Commit Graph

30 Commits

Author SHA1 Message Date
Thierry Reding 9026ba7223 gpu: host1x: Use tegra_dev_iommu_get_stream_id()
Use the newly implemented tegra_dev_iommu_get_stream_id() helper to
encapsulate and centralize the IOMMU stream ID access.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2023-01-27 17:41:49 +01:00
Mikko Perttunen d5179020f5 gpu: host1x: External timeout/cancellation for fences
Currently all fences have a 30 second timeout to ensure they are
cleaned up if the fence never completes otherwise. However, this
one size fits all solution doesn't actually fit in every case,
such as syncpoint waiting where we want to be able to have timeouts
longer than 30 seconds. As such, we want to be able to give control
over fence cancellation to the caller (and maybe eventually get rid
of the internal timeout altogether).

Here we add this cancellation mechanism by essentially adding a
function for entering the timeout path by function call, and changing
the syncpoint wait function to use it.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2023-01-26 15:55:38 +01:00
Mikko Perttunen c24973ed79 gpu: host1x: Implement job tracking using DMA fences
In anticipation of removal of the intr API, implement job tracking
using DMA fences instead. The main two things about this are
making cdma_update schedule the work since fence completion can
now be called from interrupt context, and some complication in
ensuring the callback is not running when we free the fence.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2023-01-26 15:55:38 +01:00
Mikko Perttunen 5b7239c17c gpu: host1x: Initialize syncval in channel_submit()
During the refactoring of channel_submit(), assignment of syncval was
moved but it is also used in channel_submit(). Add this assignment back
to channel_submit() as well.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2022-07-08 17:35:19 +02:00
Mikko Perttunen 1411796f20 gpu: host1x: Rewrite job opcode sequence
For new (Tegra186+) SoCs, use a new ('full-featured') job opcode
sequence that is compatible with virtualization. In particular,
the Host1x hardware in Tegra234 is more strict regarding the sequence,
requiring ACQUIRE_MLOCK-SETCLASS-SETSTREAMID opcodes to occur in
that sequence without gaps (except for SETPAYLOAD), so let's do it
properly in one go now.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2022-07-08 16:27:53 +02:00
Mikko Perttunen 2486254781 gpu: host1x: Program context stream ID on submission
Add code to do stream ID switching at the beginning of a job. The
stream ID is switched to the stream ID specified by the context
passed in the job structure.

Before switching the stream ID, an OP_DONE wait is done on the
channel's engine to ensure that there is no residual ongoing
work that might do DMA using the new stream ID.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2022-07-08 16:27:52 +02:00
Dmitry Osipenko 6b6776e2ab gpu: host1x: Add initial runtime PM and OPP support
Add runtime PM and OPP support to the Host1x driver. For the starter we
will keep host1x always-on because dynamic power management require a major
refactoring of the driver code since lot's of code paths are missing the
RPM handling and we're going to remove some of these paths in the future.

Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Peter Geis <pgwipeout@gmail.com> # Ouya T30
Tested-by: Paul Fertser <fercerpav@gmail.com> # PAZ00 T20
Tested-by: Nicolas Chauvet <kwizart@gmail.com> # PAZ00 T20 and TK1 T124
Tested-by: Matt Merhar <mattmerhar@protonmail.com> # Ouya T30
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2021-12-16 14:07:07 +01:00
Mikko Perttunen e902585fc8 gpu: host1x: Add support for syncpoint waits in CDMA pushbuffer
Add support for inserting syncpoint waits in the CDMA pushbuffer.
These waits need to be done in HOST1X class, while gather submitted
by the application execute in engine class.

Support is added by converting the gather list of job into a command
list that can include both gathers and waits. When the job is
submitted, these commands are pushed as the appropriate opcodes
on the CDMA pushbuffer.

Also supported are waits relative to the start of the job,
which are useful for jobs doing multiple things with an engine
that doesn't natively support pipelining.

While at it, use 32-bit waits on chips that support them.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2021-08-10 14:41:19 +02:00
Mikko Perttunen c78f837ae3 gpu: host1x: Add no-recovery mode
Add a new property for jobs to enable or disable recovery i.e.
CPU increments of syncpoints to max value on job timeout. This
allows for a more solid model for hanged jobs, where userspace
doesn't need to guess if a syncpoint increment happened because
the job completed, or because job timeout was triggered.

On job timeout, we stop the channel, NOP all future jobs on the
channel using the same syncpoint, mark the syncpoint as locked
and resume the channel from the next job, if any.

The future jobs are NOPed, since because we don't do the CPU
increments, the value of the syncpoint is no longer synchronized,
and any waiters would become confused if a future job incremented
the syncpoint. The syncpoint is marked locked to ensure that any
future jobs cannot increment the syncpoint either, until the
application has recognized the situation and reallocated the
syncpoint.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2021-08-10 14:40:23 +02:00
Mikko Perttunen 2aed4f5ab0 gpu: host1x: Cleanup and refcounting for syncpoints
Add reference counting for allocated syncpoints to allow keeping
them allocated while jobs are referencing them. Additionally,
clean up various places using syncpoint IDs to use host1x_syncpt
pointers instead.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2021-03-31 17:42:13 +02:00
Thomas Gleixner 9952f6918d treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 201
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms and conditions of the gnu general public license
  version 2 as published by the free software foundation this program
  is distributed in the hope it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details you should have received a copy of the gnu general
  public license along with this program if not see http www gnu org
  licenses

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 228 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Steve Winslow <swinslow@gmail.com>
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190528171438.107155473@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-30 11:29:52 -07:00
Arnd Bergmann 8bbad1ba31 gpu: host1x: Program stream ID to bypass without SMMU
If SMMU support is not available, fall back to programming the bypass
stream ID (0x7f).

Fixes: de5469c21f ("gpu: host1x: Program the channel stream ID")
Suggested-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
[treding@nvidia.com: rebase this on top of a later build fix]
Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-04-11 17:40:35 +02:00
Stefan Agner e154592a1d gpu: host1x: Fix compile error when IOMMU API is not available
In case the IOMMU API is not available compiling host1x fails with
the following error:

  In file included from drivers/gpu/host1x/hw/host1x06.c:27:
  drivers/gpu/host1x/hw/channel_hw.c: In function ‘host1x_channel_set_streamid’:
  drivers/gpu/host1x/hw/channel_hw.c:118:30: error: implicit declaration of function
    ‘dev_iommu_fwspec_get’; did you mean ‘iommu_fwspec_free’?  [-Werror=implicit-function-declaration]
  struct iommu_fwspec *spec = dev_iommu_fwspec_get(channel->dev->parent);
                              ^~~~~~~~~~~~~~~~~~~~
                              iommu_fwspec_free

Fixes: de5469c21f ("gpu: host1x: Program the channel stream ID")
Signed-off-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-04-11 10:35:39 +02:00
Thierry Reding 67a82dbc0a gpu: host1x: Support 40-bit addressing
Tegra186 and later support 40 bits of address space. Additional
registers need to be programmed to store the full 40 bits of push
buffer addresses.

Since command stream gathers can also reside in buffers in a 40-bit
address space, a new variant of the GATHER opcode is also introduced.
It takes two parameters: the first parameter contains the lower 32
bits of the address and the second parameter contains bits 32 to 39.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07 18:28:35 +01:00
Thierry Reding de5469c21f gpu: host1x: Program the channel stream ID
When processing command streams, make sure the host1x's stream ID is
programmed for the channel so that addresses are properly translated
through the SMMU.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07 18:28:33 +01:00
Thierry Reding b7c61d511d gpu: host1x: Resize channel register region on Tegra186 and later
The register region allocated per channel was decreased from 16384 bytes
to 256 bytes on Tegra186 and later. Resize the region to make sure every
channel (instead of only the first) is properly programmed.

Suggested-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-11-27 17:18:26 +01:00
Thierry Reding ac330f45c7 gpu: host1x: Drop unnecessary host1x argument
Functions taking a pointer to a host1x syncpoint as an argument don't
need to specify a pointer to a host1x instance because it can be
obtained from the syncpoint.

Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-05-18 21:51:01 +02:00
Thierry Reding 24c94e166d gpu: host1x: Remove wait check support
The job submission userspace ABI doesn't support this and there are no
plans to implement it, so all of this code is dead and can be removed.

Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-05-18 21:50:04 +02:00
Mikko Perttunen 2316f29fb5 gpu: host1x: Enable gather filter
The gather filter is a feature present on Tegra124 and newer where the
hardware prevents GATHERed command buffers from executing commands
normally reserved for the CDMA pushbuffer which is maintained by the
kernel driver.

This commit enables the gather filter on all supporting hardware.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-10-20 14:19:52 +02:00
Mikko Perttunen c3f52220f2 gpu: host1x: Enable Tegra186 syncpoint protection
Since Tegra186 the Host1x hardware allows syncpoints to be assigned to
specific channels, preventing any other channels from incrementing
them.

Enable this feature where available and assign syncpoints to channels
when submitting a job. Syncpoints are currently never unassigned from
channels since that would require extra work and is unnecessary with
the current channel allocation model.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-10-20 14:19:52 +02:00
Mikko Perttunen 8474b02531 gpu: host1x: Refactor channel allocation code
This is largely a rewrite of the Host1x channel allocation code, bringing
several changes:

- The previous code could deadlock due to an interaction
  between the 'reflock' mutex and CDMA timeout handling.
  This gets rid of the mutex.
- Support for more than 32 channels, required for Tegra186
- General refactoring, including better encapsulation
  of channel ownership handling into channel.c

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 14:25:38 +02:00
Thierry Reding 0b8070d12e gpu: host1x: Whitespace cleanup for readability
Insert a number of blank lines in places where they increase readability
of the code. Also collapse various variable declarations to shorten some
functions and finally rewrite some code for readability.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2016-06-23 11:59:30 +02:00
Thierry Reding 5c0d8d386b gpu: host1x: Use unsigned int consistently for IDs
IDs can never be negative so use unsigned int. In some instances an
explicitly sized type (such as u32) was used for no particular reason,
so turn those into unsigned int as well for consistency.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2016-06-23 11:59:24 +02:00
Thierry Reding b40d02bf96 gpu: host1x: Use struct host1x_bo pointers in traces
Rather than cast to a u32 use the struct host1x_bo pointers directly.
This avoid annoying warnings for 64-bit builds.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2014-11-13 16:11:32 +01:00
Arto Merilainen f5a954fed9 gpu: host1x: Add syncpoint base support
This patch adds support for hardware syncpoint bases. This creates
a simple mechanism to stall the command FIFO until an operation is
completed.

Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2013-10-31 09:55:48 +01:00
Thierry Reding fc3be3e8fc gpu: host1x: Use relative include paths
This is slightly safer than adding -Idrivers/gpu/host1x to cflags-y.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2013-10-31 09:55:40 +01:00
Thierry Reding 35d747a81d gpu: host1x: Expose syncpt and channel functionality
Expose the buffer objects, syncpoint and channel functionality in the
public public header so that drivers can use them.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2013-10-31 09:20:11 +01:00
Thierry Reding e1e906448d gpu: host1x: Make host1x header file public
In preparation to support host1x clients other than DRM, move this
header into a public location.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2013-10-31 09:20:10 +01:00
Terje Bergstrom 6236451d83 gpu: host1x: Add debug support
Add support for host1x debugging. Adds debugfs entries, and dumps
channel state to UART in case of stuck job.

Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Thierry Reding <thierry.reding@avionic-design.de>
Tested-by: Thierry Reding <thierry.reding@avionic-design.de>
Tested-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
2013-04-22 12:32:46 +02:00
Terje Bergstrom 6579324a41 gpu: host1x: Add channel support
Add support for host1x client modules, and host1x channels to submit
work to the clients.

Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Thierry Reding <thierry.reding@avionic-design.de>
Tested-by: Thierry Reding <thierry.reding@avionic-design.de>
Tested-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
2013-04-22 12:32:43 +02:00