Commit Graph

59 Commits

Author SHA1 Message Date
Thierry Reding 3e9c458433 gpu: host1x: Do not use mapping cache for job submissions
Buffer mappings used in job submissions are usually small and not
rapidly reused as opposed to framebuffers (which are usually large and
rapidly reused, for example when page-flipping between double-buffered
framebuffers). Avoid going through the mapping cache for these buffers
since the cache would also lead to leaks if nobody is ever releasing
the cache's last reference. For DRM/KMS these last references are
dropped when the framebuffers are removed and therefore no longer
needed.

While at it, also add a note about the need to explicitly remove the
final reference to the mapping in the cache.

Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2022-04-06 15:12:36 +02:00
Randy Dunlap fe696ccb27 gpu: host1x: Fix a kernel-doc warning
Add @cache description to eliminate a kernel-doc warning.

include/linux/host1x.h:104: warning: Function parameter or member 'cache' not described in 'host1x_client'

Fixes: 1f39b1dfa5 ("drm/tegra: Implement buffer object cache")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Thierry Reding <treding@nvidia.com>
Cc: linux-tegra@vger.kernel.org
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Thierry Reding <treding@nvidia.com>
2022-04-06 15:08:17 +02:00
Dmitry Osipenko 9ca790f446 gpu: host1x: Add host1x_channel_stop()
Add host1x_channel_stop() which waits till channel becomes idle and then
stops the channel hardware. This is needed for supporting suspend/resume
by host1x drivers since the hardware state is lost after power-gating,
thus the channel needs to be stopped before client enters into suspend.

Tested-by: Peter Geis <pgwipeout@gmail.com> # Ouya T30
Tested-by: Paul Fertser <fercerpav@gmail.com> # PAZ00 T20
Tested-by: Nicolas Chauvet <kwizart@gmail.com> # PAZ00 T20 and TK1 T124
Tested-by: Matt Merhar <mattmerhar@protonmail.com> # Ouya T30
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2021-12-16 14:07:07 +01:00
Mikko Perttunen 46f226c93d drm/tegra: Add NVDEC driver
Add support for booting and using NVDEC on Tegra210, Tegra186
and Tegra194 to the Host1x and TegraDRM drivers. Booting in
secure mode is not currently supported.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2021-12-16 14:07:06 +01:00
Thierry Reding 1f39b1dfa5 drm/tegra: Implement buffer object cache
This cache is used to avoid mapping and unmapping buffer objects
unnecessarily. Mappings are cached per client and stay hot until
the buffer object is destroyed.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2021-12-16 14:07:06 +01:00
Thierry Reding c6aeaf56f4 drm/tegra: Implement correct DMA-BUF semantics
DMA-BUF requires that each device that accesses a DMA-BUF attaches to it
separately. To do so the host1x_bo_pin() and host1x_bo_unpin() functions
need to be reimplemented so that they can return a mapping, which either
represents an attachment or a map of the driver's own GEM object.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2021-12-16 14:07:06 +01:00
Mikko Perttunen 0fddaa85d6 gpu: host1x: Add option to skip firewall for a job
The new UAPI will have its own firewall, and we don't want to run
the firewall in the Host1x driver for those jobs. As such, add a
parameter to host1x_job_alloc to specify if we want to skip the
firewall in the Host1x driver.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2021-08-10 14:42:49 +02:00
Mikko Perttunen e902585fc8 gpu: host1x: Add support for syncpoint waits in CDMA pushbuffer
Add support for inserting syncpoint waits in the CDMA pushbuffer.
These waits need to be done in HOST1X class, while gather submitted
by the application execute in engine class.

Support is added by converting the gather list of job into a command
list that can include both gathers and waits. When the job is
submitted, these commands are pushed as the appropriate opcodes
on the CDMA pushbuffer.

Also supported are waits relative to the start of the job,
which are useful for jobs doing multiple things with an engine
that doesn't natively support pipelining.

While at it, use 32-bit waits on chips that support them.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2021-08-10 14:41:19 +02:00
Mikko Perttunen 17a298e9ac gpu: host1x: Add job release callback
Add a callback field to the job structure, to be called just before
the job is to be freed. This allows the job's submitter to clean
up any of its own state, like decrement runtime PM refcounts.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2021-08-10 14:41:02 +02:00
Mikko Perttunen c78f837ae3 gpu: host1x: Add no-recovery mode
Add a new property for jobs to enable or disable recovery i.e.
CPU increments of syncpoints to max value on job timeout. This
allows for a more solid model for hanged jobs, where userspace
doesn't need to guess if a syncpoint increment happened because
the job completed, or because job timeout was triggered.

On job timeout, we stop the channel, NOP all future jobs on the
channel using the same syncpoint, mark the syncpoint as locked
and resume the channel from the next job, if any.

The future jobs are NOPed, since because we don't do the CPU
increments, the value of the syncpoint is no longer synchronized,
and any waiters would become confused if a future job incremented
the syncpoint. The syncpoint is marked locked to ensure that any
future jobs cannot increment the syncpoint either, until the
application has recognized the situation and reallocated the
syncpoint.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2021-08-10 14:40:23 +02:00
Mikko Perttunen 687db2207b gpu: host1x: Add DMA fence implementation
Add an implementation of dma_fences based on syncpoints. Syncpoint
interrupts are used to signal fences. Additionally, after
software signaling has been enabled, a 30 second timeout is started.
If the syncpoint threshold is not reached within this period,
the fence is signalled with an -ETIMEDOUT error code. This is to
allow fences that would never reach their syncpoint threshold to
be cleaned up. The timeout can potentially be removed in the future
after job tracking code has been refactored.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2021-08-10 14:39:50 +02:00
Thierry Reding 0cfe5a6e75 gpu: host1x: Split up client initalization and registration
In some cases we may need to initialize the host1x client first before
registering it. This commit adds a new helper that will do nothing but
the initialization of the data structure.

At the same time, the initialization is removed from the registration
function. Note, however, that for simplicity we explicitly initialize
the client when the host1x_client_register() function is called, as
opposed to the low-level __host1x_client_register() function. This
allows existing callers to remain unchanged.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2021-05-17 12:31:05 +02:00
Thierry Reding 933deb8c7b gpu: host1x: Add early init and late exit callbacks
These callbacks can be used by client drivers to run code during early
init and during late exit. Early init callbacks are run prior to the
regular init callbacks while late exit callbacks run after the regular
exit callbacks.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2021-03-31 17:42:14 +02:00
Mikko Perttunen f5ba33fb96 gpu: host1x: Reserve VBLANK syncpoints at initialization
On T20-T148 chips, the bootloader can set up a boot splash
screen with DC configured to increment syncpoint 26/27
at VBLANK. Because of this we shouldn't allow these syncpoints
to be allocated until DC has been reset and will no longer
increment them in the background.

As such, on these chips, reserve those two syncpoints at
initialization, and only mark them free once the DC
driver has indicated it's safe to do so.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2021-03-31 17:42:13 +02:00
Mikko Perttunen 2aed4f5ab0 gpu: host1x: Cleanup and refcounting for syncpoints
Add reference counting for allocated syncpoints to allow keeping
them allocated while jobs are referencing them. Additionally,
clean up various places using syncpoint IDs to use host1x_syncpt
pointers instead.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2021-03-31 17:42:13 +02:00
Mikko Perttunen 86cec7ece3 gpu: host1x: Allow syncpoints without associated client
Syncpoints don't need to be associated with any client,
so remove the property, and expose host1x_syncpt_alloc.
This will allow allocating syncpoints without prior knowledge
of the engine that it will be used with.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2021-03-30 19:53:24 +02:00
Mikko Perttunen a24f98176d gpu: host1x: Use different lock classes for each client
To avoid false lockdep warnings, give each client lock a different
lock class, passed from the initialization site by macro.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2021-03-30 19:37:20 +02:00
Sowjanya Komatineni cf5153e433 media: gpu: host1x: mipi: Keep MIPI clock enabled and mutex locked till calibration done
With the split of MIPI calibration into tegra_mipi_calibrate() and
tegra_mipi_wait(), MIPI clock is not kept enabled and mutex is not locked
till the calibration is done.

So, this patch keeps MIPI clock enabled and mutex locked after triggering
start of calibration till its done.

To let calibration process go through its finite sequence codes before
calibration logic waiting for pads idle state added wait time of 75usec
to make sure it sees idle state to apply the results.

This patch renames tegra_mipi_calibrate() as tegra_mipi_start_calibration()
and tegra_mipi_wait() as tegra_mipi_finish_calibration() to be inline
with their usage.

Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-08-28 15:12:38 +02:00
Sowjanya Komatineni b3f1b76071 gpu: host1x: mipi: Split tegra_mipi_calibrate() and tegra_mipi_wait()
SW can trigger MIPI pads calibration any time after power on
but calibration results will be latched and applied to the pads
by MIPI CAL unit only when the link is in LP-11 state and then
status register will be updated.

For CSI, trigger of pads calibration happen during CSI stream
enable where CSI receiver is kept ready prior to sensor or CSI
transmitter stream start.

So, pads may not be in LP-11 at this time and waiting for the
calibration to be done immediate after calibration start will
result in timeout.

This patch splits tegra_mipi_calibrate() and tegra_mipi_wait()
so triggering for calibration and waiting for it to complete can
happen at different stages.

Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2020-07-17 16:06:14 +02:00
Sowjanya Komatineni 767598d447 gpu: host1x: mipi: Update tegra_mipi_request() to be node based
Tegra CSI driver need a separate MIPI device for each channel as
calibration of corresponding MIPI pads for each channel should
happen independently.

So, this patch updates tegra_mipi_request() API to add a device_node
pointer argument to allow creating mipi device for specific device
node rather than a device.

Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2020-07-17 16:06:13 +02:00
Colton Lewis 2fd2bc7f49 gpu: host1x: Correct trivial kernel-doc inconsistencies
Silence documentation build warnings by adding kernel-doc fields.

./include/linux/host1x.h:69: warning: Function parameter or member 'parent' not described in 'host1x_client'
./include/linux/host1x.h:69: warning: Function parameter or member 'usecount' not described in 'host1x_client'
./include/linux/host1x.h:69: warning: Function parameter or member 'lock' not described in 'host1x_client'

Signed-off-by: Colton Lewis <colton.w.lewis@protonmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2020-06-16 18:59:45 +02:00
Thierry Reding 501be6c1c7 drm/tegra: Fix SMMU support on Tegra124 and Tegra210
When testing whether or not to enable the use of the SMMU, consult the
supported DMA mask rather than the actually configured DMA mask, since
the latter might already have been restricted.

Fixes: 2d9384ff91 ("drm/tegra: Relax IOMMU usage criteria on old Tegra")
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2020-04-28 11:44:07 +02:00
Dave Airlie fd7226fbb2 drm/tegra: Changes for v5.6-rc1
This contains a small set of mostly fixes and some minor improvements.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEiOrDCAFJzPfAjcif3SOs138+s6EFAl4ZGs0THHRyZWRpbmdA
 bnZpZGlhLmNvbQAKCRDdI6zXfz6zoZUkEACo+fhKHTmkCz4W+4m90aEFL/VUi7Bn
 lFmssk2E71WvxUSWfLYawuhi/bgtMLvPmTi1KTCpvdvr+ladF1f4/Vd6vuhMu7Ec
 t2ZePYmRlpX+3PHW+V2fIRyBC9/6wSx2UEAbcMOijIPHlWDNdrxo26w0+af0dBJ3
 b8oIv5q5jX0GzZDOCOsBngUwnVzCYthVz7SA+UXOwBvNPO61kjx8IA/IfBHAQ+HF
 ZVpn//C7g2/wHq125kdrFUwAWQPvFMHOs1JPfXS9248kobkIabSp0XxLaj403cFU
 WWa/no5cPkG+WRwaX5egvE2/8P3mDzu3ev9gkgpdDJxnn1YL656mPWO7jNqW4IJd
 JD8kTk7ODh7KQvY3xdHD3MrHpHWHtHdOiFZtOygiW84W2yK5/g/LRd/oph/xP1RE
 /h8eiRP/mK7VkVRmgbiZTIEQVp10RKH56XtVd1Cf3eFNkig1Q8MLarwy2rhClyJO
 7cJZKoOqRfx0aMYipxx5+YWDXrCVwtNrQKTroJ27ycre3ykv+NAoeHeei9ie81Rg
 XZeCF3OuvVNXoUPQRS3XYkH70CTbdDTpkwymctcfwWOerhn65QAHvCSNR91ezmNq
 YwC7CkS6ymeG9UhSoCvNoLa1adfeztRCwg8RkKdHgHFtHNNQuyRlp4xNyrJNgoCO
 YX5dApw++iTOwQ==
 =OBEv
 -----END PGP SIGNATURE-----

Merge tag 'drm/tegra/for-5.6-rc1' of git://anongit.freedesktop.org/tegra/linux into drm-next

drm/tegra: Changes for v5.6-rc1

This contains a small set of mostly fixes and some minor improvements.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thierry Reding <thierry.reding@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200111004835.2412858-1-thierry.reding@gmail.com
2020-01-15 16:21:28 +10:00
Thierry Reding fd67e9c6ed drm/tegra: Do not implement runtime PM
The Tegra DRM driver heavily relies on the implementations for runtime
suspend/resume to be called at specific times. Unfortunately, there are
some cases where that doesn't work. One example is if the user disables
runtime PM for a given subdevice. Another example is that the PM core
acquires a reference to runtime PM during system sleep, effectively
preventing devices from going into low power modes. This is intentional
to avoid nasty race conditions, but it also causes system sleep to not
function properly on all Tegra systems.

Fix this by not implementing runtime PM at all. Instead, a minimal,
reference-counted suspend/resume infrastructure is added to the host1x
bus. This has the benefit that it can be used regardless of the system
power state (or any transitions we might be in), or whether or not the
user allows runtime PM.

Atomic modesetting guarantees that these functions will end up being
called at the right point in time, so the pitfalls for the more generic
runtime PM do not apply here.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2020-01-10 16:37:43 +01:00
Thierry Reding 608f43ad27 gpu: host1x: Rename "parent" to "host"
Rename the host1x clients' parent to "host" because that more closely
describes what it is. The parent can be confused with the parent device
in terms of the device hierarchy. Subsequent patches will add a new
member that refers to the parent in that hierarchy.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2020-01-10 16:37:38 +01:00
Daniel Vetter 6c56e8adc0 drm-misc-next for v5.6:
UAPI Changes:
 - Add support for DMA-BUF HEAPS.
 
 Cross-subsystem Changes:
 - mipi dsi definition updates, pulled into drm-intel as well.
 - Add lockdep annotations for dma_resv vs mmap_sem and fs_reclaim.
 - Remove support for dma-buf kmap/kunmap.
 - Constify fb_ops in all fbdev drivers, including drm drivers and drm-core, and media as well.
 
 Core Changes:
 - Small cleanups to ttm.
 - Fix SCDC definition.
 - Assorted cleanups to core.
 - Add todo to remove load/unload hooks, and use generic fbdev emulation.
 - Assorted documentation updates.
 - Use blocking ww lock in ttm fault handler.
 - Remove drm_fb_helper_fbdev_setup/teardown.
 - Warning fixes with W=1 for atomic.
 - Use drm_debug_enabled() instead of drm_debug flag testing in various drivers.
 - Fallback to nontiled mode in fbdev emulation when not all tiles are present. (Later on reverted)
 - Various kconfig indentation fixes in core and drivers.
 - Fix freeing transactions in dp-mst correctly.
 - Sean Paul is steping down as core maintainer. :-(
 - Add lockdep annotations for atomic locks vs dma-resv.
 - Prevent use-after-free for a bad job in drm_scheduler.
 - Fill out all block sizes in the P01x and P210 definitions.
 - Avoid division by zero in drm/rect, and fix bounds.
 - Add drm/rect selftests.
 - Add aspect ratio and alternate clocks for HDMI 4k modes.
 - Add todo for drm_framebuffer_funcs and fb_create cleanup.
 - Drop DRM_AUTH for prime import/export ioctls.
 - Clear DP-MST payload id tables downstream when initializating.
 - Fix for DSC throughput definition.
 - Add extra FEC definitions.
 - Fix fake offset in drm_gem_object_funs.mmap.
 - Stop using encoder->bridge in core directly
 - Handle bridge chaining slightly better.
 - Add backlight support to drm/panel, and use it in many panel drivers.
 - Increase max number of y420 modes from 128 to 256, as preparation to add the new modes.
 
 Driver Changes:
 - Small fixes all over.
 - Fix documentation in vkms.
 - Fix mmap_sem vs dma_resv in nouveau.
 - Small cleanup in komeda.
 - Add page flip support in gma500 for psb/cdv.
 - Add ddc symlink in the connector sysfs directory for many drivers.
 - Add support for analogic an6345, and fix small bugs in it.
 - Add atomic modesetting support to ast.
 - Fix radeon fault handler VMA race.
 - Switch udl to use generic shmem helpers.
 - Unconditional vblank handling for mcde.
 - Miscellaneous fixes to mcde.
 - Tweak debug output from komeda using debugfs.
 - Add gamma and color transform support to komeda for DOU-IPS.
 - Add support for sony acx424AKP panel.
 - Various small cleanups to gma500.
 - Use generic fbdev emulation in udl, and replace udl_framebuffer with generic implementation.
 - Add support for Logic PD Type 28 panel.
 - Use drm_panel_* wrapper functions in exynos/tegra/msm.
 - Add devicetree bindings for generic DSI panels.
 - Don't include drm_pci.h directly in many drivers.
 - Add support for begin/end_cpu_access in udmabuf.
 - Stop using drm_get_pci_dev in gma500 and mga200.
 - Fixes to UDL damage handling, and use dma_buf_begin/end_cpu_access.
 - Add devfreq thermal support to panfrost.
 - Fix hotplug with daisy chained monitors by removing VCPI when disabling topology manager.
 - meson: Add support for OSD1 plane AFBC commit.
 - Stop displaying garbage when toggling ast primary plane on/off.
 - More cleanups and fixes to UDL.
 - Add D32 suport to komeda.
 - Remove globle copy of drm_dev in gma500.
 - Add support for Boe Himax8279d MIPI-DSI LCD panel.
 - Add support for ingenic JZ4770 panel.
 - Small null pointer deference fix in ingenic.
 - Remove support for the special tfp420 driver, as there is a generic way to do it.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAl34lkkACgkQ/lWMcqZw
 E8M76g//WRYl9fWnV063s44FBVJYjGxaus0vQJSGidaPCIE6Ep6TNjXp8DVzV82M
 HR79P9glL02DC9B8pflioNNXdIRGSVk/FJcKVB2seFAqEFCAknvWDM/X/y+mOUpp
 fUeFl+Znlwx3YlM8f4Qujdbm+CbTewfbya4VAWeWd8XG2V8jfq5cmODPPlUMNenZ
 J6Ja+W3ph741uSIfAKaP69LVJgOcuUjXINE4SWhRk/i5QF3GIRej/A7ZjWGLQ/t2
 2zUUF7EiCzhPomM40H3ddKtXb4ZjNJuc5pOD4GpxR8ciNbe2gUOHEZ5aenwYBdsU
 5MwbxNKyMbKXATtn3yv3fSc4jH3DtmEKpmovONeO8ZDBrQBnxeYa3tQvfkNghA2f
 acoZMzYUImV+ft6DMIgpXppASvo7mQYDAbLPOGEJ9E44AL4UP00jesEjnK5FOHSR
 3BEzGUnK/6QL5zFNPni8YZQ8dan4jDIno1mqIV+cQ4WCGlaKckzIWO6243Bf13b/
 kROSJpgWkiK6Ngq0ofhD0MHyT/m1QnqUzWRKTJhRtPflSWRBsDZqWCQ5Vx1QlNIE
 /HfTNbTpXWwa+5wXbbB8TkDw5t9cQGnR+QcrEd9HgoIec7B5Re8rx9i0TJAT4N05
 03RCQCecSfD8gwKd2wgaFIpFGRl9lTdLYSpffSmyL2X5a20lZhM=
 =b15X
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2019-12-16' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for v5.6:

UAPI Changes:
- Add support for DMA-BUF HEAPS.

Cross-subsystem Changes:
- mipi dsi definition updates, pulled into drm-intel as well.
- Add lockdep annotations for dma_resv vs mmap_sem and fs_reclaim.
- Remove support for dma-buf kmap/kunmap.
- Constify fb_ops in all fbdev drivers, including drm drivers and drm-core, and media as well.

Core Changes:
- Small cleanups to ttm.
- Fix SCDC definition.
- Assorted cleanups to core.
- Add todo to remove load/unload hooks, and use generic fbdev emulation.
- Assorted documentation updates.
- Use blocking ww lock in ttm fault handler.
- Remove drm_fb_helper_fbdev_setup/teardown.
- Warning fixes with W=1 for atomic.
- Use drm_debug_enabled() instead of drm_debug flag testing in various drivers.
- Fallback to nontiled mode in fbdev emulation when not all tiles are present. (Later on reverted)
- Various kconfig indentation fixes in core and drivers.
- Fix freeing transactions in dp-mst correctly.
- Sean Paul is steping down as core maintainer. :-(
- Add lockdep annotations for atomic locks vs dma-resv.
- Prevent use-after-free for a bad job in drm_scheduler.
- Fill out all block sizes in the P01x and P210 definitions.
- Avoid division by zero in drm/rect, and fix bounds.
- Add drm/rect selftests.
- Add aspect ratio and alternate clocks for HDMI 4k modes.
- Add todo for drm_framebuffer_funcs and fb_create cleanup.
- Drop DRM_AUTH for prime import/export ioctls.
- Clear DP-MST payload id tables downstream when initializating.
- Fix for DSC throughput definition.
- Add extra FEC definitions.
- Fix fake offset in drm_gem_object_funs.mmap.
- Stop using encoder->bridge in core directly
- Handle bridge chaining slightly better.
- Add backlight support to drm/panel, and use it in many panel drivers.
- Increase max number of y420 modes from 128 to 256, as preparation to add the new modes.

Driver Changes:
- Small fixes all over.
- Fix documentation in vkms.
- Fix mmap_sem vs dma_resv in nouveau.
- Small cleanup in komeda.
- Add page flip support in gma500 for psb/cdv.
- Add ddc symlink in the connector sysfs directory for many drivers.
- Add support for analogic an6345, and fix small bugs in it.
- Add atomic modesetting support to ast.
- Fix radeon fault handler VMA race.
- Switch udl to use generic shmem helpers.
- Unconditional vblank handling for mcde.
- Miscellaneous fixes to mcde.
- Tweak debug output from komeda using debugfs.
- Add gamma and color transform support to komeda for DOU-IPS.
- Add support for sony acx424AKP panel.
- Various small cleanups to gma500.
- Use generic fbdev emulation in udl, and replace udl_framebuffer with generic implementation.
- Add support for Logic PD Type 28 panel.
- Use drm_panel_* wrapper functions in exynos/tegra/msm.
- Add devicetree bindings for generic DSI panels.
- Don't include drm_pci.h directly in many drivers.
- Add support for begin/end_cpu_access in udmabuf.
- Stop using drm_get_pci_dev in gma500 and mga200.
- Fixes to UDL damage handling, and use dma_buf_begin/end_cpu_access.
- Add devfreq thermal support to panfrost.
- Fix hotplug with daisy chained monitors by removing VCPI when disabling topology manager.
- meson: Add support for OSD1 plane AFBC commit.
- Stop displaying garbage when toggling ast primary plane on/off.
- More cleanups and fixes to UDL.
- Add D32 suport to komeda.
- Remove globle copy of drm_dev in gma500.
- Add support for Boe Himax8279d MIPI-DSI LCD panel.
- Add support for ingenic JZ4770 panel.
- Small null pointer deference fix in ingenic.
- Remove support for the special tfp420 driver, as there is a generic way to do it.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ba73535a-9334-5302-2e1f-5208bd7390bd@linux.intel.com
2019-12-17 13:57:54 +01:00
Daniel Vetter 35bd71dd1c drm/tegra: Delete host1x_bo_ops->k(un)map
It doesn't have any callers anymore.

Aside: The ->mmap/munmap hooks have a bit a confusing name, they don't
do userspace mmaps, but a kernel vmap. I think most places use vmap
for this, except ttm, which uses kmap for vmap for added confusion.
mmap seems entirely for userspace mappings set up through mmap(2)
syscall.

Reviewed-by: Thierry Reding <treding@nvidia.com>
Tested-by: Thierry Reding <treding@nvidia.com>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Jonathan Hunter <jonathanh@nvidia.com>
Cc: linux-tegra@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20191118103536.17675-3-daniel.vetter@ffwll.ch
2019-11-25 22:36:01 +01:00
Thierry Reding ab4f81bfc2 gpu: host1x: Add direction flags to relocations
Add direction flags to host1x relocations performed during job pinning.
These flags indicate the kinds of accesses that hardware is allowed to
perform on the relocated buffers.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-10-29 15:04:34 +01:00
Thierry Reding 80327ce3d4 gpu: host1x: Overhaul host1x_bo_{pin,unpin}() API
The host1x_bo_pin() and host1x_bo_unpin() APIs are used to pin and unpin
buffers during host1x job submission. Pinning currently returns the SG
table and the DMA address (an IOVA if an IOMMU is used or a physical
address if no IOMMU is used) of the buffer. The DMA address is only used
for buffers that are relocated, whereas the host1x driver will map
gather buffers into its own IOVA space so that they can be processed by
the CDMA engine.

This approach has a couple of issues. On one hand it's not very useful
to return a DMA address for the buffer if host1x doesn't need it. On the
other hand, returning the SG table of the buffer is suboptimal because a
single SG table cannot be shared for multiple mappings, because the DMA
address is stored within the SG table, and the DMA address may be
different for different devices.

Subsequent patches will move the host1x driver over to the DMA API which
doesn't work with a single shared SG table. Fix this by returning a new
SG table each time a buffer is pinned. This allows the buffer to be
referenced by multiple jobs for different engines.

Change the prototypes of host1x_bo_pin() and host1x_bo_unpin() to take a
struct device *, specifying the device for which the buffer should be
pinned. This is required in order to be able to properly construct the
SG table. While at it, make host1x_bo_pin() return the SG table because
that allows us to return an ERR_PTR()-encoded error code if we need to,
or return NULL to signal that we don't need the SG table to be remapped
and can simply use the DMA address as-is. At the same time, returning
the DMA address is made optional because in the example of command
buffers, host1x doesn't need to know the DMA address since it will have
to create its own mapping anyway.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-10-29 15:04:34 +01:00
Thierry Reding aacdf19849 drm/tegra: Move IOMMU group into host1x client
Handling of the IOMMU group attachment is common to all clients, so move
the group into the client to simplify code.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-10-28 11:18:37 +01:00
Thierry Reding caccddcfc4 gpu: host1x: Request channels for clients, not devices
A struct device doesn't carry much information that a channel might be
interested in, but the client very much does. Request channels for the
clients rather than their parent devices and store a pointer to them
in order to have that information available when needed.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-10-28 11:18:33 +01:00
Dave Airlie dfd03396d7 drm/tegra: Changes for v5.3-rc1
This contains a couple of small improvements and cleanups for the Tegra
 DRM driver.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEiOrDCAFJzPfAjcif3SOs138+s6EFAl0M8jsTHHRyZWRpbmdA
 bnZpZGlhLmNvbQAKCRDdI6zXfz6zoU0MEACZhCSKLvinuCw2pWdsc2A5rZN3r0Mr
 9TpdIKXYwcHO2G2gYC1mXv0wlp2pzIOr5y3VHLl1VboWXlPZjvgH0OiDtmwZk/vU
 jjFOGZO5ZjwyVNAZ8SlQrpdBNIAm3wseVfrhVURIL+9dANDgXsxWiqTkgnEmVclZ
 Kj2SpXfysTM9TJyXfQW8op6jKtP3NJ/IPEtTguNL1R9ho0phlcRYBUMEIYqtgBVE
 aRsjrrbM27OGf+JFPY3C7bW90hzgVqZLeK9R4AAMPS8iZ91k9njlFZ20LI9Yk/P1
 GBVyemiRtcG3owbli3NlfFzaeNjmM2PmcMZJsOyf6T+UH5juH2XZ/TV5O6i/jq0z
 /8DWsuFdEMX5tdP0t4B3vnbGQTYMOo/6bRsZHceLXcpKFNuJcC6lY233/Fc3D1bw
 gWm5ZHTs3SmPm2JoWmA54h2oiXU8/hGiPZGoUDHUTxf/h7DOGhK2hfaNdB6ZYXJu
 be4yS5TidFlFKi911JuXblCDeFf0VPsOfemQtJW4OvKg9FD6WmRuehYPytJ8ifB1
 dByXh5siOVMcyE0a2bsMPAvsxK6z4pPwyDz34AJUETJ0DSmSfWeEMSbCmpJwt6m5
 35IiMossSULJXWjqtc5bcK1gmbYMJSEwj+Xe33O+H67THSgo4GLxqKdb++5QK5Ky
 svluhQIjYYx3pw==
 =T2hU
 -----END PGP SIGNATURE-----

Merge tag 'drm/tegra/for-5.3-rc1' of git://anongit.freedesktop.org/tegra/linux into drm-next

drm/tegra: Changes for v5.3-rc1

This contains a couple of small improvements and cleanups for the Tegra
DRM driver.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thierry Reding <thierry.reding@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621150753.19550-1-thierry.reding@gmail.com
2019-06-25 12:59:43 +10:00
Thierry Reding 1e390478cf gpu: host1x: Increase maximum DMA segment size
Recent versions of the DMA API debug code have started to warn about
violations of the maximum DMA segment size. This is because the segment
size defaults to 64 KiB, which can easily be exceeded in large buffer
allocations such as used in DRM/KMS for framebuffers.

Technically the Tegra SMMU and ARM SMMU don't have a maximum segment
size (they map individual pages irrespective of whether they are
contiguous or not), so the choice of 4 MiB is a bit arbitrary here. The
maximum segment size is a 32-bit unsigned integer, though, so we can't
set it to the correct maximum size, which would be the size of the
aperture.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-06-05 15:06:03 +02:00
Thomas Gleixner 1621633323 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 1
Based on 2 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 of the license or at
  your option any later version this program is distributed in the
  hope that it will be useful but without any warranty without even
  the implied warranty of merchantability or fitness for a particular
  purpose see the gnu general public license for more details you
  should have received a copy of the gnu general public license along
  with this program if not write to the free software foundation inc
  51 franklin street fifth floor boston ma 02110 1301 usa

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 of the license or at
  your option [no]_[pad]_[ctrl] any later version this program is
  distributed in the hope that it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details you should have received a copy of the gnu general
  public license along with this program if not write to the free
  software foundation inc 51 franklin street fifth floor boston ma
  02110 1301 usa

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 176 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jilayne Lovejoy <opensource@jilayne.com>
Reviewed-by: Steve Winslow <swinslow@gmail.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190519154040.652910950@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21 11:28:39 +02:00
Thierry Reding 326bbd79fd gpu: host1x: Use not explicitly sized types
The number of words and the offset in a gather don't need to be
explicitly sized, so make them unsigned int instead.

Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-05-18 21:51:37 +02:00
Thierry Reding 06490bb99e gpu: host1x: Rename relocarray -> relocs for consistency
All other array variables use a plural, and this is the only one using
the *array suffix. This is confusing, so rename it for consistency.

Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-05-18 21:51:25 +02:00
Thierry Reding bf3d41ccab gpu: host1x: Store pointer to client in jobs
Rather than storing some identifier derived from the application
context that can't be used concretely anywhere, store a pointer to the
client directly so that accesses can be made directly through that
client object.

Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-05-18 21:50:24 +02:00
Thierry Reding 24c94e166d gpu: host1x: Remove wait check support
The job submission userspace ABI doesn't support this and there are no
plans to implement it, so all of this code is dead and can be removed.

Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-05-18 21:50:04 +02:00
Thierry Reding 617dd7cc49 gpu: host1x: syncpt: Request syncpoints per client
Rather than request syncpoints for a struct device *, request them for a
struct host1x_client *. This is important because subsequent patches are
going to break the assumption that host1x will always be the parent for
devices requesting a syncpoint. It's also a more natural choice because
host1x clients are really the only ones that will know how to deal with
syncpoints.

Note that host1x clients are always guaranteed to be children of host1x,
regardless of their location in the device tree.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-10-20 14:19:51 +02:00
Mikko Perttunen 8474b02531 gpu: host1x: Refactor channel allocation code
This is largely a rewrite of the Host1x channel allocation code, bringing
several changes:

- The previous code could deadlock due to an interaction
  between the 'reflock' mutex and CDMA timeout handling.
  This gets rid of the mutex.
- Support for more than 32 channels, required for Tegra186
- General refactoring, including better encapsulation
  of channel ownership handling into channel.c

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 14:25:38 +02:00
Dmitry Osipenko a2b78b0d53 gpu: host1x: Correct swapped arguments in the is_addr_reg() definition
Arguments of the .is_addr_reg() are swapped in the definition of the
function, that is quite confusing.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 14:24:18 +02:00
Dmitry Osipenko 0f563a4bf6 gpu: host1x: Forbid unrelated SETCLASS opcode in the firewall
Several channels could be made to write the same unit concurrently via
the SETCLASS opcode, trusting userspace is a bad idea. It should be
possible to drop the per-client channel reservation and add a per-unit
locking by inserting MLOCK's to the command stream to re-allow the
SETCLASS opcode, but it will be much more work. Let's forbid the
unit-unrelated class changes for now.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 14:23:50 +02:00
Dmitry Osipenko d0fbbdff2e drm/tegra: Correct copying of waitchecks and disable them in the 'submit' IOCTL
The waitchecks along with multiple syncpoints per submit are not ready
for use yet, let's forbid them for now.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 14:16:37 +02:00
Thierry Reding 466749f13e gpu: host1x: Flesh out kerneldoc
Improve kerneldoc for the public parts of the host1x infrastructure in
preparation for adding driver-specific part to the GPU documentation.

Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 13:58:43 +02:00
Arto Merilainen 0ae797a8ba drm/tegra: Add VIC support
This patch adds support for Video Image Compositor engine which
can be used for 2d operations.

Signed-off-by: Andrew Chew <achew@nvidia.com>
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-04-05 18:11:48 +02:00
Thierry Reding 87904c3e82 drm/tegra: dsi: Enhance runtime power management
The MIPI DSI output on Tegra SoCs requires some external logic to
calibrate the MIPI pads before a video signal can be transmitted. This
MIPI calibration logic requires to be powered on while the MIPI pads are
being used, which is currently done as part of the DSI driver's probe
implementation.

This is suboptimal because it will leave the MIPI calibration logic
powered up even if the DSI output is never used.

On Tegra114 and earlier this behaviour also causes the driver to hang
while trying to power up the MIPI calibration logic because the power
partition that contains the MIPI calibration logic will be powered on
by the display controller at output pipeline configuration time. Thus
the power up sequence for the MIPI calibration logic happens before
it's power partition is guaranteed to be enabled.

Fix this by splitting up the API into a request/free pair of functions
that manage the runtime dependency between the DSI and the calibration
modules (no registers are accessed) and a set of enable, calibrate and
disable functions that program the MIPI calibration logic at points in
time where the power partition is really enabled.

While at it, make sure that the runtime power management also works in
ganged mode, which is currently also broken.

Reported-by: Jonathan Hunter <jonathanh@nvidia.com>
Tested-by: Jonathan Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2016-08-24 15:58:57 +02:00
Thierry Reding b4a20144e0 gpu: host1x: Export host1x_syncpt_read()
This function is used to read the current value of the syncpt and is
useful in situations where drivers don't schedule work and wait for the
syncpoint to increment. One particular use-case is using the syncpoint
as a VBLANK counter.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2015-04-02 18:46:20 +02:00
Thierry Reding f4c5cf88fb gpu: host1x: Provide a proper struct bus_type
Previously the struct bus_type exported by the host1x infrastructure was
only a very basic skeleton. Turn that implementation into a more full-
fledged bus to support proper probe ordering and power management.

Note that the bus infrastructure needs to be available before any of the
drivers can be registered. This is automatically ensured if all drivers
are built as loadable modules (via symbol dependencies). If all drivers
are built-in there are no such guarantees and the link order determines
the initcall ordering. Adjust drivers/gpu/Makefile to make sure that the
host1x bus infrastructure is initialized prior to any of its users (only
drm/tegra currently).

v2: Fix building host1x and tegra-drm as modules
    Reported-by: Dave Airlie <airlied@gmail.com>

Reviewed-by: Sean Paul <seanpaul@chromium.org>
Reviewed-by: Mark Zhang <markz@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2015-01-27 10:09:14 +01:00
Thierry Reding 536e171522 gpu: host1x: Call ->remove() only when a device is bound
When a driver's ->probe() function fails, the host1x bus must not call
its ->remove() function because the driver will already have cleaned up
in the error handling path in ->probe().

Signed-off-by: Thierry Reding <treding@nvidia.com>
2015-01-23 12:07:00 +01:00
Thierry Reding 961e3beae3 drm/tegra: Make job submission 64-bit safe
Job submission currently relies on the fact that struct drm_tegra_reloc
and struct host1x_reloc are the same size and uses a simple call to the
copy_from_user() function to copy them to kernel space. This causes the
handle to be stored in the buffer object field, which then needs a cast
to a 32 bit integer to resolve it to a proper buffer object pointer and
store it back in the buffer object field.

On 64-bit architectures that will no longer work, since pointers are 64
bits wide whereas handles will remain 32 bits. This causes the sizes of
both structures to because different and copying will no longer work.

Fix this by adding a new function, host1x_reloc_get_user(), that copies
the structures field by field.

While at it, use substructures for the command and target buffers in
struct host1x_reloc for better readability. Also use unsized types to
make it more obvious that this isn't part of userspace ABI.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2014-08-04 10:07:36 +02:00