Show the number of pending waiters in the debugfs status file.
This is useful for testing to verify that waiters do not leak
or accumulate incorrectly.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Syncpoints don't need to be associated with any client,
so remove the property, and expose host1x_syncpt_alloc.
This will allow allocating syncpoints without prior knowledge
of the engine that it will be used with.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
To avoid false lockdep warnings, give each client lock a different
lock class, passed from the initialization site by macro.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
New driver:
Cadence MHDP8546 DisplayPort bridge driver
core:
- cross-driver scatterlist cleanups
- devm_drm conversions
- remove drm_dev_init
- devm_drm_dev_alloc conversion
ttm:
- lots of refactoring and cleanups
bridges:
- chained bridge support in more drivers
panel:
- misc new panels
scheduler:
- cleanup priority levels
displayport:
- refactor i915 code into helpers for nouveau
i915:
- split into display and GT trees
- WW locking refactoring in GEM
- execbuf2 extension mechanism
- syncobj timeline support
- GEN 12 HOBL display powersaving
- Rocket Lake display additions
- Disable FBC on Tigerlake
- Tigerlake Type-C + DP improvements
- Hotplug interrupt refactoring
amdgpu:
- Sienna Cichlid updates
- Navy Flounder updates
- DCE6 (SI) support for DC
- Plane rotation enabled
- TMZ state info ioctl
- PCIe DPC recovery support
- DC interrupt handling refactor
- OLED panel fixes
amdkfd:
- add SMI events for thermal throttling
- SMI interface events ioctl update
- process eviction counters
radeon:
- move to dma_ for allocations
- expose sclk via sysfs
msm:
- DSI support for sm8150/sm8250
- per-process GPU pagetable support
- Displayport support
mediatek:
- move HDMI phy driver to PHY
- convert mtk-dpi to bridge API
- disable mt2701 tmds
tegra:
- bridge support
exynos:
- misc cleanups
vc4:
- dual display cleanups
ast:
- cleanups
gma500:
- conversion to GPIOd API
hisilicon:
- misc reworks
ingenic:
- clock handling and format improvements
mcde:
- DSI support
mgag200:
- desktop g200 support
mxsfb:
- i.MX7 + i.MX8M
- alpha plane support
panfrost:
- devfreq support
- amlogic SoC support
ps8640:
- EDID from eDP retrieval
tidss:
- AM65xx YUV workaround
virtio:
- virtio-gpu exported resources
rcar-du:
- R8A7742, R8A774E1 and R8A77961 support
- YUV planar format fixes
- non-visible plane handling
- VSP device reference count fix
- Kconfig fix to avoid displaying disabled options in .config
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJfh579AAoJEAx081l5xIa+GqoP/0amz+ZN7y/L7+f32CRinJ7/
3e4xjXNDmtWG4Whe/WKjlYmbAcvSdWV/4HYpurW2BFJnOAB/5lIqYcS/PyqErPzA
w4EpRoJ+ZdFgmlDH0vdsDwPLT/HFmhUN9AopNkoZpbSMxrManSj5QgmePXyiKReP
Q+ZAK5UW5AdOVY4bgXUSEkVq2eilCLXf+bSBR/LrVQuNgu7GULX8SIy/Y1CuMtv8
LgzzjLKfIZaIWC+F/RU7BxJ7YnrVq7z7yXnUx8j2416+k/Wwe+BeSUCSZstT7q9G
UkX8jWfR7ZKqhwP+UQeSwDbHkALz7lv88nyjQdxJZ3SrXRe4hy14YjxnR4maeNAj
3TAYSdcAMWyRHqeEZIZ7Hj5sQtTq5OZAoIjxzH3vpVdAnnAkcWoF77pqxV8XPqTC
nw40DihAxQOshGwMkjd5DqkEwnMv43Hs1WTVYu9dPTOfOdqPNt+Vqp7Xl9Z46+kV
k6PDcx60T9ayDW1QZ6MoIXHta9E7ixzu7gYBL3vP4LuporY0uNG3bzF3CMvof1BK
sHYcYTdZkqbTD2d6rHV+TbpPQXgTtlej9qVlQM4SeX37Xtc7LxCYpnpUHKz2S/fK
1vyeGPgdytHblwlxwZOPZ4R2I/HTfnITdr4kMcJHhxAsEewfW1Rd4+stQqVJ2Mph
Vz+CFP2BngivGFz5vuky
=4H8J
-----END PGP SIGNATURE-----
Merge tag 'drm-next-2020-10-15' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Dave Airlie:
"Not a major amount of change, the i915 trees got split into display
and gt trees to better facilitate higher level review, and there's a
major refactoring of i915 GEM locking to use more core kernel concepts
(like ww-mutexes). msm gets per-process pagetables, older AMD SI cards
get DC support, nouveau got a bump in displayport support with common
code extraction from i915.
Outside of drm this contains a couple of patches for hexint
moduleparams which you've acked, and a virtio common code tree that
you should also get via it's regular path.
New driver:
- Cadence MHDP8546 DisplayPort bridge driver
core:
- cross-driver scatterlist cleanups
- devm_drm conversions
- remove drm_dev_init
- devm_drm_dev_alloc conversion
ttm:
- lots of refactoring and cleanups
bridges:
- chained bridge support in more drivers
panel:
- misc new panels
scheduler:
- cleanup priority levels
displayport:
- refactor i915 code into helpers for nouveau
i915:
- split into display and GT trees
- WW locking refactoring in GEM
- execbuf2 extension mechanism
- syncobj timeline support
- GEN 12 HOBL display powersaving
- Rocket Lake display additions
- Disable FBC on Tigerlake
- Tigerlake Type-C + DP improvements
- Hotplug interrupt refactoring
amdgpu:
- Sienna Cichlid updates
- Navy Flounder updates
- DCE6 (SI) support for DC
- Plane rotation enabled
- TMZ state info ioctl
- PCIe DPC recovery support
- DC interrupt handling refactor
- OLED panel fixes
amdkfd:
- add SMI events for thermal throttling
- SMI interface events ioctl update
- process eviction counters
radeon:
- move to dma_ for allocations
- expose sclk via sysfs
msm:
- DSI support for sm8150/sm8250
- per-process GPU pagetable support
- Displayport support
mediatek:
- move HDMI phy driver to PHY
- convert mtk-dpi to bridge API
- disable mt2701 tmds
tegra:
- bridge support
exynos:
- misc cleanups
vc4:
- dual display cleanups
ast:
- cleanups
gma500:
- conversion to GPIOd API
hisilicon:
- misc reworks
ingenic:
- clock handling and format improvements
mcde:
- DSI support
mgag200:
- desktop g200 support
mxsfb:
- i.MX7 + i.MX8M
- alpha plane support
panfrost:
- devfreq support
- amlogic SoC support
ps8640:
- EDID from eDP retrieval
tidss:
- AM65xx YUV workaround
virtio:
- virtio-gpu exported resources
rcar-du:
- R8A7742, R8A774E1 and R8A77961 support
- YUV planar format fixes
- non-visible plane handling
- VSP device reference count fix
- Kconfig fix to avoid displaying disabled options in .config"
* tag 'drm-next-2020-10-15' of git://anongit.freedesktop.org/drm/drm: (1494 commits)
drm/ingenic: Fix bad revert
drm/amdgpu: Fix invalid number of character '{' in amdgpu_acpi_init
drm/amdgpu: Remove warning for virtual_display
drm/amdgpu: kfd_initialized can be static
drm/amd/pm: setup APU dpm clock table in SMU HW initialization
drm/amdgpu: prevent spurious warning
drm/amdgpu/swsmu: fix ARC build errors
drm/amd/display: Fix OPTC_DATA_FORMAT programming
drm/amd/display: Don't allow pstate if no support in blank
drm/panfrost: increase readl_relaxed_poll_timeout values
MAINTAINERS: Update entry for st7703 driver after the rename
Revert "gpu/drm: ingenic: Add option to mmap GEM buffers cached"
drm/amd/display: HDMI remote sink need mode validation for Linux
drm/amd/display: Change to correct unit on audio rate
drm/amd/display: Avoid set zero in the requested clk
drm/amdgpu: align frag_end to covered address space
drm/amdgpu: fix NULL pointer dereference for Renoir
drm/vmwgfx: fix regression in thp code due to ttm init refactor.
drm/amdgpu/swsmu: add interrupt work handler for smu11 parts
drm/amdgpu/swsmu: add interrupt work function
...
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().
struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).
It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.
To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
With the split of MIPI calibration into tegra_mipi_calibrate() and
tegra_mipi_wait(), MIPI clock is not kept enabled and mutex is not locked
till the calibration is done.
So, this patch keeps MIPI clock enabled and mutex locked after triggering
start of calibration till its done.
To let calibration process go through its finite sequence codes before
calibration logic waiting for pads idle state added wait time of 75usec
to make sure it sees idle state to apply the results.
This patch renames tegra_mipi_calibrate() as tegra_mipi_start_calibration()
and tegra_mipi_wait() as tegra_mipi_finish_calibration() to be inline
with their usage.
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
When job hangs and there is a memory error pointing at channel's push
buffer, it is very handy to know the push buffer's state. This patch
makes the push buffer's state to be dumped into KMSG in addition to the
job's gathers.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Once channel's job is hung, it dumps the channel's state into KMSG before
tearing down the offending job. If multiple channels hang at once, then
they dump messages simultaneously, making the debug info unreadable, and
thus, useless. This patch adds mutex which allows only one channel to emit
debug messages at a time.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
This patch fixes gather's BO refcounting on a pinning error. Gather's BO
won't be leaked now if something goes wrong.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
We don't need to hold and pin original BOs of the gathers in a case of
enabled firewall because in this case gather's content is copied and the
copy is used by the executed job.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
SW can trigger MIPI pads calibration any time after power on
but calibration results will be latched and applied to the pads
by MIPI CAL unit only when the link is in LP-11 state and then
status register will be updated.
For CSI, trigger of pads calibration happen during CSI stream
enable where CSI receiver is kept ready prior to sensor or CSI
transmitter stream start.
So, pads may not be in LP-11 at this time and waiting for the
calibration to be done immediate after calibration start will
result in timeout.
This patch splits tegra_mipi_calibrate() and tegra_mipi_wait()
so triggering for calibration and waiting for it to complete can
happen at different stages.
Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Use readl_relaxed_poll_timeout() in tegra_mipi_wait() to simplify
the code.
Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Tegra CSI driver need a separate MIPI device for each channel as
calibration of corresponding MIPI pads for each channel should
happen independently.
So, this patch updates tegra_mipi_request() API to add a device_node
pointer argument to allow creating mipi device for specific device
node rather than a device.
Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Currently when a host1x device driver is unregistered, it is not
detached from the host1x controller, which means that the device
will stay around and when the driver is registered again, it may
bind to the old, stale device rather than the new one that was
created from scratch upon driver registration. This in turn can
cause various weird crashes within the driver core because it is
confronted with a device that was already deleted.
Fix this by detaching the driver from the host1x controller when
it is unregistered. This ensures that the deleted device also is
no longer present in the device list that drivers will bind to.
Reported-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Tested-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
In order to remove the dependency on the simple-bus compatible string,
which causes the OF driver core to register all child devices, make the
host1x driver explicitly register its children.
Signed-off-by: Thierry Reding <treding@nvidia.com>
host1x_debug_init() must be reverted in an error handling path.
This is already fixed in the remove function since commit 44156eee91
("gpu: host1x: Clean up debugfs on removal")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Tegra124 and Tegra210 support addressing more than 32 bits of physical
memory. However, since their host1x does not support the wide GATHER
opcode, they should use the SMMU if at all possible to ensure that all
the system memory can be used for command buffers, irrespective of
whether or not the host1x firewall is enabled.
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
When testing whether or not to enable the use of the SMMU, consult the
supported DMA mask rather than the actually configured DMA mask, since
the latter might already have been restricted.
Fixes: 2d9384ff91 ("drm/tegra: Relax IOMMU usage criteria on old Tegra")
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
These are a couple of quick fixes for regressions that were found during
the first two weeks of the merge window.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEiOrDCAFJzPfAjcif3SOs138+s6EFAl48S+kTHHRyZWRpbmdA
bnZpZGlhLmNvbQAKCRDdI6zXfz6zodLzD/sGXZVXj2Y/IH2gLv2soU9PIsy43xr7
y6D4kZIWPgGuQD+GNbU7I0k5zMQffJ3fWogXLnWyMhxj/RgjX1O0h6WWspBmWXlS
zUtv9u4wkn2eVrKoRUyV694RJFqG3wmppYwPnWrx0UJFBQptUhp98XTUIXD634Vp
FwM4ya+Yc0ucBc982XmZ/hkPZYnduQF9eZTnR0xbA/xZj4G3s9unNy/2mjVDrpZM
MjYDzfXSMiVHL1ROLZLCsQVrfUMYoUgB8ill0o0DxE5kHL6KpA55j28WzxBwbZQQ
iSxDbjtNPnQV3pR5/sBLpLP1CeYJ1dY9TVUcy+6u+f3SFcJUhM6LEsi5W2EgcBKT
j+0Tag0i12cFvEDo1NbK4SGCp3OmeuFB0nAPT7OGtUMByWC+uhgspCQn0ZqVjEB6
Iy4EGrhPqbUCEsF72VR6amgi0sd2lkyGaCY1F3Do2hrHs/EgrFN3bKDRhM63oAGT
y9vH5eV92Q0CDAFpu+V6kBLOLod6dmyKhYexZdDfiA24BLsDoDeUafFlSi8luQmj
3bwX7agPmvtaIyBakmLG7nRHK0EhNaP82EpxyZ/a3PHH9CWUa+jC3CsqlO3tvbBZ
02mkmvXnYCyrL+qdh/abHfPktKQ49QM5JrHCp96hNLc27qyqyfzbC4FYtgA/3JCo
Fn8pINhC7Mc/dg==
=W1SC
-----END PGP SIGNATURE-----
Merge tag 'drm/tegra/for-5.6-rc1-fixes' of git://anongit.freedesktop.org/tegra/linux into drm-next
drm/tegra: Fixes for v5.6-rc1
These are a couple of quick fixes for regressions that were found during
the first two weeks of the merge window.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thierry Reding <thierry.reding@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200206172753.2185390-1-thierry.reding@gmail.com
The DMA direction is only used by the DMA API, so there is no use in
setting it when a buffer object isn't mapped with the DMA API.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
This partially reverts the DMA API support that was recently merged
because it was causing performance regressions on older Tegra devices.
Unfortunately, the cache maintenance performed by dma_map_sg() and
dma_unmap_sg() causes performance to drop by a factor of 10.
The right solution for this would be to cache mappings for buffers per
consumer device, but that's a bit involved. Instead, we simply revert to
the old behaviour of sharing IOVA mappings when we know that devices can
do so (i.e. they share the same IOMMU domain).
Cc: <stable@vger.kernel.org> # v5.5
Reported-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
This contains a small set of mostly fixes and some minor improvements.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEiOrDCAFJzPfAjcif3SOs138+s6EFAl4ZGs0THHRyZWRpbmdA
bnZpZGlhLmNvbQAKCRDdI6zXfz6zoZUkEACo+fhKHTmkCz4W+4m90aEFL/VUi7Bn
lFmssk2E71WvxUSWfLYawuhi/bgtMLvPmTi1KTCpvdvr+ladF1f4/Vd6vuhMu7Ec
t2ZePYmRlpX+3PHW+V2fIRyBC9/6wSx2UEAbcMOijIPHlWDNdrxo26w0+af0dBJ3
b8oIv5q5jX0GzZDOCOsBngUwnVzCYthVz7SA+UXOwBvNPO61kjx8IA/IfBHAQ+HF
ZVpn//C7g2/wHq125kdrFUwAWQPvFMHOs1JPfXS9248kobkIabSp0XxLaj403cFU
WWa/no5cPkG+WRwaX5egvE2/8P3mDzu3ev9gkgpdDJxnn1YL656mPWO7jNqW4IJd
JD8kTk7ODh7KQvY3xdHD3MrHpHWHtHdOiFZtOygiW84W2yK5/g/LRd/oph/xP1RE
/h8eiRP/mK7VkVRmgbiZTIEQVp10RKH56XtVd1Cf3eFNkig1Q8MLarwy2rhClyJO
7cJZKoOqRfx0aMYipxx5+YWDXrCVwtNrQKTroJ27ycre3ykv+NAoeHeei9ie81Rg
XZeCF3OuvVNXoUPQRS3XYkH70CTbdDTpkwymctcfwWOerhn65QAHvCSNR91ezmNq
YwC7CkS6ymeG9UhSoCvNoLa1adfeztRCwg8RkKdHgHFtHNNQuyRlp4xNyrJNgoCO
YX5dApw++iTOwQ==
=OBEv
-----END PGP SIGNATURE-----
Merge tag 'drm/tegra/for-5.6-rc1' of git://anongit.freedesktop.org/tegra/linux into drm-next
drm/tegra: Changes for v5.6-rc1
This contains a small set of mostly fixes and some minor improvements.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thierry Reding <thierry.reding@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200111004835.2412858-1-thierry.reding@gmail.com
platform_get_irq() will call dev_err() itself on failure,
so there is no need for the driver to also do this.
This is detected by coccinelle.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The Tegra DRM driver heavily relies on the implementations for runtime
suspend/resume to be called at specific times. Unfortunately, there are
some cases where that doesn't work. One example is if the user disables
runtime PM for a given subdevice. Another example is that the PM core
acquires a reference to runtime PM during system sleep, effectively
preventing devices from going into low power modes. This is intentional
to avoid nasty race conditions, but it also causes system sleep to not
function properly on all Tegra systems.
Fix this by not implementing runtime PM at all. Instead, a minimal,
reference-counted suspend/resume infrastructure is added to the host1x
bus. This has the benefit that it can be used regardless of the system
power state (or any transitions we might be in), or whether or not the
user allows runtime PM.
Atomic modesetting guarantees that these functions will end up being
called at the right point in time, so the pitfalls for the more generic
runtime PM do not apply here.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Rename the host1x clients' parent to "host" because that more closely
describes what it is. The parent can be confused with the parent device
in terms of the device hierarchy. Subsequent patches will add a new
member that refers to the parent in that hierarchy.
Signed-off-by: Thierry Reding <treding@nvidia.com>
UAPI Changes:
- Add support for DMA-BUF HEAPS.
Cross-subsystem Changes:
- mipi dsi definition updates, pulled into drm-intel as well.
- Add lockdep annotations for dma_resv vs mmap_sem and fs_reclaim.
- Remove support for dma-buf kmap/kunmap.
- Constify fb_ops in all fbdev drivers, including drm drivers and drm-core, and media as well.
Core Changes:
- Small cleanups to ttm.
- Fix SCDC definition.
- Assorted cleanups to core.
- Add todo to remove load/unload hooks, and use generic fbdev emulation.
- Assorted documentation updates.
- Use blocking ww lock in ttm fault handler.
- Remove drm_fb_helper_fbdev_setup/teardown.
- Warning fixes with W=1 for atomic.
- Use drm_debug_enabled() instead of drm_debug flag testing in various drivers.
- Fallback to nontiled mode in fbdev emulation when not all tiles are present. (Later on reverted)
- Various kconfig indentation fixes in core and drivers.
- Fix freeing transactions in dp-mst correctly.
- Sean Paul is steping down as core maintainer. :-(
- Add lockdep annotations for atomic locks vs dma-resv.
- Prevent use-after-free for a bad job in drm_scheduler.
- Fill out all block sizes in the P01x and P210 definitions.
- Avoid division by zero in drm/rect, and fix bounds.
- Add drm/rect selftests.
- Add aspect ratio and alternate clocks for HDMI 4k modes.
- Add todo for drm_framebuffer_funcs and fb_create cleanup.
- Drop DRM_AUTH for prime import/export ioctls.
- Clear DP-MST payload id tables downstream when initializating.
- Fix for DSC throughput definition.
- Add extra FEC definitions.
- Fix fake offset in drm_gem_object_funs.mmap.
- Stop using encoder->bridge in core directly
- Handle bridge chaining slightly better.
- Add backlight support to drm/panel, and use it in many panel drivers.
- Increase max number of y420 modes from 128 to 256, as preparation to add the new modes.
Driver Changes:
- Small fixes all over.
- Fix documentation in vkms.
- Fix mmap_sem vs dma_resv in nouveau.
- Small cleanup in komeda.
- Add page flip support in gma500 for psb/cdv.
- Add ddc symlink in the connector sysfs directory for many drivers.
- Add support for analogic an6345, and fix small bugs in it.
- Add atomic modesetting support to ast.
- Fix radeon fault handler VMA race.
- Switch udl to use generic shmem helpers.
- Unconditional vblank handling for mcde.
- Miscellaneous fixes to mcde.
- Tweak debug output from komeda using debugfs.
- Add gamma and color transform support to komeda for DOU-IPS.
- Add support for sony acx424AKP panel.
- Various small cleanups to gma500.
- Use generic fbdev emulation in udl, and replace udl_framebuffer with generic implementation.
- Add support for Logic PD Type 28 panel.
- Use drm_panel_* wrapper functions in exynos/tegra/msm.
- Add devicetree bindings for generic DSI panels.
- Don't include drm_pci.h directly in many drivers.
- Add support for begin/end_cpu_access in udmabuf.
- Stop using drm_get_pci_dev in gma500 and mga200.
- Fixes to UDL damage handling, and use dma_buf_begin/end_cpu_access.
- Add devfreq thermal support to panfrost.
- Fix hotplug with daisy chained monitors by removing VCPI when disabling topology manager.
- meson: Add support for OSD1 plane AFBC commit.
- Stop displaying garbage when toggling ast primary plane on/off.
- More cleanups and fixes to UDL.
- Add D32 suport to komeda.
- Remove globle copy of drm_dev in gma500.
- Add support for Boe Himax8279d MIPI-DSI LCD panel.
- Add support for ingenic JZ4770 panel.
- Small null pointer deference fix in ingenic.
- Remove support for the special tfp420 driver, as there is a generic way to do it.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAl34lkkACgkQ/lWMcqZw
E8M76g//WRYl9fWnV063s44FBVJYjGxaus0vQJSGidaPCIE6Ep6TNjXp8DVzV82M
HR79P9glL02DC9B8pflioNNXdIRGSVk/FJcKVB2seFAqEFCAknvWDM/X/y+mOUpp
fUeFl+Znlwx3YlM8f4Qujdbm+CbTewfbya4VAWeWd8XG2V8jfq5cmODPPlUMNenZ
J6Ja+W3ph741uSIfAKaP69LVJgOcuUjXINE4SWhRk/i5QF3GIRej/A7ZjWGLQ/t2
2zUUF7EiCzhPomM40H3ddKtXb4ZjNJuc5pOD4GpxR8ciNbe2gUOHEZ5aenwYBdsU
5MwbxNKyMbKXATtn3yv3fSc4jH3DtmEKpmovONeO8ZDBrQBnxeYa3tQvfkNghA2f
acoZMzYUImV+ft6DMIgpXppASvo7mQYDAbLPOGEJ9E44AL4UP00jesEjnK5FOHSR
3BEzGUnK/6QL5zFNPni8YZQ8dan4jDIno1mqIV+cQ4WCGlaKckzIWO6243Bf13b/
kROSJpgWkiK6Ngq0ofhD0MHyT/m1QnqUzWRKTJhRtPflSWRBsDZqWCQ5Vx1QlNIE
/HfTNbTpXWwa+5wXbbB8TkDw5t9cQGnR+QcrEd9HgoIec7B5Re8rx9i0TJAT4N05
03RCQCecSfD8gwKd2wgaFIpFGRl9lTdLYSpffSmyL2X5a20lZhM=
=b15X
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-next-2019-12-16' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for v5.6:
UAPI Changes:
- Add support for DMA-BUF HEAPS.
Cross-subsystem Changes:
- mipi dsi definition updates, pulled into drm-intel as well.
- Add lockdep annotations for dma_resv vs mmap_sem and fs_reclaim.
- Remove support for dma-buf kmap/kunmap.
- Constify fb_ops in all fbdev drivers, including drm drivers and drm-core, and media as well.
Core Changes:
- Small cleanups to ttm.
- Fix SCDC definition.
- Assorted cleanups to core.
- Add todo to remove load/unload hooks, and use generic fbdev emulation.
- Assorted documentation updates.
- Use blocking ww lock in ttm fault handler.
- Remove drm_fb_helper_fbdev_setup/teardown.
- Warning fixes with W=1 for atomic.
- Use drm_debug_enabled() instead of drm_debug flag testing in various drivers.
- Fallback to nontiled mode in fbdev emulation when not all tiles are present. (Later on reverted)
- Various kconfig indentation fixes in core and drivers.
- Fix freeing transactions in dp-mst correctly.
- Sean Paul is steping down as core maintainer. :-(
- Add lockdep annotations for atomic locks vs dma-resv.
- Prevent use-after-free for a bad job in drm_scheduler.
- Fill out all block sizes in the P01x and P210 definitions.
- Avoid division by zero in drm/rect, and fix bounds.
- Add drm/rect selftests.
- Add aspect ratio and alternate clocks for HDMI 4k modes.
- Add todo for drm_framebuffer_funcs and fb_create cleanup.
- Drop DRM_AUTH for prime import/export ioctls.
- Clear DP-MST payload id tables downstream when initializating.
- Fix for DSC throughput definition.
- Add extra FEC definitions.
- Fix fake offset in drm_gem_object_funs.mmap.
- Stop using encoder->bridge in core directly
- Handle bridge chaining slightly better.
- Add backlight support to drm/panel, and use it in many panel drivers.
- Increase max number of y420 modes from 128 to 256, as preparation to add the new modes.
Driver Changes:
- Small fixes all over.
- Fix documentation in vkms.
- Fix mmap_sem vs dma_resv in nouveau.
- Small cleanup in komeda.
- Add page flip support in gma500 for psb/cdv.
- Add ddc symlink in the connector sysfs directory for many drivers.
- Add support for analogic an6345, and fix small bugs in it.
- Add atomic modesetting support to ast.
- Fix radeon fault handler VMA race.
- Switch udl to use generic shmem helpers.
- Unconditional vblank handling for mcde.
- Miscellaneous fixes to mcde.
- Tweak debug output from komeda using debugfs.
- Add gamma and color transform support to komeda for DOU-IPS.
- Add support for sony acx424AKP panel.
- Various small cleanups to gma500.
- Use generic fbdev emulation in udl, and replace udl_framebuffer with generic implementation.
- Add support for Logic PD Type 28 panel.
- Use drm_panel_* wrapper functions in exynos/tegra/msm.
- Add devicetree bindings for generic DSI panels.
- Don't include drm_pci.h directly in many drivers.
- Add support for begin/end_cpu_access in udmabuf.
- Stop using drm_get_pci_dev in gma500 and mga200.
- Fixes to UDL damage handling, and use dma_buf_begin/end_cpu_access.
- Add devfreq thermal support to panfrost.
- Fix hotplug with daisy chained monitors by removing VCPI when disabling topology manager.
- meson: Add support for OSD1 plane AFBC commit.
- Stop displaying garbage when toggling ast primary plane on/off.
- More cleanups and fixes to UDL.
- Add D32 suport to komeda.
- Remove globle copy of drm_dev in gma500.
- Add support for Boe Himax8279d MIPI-DSI LCD panel.
- Add support for ingenic JZ4770 panel.
- Small null pointer deference fix in ingenic.
- Remove support for the special tfp420 driver, as there is a generic way to do it.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ba73535a-9334-5302-2e1f-5208bd7390bd@linux.intel.com
A few reasons to drop kmap:
- For native objects all we do is look at obj->vaddr anyway, so might
as well not call functions for every page.
- Reloc-processing on dma-buf is ... questionable.
- Plus most dma-buf that bother kernel cpu mmaps give you at least
vmap, much less kmaps. And all the ones relevant for arm-soc are
again doing a obj->vaddr game anyway, there's no real kmap going on
on arm it seems.
Plus this seems to be the only real in-tree user of dma_buf_kmap, and
I'd like to get rid of that.
Reviewed-by: Thierry Reding <treding@nvidia.com>
Tested-by: Thierry Reding <treding@nvidia.com>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: linux-tegra@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20191118103536.17675-2-daniel.vetter@ffwll.ch
Currently configurations can be generated where IOMMU_SUPPORT is
disabled but IOMMU_IOVA is built as a module and HOST1X as built-in. In
such a case, the symbols guarded by IOMMU_IOVA will not be available
when linking the host1x driver and cause a linking failure.
Simplify this by unconditionally selecting IOMMU_IOVA, which makes sure
that it will be forced to =y if HOST1X=y. Technically we can now get
IOMMU_IOVA code built-in even if we don't use it (host1x only uses it
when IOMMU_SUPPORT is also enabled), but such configuration are of a
mostly academic nature. In all practical configurations we want IOMMU
support anyway.
Signed-off-by: Thierry Reding <treding@nvidia.com>
If the Tegra DRM clients are backed by an IOMMU, push buffers are likely
to be allocated beyond the 32-bit boundary if sufficient system memory
is available. This is problematic on earlier generations of Tegra where
host1x supports a maximum of 32 address bits for the GATHER opcode. More
recent versions of Tegra (Tegra186 and later) have a wide variant of the
GATHER opcode, which allows addressing up to 64 bits of memory.
If host1x itself is behind an IOMMU as well this doesn't matter because
the IOMMU's input address space is restricted to 32 bits on generations
without support for wide GATHER opcodes.
However, if host1x is not behind an IOMMU, it won't be able to process
push buffers beyond the 32-bit boundary on Tegra generations that don't
support wide GATHER opcodes. Restrict the DMA mask to 32 bits on these
generations prevents buffers from being allocated from beyond the 32-bit
boundary.
Signed-off-by: Thierry Reding <treding@nvidia.com>
If host1x_bo_pin() returns an SG table, create a DMA mapping for the
buffer. For buffers that the host1x client has already mapped itself,
host1x_bo_pin() returns NULL and the existing DMA address is used.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Currently when the gather buffers are copied, they are copied to a
buffer that is allocated for the host1x client that wants to execute the
command streams in the buffers. However, the gather buffers will be read
by the host1x device, which causes SMMU faults if the DMA API is backed
by an IOMMU.
Fix this by allocating the gather buffer copy for the host1x device,
which makes sure that it will be mapped into the host1x's IOVA space if
the DMA API is backed by an IOMMU.
Signed-off-by: Thierry Reding <treding@nvidia.com>
The debugfs files created for host1x are never removed, causing these
files to be left dangling in debugfs. This results in a crash when any
of these files are accessed after the host1x driver has been removed,
as well as a failure to create the debugfs entries when they are added
again on driver probe.
Signed-off-by: Thierry Reding <treding@nvidia.com>
The host1x_bo_pin() and host1x_bo_unpin() APIs are used to pin and unpin
buffers during host1x job submission. Pinning currently returns the SG
table and the DMA address (an IOVA if an IOMMU is used or a physical
address if no IOMMU is used) of the buffer. The DMA address is only used
for buffers that are relocated, whereas the host1x driver will map
gather buffers into its own IOVA space so that they can be processed by
the CDMA engine.
This approach has a couple of issues. On one hand it's not very useful
to return a DMA address for the buffer if host1x doesn't need it. On the
other hand, returning the SG table of the buffer is suboptimal because a
single SG table cannot be shared for multiple mappings, because the DMA
address is stored within the SG table, and the DMA address may be
different for different devices.
Subsequent patches will move the host1x driver over to the DMA API which
doesn't work with a single shared SG table. Fix this by returning a new
SG table each time a buffer is pinned. This allows the buffer to be
referenced by multiple jobs for different engines.
Change the prototypes of host1x_bo_pin() and host1x_bo_unpin() to take a
struct device *, specifying the device for which the buffer should be
pinned. This is required in order to be able to properly construct the
SG table. While at it, make host1x_bo_pin() return the SG table because
that allows us to return an ERR_PTR()-encoded error code if we need to,
or return NULL to signal that we don't need the SG table to be remapped
and can simply use the DMA address as-is. At the same time, returning
the DMA address is made optional because in the example of command
buffers, host1x doesn't need to know the DMA address since it will have
to create its own mapping anyway.
Signed-off-by: Thierry Reding <treding@nvidia.com>
The host1x_cdma_wait_pushbuffer_space() function is not declared or
directly called from outside the file it is in, so make it static.
Fixes the following sparse warning:
drivers/gpu/host1x/cdma.c:235:5: warning: symbol 'host1x_cdma_wait_pushbuffer_space' was not declared. Should it be static?
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
A struct device doesn't carry much information that a channel might be
interested in, but the client very much does. Request channels for the
clients rather than their parent devices and store a pointer to them
in order to have that information available when needed.
Signed-off-by: Thierry Reding <treding@nvidia.com>
It's technically not required to explicitly initialize the fields that
will be zero by default, but it's easier to read these structures if
they are all initialized uniformly.
Signed-off-by: Thierry Reding <treding@nvidia.com>
host1x nor any its clients have any limitations on the DMA segment size,
so don't pretend that they do.
Signed-off-by: Thierry Reding <treding@nvidia.com>
This contains a couple of small improvements and cleanups for the Tegra
DRM driver.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEiOrDCAFJzPfAjcif3SOs138+s6EFAl0M8jsTHHRyZWRpbmdA
bnZpZGlhLmNvbQAKCRDdI6zXfz6zoU0MEACZhCSKLvinuCw2pWdsc2A5rZN3r0Mr
9TpdIKXYwcHO2G2gYC1mXv0wlp2pzIOr5y3VHLl1VboWXlPZjvgH0OiDtmwZk/vU
jjFOGZO5ZjwyVNAZ8SlQrpdBNIAm3wseVfrhVURIL+9dANDgXsxWiqTkgnEmVclZ
Kj2SpXfysTM9TJyXfQW8op6jKtP3NJ/IPEtTguNL1R9ho0phlcRYBUMEIYqtgBVE
aRsjrrbM27OGf+JFPY3C7bW90hzgVqZLeK9R4AAMPS8iZ91k9njlFZ20LI9Yk/P1
GBVyemiRtcG3owbli3NlfFzaeNjmM2PmcMZJsOyf6T+UH5juH2XZ/TV5O6i/jq0z
/8DWsuFdEMX5tdP0t4B3vnbGQTYMOo/6bRsZHceLXcpKFNuJcC6lY233/Fc3D1bw
gWm5ZHTs3SmPm2JoWmA54h2oiXU8/hGiPZGoUDHUTxf/h7DOGhK2hfaNdB6ZYXJu
be4yS5TidFlFKi911JuXblCDeFf0VPsOfemQtJW4OvKg9FD6WmRuehYPytJ8ifB1
dByXh5siOVMcyE0a2bsMPAvsxK6z4pPwyDz34AJUETJ0DSmSfWeEMSbCmpJwt6m5
35IiMossSULJXWjqtc5bcK1gmbYMJSEwj+Xe33O+H67THSgo4GLxqKdb++5QK5Ky
svluhQIjYYx3pw==
=T2hU
-----END PGP SIGNATURE-----
Merge tag 'drm/tegra/for-5.3-rc1' of git://anongit.freedesktop.org/tegra/linux into drm-next
drm/tegra: Changes for v5.3-rc1
This contains a couple of small improvements and cleanups for the Tegra
DRM driver.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thierry Reding <thierry.reding@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621150753.19550-1-thierry.reding@gmail.com
So there is no need to check for a value that can never happen. No need
to check the return value all anyway, as any debugfs call can take the
result of this function as an option just fine.
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: dri-devel@lists.freedesktop.org
Cc: linux-tegra@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Based on 1 normalized pattern(s):
this software is licensed under the terms of the gnu general public
license version 2 as published by the free software foundation and
may be copied distributed and modified under those terms this
program is distributed in the hope that it will be useful but
without any warranty without even the implied warranty of
merchantability or fitness for a particular purpose see the gnu
general public license for more details
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 285 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141900.642774971@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Logical devices created by the host1x bus infrastructure don't need to
be associated with a device tree node. Doing so will cause the driver
core to attempt to hook up IOMMU operations and fail because it is not
a real device.
However, for backwards-compatibility, we need to provide various OF_*
uevent variables that were previously provided by of_device_uevent() and
which are parsed by libdrm in userspace when querying the available
devices. Do this by implementing a uevent callback for the host1x bus.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Recent versions of the DMA API debug code have started to warn about
violations of the maximum DMA segment size. This is because the segment
size defaults to 64 KiB, which can easily be exceeded in large buffer
allocations such as used in DRM/KMS for framebuffers.
Technically the Tegra SMMU and ARM SMMU don't have a maximum segment
size (they map individual pages irrespective of whether they are
contiguous or not), so the choice of 4 MiB is a bit arbitrary here. The
maximum segment size is a 32-bit unsigned integer, though, so we can't
set it to the correct maximum size, which would be the size of the
aperture.
Signed-off-by: Thierry Reding <treding@nvidia.com>
When deferring probe, avoid logging a confusing error message. While at
it, make the error message more informational.
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms and conditions of the gnu general public license
version 2 as published by the free software foundation this program
is distributed in the hope it will be useful but without any
warranty without even the implied warranty of merchantability or
fitness for a particular purpose see the gnu general public license
for more details you should have received a copy of the gnu general
public license along with this program if not see http www gnu org
licenses
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 228 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Steve Winslow <swinslow@gmail.com>
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190528171438.107155473@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add SPDX license identifiers to all Make/Kconfig files which:
- Have no license information of any form
These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:
GPL-2.0-only
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If SMMU support is not available, fall back to programming the bypass
stream ID (0x7f).
Fixes: de5469c21f ("gpu: host1x: Program the channel stream ID")
Suggested-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
[treding@nvidia.com: rebase this on top of a later build fix]
Signed-off-by: Thierry Reding <treding@nvidia.com>
In case the IOMMU API is not available compiling host1x fails with
the following error:
In file included from drivers/gpu/host1x/hw/host1x06.c:27:
drivers/gpu/host1x/hw/channel_hw.c: In function ‘host1x_channel_set_streamid’:
drivers/gpu/host1x/hw/channel_hw.c:118:30: error: implicit declaration of function
‘dev_iommu_fwspec_get’; did you mean ‘iommu_fwspec_free’? [-Werror=implicit-function-declaration]
struct iommu_fwspec *spec = dev_iommu_fwspec_get(channel->dev->parent);
^~~~~~~~~~~~~~~~~~~~
iommu_fwspec_free
Fixes: de5469c21f ("gpu: host1x: Program the channel stream ID")
Signed-off-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Currently gathers of a hung job are getting NOP'ed and a restarted CDMA
executes the NOP'ed gathers. There shouldn't be a reason to not restart
CDMA execution starting with a next job, avoiding the unnecessary churning
with gathers NOP'ing.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
There is a chance that the last job has been completed at the time of
CDMA timeout handler invocation. In this case there is no need to complete
the completed job.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Host1x doesn't have information about jobs inter-dependency, that is
something that will become available once host1x will get a proper
jobs scheduler implementation. Currently a hang job causes other unrelated
jobs to be canceled, that is a relic from downstream driver which is
irrelevant to upstream. Let's cancel only the hanging job and not to touch
other jobs in queue.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The host1x CDMA push buffer is terminated by a special opcode (RESTART)
that tells the CDMA to wrap around to the beginning of the push buffer.
To accomodate the RESTART opcode, an extra 4 bytes are allocated on top
of the 512 * 8 = 4096 bytes needed for the 512 slots (1 slot = 2 words)
that are used for other commands passed to CDMA. This requires that two
memory pages are allocated, but most of the second page (4092 bytes) is
never used.
Decrease the number of slots to 511 so that the RESTART opcode fits
within the page. Adjust the push buffer wraparound code to take into
account push buffer sizes that are not a power of two.
Signed-off-by: Thierry Reding <treding@nvidia.com>
The HOST1X_CHANNEL_DMAEND is an offset relative to the value written to
the HOST1X_CHANNEL_DMASTART register, but it is currently treated as an
absolute address. This can cause SMMU faults if the CDMA fetches past a
pushbuffer's IOMMU mapping.
Properly setting the DMAEND prevents the CDMA from fetching beyond that
address and avoid such issues. This is currently not observed because a
whole (almost) page of essentially scratch space absorbs any excessive
prefetching by CDMA. However, changing the number of slots in the push
buffer can trigger these SMMU faults.
Signed-off-by: Thierry Reding <treding@nvidia.com>
On Tegra186 and later, the ARM SMMU provides an input address space that
is 48 bits wide. However, memory clients can only address up to 40 bits.
If the geometry is used as-is, allocations of IOVA space can end up in a
region that is not addressable by the memory clients.
To fix this, restrict the IOVA space to the DMA mask of the host1x
device.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Tegra186 and later support 40 bits of address space. Additional
registers need to be programmed to store the full 40 bits of push
buffer addresses.
Since command stream gathers can also reside in buffers in a 40-bit
address space, a new variant of the GATHER opcode is also introduced.
It takes two parameters: the first parameter contains the lower 32
bits of the address and the second parameter contains bits 32 to 39.
Signed-off-by: Thierry Reding <treding@nvidia.com>
The CDMA push buffer can currently only handle opcodes that take a
single word parameter. However, the host1x implementation on Tegra186
and later supports opcodes that require multiple words as parameters.
Unfortunately the way the push buffer is structured, these wide opcodes
cannot simply be composed of two regular opcodes because that could
result in the wide opcode being split across the end of the push buffer
and the final RESTART opcode required to wrap the push buffer around
would break the wide opcode.
One way to fix this would be to remove the concept of slots to simplify
push buffer operations. However, that's not entirely trivial and should
be done in a separate patch. For now, simply use a different function
to push four-word opcodes into the push buffer. Technically only three
words are pushed, with the fourth word used as padding to preserve the
2-word alignment required by the slots abstraction. The fourth word is
always a NOP opcode.
Additional care must be taken when the end of the push buffer is
reached. If a four-word opcode doesn't fit into the push buffer without
being split by the boundary, NOP opcodes will be introduced and the new
wide opcode placed at the beginning of the push buffer.
Signed-off-by: Thierry Reding <treding@nvidia.com>
When processing command streams, make sure the host1x's stream ID is
programmed for the channel so that addresses are properly translated
through the SMMU.
Signed-off-by: Thierry Reding <treding@nvidia.com>
In order to enable the MMIO path stream ID protection provided by the
incarnation of host1x found in Tegra186 and later, the host1x must be
provided with the list of stream ID register offsets for each of its
clients. Some clients (such as VIC) have multiple stream ID registers
that are assumed to be contiguous. The host1x is programmed with the
base offset and a limit which provide the range of registers that the
host1x needs to monitor for writes.
Signed-off-by: Thierry Reding <treding@nvidia.com>
This new debugfs file represents the state of host1x bus devices,
specifying the list of subdevices and marking which ones have
successfully registered.
Signed-off-by: Thierry Reding <treding@nvidia.com>
In this usage, the two are completely equivalent, but the completion
documents better what is going on, and we generally try to avoid
semaphores these days.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The host1x hardware found on Tegra194 is mostly backwards compatible
with the version found on Tegra186, with the notable exceptions of the
increased number of syncpoints and mlocks. In addition, some rarely
used features such as syncpoint wait bases were dropped and some
registers had to move around to accomodate the increased number of
syncpoints.
Signed-off-by: Thierry Reding <treding@nvidia.com>
The number of syncpoints on Tegra186 is 576 and therefore no longer fits
into 8 bits. Increase the size of the syncpoint ID field to 10 in order
to accomodate all syncpoints.
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The register region allocated per channel was decreased from 16384 bytes
to 256 bytes on Tegra186 and later. Resize the region to make sure every
channel (instead of only the first) is properly programmed.
Suggested-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Host1x is getting attached to an implicit IOMMU DMA domain if
CONFIG_ARM_DMA_USE_IOMMU=y. Since Host1x driver manages IOMMU by
itself, Host1x device must be detached from the implicit domain using
arch-specific IOMMU-API.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
All other assignments have a single space around the = sign, so remove
the spurious tab for consistency.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Only gather pins are mapped by the Host1x driver, regular BO relocations
are not. Check whether size of unpin isn't 0, otherwise IOVA allocation at
0x0 could be erroneously released.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Host1x's CDMA can't access the command buffers if IOMMU and Host1x
firewall are enabled in the kernels config because firewall doesn't map
the copied buffer into IOVA space. Fix this by skipping IOMMU
initialization if firewall is enabled as firewall merges sparse cmdbufs
into a single contiguous buffer and hence IOMMU isn't needed in this case.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJbFzauAAoJEAx081l5xIa+jzwP/AwfTreH3XBmlLeMO8kwkAMW
AiH1u3KHrlitN1U85x90dC8hVuuV2Kv9pEXQo1me/TAL/QCb9VZYCSf2dHHkkMwD
1UdwxAQW7soBbTRo9k0zuVJpRGSYIiYfqDh3L6KmTC3UF0tUJm53VOWCLG/DNrtn
XhEnGnlCUNbABZMMpavEvlwxPtvaYFlp6M+MmwRPIx32SrPJ3vSaBzxwWxOaFmjU
xiRdK+GpAbCV8yGeCCSkDgEe/TdWGUhZxoC9dLb0H9ex7ip8uZ0W4D+VTHPFrhQX
6nCpqUbp7BQTsbVSd1pAVsAv45scmSgWbKcqfC0NKSVcsHcJZBR0tQOF9OvnGZcf
o1Hv/beTqJ++IcG2rEIwyJTGxAGfZ0YSb0evTC9VcszaYo+b3+G283bdztIjzDeS
0QCTeLHYbZRHPITWVULNpMWy3TkJv32IdFhQfYSnD8/OGQIxLNhh4FFOtHnOmxSF
N8dnzOLKXhXMo/NgOL+UMNnbgLqIyOtEXCPDLuOQJNv/SOp8662m/A0yRjQNR6M2
gsPmR7dxQIwwJMyqrkLDOF411ABZohulquYgwLgG938MRPmTpPWOR72PtpGF4hAW
HLg+3HHBd1N/A1mlJUMAbUn2eMUACZBUIycE9u+U/geRgve/OQnzJH/FKGP2EJ4R
pf6CruEva+6GRR5GVzuM
=twst
-----END PGP SIGNATURE-----
Merge tag 'drm-next-2018-06-06-1' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Dave Airlie:
"This starts to support NVIDIA volta hardware with nouveau, and adds
amdgpu support for the GPU in the Kabylake-G (the intel + radeon
single package chip), along with some initial Intel icelake enabling.
Summary:
New Drivers:
- v3d - driver for broadcom V3D V3.x+ hardware
- xen-front - XEN PV display frontend
core:
- handle zpos normalization in the core
- stop looking at legacy pointers in atomic paths
- improved scheduler documentation
- improved aspect ratio validation
- aspect ratio support for 64:27 and 256:135
- drop unused control node code.
i915:
- Icelake (ICL) enabling
- GuC/HuC refactoring
- PSR/PSR2 enabling and fixes
- DPLL management refactoring
- DP MST fixes
- NV12 enabling
- HDCP improvements
- GEM/Execlist/reset improvements
- GVT improvements
- stolen memory first 4k fix
amdgpu:
- Vega 20 support
- VEGAM support (Kabylake-G)
- preOS scanout buffer reservation
- power management gfxoff support for raven
- SR-IOV fixes
- Vega10 power profiles and clock voltage control
- scatter/gather display support on CZ/ST
amdkfd:
- GFX9 dGPU support
- userptr memory mapping
nouveau:
- major refactoring for Volta GV100 support
tda998x:
- HDMI i2c CEC support
etnaviv:
- removed unused logging code
- license text cleanups
- MMU handling improvements
- timeout fence fix for 50 days uptime
tegra:
- IOMMU support in gr2d/gr3d drivers
- zpos support
vc4:
- syncobj support
- CTM, plane alpha and async cursor support
analogix_dp:
- HPD and aux chan fixes
sun4i:
- MIPI DSI support
tilcdc:
- clock divider fixes for OMAP-l138 LCDK board
rcar-du:
- R8A77965 support
- dma-buf fences fixes
- hardware indexed crtc/du group handling
- generic zplane property support
atmel-hclcdc:
- generic zplane property support
mediatek:
- use generic video mode function
exynos:
- S5PV210 FIMD variant support
- IPP v2 framework
- more HW overlays support"
* tag 'drm-next-2018-06-06-1' of git://anongit.freedesktop.org/drm/drm: (1286 commits)
drm/amdgpu: fix 32-bit build warning
drm/exynos: fimc: signedness bug in fimc_setup_clocks()
drm/exynos: scaler: fix static checker warning
drm/amdgpu: Use dev_info() to report amdkfd is not supported for this ASIC
drm/amd/display: Remove use of division operator for long longs
drm/amdgpu: Update GFX info structure to match what vega20 used
drm/amdgpu/pp: remove duplicate assignment
drm/sched: add rcu_barrier after entity fini
drm/amdgpu: move VM BOs on LRU again
drm/amdgpu: consistenly use VM moved flag
drm/amdgpu: kmap PDs/PTs in amdgpu_vm_update_directories
drm/amdgpu: further optimize amdgpu_vm_handle_moved
drm/amdgpu: cleanup amdgpu_vm_validate_pt_bos v2
drm/amdgpu: rework VM state machine lock handling v2
drm/amdgpu: Add runtime VCN PG support
drm/amdgpu: Enable VCN static PG by default on RV
drm/amdgpu: Add VCN static PG support on RV
drm/amdgpu: Enable VCN CG by default on RV
drm/amdgpu: Add static CG control for VCN on RV
drm/exynos: Fix default value for zpos plane property
...
The number of words and the offset in a gather don't need to be
explicitly sized, so make them unsigned int instead.
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
All other array variables use a plural, and this is the only one using
the *array suffix. This is confusing, so rename it for consistency.
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Functions taking a pointer to a host1x syncpoint as an argument don't
need to specify a pointer to a host1x instance because it can be
obtained from the syncpoint.
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Use unsigned int where possible and don't unnecessarily initialize the
loop variable.
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Rather than storing some identifier derived from the application
context that can't be used concretely anywhere, store a pointer to the
client directly so that accesses can be made directly through that
client object.
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The job submission userspace ABI doesn't support this and there are no
plans to implement it, so all of this code is dead and can be removed.
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The compiler is complaining with the following errors:
drivers/gpu/host1x/cdma.c:94:48: error:
passing argument 3 of ‘dma_alloc_wc’ from incompatible pointer type
[-Werror=incompatible-pointer-types]
drivers/gpu/host1x/cdma.c:113:48: error:
passing argument 3 of ‘dma_alloc_wc’ from incompatible pointer type
[-Werror=incompatible-pointer-types]
The expected pointer type of the third argument to dma_alloc_wc() is
dma_addr_t but phys_addr_t is passed.
Change the phys member of struct push_buffer to be dma_addr_t so that we
pass the correct type to dma_alloc_wc().
Also check pb->mapped for non-NULL in the destroy function as that is the
right way of checking if dma_alloc_wc() was successful.
Signed-off-by: Emil Goode <emil.fsw@goode.io>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The IOVA API uses a memory cache to allocate IOVA nodes from. To make
sure that this cache is available, obtain a reference to it and release
the reference when the cache is no longer needed.
On 64-bit ARM this is hidden by the fact that the DMA mapping API gets
that reference and never releases it. On 32-bit ARM, this is papered
over by the Tegra DRM driver (the sole user of the host1x API requiring
the cache) acquiring a reference to the IOVA cache for its own purposes.
However, there may be additional users of this API in the future, so fix
this upfront to avoid surprises.
Fixes: 404bfb78da ("gpu: host1x: Add IOMMU support")
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
If IOVA allocation or IOMMU mapping fails, dma_free_wc() is invoked with
size=0 because of a typo, that triggers "kernel BUG at mm/vmalloc.c:124!".
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
With each bus implementing its own DMA configuration callback, there is no
need for bus to explicitly set the force_dma flag. Modify the
of_dma_configure function to accept an input parameter which specifies if
implicit DMA configuration is required when it is not described by the
firmware.
Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> # PCI parts
Reviewed-by: Rob Herring <robh@kernel.org>
[hch: tweaked the changelog a bit]
Signed-off-by: Christoph Hellwig <hch@lst.de>
ACPI/OF support for configuration of DMA is a bus specific aspect, and
thus should be configured by the bus. Introduces a 'dma_configure' bus
method so that busses can control their DMA capabilities.
Also update the PCI, Platform, ACPI and host1x buses to use the new
method.
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> # PCI parts
Acked-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[hch: simplified host1x_dma_configure based on a comment from Thierry,
rewrote changelog]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Use IOMMU groups to attach the host1x device to its IOMMU domain. This
is not strictly necessary because the domain isn't shared with any other
device, but it makes the code consistent with how IOMMU is handled in
other drivers and provides an easy way to detect when no IOMMU has been
attached via device tree.
Signed-off-by: Thierry Reding <treding@nvidia.com>
When an error happens during the initialization of one of the sub-
devices, make sure to properly cleanup all sub-devices that have been
initialized up to that point.
Signed-off-by: Thierry Reding <treding@nvidia.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJaCm8RAAoJEAx081l5xIa+zX0QAJSm31kCG3vdw2CNiRx25L3q
3hcsEOgAjVJ9FQVGKFWjzb8TK35tSqtNx5kWIj0VGaIfBE5Bdg5SLLgKKUYas8rY
4LaphqICq2uxu2BNa2tpiar/sHhAnuozwQ4czpVWXzlaISnb9yYzRl7gMuyUVGkx
+Gih5VUhLmQC0HsRTLJ3vaZQoUsLAl2gAjKcWa1bx57j2S+iKOPfsLaq7VYo+y1I
Njc+iSGqMhJzRLXVkxL2lQKaslp7R38Bbh5K4Kvyjkm4Aq7zErOF6irpOXKMcrGl
mwnr89vf1G9thjikrBaXpKnuvdbWYveoN/ORMlTdCfxkFnChHLnm3bd7NJ49RXDN
Hv/Iq9YYjmZ9GTatxnx7lWtmXnZXC5he1yn1JAuz/yt7/0b/Wx+Mu/wEpBXYNFTd
1AZdD586i+AmPo3yDkqH9nBu8JC0W0AnS9VZma4LVvZOP2UfJmj5Im1CLHItbGDN
FnUCkwyD/lJUUk+WgT+w/GOMJgmFHDiFFl4tFtYVVjrUirpCFVguSKG9xuv6tT8P
8iRsoP7RrcmDN9ojN2SEHwcpsAv3HnKkDv+9+GIbWnrGsSbCPq8Qm+JDSvf4h22I
K5lwNpJrcpSKI+q10L7w2xliTBwb98sJkWGA/rssomrdBOWteGZAyqFRYAVgQ+mJ
x/nJurIqQYh2KQN9+uLG
=xVV2
-----END PGP SIGNATURE-----
Merge tag 'drm-for-v4.15' of git://people.freedesktop.org/~airlied/linux
Pull drm updates from Dave Airlie:
"This is the main drm pull request for v4.15.
Core:
- Atomic object lifetime fixes
- Atomic iterator improvements
- Sparse/smatch fixes
- Legacy kms ioctls to be interruptible
- EDID override improvements
- fb/gem helper cleanups
- Simple outreachy patches
- Documentation improvements
- Fix dma-buf rcu races
- DRM mode object leasing for improving VR use cases.
- vgaarb improvements for non-x86 platforms.
New driver:
- tve200: Faraday Technology TVE200 block.
This "TV Encoder" encodes a ITU-T BT.656 stream and can be found in
the StorLink SL3516 (later Cortina Systems CS3516) as well as the
Grain Media GM8180.
New bridges:
- SiI9234 support
New panels:
- S6E63J0X03, OTM8009A, Seiko 43WVF1G, 7" rpi touch panel, Toshiba
LT089AC19000, Innolux AT043TN24
i915:
- Remove Coffeelake from alpha support
- Cannonlake workarounds
- Infoframe refactoring for DisplayPort
- VBT updates
- DisplayPort vswing/emph/buffer translation refactoring
- CCS fixes
- Restore GPU clock boost on missed vblanks
- Scatter list updates for userptr allocations
- Gen9+ transition watermarks
- Display IPC (Isochronous Priority Control)
- Private PAT management
- GVT: improved error handling and pci config sanitizing
- Execlist refactoring
- Transparent Huge Page support
- User defined priorities support
- HuC/GuC firmware refactoring
- DP MST fixes
- eDP power sequencing fixes
- Use RCU instead of stop_machine
- PSR state tracking support
- Eviction fixes
- BDW DP aux channel timeout fixes
- LSPCON fixes
- Cannonlake PLL fixes
amdgpu:
- Per VM BO support
- Powerplay cleanups
- CI powerplay support
- PASID mgr for kfd
- SR-IOV fixes
- initial GPU reset for vega10
- Prime mmap support
- TTM updates
- Clock query interface for Raven
- Fence to handle ioctl
- UVD encode ring support on Polaris
- Transparent huge page DMA support
- Compute LRU pipe tweaks
- BO flag to allow buffers to opt out of implicit sync
- CTX priority setting API
- VRAM lost infrastructure plumbing
qxl:
- fix flicker since atomic rework
amdkfd:
- Further improvements from internal AMD tree
- Usermode events
- Drop radeon support
nouveau:
- Pascal temperature sensor support
- Improved BAR2 handling
- MMU rework to support Pascal MMU
exynos:
- Improved HDMI/mixer support
- HDMI audio interface support
tegra:
- Prep work for tegra186
- Cleanup/fixes
msm:
- Preemption support for a5xx
- Display fixes for 8x96 (snapdragon 820)
- Async cursor plane fixes
- FW loading rework
- GPU debugging improvements
vc4:
- Prep for DSI panels
- fix T-format tiling scanout
- New madvise ioctl
Rockchip:
- LVDS support
omapdrm:
- omap4 HDMI CEC support
etnaviv:
- GPU performance counters groundwork
sun4i:
- refactor driver load + TCON backend
- HDMI improvements
- A31 support
- Misc fixes
udl:
- Probe/EDID read fixes.
tilcdc:
- Misc fixes.
pl111:
- Support more variants
adv7511:
- Improve EDID handling.
- HDMI CEC support
sii8620:
- Add remote control support"
* tag 'drm-for-v4.15' of git://people.freedesktop.org/~airlied/linux: (1480 commits)
drm/rockchip: analogix_dp: Use mutex rather than spinlock
drm/mode_object: fix documentation for object lookups.
drm/i915: Reorder context-close to avoid calling i915_vma_close() under RCU
drm/i915: Move init_clock_gating() back to where it was
drm/i915: Prune the reservation shared fence array
drm/i915: Idle the GPU before shinking everything
drm/i915: Lock llist_del_first() vs llist_del_all()
drm/i915: Calculate ironlake intermediate watermarks correctly, v2.
drm/i915: Disable lazy PPGTT page table optimization for vGPU
drm/i915/execlists: Remove the priority "optimisation"
drm/i915: Filter out spurious execlists context-switch interrupts
drm/amdgpu: use irq-safe lock for kiq->ring_lock
drm/amdgpu: bypass lru touch for KIQ ring submission
drm/amdgpu: Potential uninitialized variable in amdgpu_vm_update_directories()
drm/amdgpu: potential uninitialized variable in amdgpu_vce_ring_parse_cs()
drm/amd/powerplay: initialize a variable before using it
drm/amd/powerplay: suppress KASAN out of bounds warning in vega10_populate_all_memory_levels
drm/amd/amdgpu: fix evicted VRAM bo adjudgement condition
drm/vblank: Tune drm_crtc_accurate_vblank_count() WARN down to a debug
drm/rockchip: add CONFIG_OF dependency for lvds
...
- turn dma_cache_sync into a dma_map_ops instance and remove
implementation that purely are dead because the architecture
doesn't support noncoherent allocations
- add a flag for busses that need DMA configuration (Robin Murphy)
-----BEGIN PGP SIGNATURE-----
iQI/BAABCAApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAloLSrYLHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYOMuQ//XXD94uNPYavrgXzGsAtg+I+LEm+xyk4T0dX5fxfj
amXX49MHoGemjsBgzJlkQMMFqwDEdkKyEuFnEuy6OeowYCyD6zW0MJ3MwP9OosNJ
PNTdGZIfSvxPYEW8cR9AdK3iQ2loMBZnYhd+O/oVjSugULLW2DNa7r2VRktcCKoh
8Ob/8gL6Y9xEYJBRszhrBwKTa/hU8IThxxozBFzN7I3LIKyFboSTcwXGLAHow43g
4anCTjWTaDcoU2JwY6UTRKRRTV+gD0ZRcsZfd8lNNb5rtMVZkBVOHbF14SMAmw1r
kSgRcU3+WIFPhK/8wBYqtGZZGnOgFBTHVeqow3AdS728pBWlWl8niTK0DiIgCd3m
qzScF6SqfN1bCZkZAy8FUV2l0DPYKS6lvyNkf00Eb2W/f6LEqAcjCi2QDDxRfaw+
Vm97nPUiM+uXNy/6KtAy6ChdprSqx12/edXPp7Y3H2rS/+Dmr6exeix+wb7QUN8W
JI7ZRHo4JLaJZk/XrZtGX/6jnN1Jo7vfApQOmYDY7kE1iGtOU/LQQj8gcZRVQxML
4soN6ivSmZX2k03LabWHpYQ8QiyCSYChLC+Az7rQH47LDLeu1IdTJu6orpXpaxyo
ymzEWlHbmF7mE66X4g/Up/eAYk2YLUA3rKLGVjAIaWDBzHftSFg5EaAnqMADC1G2
hSo=
=ALJf
-----END PGP SIGNATURE-----
Merge tag 'dma-mapping-4.15' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping updates from Christoph Hellwig:
- turn dma_cache_sync into a dma_map_ops instance and remove
implementation that purely are dead because the architecture doesn't
support noncoherent allocations
- add a flag for busses that need DMA configuration (Robin Murphy)
* tag 'dma-mapping-4.15' of git://git.infradead.org/users/hch/dma-mapping:
dma-mapping: turn dma_cache_sync into a dma_map_ops method
sh: make dma_cache_sync a no-op
xtensa: make dma_cache_sync a no-op
unicore32: make dma_cache_sync a no-op
powerpc: make dma_cache_sync a no-op
mn10300: make dma_cache_sync a no-op
microblaze: make dma_cache_sync a no-op
ia64: make dma_cache_sync a no-op
frv: make dma_cache_sync a no-op
x86: make dma_cache_sync a no-op
floppy: consolidate the dummy fd_cacheflush definition
drivers: flag buses which demand DMA configuration
* Enforce MSI multiple IRQ alignment in AMD IOMMU
* VT-d PASID error handling fixes
* Add r8a7795 IPMMU support
* Manage runtime PM links on exynos at {add,remove}_device callbacks
* Fix Mediatek driver name to avoid conflict
* Add terminate support to qcom fault handler
* 64-bit IOVA optimizations
* Simplfy IOVA domain destruction, better use of rcache, and
skip anchor nodes on copy
* Convert to IOMMU TLB sync API in io-pgtable-arm{-v7s}
* Drop command queue lock when waiting for CMD_SYNC completion on
ARM SMMU implementations supporting MSI to cacheable memory
* iomu-vmsa cleanup inspired by missed IOTLB sync callbacks
* Fix sleeping lock with preemption disabled for RT
* Dual MMU support for TI DRA7xx DSPs
* Optional flush option on IOVA allocation avoiding overhead when
caller can try other options
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)
iQIcBAABAgAGBQJaCgbBAAoJECObm247sIsiUp0P/16GwExsZFKqD0B7XBiBq+hZ
Q03XiPEnfnMBt05bw4dH8lnfpDPR1+6I3OzrxByc3VurfsBZ6IUh6oXPREcwdkNE
+iyS7/1RlXd7EJGsaY91fiL1Gm7iPQS/MJ50GcL288gdF68cAHygDoLPR0r1bHBC
PzjLhmgSYcqYQwRgnoclUe4/sNEIWWu3Z0trltNmiQ9eFEXW4WCc1/OzCoAmfxY7
EPiKynP+vCBTvzZJ/7m4oBpfciO/WuF8KhnNDjmL/lKGAQGGChiUyQq77pkdaUzm
6MjGfLAzO1xJ3DYM1ozDGBnVXR3AlDE2f+MlveT9bwVHkFLB3WF5atb70Cx70Zju
Lp2ZC2sm42AlHN0WqLGU9Q1pGPhZwXGkoF5kAxcsSh1rWC+bjfjtelpIg9PLngMi
rBAgRLLveGMM3jQcWv4259dZkByJOyFurQwYzHToacj+bE0hGzLv5cFWS/AD8kIE
rdxFBdNWlMARDWoJO+jYOzEr/QTDXovLgWsNvS2C2QDfmdXL1s0yqjdU9ahC3wWJ
pi//PydoXoyuEbN3M04ccjDNCZLX+JYIvvaZAxF5gGVt99VCKhf2Rw2wtLPviPNN
K+8NmUp1xAZX7Ak47HDNHIS3ZVB6CgP5pmoBvqZW1k5yyf5kvbvi2axQta/cyr6a
4ey4GeRpF+iad1n1U/WX
=o4qF
-----END PGP SIGNATURE-----
Merge tag 'iommu-v4.15-rc1' of git://github.com/awilliam/linux-vfio
Pull IOMMU updates from Alex Williamson:
"As Joerg mentioned[1], he's out on paternity leave through the end of
the year and I'm filling in for him in the interim:
- Enforce MSI multiple IRQ alignment in AMD IOMMU
- VT-d PASID error handling fixes
- Add r8a7795 IPMMU support
- Manage runtime PM links on exynos at {add,remove}_device callbacks
- Fix Mediatek driver name to avoid conflict
- Add terminate support to qcom fault handler
- 64-bit IOVA optimizations
- Simplfy IOVA domain destruction, better use of rcache, and skip
anchor nodes on copy
- Convert to IOMMU TLB sync API in io-pgtable-arm{-v7s}
- Drop command queue lock when waiting for CMD_SYNC completion on ARM
SMMU implementations supporting MSI to cacheable memory
- iomu-vmsa cleanup inspired by missed IOTLB sync callbacks
- Fix sleeping lock with preemption disabled for RT
- Dual MMU support for TI DRA7xx DSPs
- Optional flush option on IOVA allocation avoiding overhead when
caller can try other options
[1] https://lkml.org/lkml/2017/10/22/72"
* tag 'iommu-v4.15-rc1' of git://github.com/awilliam/linux-vfio: (54 commits)
iommu/iova: Use raw_cpu_ptr() instead of get_cpu_ptr() for ->fq
iommu/mediatek: Fix driver name
iommu/ipmmu-vmsa: Hook up r8a7795 DT matching code
iommu/ipmmu-vmsa: Allow two bit SL0
iommu/ipmmu-vmsa: Make IMBUSCTR setup optional
iommu/ipmmu-vmsa: Write IMCTR twice
iommu/ipmmu-vmsa: IPMMU device is 40-bit bus master
iommu/ipmmu-vmsa: Make use of IOMMU_OF_DECLARE()
iommu/ipmmu-vmsa: Enable multi context support
iommu/ipmmu-vmsa: Add optional root device feature
iommu/ipmmu-vmsa: Introduce features, break out alias
iommu/ipmmu-vmsa: Unify ipmmu_ops
iommu/ipmmu-vmsa: Clean up struct ipmmu_vmsa_iommu_priv
iommu/ipmmu-vmsa: Simplify group allocation
iommu/ipmmu-vmsa: Unify domain alloc/free
iommu/ipmmu-vmsa: Fix return value check in ipmmu_find_group_dma()
iommu/vt-d: Clear pasid table entry when memory unbound
iommu/vt-d: Clear Page Request Overflow fault bit
iommu/vt-d: Missing checks for pasid tables if allocation fails
iommu/amd: Limit the IOVA page range to the specified addresses
...
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This function actually doesn't sleep in the version that was merged.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The disassembler for debug dumps was missing some newer host1x opcodes.
Add disassembly support for these.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The host1x driver prints out "disassembly" dumps of the command FIFO
and gather contents on submission timeouts. However, the output has
been quite difficult to read with unnecessary newlines and occasional
missing parentheses.
Fix these problems by using pr_cont to remove unnecessary newlines
and by fixing other small issues.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The gather filter is a feature present on Tegra124 and newer where the
hardware prevents GATHERed command buffers from executing commands
normally reserved for the CDMA pushbuffer which is maintained by the
kernel driver.
This commit enables the gather filter on all supporting hardware.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Since Tegra186 the Host1x hardware allows syncpoints to be assigned to
specific channels, preventing any other channels from incrementing
them.
Enable this feature where available and assign syncpoints to channels
when submitting a job. Syncpoints are currently never unassigned from
channels since that would require extra work and is unnecessary with
the current channel allocation model.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
of_dma_configure() now checks the device's bus before configuring it, so
we need to set the device's bus before calling.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Add support for the implementation of Host1x present on the Tegra186.
The register space has been shuffled around a little bit, requiring
addition of some chip-specific code sections. Tegra186 also adds
several new features, most importantly the hypervisor, but those are
not yet supported with this commit.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Rather than request syncpoints for a struct device *, request them for a
struct host1x_client *. This is important because subsequent patches are
going to break the assumption that host1x will always be the parent for
devices requesting a syncpoint. It's also a more natural choice because
host1x clients are really the only ones that will know how to deal with
syncpoints.
Note that host1x clients are always guaranteed to be children of host1x,
regardless of their location in the device tree.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Avoid some boilerplate by calling of_device_get_match_data() instead of
open-coding the equivalent in the driver.
While at it, shuffle around some code to avoid unnecessary local
variables.
Signed-off-by: Thierry Reding <treding@nvidia.com>
We do not want the common dma_configure() pathway to apply
indiscriminately to all devices, since there are plenty of buses which
do not have DMA capability, and if their child devices were used for
DMA API calls it would only be indicative of a driver bug. However,
there are a number of buses for which DMA is implicitly expected even
when not described by firmware - those we whitelist with an automatic
opt-in to dma_configure(), assuming that the DMA address space and the
physical address space are equivalent if not otherwise specified.
Commit 7232888366 ("of: restrict DMA configuration") introduced a
short-term fix by comparing explicit bus types, but this approach is far
from pretty, doesn't scale well, and fails to cope at all with bus
drivers which may be built as modules, like host1x. Let's refine things
by making that opt-in a property of the bus type, which neatly addresses
those problems and lets the decision of whether firmware description of
DMA capability should be optional or mandatory stay internal to the bus
drivers themselves.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Now that the cached node optimisation can apply to all allocations, the
couple of users which were playing tricks with dma_32bit_pfn in order to
benefit from it can stop doing so. Conversely, there is also no need for
all the other users to explicitly calculate a 'real' 32-bit PFN, when
init_iova_domain() can happily do that itself from the page granularity.
CC: Thierry Reding <thierry.reding@gmail.com>
CC: Jonathan Hunter <jonathanh@nvidia.com>
CC: David Airlie <airlied@linux.ie>
CC: Sudeep Dutt <sudeep.dutt@intel.com>
CC: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Zhen Lei <thunder.leizhen@huawei.com>
Tested-by: Nate Watterson <nwatters@codeaurora.org>
[rm: use iova_shift(), rewrote commit message]
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This contains a couple of fixes and improvements for host1x, with some
preparatory work for Tegra186 support.
The remainder is cleanup and minor bugfixes for Tegra DRM along with
enhancements to debuggability.
There have also been some enhancements to the kernel interfaces for
host1x job submissions and support for mmap'ing PRIME buffers directly,
all of which get the interfaces very close to ready for serious work.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEiOrDCAFJzPfAjcif3SOs138+s6EFAlmXBm4THHRyZWRpbmdA
bnZpZGlhLmNvbQAKCRDdI6zXfz6zoYDJD/9ILmkCZPmv/LnYjQ1vivHN2cwboj2K
0yM6fwHE2JrvavKaAt1DDxZElxzL4Gcihcg6ALIc0/RH9S82nz7eNMkwLgfLLPHl
PFCv/rGkFkyQpqOXvUmLYBGL7uNo3GDfT3fE5fJ9ZfRWFLBaI6DgT6flzTZBsd0T
vC+mExHBdobu9bsyR95NpHGjDPQoUK//m35p+vZixnyjhCbDM/+qKA/iSUE5kaEY
S2huX/Gzl8jbi2d2Ax9dz905gYrDKl6y5qlGA3BGKpTPxOd/kYtc2eClSRHc1hno
WT/9yFeTFyXjxarpY9nAT2NNfVn+crvB3vBwP97I6YKszBbGUOHGYqhBy7r0W+Fl
MqMvlqTJgN4OlVL7pGiCkeDUKZyJ697EZqNdeYREiKwPtsmSiZcnxxk5BcTEzXBX
cF0udAVfEK8MekjPDz1CWGbH2uMuXxsH+7VTKt3avVYlN8J9rIhZv9hGK6g/znyd
4N4eyzDxRtChhAcin1fQJosAzc8oTSEE21WQW2D8vme+t0Yx9Oiy7BG5uj+yLruu
0/l0TUEyyDozg2doBsnDzJdCFzcHZjo4fClYfZu/Ficwb9eEDOx85eif+rGEOclO
ickwuGEOAjKuyrz4T0fkd6j3aMYUVRmXZ3L0gFD68jUKel00zjSpcL/JfEihwvd/
Nus3MYLH+IrFKQ==
=ZaxL
-----END PGP SIGNATURE-----
Merge tag 'drm/tegra/for-4.14-rc1' of git://anongit.freedesktop.org/tegra/linux into drm-next
drm/tegra: Changes for v4.14-rc1
This contains a couple of fixes and improvements for host1x, with some
preparatory work for Tegra186 support.
The remainder is cleanup and minor bugfixes for Tegra DRM along with
enhancements to debuggability.
There have also been some enhancements to the kernel interfaces for
host1x job submissions and support for mmap'ing PRIME buffers directly,
all of which get the interfaces very close to ready for serious work.
* tag 'drm/tegra/for-4.14-rc1' of git://anongit.freedesktop.org/tegra/linux: (21 commits)
drm/tegra: Prevent BOs from being freed during job submission
drm/tegra: gem: Implement mmap() for PRIME buffers
drm/tegra: Support render node
drm/tegra: sor: Trace register accesses
drm/tegra: dpaux: Trace register accesses
drm/tegra: dsi: Trace register accesses
drm/tegra: hdmi: Trace register accesses
drm/tegra: dc: Trace register accesses
drm/tegra: sor: Use unsigned int for register offsets
drm/tegra: hdmi: Use unsigned int for register offsets
drm/tegra: dsi: Use unsigned int for register offsets
drm/tegra: dpaux: Use unsigned int for register offsets
drm/tegra: dc: Use unsigned int for register offsets
drm/tegra: Fix NULL deref in debugfs/iova
drm/tegra: switch to drm_*_get(), drm_*_put() helpers
drm/tegra: Set MODULE_FIRMWARE for the VIC
drm/tegra: Add CONFIG_OF dependency
gpu: host1x: Support sub-devices recursively
gpu: host1x: fix error return code in host1x_probe()
gpu: host1x: Fix bitshift/mask multipliers
...