OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Linus Torvalds	a13de74e47	IOMMU Updates for Linux v6.3: Including: - Consolidate iommu_map/unmap functions. There have been blocking and atomic variants so far, but that was problematic as this approach does not scale with required new variants which just differ in the GFP flags used. So Jason consolidated this back into single functions that take a GFP parameter. This has the potential to cause conflicts with other trees, as they introduce new call-sites for the changed functions. I offered them to pull in the branch containing these changes and resolve it, but I am not sure everyone did that. The conflicts this caused with upstream up to v6.2-rc8 are resolved in the final merge commit. - Retire the detach_dev() call-back in iommu_ops - Arm SMMU updates from Will: - Device-tree binding updates: * Cater for three power domains on SM6375 * Document existing compatible strings for Qualcomm SoCs * Tighten up clocks description for platform-specific compatible strings - Enable Qualcomm workarounds for some additional platforms that need them - Intel VT-d updates from Lu Baolu: - Add Intel IOMMU performance monitoring support - Set No Execute Enable bit in PASID table entry - Two performance optimizations - Fix PASID directory pointer coherency - Fix missed rollbacks in error path - Cleanups - Apple t8110 DART support - Exynos IOMMU: - Implement better fault handling - Error handling fixes - Renesas IPMMU: - Add device tree bindings for r8a779g0 - AMD IOMMU: - Various fixes for handling on SNP-enabled systems and handling of faults with unknown request-ids - Cleanups and other small fixes - Various other smaller fixes and cleanups -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmP0hDwACgkQK/BELZcB GuM43RAA0YieShO+X0h6TFGfbK0zVoPd91giZehWBv9rHK7pP4iY8UEtBLBWGx/t CId4t98mmKmC212zz8QxrwAEzyTIRY+2t1yrpG2aVkoTYk8inMb07TU37wganh3O T0QccXN+9b2BS4k8yro5f3uX0d/C1JQVcMowwr53VMb/e73huqP1VTbz06/CIWMH DUhVRCzmNhSvoUOT5n7g6+ZDH+pot8WPZbtHV7FowEsmPCRc7Fj8kXyI9FEwKwrZ hIV5Y+6Lej8nQScgbO8MfblJym3VrBoSoM4GY2w0L0rjQw6m+Xtea5rT0W39YVWy YpiscLTL8TIMPP9zK1dXVygTaABK4J2iWmheHPkpKXIhK0iuH3Dke0Do5p6DNITj 7J2YlaNEB480D5hvNBKsbbGHavgGPT8m529Sz0R7mSC7omRzqiG5Vsb46IXL+2bc 92ojjYNfXb6OCtagIr2LMBLZRL2JCODqF1dUmyZfA8GKOHLP5kZXoMM+sZbQ2aUL 1LOxRZVx+tlb9V4VaH1ZSs/6eM+HLDzjtHeu3PoWYf6mW4AEt4S/yl9SKAkGdBqt jCUErmYB1nU/eefqG1jhWRpQeJabcT3Oe30NZru1pfMoREThhjbAACw1JxWtoe1X ipGpV6lAP7tQUGuRk3/9O1lNqElJuNwC5lVTjS4FJ38vYQhQbao= =ZaZV -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: - Consolidate iommu_map/unmap functions. There have been blocking and atomic variants so far, but that was problematic as this approach does not scale with required new variants which just differ in the GFP flags used. So Jason consolidated this back into single functions that take a GFP parameter. - Retire the detach_dev() call-back in iommu_ops - Arm SMMU updates from Will: - Device-tree binding updates: - Cater for three power domains on SM6375 - Document existing compatible strings for Qualcomm SoCs - Tighten up clocks description for platform-specific compatible strings - Enable Qualcomm workarounds for some additional platforms that need them - Intel VT-d updates from Lu Baolu: - Add Intel IOMMU performance monitoring support - Set No Execute Enable bit in PASID table entry - Two performance optimizations - Fix PASID directory pointer coherency - Fix missed rollbacks in error path - Cleanups - Apple t8110 DART support - Exynos IOMMU: - Implement better fault handling - Error handling fixes - Renesas IPMMU: - Add device tree bindings for r8a779g0 - AMD IOMMU: - Various fixes for handling on SNP-enabled systems and handling of faults with unknown request-ids - Cleanups and other small fixes - Various other smaller fixes and cleanups * tag 'iommu-updates-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (71 commits) iommu/amd: Skip attach device domain is same as new domain iommu: Attach device group to old domain in error path iommu/vt-d: Allow to use flush-queue when first level is default iommu/vt-d: Fix PASID directory pointer coherency iommu/vt-d: Avoid superfluous IOTLB tracking in lazy mode iommu/vt-d: Fix error handling in sva enable/disable paths iommu/amd: Improve page fault error reporting iommu/amd: Do not identity map v2 capable device when snp is enabled iommu: Fix error unwind in iommu_group_alloc() iommu/of: mark an unused function as __maybe_unused iommu: dart: DART_T8110_ERROR range should be 0 to 5 iommu/vt-d: Enable IOMMU perfmon support iommu/vt-d: Add IOMMU perfmon overflow handler support iommu/vt-d: Support cpumask for IOMMU perfmon iommu/vt-d: Add IOMMU perfmon support iommu/vt-d: Support Enhanced Command Interface iommu/vt-d: Retrieve IOMMU perfmon capability information iommu/vt-d: Support size of the register set in DRHD iommu/vt-d: Set No Execute Enable bit in PASID table entry iommu/vt-d: Remove sva from intel_svm_dev ...	2023-02-24 13:40:13 -08:00
Mikko Perttunen	c24973ed79	gpu: host1x: Implement job tracking using DMA fences In anticipation of removal of the intr API, implement job tracking using DMA fences instead. The main two things about this are making cdma_update schedule the work since fence completion can now be called from interrupt context, and some complication in ensuring the callback is not running when we free the fence. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2023-01-26 15:55:38 +01:00
Jason Gunthorpe	1369459b2e	iommu: Add a gfp parameter to iommu_map() The internal mechanisms support this, but instead of exposting the gfp to the caller it wrappers it into iommu_map() and iommu_map_atomic() Fix this instead of adding more variants for GFP_KERNEL_ACCOUNT. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Link: https://lore.kernel.org/r/1-v3-76b587fe28df+6e3-iommu_map_gfp_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2023-01-25 11:52:00 +01:00
Mikko Perttunen	8c92243d9e	gpu: host1x: Generalize host1x_cdma_push_wide() host1x_cdma_push_wide() had the assumptions that the last parameter word was a NOP opcode, and that NOP opcodes could be used in all situations. Neither are true with the new job opcode sequence, so adjust the function to not have these assumptions, and instead place an early RESTART opcode when necessary to jump back to the beginning of the pushbuffer. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2022-07-08 17:36:26 +02:00
Mikko Perttunen	0ae4ae9158	gpu: host1x: Use RESTART_W to skip timed out jobs on Tegra186+ When MLOCK enforcement is enabled, the 0-word write currently done is rejected by the hardware outside of an MLOCK region. As such, on these chips, which also have the newer, more convenient RESTART_W opcode, use that instead to skip over the timed out job. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2022-07-08 16:27:53 +02:00
Mikko Perttunen	c78f837ae3	gpu: host1x: Add no-recovery mode Add a new property for jobs to enable or disable recovery i.e. CPU increments of syncpoints to max value on job timeout. This allows for a more solid model for hanged jobs, where userspace doesn't need to guess if a syncpoint increment happened because the job completed, or because job timeout was triggered. On job timeout, we stop the channel, NOP all future jobs on the channel using the same syncpoint, mark the syncpoint as locked and resume the channel from the next job, if any. The future jobs are NOPed, since because we don't do the CPU increments, the value of the syncpoint is no longer synchronized, and any waiters would become confused if a future job incremented the syncpoint. The syncpoint is marked locked to ensure that any future jobs cannot increment the syncpoint either, until the application has recognized the situation and reallocated the syncpoint. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2021-08-10 14:40:23 +02:00
Mikko Perttunen	2aed4f5ab0	gpu: host1x: Cleanup and refcounting for syncpoints Add reference counting for allocated syncpoints to allow keeping them allocated while jobs are referencing them. Additionally, clean up various places using syncpoint IDs to use host1x_syncpt pointers instead. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2021-03-31 17:42:13 +02:00
Ben Dooks (Codethink)	33904487f1	gpu: host1x: Make host1x_cdma_wait_pushbuffer_space() static The host1x_cdma_wait_pushbuffer_space() function is not declared or directly called from outside the file it is in, so make it static. Fixes the following sparse warning: drivers/gpu/host1x/cdma.c:235:5: warning: symbol 'host1x_cdma_wait_pushbuffer_space' was not declared. Should it be static? Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: Thierry Reding <treding@nvidia.com>	2019-10-28 11:18:34 +01:00
Thomas Gleixner	9952f6918d	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 201 Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms and conditions of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not see http www gnu org licenses extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 228 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Steve Winslow <swinslow@gmail.com> Reviewed-by: Richard Fontana <rfontana@redhat.com> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190528171438.107155473@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-30 11:29:52 -07:00
Dmitry Osipenko	79930bafe2	gpu: host1x: Continue CDMA execution starting with a next job Currently gathers of a hung job are getting NOP'ed and a restarted CDMA executes the NOP'ed gathers. There shouldn't be a reason to not restart CDMA execution starting with a next job, avoiding the unnecessary churning with gathers NOP'ing. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2019-02-07 18:34:25 +01:00
Dmitry Osipenko	5d6f043685	gpu: host1x: Don't complete a completed job There is a chance that the last job has been completed at the time of CDMA timeout handler invocation. In this case there is no need to complete the completed job. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2019-02-07 18:30:58 +01:00
Dmitry Osipenko	e8bad65938	gpu: host1x: Cancel only job that actually got stuck Host1x doesn't have information about jobs inter-dependency, that is something that will become available once host1x will get a proper jobs scheduler implementation. Currently a hang job causes other unrelated jobs to be canceled, that is a relic from downstream driver which is irrelevant to upstream. Let's cancel only the hanging job and not to touch other jobs in queue. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2019-02-07 18:30:24 +01:00
Thierry Reding	e1f338c0f8	gpu: host1x: Optimize CDMA push buffer memory usage The host1x CDMA push buffer is terminated by a special opcode (RESTART) that tells the CDMA to wrap around to the beginning of the push buffer. To accomodate the RESTART opcode, an extra 4 bytes are allocated on top of the 512 * 8 = 4096 bytes needed for the 512 slots (1 slot = 2 words) that are used for other commands passed to CDMA. This requires that two memory pages are allocated, but most of the second page (4092 bytes) is never used. Decrease the number of slots to 511 so that the RESTART opcode fits within the page. Adjust the push buffer wraparound code to take into account push buffer sizes that are not a power of two. Signed-off-by: Thierry Reding <treding@nvidia.com>	2019-02-07 18:28:59 +01:00
Thierry Reding	5a5fccbd8c	gpu: host1x: Introduce support for wide opcodes The CDMA push buffer can currently only handle opcodes that take a single word parameter. However, the host1x implementation on Tegra186 and later supports opcodes that require multiple words as parameters. Unfortunately the way the push buffer is structured, these wide opcodes cannot simply be composed of two regular opcodes because that could result in the wide opcode being split across the end of the push buffer and the final RESTART opcode required to wrap the push buffer around would break the wide opcode. One way to fix this would be to remove the concept of slots to simplify push buffer operations. However, that's not entirely trivial and should be done in a separate patch. For now, simply use a different function to push four-word opcodes into the push buffer. Technically only three words are pushed, with the fourth word used as padding to preserve the 2-word alignment required by the slots abstraction. The fourth word is always a NOP opcode. Additional care must be taken when the end of the push buffer is reached. If a four-word opcode doesn't fit into the push buffer without being split by the boundary, NOP opcodes will be introduced and the new wide opcode placed at the beginning of the push buffer. Signed-off-by: Thierry Reding <treding@nvidia.com>	2019-02-07 18:28:35 +01:00
Arnd Bergmann	0747a672a3	gpu: host1x: Use completion instead of semaphore In this usage, the two are completely equivalent, but the completion documents better what is going on, and we generally try to avoid semaphores these days. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thierry Reding <treding@nvidia.com>	2019-02-04 08:35:54 +01:00
Thierry Reding	bf3d41ccab	gpu: host1x: Store pointer to client in jobs Rather than storing some identifier derived from the application context that can't be used concretely anywhere, store a pointer to the client directly so that accesses can be made directly through that client object. Reviewed-by: Dmitry Osipenko <digetx@gmail.com> Tested-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-05-18 21:50:24 +02:00
Emil Goode	2f8a6da866	gpu: host1x: Fix compiler errors by converting to dma_addr_t The compiler is complaining with the following errors: drivers/gpu/host1x/cdma.c:94:48: error: passing argument 3 of ‘dma_alloc_wc’ from incompatible pointer type [-Werror=incompatible-pointer-types] drivers/gpu/host1x/cdma.c:113:48: error: passing argument 3 of ‘dma_alloc_wc’ from incompatible pointer type [-Werror=incompatible-pointer-types] The expected pointer type of the third argument to dma_alloc_wc() is dma_addr_t but phys_addr_t is passed. Change the phys member of struct push_buffer to be dma_addr_t so that we pass the correct type to dma_alloc_wc(). Also check pb->mapped for non-NULL in the destroy function as that is the right way of checking if dma_alloc_wc() was successful. Signed-off-by: Emil Goode <emil.fsw@goode.io> Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-05-18 11:23:48 +02:00
Dmitry Osipenko	27db6a0073	gpu: host1x: Fix dma_free_wc() argument in the error path If IOVA allocation or IOMMU mapping fails, dma_free_wc() is invoked with size=0 because of a typo, that triggers "kernel BUG at mm/vmalloc.c:124!". Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-05-17 17:44:48 +02:00
Mikko Perttunen	404bfb78da	gpu: host1x: Add IOMMU support Add support for the Host1x unit to be located behind an IOMMU. This is required when gather buffers may be allocated non-contiguously in physical memory, as can be the case when TegraDRM is also using the IOMMU. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2017-04-05 18:11:43 +02:00
Thierry Reding	0b8070d12e	gpu: host1x: Whitespace cleanup for readability Insert a number of blank lines in places where they increase readability of the code. Also collapse various variable declarations to shorten some functions and finally rewrite some code for readability. Signed-off-by: Thierry Reding <treding@nvidia.com>	2016-06-23 11:59:30 +02:00
Thierry Reding	6df633d0dc	gpu: host1x: Fix a couple of checkpatch warnings Fix a couple of occurrences where no blank line was used to separate variable declarations from code or where block comments were wrongly formatted. Signed-off-by: Thierry Reding <treding@nvidia.com>	2016-06-23 11:59:28 +02:00
Thierry Reding	ebb2475c47	gpu: host1x: cdma: Drop unnecessary local variable The local 'pos' variable doesn't serve any purpose other than being a shortcut for pb->pos, but the result doesn't remove much, so simply drop the local variable. Signed-off-by: Thierry Reding <treding@nvidia.com>	2016-06-23 11:59:27 +02:00
Luis R. Rodriguez	f6e45661f9	dma, mm/pat: Rename dma__writecombine() to dma__wc() Rename dma__writecombine() to dma__wc(), so that the naming is coherent across the various write-combining APIs. Keep the old names for compatibility for a while, these can be removed at a later time. A guard is left to enable backporting of the rename, and later remove of the old mapping defines seemlessly. Build tested successfully with allmodconfig. The following Coccinelle SmPL patch was used for this simple transformation: @ rename_dma_alloc_writecombine @ expression dev, size, dma_addr, gfp; @@ -dma_alloc_writecombine(dev, size, dma_addr, gfp) +dma_alloc_wc(dev, size, dma_addr, gfp) @ rename_dma_free_writecombine @ expression dev, size, cpu_addr, dma_addr; @@ -dma_free_writecombine(dev, size, cpu_addr, dma_addr) +dma_free_wc(dev, size, cpu_addr, dma_addr) @ rename_dma_mmap_writecombine @ expression dev, vma, cpu_addr, dma_addr, size; @@ -dma_mmap_writecombine(dev, vma, cpu_addr, dma_addr, size) +dma_mmap_wc(dev, vma, cpu_addr, dma_addr, size) We also keep the old names as compatibility helpers, and guard against their definition to make backporting easier. Generated-by: Coccinelle SmPL Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: airlied@linux.ie Cc: akpm@linux-foundation.org Cc: benh@kernel.crashing.org Cc: bhelgaas@google.com Cc: bp@suse.de Cc: dan.j.williams@intel.com Cc: daniel.vetter@ffwll.ch Cc: dhowells@redhat.com Cc: julia.lawall@lip6.fr Cc: konrad.wilk@oracle.com Cc: linux-fbdev@vger.kernel.org Cc: linux-pci@vger.kernel.org Cc: luto@amacapital.net Cc: mst@redhat.com Cc: tomi.valkeinen@ti.com Cc: toshi.kani@hp.com Cc: vinod.koul@intel.com Cc: xen-devel@lists.xensource.com Link: http://lkml.kernel.org/r/1453516462-4844-1-git-send-email-mcgrof@do-not-panic.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-03-09 14:57:51 +01:00
Thierry Reding	0169b93f44	gpu: host1x: Make mapped field of push buffers void * This reduces the amount of casting that needs to be done to get rid of annoying warnings on 64-bit builds. Signed-off-by: Thierry Reding <treding@nvidia.com>	2014-11-13 16:11:35 +01:00
Thierry Reding	35d747a81d	gpu: host1x: Expose syncpt and channel functionality Expose the buffer objects, syncpoint and channel functionality in the public public header so that drivers can use them. Signed-off-by: Thierry Reding <treding@nvidia.com>	2013-10-31 09:20:11 +01:00
Terje Bergstrom	6236451d83	gpu: host1x: Add debug support Add support for host1x debugging. Adds debugfs entries, and dumps channel state to UART in case of stuck job. Signed-off-by: Arto Merilainen <amerilainen@nvidia.com> Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: Thierry Reding <thierry.reding@avionic-design.de> Tested-by: Thierry Reding <thierry.reding@avionic-design.de> Tested-by: Erik Faye-Lund <kusmabite@gmail.com> Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>	2013-04-22 12:32:46 +02:00
Terje Bergstrom	6579324a41	gpu: host1x: Add channel support Add support for host1x client modules, and host1x channels to submit work to the clients. Signed-off-by: Arto Merilainen <amerilainen@nvidia.com> Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: Thierry Reding <thierry.reding@avionic-design.de> Tested-by: Thierry Reding <thierry.reding@avionic-design.de> Tested-by: Erik Faye-Lund <kusmabite@gmail.com> Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>	2013-04-22 12:32:43 +02:00

27 Commits