The SID configuration requirement for Modem on SC7280 is similar to the
ones found on SC7180/SDM845 SoCs. So, add the SC7280 modem compatible to
the list to defer the programming of the modem SIDs to the kernel.
Signed-off-by: Sibi Sankar <sibis@codeaurora.org>
Reviewed-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/1631886935-14691-5-git-send-email-sibis@codeaurora.org
Signed-off-by: Will Deacon <will@kernel.org>
Now that SCM can be a loadable module, we have to add another
dependency to avoid link failures when ipa or adreno-gpu are
built-in:
aarch64-linux-ld: drivers/net/ipa/ipa_main.o: in function `ipa_probe':
ipa_main.c:(.text+0xfc4): undefined reference to `qcom_scm_is_available'
ld.lld: error: undefined symbol: qcom_scm_is_available
>>> referenced by adreno_gpu.c
>>> gpu/drm/msm/adreno/adreno_gpu.o:(adreno_zap_shader_load) in archive drivers/built-in.a
This can happen when CONFIG_ARCH_QCOM is disabled and we don't select
QCOM_MDT_LOADER, but some other module selects QCOM_SCM. Ideally we'd
use a similar dependency here to what we have for QCOM_RPROC_COMMON,
but that causes dependency loops from other things selecting QCOM_SCM.
This appears to be an endless problem, so try something different this
time:
- CONFIG_QCOM_SCM becomes a hidden symbol that nothing 'depends on'
but that is simply selected by all of its users
- All the stubs in include/linux/qcom_scm.h can go away
- arm-smccc.h needs to provide a stub for __arm_smccc_smc() to
allow compile-testing QCOM_SCM on all architectures.
- To avoid a circular dependency chain involving RESET_CONTROLLER
and PINCTRL_SUNXI, drop the 'select RESET_CONTROLLER' statement.
According to my testing this still builds fine, and the QCOM
platform selects this symbol already.
Acked-by: Kalle Valo <kvalo@codeaurora.org>
Acked-by: Alex Elder <elder@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Add compatible for QCM2290 SMMU to use the Qualcomm Technologies, Inc.
specific implementation.
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Link: https://lore.kernel.org/r/1633096832-7762-1-git-send-email-loic.poulain@linaro.org
[will: Sort by alphabetical order, add commit message]
Signed-off-by: Will Deacon <will@kernel.org>
1. Build command CMD_SYNC cannot fail. So the return value can be ignored.
2. The arm_smmu_cmdq_build_cmd() almost never fails, the addition of
"unlikely()" can optimize the instruction pipeline.
3. Check the return value in arm_smmu_cmdq_batch_add().
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20210818080452.2079-2-thunder.leizhen@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Pre-zeroing the batched commands structure is inefficient, as individual
commands are zeroed later in arm_smmu_cmdq_build_cmd(). Therefore, only
the member 'num' needs to be initialized to 0.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/20210817113411.1962-1-thunder.leizhen@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Add the missing unlock before return from function arm_smmu_device_group()
in the error handling case.
Fixes: b1a1347912 ("iommu/arm-smmu: Fix race condition during iommu_group creation")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210820074949.1946576-1-yangyingliang@huawei.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
IO_PGTABLE_QUIRK_NON_STRICT was never a very comfortable fit, since it's
not a quirk of the pagetable format itself. Now that we have a more
appropriate way to convey non-strict unmaps, though, this last of the
non-quirk quirks can also go, and with the flush queue code also now
enforcing its own ordering we can have a lovely cleanup all round.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/155b5c621cd8936472e273a8b07a182f62c6c20d.1628682049.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Pre-zeroing the batched commands structure is inefficient, as individual
commands are zeroed later in arm_smmu_cmdq_build_cmd(). The size is quite
large and commonly most commands won't even be used:
struct arm_smmu_cmdq_batch cmds = {};
345c: 52800001 mov w1, #0x0 // #0
3460: d2808102 mov x2, #0x408 // #1032
3464: 910143a0 add x0, x29, #0x50
3468: 94000000 bl 0 <memset>
Stop pre-zeroing the complete structure and only zero the num member.
Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1628696966-88386-1-git-send-email-john.garry@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
When SMMU_GERROR.CMDQP_ERR is different to SMMU_GERRORN.CMDQP_ERR, it
indicates that one or more errors have been encountered on a command queue
control page interface. We need to traverse all ECMDQs in that control
page to find all errors. For each ECMDQ error handling, it is much the
same as the CMDQ error handling. This common processing part is extracted
as a new function __arm_smmu_cmdq_skip_err().
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20210811114852.2429-5-thunder.leizhen@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
One SMMU has only one normal CMDQ. Therefore, this CMDQ is used regardless
of the core on which the command is inserted. It can be referenced
directly through "smmu->cmdq". However, one SMMU has multiple ECMDQs, and
the ECMDQ used by the core on which the command insertion is executed may
be different. So the helper function arm_smmu_get_cmdq() is added, which
returns the CMDQ/ECMDQ that the current core should use. Currently, the
code that supports ECMDQ is not added. just simply returns "&smmu->cmdq".
Many subfunctions of arm_smmu_cmdq_issue_cmdlist() use "&smmu->cmdq" or
"&smmu->cmdq.q" directly. To support ECMDQ, they need to call the newly
added function arm_smmu_get_cmdq() instead.
Note that normal CMDQ is still required until ECMDQ is available.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20210811114852.2429-4-thunder.leizhen@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
The obvious key to the performance optimization of commit 587e6c10a7
("iommu/arm-smmu-v3: Reduce contention during command-queue insertion") is
to allow multiple cores to insert commands in parallel after a brief mutex
contention.
Obviously, inserting as many commands at a time as possible can reduce the
number of times the mutex contention participates, thereby improving the
overall performance. At least it reduces the number of calls to function
arm_smmu_cmdq_issue_cmdlist().
Therefore, function arm_smmu_cmdq_issue_cmd_with_sync() is added to insert
the 'cmd+sync' commands at a time.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20210811114852.2429-3-thunder.leizhen@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
The obvious key to the performance optimization of commit 587e6c10a7
("iommu/arm-smmu-v3: Reduce contention during command-queue insertion") is
to allow multiple cores to insert commands in parallel after a brief mutex
contention.
Obviously, inserting as many commands at a time as possible can reduce the
number of times the mutex contention participates, thereby improving the
overall performance. At least it reduces the number of calls to function
arm_smmu_cmdq_issue_cmdlist().
Therefore, use command queue batching helpers to insert multiple commands
at a time.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20210811114852.2429-2-thunder.leizhen@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Currently for iommu_unmap() of large scatter-gather list with page size
elements, the majority of time is spent in flushing of partial walks in
__arm_lpae_unmap() which is a VA based TLB invalidation invalidating
page-by-page on iommus like arm-smmu-v2 (TLBIVA).
For example: to unmap a 32MB scatter-gather list with page size elements
(8192 entries), there are 16->2MB buffer unmaps based on the pgsize (2MB
for 4K granule) and each of 2MB will further result in 512 TLBIVAs (2MB/4K)
resulting in a total of 8192 TLBIVAs (512*16) for 16->2MB causing a huge
overhead.
On qcom implementation, there are several performance improvements for
TLB cache invalidations in HW like wait-for-safe (for realtime clients
such as camera and display) and few others to allow for cache
lookups/updates when TLBI is in progress for the same context bank.
So the cost of over-invalidation is less compared to the unmap latency
on several usecases like camera which deals with large buffers. So,
ASID based TLB invalidations (TLBIASID) can be used to invalidate the
entire context for partial walk flush thereby improving the unmap
latency.
For this example of 32MB scatter-gather list unmap, this change results
in just 16 ASID based TLB invalidations (TLBIASIDs) as opposed to 8192
TLBIVAs thereby increasing the performance of unmaps drastically.
Test on QTI SM8150 SoC for 10 iterations of iommu_{map_sg}/unmap:
(average over 10 iterations)
Before this optimization:
size iommu_map_sg iommu_unmap
4K 2.067 us 1.854 us
64K 9.598 us 8.802 us
1M 148.890 us 130.718 us
2M 305.864 us 67.291 us
12M 1793.604 us 390.838 us
16M 2386.848 us 518.187 us
24M 3563.296 us 775.989 us
32M 4747.171 us 1033.364 us
After this optimization:
size iommu_map_sg iommu_unmap
4K 1.723 us 1.765 us
64K 9.880 us 8.869 us
1M 155.364 us 135.223 us
2M 303.906 us 5.385 us
12M 1786.557 us 21.250 us
16M 2391.890 us 27.437 us
24M 3570.895 us 39.937 us
32M 4755.234 us 51.797 us
Real world data also shows big difference in unmap performance as below:
There were reports of camera frame drops because of high overhead in
iommu unmap without this optimization because of frequent unmaps issued
by camera of about 100MB/s taking more than 100ms thereby causing frame
drops.
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Link: https://lore.kernel.org/r/20210811160426.10312-1-saiprakash.ranjan@codeaurora.org
Signed-off-by: Will Deacon <will@kernel.org>
When two devices with same SID are getting probed concurrently through
iommu_probe_device(), the iommu_group sometimes is getting allocated more
than once as call to arm_smmu_device_group() is not protected for
concurrency. Furthermore, it leads to each device holding a different
iommu_group and domain pointer, separate IOVA space and only one of the
devices' domain is used for translations from IOMMU. This causes accesses
from other device to fault or see incorrect translations.
Fix this by protecting iommu_group allocation from concurrency in
arm_smmu_device_group().
Signed-off-by: Krishna Reddy <vdumpa@nvidia.com>
Signed-off-by: Ashish Mhetre <amhetre@nvidia.com>
Link: https://lore.kernel.org/r/1628570641-9127-3-git-send-email-amhetre@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
Some clocks for SMMU can have parent as XO such as gpu_cc_hub_cx_int_clk
of GPU SMMU in QTI SC7280 SoC and in order to enter deep sleep states in
such cases, we would need to drop the XO clock vote in unprepare call and
this unprepare callback for XO is in RPMh (Resource Power Manager-Hardened)
clock driver which controls RPMh managed clock resources for new QTI SoCs.
Given we cannot have a sleeping calls such as clk_bulk_prepare() and
clk_bulk_unprepare() in arm-smmu runtime pm callbacks since the iommu
operations like map and unmap can be in atomic context and are in fast
path, add this prepare and unprepare call to drop the XO vote only for
system pm callbacks since it is not a fast path and we expect the system
to enter deep sleep states with system pm as opposed to runtime pm.
This is a similar sequence of clock requests (prepare,enable and
disable,unprepare) in arm-smmu probe and remove.
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Co-developed-by: Rajendra Nayak <rnayak@codeaurora.org>
Signed-off-by: Rajendra Nayak <rnayak@codeaurora.org>
Link: https://lore.kernel.org/r/20210810064808.32486-1-saiprakash.ranjan@codeaurora.org
Signed-off-by: Will Deacon <will@kernel.org>
Implement the map_pages() callback for ARM SMMUV3 driver to allow calls
from iommu_map to map multiple pages of the same size in one call.
Also remove the map() callback for the ARM SMMUV3 driver as it will no
longer be used.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/1627697831-158822-3-git-send-email-chenxiang66@hisilicon.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Implement the unmap_pages() callback for ARM SMMUV3 driver to allow calls
from iommu_unmap to unmap multiple pages of the same size in one call.
Also remove the unmap() callback for the ARM SMMUV3 driver as it will
no longer be used.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/1627697831-158822-2-git-send-email-chenxiang66@hisilicon.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Members of struct "llq" will be zero-inited, apart from member max_n_shift.
But we write llq.val straight after the init, so it was pointless to zero
init those other members. As such, separately init member max_n_shift
only.
In addition, struct "head" is initialised to "llq" only so that member
max_n_shift is set. But that member is never referenced for "head", so
remove any init there.
Removing these initializations is seen as a small performance optimisation,
as this code is (very) hot path.
Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1624293394-202509-1-git-send-email-john.garry@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
If people are going to insist on calling iommu_iova_to_phys()
pointlessly and expecting it to work, we can at least do ourselves a
favour by handling those cases in the core code, rather than repeatedly
across an inconsistent handful of drivers.
Since all the existing drivers implement the internal callback, and any
future ones are likely to want to work with iommu-dma which relies on
iova_to_phys a fair bit, we may as well remove that currently-redundant
check as well and consider it mandatory.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/f564f3f6ff731b898ff7a898919bf871c2c7745a.1626354264.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Implement the map_pages() callback for the ARM SMMU driver
to allow calls from iommu_map to map multiple pages of
the same size in one call. Also, remove the map() callback
for the ARM SMMU driver, as it will no longer be used.
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Georgi Djakov <quic_c_gdjako@quicinc.com>
Link: https://lore.kernel.org/r/1623850736-389584-16-git-send-email-quic_c_gdjako@quicinc.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Implement the unmap_pages() callback for the ARM SMMU driver
to allow calls from iommu_unmap to unmap multiple pages of
the same size in one call. Also, remove the unmap() callback
for the SMMU driver, as it will no longer be used.
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Georgi Djakov <quic_c_gdjako@quicinc.com>
Link: https://lore.kernel.org/r/1623850736-389584-15-git-send-email-quic_c_gdjako@quicinc.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Hi Linus,
Please, pull the following patches that fix many fall-through
warnings when building with Clang and -Wimplicit-fallthrough.
This pull-request also contains the patch for Makefile that enables
-Wimplicit-fallthrough for Clang, globally.
It's also important to notice that since we have adopted the use of
the pseudo-keyword macro fallthrough; we also want to avoid having
more /* fall through */ comments being introduced. Notice that contrary
to GCC, Clang doesn't recognize any comments as implicit fall-through
markings when the -Wimplicit-fallthrough option is enabled. So, in
order to avoid having more comments being introduced, we have to use
the option -Wimplicit-fallthrough=5 for GCC, which similar to Clang,
will cause a warning in case a code comment is intended to be used
as a fall-through marking. The patch for Makefile also enforces this.
We had almost 4,000 of these issues for Clang in the beginning,
and there might be a couple more out there when building some
architectures with certain configurations. However, with the
recent fixes I think we are in good shape and it is now possible
to enable -Wimplicit-fallthrough for Clang. :)
Thanks!
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEkmRahXBSurMIg1YvRwW0y0cG2zEFAmDvPV0ACgkQRwW0y0cG
2zGJ1xAAsbSN8I+ESxFCKi4UwQC71988isb0rHGL6rQPBbEYD2bbI+82aGu6WH8z
2ZVQZa5x3tUoIsqR2GMK5Npn1dFvuNWGZkYzlgjAp+FgzaULCLUc4NW9DEWU/Kbd
UBz8ilXL7tUAlDalUYjXAVVTxMnRTZ+BDCbEmIdRZ5X/pMlD2xHeMGG4kh8/cHjJ
+mqFwCFPZoyLJzPBnm37OEfIr8JxrLZAo1gfmz5sxfmBdYDY3hjV/BVgP8h0bkY9
ITaq0LMEAPMXvU+4DP3DxuqzRr9COYMcddmbaWwsXPd7UAuoXwGLvqx7JE0QqY3g
J80w4/hXqEa4o4/fUAsfjYEQoNTEnhl0iGskBJD2xy5Lj17/m4aOSL44ibr2Ag67
yfJPWi9tooYELEGNobPVHPuqk4ts6amKTOKqHWMBEcTEOIGzaGWPEjRoyaMivfZ3
G3ZbEPEYERcJRizm63UbciLeslGsxb6tdYQEy5VI8Wa7caz9Ci9HT+jjS7Vm2fuq
1zg+vIgwuSxvwxrSV/PDTRcQm9EM/MoRL0D4jbr7gtYD6KNwH8Vo6M21CoaB9gH3
FgMiCT4zy9KiONjVDJKkjtzuk+9OwpI5AAg0/fAQ4Ak9TzUN3Xfg7HpytETCIJ0Z
Ii3Jqwe+3IRTNq94ZTIiX/0hT7i5GsPxUfPzI2vwttCJn9ctUcM=
=XVNN
-----END PGP SIGNATURE-----
Merge tag 'Wimplicit-fallthrough-clang-5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux
Pull fallthrough fixes from Gustavo Silva:
"This fixes many fall-through warnings when building with Clang and
-Wimplicit-fallthrough, and also enables -Wimplicit-fallthrough for
Clang, globally.
It's also important to notice that since we have adopted the use of
the pseudo-keyword macro fallthrough, we also want to avoid having
more /* fall through */ comments being introduced. Contrary to GCC,
Clang doesn't recognize any comments as implicit fall-through markings
when the -Wimplicit-fallthrough option is enabled.
So, in order to avoid having more comments being introduced, we use
the option -Wimplicit-fallthrough=5 for GCC, which similar to Clang,
will cause a warning in case a code comment is intended to be used as
a fall-through marking. The patch for Makefile also enforces this.
We had almost 4,000 of these issues for Clang in the beginning, and
there might be a couple more out there when building some
architectures with certain configurations. However, with the recent
fixes I think we are in good shape and it is now possible to enable
the warning for Clang"
* tag 'Wimplicit-fallthrough-clang-5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: (27 commits)
Makefile: Enable -Wimplicit-fallthrough for Clang
powerpc/smp: Fix fall-through warning for Clang
dmaengine: mpc512x: Fix fall-through warning for Clang
usb: gadget: fsl_qe_udc: Fix fall-through warning for Clang
powerpc/powernv: Fix fall-through warning for Clang
MIPS: Fix unreachable code issue
MIPS: Fix fall-through warnings for Clang
ASoC: Mediatek: MT8183: Fix fall-through warning for Clang
power: supply: Fix fall-through warnings for Clang
dmaengine: ti: k3-udma: Fix fall-through warning for Clang
s390: Fix fall-through warnings for Clang
dmaengine: ipu: Fix fall-through warning for Clang
iommu/arm-smmu-v3: Fix fall-through warning for Clang
mmc: jz4740: Fix fall-through warning for Clang
PCI: Fix fall-through warning for Clang
scsi: libsas: Fix fall-through warning for Clang
video: fbdev: Fix fall-through warning for Clang
math-emu: Fix fall-through warning
cpufreq: Fix fall-through warning for Clang
drm/msm: Fix fall-through warning in msm_gem_new_impl()
...
QCOM IOMMU driver calls bus_set_iommu() for every IOMMU device controller,
what fails for the second and latter IOMMU devices. This is intended and
must be not fatal to the driver registration process. Also the cleanup
path should take care of the runtime PM state, what is missing in the
current patch. Revert relevant changes to the QCOM IOMMU driver until
a proper fix is prepared.
This partially reverts commit 249c9dc6aa.
Fixes: 249c9dc6aa ("iommu/arm: Cleanup resources in case of probe error path")
Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210705065657.30356-1-m.szyprowski@samsung.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Fix the following fallthrough warning (arm64-randconfig with Clang):
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:382:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/lkml/60edca25.k00ut905IFBjPyt5%25lkp@intel.com/
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
- Reset controllers: Adding support for Microchip Sparx5 Switch.
- Memory controllers: ARM Primecell PL35x SMC memory controller
driver cleanups and improvements.
- i.MX SoC drivers: Power domain support for i.MX8MM and i.MX8MN.
- Rockchip: RK3568 power domains support + DT binding updates,
cleanups.
- Qualcomm SoC drivers: Amend socinfo with more SoC/PMIC details,
including support for MSM8226, MDM9607, SM6125 and SC8180X.
- ARM FFA driver: "Firmware Framework for ARMv8-A", defining
management interfaces and communication (including bus model)
between partitions both in Normal and Secure Worlds.
- Tegra Memory controller changes, including major rework to deal
with identity mappings at boot and integration with ARM SMMU
pieces.
-----BEGIN PGP SIGNATURE-----
iQJDBAABCgAtFiEElf+HevZ4QCAJmMQ+jBrnPN6EHHcFAmDokgYPHG9sb2ZAbGl4
b20ubmV0AAoJEIwa5zzehBx3looP/20uQAjRadPJFdV/B2mpZYqXMI4dIN9g7KJ1
6uEoaGurzYWQQreDXswQ5vFUcQfIudEJ9Im9IF+9BUsFQ2uvPTJ4I+HDN++WH70B
cIsmwwBr7Q4JUVP+O7T2WGtBY69jvHTpJrCCVtyHtwEyL4a1uyfelsAJXbxqaqis
w1lmXNkkSqx5c67H3maNNDRnbutyLL2gO0TYdiBapOcc5V03OYKNnMbDqRTddqyt
4UH4eYkFkNai8UJ476BXHU9ldlWzEkRBib/OKwF9k3oPj9W3kdQ/vd2IKK5a1fTX
jIbOPSRRC8K/9Bxn1KEtdoU0Yy+rlm3xd7DtQl5RyGTD+tHVq3dN55WjoXBY83Yh
r37y7uII9i09tPg5+APSX/jgodsIt4c46dKwvYuWXvB7ziomfsKxQiRanApJG6UX
qS5NCUrlfYWlL302JOTvEtDBePXXiXQ065GuRjM948WMnVzXwEKwYUakGhvXQWMS
jXCcOGW7GhnbY3+Ipn9chyhydHpKSxIb8oBk4cMRJU9jlN2GmjHgW8RMvT2WM6VF
1F8acyMvf6en5tV6f23cjbW+iIMTS5egKNfqi8tdjGVxbowypyJYzjYOhaqk6veJ
jHOmpglTXas0QD3ZRU7vGVlrvHqik8XyRsq3N9CQjVenRCbsQLKZRi1gTbIuspcR
rejqH3Fs
=kPg8
-----END PGP SIGNATURE-----
Merge tag 'arm-drivers-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM driver updates from Olof Johansson:
- Reset controllers: Adding support for Microchip Sparx5 Switch.
- Memory controllers: ARM Primecell PL35x SMC memory controller driver
cleanups and improvements.
- i.MX SoC drivers: Power domain support for i.MX8MM and i.MX8MN.
- Rockchip: RK3568 power domains support + DT binding updates,
cleanups.
- Qualcomm SoC drivers: Amend socinfo with more SoC/PMIC details,
including support for MSM8226, MDM9607, SM6125 and SC8180X.
- ARM FFA driver: "Firmware Framework for ARMv8-A", defining management
interfaces and communication (including bus model) between partitions
both in Normal and Secure Worlds.
- Tegra Memory controller changes, including major rework to deal with
identity mappings at boot and integration with ARM SMMU pieces.
* tag 'arm-drivers-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (120 commits)
firmware: turris-mox-rwtm: add marvell,armada-3700-rwtm-firmware compatible string
firmware: turris-mox-rwtm: show message about HWRNG registration
firmware: turris-mox-rwtm: fail probing when firmware does not support hwrng
firmware: turris-mox-rwtm: report failures better
firmware: turris-mox-rwtm: fix reply status decoding function
soc: imx: gpcv2: add support for i.MX8MN power domains
dt-bindings: add defines for i.MX8MN power domains
firmware: tegra: bpmp: Fix Tegra234-only builds
iommu/arm-smmu: Use Tegra implementation on Tegra186
iommu/arm-smmu: tegra: Implement SID override programming
iommu/arm-smmu: tegra: Detect number of instances at runtime
dt-bindings: arm-smmu: Add Tegra186 compatible string
firmware: qcom_scm: Add MDM9607 compatible
soc: qcom: rpmpd: Add MDM9607 RPM Power Domains
soc: renesas: Add support to read LSI DEVID register of RZ/G2{L,LC} SoC's
soc: renesas: Add ARCH_R9A07G044 for the new RZ/G2L SoC's
dt-bindings: soc: rockchip: drop unnecessary #phy-cells from grf.yaml
memory: emif: remove unused frequency and voltage notifiers
memory: fsl_ifc: fix leak of private memory on probe failure
memory: fsl_ifc: fix leak of IO mapping on probe failure
...
Including:
- SMMU Updates from Will Deacon:
- SMMUv3: Support stalling faults for platform devices
- SMMUv3: Decrease defaults sizes for the event and PRI queues
- SMMUv2: Support for a new '->probe_finalize' hook, needed by Nvidia
- SMMUv2: Even more Qualcomm compatible strings
- SMMUv2: Avoid Adreno TTBR1 quirk for DB820C platform
- Intel VT-d updates from Lu Baolu:
- Convert Intel IOMMU to use sva_lib helpers in iommu core
- ftrace and debugfs supports for page fault handling
- Support asynchronous nested capabilities
- Various misc cleanups
- Support for new VIOT ACPI table to make the VirtIO IOMMU:
available on x86
- Add the amd_iommu=force_enable command line option to
enable the IOMMU on platforms where they are known to cause
problems
- Support for version 2 of the Rockchip IOMMU
- Various smaller fixes, cleanups and refactorings
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmDexqwACgkQK/BELZcB
GuOy/w//cr331yKDZDF8DSkWxUHYNXitBlW12nShgYselhlREb5mB1vmpOHuWKus
K++w7tWA19/qMs/p7yPoS+zCEA3xiZjO+OgyjxTzkxfaQD4GMmP7hK+ItRCXiz9E
6QZXLOqexniydpa+KEg3rewibcrJwgIpz6QHT8FBrMISiEPRUw5oLeytv6rNjPWx
WyBRNA+TjNvnybFbWp9gTdgWCshygJMv1WlU7ySZcH45Mq4VKxS4Kihe1fTLp38s
vBqfRuUHhYcuNmExgjBuK3y8dq7hU8XelKjSm2yvp9srGbhD0NFT1Ng7iZQj43bi
Eh2Ic8O9miBvm/uJ0ug6PGcEUcdfoHf/PIMqLZMRBj79+9RKxNBzWOAkBd2RwH3A
Wy98WdOsX4+3MB5EKznxnIQMMA5Rtqyy/gLDd5j4xZnL5Ha3j0oQ9WClD+2iMfpV
v150GXNOKNDNNjlzXulBxNYzUOK8KKxse9OPg8YDevZPEyVNPH2yXZ6xjoe7SYCW
FGhaHXdCfRxUk9lsQNtb23CNTKl7Qd6JOOJLZqAfpWzQjQu4wB4kfbnv0zEhMAvi
XxVqijV/XTWyVXe3HkzByHnqQxrn7YKbC/Ql/BMmk8kbwFvpJbFIaTLvRVsBnE2r
8Edn5Cjuz4CC83YPQ3EBQn85TpAaUNMDRMc1vz5NrQTqJTzYZag=
=IxNB
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel:
- SMMU Updates from Will Deacon:
- SMMUv3:
- Support stalling faults for platform devices
- Decrease defaults sizes for the event and PRI queues
- SMMUv2:
- Support for a new '->probe_finalize' hook, needed by Nvidia
- Even more Qualcomm compatible strings
- Avoid Adreno TTBR1 quirk for DB820C platform
- Intel VT-d updates from Lu Baolu:
- Convert Intel IOMMU to use sva_lib helpers in iommu core
- ftrace and debugfs supports for page fault handling
- Support asynchronous nested capabilities
- Various misc cleanups
- Support for new VIOT ACPI table to make the VirtIO IOMMU
available on x86
- Add the amd_iommu=force_enable command line option to enable
the IOMMU on platforms where they are known to cause problems
- Support for version 2 of the Rockchip IOMMU
- Various smaller fixes, cleanups and refactorings
* tag 'iommu-updates-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (66 commits)
iommu/virtio: Enable x86 support
iommu/dma: Pass address limit rather than size to iommu_setup_dma_ops()
ACPI: Add driver for the VIOT table
ACPI: Move IOMMU setup code out of IORT
ACPI: arm64: Move DMA setup operations out of IORT
iommu/vt-d: Fix dereference of pointer info before it is null checked
iommu: Update "iommu.strict" documentation
iommu/arm-smmu: Check smmu->impl pointer before dereferencing
iommu/arm-smmu-v3: Remove unnecessary oom message
iommu/arm-smmu: Fix arm_smmu_device refcount leak in address translation
iommu/arm-smmu: Fix arm_smmu_device refcount leak when arm_smmu_rpm_get fails
iommu/vt-d: Fix linker error on 32-bit
iommu/vt-d: No need to typecast
iommu/vt-d: Define counter explicitly as unsigned int
iommu/vt-d: Remove unnecessary braces
iommu/vt-d: Removed unused iommu_count in dmar domain
iommu/vt-d: Use bitfields for DMAR capabilities
iommu/vt-d: Use DEVICE_ATTR_RO macro
iommu/vt-d: Fix out-bounds-warning in intel/svm.c
iommu/vt-d: Add PRQ handling latency sampling
...
Add, via the adreno-smmu-priv interface, a way for the GPU to request
the SMMU to stall translation on faults, and then later resume the
translation, either retrying or terminating the current translation.
This will be used on the GPU side to "freeze" the GPU while we snapshot
useful state for devcoredump.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Jordan Crouse <jordan@cosmicpenguin.net>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20210610214431.539029-5-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
Add a callback in adreno-smmu-priv to read interesting SMMU
registers to provide an opportunity for a richer debug experience
in the GPU driver.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20210610214431.539029-3-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
Call report_iommu_fault() to allow upper-level drivers to register their
own fault handlers.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20210610214431.539029-2-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
Merge in support for the Arm SMMU '->probe_finalize()' implementation
callback, which is required to prevent early faults in conjunction with
Nvidia's memory controller.
* for-thierry/arm-smmu:
iommu/arm-smmu: Check smmu->impl pointer before dereferencing
iommu/arm-smmu: Implement ->probe_finalize()
Commit 0d97174aea ("iommu/arm-smmu: Implement ->probe_finalize()")
added a new optional ->probe_finalize callback to 'struct arm_smmu_impl'
but neglected to check that 'smmu->impl' is present prior to checking
if the new callback is present.
Add the missing check, which avoids dereferencing NULL when probing an
SMMU which doesn't require any implementation-specific callbacks:
| Unable to handle kernel NULL pointer dereference at virtual address
| 0000000000000070
|
| Call trace:
| arm_smmu_probe_finalize+0x14/0x48
| of_iommu_configure+0xe4/0x1b8
| of_dma_configure_id+0xf8/0x2d8
| pci_dma_configure+0x44/0x88
| really_probe+0xc0/0x3c0
Fixes: 0d97174aea ("iommu/arm-smmu: Implement ->probe_finalize()")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Will Deacon <will@kernel.org>
Fixes scripts/checkpatch.pl warning:
WARNING: Possible unnecessary 'out of memory' message
Remove it can help us save a bit of memory.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20210609125438.14369-1-thunder.leizhen@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
The reference counting issue happens in several exception handling paths
of arm_smmu_iova_to_phys_hard(). When those error scenarios occur, the
function forgets to decrease the refcount of "smmu" increased by
arm_smmu_rpm_get(), causing a refcount leak.
Fix this issue by jumping to "out" label when those error scenarios
occur.
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/1623293391-17261-1-git-send-email-xiyuyang19@fudan.edu.cn
Signed-off-by: Will Deacon <will@kernel.org>
arm_smmu_rpm_get() invokes pm_runtime_get_sync(), which increases the
refcount of the "smmu" even though the return value is less than 0.
The reference counting issue happens in some error handling paths of
arm_smmu_rpm_get() in its caller functions. When arm_smmu_rpm_get()
fails, the caller functions forget to decrease the refcount of "smmu"
increased by arm_smmu_rpm_get(), causing a refcount leak.
Fix this issue by calling pm_runtime_resume_and_get() instead of
pm_runtime_get_sync() in arm_smmu_rpm_get(), which can keep the refcount
balanced in case of failure.
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Link: https://lore.kernel.org/r/1623293672-17954-1-git-send-email-xiyuyang19@fudan.edu.cn
Signed-off-by: Will Deacon <will@kernel.org>
Tegra186 requires the same SID override programming as Tegra194 in order
to seamlessly transition from the firmware framebuffer to the Linux
framebuffer, so the Tegra implementation needs to be used on Tegra186
devices as well.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20210603164632.1000458-7-thierry.reding@gmail.com
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
The secure firmware keeps some SID override registers set as passthrough
in order to allow devices such as the display controller to operate with
no knowledge of SMMU translations until an operating system driver takes
over. This is needed in order to seamlessly transition from the firmware
framebuffer to the OS framebuffer.
Upon successfully attaching a device to the SMMU and in the process
creating identity mappings for memory regions that are being accessed,
the Tegra implementation will call into the memory controller driver to
program the override SIDs appropriately.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20210603164632.1000458-6-thierry.reding@gmail.com
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Parse the reg property in device tree and detect the number of instances
represented by a device tree node. This is subsequently needed in order
to support single-instance SMMUs with the Tegra implementation because
additional programming is needed to properly configure the SID override
registers in the memory controller.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20210603164632.1000458-5-thierry.reding@gmail.com
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
The struct acpi_platform_list and function acpi_match_platform_list()
defined in include/linux/acpi.h are available only when CONFIG_ACPI is
enabled. Add protection to fix the build issues with !CONFIG_ACPI.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Link: https://lore.kernel.org/r/20210609015511.3955-1-shawn.guo@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
If device registration fails, remove sysfs attribute
and if setting bus callbacks fails, unregister the device
and cleanup the sysfs attribute.
Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
Link: https://lore.kernel.org/r/20210608164559.204023-1-ameynarkhede03@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
Adreno(GPU) SMMU and APSS(Application Processor SubSystem) SMMU
both implement "arm,mmu-500" in some QTI SoCs and to run through
adreno smmu specific implementation such as enabling split pagetables
support, we need to match the "qcom,adreno-smmu" compatible first
before apss smmu or else we will be running apps smmu implementation
for adreno smmu and the additional features for adreno smmu is never
set. For ex: we have "qcom,sc7280-smmu-500" compatible for both apps
and adreno smmu implementing "arm,mmu-500", so the adreno smmu
implementation is never reached because the current sequence checks
for apps smmu compatible(qcom,sc7280-smmu-500) first and runs that
specific impl and we never reach adreno smmu specific implementation.
Suggested-by: Akhil P Oommen <akhilpo@codeaurora.org>
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Jordan Crouse <jordan@cosmicpenguin.net>
Link: https://lore.kernel.org/r/c42181d313fdd440011541a28cde8cd10fffb9d3.1623155117.git.saiprakash.ranjan@codeaurora.org
Signed-off-by: Will Deacon <will@kernel.org>
The only place of_iommu.h is needed is in drivers/of/device.c. Remove it
from everywhere else.
Cc: Will Deacon <will@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Yong Wu <yong.wu@mediatek.com>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
Cc: Frank Rowand <frowand.list@gmail.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: iommu@lists.linux-foundation.org
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20210527193710.1281746-2-robh@kernel.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Commit d25f6ead16 ("iommu/arm-smmu-v3: Increase maximum size of queues")
expands the cmdq queue size to improve the success rate of concurrent
command queue space allocation by multiple cores. However, this extension
does not apply to evtq and priq, because for both of them, the SMMU driver
is the consumer. Instead, memory resources are wasted. Therefore, the
queue size of evtq and priq is restored to the original setting, one page.
Fixes: d25f6ead16 ("iommu/arm-smmu-v3: Increase maximum size of queues")
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20210531123553.9602-1-thunder.leizhen@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
When a device or driver misbehaves, it is possible to receive DMA fault
events much faster than we can print them out, causing a lock up of the
system and inability to cancel the source of the problem. Ratelimit
printing of events to help recovery.
Tested-by: Aaro Koskinen <aaro.koskinen@nokia.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/r/20210531095648.118282-1-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>