iommu: arm-smmu-impl: Add sdm845 implementation hook
Add reset hook for sdm845 based platforms to turn off
the wait-for-safe sequence.
Understanding how wait-for-safe logic affects USB and UFS performance
on MTP845 and DB845 boards:
Qcom's implementation of arm,mmu-500 adds a WAIT-FOR-SAFE logic
to address under-performance issues in real-time clients, such as
Display, and Camera.
On receiving an invalidation requests, the SMMU forwards SAFE request
to these clients and waits for SAFE ack signal from real-time clients.
The SAFE signal from such clients is used to qualify the start of
invalidation.
This logic is controlled by chicken bits, one for each - MDP (display),
IFE0, and IFE1 (camera), that can be accessed only from secure software
on sdm845.
This configuration, however, degrades the performance of non-real time
clients, such as USB, and UFS etc. This happens because, with wait-for-safe
logic enabled the hardware tries to throttle non-real time clients while
waiting for SAFE ack signals from real-time clients.
On mtp845 and db845 devices, with wait-for-safe logic enabled by the
bootloaders we see degraded performance of USB and UFS when kernel
enables the smmu stage-1 translations for these clients.
Turn off this wait-for-safe logic from the kernel gets us back the perf
of USB and UFS devices until we re-visit this when we start seeing perf
issues on display/camera on upstream supported SDM845 platforms.
The bootloaders on these boards implement secure monitor callbacks to
handle a specific command - QCOM_SCM_SVC_SMMU_PROGRAM with which the
logic can be toggled.
There are other boards such as cheza whose bootloaders don't enable this
logic. Such boards don't implement callbacks to handle the specific SCM
call so disabling this logic for such boards will be a no-op.
This change is inspired by the downstream change from Patrick Daly
to address performance issues with display and camera by handling
this wait-for-safe within separte io-pagetable ops to do TLB
maintenance. So a big thanks to him for the change and for all the
offline discussions.
Without this change the UFS reads are pretty slow:
$ time dd if=/dev/sda of=/dev/zero bs=1048576 count=10 conv=sync
10+0 records in
10+0 records out
10485760 bytes (10.0MB) copied, 22.394903 seconds, 457.2KB/s
real 0m 22.39s
user 0m 0.00s
sys 0m 0.01s
With this change they are back to rock!
$ time dd if=/dev/sda of=/dev/zero bs=1048576 count=300 conv=sync
300+0 records in
300+0 records out
314572800 bytes (300.0MB) copied, 1.030541 seconds, 291.1MB/s
real 0m 1.03s
user 0m 0.00s
sys 0m 0.54s
Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Signed-off-by: Will Deacon <will@kernel.org>
2019-09-20 16:04:29 +08:00
|
|
|
// SPDX-License-Identifier: GPL-2.0-only
|
|
|
|
/*
|
|
|
|
* Copyright (c) 2019, The Linux Foundation. All rights reserved.
|
|
|
|
*/
|
|
|
|
|
2021-05-09 10:26:07 +08:00
|
|
|
#include <linux/acpi.h>
|
2020-11-10 02:47:25 +08:00
|
|
|
#include <linux/adreno-smmu-priv.h>
|
2020-04-21 02:33:51 +08:00
|
|
|
#include <linux/of_device.h>
|
iommu: arm-smmu-impl: Add sdm845 implementation hook
Add reset hook for sdm845 based platforms to turn off
the wait-for-safe sequence.
Understanding how wait-for-safe logic affects USB and UFS performance
on MTP845 and DB845 boards:
Qcom's implementation of arm,mmu-500 adds a WAIT-FOR-SAFE logic
to address under-performance issues in real-time clients, such as
Display, and Camera.
On receiving an invalidation requests, the SMMU forwards SAFE request
to these clients and waits for SAFE ack signal from real-time clients.
The SAFE signal from such clients is used to qualify the start of
invalidation.
This logic is controlled by chicken bits, one for each - MDP (display),
IFE0, and IFE1 (camera), that can be accessed only from secure software
on sdm845.
This configuration, however, degrades the performance of non-real time
clients, such as USB, and UFS etc. This happens because, with wait-for-safe
logic enabled the hardware tries to throttle non-real time clients while
waiting for SAFE ack signals from real-time clients.
On mtp845 and db845 devices, with wait-for-safe logic enabled by the
bootloaders we see degraded performance of USB and UFS when kernel
enables the smmu stage-1 translations for these clients.
Turn off this wait-for-safe logic from the kernel gets us back the perf
of USB and UFS devices until we re-visit this when we start seeing perf
issues on display/camera on upstream supported SDM845 platforms.
The bootloaders on these boards implement secure monitor callbacks to
handle a specific command - QCOM_SCM_SVC_SMMU_PROGRAM with which the
logic can be toggled.
There are other boards such as cheza whose bootloaders don't enable this
logic. Such boards don't implement callbacks to handle the specific SCM
call so disabling this logic for such boards will be a no-op.
This change is inspired by the downstream change from Patrick Daly
to address performance issues with display and camera by handling
this wait-for-safe within separte io-pagetable ops to do TLB
maintenance. So a big thanks to him for the change and for all the
offline discussions.
Without this change the UFS reads are pretty slow:
$ time dd if=/dev/sda of=/dev/zero bs=1048576 count=10 conv=sync
10+0 records in
10+0 records out
10485760 bytes (10.0MB) copied, 22.394903 seconds, 457.2KB/s
real 0m 22.39s
user 0m 0.00s
sys 0m 0.01s
With this change they are back to rock!
$ time dd if=/dev/sda of=/dev/zero bs=1048576 count=300 conv=sync
300+0 records in
300+0 records out
314572800 bytes (300.0MB) copied, 1.030541 seconds, 291.1MB/s
real 0m 1.03s
user 0m 0.00s
sys 0m 0.54s
Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Signed-off-by: Will Deacon <will@kernel.org>
2019-09-20 16:04:29 +08:00
|
|
|
#include <linux/qcom_scm.h>
|
|
|
|
|
|
|
|
#include "arm-smmu.h"
|
|
|
|
|
|
|
|
struct qcom_smmu {
|
|
|
|
struct arm_smmu_device smmu;
|
2020-10-20 02:23:23 +08:00
|
|
|
bool bypass_quirk;
|
|
|
|
u8 bypass_cbndx;
|
2021-06-11 05:44:12 +08:00
|
|
|
u32 stall_enabled;
|
iommu: arm-smmu-impl: Add sdm845 implementation hook
Add reset hook for sdm845 based platforms to turn off
the wait-for-safe sequence.
Understanding how wait-for-safe logic affects USB and UFS performance
on MTP845 and DB845 boards:
Qcom's implementation of arm,mmu-500 adds a WAIT-FOR-SAFE logic
to address under-performance issues in real-time clients, such as
Display, and Camera.
On receiving an invalidation requests, the SMMU forwards SAFE request
to these clients and waits for SAFE ack signal from real-time clients.
The SAFE signal from such clients is used to qualify the start of
invalidation.
This logic is controlled by chicken bits, one for each - MDP (display),
IFE0, and IFE1 (camera), that can be accessed only from secure software
on sdm845.
This configuration, however, degrades the performance of non-real time
clients, such as USB, and UFS etc. This happens because, with wait-for-safe
logic enabled the hardware tries to throttle non-real time clients while
waiting for SAFE ack signals from real-time clients.
On mtp845 and db845 devices, with wait-for-safe logic enabled by the
bootloaders we see degraded performance of USB and UFS when kernel
enables the smmu stage-1 translations for these clients.
Turn off this wait-for-safe logic from the kernel gets us back the perf
of USB and UFS devices until we re-visit this when we start seeing perf
issues on display/camera on upstream supported SDM845 platforms.
The bootloaders on these boards implement secure monitor callbacks to
handle a specific command - QCOM_SCM_SVC_SMMU_PROGRAM with which the
logic can be toggled.
There are other boards such as cheza whose bootloaders don't enable this
logic. Such boards don't implement callbacks to handle the specific SCM
call so disabling this logic for such boards will be a no-op.
This change is inspired by the downstream change from Patrick Daly
to address performance issues with display and camera by handling
this wait-for-safe within separte io-pagetable ops to do TLB
maintenance. So a big thanks to him for the change and for all the
offline discussions.
Without this change the UFS reads are pretty slow:
$ time dd if=/dev/sda of=/dev/zero bs=1048576 count=10 conv=sync
10+0 records in
10+0 records out
10485760 bytes (10.0MB) copied, 22.394903 seconds, 457.2KB/s
real 0m 22.39s
user 0m 0.00s
sys 0m 0.01s
With this change they are back to rock!
$ time dd if=/dev/sda of=/dev/zero bs=1048576 count=300 conv=sync
300+0 records in
300+0 records out
314572800 bytes (300.0MB) copied, 1.030541 seconds, 291.1MB/s
real 0m 1.03s
user 0m 0.00s
sys 0m 0.54s
Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Signed-off-by: Will Deacon <will@kernel.org>
2019-09-20 16:04:29 +08:00
|
|
|
};
|
|
|
|
|
2020-10-20 02:23:23 +08:00
|
|
|
static struct qcom_smmu *to_qcom_smmu(struct arm_smmu_device *smmu)
|
|
|
|
{
|
|
|
|
return container_of(smmu, struct qcom_smmu, smmu);
|
|
|
|
}
|
|
|
|
|
2020-11-10 02:47:26 +08:00
|
|
|
static void qcom_adreno_smmu_write_sctlr(struct arm_smmu_device *smmu, int idx,
|
|
|
|
u32 reg)
|
|
|
|
{
|
2021-06-11 05:44:12 +08:00
|
|
|
struct qcom_smmu *qsmmu = to_qcom_smmu(smmu);
|
|
|
|
|
2020-11-10 02:47:26 +08:00
|
|
|
/*
|
|
|
|
* On the GPU device we want to process subsequent transactions after a
|
|
|
|
* fault to keep the GPU from hanging
|
|
|
|
*/
|
|
|
|
reg |= ARM_SMMU_SCTLR_HUPCF;
|
|
|
|
|
2021-06-11 05:44:12 +08:00
|
|
|
if (qsmmu->stall_enabled & BIT(idx))
|
|
|
|
reg |= ARM_SMMU_SCTLR_CFCFG;
|
|
|
|
|
2020-11-10 02:47:26 +08:00
|
|
|
arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_SCTLR, reg);
|
|
|
|
}
|
|
|
|
|
2021-06-11 05:44:10 +08:00
|
|
|
static void qcom_adreno_smmu_get_fault_info(const void *cookie,
|
|
|
|
struct adreno_smmu_fault_info *info)
|
|
|
|
{
|
|
|
|
struct arm_smmu_domain *smmu_domain = (void *)cookie;
|
|
|
|
struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
|
|
|
|
struct arm_smmu_device *smmu = smmu_domain->smmu;
|
|
|
|
|
|
|
|
info->fsr = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_FSR);
|
|
|
|
info->fsynr0 = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_FSYNR0);
|
|
|
|
info->fsynr1 = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_FSYNR1);
|
|
|
|
info->far = arm_smmu_cb_readq(smmu, cfg->cbndx, ARM_SMMU_CB_FAR);
|
|
|
|
info->cbfrsynra = arm_smmu_gr1_read(smmu, ARM_SMMU_GR1_CBFRSYNRA(cfg->cbndx));
|
2021-11-09 01:17:23 +08:00
|
|
|
info->ttbr0 = arm_smmu_cb_readq(smmu, cfg->cbndx, ARM_SMMU_CB_TTBR0);
|
2021-06-11 05:44:10 +08:00
|
|
|
info->contextidr = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_CONTEXTIDR);
|
|
|
|
}
|
|
|
|
|
2021-06-11 05:44:12 +08:00
|
|
|
static void qcom_adreno_smmu_set_stall(const void *cookie, bool enabled)
|
|
|
|
{
|
|
|
|
struct arm_smmu_domain *smmu_domain = (void *)cookie;
|
|
|
|
struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
|
|
|
|
struct qcom_smmu *qsmmu = to_qcom_smmu(smmu_domain->smmu);
|
|
|
|
|
|
|
|
if (enabled)
|
|
|
|
qsmmu->stall_enabled |= BIT(cfg->cbndx);
|
|
|
|
else
|
|
|
|
qsmmu->stall_enabled &= ~BIT(cfg->cbndx);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void qcom_adreno_smmu_resume_translation(const void *cookie, bool terminate)
|
|
|
|
{
|
|
|
|
struct arm_smmu_domain *smmu_domain = (void *)cookie;
|
|
|
|
struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
|
|
|
|
struct arm_smmu_device *smmu = smmu_domain->smmu;
|
|
|
|
u32 reg = 0;
|
|
|
|
|
|
|
|
if (terminate)
|
|
|
|
reg |= ARM_SMMU_RESUME_TERMINATE;
|
|
|
|
|
|
|
|
arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_RESUME, reg);
|
|
|
|
}
|
|
|
|
|
2020-11-10 02:47:25 +08:00
|
|
|
#define QCOM_ADRENO_SMMU_GPU_SID 0
|
|
|
|
|
|
|
|
static bool qcom_adreno_smmu_is_gpu_device(struct device *dev)
|
|
|
|
{
|
|
|
|
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
|
|
|
|
int i;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* The GPU will always use SID 0 so that is a handy way to uniquely
|
|
|
|
* identify it and configure it for per-instance pagetables
|
|
|
|
*/
|
|
|
|
for (i = 0; i < fwspec->num_ids; i++) {
|
|
|
|
u16 sid = FIELD_GET(ARM_SMMU_SMR_ID, fwspec->ids[i]);
|
|
|
|
|
|
|
|
if (sid == QCOM_ADRENO_SMMU_GPU_SID)
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
static const struct io_pgtable_cfg *qcom_adreno_smmu_get_ttbr1_cfg(
|
|
|
|
const void *cookie)
|
|
|
|
{
|
|
|
|
struct arm_smmu_domain *smmu_domain = (void *)cookie;
|
|
|
|
struct io_pgtable *pgtable =
|
|
|
|
io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops);
|
|
|
|
return &pgtable->cfg;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Local implementation to configure TTBR0 with the specified pagetable config.
|
|
|
|
* The GPU driver will call this to enable TTBR0 when per-instance pagetables
|
|
|
|
* are active
|
|
|
|
*/
|
|
|
|
|
|
|
|
static int qcom_adreno_smmu_set_ttbr0_cfg(const void *cookie,
|
|
|
|
const struct io_pgtable_cfg *pgtbl_cfg)
|
|
|
|
{
|
|
|
|
struct arm_smmu_domain *smmu_domain = (void *)cookie;
|
|
|
|
struct io_pgtable *pgtable = io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops);
|
|
|
|
struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
|
|
|
|
struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
|
|
|
|
|
|
|
|
/* The domain must have split pagetables already enabled */
|
|
|
|
if (cb->tcr[0] & ARM_SMMU_TCR_EPD1)
|
|
|
|
return -EINVAL;
|
|
|
|
|
|
|
|
/* If the pagetable config is NULL, disable TTBR0 */
|
|
|
|
if (!pgtbl_cfg) {
|
|
|
|
/* Do nothing if it is already disabled */
|
|
|
|
if ((cb->tcr[0] & ARM_SMMU_TCR_EPD0))
|
|
|
|
return -EINVAL;
|
|
|
|
|
|
|
|
/* Set TCR to the original configuration */
|
|
|
|
cb->tcr[0] = arm_smmu_lpae_tcr(&pgtable->cfg);
|
|
|
|
cb->ttbr[0] = FIELD_PREP(ARM_SMMU_TTBRn_ASID, cb->cfg->asid);
|
|
|
|
} else {
|
|
|
|
u32 tcr = cb->tcr[0];
|
|
|
|
|
|
|
|
/* Don't call this again if TTBR0 is already enabled */
|
|
|
|
if (!(cb->tcr[0] & ARM_SMMU_TCR_EPD0))
|
|
|
|
return -EINVAL;
|
|
|
|
|
|
|
|
tcr |= arm_smmu_lpae_tcr(pgtbl_cfg);
|
|
|
|
tcr &= ~(ARM_SMMU_TCR_EPD0 | ARM_SMMU_TCR_EPD1);
|
|
|
|
|
|
|
|
cb->tcr[0] = tcr;
|
|
|
|
cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
|
|
|
|
cb->ttbr[0] |= FIELD_PREP(ARM_SMMU_TTBRn_ASID, cb->cfg->asid);
|
|
|
|
}
|
|
|
|
|
|
|
|
arm_smmu_write_context_bank(smmu_domain->smmu, cb->cfg->cbndx);
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int qcom_adreno_smmu_alloc_context_bank(struct arm_smmu_domain *smmu_domain,
|
|
|
|
struct arm_smmu_device *smmu,
|
|
|
|
struct device *dev, int start)
|
|
|
|
{
|
|
|
|
int count;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Assign context bank 0 to the GPU device so the GPU hardware can
|
|
|
|
* switch pagetables
|
|
|
|
*/
|
|
|
|
if (qcom_adreno_smmu_is_gpu_device(dev)) {
|
|
|
|
start = 0;
|
|
|
|
count = 1;
|
|
|
|
} else {
|
|
|
|
start = 1;
|
|
|
|
count = smmu->num_context_banks;
|
|
|
|
}
|
|
|
|
|
|
|
|
return __arm_smmu_alloc_bitmap(smmu->context_map, start, count);
|
|
|
|
}
|
|
|
|
|
2021-03-27 07:13:02 +08:00
|
|
|
static bool qcom_adreno_can_do_ttbr1(struct arm_smmu_device *smmu)
|
|
|
|
{
|
|
|
|
const struct device_node *np = smmu->dev->of_node;
|
|
|
|
|
|
|
|
if (of_device_is_compatible(np, "qcom,msm8996-smmu-v2"))
|
|
|
|
return false;
|
|
|
|
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
2020-11-10 02:47:25 +08:00
|
|
|
static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain,
|
|
|
|
struct io_pgtable_cfg *pgtbl_cfg, struct device *dev)
|
|
|
|
{
|
|
|
|
struct adreno_smmu_priv *priv;
|
|
|
|
|
2021-08-12 00:04:26 +08:00
|
|
|
smmu_domain->cfg.flush_walk_prefer_tlbiasid = true;
|
|
|
|
|
2020-11-10 02:47:25 +08:00
|
|
|
/* Only enable split pagetables for the GPU device (SID 0) */
|
|
|
|
if (!qcom_adreno_smmu_is_gpu_device(dev))
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* All targets that use the qcom,adreno-smmu compatible string *should*
|
|
|
|
* be AARCH64 stage 1 but double check because the arm-smmu code assumes
|
|
|
|
* that is the case when the TTBR1 quirk is enabled
|
|
|
|
*/
|
2021-03-27 07:13:02 +08:00
|
|
|
if (qcom_adreno_can_do_ttbr1(smmu_domain->smmu) &&
|
|
|
|
(smmu_domain->stage == ARM_SMMU_DOMAIN_S1) &&
|
2020-11-10 02:47:25 +08:00
|
|
|
(smmu_domain->cfg.fmt == ARM_SMMU_CTX_FMT_AARCH64))
|
|
|
|
pgtbl_cfg->quirks |= IO_PGTABLE_QUIRK_ARM_TTBR1;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Initialize private interface with GPU:
|
|
|
|
*/
|
|
|
|
|
|
|
|
priv = dev_get_drvdata(dev);
|
|
|
|
priv->cookie = smmu_domain;
|
|
|
|
priv->get_ttbr1_cfg = qcom_adreno_smmu_get_ttbr1_cfg;
|
|
|
|
priv->set_ttbr0_cfg = qcom_adreno_smmu_set_ttbr0_cfg;
|
2021-06-11 05:44:10 +08:00
|
|
|
priv->get_fault_info = qcom_adreno_smmu_get_fault_info;
|
2021-06-11 05:44:12 +08:00
|
|
|
priv->set_stall = qcom_adreno_smmu_set_stall;
|
|
|
|
priv->resume_translation = qcom_adreno_smmu_resume_translation;
|
2020-11-10 02:47:25 +08:00
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2020-06-05 04:39:04 +08:00
|
|
|
static const struct of_device_id qcom_smmu_client_of_match[] __maybe_unused = {
|
2020-04-21 02:33:51 +08:00
|
|
|
{ .compatible = "qcom,adreno" },
|
|
|
|
{ .compatible = "qcom,mdp4" },
|
|
|
|
{ .compatible = "qcom,mdss" },
|
|
|
|
{ .compatible = "qcom,sc7180-mdss" },
|
2020-05-12 01:55:32 +08:00
|
|
|
{ .compatible = "qcom,sc7180-mss-pil" },
|
2021-06-08 20:30:06 +08:00
|
|
|
{ .compatible = "qcom,sc7280-mdss" },
|
2021-09-17 21:55:29 +08:00
|
|
|
{ .compatible = "qcom,sc7280-mss-pil" },
|
2021-01-21 09:40:05 +08:00
|
|
|
{ .compatible = "qcom,sc8180x-mdss" },
|
2020-04-21 02:33:51 +08:00
|
|
|
{ .compatible = "qcom,sdm845-mdss" },
|
2020-05-12 01:55:32 +08:00
|
|
|
{ .compatible = "qcom,sdm845-mss-pil" },
|
2020-04-21 02:33:51 +08:00
|
|
|
{ }
|
|
|
|
};
|
|
|
|
|
2021-08-12 00:04:26 +08:00
|
|
|
static int qcom_smmu_init_context(struct arm_smmu_domain *smmu_domain,
|
|
|
|
struct io_pgtable_cfg *pgtbl_cfg, struct device *dev)
|
|
|
|
{
|
|
|
|
smmu_domain->cfg.flush_walk_prefer_tlbiasid = true;
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2020-10-20 02:23:22 +08:00
|
|
|
static int qcom_smmu_cfg_probe(struct arm_smmu_device *smmu)
|
|
|
|
{
|
2020-10-20 02:23:23 +08:00
|
|
|
unsigned int last_s2cr = ARM_SMMU_GR0_S2CR(smmu->num_mapping_groups - 1);
|
|
|
|
struct qcom_smmu *qsmmu = to_qcom_smmu(smmu);
|
|
|
|
u32 reg;
|
2020-10-20 02:23:22 +08:00
|
|
|
u32 smr;
|
|
|
|
int i;
|
|
|
|
|
2020-10-20 02:23:23 +08:00
|
|
|
/*
|
|
|
|
* With some firmware versions writes to S2CR of type FAULT are
|
|
|
|
* ignored, and writing BYPASS will end up written as FAULT in the
|
|
|
|
* register. Perform a write to S2CR to detect if this is the case and
|
|
|
|
* if so reserve a context bank to emulate bypass streams.
|
|
|
|
*/
|
|
|
|
reg = FIELD_PREP(ARM_SMMU_S2CR_TYPE, S2CR_TYPE_BYPASS) |
|
|
|
|
FIELD_PREP(ARM_SMMU_S2CR_CBNDX, 0xff) |
|
|
|
|
FIELD_PREP(ARM_SMMU_S2CR_PRIVCFG, S2CR_PRIVCFG_DEFAULT);
|
|
|
|
arm_smmu_gr0_write(smmu, last_s2cr, reg);
|
|
|
|
reg = arm_smmu_gr0_read(smmu, last_s2cr);
|
|
|
|
if (FIELD_GET(ARM_SMMU_S2CR_TYPE, reg) != S2CR_TYPE_BYPASS) {
|
|
|
|
qsmmu->bypass_quirk = true;
|
|
|
|
qsmmu->bypass_cbndx = smmu->num_context_banks - 1;
|
|
|
|
|
|
|
|
set_bit(qsmmu->bypass_cbndx, smmu->context_map);
|
|
|
|
|
2021-01-06 08:50:38 +08:00
|
|
|
arm_smmu_cb_write(smmu, qsmmu->bypass_cbndx, ARM_SMMU_CB_SCTLR, 0);
|
|
|
|
|
2020-10-20 02:23:23 +08:00
|
|
|
reg = FIELD_PREP(ARM_SMMU_CBAR_TYPE, CBAR_TYPE_S1_TRANS_S2_BYPASS);
|
|
|
|
arm_smmu_gr1_write(smmu, ARM_SMMU_GR1_CBAR(qsmmu->bypass_cbndx), reg);
|
|
|
|
}
|
|
|
|
|
2020-10-20 02:23:22 +08:00
|
|
|
for (i = 0; i < smmu->num_mapping_groups; i++) {
|
|
|
|
smr = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_SMR(i));
|
|
|
|
|
|
|
|
if (FIELD_GET(ARM_SMMU_SMR_VALID, smr)) {
|
2021-01-26 05:52:25 +08:00
|
|
|
/* Ignore valid bit for SMR mask extraction. */
|
|
|
|
smr &= ~ARM_SMMU_SMR_VALID;
|
2020-10-20 02:23:22 +08:00
|
|
|
smmu->smrs[i].id = FIELD_GET(ARM_SMMU_SMR_ID, smr);
|
|
|
|
smmu->smrs[i].mask = FIELD_GET(ARM_SMMU_SMR_MASK, smr);
|
|
|
|
smmu->smrs[i].valid = true;
|
|
|
|
|
|
|
|
smmu->s2crs[i].type = S2CR_TYPE_BYPASS;
|
|
|
|
smmu->s2crs[i].privcfg = S2CR_PRIVCFG_DEFAULT;
|
|
|
|
smmu->s2crs[i].cbndx = 0xff;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2020-10-20 02:23:23 +08:00
|
|
|
static void qcom_smmu_write_s2cr(struct arm_smmu_device *smmu, int idx)
|
|
|
|
{
|
|
|
|
struct arm_smmu_s2cr *s2cr = smmu->s2crs + idx;
|
|
|
|
struct qcom_smmu *qsmmu = to_qcom_smmu(smmu);
|
|
|
|
u32 cbndx = s2cr->cbndx;
|
|
|
|
u32 type = s2cr->type;
|
|
|
|
u32 reg;
|
|
|
|
|
|
|
|
if (qsmmu->bypass_quirk) {
|
|
|
|
if (type == S2CR_TYPE_BYPASS) {
|
|
|
|
/*
|
|
|
|
* Firmware with quirky S2CR handling will substitute
|
|
|
|
* BYPASS writes with FAULT, so point the stream to the
|
|
|
|
* reserved context bank and ask for translation on the
|
|
|
|
* stream
|
|
|
|
*/
|
|
|
|
type = S2CR_TYPE_TRANS;
|
|
|
|
cbndx = qsmmu->bypass_cbndx;
|
|
|
|
} else if (type == S2CR_TYPE_FAULT) {
|
|
|
|
/*
|
|
|
|
* Firmware with quirky S2CR handling will ignore FAULT
|
|
|
|
* writes, so trick it to write FAULT by asking for a
|
|
|
|
* BYPASS.
|
|
|
|
*/
|
|
|
|
type = S2CR_TYPE_BYPASS;
|
|
|
|
cbndx = 0xff;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
reg = FIELD_PREP(ARM_SMMU_S2CR_TYPE, type) |
|
|
|
|
FIELD_PREP(ARM_SMMU_S2CR_CBNDX, cbndx) |
|
|
|
|
FIELD_PREP(ARM_SMMU_S2CR_PRIVCFG, s2cr->privcfg);
|
|
|
|
arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_S2CR(idx), reg);
|
|
|
|
}
|
|
|
|
|
2020-04-21 02:33:51 +08:00
|
|
|
static int qcom_smmu_def_domain_type(struct device *dev)
|
|
|
|
{
|
|
|
|
const struct of_device_id *match =
|
|
|
|
of_match_device(qcom_smmu_client_of_match, dev);
|
|
|
|
|
|
|
|
return match ? IOMMU_DOMAIN_IDENTITY : 0;
|
|
|
|
}
|
|
|
|
|
iommu: arm-smmu-impl: Add sdm845 implementation hook
Add reset hook for sdm845 based platforms to turn off
the wait-for-safe sequence.
Understanding how wait-for-safe logic affects USB and UFS performance
on MTP845 and DB845 boards:
Qcom's implementation of arm,mmu-500 adds a WAIT-FOR-SAFE logic
to address under-performance issues in real-time clients, such as
Display, and Camera.
On receiving an invalidation requests, the SMMU forwards SAFE request
to these clients and waits for SAFE ack signal from real-time clients.
The SAFE signal from such clients is used to qualify the start of
invalidation.
This logic is controlled by chicken bits, one for each - MDP (display),
IFE0, and IFE1 (camera), that can be accessed only from secure software
on sdm845.
This configuration, however, degrades the performance of non-real time
clients, such as USB, and UFS etc. This happens because, with wait-for-safe
logic enabled the hardware tries to throttle non-real time clients while
waiting for SAFE ack signals from real-time clients.
On mtp845 and db845 devices, with wait-for-safe logic enabled by the
bootloaders we see degraded performance of USB and UFS when kernel
enables the smmu stage-1 translations for these clients.
Turn off this wait-for-safe logic from the kernel gets us back the perf
of USB and UFS devices until we re-visit this when we start seeing perf
issues on display/camera on upstream supported SDM845 platforms.
The bootloaders on these boards implement secure monitor callbacks to
handle a specific command - QCOM_SCM_SVC_SMMU_PROGRAM with which the
logic can be toggled.
There are other boards such as cheza whose bootloaders don't enable this
logic. Such boards don't implement callbacks to handle the specific SCM
call so disabling this logic for such boards will be a no-op.
This change is inspired by the downstream change from Patrick Daly
to address performance issues with display and camera by handling
this wait-for-safe within separte io-pagetable ops to do TLB
maintenance. So a big thanks to him for the change and for all the
offline discussions.
Without this change the UFS reads are pretty slow:
$ time dd if=/dev/sda of=/dev/zero bs=1048576 count=10 conv=sync
10+0 records in
10+0 records out
10485760 bytes (10.0MB) copied, 22.394903 seconds, 457.2KB/s
real 0m 22.39s
user 0m 0.00s
sys 0m 0.01s
With this change they are back to rock!
$ time dd if=/dev/sda of=/dev/zero bs=1048576 count=300 conv=sync
300+0 records in
300+0 records out
314572800 bytes (300.0MB) copied, 1.030541 seconds, 291.1MB/s
real 0m 1.03s
user 0m 0.00s
sys 0m 0.54s
Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Signed-off-by: Will Deacon <will@kernel.org>
2019-09-20 16:04:29 +08:00
|
|
|
static int qcom_sdm845_smmu500_reset(struct arm_smmu_device *smmu)
|
|
|
|
{
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* To address performance degradation in non-real time clients,
|
|
|
|
* such as USB and UFS, turn off wait-for-safe on sdm845 based boards,
|
|
|
|
* such as MTP and db845, whose firmwares implement secure monitor
|
|
|
|
* call handlers to turn on/off the wait-for-safe logic.
|
|
|
|
*/
|
|
|
|
ret = qcom_scm_qsmmu500_wait_safe_toggle(0);
|
|
|
|
if (ret)
|
|
|
|
dev_warn(smmu->dev, "Failed to turn off SAFE logic\n");
|
|
|
|
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2020-04-21 02:33:49 +08:00
|
|
|
static int qcom_smmu500_reset(struct arm_smmu_device *smmu)
|
|
|
|
{
|
|
|
|
const struct device_node *np = smmu->dev->of_node;
|
|
|
|
|
|
|
|
arm_mmu500_reset(smmu);
|
|
|
|
|
|
|
|
if (of_device_is_compatible(np, "qcom,sdm845-smmu-500"))
|
|
|
|
return qcom_sdm845_smmu500_reset(smmu);
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
iommu: arm-smmu-impl: Add sdm845 implementation hook
Add reset hook for sdm845 based platforms to turn off
the wait-for-safe sequence.
Understanding how wait-for-safe logic affects USB and UFS performance
on MTP845 and DB845 boards:
Qcom's implementation of arm,mmu-500 adds a WAIT-FOR-SAFE logic
to address under-performance issues in real-time clients, such as
Display, and Camera.
On receiving an invalidation requests, the SMMU forwards SAFE request
to these clients and waits for SAFE ack signal from real-time clients.
The SAFE signal from such clients is used to qualify the start of
invalidation.
This logic is controlled by chicken bits, one for each - MDP (display),
IFE0, and IFE1 (camera), that can be accessed only from secure software
on sdm845.
This configuration, however, degrades the performance of non-real time
clients, such as USB, and UFS etc. This happens because, with wait-for-safe
logic enabled the hardware tries to throttle non-real time clients while
waiting for SAFE ack signals from real-time clients.
On mtp845 and db845 devices, with wait-for-safe logic enabled by the
bootloaders we see degraded performance of USB and UFS when kernel
enables the smmu stage-1 translations for these clients.
Turn off this wait-for-safe logic from the kernel gets us back the perf
of USB and UFS devices until we re-visit this when we start seeing perf
issues on display/camera on upstream supported SDM845 platforms.
The bootloaders on these boards implement secure monitor callbacks to
handle a specific command - QCOM_SCM_SVC_SMMU_PROGRAM with which the
logic can be toggled.
There are other boards such as cheza whose bootloaders don't enable this
logic. Such boards don't implement callbacks to handle the specific SCM
call so disabling this logic for such boards will be a no-op.
This change is inspired by the downstream change from Patrick Daly
to address performance issues with display and camera by handling
this wait-for-safe within separte io-pagetable ops to do TLB
maintenance. So a big thanks to him for the change and for all the
offline discussions.
Without this change the UFS reads are pretty slow:
$ time dd if=/dev/sda of=/dev/zero bs=1048576 count=10 conv=sync
10+0 records in
10+0 records out
10485760 bytes (10.0MB) copied, 22.394903 seconds, 457.2KB/s
real 0m 22.39s
user 0m 0.00s
sys 0m 0.01s
With this change they are back to rock!
$ time dd if=/dev/sda of=/dev/zero bs=1048576 count=300 conv=sync
300+0 records in
300+0 records out
314572800 bytes (300.0MB) copied, 1.030541 seconds, 291.1MB/s
real 0m 1.03s
user 0m 0.00s
sys 0m 0.54s
Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Signed-off-by: Will Deacon <will@kernel.org>
2019-09-20 16:04:29 +08:00
|
|
|
static const struct arm_smmu_impl qcom_smmu_impl = {
|
2021-08-12 00:04:26 +08:00
|
|
|
.init_context = qcom_smmu_init_context,
|
2020-10-20 02:23:22 +08:00
|
|
|
.cfg_probe = qcom_smmu_cfg_probe,
|
2020-04-21 02:33:51 +08:00
|
|
|
.def_domain_type = qcom_smmu_def_domain_type,
|
2020-04-21 02:33:49 +08:00
|
|
|
.reset = qcom_smmu500_reset,
|
2020-10-20 02:23:23 +08:00
|
|
|
.write_s2cr = qcom_smmu_write_s2cr,
|
iommu: arm-smmu-impl: Add sdm845 implementation hook
Add reset hook for sdm845 based platforms to turn off
the wait-for-safe sequence.
Understanding how wait-for-safe logic affects USB and UFS performance
on MTP845 and DB845 boards:
Qcom's implementation of arm,mmu-500 adds a WAIT-FOR-SAFE logic
to address under-performance issues in real-time clients, such as
Display, and Camera.
On receiving an invalidation requests, the SMMU forwards SAFE request
to these clients and waits for SAFE ack signal from real-time clients.
The SAFE signal from such clients is used to qualify the start of
invalidation.
This logic is controlled by chicken bits, one for each - MDP (display),
IFE0, and IFE1 (camera), that can be accessed only from secure software
on sdm845.
This configuration, however, degrades the performance of non-real time
clients, such as USB, and UFS etc. This happens because, with wait-for-safe
logic enabled the hardware tries to throttle non-real time clients while
waiting for SAFE ack signals from real-time clients.
On mtp845 and db845 devices, with wait-for-safe logic enabled by the
bootloaders we see degraded performance of USB and UFS when kernel
enables the smmu stage-1 translations for these clients.
Turn off this wait-for-safe logic from the kernel gets us back the perf
of USB and UFS devices until we re-visit this when we start seeing perf
issues on display/camera on upstream supported SDM845 platforms.
The bootloaders on these boards implement secure monitor callbacks to
handle a specific command - QCOM_SCM_SVC_SMMU_PROGRAM with which the
logic can be toggled.
There are other boards such as cheza whose bootloaders don't enable this
logic. Such boards don't implement callbacks to handle the specific SCM
call so disabling this logic for such boards will be a no-op.
This change is inspired by the downstream change from Patrick Daly
to address performance issues with display and camera by handling
this wait-for-safe within separte io-pagetable ops to do TLB
maintenance. So a big thanks to him for the change and for all the
offline discussions.
Without this change the UFS reads are pretty slow:
$ time dd if=/dev/sda of=/dev/zero bs=1048576 count=10 conv=sync
10+0 records in
10+0 records out
10485760 bytes (10.0MB) copied, 22.394903 seconds, 457.2KB/s
real 0m 22.39s
user 0m 0.00s
sys 0m 0.01s
With this change they are back to rock!
$ time dd if=/dev/sda of=/dev/zero bs=1048576 count=300 conv=sync
300+0 records in
300+0 records out
314572800 bytes (300.0MB) copied, 1.030541 seconds, 291.1MB/s
real 0m 1.03s
user 0m 0.00s
sys 0m 0.54s
Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Signed-off-by: Will Deacon <will@kernel.org>
2019-09-20 16:04:29 +08:00
|
|
|
};
|
|
|
|
|
2020-11-10 02:47:25 +08:00
|
|
|
static const struct arm_smmu_impl qcom_adreno_smmu_impl = {
|
|
|
|
.init_context = qcom_adreno_smmu_init_context,
|
|
|
|
.def_domain_type = qcom_smmu_def_domain_type,
|
|
|
|
.reset = qcom_smmu500_reset,
|
|
|
|
.alloc_context_bank = qcom_adreno_smmu_alloc_context_bank,
|
2020-11-10 02:47:26 +08:00
|
|
|
.write_sctlr = qcom_adreno_smmu_write_sctlr,
|
2020-11-10 02:47:25 +08:00
|
|
|
};
|
|
|
|
|
|
|
|
static struct arm_smmu_device *qcom_smmu_create(struct arm_smmu_device *smmu,
|
|
|
|
const struct arm_smmu_impl *impl)
|
iommu: arm-smmu-impl: Add sdm845 implementation hook
Add reset hook for sdm845 based platforms to turn off
the wait-for-safe sequence.
Understanding how wait-for-safe logic affects USB and UFS performance
on MTP845 and DB845 boards:
Qcom's implementation of arm,mmu-500 adds a WAIT-FOR-SAFE logic
to address under-performance issues in real-time clients, such as
Display, and Camera.
On receiving an invalidation requests, the SMMU forwards SAFE request
to these clients and waits for SAFE ack signal from real-time clients.
The SAFE signal from such clients is used to qualify the start of
invalidation.
This logic is controlled by chicken bits, one for each - MDP (display),
IFE0, and IFE1 (camera), that can be accessed only from secure software
on sdm845.
This configuration, however, degrades the performance of non-real time
clients, such as USB, and UFS etc. This happens because, with wait-for-safe
logic enabled the hardware tries to throttle non-real time clients while
waiting for SAFE ack signals from real-time clients.
On mtp845 and db845 devices, with wait-for-safe logic enabled by the
bootloaders we see degraded performance of USB and UFS when kernel
enables the smmu stage-1 translations for these clients.
Turn off this wait-for-safe logic from the kernel gets us back the perf
of USB and UFS devices until we re-visit this when we start seeing perf
issues on display/camera on upstream supported SDM845 platforms.
The bootloaders on these boards implement secure monitor callbacks to
handle a specific command - QCOM_SCM_SVC_SMMU_PROGRAM with which the
logic can be toggled.
There are other boards such as cheza whose bootloaders don't enable this
logic. Such boards don't implement callbacks to handle the specific SCM
call so disabling this logic for such boards will be a no-op.
This change is inspired by the downstream change from Patrick Daly
to address performance issues with display and camera by handling
this wait-for-safe within separte io-pagetable ops to do TLB
maintenance. So a big thanks to him for the change and for all the
offline discussions.
Without this change the UFS reads are pretty slow:
$ time dd if=/dev/sda of=/dev/zero bs=1048576 count=10 conv=sync
10+0 records in
10+0 records out
10485760 bytes (10.0MB) copied, 22.394903 seconds, 457.2KB/s
real 0m 22.39s
user 0m 0.00s
sys 0m 0.01s
With this change they are back to rock!
$ time dd if=/dev/sda of=/dev/zero bs=1048576 count=300 conv=sync
300+0 records in
300+0 records out
314572800 bytes (300.0MB) copied, 1.030541 seconds, 291.1MB/s
real 0m 1.03s
user 0m 0.00s
sys 0m 0.54s
Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Signed-off-by: Will Deacon <will@kernel.org>
2019-09-20 16:04:29 +08:00
|
|
|
{
|
|
|
|
struct qcom_smmu *qsmmu;
|
|
|
|
|
2020-11-13 06:05:19 +08:00
|
|
|
/* Check to make sure qcom_scm has finished probing */
|
|
|
|
if (!qcom_scm_is_available())
|
|
|
|
return ERR_PTR(-EPROBE_DEFER);
|
|
|
|
|
2020-10-26 20:00:22 +08:00
|
|
|
qsmmu = devm_krealloc(smmu->dev, smmu, sizeof(*qsmmu), GFP_KERNEL);
|
iommu: arm-smmu-impl: Add sdm845 implementation hook
Add reset hook for sdm845 based platforms to turn off
the wait-for-safe sequence.
Understanding how wait-for-safe logic affects USB and UFS performance
on MTP845 and DB845 boards:
Qcom's implementation of arm,mmu-500 adds a WAIT-FOR-SAFE logic
to address under-performance issues in real-time clients, such as
Display, and Camera.
On receiving an invalidation requests, the SMMU forwards SAFE request
to these clients and waits for SAFE ack signal from real-time clients.
The SAFE signal from such clients is used to qualify the start of
invalidation.
This logic is controlled by chicken bits, one for each - MDP (display),
IFE0, and IFE1 (camera), that can be accessed only from secure software
on sdm845.
This configuration, however, degrades the performance of non-real time
clients, such as USB, and UFS etc. This happens because, with wait-for-safe
logic enabled the hardware tries to throttle non-real time clients while
waiting for SAFE ack signals from real-time clients.
On mtp845 and db845 devices, with wait-for-safe logic enabled by the
bootloaders we see degraded performance of USB and UFS when kernel
enables the smmu stage-1 translations for these clients.
Turn off this wait-for-safe logic from the kernel gets us back the perf
of USB and UFS devices until we re-visit this when we start seeing perf
issues on display/camera on upstream supported SDM845 platforms.
The bootloaders on these boards implement secure monitor callbacks to
handle a specific command - QCOM_SCM_SVC_SMMU_PROGRAM with which the
logic can be toggled.
There are other boards such as cheza whose bootloaders don't enable this
logic. Such boards don't implement callbacks to handle the specific SCM
call so disabling this logic for such boards will be a no-op.
This change is inspired by the downstream change from Patrick Daly
to address performance issues with display and camera by handling
this wait-for-safe within separte io-pagetable ops to do TLB
maintenance. So a big thanks to him for the change and for all the
offline discussions.
Without this change the UFS reads are pretty slow:
$ time dd if=/dev/sda of=/dev/zero bs=1048576 count=10 conv=sync
10+0 records in
10+0 records out
10485760 bytes (10.0MB) copied, 22.394903 seconds, 457.2KB/s
real 0m 22.39s
user 0m 0.00s
sys 0m 0.01s
With this change they are back to rock!
$ time dd if=/dev/sda of=/dev/zero bs=1048576 count=300 conv=sync
300+0 records in
300+0 records out
314572800 bytes (300.0MB) copied, 1.030541 seconds, 291.1MB/s
real 0m 1.03s
user 0m 0.00s
sys 0m 0.54s
Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Signed-off-by: Will Deacon <will@kernel.org>
2019-09-20 16:04:29 +08:00
|
|
|
if (!qsmmu)
|
|
|
|
return ERR_PTR(-ENOMEM);
|
|
|
|
|
2020-11-10 02:47:25 +08:00
|
|
|
qsmmu->smmu.impl = impl;
|
iommu: arm-smmu-impl: Add sdm845 implementation hook
Add reset hook for sdm845 based platforms to turn off
the wait-for-safe sequence.
Understanding how wait-for-safe logic affects USB and UFS performance
on MTP845 and DB845 boards:
Qcom's implementation of arm,mmu-500 adds a WAIT-FOR-SAFE logic
to address under-performance issues in real-time clients, such as
Display, and Camera.
On receiving an invalidation requests, the SMMU forwards SAFE request
to these clients and waits for SAFE ack signal from real-time clients.
The SAFE signal from such clients is used to qualify the start of
invalidation.
This logic is controlled by chicken bits, one for each - MDP (display),
IFE0, and IFE1 (camera), that can be accessed only from secure software
on sdm845.
This configuration, however, degrades the performance of non-real time
clients, such as USB, and UFS etc. This happens because, with wait-for-safe
logic enabled the hardware tries to throttle non-real time clients while
waiting for SAFE ack signals from real-time clients.
On mtp845 and db845 devices, with wait-for-safe logic enabled by the
bootloaders we see degraded performance of USB and UFS when kernel
enables the smmu stage-1 translations for these clients.
Turn off this wait-for-safe logic from the kernel gets us back the perf
of USB and UFS devices until we re-visit this when we start seeing perf
issues on display/camera on upstream supported SDM845 platforms.
The bootloaders on these boards implement secure monitor callbacks to
handle a specific command - QCOM_SCM_SVC_SMMU_PROGRAM with which the
logic can be toggled.
There are other boards such as cheza whose bootloaders don't enable this
logic. Such boards don't implement callbacks to handle the specific SCM
call so disabling this logic for such boards will be a no-op.
This change is inspired by the downstream change from Patrick Daly
to address performance issues with display and camera by handling
this wait-for-safe within separte io-pagetable ops to do TLB
maintenance. So a big thanks to him for the change and for all the
offline discussions.
Without this change the UFS reads are pretty slow:
$ time dd if=/dev/sda of=/dev/zero bs=1048576 count=10 conv=sync
10+0 records in
10+0 records out
10485760 bytes (10.0MB) copied, 22.394903 seconds, 457.2KB/s
real 0m 22.39s
user 0m 0.00s
sys 0m 0.01s
With this change they are back to rock!
$ time dd if=/dev/sda of=/dev/zero bs=1048576 count=300 conv=sync
300+0 records in
300+0 records out
314572800 bytes (300.0MB) copied, 1.030541 seconds, 291.1MB/s
real 0m 1.03s
user 0m 0.00s
sys 0m 0.54s
Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Signed-off-by: Will Deacon <will@kernel.org>
2019-09-20 16:04:29 +08:00
|
|
|
|
|
|
|
return &qsmmu->smmu;
|
|
|
|
}
|
2020-11-10 02:47:25 +08:00
|
|
|
|
2020-11-25 15:00:17 +08:00
|
|
|
static const struct of_device_id __maybe_unused qcom_smmu_impl_of_match[] = {
|
2021-01-10 00:56:21 +08:00
|
|
|
{ .compatible = "qcom,msm8998-smmu-v2" },
|
2021-10-01 22:00:31 +08:00
|
|
|
{ .compatible = "qcom,qcm2290-smmu-500" },
|
2020-11-25 15:00:17 +08:00
|
|
|
{ .compatible = "qcom,sc7180-smmu-500" },
|
2021-06-08 20:30:06 +08:00
|
|
|
{ .compatible = "qcom,sc7280-smmu-500" },
|
2021-01-21 09:40:05 +08:00
|
|
|
{ .compatible = "qcom,sc8180x-smmu-500" },
|
2021-01-10 00:56:21 +08:00
|
|
|
{ .compatible = "qcom,sdm630-smmu-v2" },
|
2020-11-25 15:00:17 +08:00
|
|
|
{ .compatible = "qcom,sdm845-smmu-500" },
|
2021-05-24 05:25:33 +08:00
|
|
|
{ .compatible = "qcom,sm6125-smmu-500" },
|
2021-08-21 04:29:05 +08:00
|
|
|
{ .compatible = "qcom,sm6350-smmu-500" },
|
2020-11-25 15:00:17 +08:00
|
|
|
{ .compatible = "qcom,sm8150-smmu-500" },
|
|
|
|
{ .compatible = "qcom,sm8250-smmu-500" },
|
2021-01-15 17:03:22 +08:00
|
|
|
{ .compatible = "qcom,sm8350-smmu-500" },
|
2021-12-01 15:39:43 +08:00
|
|
|
{ .compatible = "qcom,sm8450-smmu-500" },
|
2020-11-25 15:00:17 +08:00
|
|
|
{ }
|
|
|
|
};
|
|
|
|
|
2021-06-09 09:55:11 +08:00
|
|
|
#ifdef CONFIG_ACPI
|
2021-05-09 10:26:07 +08:00
|
|
|
static struct acpi_platform_list qcom_acpi_platlist[] = {
|
|
|
|
{ "LENOVO", "CB-01 ", 0x8180, ACPI_SIG_IORT, equal, "QCOM SMMU" },
|
|
|
|
{ "QCOM ", "QCOMEDK2", 0x8180, ACPI_SIG_IORT, equal, "QCOM SMMU" },
|
|
|
|
{ }
|
|
|
|
};
|
2021-06-09 09:55:11 +08:00
|
|
|
#endif
|
2021-05-09 10:26:07 +08:00
|
|
|
|
2020-11-10 02:47:25 +08:00
|
|
|
struct arm_smmu_device *qcom_smmu_impl_init(struct arm_smmu_device *smmu)
|
|
|
|
{
|
2020-11-25 15:00:17 +08:00
|
|
|
const struct device_node *np = smmu->dev->of_node;
|
2020-11-10 02:47:25 +08:00
|
|
|
|
2021-06-09 09:55:11 +08:00
|
|
|
#ifdef CONFIG_ACPI
|
2021-05-09 10:26:07 +08:00
|
|
|
if (np == NULL) {
|
|
|
|
/* Match platform for ACPI boot */
|
|
|
|
if (acpi_match_platform_list(qcom_acpi_platlist) >= 0)
|
|
|
|
return qcom_smmu_create(smmu, &qcom_smmu_impl);
|
|
|
|
}
|
2021-06-09 09:55:11 +08:00
|
|
|
#endif
|
2020-11-25 15:00:17 +08:00
|
|
|
|
2021-06-08 20:30:07 +08:00
|
|
|
/*
|
|
|
|
* Do not change this order of implementation, i.e., first adreno
|
|
|
|
* smmu impl and then apss smmu since we can have both implementing
|
|
|
|
* arm,mmu-500 in which case we will miss setting adreno smmu specific
|
|
|
|
* features if the order is changed.
|
|
|
|
*/
|
2020-11-25 15:00:17 +08:00
|
|
|
if (of_device_is_compatible(np, "qcom,adreno-smmu"))
|
|
|
|
return qcom_smmu_create(smmu, &qcom_adreno_smmu_impl);
|
|
|
|
|
2021-06-08 20:30:07 +08:00
|
|
|
if (of_match_node(qcom_smmu_impl_of_match, np))
|
|
|
|
return qcom_smmu_create(smmu, &qcom_smmu_impl);
|
|
|
|
|
2020-11-25 15:00:17 +08:00
|
|
|
return smmu;
|
2020-11-10 02:47:25 +08:00
|
|
|
}
|