Add debug statistics collection support. The statistics is available
via debugfs in '/sys/kernel/debug/mc/stats', it shows percent of memory
controller utilization for each memory client. This information is
intended to help with debugging of memory performance issues, it already
was proven to be useful by helping to improve memory bandwidth management
of the display driver.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Link: https://lore.kernel.org/r/20210319130933.23261-1-digetx@gmail.com
Check whether memory client reset is already asserted in order to prevent
DMA-flush error on trying to re-assert an already asserted reset.
This becomes a problem once PMC GENPD is enabled to use memory resets
since GENPD will get a error and fail to toggle power domain. PMC GENPDs
can't be toggled safely without holding memory reset on Tegra and we're
about to fix this.
Tested-by: Peter Geis <pgwipeout@gmail.com> # Ouya T30
Tested-by: Nicolas Chauvet <kwizart@gmail.com> # PAZ00 T20
Tested-by: Matt Merhar <mattmerhar@protonmail.com> # Ouya T30
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Link: https://lore.kernel.org/r/20210119235210.13006-1-digetx@gmail.com
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Add modularization support to the Tegra30 EMC driver, which now can be
compiled as a loadable kernel module.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Link: https://lore.kernel.org/r/20201111011456.7875-8-digetx@gmail.com
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Add common SoC-agnostic ICC framework which turns Tegra Memory Controller
into a memory interconnection provider. This allows us to use interconnect
API for tuning of memory configurations.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Peter Geis <pgwipeout@gmail.com>
Tested-by: Nicolas Chauvet <kwizart@gmail.com>
Link: https://lore.kernel.org/r/20201104164923.21238-33-digetx@gmail.com
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
The platform_get_irq() prints error message telling that interrupt is
missing, hence there is no need to duplicated that message in the drivers.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Link: https://lore.kernel.org/r/20201104164923.21238-31-digetx@gmail.com
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Multiple Tegra drivers need to retrieve Memory Controller and there is
duplication of the retrieval code among the drivers.
Add new devm_tegra_memory_controller_get() helper to remove the code's
duplication and to fix put_device() which was missed in the duplicated
code. Make EMC drivers to use the new helper.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20201104164923.21238-29-digetx@gmail.com
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
The Memory Controller registers definition is sparse and duplicated,
let's consolidate everything into a common place for consistency.
Acked-by: Peter De Schrijver <pdeschrijver@nvidia.com>
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Timing control debug features should be disabled at a boot time, but you
never now and hence it's better to disable them explicitly because some of
those features are crucial for the driver to do a proper thing.
Acked-by: Peter De Schrijver <pdeschrijver@nvidia.com>
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Introduce driver for the External Memory Controller (EMC) found on Tegra30
chips, it controls the external DRAM on the board. The purpose of this
driver is to program memory timing for external memory on the EMC clock
rate change.
Acked-by: Peter De Schrijver <pdeschrijver@nvidia.com>
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Peter Geis <pgwipeout@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The memory controller on Tegra124 and later supports 34 or more address
bits. Advertise that by setting the DMA mask based on the number of the
address bits.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Based on 2 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation #
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 4122 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is no need for a memory barriers on reading/writing of register
values as we only care about the read/write order, hence let's use the
common helpers.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Multiplying the Memory Controller clock rate by the tick count results
in an integer overflow and in result the truncated tick value is being
programmed into hardware, such that the GR3D memory client performance is
reduced by two times.
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Some of Memory Controller registers are shadowed and require latching in
order to copy assembly state into the active, MC_EMEM_ARB_CFG is one of
these registers.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Rename all occurrences of "terga" to "tegra". It's an easy typo to make
and a difficult one to spot.
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Make all messages to start with a lower case and don't unnecessarily go
over 80 chars in the code.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Memory Controller driver never shared IRQ with any other driver and very
unlikely that it will. Hence there is no need to request IRQ sharing and
the corresponding flag can be dropped safely.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Tegra20 doesn't have SMMU. Move out checking of the SMMU presence from
the SMMU driver into the Memory Controller driver. This change makes code
consistent in regards to how GART/SMMU presence checking is performed.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The device-tree binding has been changed. There is no separate GART device
anymore, it is squashed into the Memory Controller. Integrate GART module
with the MC in a way it is done for the SMMU on Tegra30+.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
There is no need to match device with the DT node since it was already
matched, use of_device_get_match_data() helper to get the match-data.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
With the device tree binding changes, now Memory Controller has access to
GART registers. Hence it is now possible to read client ID on GART page
fault to get information about what memory client causes the fault.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The tegra20-mc device-tree binding has been changed, GART has been
squashed into Memory Controller and now the clock property is mandatory
for Tegra20, the DT compatible has been changed as well. Adapt driver to
the DT changes.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
In preparation to remove the node name pointer from struct device_node,
convert printf users to use the %pOFn format specifier.
Cc: Roger Quadros <rogerq@ti.com>
Cc: Kukjin Kim <kgene@kernel.org>
Cc: Jonathan Hunter <jonathanh@nvidia.com>
Cc: linux-omap@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-samsung-soc@vger.kernel.org
Cc: linux-tegra@vger.kernel.org
Acked-by: Thierry Reding <treding@nvidia.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Acked-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Rob Herring <robh@kernel.org>
The Reset Controller should be registered in the end of probe, otherwise
Memory Controller device goes away if IRQ requesting fails and the Reset
Controller stays registered. To avoid having to unwind the MC probing in
a case of SMMU probe failure, let's simply print the error message without
failing the MC probe. This allows us to just move the Reset Controller
registering before the SMMU registration, reducing code churning. Also
let's not fail MC probe in a case of Reset Controller registration failure
as it doesn't prevent the MC driver to work.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Memory Controller driver invokes SMMU driver registration and MC's
registers mapping is shared with SMMU. This mapping goes away if MC
driver probing fails after SMMU registration.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
In order to reset busy HW properly, memory controller needs to be
involved, otherwise it is possible to get corrupted memory or hang machine
if HW was reset during DMA. Introduce memory client 'hot reset' that will
be used for resetting of busy HW.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Tegra30+ has some minor differences in registers / bits layout compared
to Tegra20. Let's squash Tegra20 driver into the common tegra-mc driver
in a preparation for the upcoming MC hot reset controls implementation,
avoiding code duplication.
Note that this currently doesn't report the value of MC_GART_ERROR_REQ
because it is located within the GART register area and cannot be safely
accessed from the MC driver (this happens to work only by accident). The
proper solution is to integrate the GART driver with the MC driver, much
like is done for the Tegra SMMU, but that is an invasive change and will
be part of a separate patch series.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Currently we are enabling handling of interrupts specific to Tegra124+
which happen to overlap with previous generations. Let's specify
interrupts mask per SoC generation for consistency and in a preparation
of squashing of Tegra20 driver into the common one that will enable
handling of GART faults which may be undesirable by newer generations.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The ISR reads interrupts-enable mask, but doesn't utilize it. Apply the
mask to the interrupt status and don't handle interrupts that MC driver
haven't asked for. Kernel would disable spurious MC IRQ and report the
error. This would happen only in a case of a very severe bug.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
for_each_child_of_node() performs an of_node_get() on each iteration, so
to break out of the loop an of_node_put() is required.
Found using Coccinelle. The semantic patch used for this is as follows:
// <smpl>
@@
expression e;
local idexpression n;
@@
for_each_child_of_node(..., n) {
... when != of_node_put(n)
when != e = n
(
return n;
|
+ of_node_put(n);
? return ...;
)
...
}
// </smpl>
Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
for_each_child_of_node() performs an of_node_put() on each iteration, so
putting an of_node_put() before a continue results in a double put.
The semantic match that finds this problem is as follows
(http://coccinelle.lip6.fr):
// <smpl>
@@
expression root,e;
local idexpression child;
iterator name for_each_child_of_node;
@@
for_each_child_of_node(root, child) {
... when != of_node_get(child)
* of_node_put(child);
...
* continue;
}
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Recent versions of the Tegra MC hardware extend the size of the client
ID bitfield in the MC_ERR_STATUS register by one bit. While one could
simply extend the bitfield for older hardware, that would allow data
from reserved bits into the driver code, which is generally a bad idea
on principle. So this patch instead passes in the client ID mask from
from the per-SoC MC data.
There's no MC support for T210 (yet), but when that support winds up
in the kernel, the appropriate soc->client_id_mask value for that chip
will be 0xff.
Based on an original patch by David Ung <davidu@nvidia.com>.
Signed-off-by: Paul Walmsley <paul@pwsan.com>
Cc: Paul Walmsley <pwalmsley@nvidia.com>
Cc: Thierry Reding <treding@nvidia.com>
Cc: David Ung <davidu@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The EMC driver needs to know the number of external memory devices and
also needs to update the EMEM configuration based on the new rate of the
memory bus.
To know how to update the EMEM config, looks up the values of the burst
regs in the DT, for a given timing.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
As this interrupt is just for development purposes, as the TRM says, and
the sheer amount of interrupts fired can seriously disrupt userspace
when testing the lower frequencies supported by the EMC.
From the TRM:
"There is one performance warning type interrupt: ARBITRATION_EMEM. It
fires when the MC detects that a request has been pending in the Row
Sorter long enough to hit the DEADLOCK_PREVENTION_SLACK_THRESHOLD. In
addition to true performance problems, this interrupt may fire in
situations such as clock-change where the EMC backpressures pending
traffic for long periods of time. This interrupt helps developers
identify and debug performance issues and configuration issues."
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The memory controller on Tegra132 is very similar to the one found on
Tegra124. But the Denver CPUs don't have an outer cache, so dcache
maintenance is done slightly differently.
Signed-off-by: Thierry Reding <treding@nvidia.com>
The memory controller on NVIDIA Tegra exposes various knobs that can be
used to tune the behaviour of the clients attached to it.
Currently this driver sets up the latency allowance registers to the HW
defaults. Eventually an API should be exported by this driver (via a
custom API or a generic subsystem) to allow clients to register latency
requirements.
This driver also registers an IOMMU (SMMU) that's implemented by the
memory controller. It is supported on Tegra30, Tegra114 and Tegra124
currently. Tegra20 has a GART instead.
The Tegra SMMU operates on memory clients and SWGROUPs. A memory client
is a unidirectional, special-purpose DMA master. A SWGROUP represents a
set of memory clients that form a logical functional unit corresponding
to a single device. Typically a device has two clients: one client for
read transactions and one client for write transactions, but there are
also devices that have only read clients, but many of them (such as the
display controllers).
Because there is no 1:1 relationship between memory clients and devices
the driver keeps a table of memory clients and the SWGROUPs that they
belong to per SoC. Note that this is an exception and due to the fact
that the SMMU is tightly integrated with the rest of the Tegra SoC. The
use of these tables is discouraged in drivers for generic IOMMU devices
such as the ARM SMMU because the same IOMMU could be used in any number
of SoCs and keeping such tables for each SoC would not scale.
Acked-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Thierry Reding <treding@nvidia.com>