The AUX bounce buffer is allocated with API dma_alloc_coherent(), in the
low level's architecture code, e.g. for Arm64, it maps the memory with
the attribution "Normal non-cacheable"; this can be concluded from the
definition for pgprot_dmacoherent() in arch/arm64/include/asm/pgtable.h.
Later when access the AUX bounce buffer, since the memory mapping is
non-cacheable, it's low efficiency due to every load instruction must
reach out DRAM.
This patch changes to allocate pages with dma_alloc_noncoherent(), the
driver can access the memory via cacheable mapping; therefore, load
instructions can fetch data from cache lines rather than always read
data from DRAM, the driver can boost memory performance. After using
the cacheable mapping, the driver uses dma_sync_single_for_cpu() to
invalidate cacheline prior to read bounce buffer so can avoid read stale
trace data.
By measurement the duration for function tmc_update_etr_buffer() with
ftrace function_graph tracer, it shows the performance significant
improvement for copying 4MiB data from bounce buffer:
# echo tmc_etr_get_data_flat_buf > set_graph_notrace // avoid noise
# echo tmc_update_etr_buffer > set_graph_function
# echo function_graph > current_tracer
before:
# CPU DURATION FUNCTION CALLS
# | | | | | | |
2) | tmc_update_etr_buffer() {
...
2) # 8148.320 us | }
after:
# CPU DURATION FUNCTION CALLS
# | | | | | | |
2) | tmc_update_etr_buffer() {
...
2) # 2525.420 us | }
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20210905032144.966766-1-leo.yan@linaro.org
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Commit 2f01c200d4 ("perf cs-etm: Remove callback cs_etm_find_snapshot()")
has removed the function cs_etm_find_snapshot() from the perf tool in the
user space, now CoreSight trace directly uses the perf common function
__auxtrace_mmap__read() to calcualte the head and size for AUX trace data
in snapshot mode.
This patch updates the comments in drivers to make them generic and not
stick to any specific function from perf tool.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Link: https://lore.kernel.org/r/20210912125748.2816606-3-leo.yan@linaro.org
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
When enable the Arm CoreSight PMU event, the context for AUX ring buffer
is prepared in the structure perf_output_handle, and its field "head"
points the head of the AUX ring buffer and it is updated after filling
AUX trace data into buffer.
Current code uses an extra field etr_perf_buffer::head to maintain the
header for the AUX ring buffer which is not necessary; alternatively,
it's better to directly use perf_output_handle::head.
This patch removes the field etr_perf_buffer::head and directly uses
perf_output_handle::head for the head of AUX ring buffer.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20210912125748.2816606-2-leo.yan@linaro.org
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Since a memory barrier is required between AUX trace data store and
aux_head store, and the AUX trace data is filled with memcpy(), it's
sufficient to use smp_wmb() so can ensure the trace data is visible
prior to updating aux_head.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20210809111407.596077-3-leo.yan@linaro.org
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
The current driver sets the write burst size initiated by TMC-ETR on
AXI bus to a fixed value of 16. Make this configurable by reading the
value specified in fwnode. If not specified, then default to 16.
Introduced a "max_burst_size" variable in tmc_drvdata structure to
facilitate this change.
Signed-off-by: Tanmay Jagdale <tanmay@marvell.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Link: https://lore.kernel.org/r/20210901131049.1365367-3-tanmay@marvell.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
When the ETR is used in perf mode with a larger buffer (configured
via sysfs or the default size of 1M) than the perf aux buffer size,
we end up inserting the barrier packet at the wrong offset, while
moving the offset forward. i.e, instead of the "new moved offset",
we insert it at the current hardware buffer offset. These packets
will not be visible as they are never copied and could lead to
corruption in the trace decoding side, as the decoder is not aware
that it needs to reset the decoding.
Fixes: ec13c78d7b ("coresight: tmc-etr: Add barrier packets when moving offset forward")
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: stable@vger.kernel.org
Reported-by: Al Grant <al.grant@arm.com>
Tested-by: Mike Leach <mike.leach@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20201208182651.1597945-2-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
alloc_pages_node() return should be checked before calling
dma_map_page() to make sure that valid page is mapped or
else it can lead to aborts as below:
Unable to handle kernel paging request at virtual address ffffffc008000000
Mem abort info:
<snip>...
pc : __dma_inv_area+0x40/0x58
lr : dma_direct_map_page+0xd8/0x1c8
Call trace:
__dma_inv_area
tmc_pages_alloc
tmc_alloc_data_pages
tmc_alloc_sg_table
tmc_init_etr_sg_table
tmc_alloc_etr_buf
tmc_enable_etr_sink_sysfs
tmc_enable_etr_sink
coresight_enable_path
coresight_enable
enable_source_store
dev_attr_store
sysfs_kf_write
Fixes: 99443ea19e ("coresight: Add generic TMC sg table framework")
Cc: stable@vger.kernel.org
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mao Jinlong <jinlmao@codeaurora.org>
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20201127175256.1092685-13-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Make etr_catu_buf_ops static. Instead of directly accessing it in
etr_buf_ops[], add a function to let catu driver register the ops at
runtime. Break circular dependency between tmc-etr and catu drivers.
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mian Yousaf Kaukab <ykaukab@suse.de>
Signed-off-by: Tingwei Zhang <tingwei@codeaurora.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20200928163513.70169-23-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Implement a shutdown callback to ensure ETR hardware is
properly shutdown in reboot/shutdown path. This is required
for ETR which has SMMU address translation enabled like on
SC7180 SoC and few others. If the hardware is still accessing
memory after SMMU translation is disabled as part of SMMU
shutdown callback in system reboot or shutdown path, then
IOVAs(I/O virtual address) which it was using will go on the
bus as the physical addresses which might result in unknown
crashes (NoC/interconnect errors). So we make sure from this
shutdown callback that the ETR is shutdown before SMMU translation
is disabled and device_link in SMMU driver will take care of
ordering of shutdown callbacks such that SMMU shutdown callback
is not called before any of its consumer shutdown callbacks.
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20200716175746.3338735-13-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch adds barrier packets in the trace stream when the offset in the
data buffer needs to be moved forward. Otherwise the decoder isn't aware
of the break in the stream and can't synchronise itself with the trace
data.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Tested-by: Yabin Cui <yabinc@google.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Link: https://lore.kernel.org/r/20190829202842.580-18-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If less space is available in the perf ring buffer than the ETR buffer,
barrier packets inserted in the trace stream by tmc_sync_etr_buf() are
skipped over when the head of the buffer is moved forward, resulting in
traces that can't be decoded.
This patch decouples the process of syncing ETR buffers and the addition
of barrier packets in order to perform the latter once the offset in the
trace buffer has been properly computed.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Link: https://lore.kernel.org/r/20190829202842.580-17-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When tracing etm data of multiple threads on multiple cpus through
perf interface, each cpu has a unique etr_perf_buffer while sharing
the same etr device. There is no guarantee that the last cpu starts
etm tracing also stops last. This makes perf_data check fail.
Fix it by checking etr_buf instead of etr_perf_buffer.
Also move the code setting and clearing perf_buf to more suitable
places.
Fixes: 3147da92a8 ("coresight: tmc-etr: Allocate and free ETR memory buffers for CPU-wide scenarios")
Signed-off-by: Yabin Cui <yabinc@google.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20190829202842.580-15-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
TMC etr always copies all available data to perf aux buffer, which
may exceed the available space in perf aux buffer. It isn't suitable
for not-snapshot mode, because:
1) It may overwrite previously written data.
2) It may make the perf_event_mmap_page->aux_head report having more
or less data than the reality.
So change to only copy the latest data fitting the available space in
perf aux buffer.
Signed-off-by: Yabin Cui <yabinc@google.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20190829202842.580-14-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We have so far ignored the memory errors, assuming that we have perfect
hardware and driver. Let us handle the memory errors reported by the
TMC ETR in status and truncate the buffer.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
[Removed ASCII smiley face from changelog]
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20190829202842.580-6-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We now use refcounts for the etr_buf users. Let us initialize it
while we allocate it.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20190829202842.580-5-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Based on the following report from Smatch, fix the potential
NULL pointer dereference check.
The patch 743256e214e8: "coresight: tmc: Clean up device specific
data" from May 22, 2019, leads to the following Smatch complaint:
drivers/hwtracing/coresight/coresight-tmc-etr.c:625 tmc_etr_free_flat_buf()
warn: variable dereferenced before check 'flat_buf' (see line 623)
drivers/hwtracing/coresight/coresight-tmc-etr.c
622 struct etr_flat_buf *flat_buf = etr_buf->private;
623 struct device *real_dev = flat_buf->dev->parent;
^^^^^^^^^^
The patch introduces a new NULL check
624
625 if (flat_buf && flat_buf->daddr)
^^^^^^^^
but the existing code assumed it can be NULL.
626 dma_free_coherent(real_dev, flat_buf->size,
627 flat_buf->vaddr, flat_buf->daddr);
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20190621175205.24551-3-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
During a perf session we try to allocate buffers on the "node" associated
with the CPU the event is bound to. If it is not bound to a CPU, we
use the current CPU node, using smp_processor_id(). However this is unsafe
in a pre-emptible context and could generate the splats as below :
BUG: using smp_processor_id() in preemptible [00000000] code: perf/1743
caller is tmc_alloc_etr_buffer+0x1bc/0x1f0
CPU: 1 PID: 1743 Comm: perf Not tainted 5.1.0-rc6-147786-g116841e #344
Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Feb 1 2019
Call trace:
dump_backtrace+0x0/0x150
show_stack+0x14/0x20
dump_stack+0x9c/0xc4
debug_smp_processor_id+0x10c/0x110
tmc_alloc_etr_buffer+0x1bc/0x1f0
etm_setup_aux+0x1c4/0x230
rb_alloc_aux+0x1b8/0x2b8
perf_mmap+0x35c/0x478
mmap_region+0x34c/0x4f0
do_mmap+0x2d8/0x418
vm_mmap_pgoff+0xd0/0xf8
ksys_mmap_pgoff+0x88/0xf8
__arm64_sys_mmap+0x28/0x38
el0_svc_handler+0xd8/0x138
el0_svc+0x8/0xc
Use NUMA_NO_NODE hint instead of using the current node for events
not bound to CPUs.
Fixes: 22f429f19c ("coresight: etm-perf: Add support for ETR backend")
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: stable <stable@vger.kernel.org> # 4.20+
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20190620221237.3536-3-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
During a perf session we try to allocate buffers on the "node" associated
with the CPU the event is bound to. If it's not bound to a CPU, we use
the current CPU node, using smp_processor_id(). However this is unsafe
in a pre-emptible context and could generate the splats as below :
BUG: using smp_processor_id() in preemptible [00000000] code: perf/1743
caller is alloc_etr_buf.isra.6+0x80/0xa0
CPU: 1 PID: 1743 Comm: perf Not tainted 5.1.0-rc6-147786-g116841e #344
Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Feb 1 2019
Call trace:
dump_backtrace+0x0/0x150
show_stack+0x14/0x20
dump_stack+0x9c/0xc4
debug_smp_processor_id+0x10c/0x110
alloc_etr_buf.isra.6+0x80/0xa0
tmc_alloc_etr_buffer+0x12c/0x1f0
etm_setup_aux+0x1c4/0x230
rb_alloc_aux+0x1b8/0x2b8
perf_mmap+0x35c/0x478
mmap_region+0x34c/0x4f0
do_mmap+0x2d8/0x418
vm_mmap_pgoff+0xd0/0xf8
ksys_mmap_pgoff+0x88/0xf8
__arm64_sys_mmap+0x28/0x38
el0_svc_handler+0xd8/0x138
el0_svc+0x8/0xc
Use NUMA_NO_NODE hint instead of using the current node for events
not bound to CPUs.
Fixes: 855ab61c16 ("coresight: tmc-etr: Refactor function tmc_etr_setup_perf_buf()")
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20190620221237.3536-2-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The platform specific information describes the connections and
the ports of a given coresigh device. This information is also
recorded in the coresight device as separate fields. Let us reuse
the original platform description to streamline the handling
of the data.
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In preparation to use a consistent device naming scheme,
clean up the device link tracking in replicator driver.
Use the "coresight" device instead of the "real" parent device
for all internal purposes. All other requests (e.g, power management,
DMA operations) must use the "real" device which is the parent device.
Since the CATU driver also uses the TMC-SG infrastructure, update
the callers to ensure they pass the appropriate device argument
for the tables.
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch avoids setting the truncated flag when operating in snapshot
mode since the trace buffer is expected to be truncated and discontinuous
from one snapshot to another. Moreover when the truncated flag is set
the perf core stops enabling the event, waiting for user space to consume
the data. In snapshot mode this is clearly not what we want since it
results in stale data.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Unify amongst sink drivers how the AUX ring buffer head is communicated
to user space. That way the same algorithm in user space can be used to
determine where the latest data is and how much of it to access.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch adds support for CPU-wide trace scenarios by making sure that
only the sources monitoring the same process have access to a common sink.
Because the sink is shared between sources, the first source to use the
sink switches it on while the last one does the cleanup. Any attempt to
modify the HW is overlooked for as long as more than one source is using
a sink.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Robert Walker <robert.walker@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch uses the PID of the process being traced to allocate and free
ETR memory buffers for CPU-wide scenarios. The implementation is tailored
to handle both N:1 and 1:1 source/sink HW topologies.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Robert Walker <robert.walker@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch adds reference counting to struct etr_buf so that, in CPU-wide
trace scenarios, shared buffers can be disposed of when no longer used.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Robert Walker <robert.walker@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In preparation to support CPU-wide trace scenarios, introduce the notion
of process ID to ETR devices. That way events monitoring the same process
can use the same etr_buf, allowing multiple CPUs to use the same sink.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Robert Walker <robert.walker@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Buffer allocation is different when dealing with per-thread and
CPU-wide sessions. In preparation to support CPU-wide trace scenarios
simplify things by keeping allocation functions for both type separate.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Robert Walker <robert.walker@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Refactoring function tmc_etr_setup_perf_buf() so that it only deals
with the high level etr_perf_buffer, leaving the allocation of the
backend buffer (i.e etr_buf) to another function.
That way the backend buffer allocation function can decide if it wants
to reuse an existing buffer (CPU-wide trace scenarios) or simply create
a new one.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Robert Walker <robert.walker@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Make struct perf_event available to sink buffer allocation functions in
order to use the pid they carry to allocate and free buffer memory along
with regimenting access to what source a sink can collect data for.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Robert Walker <robert.walker@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When disabling a sink the reference counter ensures the operation goes
through if nobody else is using it. As such if drvdata::mode is already
set do CS_MODE_DISABLED, it is an error and should be reported as such.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Robert Walker <robert.walker@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When operating in CPU-wide mode with an N:1 source/sink HW topology,
multiple CPUs can access a sink concurrently. As such reference counting
needs to happen when the device's spinlock is held to avoid racing with
other operations (start(), update(), stop()), such as:
session A Session B
----- -------
enable_sink
atomic_inc(refcount) = 1
...
atomic_dec(refcount) = 0 enable_sink
if (refcount == 0) disable_sink
atomic_inc()
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Robert Walker <robert.walker@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In preparation to handle device reference counting inside of the sink
drivers, add a return code to the sink::disable() operation so that
proper action can be taken if a sink has not been disabled.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Robert Walker <robert.walker@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clang points out a syntax error, as the etr_catu_buf_ops structure is
declared 'static' before the type is known:
In file included from drivers/hwtracing/coresight/coresight-tmc-etr.c:12:
drivers/hwtracing/coresight/coresight-catu.h:116:40: warning: tentative definition of variable with internal linkage has incomplete non-array type 'const struct etr_buf_operations' [-Wtentative-definition-incomplete-type]
static const struct etr_buf_operations etr_catu_buf_ops;
^
drivers/hwtracing/coresight/coresight-catu.h:116:21: note: forward declaration of 'struct etr_buf_operations'
static const struct etr_buf_operations etr_catu_buf_ops;
This seems worth fixing in the code, so replace pointer to the empty
constant structure with a NULL pointer. We need an extra NULL pointer
check here, but the result should be better object code otherwise,
avoiding the silly empty structure.
Fixes: 434d611cdd ("coresight: catu: Plug in CATU as a backend for ETR buffer")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
[Fixed line over 80 characters]
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use CLAIM tags to make sure the device is available for use.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Make sure we honor the errors in CATU device and abort the operation.
While at it, delay setting the etr_buf for the session until we are
sure that we are indeed enabling the ETR.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Refactor the tmc-etr enable operation to make it easier to
handle errors in enabling the hardware. We need to make
sure that the buffer is compatible with the ETR. This
patch re-arranges to make the error handling easier, by
deferring the hardware enablement until all the errors
are checked. This also avoids turning the CATU on/off
during a sysfs read session.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add support for using TMC-ETR as backend for ETM perf tracing.
We use software double buffering at the moment. i.e, the TMC-ETR
uses a separate buffer than the perf ring buffer. The data is
copied to the perf ring buffer once a session completes.
The TMC-ETR would try to match the larger of perf ring buffer
or the ETR buffer size configured via sysfs, scaling down to
a minimum limit of 1MB.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In coresight perf mode, we need to prepare the sink before
starting a session, which is done via set_buffer call back.
We then proceed to enable the tracing. If we fail to start
the session successfully, we leave the sink configuration
unchanged. In order to make the operation atomic and to
avoid yet another call back to clear the buffer, we get
rid of the "set_buffer" call back and pass the buffer details
via enable() call back to the sink.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Convert component enable/disable messages from dev_info to dev_dbg.
When used with perf, the components in the paths are enabled/disabled
during each schedule of the run, which can flood the dmesg with these
messages. Moreover, they are only useful for debug purposes. So,
convert such messages to dev_dbg() which can be turned on as
needed.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since the ETR now uses mode specific buffers, we can reliably
provide the trace data captured in sysfs mode, even when the ETR
is operating in PERF mode.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since the ETR could be driven either by SYSFS or by perf, it
becomes complicated how we deal with the buffers used for each
of these modes. The ETR driver cannot simply free the current
attached buffer without knowing the provider (i.e, sysfs vs perf).
To solve this issue, we provide:
1) the driver-mode specific etr buffer to be retained in the drvdata
2) the etr_buf for a session should be passed on when enabling the
hardware, which will be stored in drvdata->etr_buf. This will be
replaced (not free'd) as soon as the hardware is disabled, after
necessary sync operation.
The advantages of this are :
1) The common code path doesn't need to worry about how to dispose
an existing buffer, if it is about to start a new session with a
different buffer, possibly in a different mode.
2) The driver mode can control its buffers and can get access to the
saved session even when the hardware is operating in a different
mode. (e.g, we can still access a trace buffer from a sysfs mode
even if the etr is now used in perf mode, without disrupting the
current session.)
Towards this, we introduce a sysfs specific data which will hold the
etr_buf used for sysfs mode of operation, controlled solely by the
sysfs mode handling code.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use ERR_CAT inlined function to replace the ERR_PTR(PTR_ERR). It
make the code more concise.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that we can use a CATU with a scatter gather table, add support
for the TMC ETR to make use of the connected CATU in translate mode.
This is done by adding CATU as new buffer mode.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add the initial support for Coresight Address Translation Unit, which
augments the TMC in Coresight SoC-600 by providing an improved Scatter
Gather mechanism. CATU is always connected to a single TMC-ETR and
converts the AXI address with a translated address (from a given SG
table with specific format). The CATU should be programmed in pass
through mode and enabled even if the ETR doesn't use the translation
by CATU.
This patch provides mechanism to enable/disable the CATU always in the
pass through mode.
We reuse the existing ports mechanism to link the TMC-ETR to the
connected CATU.
i.e, TMC-ETR:output_port0 -> CATU:input_port0
Reference manual for CATU component is avilable in version r2p0 of :
"Arm Coresight System-on-Chip SoC-600 Technical Reference Manual".
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We request for "CORESIGHT_BARRIER_PKT_SIZE" length and we should
be happy when we get that size.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The newly introduced code fails to build in some configurations
unless we include the right headers:
drivers/hwtracing/coresight/coresight-tmc-etr.c: In function 'tmc_free_table_pages':
drivers/hwtracing/coresight/coresight-tmc-etr.c:206:3: error: implicit declaration of function 'vunmap'; did you mean 'iounmap'? [-Werror=implicit-function-declaration]
Fixes: 79613ae8715a ("coresight: Add generic TMC sg table framework")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add the support for Scatter-Gather mode to the etr-buf layer.
Since we now have two different modes, we choose the backend
based on a set of conditions, documented in the code.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>