Upstream: commit b006847222623ac3cda8589d15379eac86a2bcb7
Conflicts: none
Backport-reason: mm: memcg: subtree stats flushing and thresholds
The workingset code flushes the stats in workingset_refault() to get
accurate stats of the eviction memcg. In preparation for more scoped
flushed and passing the eviction memcg to the flush call, move the call to
workingset_test_recent() where we have a pointer to the eviction memcg.
The flush call is sleepable, and cannot be made in an rcu read section.
Hence, minimize the rcu read section by also moving it into
workingset_test_recent(). Furthermore, instead of holding the rcu read
lock throughout workingset_test_recent(), only hold it briefly to get a
ref on the eviction memcg. This allows us to make the flush call after we
get the eviction memcg.
As for workingset_refault(), nothing else there appears to be protected by
rcu. The memcg of the faulted folio (which is not necessarily the same as
the eviction memcg) is protected by the folio lock, which is held from all
callsites. Add a VM_BUG_ON() to make sure this doesn't change from under
us.
No functional change intended.
Link: https://lkml.kernel.org/r/20231129032154.3710765-5-yosryahmed@google.com
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Tested-by: Domenico Cerasuolo <cerasuolodomenico@gmail.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Cc: Chris Li <chrisl@kernel.org>
Cc: Greg Thelen <gthelen@google.com>
Cc: Ivan Babrou <ivan@cloudflare.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Koutny <mkoutny@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Tejun Heo <tj@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Wei Xu <weixugc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Kairui Song <kasong@tencent.com>
Upstream: commit 8d59d2214c2362e7a9d185d80b613e632581af7b
Conflicts: none
Backport-reason: mm: memcg: subtree stats flushing and thresholds
A global counter for the magnitude of memcg stats update is maintained on
the memcg side to avoid invoking rstat flushes when the pending updates
are not significant. This avoids unnecessary flushes, which are not very
cheap even if there isn't a lot of stats to flush. It also avoids
unnecessary lock contention on the underlying global rstat lock.
Make this threshold per-memcg. The scheme is followed where percpu (now
also per-memcg) counters are incremented in the update path, and only
propagated to per-memcg atomics when they exceed a certain threshold.
This provides two benefits: (a) On large machines with a lot of memcgs,
the global threshold can be reached relatively fast, so guarding the
underlying lock becomes less effective. Making the threshold per-memcg
avoids this.
(b) Having a global threshold makes it hard to do subtree flushes, as we
cannot reset the global counter except for a full flush. Per-memcg
counters removes this as a blocker from doing subtree flushes, which helps
avoid unnecessary work when the stats of a small subtree are needed.
Nothing is free, of course. This comes at a cost: (a) A new per-cpu
counter per memcg, consuming NR_CPUS * NR_MEMCGS * 4 bytes. The extra
memory usage is insigificant.
(b) More work on the update side, although in the common case it will only
be percpu counter updates. The amount of work scales with the number of
ancestors (i.e. tree depth). This is not a new concept, adding a cgroup
to the rstat tree involves a parent loop, so is charging. Testing results
below show no significant regressions.
(c) The error margin in the stats for the system as a whole increases from
NR_CPUS * MEMCG_CHARGE_BATCH to NR_CPUS * MEMCG_CHARGE_BATCH * NR_MEMCGS.
This is probably fine because we have a similar per-memcg error in charges
coming from percpu stocks, and we have a periodic flusher that makes sure
we always flush all the stats every 2s anyway.
This patch was tested to make sure no significant regressions are
introduced on the update path as follows. The following benchmarks were
ran in a cgroup that is 2 levels deep (/sys/fs/cgroup/a/b/):
(1) Running 22 instances of netperf on a 44 cpu machine with
hyperthreading disabled. All instances are run in a level 2 cgroup, as
well as netserver:
# netserver -6
# netperf -6 -H ::1 -l 60 -t TCP_SENDFILE -- -m 10K
Averaging 20 runs, the numbers are as follows:
Base: 40198.0 mbps
Patched: 38629.7 mbps (-3.9%)
The regression is minimal, especially for 22 instances in the same
cgroup sharing all ancestors (so updating the same atomics).
(2) will-it-scale page_fault tests. These tests (specifically
per_process_ops in page_fault3 test) detected a 25.9% regression before
for a change in the stats update path [1]. These are the
numbers from 10 runs (+ is good) on a machine with 256 cpus:
LABEL | MEAN | MEDIAN | STDDEV |
------------------------------+-------------+-------------+-------------
page_fault1_per_process_ops | | | |
(A) base | 270249.164 | 265437.000 | 13451.836 |
(B) patched | 261368.709 | 255725.000 | 13394.767 |
| -3.29% | -3.66% | |
page_fault1_per_thread_ops | | | |
(A) base | 242111.345 | 239737.000 | 10026.031 |
(B) patched | 237057.109 | 235305.000 | 9769.687 |
| -2.09% | -1.85% | |
page_fault1_scalability | | |
(A) base | 0.034387 | 0.035168 | 0.0018283 |
(B) patched | 0.033988 | 0.034573 | 0.0018056 |
| -1.16% | -1.69% | |
page_fault2_per_process_ops | | |
(A) base | 203561.836 | 203301.000 | 2550.764 |
(B) patched | 197195.945 | 197746.000 | 2264.263 |
| -3.13% | -2.73% | |
page_fault2_per_thread_ops | | |
(A) base | 171046.473 | 170776.000 | 1509.679 |
(B) patched | 166626.327 | 166406.000 | 768.753 |
| -2.58% | -2.56% | |
page_fault2_scalability | | |
(A) base | 0.054026 | 0.053821 | 0.00062121 |
(B) patched | 0.053329 | 0.05306 | 0.00048394 |
| -1.29% | -1.41% | |
page_fault3_per_process_ops | | |
(A) base | 1295807.782 | 1297550.000 | 5907.585 |
(B) patched | 1275579.873 | 1273359.000 | 8759.160 |
| -1.56% | -1.86% | |
page_fault3_per_thread_ops | | |
(A) base | 391234.164 | 390860.000 | 1760.720 |
(B) patched | 377231.273 | 376369.000 | 1874.971 |
| -3.58% | -3.71% | |
page_fault3_scalability | | |
(A) base | 0.60369 | 0.60072 | 0.0083029 |
(B) patched | 0.61733 | 0.61544 | 0.009855 |
| +2.26% | +2.45% | |
All regressions seem to be minimal, and within the normal variance for the
benchmark. The fix for [1] assumes that 3% is noise -- and there were no
further practical complaints), so hopefully this means that such
variations in these microbenchmarks do not reflect on practical workloads.
(3) I also ran stress-ng in a nested cgroup and did not observe any
obvious regressions.
[1]https://lore.kernel.org/all/20190520063534.GB19312@shao2-debian/
Link: https://lkml.kernel.org/r/20231129032154.3710765-4-yosryahmed@google.com
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Domenico Cerasuolo <cerasuolodomenico@gmail.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Cc: Chris Li <chrisl@kernel.org>
Cc: Greg Thelen <gthelen@google.com>
Cc: Ivan Babrou <ivan@cloudflare.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Koutny <mkoutny@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Tejun Heo <tj@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Wei Xu <weixugc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Kairui Song <kasong@tencent.com>
Upstream: commit e0bf1dc859fdd08ef738824710770a30a8069433
Conflicts: resolved
Backport-reason: mm: memcg: subtree stats flushing and thresholds
The following patch will make use of those structs in the flushing code,
so move their definitions (and a few other dependencies) a little bit up
to reduce the diff noise in the following patch.
No functional change intended.
Link: https://lkml.kernel.org/r/20231129032154.3710765-3-yosryahmed@google.com
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Tested-by: Domenico Cerasuolo <cerasuolodomenico@gmail.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Cc: Chris Li <chrisl@kernel.org>
Cc: Greg Thelen <gthelen@google.com>
Cc: Ivan Babrou <ivan@cloudflare.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Koutny <mkoutny@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Tejun Heo <tj@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Wei Xu <weixugc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Kairui Song <kasong@tencent.com>
Upstream: commit 508bed884767a8eb394640bae9edcdf082816c43
Conflicts: none
Backport-reason: mm: memcg: subtree stats flushing and thresholds
Patch series "mm: memcg: subtree stats flushing and thresholds", v4.
This series attempts to address shortages in today's approach for memcg
stats flushing, namely occasionally stale or expensive stat reads. The
series does so by changing the threshold that we use to decide whether to
trigger a flush to be per memcg instead of global (patch 3), and then
changing flushing to be per memcg (i.e. subtree flushes) instead of
global (patch 5).
This patch (of 5):
flush_next_time is an inaccurate name. It's not the next time that
periodic flushing will happen, it's rather the next time that ratelimited
flushing can happen if the periodic flusher is late.
Simplify its semantics by just storing the timestamp of the last flush
instead, flush_last_time. Move the 2*FLUSH_TIME addition to
mem_cgroup_flush_stats_ratelimited(), and add a comment explaining it.
This way, all the ratelimiting semantics live in one place.
No functional change intended.
Link: https://lkml.kernel.org/r/20231129032154.3710765-1-yosryahmed@google.com
Link: https://lkml.kernel.org/r/20231129032154.3710765-2-yosryahmed@google.com
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Tested-by: Domenico Cerasuolo <cerasuolodomenico@gmail.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Chris Li <chrisl@kernel.org> (Google)
Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Ivan Babrou <ivan@cloudflare.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Koutny <mkoutny@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Tejun Heo <tj@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Wei Xu <weixugc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Kairui Song <kasong@tencent.com>
When a commit cherry-pick from upstream, we should add Signed-off-by in
commit message. With the Signed-off-by info, we could easy to know who
picked the commit, and could ask him the reason: why we need pick the commit.
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
Reviewed-by: Yongliang Gao <leonylgao@tencent.com>
Accepted few kind of references:
commit xxx upstream or
Upstream commit xxx or
[ Upstream commit xxx or
Upstream commit:
Signed-off-by: Alex Shi <alexsshi@tencent.com>
Acked-by: Alex Shi <alexsshi@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
Upstream commit 0d7ae06860753bb30b3731302b994da071120d00
Now that we support pinning a BPF timer to the current core, we should
test it with some selftests. This patch adds two new testcases to the
timer suite, which verifies that a BPF timer both with and without
BPF_F_TIMER_ABS, can be pinned to the calling core with BPF_F_TIMER_CPU_PIN.
Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Hongbo Li <herberthbli@tencent.com>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/bpf/20231004162339.200702-3-void@manifault.com
Upstream commit d6247ecb6c1e17d7a33317090627f5bfe563cbb2
BPF supports creating high resolution timers using bpf_timer_* helper
functions. Currently, only the BPF_F_TIMER_ABS flag is supported, which
specifies that the timeout should be interpreted as absolute time. It
would also be useful to be able to pin that timer to a core. For
example, if you wanted to make a subset of cores run without timer
interrupts, and only have the timer be invoked on a single core.
This patch adds support for this with a new BPF_F_TIMER_CPU_PIN flag.
When specified, the HRTIMER_MODE_PINNED flag is passed to
hrtimer_start(). A subsequent patch will update selftests to validate.
Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Hongbo Li <herberthbli@tencent.com>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/bpf/20231004162339.200702-2-void@manifault.com
Upstream commit e23edc86b09df655bf8963bbcb16647adc787395
The name is a bit opaque - make it clear that this is about wakeup
preemption.
Also rename the ->check_preempt_curr() methods similarly.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Hongbo Li <herberthbli@tencent.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Upstream commit 82845683ca6a15fe8c7912c6264bb0e84ec6f5fb
Other scheduling classes already postfix their similar methods
with the class name.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Hongbo Li <herberthbli@tencent.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Upstream: no
check-kabi script copy from tkernel4, run failed in tkernel5.
Traceback (most recent call last):
File "./scripts/check-kabi", line 143, in <module>
load_symvers(symvers,symvers_file)
File "./scripts/check-kabi", line 44, in load_symvers
checksum,symbol,directory,type = string.split(in_line)
ValueError: too many values to unpack
update script copy from dist/sources/check-kabi
Signed-off-by: Yongliang Gao <leonylgao@tencent.com>
Reviewed-by: Jianping Liu <frankjpliu@tencent.com>
Upstream: no
Provides kabi check/update/create commands for local users:
1. Check whether TencentOS Kennel KABI is compatible
2. Update TencentOS Kennel KABI file
3. Create TencentOS Kennel KABI file
Signed-off-by: Yongliang Gao <leonylgao@tencent.com>
Reviewed-by: Jianping Liu <frankjpliu@tencent.com>
For easy get the config used in tkci, add kernel/configs/tkci.config,
users can run "make tencentconfig tkci.config" to generate the .config.
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
Signed-off-by: Yongliang Gao <leonylgao@tencent.com>
Reviewed-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: aurelianliu <aurelianliu@tencent.com>
commit 762cad7("pci: delay 5s to proble multiple storage controllers")
aimed to order scsi device mount order. But nvme devices do not need
to do so, which will instead increase the boot time of storage servers.
Therefore we bypass NVMe device here.
Signed-off-by: Xinghui Li <korantli@tencent.com>
Signed-off-by: Samuel Liao <samuelliao@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
virtio block device has no async probe path, so needn't probe delay
this patch will reduce about 5s kernel booting time.
Signed-off-by: Xiaoming Gao <newtongao@tencent.com>
Signed-off-by: Liu Yu <allanyuliu@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
spec: add dependency on libtool to build on koji.
Signed-off-by: Sinong Chen <costinchen@tencent.com>
Reviewed-by: Jianping Liu <frankjpliu@tencent.com>
On AmpereONE soc, it support NMI interrupt, which needing enable
CONFIG_ACPI_AGDI.
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
Reviewed-by: Yongliang Gao <leonylgao@tencent.com>
Cloud game need run android container, so add the configs as fellow:
CONFIG_ANDROID_BINDER_IPC=y
CONFIG_ANDROID_BINDERFS=y
CONFIG_ANDROID_BINDER_DEVICES="binder,hwbinder,vndbinder"
CONFIG_ANDROID_BINDER_IPC_SELFTEST=y
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
Reviewed-by: Yongliang Gao <leonylgao@tencent.com>
TK have some kernel modules (such nvidia.ko) only uesd in public release verison,
split them into modules-public subpackage.
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
Reviewed-by: Yongliang Gao <leonylgao@tencent.com>
Modules in kernle*modules-removable-media*.rpm are the drivers for removable
media. Using them will rise attack risck, and they are not used in private
release, only used in public release. So, rename it.
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
Reviewed-by: Yongliang Gao <leonylgao@tencent.com>
Tk4 have this sub package, try to be compatible, and use
filter-modules.sh instead which does a depmod check after splitting, to
avoid depmod failure after some modules are split into subpackage.
Signed-off-by: Kairui Song <kasong@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
commit 16438b652b464ef7d0a877d31e93ab54338f6b0a upstream.
Add JSON files for AmpereOneX core PMU events and metrics.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20231201021550.1109196-4-ilkka@os.amperecomputing.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Huang Cun <cunhuang@tencent.com>
commit be097997a273259f1723baac5463cf19d8564efa upstream.
It is possible for multiple vCPUs to fault on the same IPA and attempt
to resolve the fault. One of the page table walks will actually update
the PTE and the rest will return -EAGAIN per our race detection scheme.
KVM elides the TLB invalidation on the racing threads as the return
value is nonzero.
Before commit a12ab1378a ("KVM: arm64: Use local TLBI on permission
relaxation") KVM always used broadcast TLB invalidations when handling
permission faults, which had the convenient property of making the
stage-2 updates visible to all CPUs in the system. However now we do a
local invalidation, and TLBI elision leads to the vCPU thread faulting
again on the stale entry. Remember that the architecture permits the TLB
to cache translations that precipitate a permission fault.
Invalidate the TLB entry responsible for the permission fault if the
stage-2 descriptor has been relaxed, regardless of which thread actually
did the job.
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230922223229.1608155-1-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Huang Cun <cunhuang@tencent.com>
commit 909b583f81b5bb5a398d4580543f59b908a86ccc upstream.
Gavin reports of soft lockups on his Ampere Altra Max machine when
backing KVM guests with hugetlb pages. Upon further investigation, it
was found that the system is unable to keep up with parallel I-cache
invalidations done by KVM's stage-2 fault handler.
This is ultimately an implementation problem. I-cache maintenance
instructions are available at EL0, so nothing stops a malicious
userspace from hammering a system with CMOs and cause it to fall over.
"Fixing" this problem in KVM is nothing more than slapping a bandage
over a much deeper problem.
Anyway, the kernel already has a heuristic for limiting TLB
invalidations to avoid soft lockups. Reuse that logic to limit I-cache
CMOs done by KVM to map executable pages on systems without FEAT_DIC.
While at it, restructure __invalidate_icache_guest_page() to improve
readability and squeeze our new condition into the existing branching
structure.
Link: https://lore.kernel.org/kvmarm/20230904072826.1468907-1-gshan@redhat.com/
Reviewed-by: Gavin Shan <gshan@redhat.com>
Tested-by: Gavin Shan <gshan@redhat.com>
Link: https://lore.kernel.org/r/20230920080133.944717-3-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Huang Cun <cunhuang@tencent.com>
commit ec1c3b9ff16082f880b304be40992568f4eee6a7 upstream.
Perhaps unsurprisingly, I-cache invalidations suffer from performance
issues similar to TLB invalidations on certain systems. TLB and I-cache
maintenance all result in DVM on the mesh, which is where the real
bottleneck lies.
Rename the heuristic to point the finger at DVM, such that it may be
reused for limiting I-cache invalidations.
Reviewed-by: Gavin Shan <gshan@redhat.com>
Tested-by: Gavin Shan <gshan@redhat.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230920080133.944717-2-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Huang Cun <cunhuang@tencent.com>
commit 0abe7f61c28d62ee0530c31589e6ea209aa82cbd upstream.
Add ampere_cspmu to toctree in order to address the following warning
produced when building documents:
Documentation/admin-guide/perf/ampere_cspmu.rst: WARNING: document isn't included in any toctree
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Closes: https://lore.kernel.org/all/20231011172250.5a6498e5@canb.auug.org.au/
Fixes: 53a810ad3c5c ("perf: arm_cspmu: ampere_cspmu: Add support for Ampere SoC PMU")
Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/20231012074103.3772114-1-ilkka@os.amperecomputing.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Huang Cun <cunhuang@tencent.com>
commit 53a810ad3c5cde674cac71e629e6d10bfc9d838c upstream.
Ampere SoC PMU follows CoreSight PMU architecture. It uses implementation
specific registers to filter events rather than PMEVFILTnR registers.
Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/20230913233941.9814-5-ilkka@os.amperecomputing.com
[will: Include linux/io.h in ampere_cspmu.c for writel()]
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Huang Cun <cunhuang@tencent.com>
commit 647d5c5a9e7672e285f54f0e141ee759e69382f2 upstream.
Some platforms may use e.g. different filtering mechanism and, thus,
may need different way to validate the events and group.
Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230913233941.9814-4-ilkka@os.amperecomputing.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Huang Cun <cunhuang@tencent.com>
commit 0a7603ab242e9bab530227cf0d0d344d4e334acc upstream.
ARM Coresight PMU architecture specification [1] defines PMEVTYPER and
PMEVFILT* registers as optional in Chapter 2.1. Moreover, implementers may
choose to use PMIMPDEF* registers (offset: 0xD80-> 0xDFF) to filter the
events. Add support for those by adding implementation specific filter
callback function.
[1] https://developer.arm.com/documentation/ihi0091/latest
Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Reviewed-by: Besar Wicaksono <bwicaksono@nvidia.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230913233941.9814-3-ilkka@os.amperecomputing.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Huang Cun <cunhuang@tencent.com>
commit 8c282414ca6209977cb6d6cc66470ca2d1e56bf6 upstream.
Split the 64-bit register accesses if 64-bit access is not supported
by the PMU.
Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Reviewed-by: Besar Wicaksono <bwicaksono@nvidia.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230913233941.9814-2-ilkka@os.amperecomputing.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Huang Cun <cunhuang@tencent.com>
commit bfc653aa89cb05796d7b4e046600accb442c9b7a upstream.
Arm Coresight PMU driver consists of main standard code and
vendor backend code. Both are currently built as a single module.
This patch adds vendor registration API to separate the two to
keep things modular. The main driver requests each known backend
module during initialization and defer device binding process.
The backend module then registers an init callback to the main
driver and continue the device driver binding process.
Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-and-tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/20230821231608.50911-1-bwicaksono@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Huang Cun <cunhuang@tencent.com>
Add CONFIG_SLUB_DEBUG_ON=y only in debug.config, it will not affect
release config.
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
Reviewed-by: Yongliang Gao <leonylgao@tencent.com>
Linux 6.6 support hard lockup detect on aarch64, enable it.
It is useful to debug spin deadlock with irq disable.
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
Reviewed-by: Yongliang Gao <leonylgao@tencent.com>
Using CONFIG_DEBUG_LOCKDEP will easy to debug deadlock, so enable it
in debug kernel.
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
Reviewed-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: aurelianliu <aurelianliu@tencent.com>
Performance degradation due to multi-core contending for global spinlock.
Disable CONFIG_LATENCYTOP won't select SCHEDSTATS and KALLSYMS_ALL by
default, which makes data section 's symbol can't be looked up.
So enable KALLSYMS_ALL and SCHEDSTATS after disable CONFIG_LATENCYTOP.
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
Reviewed-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: aurelianliu <aurelianliu@tencent.com>
In TK5, private and public kernel rpm will be the same, so disable
CONFIG_MODULE_SIG_FORCE, and using module.sig_enforce=1 kernel param
in private TS distro.
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
Reviewed-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: aurelianliu <aurelianliu@tencent.com>
Update tencent.config by:
make tencentconfig
make savedefconfig
mv defconfig arch/x86/configs/tencent.config
With out any manual change.
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
Reviewed-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: aurelianliu <aurelianliu@tencent.com>