Commit Graph

445 Commits

Author SHA1 Message Date
Peter Zijlstra bd27568117 perf: Rewrite core context handling
There have been various issues and limitations with the way perf uses
(task) contexts to track events. Most notable is the single hardware
PMU task context, which has resulted in a number of yucky things (both
proposed and merged).

Notably:
 - HW breakpoint PMU
 - ARM big.little PMU / Intel ADL PMU
 - Intel Branch Monitoring PMU
 - AMD IBS PMU
 - S390 cpum_cf PMU
 - PowerPC trace_imc PMU

*Current design:*

Currently we have a per task and per cpu perf_event_contexts:

  task_struct::perf_events_ctxp[] <-> perf_event_context <-> perf_cpu_context
       ^                                 |    ^     |           ^
       `---------------------------------'    |     `--> pmu ---'
                                              v           ^
                                         perf_event ------'

Each task has an array of pointers to a perf_event_context. Each
perf_event_context has a direct relation to a PMU and a group of
events for that PMU. The task related perf_event_context's have a
pointer back to that task.

Each PMU has a per-cpu pointer to a per-cpu perf_cpu_context, which
includes a perf_event_context, which again has a direct relation to
that PMU, and a group of events for that PMU.

The perf_cpu_context also tracks which task context is currently
associated with that CPU and includes a few other things like the
hrtimer for rotation etc.

Each perf_event is then associated with its PMU and one
perf_event_context.

*Proposed design:*

New design proposed by this patch reduce to a single task context and
a single CPU context but adds some intermediate data-structures:

  task_struct::perf_event_ctxp -> perf_event_context <- perf_cpu_context
       ^                           |   ^ ^
       `---------------------------'   | |
                                       | |    perf_cpu_pmu_context <--.
                                       | `----.    ^                  |
                                       |      |    |                  |
                                       |      v    v                  |
                                       | ,--> perf_event_pmu_context  |
                                       | |                            |
                                       | |                            |
                                       v v                            |
                                  perf_event ---> pmu ----------------'

With the new design, perf_event_context will hold all events for all
pmus in the (respective pinned/flexible) rbtrees. This can be achieved
by adding pmu to rbtree key:

  {cpu, pmu, cgroup, group_index}

Each perf_event_context carries a list of perf_event_pmu_context which
is used to hold per-pmu-per-context state. For example, it keeps track
of currently active events for that pmu, a pmu specific task_ctx_data,
a flag to tell whether rotation is required or not etc.

Additionally, perf_cpu_pmu_context is used to hold per-pmu-per-cpu
state like hrtimer details to drive the event rotation, a pointer to
perf_event_pmu_context of currently running task and some other
ancillary information.

Each perf_event is associated to it's pmu, perf_event_context and
perf_event_pmu_context.

Further optimizations to current implementation are possible. For
example, ctx_resched() can be optimized to reschedule only single pmu
events.

Much thanks to Ravi for picking this up and pushing it towards
completion.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Co-developed-by: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20221008062424.313-1-ravi.bangoria@amd.com
2022-10-27 20:12:16 +02:00
Linus Torvalds 1df046ab1c arm64 fixes:
- Cortex-A55 errata workaround (repeat TLBI).
 
 - AMPERE1 added to the Spectre-BHB affected list.
 
 - MTE fix to avoid setting PG_mte_tagged if no tags have been touched on
   a page.
 
 - Fixed typo in the SCTLR_EL1.SPINTMASK bit naming (the commit log has
   other typos).
 
 - perf: return value check in ali_drw_pmu_probe(),
   ALIBABA_UNCORE_DRW_PMU dependency on ACPI.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmNJrpAACgkQa9axLQDI
 XvFwWQ/+O71bVQPXf43p+O3LapX3IsOxDCWLTMjNcxVgSBSK+TPwtjXN7vIPZDvv
 Ibx4Y10vmHo8Copbs7C8USx+hGo7hzknk/s2zoeqJQX13WQkqpuAwTDshzMp60La
 nQoJXab3KapQ3UIPL5El/cbvAD9+DGJSiWdyvC8GBHwtWKWi1WDpSNFN3WMJm97P
 uQqERiWaf3XOI9BhsuOlCzQE5eemCllycdWoRBelCjIQByuo6SaDPEpTUZDICCPp
 f4Ji7U1hfORmXg/DJcjSJbtkSshVRqjhSAtAmP/sWUic7+kWGBiC+zJQ0PxwiNQH
 Bfryz90ETa/INA65hA1iC51lE7hvt1DKueZAMKjozxYSCSVAxUNonSkEOfKegPeU
 hLhTowmveryqxYGuQ75p5tZjdpvML0Sa/lx7p/GUEhaV77dca/EJ0B68x8WrBpO5
 TCsW3iDq2V+ErWgYL7n6nFoMhZQnNvq9jxhAPuJ8Y47ZkeQ8HcvooKLHUSDSjMk2
 f/7A7rUJh0piYf0FEPSjRBTO/HyPb1D90n1t2wJoCqwrICZ/mmWzVqua0fgmrbvS
 H33YQiSEIkwsfLktIIJRGknYgC0P/JALKlAQPAcmsd+njWsThXJ/WwwRrpvCZdMj
 9CVuDfhw7Ipt4Iz5Tg61lLDkzi7bPRqPpEKc8zzsI3nmY0KC/iA=
 =vjYu
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

 - Cortex-A55 errata workaround (repeat TLBI)

 - AMPERE1 added to the Spectre-BHB affected list

 - MTE fix to avoid setting PG_mte_tagged if no tags have been touched
   on a page

 - Fixed typo in the SCTLR_EL1.SPINTMASK bit naming (the commit log has
   other typos)

 - perf: return value check in ali_drw_pmu_probe(),
   ALIBABA_UNCORE_DRW_PMU dependency on ACPI

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: Add AMPERE1 to the Spectre-BHB affected list
  arm64: mte: Avoid setting PG_mte_tagged if no tags cleared or restored
  MAINTAINERS: rectify file entry in ALIBABA PMU DRIVER
  drivers/perf: ALIBABA_UNCORE_DRW_PMU should depend on ACPI
  drivers/perf: fix return value check in ali_drw_pmu_probe()
  arm64: errata: Add Cortex-A55 to the repeat tlbi list
  arm64/sysreg: Fix typo in SCTR_EL1.SPINTMASK
2022-10-14 12:38:03 -07:00
Linus Torvalds 498574970f RISC-V Patches for the 6.1 Merge Window, Part 2
* A handful of DT updates for the PolarFire SOC.
 * A fix to correct the handling of write-only mappings.
 * m{vetndor,arcd,imp}id is now in /proc/cpuinfo
 * The SiFive L2 cache controller support has been refactored to also
   support L3 caches.
 
 There's also a handful of fixes, cleanups and improvements throughout
 the tree.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmNJiegTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiaW+D/9nN7JMKD4KbED8MUgRVcs+TS4MQk5J
 QkjmAXme7w2H1+T5mxNWHk0QvEC6qMu2JQWHeott0ROYPkbXoNtuOOkzcsZhVaeb
 rjWH/WC5QhEeMPDc1qc0AmxVkOa937f2NkGtoEvhW6SMWvStHpefdOHo/ij106Re
 7wxkcj0fjgn/zPmDltRbWSMyFZWJec5DvZ3AB4NYHnc2ycr8Z7HAG08rjZkdkIjQ
 zYGsCkUtrk4qShikKK3cSelW6hH1/FdM+bAo0rzt1frmTw1FLaFOsZZjOaVYekzi
 l0jxsb8FRBWyKBFqcagukjZwFy6D+7Q+masb0cXY03eZpEVrroA4bHiPkuZ22Ms/
 7ol/I5FvTyrj2R4zd70ziYVF3usO78t5HC4AIQmFl25TNcQdYQd8X28yh12iG6QN
 Pa0lh/EOr5idaT2+TErhzRepICnK1Nj9y0H5TZxYljLAhH9j0d/8+Iw88vzJnPga
 vek7unZ3BzkooLVIpfpkT8vC94MA0hoP849MVFQDtZgZhPMbjIkN91HrDiw9ktV2
 SK9cuPndfrs5nW1WKu8F0cDziusbMHv4F51TPVRu/dFcIkspv+aLojAThgvcm42K
 55LgvDgLjo7P3PUghiDXUoPZXvL19t5Bnq2/E47rlSTvvGssIu/onZO6BkxybAm6
 BkHuchr8TGBWMg==
 =iCOr
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.1-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull more RISC-V updates from Palmer Dabbelt:

 - DT updates for the PolarFire SOC

 - a fix to correct the handling of write-only mappings

 - m{vetndor,arcd,imp}id is now in /proc/cpuinfo

 - the SiFive L2 cache controller support has been refactored to also
   support L3 caches

 - misc fixes, cleanups and improvements throughout the tree

* tag 'riscv-for-linus-6.1-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (42 commits)
  MAINTAINERS: add RISC-V's patchwork
  RISC-V: Make port I/O string accessors actually work
  riscv: enable software resend of irqs
  RISC-V: Re-enable counter access from userspace
  riscv: vdso: fix NULL deference in vdso_join_timens() when vfork
  riscv: Add cache information in AUX vector
  soc: sifive: ccache: define the macro for the register shifts
  soc: sifive: ccache: use pr_fmt() to remove CCACHE: prefixes
  soc: sifive: ccache: reduce printing on init
  soc: sifive: ccache: determine the cache level from dts
  soc: sifive: ccache: Rename SiFive L2 cache to Composable cache.
  dt-bindings: sifive-ccache: change Sifive L2 cache to Composable cache
  riscv: check for kernel config option in t-head memory types errata
  riscv: use BIT() marco for cpufeature probing
  riscv: use BIT() macros in t-head errata init
  riscv: drop some idefs from CMO initialization
  riscv: cleanup svpbmt cpufeature probing
  riscv: Pass -mno-relax only on lld < 15.0.0
  RISC-V: Avoid dereferening NULL regs in die()
  dt-bindings: riscv: add new riscv,isa strings for emulators
  ...
2022-10-14 11:21:11 -07:00
Palmer Dabbelt 5a5294fbe0
RISC-V: Re-enable counter access from userspace
These counters were part of the ISA when we froze the uABI, removing
them breaks userspace.

Link: https://lore.kernel.org/all/YxEhC%2FmDW1lFt36J@aurel32.net/
Fixes: e999143459 ("RISC-V: Add perf platform driver based on SBI PMU extension")
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20220928131807.30386-1-palmer@rivosinc.com
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-13 11:18:39 -07:00
Linus Torvalds 3871d93b82 Perf events updates for v6.1:
- PMU driver updates:
 
      - Add AMD Last Branch Record Extension Version 2 (LbrExtV2)
        feature support for Zen 4 processors.
 
      - Extend the perf ABI to provide branch speculation information,
        if available, and use this on CPUs that have it (eg. LbrExtV2).
 
      - Improve Intel PEBS TSC timestamp handling & integration.
 
      - Add Intel Raptor Lake S CPU support.
 
      - Add 'perf mem' and 'perf c2c' memory profiling support on
        AMD CPUs by utilizing IBS tagged load/store samples.
 
      - Clean up & optimize various x86 PMU details.
 
  - HW breakpoints:
 
      - Big rework to optimize the code for systems with hundreds of CPUs and
        thousands of breakpoints:
 
         - Replace the nr_bp_mutex global mutex with the bp_cpuinfo_sem
 	  per-CPU rwsem that is read-locked during most of the key operations.
 
 	- Improve the O(#cpus * #tasks) logic in toggle_bp_slot()
 	  and fetch_bp_busy_slots().
 
 	- Apply micro-optimizations & cleanups.
 
   - Misc cleanups & enhancements.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmM/2pMRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1iIMA/+J+MCEVTt9kwZeBtHoPX7iZ5gnq1+McoQ
 f6ALX19AO/ZSuA7EBA3cS3Ny5eyGy3ofYUnRW+POezu9CpflLW/5N27R2qkZFrWC
 A09B86WH676ZrmXt+oI05rpZ2y/NGw4gJxLLa4/bWF3g9xLfo21i+YGKwdOnNFpl
 DEdCVHtjlMcOAU3+on6fOYuhXDcYd7PKGcCfLE7muOMOAtwyj0bUDBt7m+hneZgy
 qbZHzDU2DZ5L/LXiMyuZj5rC7V4xUbfZZfXglG38YCW1WTieS3IjefaU2tREhu7I
 rRkCK48ULDNNJR3dZK8IzXJRxusq1ICPG68I+nm/K37oZyTZWtfYZWehW/d/TnPa
 tUiTwimabz7UUqaGq9ZptxwINcAigax0nl6dZ3EseeGhkDE6j71/3kqrkKPz4jth
 +fCwHLOrI3c4Gq5qWgPvqcUlUneKf3DlOMtzPKYg7sMhla2zQmFpYCPzKfm77U/Z
 BclGOH3FiwaK6MIjPJRUXTePXqnUseqCR8PCH/UPQUeBEVHFcMvqCaa15nALed8x
 dFi76VywR9mahouuLNq6sUNePlvDd2B124PygNwegLlBfY9QmKONg9qRKOnQpuJ6
 UprRJjLOOucZ/N/jn6+ShHkqmXsnY2MhfUoHUoMQ0QAI+n++e+2AuePo251kKWr8
 QlqKxd9PMQU=
 =LcGg
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf events updates from Ingo Molnar:
 "PMU driver updates:

   - Add AMD Last Branch Record Extension Version 2 (LbrExtV2) feature
     support for Zen 4 processors.

   - Extend the perf ABI to provide branch speculation information, if
     available, and use this on CPUs that have it (eg. LbrExtV2).

   - Improve Intel PEBS TSC timestamp handling & integration.

   - Add Intel Raptor Lake S CPU support.

   - Add 'perf mem' and 'perf c2c' memory profiling support on AMD CPUs
     by utilizing IBS tagged load/store samples.

   - Clean up & optimize various x86 PMU details.

  HW breakpoints:

   - Big rework to optimize the code for systems with hundreds of CPUs
     and thousands of breakpoints:

      - Replace the nr_bp_mutex global mutex with the bp_cpuinfo_sem
        per-CPU rwsem that is read-locked during most of the key
        operations.

      - Improve the O(#cpus * #tasks) logic in toggle_bp_slot() and
        fetch_bp_busy_slots().

      - Apply micro-optimizations & cleanups.

  - Misc cleanups & enhancements"

* tag 'perf-core-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (75 commits)
  perf/hw_breakpoint: Annotate tsk->perf_event_mutex vs ctx->mutex
  perf: Fix pmu_filter_match()
  perf: Fix lockdep_assert_event_ctx()
  perf/x86/amd/lbr: Adjust LBR regardless of filtering
  perf/x86/utils: Fix uninitialized var in get_branch_type()
  perf/uapi: Define PERF_MEM_SNOOPX_PEER in kernel header file
  perf/x86/amd: Support PERF_SAMPLE_PHY_ADDR
  perf/x86/amd: Support PERF_SAMPLE_ADDR
  perf/x86/amd: Support PERF_SAMPLE_{WEIGHT|WEIGHT_STRUCT}
  perf/x86/amd: Support PERF_SAMPLE_DATA_SRC
  perf/x86/amd: Add IBS OP_DATA2 DataSrc bit definitions
  perf/mem: Introduce PERF_MEM_LVLNUM_{EXTN_MEM|IO}
  perf/x86/uncore: Add new Raptor Lake S support
  perf/x86/cstate: Add new Raptor Lake S support
  perf/x86/msr: Add new Raptor Lake S support
  perf/x86: Add new Raptor Lake S support
  bpf: Check flags for branch stack in bpf_read_branch_records helper
  perf, hw_breakpoint: Fix use-after-free if perf_event_open() fails
  perf: Use sample_flags for raw_data
  perf: Use sample_flags for addr
  ...
2022-10-10 09:27:46 -07:00
Linus Torvalds 2e64066dab RISC-V Patches for the 6.1 Merge Window, Part 1
* Improvements to the CPU topology subsystem, which fix some issues
   where RISC-V would report bad topology information.
 * The default NR_CPUS has increased to XLEN, and the maximum
   configurable value is 512.
 * The CD-ROM filesystems have been enabled in the defconfig.
 * Support for THP_SWAP has been added for rv64 systems.
 
 There are also a handful of cleanups and fixes throughout the tree.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmNAWgwTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYicSiEACmuB9WuGZmAasKvmPgz7thyLqakg7/
 cE4YK0MxgJxkhsXzYSAv1Fn+WUfX7DSzhK4OOM5wEngAYul7QoFdc84MF0DYKO+E
 InjdOvVavzUsWYqETNCuMHPRK6xyzvfHCqqBDDxKHx5jUoicCQfFwJyHLw+cvouR
 7WSJoFdvOEV01QyN5Qw9bQp7ASx61ZZX1yE6OAPc2/EJlDEA2QSnjBAi4M+n2ZCx
 ZsQz+Dp9RfSU8/nIr13oGiL3Zm+kyXwdOS/8PaDqtrkyiGh6+vSeGqZZwRLVITP/
 oUxqGEgnn2eFBD1y8vjsQNWMLWoi9Av4746Fxr8CEHX+jX1cp9CCkU2OkkLxaFcv
 6XFtXPJIh/UjzVgPmjZxK+ArEX28QOM5IVyBFxsSl0dNtvyVqKpBXCV1RQ+fFHkO
 ntHF3ZxibqOn8ZJmziCn0nzWSOqugNTdAhD4dJAbl58RB/IQtQT0OnHpmpXCG3xh
 +/JBzy//xkr7u2HMqU69PzwPtWwZrENUV6jl5SHUDUoW8pySng2Pl4pbmTFqgWty
 JTfc5EdyWOWyshhoSCtK2//bnVFryl2ntwGr3LIZrZxkiUiOeYjn+C/YedXZIRob
 yy2CN+QanW/FXdIa4GMNeGc9sGDApd3/RtP+8L9mV1kWK6OE0EVskkI1UMCGXrIP
 5JoE1jLMVhjcKQ==
 =LJg6
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.1-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Improvements to the CPU topology subsystem, which fix some issues
   where RISC-V would report bad topology information.

 - The default NR_CPUS has increased to XLEN, and the maximum
   configurable value is 512.

 - The CD-ROM filesystems have been enabled in the defconfig.

 - Support for THP_SWAP has been added for rv64 systems.

There are also a handful of cleanups and fixes throughout the tree.

* tag 'riscv-for-linus-6.1-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: enable THP_SWAP for RV64
  RISC-V: Print SSTC in canonical order
  riscv: compat: s/failed/unsupported if compat mode isn't supported
  RISC-V: Increase range and default value of NR_CPUS
  cpuidle: riscv-sbi: Fix CPU_PM_CPU_IDLE_ENTER_xyz() macro usage
  perf: RISC-V: throttle perf events
  perf: RISC-V: exclude invalid pmu counters from SBI calls
  riscv: enable CD-ROM file systems in defconfig
  riscv: topology: fix default topology reporting
  arm64: topology: move store_cpu_topology() to shared code
2022-10-09 13:24:01 -07:00
Geert Uytterhoeven e08d07dd9f drivers/perf: ALIBABA_UNCORE_DRW_PMU should depend on ACPI
The Alibaba T-Head Yitian 710 DDR Sub-system Driveway PMU driver relies
solely on ACPI for matching.  Hence add a dependency on ACPI, to prevent
asking the user about this driver when configuring a kernel without ACPI
support.

Fixes: cf7b61073e ("drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/2a4407bb598285660fa5e604e56823ddb12bb0aa.1664285774.git.geert+renesas@glider.be
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-10-07 14:47:44 +01:00
Sun Ke ad0112f2d5 drivers/perf: fix return value check in ali_drw_pmu_probe()
In case of error, devm_ioremap_resource() returns ERR_PTR(),
and never returns NULL. The NULL test in the return value
check should be replaced with IS_ERR().

Fixes: cf7b61073e ("drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC")
Signed-off-by: Sun Ke <sunke32@huawei.com>
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220924032127.313156-1-sunke32@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-10-07 14:47:38 +01:00
Linus Torvalds 18fd049731 arm64 updates for 6.1:
- arm64 perf: DDR PMU driver for Alibaba's T-Head Yitian 710 SoC, SVE
   vector granule register added to the user regs together with SVE perf
   extensions documentation.
 
 - SVE updates: add HWCAP for SVE EBF16, update the SVE ABI documentation
   to match the actual kernel behaviour (zeroing the registers on syscall
   rather than "zeroed or preserved" previously).
 
 - More conversions to automatic system registers generation.
 
 - vDSO: use self-synchronising virtual counter access in gettimeofday()
   if the architecture supports it.
 
 - arm64 stacktrace cleanups and improvements.
 
 - arm64 atomics improvements: always inline assembly, remove LL/SC
   trampolines.
 
 - Improve the reporting of EL1 exceptions: rework BTI and FPAC exception
   handling, better EL1 undefs reporting.
 
 - Cortex-A510 erratum 2658417: remove BF16 support due to incorrect
   result.
 
 - arm64 defconfig updates: build CoreSight as a module, enable options
   necessary for docker, memory hotplug/hotremove, enable all PMUs
   provided by Arm.
 
 - arm64 ptrace() support for TPIDR2_EL0 (register provided with the SME
   extensions).
 
 - arm64 ftraces updates/fixes: fix module PLTs with mcount, remove
   unused function.
 
 - kselftest updates for arm64: simple HWCAP validation, FP stress test
   improvements, validation of ZA regs in signal handlers, include larger
   SVE and SME vector lengths in signal tests, various cleanups.
 
 - arm64 alternatives (code patching) improvements to robustness and
   consistency: replace cpucap static branches with equivalent
   alternatives, associate callback alternatives with a cpucap.
 
 - Miscellaneous updates: optimise kprobe performance of patching
   single-step slots, simplify uaccess_mask_ptr(), move MTE registers
   initialisation to C, support huge vmalloc() mappings, run softirqs on
   the per-CPU IRQ stack, compat (arm32) misalignment fixups for
   multiword accesses.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmM9W4cACgkQa9axLQDI
 XvEy3w/+LJ3KCFowWiz5gTAWikjv+UVssHjLMJixn47V7hsEFQ26Xnam/438rTMI
 kE95u6DHUpw2SMIxKzFRO7oI5cQtP+cWGwTtOUnjVO+U1oN+HqDOIbO9DbylWDcU
 eeeqMMmawMfTPuZrYklpOhXscsorbrKIvYBg7wHYOcwBYV3EPhWr89lwMvTVRuyJ
 qpX628KlkGMaBcONNhv3nS3qZcAOs0oHQCAVS4C8czLDL+vtJlumXUS3xr1Mqm72
 xtFe7sje8Djr2kZ8mzh0GbFiZEBoBD3F/l7ayq8gVRaVpToUt8sk36Stjs4LojF1
 6imuAfji/5TItkScq5KhGqj6MIugwp/eUVbRN74OLNTYx7msF1ZADNFQ+Q0UuY0H
 SYK13KvmOji0xjS8qAfhqrwNB79sk3fb+zF9LjETbdz4ZJCgg9gcFbSUTY0DvMfS
 MXZk/jVeB07olA8xYbjh0BRt4UV9xU628FPQzK5k7e4Nzl4jSvgtJZCZanfuVtjy
 /ZS1vbN8o7tQLBAlVnw+Exi/VedkKxkkMgm8tPKsMgERTFDx0Pc4Gs72hRpDnPWT
 MRbeCCGleAf3JQ5vF0coBDNOCEVvweQgShHOyHTz0GyhWXLCFx3RJICo5I4EIpps
 LLUk4JK0fO3LVrf1AEpu5ZP4+Sact0zfsH3gB7qyLPYFDmjDXD8=
 =jl3Z
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:

 - arm64 perf: DDR PMU driver for Alibaba's T-Head Yitian 710 SoC, SVE
   vector granule register added to the user regs together with SVE perf
   extensions documentation.

 - SVE updates: add HWCAP for SVE EBF16, update the SVE ABI
   documentation to match the actual kernel behaviour (zeroing the
   registers on syscall rather than "zeroed or preserved" previously).

 - More conversions to automatic system registers generation.

 - vDSO: use self-synchronising virtual counter access in gettimeofday()
   if the architecture supports it.

 - arm64 stacktrace cleanups and improvements.

 - arm64 atomics improvements: always inline assembly, remove LL/SC
   trampolines.

 - Improve the reporting of EL1 exceptions: rework BTI and FPAC
   exception handling, better EL1 undefs reporting.

 - Cortex-A510 erratum 2658417: remove BF16 support due to incorrect
   result.

 - arm64 defconfig updates: build CoreSight as a module, enable options
   necessary for docker, memory hotplug/hotremove, enable all PMUs
   provided by Arm.

 - arm64 ptrace() support for TPIDR2_EL0 (register provided with the SME
   extensions).

 - arm64 ftraces updates/fixes: fix module PLTs with mcount, remove
   unused function.

 - kselftest updates for arm64: simple HWCAP validation, FP stress test
   improvements, validation of ZA regs in signal handlers, include
   larger SVE and SME vector lengths in signal tests, various cleanups.

 - arm64 alternatives (code patching) improvements to robustness and
   consistency: replace cpucap static branches with equivalent
   alternatives, associate callback alternatives with a cpucap.

 - Miscellaneous updates: optimise kprobe performance of patching
   single-step slots, simplify uaccess_mask_ptr(), move MTE registers
   initialisation to C, support huge vmalloc() mappings, run softirqs on
   the per-CPU IRQ stack, compat (arm32) misalignment fixups for
   multiword accesses.

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (126 commits)
  arm64: alternatives: Use vdso/bits.h instead of linux/bits.h
  arm64/kprobe: Optimize the performance of patching single-step slot
  arm64: defconfig: Add Coresight as module
  kselftest/arm64: Handle EINTR while reading data from children
  kselftest/arm64: Flag fp-stress as exiting when we begin finishing up
  kselftest/arm64: Don't repeat termination handler for fp-stress
  ARM64: reloc_test: add __init/__exit annotations to module init/exit funcs
  arm64/mm: fold check for KFENCE into can_set_direct_map()
  arm64: ftrace: fix module PLTs with mcount
  arm64: module: Remove unused plt_entry_is_initialized()
  arm64: module: Make plt_equals_entry() static
  arm64: fix the build with binutils 2.27
  kselftest/arm64: Don't enable v8.5 for MTE selftest builds
  arm64: uaccess: simplify uaccess_mask_ptr()
  arm64: asm/perf_regs.h: Avoid C++-style comment in UAPI header
  kselftest/arm64: Fix typo in hwcap check
  arm64: mte: move register initialization to C
  arm64: mm: handle ARM64_KERNEL_USES_PMD_MAPS in vmemmap_populate()
  arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()
  arm64/sve: Add Perf extensions documentation
  ...
2022-10-06 11:51:49 -07:00
Linus Torvalds 9388076b4c ACPI updates for 6.1-rc1
- Reimplement acpi_get_pci_dev() using the list of physical devices
    associated with the given ACPI device object (Rafael Wysocki).
 
  - Rename ACPI device object reference counting functions (Rafael
    Wysocki).
 
  - Rearrange ACPI device object initialization code (Rafael Wysocki).
 
  - Drop parent field from struct acpi_device (Rafael Wysocki).
 
  - Extend the the int3472-tps68470 driver to support multiple consumers
    of a single TPS68470 along with the requisite framework-level
    support (Daniel Scally).
 
  - Filter out non-memory resources in is_memory(), add a helper
    function to find all memory type resources of an ACPI device object
    and use that function in 3 places (Heikki Krogerus).
 
  - Add IRQ override quirks for Asus Vivobook K3402ZA/K3502ZA and ASUS
    model S5402ZA (Tamim Khan, Kellen Renshaw).
 
  - Fix acpi_dev_state_d0() kerneldoc (Sakari Ailus).
 
  - Fix up suspend-to-idle support on ASUS Rembrandt laptops (Mario
    Limonciello).
 
  - Clean up ACPI platform devices support code (Andy Shevchenko, John
    Garry).
 
  - Clean up ACPI bus management code (Andy Shevchenko, ye xingchen).
 
  - Add support for multiple DMA windows with different offsets to the
    ACPI device enumeration code and use it on LoongArch (Jianmin Lv).
 
  - Clean up the ACPI LPSS (Intel SoC) driver (Andy Shevchenko).
 
  - Add a quirk for Dell Inspiron 14 2-in-1 for StorageD3Enable (Mario
    Limonciello).
 
  - Drop unused dev_fmt() and redundant 'HMAT' prefix from the HMAT
    parsing code (Liu Shixin).
 
  - Make ACPI FPDT parsing code avoid calling acpi_os_map_memory() on
    invalid physical addresses (Hans de Goede).
 
  - Silence missing-declarations warning related to Apple device
    properties management (Lukas Wunner).
 
  - Disable frequency invariance in the CPPC library if registers used
    by cppc_get_perf_ctrs() are accessed via PCC (Jeremy Linton).
 
  - Add ACPI disabled check to acpi_cpc_valid() (Perry Yuan).
 
  - Fix Tx acknowledge in the PCC address space handler (Huisong Li).
 
  - Use wait_for_completion_timeout() for PCC mailbox operations (Huisong
    Li).
 
  - Release resources on PCC address space setup failure path (Rafael
    Mendonca).
 
  - Remove unneeded result variables from APEI code (ye xingchen).
 
  - Print total number of records found during BERT log parsing (Dmitry
    Monakhov).
 
  - Drop support for 3 _OSI strings that should not be necessary any
    more and update documentation on custom _OSI strings so that adding
    new ones is not encouraged any more (Mario Limonciello).
 
  - Drop unneeded result variable from ec_write() (ye xingchen).
 
  - Remove the leftover struct acpi_ac_bl from the ACPI AC driver (Hanjun
    Guo).
 
  - Reorder symbols to get rid of a few forward declarations in the ACPI
    fan driver (Uwe Kleine-König).
 
  - Add Toshiba Satellite/Portege Z830 ACPI backlight quirk (Arvid
    Norlander).
 
  - Add ARM DMA-330 controller to the supported list in the ACPI AMBA
    driver (Vijayenthiran Subramaniam).
 
  - Drop references to non-functional 01.org/linux-acpi web site from
    MAINTAINERS and Kconfig help texts (Rafael Wysocki).
 
  - Replace strlcpy() with unused retval with strscpy() in the ACPI
    support code (Wolfram Sang).
 
  - Do not initialize ret in main() in the pfrut utility (Shi junming).
 
  - Drop useless ACPI DSDT override documentation (Rafael Wysocki).
 
  - Fix a few typos and wording mistakes in the ACPI device enumeration
    documentation (Jean Delvare).
 
  - Introduce acpi_dev_uid_to_integer() to convert a _UID string into an
    integer value (Andy Shevchenko).
 
  - Use acpi_dev_uid_to_integer() in several places to unify _UID
    handling (Andy Shevchenko).
 
  - Drop unused pnpid32_to_pnpid() declaration from  PNP code (Gaosheng
    Cui).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmM7OhkSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRx/TkQALQ4TN451dPSj9jcYSNY6qZ/9b4P9Iym
 TmRf3wO3+IVZQ8JajeKKRuVKNsW3sC0RcFkJJVmgZkydJBr1Uui2L0ZLzi8axGNy
 RlbZm5NyBeFnlP0fA8Gb2iRMXVAUcRIx+RvZCulxxFmgQ8UhoU4wlVZWlEcko4TQ
 hGp++lJYcRHR1NbVLSXZhFvzopKLdhGL6vB1Awsjb/I7TVqn23+k4jVRV1DYkIQ7
 qgFM+Z7osRVZiVQbaPoOgdykeSa43qXu7Vgs7F/QeJuIiUYx59xDh0/WCJBxnuDM
 cHGiaNnvuJghKmCg43X8+joaHEH/jCFyvBVGfiSzRvjz03WOPRs1XztwdEiCi+py
 RcZGzrPaXmkCjNeytPRooiifyqm95HT7aMBN/aTvKBXDaGRrfPheXF+i2idl24HM
 NrHqMaa0+5qoDGHLUEaf5znlCHfS+3lwq6+lGVrq/UGf6B3cP+9HwOyevEW493JX
 4nuv69Y517moR9W3mBU8sAn5mUjshcka7pghRj7QnuoqRqWLbU3lIz8oUDHr84cI
 ixpIPvt2KlZ5UjnN9aqu/6k70JkJvy4SrKjnx4iqu03ePmMrRc0Hcpy7+VMlgumD
 tgN9aW+YDgy0/Z5QmO1MOvFodVmA5sX6+gnX1neAjuDdIo3LkJptlkO1fCx2jfQu
 cgPQk1CtPOos
 =xyUK
 -----END PGP SIGNATURE-----

Merge tag 'acpi-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI updates from Rafael Wysocki:
 "ACPI and PNP updates for 6.1-rc1.

  These rearrange the ACPI device object initialization code (to get rid
  of a redundant parent pointer from struct acpi_device among other
  things), unify the _UID handling, drop support for some _OSI strings
  that should not be necessary any more, add new IDs to support more
  hardware and some more quirks, fix a few issues and clean up code all
  over.

  Specifics:

   - Reimplement acpi_get_pci_dev() using the list of physical devices
     associated with the given ACPI device object (Rafael Wysocki)

   - Rename ACPI device object reference counting functions (Rafael
     Wysocki)

   - Rearrange ACPI device object initialization code (Rafael Wysocki)

   - Drop parent field from struct acpi_device (Rafael Wysocki)

   - Extend the the int3472-tps68470 driver to support multiple
     consumers of a single TPS68470 along with the requisite
     framework-level support (Daniel Scally)

   - Filter out non-memory resources in is_memory(), add a helper
     function to find all memory type resources of an ACPI device object
     and use that function in 3 places (Heikki Krogerus)

   - Add IRQ override quirks for Asus Vivobook K3402ZA/K3502ZA and ASUS
     model S5402ZA (Tamim Khan, Kellen Renshaw)

   - Fix acpi_dev_state_d0() kerneldoc (Sakari Ailus)

   - Fix up suspend-to-idle support on ASUS Rembrandt laptops (Mario
     Limonciello)

   - Clean up ACPI platform devices support code (Andy Shevchenko, John
     Garry)

   - Clean up ACPI bus management code (Andy Shevchenko, ye xingchen)

   - Add support for multiple DMA windows with different offsets to the
     ACPI device enumeration code and use it on LoongArch (Jianmin Lv)

   - Clean up the ACPI LPSS (Intel SoC) driver (Andy Shevchenko)

   - Add a quirk for Dell Inspiron 14 2-in-1 for StorageD3Enable (Mario
     Limonciello)

   - Drop unused dev_fmt() and redundant 'HMAT' prefix from the HMAT
     parsing code (Liu Shixin)

   - Make ACPI FPDT parsing code avoid calling acpi_os_map_memory() on
     invalid physical addresses (Hans de Goede)

   - Silence missing-declarations warning related to Apple device
     properties management (Lukas Wunner)

   - Disable frequency invariance in the CPPC library if registers used
     by cppc_get_perf_ctrs() are accessed via PCC (Jeremy Linton)

   - Add ACPI disabled check to acpi_cpc_valid() (Perry Yuan)

   - Fix Tx acknowledge in the PCC address space handler (Huisong Li)

   - Use wait_for_completion_timeout() for PCC mailbox operations
     (Huisong Li)

   - Release resources on PCC address space setup failure path (Rafael
     Mendonca)

   - Remove unneeded result variables from APEI code (ye xingchen)

   - Print total number of records found during BERT log parsing (Dmitry
     Monakhov)

   - Drop support for 3 _OSI strings that should not be necessary any
     more and update documentation on custom _OSI strings so that adding
     new ones is not encouraged any more (Mario Limonciello)

   - Drop unneeded result variable from ec_write() (ye xingchen)

   - Remove the leftover struct acpi_ac_bl from the ACPI AC driver
     (Hanjun Guo)

   - Reorder symbols to get rid of a few forward declarations in the
     ACPI fan driver (Uwe Kleine-König)

   - Add Toshiba Satellite/Portege Z830 ACPI backlight quirk (Arvid
     Norlander)

   - Add ARM DMA-330 controller to the supported list in the ACPI AMBA
     driver (Vijayenthiran Subramaniam)

   - Drop references to non-functional 01.org/linux-acpi web site from
     MAINTAINERS and Kconfig help texts (Rafael Wysocki)

   - Replace strlcpy() with unused retval with strscpy() in the ACPI
     support code (Wolfram Sang)

   - Do not initialize ret in main() in the pfrut utility (Shi junming)

   - Drop useless ACPI DSDT override documentation (Rafael Wysocki)

   - Fix a few typos and wording mistakes in the ACPI device enumeration
     documentation (Jean Delvare)

   - Introduce acpi_dev_uid_to_integer() to convert a _UID string into
     an integer value (Andy Shevchenko)

   - Use acpi_dev_uid_to_integer() in several places to unify _UID
     handling (Andy Shevchenko)

   - Drop unused pnpid32_to_pnpid() declaration from PNP code (Gaosheng
     Cui)"

* tag 'acpi-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (79 commits)
  ACPI: LPSS: Deduplicate skipping device in acpi_lpss_create_device()
  ACPI: LPSS: Replace loop with first entry retrieval
  ACPI: x86: s2idle: Add another ID to s2idle_dmi_table
  ACPI: x86: s2idle: Fix a NULL pointer dereference
  MAINTAINERS: Drop records pointing to 01.org/linux-acpi
  ACPI: Kconfig: Drop link to https://01.org/linux-acpi
  ACPI: docs: Drop useless DSDT override documentation
  ACPI: DPTF: Drop stale link from Kconfig help
  ACPI: x86: s2idle: Add a quirk for ASUSTeK COMPUTER INC. ROG Flow X13
  ACPI: x86: s2idle: Add a quirk for Lenovo Slim 7 Pro 14ARH7
  ACPI: x86: s2idle: Add a quirk for ASUS ROG Zephyrus G14
  ACPI: x86: s2idle: Add a quirk for ASUS TUF Gaming A17 FA707RE
  ACPI: x86: s2idle: Add module parameter to prefer Microsoft GUID
  ACPI: x86: s2idle: If a new AMD _HID is missing assume Rembrandt
  ACPI: x86: s2idle: Move _HID handling for AMD systems into structures
  platform/x86: int3472: Add board data for Surface Go2 IR camera
  platform/x86: int3472: Support multiple gpio lookups in board data
  platform/x86: int3472: Support multiple clock consumers
  ACPI: bus: Add iterator for dependent devices
  ACPI: scan: Add acpi_dev_get_next_consumer_dev()
  ...
2022-10-03 13:19:53 -07:00
Rafael J. Wysocki 4aa497ca10 Merge branch 'acpi-uid'
Merge ACPI _UID handling unification changes for 6.1-rc1:

 - Introduce acpi_dev_uid_to_integer() to convert a _UID string into an
   integer value (Andy Shevchenko).

 - Use acpi_dev_uid_to_integer() in several places to unify _UID
   handling (Andy Shevchenko).

* acpi-uid:
  efi/dev-path-parser: Refactor _UID handling to use acpi_dev_uid_to_integer()
  spi: pxa2xx: Refactor _UID handling to use acpi_dev_uid_to_integer()
  perf: qcom_l2_pmu: Refactor _UID handling to use acpi_dev_uid_to_integer()
  i2c: mlxbf: Refactor _UID handling to use acpi_dev_uid_to_integer()
  i2c: amd-mp2-plat: Refactor _UID handling to use acpi_dev_uid_to_integer()
  ACPI: x86: Refactor _UID handling to use acpi_dev_uid_to_integer()
  ACPI: LPSS: Refactor _UID handling to use acpi_dev_uid_to_integer()
  ACPI: utils: Add acpi_dev_uid_to_integer() helper to get _UID as integer
2022-10-03 20:09:22 +02:00
Rafael J. Wysocki 80487a37de Merge branch 'acpi-dev'
Merge changes regarding the management of ACPI device objects for
6.1-rc1:

 - Rename ACPI device object reference counting functions (Rafael
   Wysocki).

 - Rearrange ACPI device object initialization code (Rafael Wysocki).

 - Drop parent field from struct acpi_device (Rafael Wysocki).

 - Extend the the int3472-tps68470 driver to support multiple consumers
   of a single TPS68470 along with the requisite framework-level
   support (Daniel Scally).

* acpi-dev:
  platform/x86: int3472: Add board data for Surface Go2 IR camera
  platform/x86: int3472: Support multiple gpio lookups in board data
  platform/x86: int3472: Support multiple clock consumers
  ACPI: bus: Add iterator for dependent devices
  ACPI: scan: Add acpi_dev_get_next_consumer_dev()
  ACPI: property: Use acpi_dev_parent()
  ACPI: Drop redundant acpi_dev_parent() header
  ACPI: PM: Fix NULL argument handling in acpi_device_get/set_power()
  ACPI: Drop parent field from struct acpi_device
  ACPI: scan: Eliminate __acpi_device_add()
  ACPI: scan: Rearrange initialization of ACPI device objects
  ACPI: scan: Rename acpi_bus_get_parent() and rearrange it
  ACPI: Rename acpi_bus_get/put_acpi_device()
2022-09-30 20:05:16 +02:00
Catalin Marinas b23ec74cbd Merge branches 'for-next/doc', 'for-next/sve', 'for-next/sysreg', 'for-next/gettimeofday', 'for-next/stacktrace', 'for-next/atomics', 'for-next/el1-exceptions', 'for-next/a510-erratum-2658417', 'for-next/defconfig', 'for-next/tpidr2_el0' and 'for-next/ftrace', remote-tracking branch 'arm64/for-next/perf' into for-next/core
* arm64/for-next/perf:
  arm64: asm/perf_regs.h: Avoid C++-style comment in UAPI header
  arm64/sve: Add Perf extensions documentation
  perf: arm64: Add SVE vector granule register to user regs
  MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver
  drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC
  docs: perf: Add description for Alibaba's T-Head PMU driver

* for-next/doc:
  : Documentation/arm64 updates
  arm64/sve: Document our actual ABI for clearing registers on syscall

* for-next/sve:
  : SVE updates
  arm64/sysreg: Add hwcap for SVE EBF16

* for-next/sysreg: (35 commits)
  : arm64 system registers generation (more conversions)
  arm64/sysreg: Fix a few missed conversions
  arm64/sysreg: Convert ID_AA64AFRn_EL1 to automatic generation
  arm64/sysreg: Convert ID_AA64DFR1_EL1 to automatic generation
  arm64/sysreg: Convert ID_AA64FDR0_EL1 to automatic generation
  arm64/sysreg: Use feature numbering for PMU and SPE revisions
  arm64/sysreg: Add _EL1 into ID_AA64DFR0_EL1 definition names
  arm64/sysreg: Align field names in ID_AA64DFR0_EL1 with architecture
  arm64/sysreg: Add defintion for ALLINT
  arm64/sysreg: Convert SCXTNUM_EL1 to automatic generation
  arm64/sysreg: Convert TIPDR_EL1 to automatic generation
  arm64/sysreg: Convert ID_AA64PFR1_EL1 to automatic generation
  arm64/sysreg: Convert ID_AA64PFR0_EL1 to automatic generation
  arm64/sysreg: Convert ID_AA64MMFR2_EL1 to automatic generation
  arm64/sysreg: Convert ID_AA64MMFR1_EL1 to automatic generation
  arm64/sysreg: Convert ID_AA64MMFR0_EL1 to automatic generation
  arm64/sysreg: Convert HCRX_EL2 to automatic generation
  arm64/sysreg: Standardise naming of ID_AA64PFR1_EL1 SME enumeration
  arm64/sysreg: Standardise naming of ID_AA64PFR1_EL1 BTI enumeration
  arm64/sysreg: Standardise naming of ID_AA64PFR1_EL1 fractional version fields
  arm64/sysreg: Standardise naming for MTE feature enumeration
  ...

* for-next/gettimeofday:
  : Use self-synchronising counter access in gettimeofday() (if FEAT_ECV)
  arm64: vdso: use SYS_CNTVCTSS_EL0 for gettimeofday
  arm64: alternative: patch alternatives in the vDSO
  arm64: module: move find_section to header

* for-next/stacktrace:
  : arm64 stacktrace cleanups and improvements
  arm64: stacktrace: track hyp stacks in unwinder's address space
  arm64: stacktrace: track all stack boundaries explicitly
  arm64: stacktrace: remove stack type from fp translator
  arm64: stacktrace: rework stack boundary discovery
  arm64: stacktrace: add stackinfo_on_stack() helper
  arm64: stacktrace: move SDEI stack helpers to stacktrace code
  arm64: stacktrace: rename unwind_next_common() -> unwind_next_frame_record()
  arm64: stacktrace: simplify unwind_next_common()
  arm64: stacktrace: fix kerneldoc comments

* for-next/atomics:
  : arm64 atomics improvements
  arm64: atomic: always inline the assembly
  arm64: atomics: remove LL/SC trampolines

* for-next/el1-exceptions:
  : Improve the reporting of EL1 exceptions
  arm64: rework BTI exception handling
  arm64: rework FPAC exception handling
  arm64: consistently pass ESR_ELx to die()
  arm64: die(): pass 'err' as long
  arm64: report EL1 UNDEFs better

* for-next/a510-erratum-2658417:
  : Cortex-A510: 2658417: remove BF16 support due to incorrect result
  arm64: errata: remove BF16 HWCAP due to incorrect result on Cortex-A510
  arm64: cpufeature: Expose get_arm64_ftr_reg() outside cpufeature.c
  arm64: cpufeature: Force HWCAP to be based on the sysreg visible to user-space

* for-next/defconfig:
  : arm64 defconfig updates
  arm64: defconfig: Add Coresight as module
  arm64: Enable docker support in defconfig
  arm64: defconfig: Enable memory hotplug and hotremove config
  arm64: configs: Enable all PMUs provided by Arm

* for-next/tpidr2_el0:
  : arm64 ptrace() support for TPIDR2_EL0
  kselftest/arm64: Add coverage of TPIDR2_EL0 ptrace interface
  arm64/ptrace: Support access to TPIDR2_EL0
  arm64/ptrace: Document extension of NT_ARM_TLS to cover TPIDR2_EL0
  kselftest/arm64: Add test coverage for NT_ARM_TLS

* for-next/ftrace:
  : arm64 ftraces updates/fixes
  arm64: ftrace: fix module PLTs with mcount
  arm64: module: Remove unused plt_entry_is_initialized()
  arm64: module: Make plt_equals_entry() static
2022-09-30 09:17:57 +01:00
Peter Zijlstra a1ebcd5943 Linux 6.0-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmMwwY4eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGdlwH/0ESzdb6F9zYWwHR
 E08har56/IfwjOsn1y+JuHibpwUjzskLzdwIfI5zshSZAQTj5/UyC0P7G/wcYh/Z
 INh1uHGazmDUkx4O3lwuWLR+mmeUxZRWdq4NTwYDRNPMSiPInVxz+cZJ7y0aPr2e
 wii7kMFRHgXmX5DMDEwuHzehsJF7vZrp8zBu2DqzVUGnbwD50nPbyMM3H4g9mute
 fAEpDG0X3+smqMaKL+2rK0W/Av/87r3U8ZAztBem3nsCJ9jT7hqMO1ICcKmFMviA
 DTERRMwWjPq+mBPE2CiuhdaXvNZBW85Ds81mSddS6MsO6+Tvuzfzik/zSLQJxlBi
 vIqYphY=
 =NqG+
 -----END PGP SIGNATURE-----

Merge branch 'v6.0-rc7'

Merge upstream to get RAPTORLAKE_S

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
2022-09-29 12:20:50 +02:00
Linus Torvalds a63f2e7cb1 arm64 fixes for -rc7
- Fix false positive "sleeping while atomic" warning resulting from
   the kPTI rework taking a mutex too early.
 
 - Fix possible overflow in AMU frequency calculation
 
 - Fix incorrect shift in CMN PMU driver which causes problems with
   newer versions of the IP
 
 - Reduce alignment of the CFI jump table to avoid huge kernel images
   and link errors with !4KiB page size configurations
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmMtrWwQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNJ/QCACnRX9+S83mixt+EEbqDMkCDqlKqpwYAP0a
 Fq7Yb4/iOqBCHY0n9of5SalpLc/ExAWDJiXoA0Y5g1E2hZMJnSvMx8aC6r92Ofdx
 uwx9PlWP6GhB7s1+kCNEHcSGHLDv4HT0nu/xbFHl64R+JwONeiB7tH3Mf9vXE05g
 As07ij7aLckMx19+SnezPawD55A6xKJ1KtoAF9NjWOuj79jJ7uTm9wCjAM29GPMT
 wC8axTaeqgHcI6fLqNpQ5cSQNHwN51f0PnPmj/fAeaJ8riAEubP1+ys4NNczrO0H
 uHQOnD0kArtjro/dNYZZZGa1u9BS+qYENgSwAl09cZ3r6orRrda4
 =EiMW
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "These are all very simple and self-contained, although the CFI
  jump-table fix touches the generic linker script as that's where the
  problematic macro lives.

   - Fix false positive "sleeping while atomic" warning resulting from
     the kPTI rework taking a mutex too early.

   - Fix possible overflow in AMU frequency calculation

   - Fix incorrect shift in CMN PMU driver which causes problems with
     newer versions of the IP

   - Reduce alignment of the CFI jump table to avoid huge kernel images
     and link errors with !4KiB page size configurations"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  vmlinux.lds.h: CFI: Reduce alignment of jump-table to function alignment
  perf/arm-cmn: Add more bits to child node address offset field
  arm64: topology: fix possible overflow in amu_fie_setup()
  arm64: mm: don't acquire mutex when rewriting swapper
2022-09-23 15:28:51 -07:00
James Clark cbb0c02caf perf: arm64: Add SVE vector granule register to user regs
Dwarf based unwinding in a function that pushes SVE registers onto
the stack requires the unwinder to know the length of the SVE register
to calculate the stack offsets correctly. This was added to the Arm
specific Dwarf spec as the VG pseudo register[1].

Add the vector length at position 46 if it's requested by userspace and
SVE is supported. If it's not supported then fail to open the event.

The vector length must be on each sample because it can be changed
at runtime via a prctl or ptrace call. Also by adding it as a register
rather than a separate attribute, minimal changes will be required in an
unwinder that already indexes into the register list.

[1]: https://github.com/ARM-software/abi-aa/blob/main/aadwarf64/aadwarf64.rst

Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20220901132658.1024635-2-james.clark@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-09-22 15:06:02 +01:00
Ilkka Koskinen 05d6f6d346 perf/arm-cmn: Add more bits to child node address offset field
CMN-600 uses bits [27:0] for child node address offset while bits [30:28]
are required to be zero.

For CMN-650, the child node address offset field has been increased
to include bits [29:0] while leaving only bit 30 set to zero.

Let's include the missing two bits and assume older implementations
comply with the spec and set bits [29:28] to 0.

Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Fixes: 60d1504070 ("perf/arm-cmn: Support new IP features")
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20220808195455.79277-1-ilkka@os.amperecomputing.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-09-22 14:30:00 +01:00
Shuai Xue cf7b61073e drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC
Add the DDR Sub-System Driveway Performance Monitoring Unit (PMU) driver
support for Alibaba T-Head Yitian 710 SoC chip. Yitian supports DDR5/4
DRAM and targets cloud computing and HPC.

Each PMU is registered as a device in /sys/bus/event_source/devices, and
users can select event to monitor in each sub-channel, independently. For
example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two
sub-channels of the same channel in die 0. And the PMU device of die 1 is
prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000.

Due to hardware limitation, one of DDRSS Driveway PMU overflow interrupt
shares the same irq number with MPAM ERR_IRQ. To register DDRSS PMU and
MPAM drivers successfully, add IRQF_SHARED flag.

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Co-developed-by: Hongbo Yao <yaohongbo@linux.alibaba.com>
Signed-off-by: Hongbo Yao <yaohongbo@linux.alibaba.com>
Co-developed-by: Neng Chen <nengchen@linux.alibaba.com>
Signed-off-by: Neng Chen <nengchen@linux.alibaba.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220818031822.38415-3-xueshuai@linux.alibaba.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-09-22 14:09:10 +01:00
Nathan Chancellor db74cd6337 arm64/sysreg: Fix a few missed conversions
After the conversion to automatically generating the ID_AA64DFR0_EL1
definition names, the build fails in a few different places because some
of the definitions were not changed to their new names along the way.
Update the names to resolve the build errors.

Fixes: c0357a73fa ("arm64/sysreg: Align field names in ID_AA64DFR0_EL1 with architecture")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220919160928.3905780-1-nathan@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-21 09:24:29 +01:00
Andy Shevchenko 9cde62517f perf: qcom_l2_pmu: Refactor _UID handling to use acpi_dev_uid_to_integer()
ACPI utils provide acpi_dev_uid_to_integer() helper to extract _UID as
an integer. Use it instead of custom approach.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-09-19 18:34:42 +02:00
Linus Torvalds 22b2e2d6ab RISC-V Fixes for 6.0-rc5
* A pair of device tree fixes for the Polarfire SOC.
 * A fix to avoid overflowing the PMU counter array when firmware
   incorrectly reports the number of supported counters, which manifests
   on OpenSBI versions prior to 1.1.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmMbRnQTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiU1YEADJTiBZwv7C4yrDIBtVolTuNhO1IT5U
 Y9h4iWeJcg08oHoBJCCy731zd56/Kaiqoje99ou/G+wzs4Idze9dgElrFQY2bOE+
 JPCnJ6Vcc/F7jjrZfJziAyP3QxdSIVebljKynaRyzkCxCNbL7QI26PrdmqpaXPQv
 orpk4kZFdAzEK9SBnYL2L5cPn3387N9l+F7RcunmAM61L+YkPXKSftK5JDK9wv/6
 W7RjxODbNni7dRIZsGTjFUEnR60UKsAiCvzA3IszWIRJcPTkPULhvpO8M67Z87qa
 DLq1zx0dn+ZLBb2prn/9UHHpqJFgoi7nuvoaGtqnMuCUIADkuRGW/y7ZNaSb+KIK
 j0c4hDrQ/if7UhXVnWc8EHmsOhaLKBXh9f4NGL9CjSDOw8K3RvC1GvKr+g8jMah+
 5Gi8KfbMLK5z4IK2jzJZe7tYY5YcgieG/M7HCnZo7O3A9ao84DXSjSUXgU4sfYI5
 QeYO5A51xOA7ZlsKrK0hxOfGcZf1uwXA/ZyPTPloSRmVL42Ji1V1rgSGO6I4A493
 SUZEQ3nrEmmOemf8pE5t65IIRNQ3/sPWLZRqoBgtRn8pUJ0TqTxX+VPFAEAuGFqB
 g5LDr77+ek2kQSw1PU9wZmi/3WUpB0WhVCDJQgdc8RgZP9vOdUc9oh5gq+T2he+K
 17NsZsP+LrmMUQ==
 =86+k
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - A pair of device tree fixes for the Polarfire SOC

 - A fix to avoid overflowing the PMU counter array when firmware
   incorrectly reports the number of supported counters, which manifests
   on OpenSBI versions prior to 1.1

* tag 'riscv-for-linus-6.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  perf: RISC-V: fix access beyond allocated array
  riscv: dts: microchip: use an mpfs specific l2 compatible
  dt-bindings: riscv: sifive-l2: add a PolarFire SoC compatible
2022-09-09 14:06:10 -04:00
Sergey Matyukevich 20e0fbab16
perf: RISC-V: fix access beyond allocated array
SBI firmware should report total number of firmware and hardware counters
including unused ones or special ones. In this case the kernel doesn't need
to make any assumptions about gaps in reported counters, e.g. excluded timer
counter. That was fixed in OpenSBI v1.1 by commit 3f66465fb6bf ("lib: pmu:
allow to use the highest available counter"). This kernel patch has no effect
if SBI firmware behaves correctly. However it eliminates access beyond the
allocated pmu_ctr_list if the kernel is used with OpenSBI older than v1.1.

Fixes: e999143459 ("RISC-V: Add perf platform driver based on SBI PMU extension")
Signed-off-by: Sergey Matyukevich <sergey.matyukevich@syntacore.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220830155306.301714-2-geomatsi@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-09-08 13:50:25 -07:00
Sergey Matyukevich 096b52fd2b
perf: RISC-V: throttle perf events
Call perf_sample_event_took() to report time spent in overflow
interrupts. Perf core uses these measurements to throttle
perf events properly.

Signed-off-by: Sergey Matyukevich <sergey.matyukevich@syntacore.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20220830155306.301714-4-geomatsi@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-09-08 13:34:58 -07:00
Sergey Matyukevich 1537bf26e2
perf: RISC-V: exclude invalid pmu counters from SBI calls
SBI firmware may not provide information for some counters in response
to SBI_EXT_PMU_COUNTER_GET_INFO call. Exclude such counters from the
subsequent SBI requests. For this purpose use global mask to keep track
of fully specified counters.

Signed-off-by: Sergey Matyukevich <sergey.matyukevich@syntacore.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20220830155306.301714-3-geomatsi@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-09-08 13:34:50 -07:00
Anshuman Khandual 91207f6261 arm64/perf: Assert all platform event flags are within PERF_EVENT_FLAG_ARCH
Ensure all platform specific event flags are within PERF_EVENT_FLAG_ARCH.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: James Clark <james.clark@arm.com>
Link: https://lkml.kernel.org/r/20220907091924.439193-4-anshuman.khandual@arm.com
2022-09-07 21:54:01 +02:00
Linus Torvalds cf3488fa25 arm64 fixes for -rc4
- Fix two boot issues caused by the recent head.S rework when !KASLR
 
 - Fix calculation of crashkernel memory reservation
 
 - Fix bogus error check in PMU IRQ probing code
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmMRwE8QHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNP6lB/4m0NNn4PQJ5FmTBuNurwB+CvZyC/pl/exL
 reXG0hwI+ApX1xj6aD7tZ3tB+mi6EG0+3PBLuJ4KuSNecwwT/au5ZxAipsc+IRs0
 EI4loeItzTaevaVTyydsTMgXYAJlCHlMDxEI0QIW3/datH10bZ1GzhEDL/Ebflfu
 Mo4g0/+eysGmSHtj7kL5IU4bE+CCXnXycphgCLAn0qcdxuF/iZphu/GgB4Az6gSS
 9ks3HJkO+X1UVb8qK0prVDsalX2z5tTWbxQ7XACJ4KEVEiSqh7ei3B283cF+Zeip
 3PSg6gkPC5YZmLjv431B/pAycsx/x8KeXY0yDAadTRmOQYROMm0j
 =a7xW
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "It's a lot smaller than last week, with the star of the show being a
  couple of fixes to head.S addressing a boot regression introduced by
  the recent overhaul of that code in non-default configurations (i.e.
  KASLR disabled).

  The first of those two resolves the issue reported (and bisected) by
  Mikulus in the wait_on_bit() thread.

  Summary:

   - Fix two boot issues caused by the recent head.S rework when !KASLR

   - Fix calculation of crashkernel memory reservation

   - Fix bogus error check in PMU IRQ probing code"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: mm: Reserve enough pages for the initial ID map
  perf/arm_pmu_platform: fix tests for platform_get_irq() failure
  arm64: head: Ignore bogus KASLR displacement on non-relocatable kernels
  arm64/kexec: Fix missing extra range for crashkres_low.
2022-09-02 10:32:30 -07:00
Yu Zhe 6bb0d64c10 perf/arm_pmu_platform: fix tests for platform_get_irq() failure
The platform_get_irq() returns negative error codes.  It can't actually
return zero.

Signed-off-by: Yu Zhe <yuzhe@nfschina.com>
Link: https://lore.kernel.org/r/20220825011844.8536-1-yuzhe@nfschina.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-09-01 12:01:40 +01:00
Rafael J. Wysocki 62fcb99bdf ACPI: Drop parent field from struct acpi_device
The parent field in struct acpi_device is, in fact, redundant,
because the dev.parent field in it effectively points to the same
object and it is used by the driver core.

Accordingly, the parent field can be dropped from struct acpi_device
and for this purpose define acpi_dev_parent() to retrieve a parent
struct acpi_device pointer from the dev.parent field in struct
acpi_device.  Next, update all of the users of the parent field
in struct acpi_device to use acpi_dev_parent() instead of it and
drop it.

While at it, drop the ACPI_IS_ROOT_DEVICE() macro that is only used
in one place in a confusing way.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Mark Brown <broonie@kernel.org>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Wei Liu <wei.liu@kernel.org>
Reviewed-by: Punit Agrawal <punit.agrawal@bytedance.com>
2022-08-24 20:55:24 +02:00
Conor Dooley 96264230a6
perf: riscv legacy: fix kerneldoc comment warning
Fix the warning:
drivers/perf/riscv_pmu_legacy.c:76: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst

Fixes: 9b3e150e31 ("RISC-V: Add a simple platform driver for RISC-V legacy perf")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220812143532.1962623-1-conor.dooley@microchip.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-08-18 14:19:26 -07:00
Palmer Dabbelt 9801002f76
perf: riscv_pmu{,_sbi}: Miscallenous improvement & fixes
A series of mostly-independent fixes and cleanups for the RISC-V PMU
drivers.

Link: https://lore.kernel.org/lkml/CAAhSdy23vE8+HxU5Jxy2rBMjy3rBTrJt_4sriuROac_sEESSVw@mail.gmail.com/T/#m9de15aef1b65ae6155fa33ea1239578ef463c2a2

* palmer/riscv-pmu:
  RISC-V: Improve SBI definitions
  RISC-V: Move counter info definition to sbi header file
  RISC-V: Fix SBI PMU calls for RV32
  RISC-V: Update user page mapping only once during start
  RISC-V: Fix counter restart during overflow for RV32
2022-08-12 07:17:38 -07:00
Atish Patra 63ba67ebdf
RISC-V: Move counter info definition to sbi header file
Counter info encoding format is defined by the SBI specificaiton.
KVM implementation of SBI PMU extension will also leverage this definition.
Move the definition to common sbi header file from the sbi pmu driver.

Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20220711174632.4186047-5-atishp@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-08-11 14:58:22 -07:00
Atish Patra 0209b5830b
RISC-V: Fix SBI PMU calls for RV32
Some of the SBI PMU calls does not pass 64bit arguments
correctly and not under RV32 compile time flags. Currently,
this doesn't create any incorrect results as RV64 ignores
any value in the additional register and qemu doesn't support
raw events.

Fix those SBI calls in order to set correct values for RV32.

Fixes: e999143459 ("RISC-V: Add perf platform driver based on SBI PMU extension")
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220711174632.4186047-4-atishp@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-08-11 14:58:18 -07:00
Atish Patra 133a6d1fe7
RISC-V: Update user page mapping only once during start
Currently, riscv_pmu_event_set_period updates the userpage mapping.
However, the caller of riscv_pmu_event_set_period should update
the userpage mapping because the counter can not be updated/started
from set_period function in counter overflow path.

Invoke the perf_event_update_userpage at the caller so that it
doesn't get invoked twice during counter start path.

Fixes: f5bfa23f57 ("RISC-V: Add a perf core library for pmu drivers")
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Guo Ren <guoren@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220711174632.4186047-3-atishp@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-08-11 14:58:13 -07:00
Atish Patra acc1b919f4
RISC-V: Fix counter restart during overflow for RV32
Pass the upper half of the initial value of the counter correctly
for RV32.

Fixes: 4905ec2fb7 ("RISC-V: Add sscofpmf extension support")
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Guo Ren <guoren@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220711174632.4186047-2-atishp@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-08-11 14:58:07 -07:00
Anshuman Khandual 92f2b8bafa drivers/perf: arm_spe: Fix consistency of SYS_PMSCR_EL1.CX
The arm_spe_pmu driver will enable SYS_PMSCR_EL1.CX in order to add CONTEXT
packets into the traces, if the owner of the perf event runs with required
capabilities i.e CAP_PERFMON or CAP_SYS_ADMIN via perfmon_capable() helper.

The value of this bit is computed in the arm_spe_event_to_pmscr() function
but the check for capabilities happens in the pmu event init callback i.e
arm_spe_pmu_event_init(). This suggests that the value of the CX bit should
remain consistent for the duration of the perf session.

However, the function arm_spe_event_to_pmscr() may be called later during
the event start callback i.e arm_spe_pmu_start() when the "current" process
is not the owner of the perf session, hence the CX bit setting is currently
not consistent.

One way to fix this, is by caching the required value of the CX bit during
the initialization of the PMU event, so that it remains consistent for the
duration of the session. It uses currently unused 'event->hw.flags' element
to cache perfmon_capable() value, which can be referred during event start
callback to compute SYS_PMSCR_EL1.CX. This ensures consistent availability
of context packets in the trace as per event owner capabilities.

Drop BIT(SYS_PMSCR_EL1_CX_SHIFT) check in arm_spe_pmu_event_init(), because
now CX bit cannot be set in arm_spe_event_to_pmscr() with perfmon_capable()
disabled.

Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Fixes: d5d9696b03 ("drivers/perf: Add support for ARMv8.2 Statistical Profiling Extension")
Reported-by: German Gomez <german.gomez@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20220714061302.2715102-1-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-07-19 18:50:09 +01:00
Liang He 491f10d08f perf: RISC-V: Add of_node_put() when breaking out of for_each_of_cpu_node()
In pmu_sbi_setup_irqs(), we should call of_node_put() for the 'cpu'
when breaking out of for_each_of_cput_node() as its refcount will
be automatically increased and decreased during the iteration.

Fixes: 4905ec2fb7 ("RISC-V: Add sscofpmf extension support")
Signed-off-by: Liang He <windhl@126.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20220715130330.443363-1-windhl@126.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-07-19 18:40:31 +01:00
Guangbin Huang 66637ab137 drivers/perf: hisi: add driver for HNS3 PMU
HNS3(HiSilicon Network System 3) PMU is RCiEP device in HiSilicon SoC NIC,
supports collection of performance statistics such as bandwidth, latency,
packet rate and interrupt rate.

NIC of each SICL has one PMU device for it. Driver registers each PMU
device to perf, and exports information of supported events, filter mode of
each event, bdf range, hardware clock frequency, identifier and so on via
sysfs.

Each PMU device has its own registers of control, counters and interrupt,
and it supports 8 hardware events, each hardward event has its own
registers for configuration, counters and interrupt.

Filter options contains:
config       - select event
port         - select physical port of nic
tc           - select tc(must be used with port)
func         - select PF/VF
queue        - select queue of PF/VF(must be used with func)
intr         - select interrupt number(must be used with func)
global       - select all functions of IO DIE

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/20220628063419.38514-3-huangguangbin2@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-07-06 11:25:53 +01:00
Nikita Shubin 26fabd6d2f drivers/perf: riscv_pmu_sbi: perf format
Update driver to export formatting and event information to sysfs so it
can be used by the perf user space tools with the syntaxes:

perf stat -e cpu/event=0x05
perf stat -e cpu/event=0x05,firmware=0x1/

63-bit is used to distinguish hardware events from firmware. Firmware
events are defined by "RISC-V Supervisor Binary Interface
Specification".

perf stat -e cpu/event=0x05,firmware=0x1/

is equivalent to

perf stat -e r8000000000000005

Suggested-by: João Mário Domingos <joao.mario@tecnico.ulisboa.pt>
Signed-off-by: Nikita Shubin <n.shubin@yadro.com>
Link: https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/riscv-sbi.adoc
Link: https://lore.kernel.org/r/20220628114625.166665-2-nikita.shubin@maquefel.me
Signed-off-by: Will Deacon <will@kernel.org>
2022-07-06 11:06:24 +01:00
Christophe JAILLET 0e35850b34 perf/arm-cci: Use the bitmap API to allocate bitmaps
Use devm_bitmap_zalloc() instead of hand-writing it.
It is less verbose and it improves the semantic.

While at it, use bitmap_zero() instead of hand-writing it.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/fbde85a5e8ae99b10a2115d8ea1e69320a62947f.1657084786.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Will Deacon <will@kernel.org>
2022-07-06 11:02:58 +01:00
Eric Lin e9a023f2b7 drivers/perf: riscv_pmu: Add riscv pmu pm notifier
Currently, when the CPU is doing suspend to ram, we don't
save pmu counter register and its content will be lost.

To ensure perf profiling is not affected by suspend to ram,
this patch is based on arm_pmu CPU_PM notifier and implements riscv
pmu pm notifier. In the pm notifier, we stop the counter and update
the counter value before suspend and start the counter after resume.

Signed-off-by: Eric Lin <eric.lin@sifive.com>
Link: https://lore.kernel.org/r/20220705091920.27432-1-eric.lin@sifive.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-07-06 10:57:30 +01:00
Chen Jun e500405dd1 perf: hisi: Extract hisi_pmu_init
Extract the initialization code of hisi_pmu->pmu into a function

Signed-off-by: Chen Jun <chenjun102@huawei.com>
Link: https://lore.kernel.org/r/20220516131601.48383-1-chenjun102@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-27 11:14:54 +01:00
Tanmay Jagdale f5ebeb138f perf/marvell_cn10k: Fix TAD PMU register offset
The existing offset of TAD_PRF and TAD_PFC registers are incorrect.
Hence, fix with the right register offsets.

Also, drop read of TAD_PRF register in tad_pmu_event_counter_start()
since we don't have to preserve any bit fields and always write
an updated value.

Signed-off-by: Tanmay Jagdale <tanmay@marvell.com>
Link: https://lore.kernel.org/r/20220614171356.773967-1-tanmay@marvell.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24 13:21:38 +01:00
Christophe JAILLET 8e28e53f13 perf/marvell_cn10k: Remove useless license text when SPDX-License-Identifier is already used
An SPDX-License-Identifier is already in place. There is no need to
duplicate part of the corresponding license.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/4a8016a6da9cc6815cfa0f97ae8d3dd862797bda.1654936653.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24 13:19:44 +01:00
Julia Lawall 9ba86a4746 perf/arm-cci: fix typo in comment
Spelling mistake (triple letters) in comment.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Link: https://lore.kernel.org/r/20220521111145.81697-67-Julia.Lawall@inria.fr
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-23 15:44:59 +01:00
keliu a336916b06 drivers/perf:Directly use ida_alloc()/free()
Use ida_alloc()/ida_free() instead of deprecated
ida_simple_get()/ida_simple_remove() .

Signed-off-by: keliu <liuke94@huawei.com>
Link: https://lore.kernel.org/r/20220519080127.147030-2-liuke94@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-23 15:44:45 +01:00
keliu 49785a7778 drivers/perf: Directly use ida_alloc()/free()
Use ida_alloc()/ida_free() instead of deprecated
ida_simple_get()/ida_simple_remove() .

Signed-off-by: keliu <liuke94@huawei.com>
Link: https://lore.kernel.org/r/20220519080127.147030-1-liuke94@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-23 15:44:45 +01:00
Linus Torvalds 143a6252e1 arm64 updates for 5.19:
- Initial support for the ARMv9 Scalable Matrix Extension (SME). SME
   takes the approach used for vectors in SVE and extends this to provide
   architectural support for matrix operations. No KVM support yet, SME
   is disabled in guests.
 
 - Support for crashkernel reservations above ZONE_DMA via the
   'crashkernel=X,high' command line option.
 
 - btrfs search_ioctl() fix for live-lock with sub-page faults.
 
 - arm64 perf updates: support for the Hisilicon "CPA" PMU for monitoring
   coherent I/O traffic, support for Arm's CMN-650 and CMN-700
   interconnect PMUs, minor driver fixes, kerneldoc cleanup.
 
 - Kselftest updates for SME, BTI, MTE.
 
 - Automatic generation of the system register macros from a 'sysreg'
   file describing the register bitfields.
 
 - Update the type of the function argument holding the ESR_ELx register
   value to unsigned long to match the architecture register size
   (originally 32-bit but extended since ARMv8.0).
 
 - stacktrace cleanups.
 
 - ftrace cleanups.
 
 - Miscellaneous updates, most notably: arm64-specific huge_ptep_get(),
   avoid executable mappings in kexec/hibernate code, drop TLB flushing
   from get_clear_flush() (and rename it to get_clear_contig()),
   ARCH_NR_GPIO bumped to 2048 for ARCH_APPLE.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmKH19IACgkQa9axLQDI
 XvEFWg//bf0p6zjeNaOJmBbyVFsXsVyYiEaLUpFPUs3oB+81s2YZ+9i1rgMrNCft
 EIDQ9+/HgScKxJxnzWf68heMdcBDbk76VJtLALExbge6owFsjByQDyfb/b3v/bLd
 ezAcGzc6G5/FlI1IP7ct4Z9MnQry4v5AG8lMNAHjnf6GlBS/tYNAqpmj8HpQfgRQ
 ZbhfZ8Ayu3TRSLWL39NHVevpmxQm/bGcpP3Q9TtjUqg0r1FQ5sK/LCqOksueIAzT
 UOgUVYWSFwTpLEqbYitVqgERQp9LiLoK5RmNYCIEydfGM7+qmgoxofSq5e2hQtH2
 SZM1XilzsZctRbBbhMit1qDBqMlr/XAy/R5FO0GauETVKTaBhgtj6mZGyeC9nU/+
 RGDljaArbrOzRwMtSuXF+Fp6uVo5spyRn1m8UT/k19lUTdrV9z6EX5Fzuc4Mnhed
 oz4iokbl/n8pDObXKauQspPA46QpxUYhrAs10B/ELc3yyp/Qj3jOfzYHKDNFCUOq
 HC9mU+YiO9g2TbYgCrrFM6Dah2E8fU6/cR0ZPMeMgWK4tKa+6JMEINYEwak9e7M+
 8lZnvu3ntxiJLN+PrPkiPyG+XBh2sux1UfvNQ+nw4Oi9xaydeX7PCbQVWmzTFmHD
 q7UPQ8220e2JNCha9pULS8cxDLxiSksce06DQrGXwnHc1Ir7T04=
 =0DjE
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:

 - Initial support for the ARMv9 Scalable Matrix Extension (SME).

   SME takes the approach used for vectors in SVE and extends this to
   provide architectural support for matrix operations. No KVM support
   yet, SME is disabled in guests.

 - Support for crashkernel reservations above ZONE_DMA via the
   'crashkernel=X,high' command line option.

 - btrfs search_ioctl() fix for live-lock with sub-page faults.

 - arm64 perf updates: support for the Hisilicon "CPA" PMU for
   monitoring coherent I/O traffic, support for Arm's CMN-650 and
   CMN-700 interconnect PMUs, minor driver fixes, kerneldoc cleanup.

 - Kselftest updates for SME, BTI, MTE.

 - Automatic generation of the system register macros from a 'sysreg'
   file describing the register bitfields.

 - Update the type of the function argument holding the ESR_ELx register
   value to unsigned long to match the architecture register size
   (originally 32-bit but extended since ARMv8.0).

 - stacktrace cleanups.

 - ftrace cleanups.

 - Miscellaneous updates, most notably: arm64-specific huge_ptep_get(),
   avoid executable mappings in kexec/hibernate code, drop TLB flushing
   from get_clear_flush() (and rename it to get_clear_contig()),
   ARCH_NR_GPIO bumped to 2048 for ARCH_APPLE.

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (145 commits)
  arm64/sysreg: Generate definitions for FAR_ELx
  arm64/sysreg: Generate definitions for DACR32_EL2
  arm64/sysreg: Generate definitions for CSSELR_EL1
  arm64/sysreg: Generate definitions for CPACR_ELx
  arm64/sysreg: Generate definitions for CONTEXTIDR_ELx
  arm64/sysreg: Generate definitions for CLIDR_EL1
  arm64/sve: Move sve_free() into SVE code section
  arm64: Kconfig.platforms: Add comments
  arm64: Kconfig: Fix indentation and add comments
  arm64: mm: avoid writable executable mappings in kexec/hibernate code
  arm64: lds: move special code sections out of kernel exec segment
  arm64/hugetlb: Implement arm64 specific huge_ptep_get()
  arm64/hugetlb: Use ptep_get() to get the pte value of a huge page
  arm64: kdump: Do not allocate crash low memory if not needed
  arm64/sve: Generate ZCR definitions
  arm64/sme: Generate defintions for SVCR
  arm64/sme: Generate SMPRI_EL1 definitions
  arm64/sme: Automatically generate SMPRIMAP_EL2 definitions
  arm64/sme: Automatically generate SMIDR_EL1 defines
  arm64/sme: Automatically generate defines for SMCR
  ...
2022-05-23 21:06:11 -07:00
Robin Murphy c578121298 perf/arm-cmn: Decode CAL devices properly in debugfs
The debugfs code is lazy, and since it only keeps the bottom byte of
each connect_info register to save space, it also treats the whole thing
as the device_type since the other bits were reserved anyway. Upon
closer inspection, though, this is no longer true on newer IP versions,
so let's be good and decode the exact field properly. This should help
it not get confused when a Component Aggregation Layer is present (which
is already implied if Node IDs are found for both device addresses
represented by the next two lines of the table).

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/6a13a6128a28cfe2eec6d09cf372a167ec9c3b65.1652274773.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-05-12 13:44:56 +01:00
Robin Murphy 3630b2a863 perf/arm-cmn: Fix filter_sel lookup
Carefully considering the bounds of an array is all well and good,
until you forget that that array also contains a NULL sentinel at
the end and dereference it. So close...

Reported-by: Qian Cai <quic_qiancai@quicinc.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/bebba768156aa3c0757140457bdd0fec10819388.1652217788.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-05-11 10:20:42 +01:00
Tanmay Jagdale 33835e8dfb perf/marvell_cn10k: Fix tad_pmu_event_init() to check pmu type first
Make sure to check the pmu type first and then check event->attr.disabled.
Doing so would avoid reading the disabled attribute of an event that is
not handled by TAD PMU.

Signed-off-by: Tanmay Jagdale <tanmay@marvell.com>
Link: https://lore.kernel.org/r/20220510102657.487539-1-tanmay@marvell.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-05-10 12:14:27 +01:00
Qi Liu 6b79738b6e drivers/perf: hisi: Add Support for CPA PMU
On HiSilicon Hip09 platform, there is a CPA (Coherency Protocol Agent) on
each SICL (Super IO Cluster) which implements packet format translation,
route parsing and traffic statistics.

CPA PMU has 8 PMU counters and interrupt is supported to handle counter
overflow. Let's support its driver under the framework of HiSilicon PMU
driver.

Signed-off-by: Qi Liu <liuqi115@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/20220415102352.6665-3-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-05-06 15:14:31 +01:00
Qi Liu 807907dae9 drivers/perf: hisi: Associate PMUs in SICL with CPUs online
If a PMU is in a SICL (Super IO cluster), it is not appropriate to
associate this PMU with a CPU die. So we associate it with all CPUs
online, rather than CPUs in the nearest SCCL.

As the firmware of Hip09 platform hasn't been published yet, change
of PMU driver will not influence backwards compatibility between
driver and firmware.

Signed-off-by: Qi Liu <liuqi115@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/20220415102352.6665-2-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-05-06 15:14:31 +01:00
Shaokun Zhang 47a9ed88a4 drivers/perf: arm_spe: Expose saturating counter to 16-bit
In order to acquire more accurate latency, Armv8.8[1] has defined the
CountSize field to 16-bit saturating counters when it's 0b0011.

Let's support this new feature and expose its to user under sysfs.

[1] https://developer.arm.com/documentation/ddi0487/latest

Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/20220429063307.63251-1-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-05-06 15:10:00 +01:00
Robin Murphy 23760a0144 perf/arm-cmn: Add CMN-700 support
Add the identifiers, events, and subtleties for CMN-700. Highlights
include yet more options for doubling up CHI channels, which finally
grows event IDs beyond 8 bits for XPs, and a new set of CML gateway
nodes adding support for CXL as well as CCIX, where the Link Agent is
now internal to the CMN mesh so we gain regular PMU events for that too.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/cf892baa0d0258ea6cd6544b15171be0069a083a.1650320598.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-05-06 15:07:25 +01:00
Robin Murphy 65adf71398 perf/arm-cmn: Refactor occupancy filter selector
So far, DNs and HN-Fs have each had one event ralated to occupancy
trackers which are filtered by a separate field. CMN-700 raises the
stakes by introducing two more sets of HN-F events with corresponding
additional filter fields. Prepare for this by refactoring our filter
selection and tracking logic to account for multiple filter types
coexisting on the same node. This need not affect the uAPI, which can
just continue to encode any per-event filter setting in the "occupid"
config field, even if it's technically not the most accurate name for
some of them.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/1aa47ba0455b144c416537f6b0e58dc93b467a00.1650320598.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-05-06 15:07:25 +01:00
Robin Murphy 8e504d93ac perf/arm-cmn: Add CMN-650 support
Add the identifiers and events for CMN-650, which slots into its
evolutionary position between CMN-600 and the 700-series products.
Imagine CMN-600 made bigger, and with most of the rough edges smoothed
off, but that then balanced out by some bonkers PMU functionality for
the new HN-P enhancement in CMN-650r2.

Most of the CXG events are actually common to newer revisions of CMN-600
too, so they're arguably a little late; oh well.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/b0adc5824db53f71a2b561c293e2120390106536.1650320598.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-05-06 15:07:25 +01:00
Ren Yu 4b5b712909 perf: check return value of armpmu_request_irq()
When the function armpmu_request_irq() failed, goto err

Signed-off-by: Ren Yu <renyu@nfschina.com>
Link: https://lore.kernel.org/r/20220425100436.4881-1-renyu@nfschina.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-05-06 15:04:48 +01:00
Palmer Dabbelt c7a9dcea8e perf: RISC-V: Remove non-kernel-doc ** comments
This will presumably trip up some tools that try to parse the comments
as kernel doc when they're not.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: 4905ec2fb7 ("RISC-V: Add sscofpmf extension support")
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>

--

These recently landed in for-next, but I'm trying to avoid rewriting
history as there's a lot in flight right now.

Reviewed-by: Atish Patra <atishp@rivosinc.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20220322220147.11407-1-palmer@rivosinc.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-05-06 13:48:30 +01:00
Rob Herring e5c23779f9 arm_pmu: Validate single/group leader events
In the case where there is only a cycle counter available (i.e.
PMCR_EL0.N is 0) and an event other than CPU cycles is opened, the open
should fail as the event can never possibly be scheduled. However, the
event validation when an event is opened is skipped when the group
leader is opened. Fix this by always validating the group leader events.

Reported-by: Al Grant <al.grant@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20220408203330.4014015-1-robh@kernel.org
Cc: <stable@vger.kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2022-04-13 11:48:45 +01:00
Borislav Petkov d02b4dd84e perf/imx_ddr: Fix undefined behavior due to shift overflowing the constant
Fix:

  In file included from <command-line>:0:0:
  In function ‘ddr_perf_counter_enable’,
      inlined from ‘ddr_perf_irq_handler’ at drivers/perf/fsl_imx8_ddr_perf.c:651:2:
  ././include/linux/compiler_types.h:352:38: error: call to ‘__compiletime_assert_729’ \
	declared with attribute error: FIELD_PREP: mask is not constant
    _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
...

See https://lore.kernel.org/r/YkwQ6%2BtIH8GQpuct@zn.tnic for the gory
details as to why it triggers with older gccs only.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Frank Li <Frank.li@nxp.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Pengutronix Kernel Team <kernel@pengutronix.de>
Cc: Fabio Estevam <festevam@gmail.com>
Cc: NXP Linux Team <linux-imx@nxp.com>
Cc: linux-arm-kernel@lists.infradead.org
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220405151517.29753-10-bp@alien8.de
Signed-off-by: Will Deacon <will@kernel.org>
2022-04-08 14:17:57 +01:00
Geert Uytterhoeven 1d8e926a04 perf: MARVELL_CN10K_DDR_PMU should depend on ARCH_THUNDER
The Marvell CN10K DRAM Subsystem (DSS) performance monitor is only
present on Marvell CN10K SoCs.  Hence add a dependency on ARCH_THUNDER,
to prevent asking the user about this driver when configuring a kernel
without Cavium Thunder (incl. Marvell CN10K) SoC support,

Fixes: 68fa55f0e0 ("perf/marvell: cn10k DDR perf event core ownership")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/18bfd6e1bcf67db7ea656d684a8bbb68261eeb54.1648559364.git.geert+renesas@glider.be
Signed-off-by: Will Deacon <will@kernel.org>
2022-04-04 10:51:20 +01:00
Xiaomeng Tong 2012a9e279 perf: qcom_l2_pmu: fix an incorrect NULL check on list iterator
The bug is here:
	return cluster;

The list iterator value 'cluster' will *always* be set and non-NULL
by list_for_each_entry(), so it is incorrect to assume that the
iterator value will be NULL if the list is empty or no element
is found.

To fix the bug, return 'cluster' when found, otherwise return NULL.

Cc: stable@vger.kernel.org
Fixes: 21bdbb7102 ("perf: add qcom l2 cache perf events driver")
Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com>
Link: https://lore.kernel.org/r/20220327055733.4070-1-xiam0nd.tong@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-04-04 10:50:02 +01:00
Linus Torvalds aa5b537b0e RISC-V Patches for the 5.18 Merge Window, Part 1
* Support for Sv57-based virtual memory.
 * Various improvements for the MicroChip PolarFire SOC and the
   associated Icicle dev board, which should allow upstream kernels to
   boot without any additional modifications.
 * An improved memmove() implementation.
 * Support for the new Ssconfpmf and SBI PMU extensions, which allows for
   a much more useful perf implementation on RISC-V systems.
 * Support for restartable sequences.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmI96FcTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiQBFD/425+6xmoOru6Wiki3Ja0fqQToNrQyW
 IbmE/8AxUP7UxMvJSNzvQm8deXgklzvmegXCtnjwZZins971vMzzDSI83k/zn8I7
 m5thVC9z01BjodV+pvIp/44hS6FesolOLzkVHksX0Zh6h0iidrc34Qf5HrqvvNfN
 CZ/4K1+E9ig5r9qZp4WdvocCXj+FzwF/30GjKoW9vwA599CEG/dCo+TNN9GKD6XS
 k+xiUGwlIRA+kCLSPFCi7ev9XPr1tCmQB7uB8Igcvr7Y3mWl8HKfajQVXBnXNRC3
 ifbDxpx1elJiLPyf7Rza8jIDwDhLQdxBiwPgDgP9h9R4x0uF4efq8PzLzFlFmaE+
 9Z9thfykBb5dXYDFDje9bAOXvKnGk7Iqxdsz0qWo/ChEQawX1+11bJb0TNN8QTT9
 YvlQfUXgb1dmEcj5yG2uVE1Y8L7YNLRMsZU3W3FbmPJZoavSOuU4x0yCGeLyv597
 76af3nuBJ5v80Db97gu6St+HIACeevKflsZUf/8GS/p7d1DlvmrWzQUMEycxPTG9
 UZpZak58jh7AqQ9JbLnavhwmeacY50vpZOw6QHGAHSN+8daCPlOHDG7Ver7Z+kNj
 +srJ7iKMvLnnaEjGNgavfxdqTOme1gv4LWs/JdHYMkpphqVN92xBDJnhXTPRVZiQ
 0x39vK86qtB46A==
 =Omc6
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.18-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Support for Sv57-based virtual memory.

 - Various improvements for the MicroChip PolarFire SOC and the
   associated Icicle dev board, which should allow upstream kernels to
   boot without any additional modifications.

 - An improved memmove() implementation.

 - Support for the new Ssconfpmf and SBI PMU extensions, which allows
   for a much more useful perf implementation on RISC-V systems.

 - Support for restartable sequences.

* tag 'riscv-for-linus-5.18-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (36 commits)
  rseq/selftests: Add support for RISC-V
  RISC-V: Add support for restartable sequence
  MAINTAINERS: Add entry for RISC-V PMU drivers
  Documentation: riscv: Remove the old documentation
  RISC-V: Add sscofpmf extension support
  RISC-V: Add perf platform driver based on SBI PMU extension
  RISC-V: Add RISC-V SBI PMU extension definitions
  RISC-V: Add a simple platform driver for RISC-V legacy perf
  RISC-V: Add a perf core library for pmu drivers
  RISC-V: Add CSR encodings for all HPMCOUNTERS
  RISC-V: Remove the current perf implementation
  RISC-V: Improve /proc/cpuinfo output for ISA extensions
  RISC-V: Do no continue isa string parsing without correct XLEN
  RISC-V: Implement multi-letter ISA extension probing framework
  RISC-V: Extract multi-letter extension names from "riscv, isa"
  RISC-V: Minimal parser for "riscv, isa" strings
  RISC-V: Correctly print supported extensions
  riscv: Fixed misaligned memory access. Fixed pointer comparison.
  MAINTAINERS: update riscv/microchip entry
  riscv: dts: microchip: add new peripherals to icicle kit device tree
  ...
2022-03-25 10:11:38 -07:00
Linus Torvalds 34af78c4e6 IOMMU Updates for Linux v5.18
Including:
 
 	- IOMMU Core changes:
 	  - Removal of aux domain related code as it is basically dead
 	    and will be replaced by iommu-fd framework
 	  - Split of iommu_ops to carry domain-specific call-backs
 	    separatly
 	  - Cleanup to remove useless ops->capable implementations
 	  - Improve 32-bit free space estimate in iova allocator
 
 	- Intel VT-d updates:
 	  - Various cleanups of the driver
 	  - Support for ATS of SoC-integrated devices listed in
 	    ACPI/SATC table
 
 	- ARM SMMU updates:
 	  - Fix SMMUv3 soft lockup during continuous stream of events
 	  - Fix error path for Qualcomm SMMU probe()
 	  - Rework SMMU IRQ setup to prepare the ground for PMU support
 	  - Minor cleanups and refactoring
 
 	- AMD IOMMU driver:
 	  - Some minor cleanups and error-handling fixes
 
 	- Rockchip IOMMU driver:
 	  - Use standard driver registration
 
 	- MSM IOMMU driver:
 	  - Minor cleanup and change to standard driver registration
 
 	- Mediatek IOMMU driver:
 	  - Fixes for IOTLB flushing logic
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmI8OUkACgkQK/BELZcB
 GuNz9xAAvlgEg3byMx1y6LY9IctVGyLsegweVGM4+m6XR7qvT5Llc1E2Yw4Gooe4
 EAceOihDKW2T9VnMlz9g/cG7Modrx60chcB22KKfxDXPl6yF3R89EMd7DE43T6n/
 KPrP9+EsBnI8QSXyYu9ZowioX4CYwWhWD0dKHKAwDvw0BWHHUJ4hTaoHqEoIqLdP
 vubeHziIok/g1sylSpJjTzV7r/Na8Q3TGcb/Mi5qC8uiyiyx40vtaduMGNW+/ToN
 EqOKszxPmHfHv/xf0IHo0eUZ2L/JAe0mAlZzOb09f5F2sXJrbC05jlmRaDmSjT+u
 iEc1r2By/0xo6iOuQC3wD6kTvwwO/ecpNYGhXYXdTbtLquYfL5PSXjRHEU9gf2BO
 i/llPDsnytPvm/hnmbi26ChNR6yrDPz5bkoCUl5mnB1jZcaZtIURN7cRlEPPZUWo
 62VDNdqWDB6AvALc1/SwYdJX/i5eaBf+niS7/BJ/IkLp2oJxFzrGsU8SRJFHNYsa
 zdFIUUoTw647Ul6derSpGzHow169/RwVKYPiXMsaA8/viPNjpBOtfg56abn1WLW6
 N4CtwNu6tt+sPfftFdFx2cDEMW2zpWg5doMddBfEx9FAk0HJ4WLZiTpaO2PxcLyd
 kCAsGHj+ViAZHINVKFV4nQN/V9yQtcIc4UPmSGJBtKCK3KUYujw=
 =bcqr
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates from Joerg Roedel:

 - IOMMU Core changes:
      - Removal of aux domain related code as it is basically dead and
        will be replaced by iommu-fd framework
      - Split of iommu_ops to carry domain-specific call-backs separatly
      - Cleanup to remove useless ops->capable implementations
      - Improve 32-bit free space estimate in iova allocator

 - Intel VT-d updates:
      - Various cleanups of the driver
      - Support for ATS of SoC-integrated devices listed in ACPI/SATC
        table

 - ARM SMMU updates:
      - Fix SMMUv3 soft lockup during continuous stream of events
      - Fix error path for Qualcomm SMMU probe()
      - Rework SMMU IRQ setup to prepare the ground for PMU support
      - Minor cleanups and refactoring

 - AMD IOMMU driver:
      - Some minor cleanups and error-handling fixes

 - Rockchip IOMMU driver:
      - Use standard driver registration

 - MSM IOMMU driver:
      - Minor cleanup and change to standard driver registration

 - Mediatek IOMMU driver:
      - Fixes for IOTLB flushing logic

* tag 'iommu-updates-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (47 commits)
  iommu/amd: Improve amd_iommu_v2_exit()
  iommu/amd: Remove unused struct fault.devid
  iommu/amd: Clean up function declarations
  iommu/amd: Call memunmap in error path
  iommu/arm-smmu: Account for PMU interrupts
  iommu/vt-d: Enable ATS for the devices in SATC table
  iommu/vt-d: Remove unused function intel_svm_capable()
  iommu/vt-d: Add missing "__init" for rmrr_sanity_check()
  iommu/vt-d: Move intel_iommu_ops to header file
  iommu/vt-d: Fix indentation of goto labels
  iommu/vt-d: Remove unnecessary prototypes
  iommu/vt-d: Remove unnecessary includes
  iommu/vt-d: Remove DEFER_DEVICE_DOMAIN_INFO
  iommu/vt-d: Remove domain and devinfo mempool
  iommu/vt-d: Remove iova_cache_get/put()
  iommu/vt-d: Remove finding domain in dmar_insert_one_dev_info()
  iommu/vt-d: Remove intel_iommu::domains
  iommu/mediatek: Always tlb_flush_all when each PM resume
  iommu/mediatek: Add tlb_lock in tlb_flush_all
  iommu/mediatek: Remove the power status checking in tlb flush all
  ...
2022-03-24 19:48:57 -07:00
Atish Patra 4905ec2fb7
RISC-V: Add sscofpmf extension support
The sscofpmf extension allows counter overflow and filtering for
programmable counters. Enable the perf driver to handle the overflow
interrupt. The overflow interrupt is a hart local interrupt.
Thus, per cpu overflow interrupts are setup as a child under the root
INTC irq domain.

Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-21 15:01:09 -07:00
Atish Patra e999143459
RISC-V: Add perf platform driver based on SBI PMU extension
RISC-V SBI specification added a PMU extension that allows to configure
start/stop any pmu counter. The RISC-V perf can use most of the generic
perf features except interrupt overflow and event filtering based on
privilege mode which will be added in future.

It also allows to monitor a handful of firmware counters that can provide
insights into firmware activity during a performance analysis.

Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-21 14:58:33 -07:00
Atish Patra 9b3e150e31
RISC-V: Add a simple platform driver for RISC-V legacy perf
The old RISC-V perf implementation allowed counting of only
cycle/instruction counters using perf. Restore that feature by implementing
a simple platform driver under a separate config to provide backward
compatibility. Any existing software stack will continue to work as it is.
However, it provides an easy way out in future where we can remove the
legacy driver.

Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-21 14:58:25 -07:00
Atish Patra f5bfa23f57
RISC-V: Add a perf core library for pmu drivers
Implement a perf core library that can support all the essential perf
features in future. It can also accommodate any type of PMU implementation
in future. Currently, both SBI based perf driver and legacy driver
implemented uses the library. Most of the common perf functionalities
are kept in this core library wile PMU specific driver can implement PMU
specific features. For example, the SBI specific functionality will be
implemented in the SBI specific driver.

Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-21 14:58:21 -07:00
Will Deacon 6676a42f1e perf/marvell: Fix !CONFIG_OF build for CN10K DDR PMU driver
When compiling the Marvell CN10K DDR PMU driver with CONFIG_OF=n, the
build fails:

  | drivers/perf/marvell_cn10k_ddr_pmu.c:723:35: error: 'cn10k_ddr_pmu_of_match' undeclared here (not in a function); did you mean 'cn10k_ddr_pmu_driver'?

Use `of_match_ptr()` to avoid referencing the non-existent match table
in this configuration.

Link: https://lore.kernel.org/r/202203091424.Vfe8J4W9-lkp@intel.com
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Will Deacon <will@kernel.org>
2022-03-09 12:31:00 +00:00
Will Deacon 0162052214 Merge branch 'for-next/perf-m1' into for-next/perf
Support for the CPU PMUs on the Apple M1.

* for-next/perf-m1:
  drivers/perf: Add Apple icestorm/firestorm CPU PMU driver
  drivers/perf: arm_pmu: Handle 47 bit counters
  irqchip/apple-aic: Move PMU-specific registers to their own include file
  arm64: dts: apple: Add t8303 PMU nodes
  arm64: dts: apple: Add t8103 PMU interrupt affinities
  irqchip/apple-aic: Wire PMU interrupts
  irqchip/apple-aic: Parse FIQ affinities from device-tree
  dt-bindings: apple,aic: Add affinity description for per-cpu pseudo-interrupts
  dt-bindings: apple,aic: Add CPU PMU per-cpu pseudo-interrupts
  dt-bindings: arm-pmu: Document Apple PMU compatible strings
2022-03-08 13:33:34 +00:00
Marc Zyngier a639027a1b drivers/perf: Add Apple icestorm/firestorm CPU PMU driver
Add a new, weird and wonderful driver for the equally weird Apple
PMU HW. Although the PMU itself is functional, we don't know much
about the events yet, so this can be considered as yet another
random number generator...

Nonetheless, it can reliably count at least cycles and instructions
in the usually wonky big-little way. For anything else, it of course
supports raw event numbers.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2022-03-08 13:32:48 +00:00
Marc Zyngier 1280f12f56 drivers/perf: arm_pmu: Handle 47 bit counters
The current ARM PMU framework can only deal with 32 or 64bit counters.
Teach it about a 47bit flavour.

Yes, this is odd.

Reviewed-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2022-03-08 13:32:48 +00:00
Bharat Bhushan 68fa55f0e0 perf/marvell: cn10k DDR perf event core ownership
As DDR perf event counters are not per core, so they should be accessed
only by one core at a time. Select new core when previously owning core
is going offline.

Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
Reviewed-by: Bhaskara Budiredla <bbudiredla@marvell.com>
Link: https://lore.kernel.org/r/20220211045346.17894-5-bbhushan2@marvell.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-03-08 11:17:37 +00:00
Bharat Bhushan 35a43326a9 perf/marvell: cn10k DDR perfmon event overflow handling
CN10k DSS h/w perfmon does not support event overflow interrupt, so
periodic timer is being used. Each event counter is 48bit, which in worst
case scenario can increment at maximum 5.6 GT/s. At this rate it may take
many hours to overflow these counters. Therefore polling period for
overflow is set to 100 sec, which can be changed using sysfs parameter.

Two fixed event counters starts counting from zero on overflow, so
overflow condition is when new count less than previous count. While
eight programmable event counters freezes at maximum value. Also individual
counter cannot be restarted, so need to restart all eight counters.

Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
Reviewed-by: Bhaskara Budiredla <bbudiredla@marvell.com>
Link: https://lore.kernel.org/r/20220211045346.17894-4-bbhushan2@marvell.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-03-08 11:17:37 +00:00
Bharat Bhushan 7cf83e222b perf/marvell: CN10k DDR performance monitor support
Marvell CN10k DRAM Subsystem (DSS) supports eight event counters for
monitoring performance and software can program each counter to monitor
any of the defined performance event. Performance events are for
interface between the DDR controller and the PHY, interface between the
DDR Controller and the CHI interconnect, or within the DDR Controller.
Additionally DSS also supports two fixed performance event counters, one
for number of ddr reads and other for ddr writes.

This patch add basic support for these performance monitoring events
on CN10k.

Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
Reviewed-by: Bhaskara Budiredla <bbudiredla@marvell.com>
Link: https://lore.kernel.org/r/20220211045346.17894-3-bbhushan2@marvell.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-03-08 11:17:37 +00:00
Robin Murphy 31fac56577 perf/arm-cmn: Update watchpoint format
From CMN-650 onwards, some of the fields in the watchpoint config
registers moved subtly enough to easily overlook. Watchpoint events are
still only partially supported on newer IPs - which in itself deserves
noting - but were not intended to become any *less* functional than on
CMN-600.

Fixes: 60d1504070 ("perf/arm-cmn: Support new IP features")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/e1ce4c2f1e4f73ab1c60c3a85e4037cd62dd6352.1645727871.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-03-08 11:02:26 +00:00
Robin Murphy 205295c7e1 perf/arm-cmn: Hide XP PUB events for CMN-600
CMN-600 doesn't have XP events for the PUB channel, but we missed
the appropriate check to avoid exposing them.

Fixes: 60d1504070 ("perf/arm-cmn: Support new IP features")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/4c108d39a0513def63acccf09ab52b328f242aeb.1645727871.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-03-08 11:02:26 +00:00
Andy Shevchenko 8ddf4eff71 perf/smmuv3: Don't cast parameter in bit operations
While in this particular case it would not be a (critical) issue,
the pattern itself is bad and error prone in case somebody blindly
copies to their code.

Don't cast parameter to unsigned long pointer in the bit operations.
Instead copy to a local variable on stack of a proper type and use.

Note, new compilers might warn on this line for potential outbound access.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20220209184758.56578-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-02-15 16:51:26 +00:00
Yury Norov 95ed57c73b perf: replace bitmap_weight with bitmap_empty where appropriate
In some places, drivers/perf code calls bitmap_weight() to check if any
bit of a given bitmap is set. It's better to use bitmap_empty() in that
case because bitmap_empty() stops traversing the bitmap as soon as it
finds first set bit, while bitmap_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20220210224933.379149-13-yury.norov@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-02-15 14:38:57 +00:00
Rafael J. Wysocki 602c873eb5 perf: Replace acpi_bus_get_device()
Replace acpi_bus_get_device() that is going to be dropped with
acpi_fetch_acpi_dev().

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/10025610.nUPlyArG6x@kreacher
Signed-off-by: Will Deacon <will@kernel.org>
2022-02-08 15:14:53 +00:00
Will Deacon 8c0c56879d perf/marvell_cn10k: Fix unused variable warning when W=1 and CONFIG_OF=n
The kbuild helpfully reports that the Marvell CN10K TAD PMU driver emits
a warning when building with W=1 and CONFIG_OF=n:

  | >> drivers/perf/marvell_cn10k_tad_pmu.c:371:34: warning: unused variable 'tad_pmu_of_match' [-Wunused-const-variable]
       static const struct of_device_id tad_pmu_of_match[] = {

Guard the match table with CONFIG_OF to squash the warning.

Link: https://lore.kernel.org/r/202201292349.zRQLcDDD-lkp@intel.com
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Will Deacon <will@kernel.org>
2022-02-08 15:12:28 +00:00
Robin Murphy 6f75217b20 perf/arm-cmn: Make arm_cmn_debugfs static
Indeed our debugfs directory is driver-internal so should be static.

Link: https://lore.kernel.org/r/202202030812.II1K2ZXf-lkp@intel.com
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/ca9248caaae69b5134f69e085fe78905dfe74378.1643911278.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-02-08 14:57:15 +00:00
Geert Uytterhoeven e564518b07 perf: MARVELL_CN10K_TAD_PMU should depend on ARCH_THUNDER
The Marvell CN10K Last-Level cache Tag-and-data Units (LLC-TAD)
performance monitor is only present on Marvell CN10K SoCs.  Hence add a
dependency on ARCH_THUNDER, to prevent asking the user about this driver
when configuring a kernel without Cavium Thunder (incl. Marvell CN10K)
SoC support.

Fixes: 036a7584be ("drivers: perf: Add LLC-TAD perf counter support")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/b4662a2c767d04cca19417e0c845edea2da262ad.1641995941.git.geert+renesas@glider.be
Signed-off-by: Will Deacon <will@kernel.org>
2022-02-08 14:54:18 +00:00
Lad Prabhakar adbb8a1ede perf/arm-ccn: Use platform_get_irq() to get the interrupt
platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static
allocation of IRQ resources in DT core code, this causes an issue
when using hierarchical interrupt domains using "interrupts" property
in the node as this bypasses the hierarchical setup and messes up the
irq chaining.

In preparation for removal of static setup of IRQ resource from DT core
code use platform_get_irq().

Link: https://lore.kernel.org/r/20211224161334.31123-1-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Signed-off-by: Will Deacon <will@kernel.org>
2022-02-08 14:25:35 +00:00
Linus Torvalds feb7a43de5 Rework of the MSI interrupt infrastructure:
Treewide cleanup and consolidation of MSI interrupt handling in
   preparation for further changes in this area which are necessary to:
 
   - address existing shortcomings in the VFIO area
 
   - support the upcoming Interrupt Message Store functionality which
     decouples the message store from the PCI config/MMIO space
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmHf+SETHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYobzGD/wNEFl5qQo5mNZ9thP6JSJFOItm7zMc
 2QgzCYOqNwAv4jL6Dqo+EHtbShYqDyWzKdKccgqNjmdIqgW8q7/fubN1OPzRsClV
 CZG997AsXDGXYlQcE3tXZjkeCWnWEE2AGLnygSkFV1K/r9ALAtFfTBJAWB+UD+Zc
 1P8Kxo0q0Jg+DQAMAA5bWfSSjo/Pmpr/1AFjY7+GA8BBeJJgWOyW7H1S+GYEWVOE
 RaQP81Sbd6x1JkopxkNqSJ/lbNJfnPJxi2higB56Y0OYn5CuSarYbZUM7oQ2V61t
 jN7pcEEvTpjLd6SJ93ry8WOcJVMTbccCklVfD0AfEwwGUGw2VM6fSyNrZfnrosUN
 tGBEO8eflBJzGTAwSkz1EhiGKna4o1NBDWpr0sH2iUiZC5G6V2hUDbM+0PQJhDa8
 bICwguZElcUUPOprwjS0HXhymnxghTmNHyoEP1yxGoKLTrwIqkH/9KGustWkcBmM
 hNtOCwQNqxcOHg/r3MN0KxttTASgoXgNnmFliAWA7XwseRpLWc95XPQFa5sptRhc
 EzwumEz17EW1iI5/NyZQcY+jcZ9BdgCqgZ9ECjZkyN4U+9G6iACUkxVaHUUs77jl
 a0ISSEHEvJisFOsOMYyFfeWkpIKGIKP/bpLOJEJ6kAdrUWFvlRGF3qlav3JldXQl
 ypFjPapDeB5guw==
 =vKzd
 -----END PGP SIGNATURE-----

Merge tag 'irq-msi-2022-01-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull MSI irq updates from Thomas Gleixner:
 "Rework of the MSI interrupt infrastructure.

  This is a treewide cleanup and consolidation of MSI interrupt handling
  in preparation for further changes in this area which are necessary
  to:

   - address existing shortcomings in the VFIO area

   - support the upcoming Interrupt Message Store functionality which
     decouples the message store from the PCI config/MMIO space"

* tag 'irq-msi-2022-01-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (94 commits)
  genirq/msi: Populate sysfs entry only once
  PCI/MSI: Unbreak pci_irq_get_affinity()
  genirq/msi: Convert storage to xarray
  genirq/msi: Simplify sysfs handling
  genirq/msi: Add abuse prevention comment to msi header
  genirq/msi: Mop up old interfaces
  genirq/msi: Convert to new functions
  genirq/msi: Make interrupt allocation less convoluted
  platform-msi: Simplify platform device MSI code
  platform-msi: Let core code handle MSI descriptors
  bus: fsl-mc-msi: Simplify MSI descriptor handling
  soc: ti: ti_sci_inta_msi: Remove ti_sci_inta_msi_domain_free_irqs()
  soc: ti: ti_sci_inta_msi: Rework MSI descriptor allocation
  NTB/msi: Convert to msi_on_each_desc()
  PCI: hv: Rework MSI handling
  powerpc/mpic_u3msi: Use msi_for_each-desc()
  powerpc/fsl_msi: Use msi_for_each_desc()
  powerpc/pasemi/msi: Convert to msi_on_each_dec()
  powerpc/cell/axon_msi: Convert to msi_on_each_desc()
  powerpc/4xx/hsta: Rework MSI handling
  ...
2022-01-13 09:05:29 -08:00
Dan Carpenter 2da56881a7 drivers: perf: marvell_cn10k: fix an IS_ERR() vs NULL check
The devm_ioremap() function does not return error pointers.  It returns
NULL.

Fixes: 036a7584be ("drivers: perf: Add LLC-TAD perf counter support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20211217145907.GA16611@kili
Signed-off-by: Will Deacon <will@kernel.org>
2022-01-04 13:58:17 +00:00
Will Deacon 527a7f5252 perf/smmuv3: Fix unused variable warning when CONFIG_OF=n
The kbuild robot reports that building the SMMUv3 PMU driver with
CONFIG_OF=n results in a warning for W=1 builds:

>> drivers/perf/arm_smmuv3_pmu.c:889:34: warning: unused variable 'smmu_pmu_of_match' [-Wunused-const-variable]
   static const struct of_device_id smmu_pmu_of_match[] = {
                                    ^

Guard the match table with #ifdef CONFIG_OF.

Link: https://lore.kernel.org/r/202201041700.01KZEzhb-lkp@intel.com
Fixes: 3f7be43561 ("perf/smmuv3: Add devicetree support")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Will Deacon <will@kernel.org>
2022-01-04 13:38:16 +00:00
Thomas Gleixner 8484567055 perf/smmuv3: Use msi_get_virq()
Let the core code fiddle with the MSI descriptor retrieval.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20211210221815.029143589@linutronix.de
2021-12-16 22:16:41 +01:00
Will Deacon 1879a61f4a Merge branch 'for-next/perf-smmu' into for-next/perf
* for-next/perf-smmu:
  perf/smmuv3: Synthesize IIDR from CoreSight ID registers
  perf/smmuv3: Add devicetree support
  dt-bindings: Add Arm SMMUv3 PMCG binding
2021-12-14 13:42:17 +00:00
Will Deacon 8330904fed Merge branch 'for-next/perf-hisi' into for-next/perf
* for-next/perf-hisi:
  drivers/perf: hisi: Add driver for HiSilicon PCIe PMU
  docs: perf: Add description for HiSilicon PCIe PMU driver
2021-12-14 13:42:05 +00:00
Will Deacon e73bc4fd78 Merge branch 'for-next/perf-cn10k' into for-next/perf
* for-next/perf-cn10k:
  dt-bindings: perf: Add YAML schemas for Marvell CN10K LLC-TAD pmu bindings
  drivers: perf: Add LLC-TAD perf counter support
2021-12-14 13:41:58 +00:00
Qi Liu 8404b0fbc7 drivers/perf: hisi: Add driver for HiSilicon PCIe PMU
PCIe PMU Root Complex Integrated End Point(RCiEP) device is supported
to sample bandwidth, latency, buffer occupation etc.

Each PMU RCiEP device monitors multiple Root Ports, and each RCiEP is
registered as a PMU in /sys/bus/event_source/devices, so users can
select target PMU, and use filter to do further sets.

Filtering options contains:
event     - select the event.
port      - select target Root Ports. Information of Root Ports are
            shown under sysfs.
bdf       - select requester_id of target EP device.
trig_len  - set trigger condition for starting event statistics.
trig_mode - set trigger mode. 0 means starting to statistic when bigger
            than trigger condition, and 1 means smaller.
thr_len   - set threshold for statistics.
thr_mode  - set threshold mode. 0 means count when bigger than threshold,
            and 1 means smaller.

Acked-by: Krzysztof Wilczyński <kw@linux.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/20211202080633.2919-3-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14 12:30:26 +00:00
Bhaskara Budiredla 036a7584be drivers: perf: Add LLC-TAD perf counter support
This driver adds support for Last-level cache tag-and-data unit
(LLC-TAD) PMU that is featured in some of the Marvell's CN10K
infrastructure silicons.

The LLC is divided into 2N slices distributed across N Mesh tiles
in a single-socket configuration. The driver always configures the
same counter for all of the TADs. The user would end up effectively
reserving one of eight counters in every TAD to look across all TADs.
The occurrences of events are aggregated and presented to the user
at the end of an application run. The driver does not provide a way
for the user to partition TADs so that different TADs are used for
different applications.

The event counters are zeroed to start event counting to avoid any
rollover issues. TAD perf counters are 64-bit, so it's not currently
possible to overflow event counters at current mesh and core
frequencies.

To measure tad pmu events use perf tool stat command. For instance:

perf stat -e tad_dat_msh_in_dss,tad_req_msh_out_any <application>
perf stat -e tad_alloc_any,tad_hit_any,tad_tag_rd <application>

Signed-off-by: Bhaskara Budiredla <bbudiredla@marvell.com>
Link: https://lore.kernel.org/r/20211115043506.6679-2-bbudiredla@marvell.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14 12:23:01 +00:00
Robin Murphy df457ca973 perf/smmuv3: Synthesize IIDR from CoreSight ID registers
The SMMU_PMCG_IIDR register was not present in older revisions of the
Arm SMMUv3 spec. On Arm Ltd. implementations, the IIDR value consists of
fields from several PIDR registers, allowing us to present a
standardized identifier to userspace.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/r/20211117144844.241072-4-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14 12:09:52 +00:00
Jean-Philippe Brucker 3f7be43561 perf/smmuv3: Add devicetree support
Add device-tree support to the SMMUv3 PMCG driver.

Signed-off-by: Jay Chen <jkchen@linux.alibaba.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20211117144844.241072-3-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14 12:09:52 +00:00
Robin Murphy a88fa6c28b perf/arm-cmn: Add debugfs topology info
In general, detailed performance analysis will require knoweldge of the
the SoC beyond the CMN itself - e.g. which actual CPUs/peripherals/etc.
are connected to each node. However for certain development and bringup
tasks it can be useful to have a quick overview of the CMN internal
topology to hand too. Add a debugfs file to map this out.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/159fd4d7e19fb3c8801a8cb64ee73ec50f55903c.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14 12:09:28 +00:00
Robin Murphy b2fea780c9 perf/arm-cmn: Add CI-700 Support
Add the identifiers and events for the CI-700 coherent interconnect.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/28f566ab23a83733c6c9ef9414c010b760b4549c.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14 12:09:28 +00:00
Robin Murphy 60d1504070 perf/arm-cmn: Support new IP features
The second generation of CMN IPs add new node types and significantly
expand the configuration space with options for extra device ports on
edge XPs, either plumbed into the regular DTM or with extra dedicated
DTMs to monitor them, plus larger (and smaller) mesh sizes. Add basic
support for pulling this new information out of the hardware, piping
it around as necessary, and handling (most of) the new choices.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/e58b495bcc7deec3882be4bac910ed0bf6979674.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14 12:09:28 +00:00
Robin Murphy 61ec1d8758 perf/arm-cmn: Demarcate CMN-600 specifics
In preparation for supporting newer CMN products, let's introduce a
means to differentiate the features and events which are specific to a
particular IP from those which remain common to the whole family. The
newer designs have also smoothed off some of the rough edges in terms
of discoverability, so separate out the parts of the flow which have
effectively now become CMN-600 quirks.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/9f6368cdca4c821d801138939508a5bba54ccabb.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14 12:09:28 +00:00
Robin Murphy 558a078070 perf/arm-cmn: Move group validation data off-stack
With the value of CMN_MAX_DTMS increasing significantly, our validation
data structure is set to get quite big. Technically we could pack it at
least twice as densely, since we only need around 19 bits of information
per DTM, but that makes the code even more mind-bogglingly impenetrable,
and even half of "quite big" may still be uncomfortably large for a
stack frame (~1KB). Just move it to an off-stack allocation instead.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/0cabff2e5839ddc0979e757c55515966f65359e4.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14 12:09:28 +00:00