Commit Graph

1089050 Commits

Author SHA1 Message Date
Michael Ellerman 750137ec6c Merge branch 'fixes' into topic/ppc-kvm
Merge our fixes branch. In parciular this brings in the KVM TCE handling
fix, which is a prerequisite for a subsequent patch.
2022-05-19 00:43:04 +10:00
Fabiano Rosas 1d1cd0f12a KVM: PPC: Book3S HV: Initialize AMOR in nested entry
The hypervisor always sets AMOR to ~0, but let's ensure we're not
passing stale values around.

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220425142151.1495142-1-farosas@linux.ibm.com
2022-05-19 00:28:49 +10:00
Bo Liu 15eb1b6afc KVM: PPC: Book3S HV: Use consistent type for return value of kvm_age_rmapp()
The return value type defined in the function kvm_age_rmapp() is
"bool", but the return value type defined in the implementation of the
function kvm_age_rmapp() is "int".

Change the return value type to "bool".

Signed-off-by: Bo Liu <liubo03@inspur.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220401065252.36472-1-liubo03@inspur.com
2022-05-18 23:34:39 +10:00
Xiaomeng Tong 300981abdd KVM: PPC: Book3S HV: fix incorrect NULL check on list iterator
The bug is here:
	if (!p)
                return ret;

The list iterator value 'p' will *always* be set and non-NULL by
list_for_each_entry(), so it is incorrect to assume that the iterator
value will be NULL if the list is empty or no element is found.

To fix the bug, Use a new value 'iter' as the list iterator, while use
the old value 'p' as a dedicated variable to point to the found element.

Fixes: dfaa973ae9 ("KVM: PPC: Book3S HV: In H_SVM_INIT_DONE, migrate remaining normal-GFNs to secure-GFNs")
Cc: stable@vger.kernel.org # v5.9+
Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220414062103.8153-1-xiam0nd.tong@gmail.com
2022-05-18 23:31:35 +10:00
Bagas Sanjaya d53c36e6c8 KVM: PPC: Book3S HV: remove extraneous asterisk from rm_host_ipi_action() comment
kernel test robot reported kernel-doc warning for rm_host_ipi_action():

   arch/powerpc/kvm/book3s_hv_rm_xics.c:887: warning: This comment starts with '/**', but isn't a kernel-doc comment.
    * Host Operations poked by RM KVM

Since the function is static, remove the extraneous (second) asterisk at
the head of function comment.

Fixes: 0c2a660624 ("KVM: PPC: Book3S HV: Host side kick VCPU when poked by real-mode KVM")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/linux-doc/202204252334.Cd2IsiII-lkp@intel.com/
Link: https://lore.kernel.org/r/20220506070747.16309-1-bagasdotme@gmail.com
2022-05-18 23:27:24 +10:00
Nicholas Piggin 2852ebfa10 KVM: PPC: Book3S HV Nested: L2 LPCR should inherit L1 LPES setting
The L1 should not be able to adjust LPES mode for the L2. Setting LPES
if the L0 needs it clear would cause external interrupts to be sent to
L2 and missed by the L0.

Clearing LPES when it may be set, as typically happens with XIVE enabled
could cause a performance issue despite having no native XIVE support in
the guest, because it will cause mediated interrupts for the L2 to be
taken in HV mode, which then have to be injected.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220303053315.1056880-7-npiggin@gmail.com
2022-05-13 21:34:33 +10:00
Nicholas Piggin 11681b79b1 KVM: PPC: Book3S HV Nested: L2 must not run with L1 xive context
The PowerNV L0 currently pushes the OS xive context when running a vCPU,
regardless of whether it is running a nested guest. The problem is that
xive OS ring interrupts will be delivered while the L2 is running.

At the moment, by default, the L2 guest runs with LPCR[LPES]=0, which
actually makes external interrupts go to the L0. That causes the L2 to
exit and the interrupt taken or injected into the L1, so in some
respects this behaves like an escalation. It's not clear if this was
deliberate or not, there's no comment about it and the L1 is actually
allowed to clear LPES in the L2, so it's confusing at best.

When the L2 is running, the L1 is essentially in a ceded state with
respect to external interrupts (it can't respond to them directly and
won't get scheduled again absent some additional event). So the natural
way to solve this is when the L0 handles a H_ENTER_NESTED hypercall to
run the L2, have it arm the escalation interrupt and don't push the L1
context while running the L2.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220303053315.1056880-6-npiggin@gmail.com
2022-05-13 21:34:33 +10:00
Nicholas Piggin 42b4a2b347 KVM: PPC: Book3S HV P9: Split !nested case out from guest entry
The differences between nested and !nested will become larger in
later changes so split them out for readability.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220303053315.1056880-5-npiggin@gmail.com
2022-05-13 21:34:33 +10:00
Nicholas Piggin ad5ace91c5 KVM: PPC: Book3S HV P9: Move cede logic out of XIVE escalation rearming
Move the cede abort logic out of xive escalation rearming and into
the caller to prepare for handling a similar case with nested guest
entry.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220303053315.1056880-4-npiggin@gmail.com
2022-05-13 21:34:33 +10:00
Nicholas Piggin 026728dc5d KVM: PPC: Book3S HV P9: Inject pending xive interrupts at guest entry
If there is a pending xive interrupt, inject it at guest entry (if
MSR[EE] is enabled) rather than take another interrupt when the guest
is entered. If xive is enabled then LPCR[LPES] is set so this behaviour
should be expected.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220303053315.1056880-3-npiggin@gmail.com
2022-05-13 21:34:33 +10:00
Nicholas Piggin f104df7d51 KVM: PPC: Book3S HV: Remove KVMPPC_NR_LPIDS
KVMPPC_NR_LPIDS no longer represents any size restriction on the
LPID space and can be removed. A CPU with more than 12 LPID bits
implemented will now be able to create more than 4095 guests.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220123120043.3586018-7-npiggin@gmail.com
2022-05-13 21:33:34 +10:00
Nicholas Piggin 03a2e65f54 KVM: PPC: Book3S Nested: Use explicit 4096 LPID maximum
Rather than tie this to KVMPPC_NR_LPIDS which is becoming more dynamic,
fix it to 4096 (12-bits) explicitly for now.

kvmhv_get_nested() does not have to check against KVM_MAX_NESTED_GUESTS
because the L1 partition table registration hcall already did that, and
it checks against the partition table size.

This patch also puts all the partition table size calculations into the
same form, using 12 for the architected size field shift and 4 for the
shift corresponding to the partition table entry size.

Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-of-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220123120043.3586018-6-npiggin@gmail.com
2022-05-13 21:33:34 +10:00
Nicholas Piggin c0f00a18e2 KVM: PPC: Book3S HV Nested: Change nested guest lookup to use idr
This removes the fixed sized kvm->arch.nested_guests array.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220123120043.3586018-5-npiggin@gmail.com
2022-05-13 21:33:34 +10:00
Nicholas Piggin 6ba2a2924d KVM: PPC: Book3S HV: Use IDA allocator for LPID allocator
This removes the fixed-size lpid_inuse array.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220123120043.3586018-4-npiggin@gmail.com
2022-05-13 21:33:33 +10:00
Nicholas Piggin 5d506f159b KVM: PPC: Book3S HV: Update LPID allocator init for POWER9, Nested
The LPID allocator init is changed to:
- use mmu_lpid_bits rather than hard-coding;
- use KVM_MAX_NESTED_GUESTS for nested hypervisors;
- not reserve the top LPID on POWER9 and newer CPUs.

The reserved LPID is made a POWER7/8-specific detail.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220123120043.3586018-3-npiggin@gmail.com
2022-05-13 21:33:33 +10:00
Nicholas Piggin 18827eeef0 KVM: PPC: Remove kvmppc_claim_lpid
Removing kvmppc_claim_lpid makes the lpid allocator API a bit simpler to
change the underlying implementation in a future patch.

The host LPID is always 0, so that can be a detail of the allocator. If
the allocator range is restricted, that can reserve LPIDs at the top of
the range. This allows kvmppc_claim_lpid to be removed.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220123120043.3586018-2-npiggin@gmail.com
2022-05-13 21:33:33 +10:00
Nicholas Piggin 361234d7a1 KVM: PPC: Book3S HV P9: Optimise loads around context switch
It is better to get all loads for the register values in flight
before starting to switch LPID, PID, and LPCR because those
mtSPRs are expensive and serialising.

This also just tidies up the code for a potential future change
to the context switching sequence.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220123114725.3549202-1-npiggin@gmail.com
2022-05-13 21:33:26 +10:00
Nicholas Piggin 861604614a KVM: PPC: Book3S HV: HFSCR[PREFIX] does not exist
This facility is controlled by FSCR only. Reserved bits should not be
set in the HFSCR register (although it's likely harmless as this
position would not be re-used, and the L0 is forgiving here too).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220122105639.3477407-1-npiggin@gmail.com
2022-05-13 21:33:19 +10:00
Alexander Graf ee8348496c KVM: PPC: Book3S PR: Enable MSR_DR for switch_mmu_context()
Commit 863771a28e ("powerpc/32s: Convert switch_mmu_context() to C")
moved the switch_mmu_context() to C. While in principle a good idea, it
meant that the function now uses the stack. The stack is not accessible
from real mode though.

So to keep calling the function, let's turn on MSR_DR while we call it.
That way, all pointer references to the stack are handled virtually.

In addition, make sure to save/restore r12 on the stack, as it may get
clobbered by the C function.

Fixes: 863771a28e ("powerpc/32s: Convert switch_mmu_context() to C")
Cc: stable@vger.kernel.org # v5.14+
Reported-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Alexander Graf <graf@amazon.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220510123717.24508-1-graf@amazon.com
2022-05-11 23:03:16 +10:00
Kajol Jain 348c713441 powerpc/papr_scm: Fix buffer overflow issue with CONFIG_FORTIFY_SOURCE
With CONFIG_FORTIFY_SOURCE enabled, string functions will also perform
dynamic checks for string size which can panic the kernel, like incase
of overflow detection.

In papr_scm, papr_scm_pmu_check_events function uses stat->stat_id with
string operations, to populate the nvdimm_events_map array. Since
stat_id variable is not NULL terminated, the kernel panics with
CONFIG_FORTIFY_SOURCE enabled at boot time.

Below are the logs of kernel panic:

  detected buffer overflow in __fortify_strlen
  ------------[ cut here ]------------
  kernel BUG at lib/string_helpers.c:980!
  Oops: Exception in kernel mode, sig: 5 [#1]
  NIP [c00000000077dad0] fortify_panic+0x28/0x38
  LR [c00000000077dacc] fortify_panic+0x24/0x38
  Call Trace:
  [c0000022d77836e0] [c00000000077dacc] fortify_panic+0x24/0x38 (unreliable)
  [c00800000deb2660] papr_scm_pmu_check_events.constprop.0+0x118/0x220 [papr_scm]
  [c00800000deb2cb0] papr_scm_probe+0x288/0x62c [papr_scm]
  [c0000000009b46a8] platform_probe+0x98/0x150

Fix this issue by using kmemdup_nul() to copy the content of
stat->stat_id directly to the nvdimm_events_map array.

mpe: stat->stat_id comes from the hypervisor, not userspace, so there is
no security exposure.

Fixes: 4c08d4bbc0 ("powerpc/papr_scm: Add perf interface support")
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220505153451.35503-1-kjain@linux.ibm.com
2022-05-06 12:44:03 +10:00
Michael Ellerman 6d65028eb6 powerpc/vdso: Fix incorrect CFI in gettimeofday.S
As reported by Alan, the CFI (Call Frame Information) in the VDSO time
routines is incorrect since commit ce7d8056e3 ("powerpc/vdso: Prepare
for switching VDSO to generic C implementation.").

DWARF has a concept called the CFA (Canonical Frame Address), which on
powerpc is calculated as an offset from the stack pointer (r1). That
means when the stack pointer is changed there must be a corresponding
CFI directive to update the calculation of the CFA.

The current code is missing those directives for the changes to r1,
which prevents gdb from being able to generate a backtrace from inside
VDSO functions, eg:

  Breakpoint 1, 0x00007ffff7f804dc in __kernel_clock_gettime ()
  (gdb) bt
  #0  0x00007ffff7f804dc in __kernel_clock_gettime ()
  #1  0x00007ffff7d8872c in clock_gettime@@GLIBC_2.17 () from /lib64/libc.so.6
  #2  0x00007fffffffd960 in ?? ()
  #3  0x00007ffff7d8872c in clock_gettime@@GLIBC_2.17 () from /lib64/libc.so.6
  Backtrace stopped: frame did not save the PC

Alan helpfully describes some rules for correctly maintaining the CFI information:

  1) Every adjustment to the current frame address reg (ie. r1) must be
     described, and exactly at the instruction where r1 changes. Why?
     Because stack unwinding might want to access previous frames.

  2) If a function changes LR or any non-volatile register, the save
     location for those regs must be given. The CFI can be at any
     instruction after the saves up to the point that the reg is
     changed.
     (Exception: LR save should be described before a bl. not after)

  3) If asychronous unwind info is needed then restores of LR and
     non-volatile regs must also be described. The CFI can be at any
     instruction after the reg is restored up to the point where the
     save location is (potentially) trashed.

Fix the inability to backtrace by adding CFI directives describing the
changes to r1, ie. satisfying rule 1.

Also change the information for LR to point to the copy saved on the
stack, not the value in r0 that will be overwritten by the function
call.

Finally, add CFI directives describing the save/restore of r2.

With the fix gdb can correctly back trace and navigate up and down the stack:

  Breakpoint 1, 0x00007ffff7f804dc in __kernel_clock_gettime ()
  (gdb) bt
  #0  0x00007ffff7f804dc in __kernel_clock_gettime ()
  #1  0x00007ffff7d8872c in clock_gettime@@GLIBC_2.17 () from /lib64/libc.so.6
  #2  0x0000000100015b60 in gettime ()
  #3  0x000000010000c8bc in print_long_format ()
  #4  0x000000010000d180 in print_current_files ()
  #5  0x00000001000054ac in main ()
  (gdb) up
  #1  0x00007ffff7d8872c in clock_gettime@@GLIBC_2.17 () from /lib64/libc.so.6
  (gdb)
  #2  0x0000000100015b60 in gettime ()
  (gdb)
  #3  0x000000010000c8bc in print_long_format ()
  (gdb)
  #4  0x000000010000d180 in print_current_files ()
  (gdb)
  #5  0x00000001000054ac in main ()
  (gdb)
  Initial frame selected; you cannot go up.
  (gdb) down
  #4  0x000000010000d180 in print_current_files ()
  (gdb)
  #3  0x000000010000c8bc in print_long_format ()
  (gdb)
  #2  0x0000000100015b60 in gettime ()
  (gdb)
  #1  0x00007ffff7d8872c in clock_gettime@@GLIBC_2.17 () from /lib64/libc.so.6
  (gdb)
  #0  0x00007ffff7f804dc in __kernel_clock_gettime ()
  (gdb)

Fixes: ce7d8056e3 ("powerpc/vdso: Prepare for switching VDSO to generic C implementation.")
Cc: stable@vger.kernel.org # v5.11+
Reported-by: Alan Modra <amodra@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Link: https://lore.kernel.org/r/20220502125010.1319370-1-mpe@ellerman.id.au
2022-05-04 22:12:10 +10:00
Haren Myneni 57831bfb5e powerpc/pseries/vas: Use QoS credits from the userspace
The user can change the QoS credits dynamically with the
management console interface which notifies OS with sysfs. After
returning from the OS interface successfully, the management
console updates the hypervisor. Since the VAS capabilities in
the hypervisor is not updated when the OS gets the update,
the kernel is using the old total credits value from the
hypervisor. Fix this issue by using the new QoS credits
from the userspace instead of depending on VAS capabilities
from the hypervisor.

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/76d156f8af1e03cc09369d68e0bfad0c40031bcc.camel@linux.ibm.com
2022-05-04 22:00:47 +10:00
Alexey Kardashevskiy bb82c57469 powerpc/perf: Fix 32bit compile
The "read_bhrb" global symbol is only called under CONFIG_PPC64 of
arch/powerpc/perf/core-book3s.c but it is compiled for both 32 and 64 bit
anyway (and LLVM fails to link this on 32bit).

This fixes it by moving bhrb.o to obj64 targets.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220421025756.571995-1-aik@ozlabs.ru
2022-04-21 23:26:47 +10:00
Athira Rajeev c6cc9a852f powerpc/perf: Fix power10 event alternatives
When scheduling a group of events, there are constraint checks done to
make sure all events can go in a group. Example, one of the criteria is
that events in a group cannot use the same PMC. But platform specific
PMU supports alternative event for some of the event codes. During
perf_event_open(), if any event group doesn't match constraint check
criteria, further lookup is done to find alternative event.

By current design, the array of alternatives events in PMU code is
expected to be sorted by column 0. This is because in
find_alternative() the return criteria is based on event code
comparison. ie. "event < ev_alt[i][0])". This optimisation is there
since find_alternative() can be called multiple times. In power10 PMU
code, the alternative event array is not sorted properly and hence there
is breakage in finding alternative event.

To work with existing logic, fix the alternative event array to be
sorted by column 0 for power10-pmu.c

Results:

In case where an alternative event is not chosen when we could, events
will be multiplexed. ie, time sliced where it could actually run
concurrently.

Example, in power10 PM_INST_CMPL_ALT(0x00002) has alternative event,
PM_INST_CMPL(0x500fa). Without the fix, if a group of events with PMC1
to PMC4 is used along with PM_INST_CMPL_ALT, it will be time sliced
since all programmable PMC's are consumed already. But with the fix,
when it picks alternative event on PMC5, all events will run
concurrently.

Before:

 # perf stat -e r00002,r100fc,r200fa,r300fc,r400fc

 Performance counter stats for 'system wide':

         328668935      r00002               (79.94%)
          56501024      r100fc               (79.95%)
          49564238      r200fa               (79.95%)
               376      r300fc               (80.19%)
               660      r400fc               (79.97%)

       4.039150522 seconds time elapsed

With the fix, since alternative event is chosen to run on PMC6, events
will be run concurrently.

After:

 # perf stat -e r00002,r100fc,r200fa,r300fc,r400fc

 Performance counter stats for 'system wide':

          23596607      r00002
           4907738      r100fc
           2283608      r200fa
               135      r300fc
               248      r400fc

       1.664671390 seconds time elapsed

Fixes: a64e697cef ("powerpc/perf: power10 Performance Monitoring support")
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220419114828.89843-2-atrajeev@linux.vnet.ibm.com
2022-04-21 23:25:33 +10:00
Athira Rajeev 0dcad700bb powerpc/perf: Fix power9 event alternatives
When scheduling a group of events, there are constraint checks done to
make sure all events can go in a group. Example, one of the criteria is
that events in a group cannot use the same PMC. But platform specific
PMU supports alternative event for some of the event codes. During
perf_event_open(), if any event group doesn't match constraint check
criteria, further lookup is done to find alternative event.

By current design, the array of alternatives events in PMU code is
expected to be sorted by column 0. This is because in
find_alternative() the return criteria is based on event code
comparison. ie. "event < ev_alt[i][0])". This optimisation is there
since find_alternative() can be called multiple times. In power9 PMU
code, the alternative event array is not sorted properly and hence there
is breakage in finding alternative events.

To work with existing logic, fix the alternative event array to be
sorted by column 0 for power9-pmu.c

Results:

With alternative events, multiplexing can be avoided. That is, for
example, in power9 PM_LD_MISS_L1 (0x3e054) has alternative event,
PM_LD_MISS_L1_ALT (0x400f0). This is an identical event which can be
programmed in a different PMC.

Before:

 # perf stat -e r3e054,r300fc

 Performance counter stats for 'system wide':

           1057860      r3e054              (50.21%)
               379      r300fc              (49.79%)

       0.944329741 seconds time elapsed

Since both the events are using PMC3 in this case, they are
multiplexed here.

After:

 # perf stat -e r3e054,r300fc

 Performance counter stats for 'system wide':

           1006948      r3e054
               182      r300fc

Fixes: 91e0bd1e62 ("powerpc/perf: Add PM_LD_MISS_L1 and PM_BR_2PATH to power9 event list")
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220419114828.89843-1-atrajeev@linux.vnet.ibm.com
2022-04-21 23:25:22 +10:00
Alexey Kardashevskiy 26a62b750a KVM: PPC: Fix TCE handling for VFIO
The LoPAPR spec defines a guest visible IOMMU with a variable page size.
Currently QEMU advertises 4K, 64K, 2M, 16MB pages, a Linux VM picks
the biggest (16MB). In the case of a passed though PCI device, there is
a hardware IOMMU which does not support all pages sizes from the above -
P8 cannot do 2MB and P9 cannot do 16MB. So for each emulated
16M IOMMU page we may create several smaller mappings ("TCEs") in
the hardware IOMMU.

The code wrongly uses the emulated TCE index instead of hardware TCE
index in error handling. The problem is easier to see on POWER8 with
multi-level TCE tables (when only the first level is preallocated)
as hash mode uses real mode TCE hypercalls handlers.
The kernel starts using indirect tables when VMs get bigger than 128GB
(depends on the max page order).
The very first real mode hcall is going to fail with H_TOO_HARD as
in the real mode we cannot allocate memory for TCEs (we can in the virtual
mode) but on the way out the code attempts to clear hardware TCEs using
emulated TCE indexes which corrupts random kernel memory because
it_offset==1<<59 is subtracted from those indexes and the resulting index
is out of the TCE table bounds.

This fixes kvmppc_clear_tce() to use the correct TCE indexes.

While at it, this fixes TCE cache invalidation which uses emulated TCE
indexes instead of the hardware ones. This went unnoticed as 64bit DMA
is used these days and VMs map all RAM in one go and only then do DMA
and this is when the TCE cache gets populated.

Potentially this could slow down mapping, however normally 16MB
emulated pages are backed by 64K hardware pages so it is one write to
the "TCE Kill" per 256 updates which is not that bad considering the size
of the cache (1024 TCEs or so).

Fixes: ca1fc489cf ("KVM: PPC: Book3S: Allow backing bigger guest IOMMU pages with smaller physical pages")

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Tested-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220420050840.328223-1-aik@ozlabs.ru
2022-04-21 17:07:58 +10:00
Michael Ellerman d2b9be1f4a powerpc/time: Always set decrementer in timer_interrupt()
This is a partial revert of commit 0faf20a1ad ("powerpc/64s/interrupt:
Don't enable MSR[EE] in irq handlers unless perf is in use").

Prior to that commit, we always set the decrementer in
timer_interrupt(), to clear the timer interrupt. Otherwise we could end
up continuously taking timer interrupts.

When high res timers are enabled there is no problem seen with leaving
the decrementer untouched in timer_interrupt(), because it will be
programmed via hrtimer_interrupt() -> tick_program_event() ->
clockevents_program_event() -> decrementer_set_next_event().

However with CONFIG_HIGH_RES_TIMERS=n or booting with highres=off, we
see a stall/lockup, because tick_nohz_handler() does not cause a
reprogram of the decrementer, leading to endless timer interrupts.
Example trace:

  [    1.898617][    T7] Freeing initrd memory: 2624K^M
  [   22.680919][    C1] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:^M
  [   22.682281][    C1] rcu:     0-....: (25 ticks this GP) idle=073/0/0x1 softirq=10/16 fqs=1050 ^M
  [   22.682851][    C1]  (detected by 1, t=2102 jiffies, g=-1179, q=476)^M
  [   22.683649][    C1] Sending NMI from CPU 1 to CPUs 0:^M
  [   22.685252][    C0] NMI backtrace for cpu 0^M
  [   22.685649][    C0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-rc2-00185-g0faf20a1ad16 #145^M
  [   22.686393][    C0] NIP:  c000000000016d64 LR: c000000000f6cca4 CTR: c00000000019c6e0^M
  [   22.686774][    C0] REGS: c000000002833590 TRAP: 0500   Not tainted  (5.16.0-rc2-00185-g0faf20a1ad16)^M
  [   22.687222][    C0] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 24000222  XER: 00000000^M
  [   22.688297][    C0] CFAR: c00000000000c854 IRQMASK: 0 ^M
  ...
  [   22.692637][    C0] NIP [c000000000016d64] arch_local_irq_restore+0x174/0x250^M
  [   22.694443][    C0] LR [c000000000f6cca4] __do_softirq+0xe4/0x3dc^M
  [   22.695762][    C0] Call Trace:^M
  [   22.696050][    C0] [c000000002833830] [c000000000f6cc80] __do_softirq+0xc0/0x3dc (unreliable)^M
  [   22.697377][    C0] [c000000002833920] [c000000000151508] __irq_exit_rcu+0xd8/0x130^M
  [   22.698739][    C0] [c000000002833950] [c000000000151730] irq_exit+0x20/0x40^M
  [   22.699938][    C0] [c000000002833970] [c000000000027f40] timer_interrupt+0x270/0x460^M
  [   22.701119][    C0] [c0000000028339d0] [c0000000000099a8] decrementer_common_virt+0x208/0x210^M

Possibly this should be fixed in the lowres timing code, but that would
be a generic change and could take some time and may not backport
easily, so for now make the programming of the decrementer unconditional
again in timer_interrupt() to avoid the stall/lockup.

Fixes: 0faf20a1ad ("powerpc/64s/interrupt: Don't enable MSR[EE] in irq handlers unless perf is in use")
Reported-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Link: https://lore.kernel.org/r/20220420141657.771442-1-mpe@ellerman.id.au
2022-04-21 16:10:56 +10:00
Linus Torvalds ce522ba9ef Linux 5.18-rc2 2022-04-10 14:21:36 -10:00
Linus Torvalds 8b57b30461 Serial driver fix for 5.18-rc2
This is a single serial driver fix for a build issue that showed up due
 to changes that came in through the tty tree in 5.18-rc1 that were
 missed previously.  It resolves a build error with the mpc52xx_uart
 driver.
 
 It has been in linux-next this week with no reported problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYlLSOw8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykUDQCgpgjEqSAUVXHS3NMjIppMSF8RfD4AoNYjH7Hl
 oVs5nzWNDcbZPvvh+TFw
 =F6pL
 -----END PGP SIGNATURE-----

Merge tag 'tty-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull serial driver fix from Greg KH:
 "This is a single serial driver fix for a build issue that showed up
  due to changes that came in through the tty tree in 5.18-rc1 that were
  missed previously. It resolves a build error with the mpc52xx_uart
  driver.

  It has been in linux-next this week with no reported problems"

* tag 'tty-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  tty: serial: mpc52xx_uart: make rx/tx hooks return unsigned, part II.
2022-04-10 10:08:50 -10:00
Linus Torvalds 95aa17c36d Staging driver fix for 5.18-rc2
Here is a single staging driver fix for 5.18-rc2 that resolves an endian
 issue for the r8188eu driver.  It has been in linux-next all this week
 with no reported problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYlLRlw8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ylqdgCcCNVlN2uYRqAYjVyZmYSNKumgzN8AoMm1/DVT
 bT68T73BB/g1TFMNjuwy
 =Ho+t
 -----END PGP SIGNATURE-----

Merge tag 'staging-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging driver fix from Greg KH:
 "Here is a single staging driver fix for 5.18-rc2 that resolves an
  endian issue for the r8188eu driver. It has been in linux-next all
  this week with no reported problems"

* tag 'staging-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: r8188eu: Fix PPPoE tag insertion on little endian systems
2022-04-10 10:04:30 -10:00
Linus Torvalds 33563138ac Driver core changes for 5.18-rc2
Here are 2 small driver core changes for 5.18-rc2.
 
 They are the final bits in the removal of the default_attrs field in
 struct kobj_type.  I had to wait until after 5.18-rc1 for all of the
 changes to do this came in through different development trees, and then
 one new user snuck in.  So this series has 2 changes:
 	- removal of the default_attrs field in the powerpc/pseries/vas
 	  code.  Change has been acked by the PPC maintainers to come
 	  through this tree
 	- removal of default_attrs from struct kobj_type now that all
 	  in-kernel users are removed.  This cleans up the kobject code
 	  a little bit and removes some duplicated functionality that
 	  confused people (now there is only one way to do default
 	  groups.)
 
 All of these have been in linux-next for all of this week with no
 reported problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYlLRHg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yn+9gCfXN0OvKmw5QD55z8YGp/jIycK0ToAnifJ/OX+
 sU2V8ZQfNbV8xw7iXfc2
 =L+Uc
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here are two small driver core changes for 5.18-rc2.

  They are the final bits in the removal of the default_attrs field in
  struct kobj_type. I had to wait until after 5.18-rc1 for all of the
  changes to do this came in through different development trees, and
  then one new user snuck in. So this series has two changes:

   - removal of the default_attrs field in the powerpc/pseries/vas code.

     The change has been acked by the PPC maintainers to come through
     this tree

   - removal of default_attrs from struct kobj_type now that all
     in-kernel users are removed.

     This cleans up the kobject code a little bit and removes some
     duplicated functionality that confused people (now there is only
     one way to do default groups)

  Both of these have been in linux-next for all of this week with no
  reported problems"

* tag 'driver-core-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  kobject: kobj_type: remove default_attrs
  powerpc/pseries/vas: use default_groups in kobj_type
2022-04-10 09:55:09 -10:00
Linus Torvalds f58d3410c5 Char/Misc driver fix for 5.18-rc2
Here is a single driver fix for 5.18-rc2.  It resolves the build warning
 issue on 32bit systems in the habannalabs driver that came in during the
 5.18-rc1 merge cycle.
 
 It has been in linux-next for all this week with no reported problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYlK+5Q8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykTKQCgoOU9/+9EiS3crSlFzo24SdomjKAAoL1nQoN9
 2s4KLX25ynnincGifSr5
 =VVFs
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver fix from Greg KH:
 "A single driver fix. It resolves the build warning issue on 32bit
  systems in the habannalabs driver that came in during the 5.18-rc1
  merge cycle.

  It has been in linux-next for all this week with no reported problems"

* tag 'char-misc-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  habanalabs: Fix test build failures
2022-04-10 09:52:46 -10:00
Linus Torvalds 4ea3c64252 powerpc fixes for 5.18 #2
- Fix KVM "lost kick" race, where an attempt to pull a vcpu out of the guest could be
    lost (or delayed until the next guest exit).
 
  - Disable SCV (system call vectored) when PR KVM guests could be run.
 
  - Fix KVM PR guests using SCV, by disallowing AIL != 0 for KVM PR guests.
 
  - Add a new KVM CAP to indicate if AIL == 3 is supported.
 
  - Fix a regression when hotplugging a CPU to a memoryless/cpuless node.
 
  - Make virt_addr_valid() stricter for 64-bit Book3E & 32-bit, which fixes crashes seen
    due to hardened usercopy.
 
  - Revert a change to max_mapnr which broke HIGHMEM.
 
 Thanks to: Christophe Leroy, Fabiano Rosas, Kefeng Wang, Nicholas Piggin, Srikar Dronamraju.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmJSzCYTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgFqxD/98cokv9ZFbXoPApT0rbZo/5Re5GWGj
 IzSI4kuBI7j5oPqDdwusfF/pqKt+zFmr0fVsnhz2WYZ4gX4xr9B48OpmuIQvNNbx
 46gz4wWIPE2C9xVnOtU829DTOXfFoBOQo16TFzE8wfiLFx9M8gF2oogTzvF14LML
 +tbE2STL3ga6MGje8oZ3VOvvXrt9zrynTRt4W/SsfpkXvhQYRdGSPC2Rw6IkbN1k
 XDoFPt+vN9C+g6ItW7OzBrkMvCSYNxmsptWAA48zCqbGOawXomYoZyFTS7fooX5E
 nhGM9wAQGVNRlbnLgEtOAUv/Djz4yVz1gjR+4b7LF26AN3bd3CrQJ+whZJAAqw+G
 I6wtRZI6DrZ4UH5sfjsUQaOIT6DcGlt2MTidGmG2hY+XlanKgiLCdIisnxAMa4+x
 kBD1zqSuThPWgpryfKMex4r1WBZyZ27bcwQ9L9Z9GeCQN0V9cNfD8OHwyeKEuQEb
 hA941h2qq9bzzVL/wrDxVesRSzXRXoBed77RCL2YUYLonybW+mxijqbaWNVcqqB0
 Hr3/hhgq+0uYid5Ld9rxHnXl9yrJI9itakXNFU6dmzqZtQ7b4xaha21IME5zoIcJ
 DRkTWGnub0wjp2Re1rdJVpTDREP19k+gPu/dVJFNlW16SG4/Lhg1xOLTkRNQ+gnt
 Ayp4o27CPzoTJg==
 =uNqF
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - Fix KVM "lost kick" race, where an attempt to pull a vcpu out of the
   guest could be lost (or delayed until the next guest exit).

 - Disable SCV (system call vectored) when PR KVM guests could be run.

 - Fix KVM PR guests using SCV, by disallowing AIL != 0 for KVM PR
   guests.

 - Add a new KVM CAP to indicate if AIL == 3 is supported.

 - Fix a regression when hotplugging a CPU to a memoryless/cpuless node.

 - Make virt_addr_valid() stricter for 64-bit Book3E & 32-bit, which
   fixes crashes seen due to hardened usercopy.

 - Revert a change to max_mapnr which broke HIGHMEM.

Thanks to Christophe Leroy, Fabiano Rosas, Kefeng Wang, Nicholas Piggin,
and Srikar Dronamraju.

* tag 'powerpc-5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  Revert "powerpc: Set max_mapnr correctly"
  powerpc: Fix virt_addr_valid() for 64-bit Book3E & 32-bit
  KVM: PPC: Move kvmhv_on_pseries() into kvm_ppc.h
  powerpc/numa: Handle partially initialized numa nodes
  powerpc/64: Fix build failure with allyesconfig in book3s_64_entry.S
  KVM: PPC: Use KVM_CAP_PPC_AIL_MODE_3
  KVM: PPC: Book3S PR: Disallow AIL != 0
  KVM: PPC: Book3S PR: Disable SCV when AIL could be disabled
  KVM: PPC: Book3S HV P9: Fix "lost kick" race
2022-04-10 07:36:18 -10:00
Linus Torvalds 1519610b53 A set of interrupt chip driver fixes:
- A fix for a long standing bug in the ARM GICv3 redistributor polling
     which uses the wrong bit number to test.
 
   - Prevent translation of bogus ACPI table entries which map device
     interrupts into the IPI space on ARM GICs.
 
   - Don't write into the pending register of ARM GICV4 before the scan
     in hardware has completed.
 
   - A set of build and correctness fixes for the Qualcomm MPM driver
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmJSzCQTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYobYVD/9WP+8WMN+EUWtL4+fbSFswjvlXtSjY
 wURTtcu4Vfadoren5f6I7aEOmgJhC3iUY4R8u9ftvuKi5lh8M2TWVMyFDXfYQa/3
 qGlIHfVPQe3AtN/bY1nFnpkFWCVFhfA6EfKOOsBV6/lmP28gSytepc9B29rqfIQM
 d67GC2MeidSVpxM4AtS2Dguiq+v/MdDcA1P2oMFcpkYwsPSpPNHdrn8F1EbJyXfV
 O8bZ1stlPRaMTtLb7Dzkpo0JhW6okaLDHDAo2HiXB23PciW7JdzsALvWsidosBuG
 /b63YHJQqEEk/8sIZDlf84xOUkkOTA6sWpqdxVlzMA71jfMnIs/YEtIQ41paEwE2
 MEZPyygVnE8vPmjMjM8dypcQAK/IdAoqyWlbtlNfc+6BFvA6wMLTG+ipG9nEkAiI
 YvmEI4PUdDa2hrV2S/ExHGyyVhtXZBHT13YFHmspm8cHOkPtUjPSXhVztLv9oQHR
 yCD1rpqv/zYPVDXNXR6jG+idDkc1L/emFl/3X/vkabbt2bZs8DAJf03sdU8sbAl8
 2goG5JREJZI07bCSPdovi1xN8gPlHkZeiv3dFPN3r4Sghxp/H2G+YlMZSD/H9Nti
 YhH3zXgWKpfEUAaSGyNXl/6RkvUJ5+ZCOuZkZPJtmsn8ptXhb+Z2u4f69ZqlI0LE
 bkd0uATF50ZwBA==
 =SJ8P
 -----END PGP SIGNATURE-----

Merge tag 'irq-urgent-2022-04-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fixes from Thomas Gleixner:
 "A set of interrupt chip driver fixes:

   - A fix for a long standing bug in the ARM GICv3 redistributor
     polling which uses the wrong bit number to test.

   - Prevent translation of bogus ACPI table entries which map device
     interrupts into the IPI space on ARM GICs.

   - Don't write into the pending register of ARM GICV4 before the scan
     in hardware has completed.

   - A set of build and correctness fixes for the Qualcomm MPM driver"

* tag 'irq-urgent-2022-04-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/gic, gic-v3: Prevent GSI to SGI translations
  irqchip/gic-v3: Fix GICR_CTLR.RWP polling
  irqchip/gic-v4: Wait for GICR_VPENDBASER.Dirty to clear before descheduling
  irqchip/irq-qcom-mpm: fix return value check in qcom_mpm_init()
  irq/qcom-mpm: Fix build error without MAILBOX
2022-04-10 07:25:49 -10:00
Linus Torvalds 9c6913b749 - Fix the MSI message data struct definition
- Use local labels in the exception table macros to avoid symbol
 conflicts with clang LTO builds
 
 - A couple of fixes to objtool checking of the relatively newly added
 SLS and IBT code
 
 - Rename a local var in the WARN* macro machinery to prevent shadowing
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmJSwSkACgkQEsHwGGHe
 VUp6QQ//TGhL2xxLoN+7pYjIBDEDHJ3Oi0m6fOweqyQAZTYcm/rAPqd7hvoWVSoO
 YsLdWi9jeMwkzG0ItSm/qPVm/UvrViXwuQMdz4nDWqg2IPFIbhgNA3CKCIyPTio2
 WHp2NXvYyDnwPMr6xTTRndMDoxiwxMBnXf91pNwoU3toxw0GuUuXan0Y+GKnvx1A
 sqhbpWO27bAmhKb26wPw5soJVxBbSqx+1TbFVG0Sz/uwYQowMa+nfNg1DXF0sXyJ
 E/ssqBB6wjl7ANVbQsxBQHRzr/EksLVPwHHrlT8ga/5loin+VJ6mTBCPLgG7SMBE
 +R1fm79Bp/9KU194fcqhJ3pvnyJPi8hfizzCqNKnK871V8LRzC+jW0l3EdvASEXC
 sDj0XWsSFoWft9eAtMV11d641uVC4rLB90GyyzmWWrEw9BbxmasBgED6QBx9d+V6
 o1L4y58Tsz88HKzwd0PtBkeGDkvkA7xOx8ViG24IeLA0tcbixnfnATQdelQeWKqO
 4m3o1JU8ogJp9JCEBY7ZeXyStFjZMedM4U/V0akF6AKnpDuVfR3T5C68cYhoLKBu
 XU6Swf5sFHImNWp0+54HPnXhHj/uhuwj9YWCkxx/eXViwvVlxSdTdIQWa380EddN
 0KhOFLwLOdhha2+81FJc6vmkDHwiu6hlR38yqdGvdxZf/KPKjM0=
 =kMtP
 -----END PGP SIGNATURE-----

Merge tag 'x86_urgent_for_v5.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

 - Fix the MSI message data struct definition

 - Use local labels in the exception table macros to avoid symbol
   conflicts with clang LTO builds

 - A couple of fixes to objtool checking of the relatively newly added
   SLS and IBT code

 - Rename a local var in the WARN* macro machinery to prevent shadowing

* tag 'x86_urgent_for_v5.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/msi: Fix msi message data shadow struct
  x86/extable: Prefer local labels in .set directives
  x86,bpf: Avoid IBT objtool warning
  objtool: Fix SLS validation for kcov tail-call replacement
  objtool: Fix IBT tail-call detection
  x86/bug: Prevent shadowing in __WARN_FLAGS
  x86/mm/tlb: Revert retpoline avoidance approach
2022-04-10 07:12:27 -10:00
Linus Torvalds b51f86e990 - A couple of fixes to cgroup-related handling of perf events
- A couple of fixes to event encoding on Sapphire Rapids
 
 - Pass event caps of inherited events so that perf doesn't fail wrongly at fork()
 
 - Add support for a new Raptor Lake CPU
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmJSvt0ACgkQEsHwGGHe
 VUpKBRAAtegwh4ilwoRM0LePH2TX752pREy+M1qfEUp/XyH3tF8VAixCAmIg7qlI
 IyjRX0AKDC1F08sM7/JmTf0M+hnl/oH2YPG8Q6p3igtfARvn+5bPZdSBpTAC9P5L
 QX3S2WVzv5X78IomIfENqbg5HyZP3IXeg7R7sqZhHbtoG54n5NEv/+aJl5HmHFTt
 gLTrXetL46OSMnLzKfd3hlJqCWSnTz1aGKgGX2cZy9ipI63+XrYMuNmiwJ+CrA3G
 pI98RmKnCPqV2rXij1GpVQNyG2aPR+VVZM3aaq6XBAmiNTaCfnvWbEBGhCkjaSgA
 UU7Y6D1Qxc0OZ1plcjhKc4l/W1oj8jqmG9nS6J2Xy4szdpZIdxBhlWq89xCrb9AC
 yIgKif2iVl7eMVKVG1Jq1u2wTwurBAamH73sCCNn8ndctBjicoM8pbtHMHxzceyZ
 w4Cff0yUNzHgPiqSHQRARw/CaUceL9kDoGzPeEQOR0A+27MpNulchts4HCtIvwzI
 yLIK1JFPHDrCACLTMuAhvov3EMTeoTIfc91eOZRjubRTPx7TxujaZHdP7N+R3nkk
 Giehc/l6IhFPhT8QACk0bziTVJ9in+Jx8pCnocGKuj80Uqs7Sq7swjlasy1Zoy7r
 x9Qzy1gZhPHnvPd6LWU4WyPa767D07DlG/zFdg+P3EeWa/3efdw=
 =ba3V
 -----END PGP SIGNATURE-----

Merge tag 'perf_urgent_for_v5.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Borislav Petkov:

 - A couple of fixes to cgroup-related handling of perf events

 - A couple of fixes to event encoding on Sapphire Rapids

 - Pass event caps of inherited events so that perf doesn't fail wrongly
   at fork()

 - Add support for a new Raptor Lake CPU

* tag 'perf_urgent_for_v5.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Always set cpuctx cgrp when enable cgroup event
  perf/core: Fix perf_cgroup_switch()
  perf/core: Use perf_cgroup_info->active to check if cgroup is active
  perf/core: Don't pass task around when ctx sched in
  perf/x86/intel: Update the FRONTEND MSR mask on Sapphire Rapids
  perf/x86/intel: Don't extend the pseudo-encoding to GP counters
  perf/core: Inherit event_caps
  perf/x86/uncore: Add Raptor Lake uncore support
  perf/x86/msr: Add Raptor Lake CPU support
  perf/x86/cstate: Add Raptor Lake support
  perf/x86: Add Intel Raptor Lake support
2022-04-10 07:08:22 -10:00
Linus Torvalds 50c94de67c - Allow the compiler to optimize away unused percpu accesses and change
the local_lock_* macros back to inline functions
 
 - A couple of fixes to static call insn patching
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmJStZ4ACgkQEsHwGGHe
 VUpUpA/8DHOMUQa7rM8z49ZWBV01HNVCLECTeeKshQBLyJfWc84MNOfdPbpgEGvY
 XE/eIZDnTMB5UKD0bfRqD+AQ0fXjl3NiLnJrdDZJqEQAiP/wGBswKNXMire8xPT8
 9MfaOKYWYPl0LY2uZBWVLcdC+lVe4kRGfhqAcl4LRx0ZSvMzgjcFy34NeXY8LlXD
 kFQJEzHa97CTROje54mtmXEt7Y5bxjxWwVTSyfEt0hJPGo1bJtJP6FaY01Muj+Xu
 h/OGNx3KLOYf9MqQC31caAwKgtUOptm8bTpvG3onaHg29qJgz2umKwONyOjYrUUn
 2PE3NREfMuKI38nf88pX+lOCs6/I1uVIjJPvAVJijIcuI1ZBXrfm26IP0lZ3LqG1
 h/9Y5gChiZPn1j90VnF4UCJUm4u3bYEAHqKIQgUdpcpUqX0NlxbDiXoYxJWfHnmB
 PBJ0PE7Vdo4MPK0n3BGVrzXAFeOyHsohAsKFijT8afRCMAOF/ebmVs/tI5NygFrK
 11e/U13/78iKkazZSxWew8vU3yXA39W5Rym7aPnhR2lWxvN+xQOjNTgZTxF9hUcZ
 6AcsaYJgHR7nD8SM7Y9+cwHWOWaDEdZMg9XSkgvyd1p0tHb4u+Ve/SQK7sA3j9q7
 ZmZyFSE1X3K+M1i+75rUSVmIEVM5cpfhodN89iRje/JIZ1KyRT8=
 =hSOc
 -----END PGP SIGNATURE-----

Merge tag 'locking_urgent_for_v5.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking fixes from Borislav Petkov:

 - Allow the compiler to optimize away unused percpu accesses and change
   the local_lock_* macros back to inline functions

 - A couple of fixes to static call insn patching

* tag 'locking_urgent_for_v5.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Revert "mm/page_alloc: mark pagesets as __maybe_unused"
  Revert "locking/local_lock: Make the empty local_lock_*() function a macro."
  x86/percpu: Remove volatile from arch_raw_cpu_ptr().
  static_call: Remove __DEFINE_STATIC_CALL macro
  static_call: Properly initialise DEFINE_STATIC_CALL_RET0()
  static_call: Don't make __static_call_return0 static
  x86,static_call: Fix __static_call_return0 for i386
2022-04-10 06:56:46 -10:00
Linus Torvalds 7136849ea9 - Use the correct static key checking primitive on the IRQ exit path
- Two fixes for the new forceidle balancer
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmJSsskACgkQEsHwGGHe
 VUqHRg//UvcM+Ygrx17yhWzHAKi5Mm5sVxSbsY08WU9cQsAQTZ4c9I8Rs2OHkyKC
 8k0uQDxrrOeJRcuJoP87pRPqefvep+Gk1873KEvBRUe7+sQATGO6KMshoiPrnYO5
 kQzd98rK9vrNu5ZwZFWADnCAYco+5nlsyFVXkRY0ZXhDUvfInus2OLkAm4ULXrt1
 FVtO69QXDK3y42NzLSPDCoQyeM/bcCCts6wpR2WWHE2F9zD8tiM8A4DBNpu6Iker
 wli2la27V4U0236ar7Md2HwD8AkQUuOSGYh9JD5RBNZjJpoHKPNNIv35dI7r9yXz
 6/r52pM+idMmfV7MjDWg7cIyHgJKbfBJ54+ibvoqd3Gi5R9IZLJOXGEQaaPLaMT0
 7movvEm/NDzHbQDGKmPoiRr4PYinRKFN85zTuzirscbHInMkmshciHmK2TG9Qt9m
 2L5DG/LnA0EQkhFyrGoxXTgnZwGWpmpWu7tRZfFTUsjbri4CRGThmQYpl7tAEzF7
 TC60WA2RfYXaJgtguZJZfiHXSYfzQriXLd4Mj1WRv6FU+IKedZmAPgjYO2dKu++y
 We8ZOER8Ysy2lJR/DDQ0waDp5UrTarX/WCFzIWNLKcFLvgLEPKrfeO8AtFHDVUZd
 58g9DCn1Jed8ZEYwxpPYpbLVcqQ790oShlU+/EA+FwN5pehuJNU=
 =elMK
 -----END PGP SIGNATURE-----

Merge tag 'sched_urgent_for_v5.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fixes from Borislav Petkov:

 - Use the correct static key checking primitive on the IRQ exit path

 - Two fixes for the new forceidle balancer

* tag 'sched_urgent_for_v5.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  entry: Fix compile error in dynamic_irqentry_exit_cond_resched()
  sched: Teach the forced-newidle balancer about CPU affinity limitation.
  sched/core: Fix forceidle balancing
2022-04-10 06:47:49 -10:00
Linus Torvalds 1862a69c91 perf tools fixes for v5.18: 1st batch
- Fix the clang command line option probing and remove some options to filter
   out, fixing the build with the latest clang versions.
 
 - Fix 'perf bench' futex and epoll benchmarks to deal with machines with more
   than 1K CPUs.
 
 - Fix 'perf test tsc' error message when not supported.
 
 - Remap perf ring buffer if there is no space for event, fixing perf usage
   in 32-bit ChromeOS.
 
 - Drop objdump stderr to avoid getting stuck waiting for stdout output in
   'perf annotate'.
 
 - Fix up garbled output by now showing unwind error messages when augmenting
   frame in best effort mode.
 
 - Fix perf's libperf_print callback, use the va_args eprintf() variant.
 
 - Sync vhost and arm64 cputype headers with the kernel sources.
 
 - Fix 'perf report --mem-mode' with ARM SPE.
 
 - Add missing external commands ('perf iiostat', etc) to 'perf --list-cmds'.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYlIM+gAKCRCyPKLppCJ+
 J5l6AQCCY4co/6FBh8JMmMX4RVHAUriX0YfKTJfpeLU3nsiXPAD/TVqf1LOyYaPv
 /ZqJ8DwqvKr9nkUsf5kAOfPrDB/j/QQ=
 =0UV/
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-fixes-for-v5.18-2022-04-09' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Fix the clang command line option probing and remove some options to
   filter out, fixing the build with the latest clang versions

 - Fix 'perf bench' futex and epoll benchmarks to deal with machines
   with more than 1K CPUs

 - Fix 'perf test tsc' error message when not supported

 - Remap perf ring buffer if there is no space for event, fixing perf
   usage in 32-bit ChromeOS

 - Drop objdump stderr to avoid getting stuck waiting for stdout output
   in 'perf annotate'

 - Fix up garbled output by now showing unwind error messages when
   augmenting frame in best effort mode

 - Fix perf's libperf_print callback, use the va_args eprintf() variant

 - Sync vhost and arm64 cputype headers with the kernel sources

 - Fix 'perf report --mem-mode' with ARM SPE

 - Add missing external commands ('iiostat', etc) to 'perf --list-cmds'

* tag 'perf-tools-fixes-for-v5.18-2022-04-09' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf annotate: Drop objdump stderr to avoid getting stuck waiting for stdout output
  perf tools: Add external commands to list-cmds
  perf docs: Add perf-iostat link to manpages
  perf session: Remap buf if there is no space for event
  perf bench: Fix epoll bench to correct usage of affinity for machines with #CPUs > 1K
  perf bench: Fix futex bench to correct usage of affinity for machines with #CPUs > 1K
  perf tools: Fix perf's libperf_print callback
  perf: arm-spe: Fix perf report --mem-mode
  perf unwind: Don't show unwind error messages when augmenting frame pointer stack
  tools headers arm64: Sync arm64's cputype.h with the kernel sources
  perf test tsc: Fix error message when not supported
  perf build: Don't use -ffat-lto-objects in the python feature test when building with clang-13
  perf python: Fix probing for some clang command line options
  tools build: Filter out options and warnings not supported by clang
  tools build: Use $(shell ) instead of `` to get embedded libperl's ccopts
  tools include UAPI: Sync linux/vhost.h with the kernel sources
2022-04-09 18:45:10 -10:00
Linus Torvalds 94a4c2bb7a cxl + nvdimm fixes for v5.18-rc2
- Fix a compile error in the nvdimm unit tests
 
 - Fix a shadowed variable warning in the CXL PCI driver
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCYlIY6QAKCRDfioYZHlFs
 Z0aeAQDDYcicYRhLZ3Ljbg6stitBIumpdVcKDHm4WkC9gbmB4QEArnXLpcHPWyAa
 zmgc1Yrp9gOnpNSRMog9Wc8NaR45KA8=
 =Ov5t
 -----END PGP SIGNATURE-----

Merge tag 'cxl+nvdimm-for-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull cxl and nvdimm fixes from Dan Williams:

 - Fix a compile error in the nvdimm unit tests

 - Fix a shadowed variable warning in the CXL PCI driver

* tag 'cxl+nvdimm-for-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  cxl/pci: Drop shadowed variable
  tools/testing/nvdimm: Fix security_init() symbol collision
2022-04-09 18:31:59 -10:00
Linus Torvalds fa3b895da8 gpio fixes for v5.18-rc2
- fix a race condition with consumers accessing the fields of GPIO IRQ chips
   before they're fully initialized
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAmJR8OMACgkQEacuoBRx
 13LPfA//Qm/qAwlREBIVAhn/vdjAdLKM+JtMVRarI7V8RNQjEkuwbYFMissSMp44
 HxChhzaPMfmJh1kd0oa4t9GL34d83oI6Pa/vgqlcIYg5DjaeYD11wjiYKgE1Hsbs
 3t/s77pX3Swl0WNT/P7wl1nVjjbZsNZxS6tqCesvWCII5kQsTJSs4cuRDCYnxjgA
 3Xe1Dzs71c4ypSbXPJJ8LGxOi3Y2/fOG3M5Jc8MUO0CAY+B4byZopH5yaSurRWcO
 9rGQa7hfbgxfVkqpRgiFk9Vny/laoZQ7Hf1sTotXYjsOs5wa/mi8Zd6mu1X9/gYl
 Wr2g3VnpuFkfJSu3igxc+o2iwLD2fyxD/+4sIkVPFhvgX3Z0tmlK8yTRQcULAUre
 zk9eoAsDkJNNXh6wMUJ9no4S0mdSg77TAuJvBZTC727U8I4+xGem1PSjWc6WUW1n
 IoyRCBGgME5qllsCknFGvYBBLMtbv/UsCNc+0l/9lX20+At2pDH82eSX7keKK49z
 MmSEIvFtSHNpja0RXeA6byr0V5i4+eyNDnFenApXxx9h4EkC+s/dDjZU/hbTF0TJ
 NpcUJIU4BmXwl6WXVDLEddvQ3pvDH3mAQY8L3uPn5LLgZLRRlsfJisH1r1FThRFU
 A/bPbqsqEWCTgLo6lEZCN/WfOXoD1hbBLwWM/axpQVuD0WXt0RM=
 =faCF
 -----END PGP SIGNATURE-----

Merge tag 'gpio-fixes-for-v5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio fix from Bartosz Golaszewski:

 - fix a race condition with consumers accessing the fields of GPIO IRQ
   chips before they're fully initialized

* tag 'gpio-fixes-for-v5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: Restrict usage of GPIO chip irq members before initialization
2022-04-09 18:17:43 -10:00
Thomas Gleixner 63ef1a8a07 irqchip fixes for 5.18, take #1
- Fix GICv3 polling for RWP in redistributors
 
 - Reject ACPI attempts to use SGIs on GIC/GICv3
 
 - Fix unpredictible behaviour when making a VPE non-resident
   with GICv4
 
 - A couple of fixes for the newly merged qcom-mpm driver
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmJRUIQPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDJsYP/A1wjOMvcK2HHRDdKXgc65sFrv0uI4kZw1yX
 UxxRp/nwYo7uuz3L/k7D4/C96nMERP2rjiF2Y4NURLEhgVlh3iNK/lQWb9eVsN04
 jX33c3qOeH5kNAy89uYv1PnNy/QjjIm3LFh0lSXCLEDK0CV3l+LKcuGDHVn8pP0B
 2rd/N+s0Hc/SL2W/cV+pGkoJK9RGfWojC6SalFjFD/n8jE2jBNcMGeeZERdNhuVP
 r/ZwMEtM248KFyCX04QKidDexk37HZmrgq1K+RlPX60zqu/hcsDO27zF/qhru+U+
 Kr8WPLHsrzCo//9b5o8s/WTZE9QXA7QmTGum2KHfQazDug7YSk5Fgh1t7f1dSVAw
 Yh7cNtWXqa63I4woBlUtR8AImnIbM/a66Qeim+RZRl8QI5IMndgcLQaNu/IeoqVs
 wUNVX32mnQ0ee3Von/ADSBL+4ibqxQiw11ZXZU8/1Dc+teaMBshVdcHrug+iU7e8
 ftjwyf9IUsiITTNrExm6CDkzL27hbwHoHAD3eDkNisqTyJYmyWvL36kab88NgCAg
 Moc9WH2XkM4fZv0/9OADdyPJUOLtuZw85/45O2ydGVvpvW3GVmo06xt6P70P3/rq
 x2aU6n7CpYRrFZn7IL1hY8HBUXS+s/Ya65B7/L63yngyq8xAslkYOwcTx/XjHpEP
 EQt3/t9f
 =W3KX
 -----END PGP SIGNATURE-----

Merge tag 'irqchip-fixes-5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent

Pull irqchip fixes from Marc Zyngier:

 - Fix GICv3 polling for RWP in redistributors

 - Reject ACPI attempts to use SGIs on GIC/GICv3

 - Fix unpredictible behaviour when making a VPE non-resident
   with GICv4

 - A couple of fixes for the newly merged qcom-mpm driver

Link: https://lore.kernel.org/lkml/20220409094229.267649-1-maz@kernel.org
2022-04-09 22:21:55 +02:00
Ian Rogers 940a445a90 perf annotate: Drop objdump stderr to avoid getting stuck waiting for stdout output
If objdump writes to stderr it can block waiting for it to be read. As
perf doesn't read stderr then progress stops with perf waiting for
stdout output.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Truong <alexandre.truong@arm.com>
Cc: Dave Marchevsky <davemarchevsky@fb.com>
Cc: Denis Nikitin <denik@chromium.org>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Lexi Shao <shaolexi@huawei.com>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Remi Bernon <rbernon@codeweavers.com>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Cc: William Cohen <wcohen@redhat.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20220407230503.1265036-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-09 14:21:00 -03:00
Michael Petlan 3e6b43beb7 perf tools: Add external commands to list-cmds
The `perf --list-cmds` output prints only internal commands, although
there is no reason for that from users' perspective.

Adding the external commands to commands array with NULL function
pointer allows printing all perf commands while not changing the logic
of command handler selection.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220404221541.30312-2-mpetlan@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-09 14:21:00 -03:00
Michael Petlan 0ff26efe92 perf docs: Add perf-iostat link to manpages
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220404221541.30312-1-mpetlan@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-09 14:20:59 -03:00
Denis Nikitin bc21e74d47 perf session: Remap buf if there is no space for event
If a perf event doesn't fit into remaining buffer space return NULL to
remap buf and fetch the event again.

Keep the logic to error out on inadequate input from fuzzing.

This fixes perf failing on ChromeOS (with 32b userspace):

  $ perf report -v -i perf.data
  ...
  prefetch_event: head=0x1fffff8 event->header_size=0x30, mmap_size=0x2000000: fuzzed or compressed perf.data?
  Error:
  failed to process sample

Fixes: 57fc032ad6 ("perf session: Avoid infinite loop when seeing invalid header.size")
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Denis Nikitin <denik@chromium.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220330031130.2152327-1-denik@chromium.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-09 14:20:59 -03:00
Linus Torvalds e1f700ebd6 SCSI fixes on 20220409
38 Patches, two adding support for new devices (ufs, mvsas), a major
 set of six fixes in lpfc plus a huge chunk removal in pmcraid to get
 rid of a driver specific ioctl and a major rework of aha152x to get
 rid of the scsi_pointer.  The rest are minor fixes and obvious changes
 including several spelling updates.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYlGRXSYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishSAgAP42HLBW
 4cL+Mff4MRKYgS3AVWjJevI3m2mnXB6NQ6Xe/QD/beI0Ppx5s7q6VQvDvU/wvVYI
 tfne0SAy8Bi6V82cjlI=
 =LJ3r
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:

 - add support for new devices (ufs, mvsas)

 - a major set of fixes in lpfc

 - get rid of a driver specific ioctl in pcmraid

 - a major rework of aha152x to get rid of the scsi_pointer.

 - minor fixes and obvious changes including several spelling updates.

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (36 commits)
  scsi: megaraid_sas: Target with invalid LUN ID is deleted during scan
  scsi: ufs: ufshpb: Fix a NULL check on list iterator
  scsi: sd: Clean up gendisk if device_add_disk() failed
  scsi: message: fusion: Remove redundant variable dmp
  scsi: mvsas: Add PCI ID of RocketRaid 2640
  scsi: sd: sd_read_cpr() requires VPD pages
  scsi: mpt3sas: Fail reset operation if config request timed out
  scsi: sym53c500_cs: Stop using struct scsi_pointer
  scsi: ufs: ufs-pci: Add support for Intel MTL
  scsi: mpt3sas: Fix mpt3sas_check_same_4gb_region() kdoc comment
  scsi: scsi_debug: Fix sdebug_blk_mq_poll() in_use_bm bitmap use
  scsi: bnx2i: Fix spelling mistake "mis-match" -> "mismatch"
  scsi: bnx2fc: Fix spelling mistake "mis-match" -> "mismatch"
  scsi: zorro7xx: Fix a resource leak in zorro7xx_remove_one()
  scsi: aic7xxx: Use standard PCI subsystem, subdevice defines
  scsi: ufs: qcom: Drop custom Android boot parameters
  scsi: core: sysfs: Remove comments that conflict with the actual logic
  scsi: hisi_sas: Remove stray fallthrough annotation
  scsi: virtio-scsi: Eliminate anonymous module_init & module_exit
  scsi: isci: Fix spelling mistake "doesnt" -> "doesn't"
  ...
2022-04-09 06:05:46 -10:00
Athira Rajeev 299687e18a perf bench: Fix epoll bench to correct usage of affinity for machines with #CPUs > 1K
The 'perf bench epoll' testcase fails on systems with more than 1K CPUs.

Testcase: perf bench epoll all

Result snippet:
<<>>
Run summary [PID 106497]: 1399 threads monitoring on 64 file-descriptors for 8 secs.

perf: pthread_create: No such file or directory
<<>>

In epoll benchmarks (ctl, wait) pthread_create is invoked in do_threads
from respective bench_epoll_*  function. Though the logs shows direct
failure from pthread_create, the actual failure is from
"sched_setaffinity" returning EINVAL (invalid argument).

This happens because the default mask size in glibc is 1024. To overcome
this 1024 CPUs mask size limitation of cpu_set_t, change the mask size
using the CPU_*_S macros.

Patch addresses this by fixing all the epoll benchmarks to use CPU_ALLOC
to allocate cpumask, CPU_ALLOC_SIZE for size, and CPU_SET_S to set the
mask.

Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20220406175113.87881-3-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-09 12:34:29 -03:00
Athira Rajeev c9c2a427dd perf bench: Fix futex bench to correct usage of affinity for machines with #CPUs > 1K
The 'perf bench futex' testcase fails on systems with more than 1K CPUs.

Testcase: perf bench futex all

Failure snippet:
<<>>Running futex/hash benchmark...

perf: pthread_create: No such file or directory
<<>>

All the futex benchmarks (ie hash, lock-api, requeue, wake,
wake-parallel), pthread_create is invoked in respective bench_futex_*
function. Though the logs shows direct failure from pthread_create,
strace logs showed that actual failure is from  "sched_setaffinity"
returning EINVAL (invalid argument).

This happens because the default mask size in glibc is 1024. To overcome
this 1024 CPUs mask size limitation of cpu_set_t, change the mask size
using the CPU_*_S macros.

Patch addresses this by fixing all the futex benchmarks to use CPU_ALLOC
to allocate cpumask, CPU_ALLOC_SIZE for size, and CPU_SET_S to set the
mask.

Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20220406175113.87881-2-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-09 12:34:29 -03:00
Adrian Hunter aeee9dc53c perf tools: Fix perf's libperf_print callback
eprintf() does not expect va_list as the type of the 4th parameter.

Use veprintf() because it does.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Fixes: 428dab813a ("libperf: Merge libperf_set_print() into libperf_init()")
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220408132625.2451452-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-09 12:34:29 -03:00