Commit Graph

11466 Commits

Author SHA1 Message Date
Linus Torvalds b066935bf8 ARM:
* Address some fallout of the locking rework, this time affecting
   the way the vgic is configured
 
 * Fix an issue where the page table walker frees a subtree and
   then proceeds with walking what it has just freed...
 
 * Check that a given PA donated to the guest is actually memory
   (only affecting pKVM)
 
 * Correctly handle MTE CMOs by Set/Way
 
 * Fix the reported address of a watchpoint forwarded to userspace
 
 * Fix the freeing of the root of stage-2 page tables
 
 * Stop creating spurious PMU events to perform detection of the
   default PMU and use the existing PMU list instead.
 
 x86:
 
 * Fix a memslot lookup bug in the NX recovery thread that could
    theoretically let userspace bypass the NX hugepage mitigation
 
 * Fix a s/BLOCKING/PENDING bug in SVM's vNMI support
 
 * Account exit stats for fastpath VM-Exits that never leave the super
   tight run-loop
 
 * Fix an out-of-bounds bug in the optimized APIC map code, and add a
   regression test for the race.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmR7k1QUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNblwf/faUVOBMv7mQBGsGa7FNcmaNhYeIT
 U1k4pFNlo7dNNuNJrGdpo+sOGP5A8CRLNSVvlyjgCHF1Qc9gVtXNvZ9PnA6nAYmB
 qqvUz/TDw9/NLTlJEkbSs05B4am4yfd5pV6R/32jrPIbXOW++6ae2LpILS/NPBrB
 y0tGiVUJrO3zVXdBKa4PFmlO8jsXPmMEiicEJa5v2Boeo5SFyFfErw9zDNwSMsQc
 27bzbs3O2daXTNMFnwVCCpWUxt1EqWYUXGvBjsChAUI0K10F2/GW9f6YeFsGXqKI
 d8g1QuCukSt/CvN0pT+g/540mR6i0Azpek1myQfuCu2IhQ1jCJaSWOjoEw==
 =8VrO
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "ARM:

   - Address some fallout of the locking rework, this time affecting the
     way the vgic is configured

   - Fix an issue where the page table walker frees a subtree and then
     proceeds with walking what it has just freed...

   - Check that a given PA donated to the guest is actually memory (only
     affecting pKVM)

   - Correctly handle MTE CMOs by Set/Way

   - Fix the reported address of a watchpoint forwarded to userspace

   - Fix the freeing of the root of stage-2 page tables

   - Stop creating spurious PMU events to perform detection of the
     default PMU and use the existing PMU list instead

  x86:

   - Fix a memslot lookup bug in the NX recovery thread that could
     theoretically let userspace bypass the NX hugepage mitigation

   - Fix a s/BLOCKING/PENDING bug in SVM's vNMI support

   - Account exit stats for fastpath VM-Exits that never leave the super
     tight run-loop

   - Fix an out-of-bounds bug in the optimized APIC map code, and add a
     regression test for the race"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: selftests: Add test for race in kvm_recalculate_apic_map()
  KVM: x86: Bail from kvm_recalculate_phys_map() if x2APIC ID is out-of-bounds
  KVM: x86: Account fastpath-only VM-Exits in vCPU stats
  KVM: SVM: vNMI pending bit is V_NMI_PENDING_MASK not V_NMI_BLOCKING_MASK
  KVM: x86/mmu: Grab memslot for correct address space in NX recovery worker
  KVM: arm64: Document default vPMU behavior on heterogeneous systems
  KVM: arm64: Iterate arm_pmus list to probe for default PMU
  KVM: arm64: Drop last page ref in kvm_pgtable_stage2_free_removed()
  KVM: arm64: Populate fault info for watchpoint
  KVM: arm64: Reload PTE after invoking walker callback on preorder traversal
  KVM: arm64: Handle trap of tagged Set/Way CMOs
  arm64: Add missing Set/Way CMO encodings
  KVM: arm64: Prevent unconditional donation of unmapped regions from the host
  KVM: arm64: vgic: Fix a comment
  KVM: arm64: vgic: Fix locking comment
  KVM: arm64: vgic: Wrap vgic_its_create() with config_lock
  KVM: arm64: vgic: Fix a circular locking issue
2023-06-04 07:16:53 -04:00
Linus Torvalds 51f269a6ec Probes fixes for 6.4-rc4:
- Return NULL if the trace_probe list on trace_probe_event is empty.
 
 - selftests/ftrace: Choose testing symbol name for filtering feature
   from sample data instead of fixed symbol.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmR640AACgkQ2/sHvwUr
 PxugGgf/YwwocmUqiEtTukTB7fzoAjYyQXr0YaJM+DjeZXMqAJ4dl9tV1/AmAL4j
 iWtZd53aolTym/3P2VADfSc4xiyWjFdkYv7zRPjpqfMg3XsELJgshwz+12dmmMdx
 0uco1l2/Ge3JNPK6BuWaO3V44QjoPSgiRsmxxKLh5K7M9V5swL7fadoLtins1B0r
 TVVqdyEHQkZLTByexg7wHYd/ro+4lexv1yhvyP4rEmYRPDoR56eOF2zwcQMHPvaY
 qstdP2ce6m5rG0gp4TsY7oRkezb64y903hNQuumoU6VR9nI3IK4PZjuX5/xns2By
 G9mRaOqb02+UmP+HhX4QGmr92G9Vyw==
 =o07w
 -----END PGP SIGNATURE-----

Merge tag 'probes-fixes-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull probes fixes from Masami Hiramatsu:

 - Return NULL if the trace_probe list on trace_probe_event is empty

 - selftests/ftrace: Choose testing symbol name for filtering feature
   from sample data instead of fixed symbol

* tag 'probes-fixes-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  selftests/ftrace: Choose target function for filter test from samples
  tracing/probe: trace_probe_primary_from_call(): checked list_first_entry
2023-06-03 08:23:16 -04:00
Masami Hiramatsu (Google) eb50d0f250 selftests/ftrace: Choose target function for filter test from samples
Since the event-filter-function.tc expects the 'exit_mmap()' directly
calls 'kmem_cache_free()', this is vulnerable to code modifications.

Choose the target function for the filter test from the sample
event data so that it can keep test running correctly even if the caller
function name will be changed.

Link: https://lore.kernel.org/linux-trace-kernel/167919441260.1922645.18355804179347364057.stgit@mhiramat.roam.corp.google.com/

Link: https://lore.kernel.org/all/CA+G9fYtF-XEKi9YNGgR=Kf==7iRb2FrmEC7qtwAeQbfyah-UhA@mail.gmail.com/
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Fixes: 7f09d639b8 ("tracing/selftests: Add test for event filtering on function name")
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-06-03 15:48:22 +09:00
Michal Luczaj 47d2804bc9 KVM: selftests: Add test for race in kvm_recalculate_apic_map()
Keep switching between LAPIC_MODE_X2APIC and LAPIC_MODE_DISABLED during
APIC map construction to hunt for TOCTOU bugs in KVM.  KVM's optimized map
recalc makes multiple passes over the list of vCPUs, and the calculations
ignore vCPU's whose APIC is hardware-disabled, i.e. there's a window where
toggling LAPIC_MODE_DISABLED is quite interesting.

Signed-off-by: Michal Luczaj <mhal@rbox.co>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20230602233250.1014316-4-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-06-02 17:21:06 -07:00
Linus Torvalds 714069daa5 Fairly standard-sized batch of fixes, accounting for the lack of
sub-tree submissions this week. The mlx5 IRQ fixes are notable,
 people were complaining about that. No fires burning.
 
 Current release - regressions:
 
  - eth: mlx5e:
    - multiple fixes for dynamic IRQ allocation
    - prevent encap offload when neigh update is running
 
  - eth: mana: fix perf regression: remove rx_cqes, tx_cqes counters
 
 Current release - new code bugs:
 
  - eth: mlx5e: DR, add missing mutex init/destroy in pattern manager
 
 Previous releases - always broken:
 
  - tcp: deny tcp_disconnect() when threads are waiting
 
  - sched: prevent ingress Qdiscs from getting installed in random
    locations in the hierarchy and moving around
 
  - sched: flower: fix possible OOB write in fl_set_geneve_opt()
 
  - netlink: fix NETLINK_LIST_MEMBERSHIPS length report
 
  - udp6: fix race condition in udp6_sendmsg & connect
 
  - tcp: fix mishandling when the sack compression is deferred
 
  - rtnetlink: validate link attributes set at creation time
 
  - mptcp: fix connect timeout handling
 
  - eth: stmmac: fix call trace when stmmac_xdp_xmit() is invoked
 
  - eth: amd-xgbe: fix the false linkup in xgbe_phy_status
 
  - eth: mlx5e:
    - fix corner cases in internal buffer configuration
    - drain health before unregistering devlink
 
  - usb: qmi_wwan: set DTR quirk for BroadMobi BM818
 
 Misc:
 
  - tcp: return user_mss for TCP_MAXSEG in CLOSE/LISTEN state if user_mss set
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmR42fUACgkQMUZtbf5S
 IrtD7w//R4zkxkNpT/pTmBsbm4zA7MSttPEW07HPTBXcMBbJMPV3pEyo18Pezm/a
 kJLO+2mvNTz/S74WAU0H2M3ux2I+Uc/srRXroff52ttCeV7mO/OHPAYBna0PqDoq
 O5A+laSGxaq3ulmLJdbE0SMhrH4t6iCeelEFUw03q49XhEKQlfgSHW4+lws16ffE
 Togxy1Iip8PXCMYXqWb1Hc0MFqF+2MdPm8YVaIfYRuFrI56apdknywrKuHYwk7kl
 gsy8hKHRJDRQy5RjmHDgbsLm4RCr2abHOe2mYwtdGAXWfvdRUI4HGeFYtc/b5i32
 55WAkegBzZPVkql7OLQM1N0hZYjz7JdAV/MiT8Taf8kSJGmLNbQxgJxgF8t8xj8S
 9ddUGSuxOhKBU3jeen6zjenVVbKWXXHDHICS3T5j/rlGUeeUmfgmLyvZ3PfYQeHA
 gibfLsq2KQqYQDKESO97hfORXnBV+dP0xXPMNcjKLNx+NnmhmaCI4vVYPfLc7X02
 1XqApdfhR/CK2ytNTATDVQdP72xyhtuqRTuhWEYi0q5UYQamTCXt+mDG7DKXfCj5
 cyJOcFNOJQH14XKiiAxf0IkD2VimuPVYxuhebTu79YQTXSqf6sUw67pLJ7i8LhUW
 tQMM4qOCcIH7y1qJvHbUFDa/4fK/+HnnmC2H3LH/Q8wtpZLKgRU=
 =o77c
 -----END PGP SIGNATURE-----

Merge tag 'net-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Happy Wear a Dress Day.

  Fairly standard-sized batch of fixes, accounting for the lack of
  sub-tree submissions this week. The mlx5 IRQ fixes are notable, people
  were complaining about that. No fires burning.

  Current release - regressions:

   - eth: mlx5e:
      - multiple fixes for dynamic IRQ allocation
      - prevent encap offload when neigh update is running

   - eth: mana: fix perf regression: remove rx_cqes, tx_cqes counters

  Current release - new code bugs:

   - eth: mlx5e: DR, add missing mutex init/destroy in pattern manager

  Previous releases - always broken:

   - tcp: deny tcp_disconnect() when threads are waiting

   - sched: prevent ingress Qdiscs from getting installed in random
     locations in the hierarchy and moving around

   - sched: flower: fix possible OOB write in fl_set_geneve_opt()

   - netlink: fix NETLINK_LIST_MEMBERSHIPS length report

   - udp6: fix race condition in udp6_sendmsg & connect

   - tcp: fix mishandling when the sack compression is deferred

   - rtnetlink: validate link attributes set at creation time

   - mptcp: fix connect timeout handling

   - eth: stmmac: fix call trace when stmmac_xdp_xmit() is invoked

   - eth: amd-xgbe: fix the false linkup in xgbe_phy_status

   - eth: mlx5e:
      - fix corner cases in internal buffer configuration
      - drain health before unregistering devlink

   - usb: qmi_wwan: set DTR quirk for BroadMobi BM818

  Misc:

   - tcp: return user_mss for TCP_MAXSEG in CLOSE/LISTEN state if
     user_mss set"

* tag 'net-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (71 commits)
  mptcp: fix active subflow finalization
  mptcp: add annotations around sk->sk_shutdown accesses
  mptcp: fix data race around msk->first access
  mptcp: consolidate passive msk socket initialization
  mptcp: add annotations around msk->subflow accesses
  mptcp: fix connect timeout handling
  rtnetlink: add the missing IFLA_GRO_ tb check in validate_linkmsg
  rtnetlink: move IFLA_GSO_ tb check to validate_linkmsg
  rtnetlink: call validate_linkmsg in rtnl_create_link
  ice: recycle/free all of the fragments from multi-buffer frame
  net: phy: mxl-gpy: extend interrupt fix to all impacted variants
  net: renesas: rswitch: Fix return value in error path of xmit
  net: dsa: mv88e6xxx: Increase wait after reset deactivation
  net: ipa: Use correct value for IPA_STATUS_SIZE
  tcp: fix mishandling when the sack compression is deferred.
  net/sched: flower: fix possible OOB write in fl_set_geneve_opt()
  sfc: fix error unwinds in TC offload
  net/mlx5: Read embedded cpu after init bit cleared
  net/mlx5e: Fix error handling in mlx5e_refresh_tirs
  net/mlx5: Ensure af_desc.mask is properly initialized
  ...
2023-06-01 17:29:18 -04:00
Matthieu Baerts 63212608a9 selftests: mptcp: userspace pm: skip if MPTCP is not supported
Selftests are supposed to run on any kernels, including the old ones not
supporting MPTCP.

A new check is then added to make sure MPTCP is supported. If not, the
test stops and is marked as "skipped".

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: 259a834fad ("selftests: mptcp: functional tests for the userspace PM type")
Cc: stable@vger.kernel.org
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30 13:21:03 +02:00
Matthieu Baerts cf6f0fda7a selftests: mptcp: sockopt: skip if MPTCP is not supported
Selftests are supposed to run on any kernels, including the old ones not
supporting MPTCP.

A new check is then added to make sure MPTCP is supported. If not, the
test stops and is marked as "skipped".

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: dc65fe82fb ("selftests: mptcp: add packet mark test case")
Cc: stable@vger.kernel.org
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30 13:21:03 +02:00
Matthieu Baerts 9161f21c74 selftests: mptcp: simult flows: skip if MPTCP is not supported
Selftests are supposed to run on any kernels, including the old ones not
supporting MPTCP.

A new check is then added to make sure MPTCP is supported. If not, the
test stops and is marked as "skipped".

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: 1a418cb8e8 ("mptcp: simult flow self-tests")
Cc: stable@vger.kernel.org
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30 13:21:03 +02:00
Matthieu Baerts 46565acdd2 selftests: mptcp: diag: skip if MPTCP is not supported
Selftests are supposed to run on any kernels, including the old ones not
supporting MPTCP.

A new check is then added to make sure MPTCP is supported. If not, the
test stops and is marked as "skipped".

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: df62f2ec3d ("selftests/mptcp: add diag interface tests")
Cc: stable@vger.kernel.org
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30 13:21:02 +02:00
Matthieu Baerts 715c78a82e selftests: mptcp: join: skip if MPTCP is not supported
Selftests are supposed to run on any kernels, including the old ones not
supporting MPTCP.

A new check is then added to make sure MPTCP is supported. If not, the
test stops and is marked as "skipped".

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: b08fbf2410 ("selftests: add test-cases for MPTCP MP_JOIN")
Cc: stable@vger.kernel.org
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30 13:21:02 +02:00
Matthieu Baerts 0f4955a40d selftests: mptcp: pm nl: skip if MPTCP is not supported
Selftests are supposed to run on any kernels, including the old ones not
supporting MPTCP.

A new check is then added to make sure MPTCP is supported. If not, the
test stops and is marked as "skipped".

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: eedbc68532 ("selftests: add PM netlink functional tests")
Cc: stable@vger.kernel.org
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30 13:21:02 +02:00
Matthieu Baerts d83013bdf9 selftests: mptcp: connect: skip if MPTCP is not supported
Selftests are supposed to run on any kernels, including the old ones not
supporting MPTCP.

A new check is then added to make sure MPTCP is supported. If not, the
test stops and is marked as "skipped". Note that this check can also
mark the test as failed if 'SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES' env
var is set to 1: by doing that, we can make sure a test is not being
skipped by mistake.

A new shared file is added here to be able to re-used the same check in
the different selftests we have.

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: 048d19d444 ("mptcp: add basic kselftest for mptcp")
Cc: stable@vger.kernel.org
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30 13:21:02 +02:00
Matthieu Baerts d328fe8706 selftests: mptcp: join: avoid using 'cmp --bytes'
BusyBox's 'cmp' command doesn't support the '--bytes' parameter.

Some CIs -- i.e. LKFT -- use BusyBox and have the mptcp_join.sh test
failing [1] because their 'cmp' command doesn't support this '--bytes'
option:

    cmp: unrecognized option '--bytes=1024'
    BusyBox v1.35.0 () multi-call binary.

    Usage: cmp [-ls] [-n NUM] FILE1 [FILE2]

Instead, 'head --bytes' can be used as this option is supported by
BusyBox. A temporary file is needed for this operation.

Because it is apparently quite common to use BusyBox, it is certainly
better to backport this fix to impacted kernels.

Fixes: 6bf41020b7 ("selftests: mptcp: update and extend fastclose test-cases")
Cc: stable@vger.kernel.org
Link: https://qa-reports.linaro.org/lkft/linux-mainline-master/build/v6.3-rc5-5-g148341f0a2f5/testrun/16088933/suite/kselftest-net-mptcp/test/net_mptcp_userspace_pm_sh/log [1]
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30 13:21:02 +02:00
Linus Torvalds 8b817fded4 Tracing fixes:
User events:
 
  - Use long instead of int for storing the enable set/clear bit, as it was
    found that big endian machines could end up using the wrong bits.
 
  - Split allocating mm and attaching it. This keeps the allocation separate
    from the registration and avoids various races.
 
  - Remove RCU locking around pin_user_pages_remote() as that can schedule. The
    RCU protection is no longer needed with the above split of mm allocation and
    attaching.
 
  - Rename the "link" fields of the various structs to something more
    meaningful.
 
  - Add comments around user_event_mm struct usage and locking requirements.
 
 Timerlat tracer:
 
  - Fix missed wakeup of timerlat thread caused by the timerlat interrupt
    triggering when tracing is off. The timer interrupt handler needs to always
    wake up the timerlat thread regardless if tracing is enabled or not,
    otherwise, it will never wake up.
 
 Histograms:
 
  - Fix regression of breaking the "stacktrace" modifier for variables. That
    modifier cannot be used for values, but can be used for variables that are
    passed from one histogram to the next. This was broken when adding the
    restriction to values as the variable logic used the same code.
 
  - Rename the special field "stacktrace" to "common_stacktrace". Special fields
    (that are not actually part of the event, but can act just like event
    fields, like 'comm' and 'timestamp') should be prefixed with 'common_' for
    consistency. To keep backward compatibility, 'stacktrace' can still be used
    (as with the special field 'cpu'), but can be overridden if the event has a
    field called 'stacktrace'.
 
  - Update the synthetic event selftests to use the new name (synthetic events
    are created by histograms)
 
 Tracing bootup selftests:
 
  - Reorganize the code to keep artifacts of the selftests not compiled in when
    selftests are not configured.
 
  - Add various cond_resched() around the selftest code, as the softlock
    watchdog was triggering much more often. It appears that the kernel runs
    slower now with full debugging enabled.
 
  - While debugging ftrace with ftrace (using an instance ring buffer instead of
    the top level one), I found that the selftests were disabling prints to the
    debug instance. This should not happen, as the selftests only disable
    printing to the main buffer as the selftests examine the main buffer to see
    if it has what it expects, and prints can make the tests fail. Make the
    selftests only disable printing to the toplevel buffer, and leave the
    instance buffers alone.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZHQGJBQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qu6hAQCJ1WebZUTJ/s7pFo36mXirLnrW4afB
 Ua6sALseqKNesgEAyhLmd2+sMeqmAbCCIUWtcWJb/Pod0jGOt0U8+cBxfw8=
 =PhaX
 -----END PGP SIGNATURE-----

Merge tag 'trace-v6.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing fixes from Steven Rostedt:
 "User events:

   - Use long instead of int for storing the enable set/clear bit, as it
     was found that big endian machines could end up using the wrong
     bits.

   - Split allocating mm and attaching it. This keeps the allocation
     separate from the registration and avoids various races.

   - Remove RCU locking around pin_user_pages_remote() as that can
     schedule. The RCU protection is no longer needed with the above
     split of mm allocation and attaching.

   - Rename the "link" fields of the various structs to something more
     meaningful.

   - Add comments around user_event_mm struct usage and locking
     requirements.

  Timerlat tracer:

   - Fix missed wakeup of timerlat thread caused by the timerlat
     interrupt triggering when tracing is off. The timer interrupt
     handler needs to always wake up the timerlat thread regardless if
     tracing is enabled or not, otherwise, it will never wake up.

  Histograms:

   - Fix regression of breaking the "stacktrace" modifier for variables.
     That modifier cannot be used for values, but can be used for
     variables that are passed from one histogram to the next. This was
     broken when adding the restriction to values as the variable logic
     used the same code.

   - Rename the special field "stacktrace" to "common_stacktrace".

     Special fields (that are not actually part of the event, but can
     act just like event fields, like 'comm' and 'timestamp') should be
     prefixed with 'common_' for consistency. To keep backward
     compatibility, 'stacktrace' can still be used (as with the special
     field 'cpu'), but can be overridden if the event has a field called
     'stacktrace'.

   - Update the synthetic event selftests to use the new name (synthetic
     events are created by histograms)

  Tracing bootup selftests:

   - Reorganize the code to keep artifacts of the selftests not compiled
     in when selftests are not configured.

   - Add various cond_resched() around the selftest code, as the
     softlock watchdog was triggering much more often. It appears that
     the kernel runs slower now with full debugging enabled.

   - While debugging ftrace with ftrace (using an instance ring buffer
     instead of the top level one), I found that the selftests were
     disabling prints to the debug instance.

     This should not happen, as the selftests only disable printing to
     the main buffer as the selftests examine the main buffer to see if
     it has what it expects, and prints can make the tests fail.

     Make the selftests only disable printing to the toplevel buffer,
     and leave the instance buffers alone"

* tag 'trace-v6.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Have function_graph selftest call cond_resched()
  tracing: Only make selftest conditionals affect the global_trace
  tracing: Make tracing_selftest_running/delete nops when not used
  tracing: Have tracer selftests call cond_resched() before running
  tracing: Move setting of tracing_selftest_running out of register_tracer()
  tracing/selftests: Update synthetic event selftest to use common_stacktrace
  tracing: Rename stacktrace field to common_stacktrace
  tracing/histograms: Allow variables to have some modifiers
  tracing/user_events: Document user_event_mm one-shot list usage
  tracing/user_events: Rename link fields for clarity
  tracing/user_events: Remove RCU lock while pinning pages
  tracing/user_events: Split up mm alloc and attach
  tracing/timerlat: Always wakeup the timerlat thread
  tracing/user_events: Use long vs int for atomic bit ops
2023-05-29 07:20:13 -04:00
Linus Torvalds 91a304340a gpio fixes for v6.4-rc4
- fix incorrect output in in-tree gpio tools
 - fix a shell coding issue in gpio-sim selftests
 - correctly set the permissions for debugfs attributes exposed by gpio-mockup
 - fix chip name and pin count in gpio-f7188x for one of the supported models
 - fix numberspace pollution when using dynamically and statically allocated
   GPIOs together
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAmRws3YACgkQEacuoBRx
 13JHlhAAkyUNjtvkcTdO9buB2AXB3e/JH31MgSEEzoAi9kr54n70D+mtmSz5iX0X
 4a1KjZrd5Gd7nSGA3AHll7jxL1Rfa4vJsTID/khfhIPB0arGl7brZ0J2fItRUGqz
 oxAvpH3OpSWI+QWQMzp/NEZV0F5rfWBme93w6OqYItzGNr4LfP+3x49zPDUngOjp
 d3EsDzQbgxezYvrmWjPvZuZvk3RYud+dfU8bjvtnI6TJZqIxzePmcqEAFgqsnDJF
 VLEJQFPGpLaX7U+ZepcAQAs2s5ggk6Mb0JdGKp1w9hQ9w91TClBUgUURG6hzdLmZ
 w6/X3iGaFGaHugA4DmWhYd4FWE7H58dWazw/RVK4RM2ZGZ9cT8R8OVQlDsBhesyG
 tOTY4RJlU1DgMt66nlUvf8r3ECdm2ayvFq7qSN9xRDQ0W4Ti3+QU2+/re3s8jevR
 VYAo1BHtVai28+lCMjopK+fYBe8G4DqsGx9Uh3y0RxYIFvMKdo4EQF9Ku3yTEcq3
 3FNFO00KU5ktKgnx7z6bnO7+F3OvHUeSRh44U8O6pm/odfdkUn7CHjFVi2dM6yO5
 763UKwMZq9rDT2owdwljolIRECb8IAN0siBhhHYYW4oAFTD2w2vTQ/y2KYrxbntv
 CptwauQMM9vjzg9bxMP+4R5YVZhZhtJ4I1oa2kQx7eXIPBy/6OM=
 =R4dV
 -----END PGP SIGNATURE-----

Merge tag 'gpio-fixes-for-v6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:

 - fix incorrect output in in-tree gpio tools

 - fix a shell coding issue in gpio-sim selftests

 - correctly set the permissions for debugfs attributes exposed by
   gpio-mockup

 - fix chip name and pin count in gpio-f7188x for one of the supported
   models

 - fix numberspace pollution when using dynamically and statically
   allocated GPIOs together

* tag 'gpio-fixes-for-v6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio-f7188x: fix chip name and pin count on Nuvoton chip
  gpiolib: fix allocation of mixed dynamic/static GPIOs
  gpio: mockup: Fix mode of debugfs files
  selftests: gpio: gpio-sim: Fix BUG: test FAILED due to recent change
  tools: gpio: fix debounce_period_us output of lsgpio
2023-05-26 13:29:16 -07:00
Jakub Kicinski 0c615f1cc3 bpf-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZG4AiAAKCRDbK58LschI
 g+xlAQCmefGbDuwPckZLnomvt6gl4bkIjs7kc1ySbG9QBnaInwD/WyrJaQIPijuD
 qziHPAyx+MEgPseFU1b7Le35SZ66IwM=
 =s4R1
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
pull-request: bpf 2023-05-24

We've added 19 non-merge commits during the last 10 day(s) which contain
a total of 20 files changed, 738 insertions(+), 448 deletions(-).

The main changes are:

1) Batch of BPF sockmap fixes found when running against NGINX TCP tests,
   from John Fastabend.

2) Fix a memleak in the LRU{,_PERCPU} hash map when bucket locking fails,
   from Anton Protopopov.

3) Init the BPF offload table earlier than just late_initcall,
   from Jakub Kicinski.

4) Fix ctx access mask generation for 32-bit narrow loads of 64-bit fields,
   from Will Deacon.

5) Remove a now unsupported __fallthrough in BPF samples,
   from Andrii Nakryiko.

6) Fix a typo in pkg-config call for building sign-file,
   from Jeremy Sowden.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf, sockmap: Test progs verifier error with latest clang
  bpf, sockmap: Test FIONREAD returns correct bytes in rx buffer with drops
  bpf, sockmap: Test FIONREAD returns correct bytes in rx buffer
  bpf, sockmap: Test shutdown() correctly exits epoll and recv()=0
  bpf, sockmap: Build helper to create connected socket pair
  bpf, sockmap: Pull socket helpers out of listen test for general use
  bpf, sockmap: Incorrectly handling copied_seq
  bpf, sockmap: Wake up polling after data copy
  bpf, sockmap: TCP data stall on recv before accept
  bpf, sockmap: Handle fin correctly
  bpf, sockmap: Improved check for empty queue
  bpf, sockmap: Reschedule is now done through backlog
  bpf, sockmap: Convert schedule_work into delayed_work
  bpf, sockmap: Pass skb ownership through read_skb
  bpf: fix a memory leak in the LRU and LRU_PERCPU hash maps
  bpf: Fix mask generation for 32-bit narrow loads of 64-bit fields
  samples/bpf: Drop unnecessary fallthrough
  bpf: netdev: init the offload table earlier
  selftests/bpf: Fix pkg-config call building sign-file
====================

Link: https://lore.kernel.org/r/20230524170839.13905-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-24 21:57:57 -07:00
Steven Rostedt (Google) f1aab36343 tracing/selftests: Update synthetic event selftest to use common_stacktrace
With the rename of the stacktrace field to common_stacktrace, update the
selftests to reflect this change. Copy the current selftest to test the
backward compatibility "stacktrace" keyword. Also the "requires" of that
test was incorrect, so it would never actually ran before. That is fixed
now.

Link: https://lore.kernel.org/linux-trace-kernel/20230523225402.55951f2f@rorschach.local.home

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Tom Zanussi <zanussi@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-05-23 23:47:49 -04:00
John Fastabend f726e03564 bpf, sockmap: Test progs verifier error with latest clang
With a relatively recent clang (7090c10273119) and with this commit
to fix warnings in selftests (c8ed668593) that uses __sink(err)
to resolve unused variables. We get the following verifier error.

root@6e731a24b33a:/host/tools/testing/selftests/bpf# ./test_sockmap
libbpf: prog 'bpf_sockmap': BPF program load failed: Permission denied
libbpf: prog 'bpf_sockmap': -- BEGIN PROG LOAD LOG --
0: R1=ctx(off=0,imm=0) R10=fp0
; op = (int) skops->op;
0: (61) r2 = *(u32 *)(r1 +0)          ; R1=ctx(off=0,imm=0) R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
; switch (op) {
1: (16) if w2 == 0x4 goto pc+5        ; R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
2: (56) if w2 != 0x5 goto pc+15       ; R2_w=5
; lport = skops->local_port;
3: (61) r2 = *(u32 *)(r1 +68)         ; R1=ctx(off=0,imm=0) R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
; if (lport == 10000) {
4: (56) if w2 != 0x2710 goto pc+13 18: R1=ctx(off=0,imm=0) R2=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0
; __sink(err);
18: (bc) w1 = w0
R0 !read_ok
processed 18 insns (limit 1000000) max_states_per_insn 0 total_states 2 peak_states 2 mark_read 1
-- END PROG LOAD LOG --
libbpf: prog 'bpf_sockmap': failed to load: -13
libbpf: failed to load object 'test_sockmap_kern.bpf.o'
load_bpf_file: (-1) No such file or directory
ERROR: (-1) load bpf failed
libbpf: prog 'bpf_sockmap': BPF program load failed: Permission denied
libbpf: prog 'bpf_sockmap': -- BEGIN PROG LOAD LOG --
0: R1=ctx(off=0,imm=0) R10=fp0
; op = (int) skops->op;
0: (61) r2 = *(u32 *)(r1 +0)          ; R1=ctx(off=0,imm=0) R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
; switch (op) {
1: (16) if w2 == 0x4 goto pc+5        ; R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
2: (56) if w2 != 0x5 goto pc+15       ; R2_w=5
; lport = skops->local_port;
3: (61) r2 = *(u32 *)(r1 +68)         ; R1=ctx(off=0,imm=0) R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
; if (lport == 10000) {
4: (56) if w2 != 0x2710 goto pc+13 18: R1=ctx(off=0,imm=0) R2=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0
; __sink(err);
18: (bc) w1 = w0
R0 !read_ok
processed 18 insns (limit 1000000) max_states_per_insn 0 total_states 2 peak_states 2 mark_read 1
-- END PROG LOAD LOG --
libbpf: prog 'bpf_sockmap': failed to load: -13
libbpf: failed to load object 'test_sockhash_kern.bpf.o'
load_bpf_file: (-1) No such file or directory
ERROR: (-1) load bpf failed
libbpf: prog 'bpf_sockmap': BPF program load failed: Permission denied
libbpf: prog 'bpf_sockmap': -- BEGIN PROG LOAD LOG --
0: R1=ctx(off=0,imm=0) R10=fp0
; op = (int) skops->op;
0: (61) r2 = *(u32 *)(r1 +0)          ; R1=ctx(off=0,imm=0) R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
; switch (op) {
1: (16) if w2 == 0x4 goto pc+5        ; R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
2: (56) if w2 != 0x5 goto pc+15       ; R2_w=5
; lport = skops->local_port;
3: (61) r2 = *(u32 *)(r1 +68)         ; R1=ctx(off=0,imm=0) R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
; if (lport == 10000) {
4: (56) if w2 != 0x2710 goto pc+13 18: R1=ctx(off=0,imm=0) R2=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0
; __sink(err);
18: (bc) w1 = w0
R0 !read_ok
processed 18 insns (limit 1000000) max_states_per_insn 0 total_states 2 peak_states 2 mark_read 1
-- END PROG LOAD LOG --

To fix simply remove the err value because its not actually used anywhere
in the testing. We can investigate the root cause later. Future patch should
probably actually test the err value as well. Although if the map updates
fail they will get caught eventually by userspace.

Fixes: c8ed668593 ("selftests/bpf: fix lots of silly mistakes pointed out by compiler")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20230523025618.113937-15-john.fastabend@gmail.com
2023-05-23 16:11:27 +02:00
John Fastabend 80e24d2226 bpf, sockmap: Test FIONREAD returns correct bytes in rx buffer with drops
When BPF program drops pkts the sockmap logic 'eats' the packet and
updates copied_seq. In the PASS case where the sk_buff is accepted
we update copied_seq from recvmsg path so we need a new test to
handle the drop case.

Original patch series broke this resulting in

test_sockmap_skb_verdict_fionread:PASS:ioctl(FIONREAD) error 0 nsec
test_sockmap_skb_verdict_fionread:FAIL:ioctl(FIONREAD) unexpected ioctl(FIONREAD): actual 1503041772 != expected 256

After updated patch with fix.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20230523025618.113937-14-john.fastabend@gmail.com
2023-05-23 16:11:20 +02:00
John Fastabend bb516f98c7 bpf, sockmap: Test FIONREAD returns correct bytes in rx buffer
A bug was reported where ioctl(FIONREAD) returned zero even though the
socket with a SK_SKB verdict program attached had bytes in the msg
queue. The result is programs may hang or more likely try to recover,
but use suboptimal buffer sizes.

Add a test to check that ioctl(FIONREAD) returns the correct number of
bytes.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20230523025618.113937-13-john.fastabend@gmail.com
2023-05-23 16:11:13 +02:00
John Fastabend 1fa1fe8ff1 bpf, sockmap: Test shutdown() correctly exits epoll and recv()=0
When session gracefully shutdowns epoll needs to wake up and any recv()
readers should return 0 not the -EAGAIN they previously returned.

Note we use epoll instead of select to test the epoll wake on shutdown
event as well.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20230523025618.113937-12-john.fastabend@gmail.com
2023-05-23 16:11:05 +02:00
John Fastabend 298970c8af bpf, sockmap: Build helper to create connected socket pair
A common operation for testing is to spin up a pair of sockets that are
connected. Then we can use these to run specific tests that need to
send data, check BPF programs and so on.

The sockmap_listen programs already have this logic lets move it into
the new sockmap_helpers header file for general use.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20230523025618.113937-11-john.fastabend@gmail.com
2023-05-23 16:10:58 +02:00
John Fastabend 4e02588d9a bpf, sockmap: Pull socket helpers out of listen test for general use
No functional change here we merely pull the helpers in sockmap_listen.c
into a header file so we can use these in other programs. The tests we
are about to add aren't really _listen tests so doesn't make sense
to add them here.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20230523025618.113937-10-john.fastabend@gmail.com
2023-05-23 16:10:50 +02:00
Po-Hsu Lin d226b1df36 selftests: fib_tests: mute cleanup error message
In the end of the test, there will be an error message induced by the
`ip netns del ns1` command in cleanup()

  Tests passed: 201
  Tests failed:   0
  Cannot remove namespace file "/run/netns/ns1": No such file or directory

This can even be reproduced with just `./fib_tests.sh -h` as we're
calling cleanup() on exit.

Redirect the error message to /dev/null to mute it.

V2: Update commit message and fixes tag.
V3: resubmit due to missing netdev ML in V2

Fixes: b60417a9f2 ("selftest: fib_tests: Always cleanup before exit")
Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-19 08:52:11 +01:00
Linus Torvalds 1f594fe7c9 Networking fixes for 6.4-rc3, including fixes from can, xfrm,
bluetooth and netfilter.
 
 Current release - regressions:
 
   - ipv6: fix RCU splat in ipv6_route_seq_show()
 
   - wifi: iwlwifi: disable RFI feature
 
 Previous releases - regressions:
 
   - tcp: fix possible sk_priority leak in tcp_v4_send_reset()
 
   - tipc: do not update mtu if msg_max is too small in mtu negotiation
 
   - netfilter: fix null deref on element insertion
 
   - devlink: change per-devlink netdev notifier to static one
 
   - phylink: fix ksettings_set() ethtool call
 
   - wifi: mac80211: fortify the spinlock against deadlock by interrupt
 
   - wifi: brcmfmac: check for probe() id argument being NULL
 
   - eth: ice:
     - fix undersized tx_flags variable
     - fix ice VF reset during iavf initialization
 
   - eth: hns3: fix sending pfc frames after reset issue
 
 Previous releases - always broken:
 
   - xfrm: release all offloaded policy memory
 
   - nsh: use correct mac_offset to unwind gso skb in nsh_gso_segment()
 
   - vsock: avoid to close connected socket after the timeout
 
   - dsa: rzn1-a5psw: enable management frames for CPU port
 
   - eth: virtio_net: fix error unwinding of XDP initialization
 
   - eth: tun: fix memory leak for detached NAPI queue.
 
 Misc:
 
   - MAINTAINERS: sctp: move Neil to CREDITS
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmRmEHoSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkXu0P/AnSeVtu2CYZSjyCQYvkKpdpbDOSlsee
 GOnG0jduOdJ+OabYyM6prg9JHp1t3QxxTVJs+Spc7Eh9EX+YHRwK5DNPhv3GQ7RI
 pSwQxwiEhLVXVjaEtqUo1Wf8JgCQJ+ThisSIDgQRnaHKQnrRIlbZRngwn6TwNFba
 kxBpCUn2RZcZZhL8xdVF4UbSbEeLEupN76rHiePBVKNG70QVRptX3Rd6EV6FxmTa
 EzAtfNh1r0p3BHW1YYgsbpy7PeXhKGhlVLqIld8h9r/y4hATsrQ2f7Bv0RNrVBDf
 f8r1bdG5E0K3V8AzFPpyOe304G3GAwV+V/wtvA3GRjiwmPUwzJOzaNnSgJZZ/sbq
 mnR4pCwJ4gDGnDxLa8hBh6+emQGB6LJJX0JTQY7vMPNmUwtIuEQc6tLxJ4DpXTzW
 psEndQPDA7yR/7pNoE7ax+8CKCxPvfiBRnV9sxzmPV6FcxWtzJeLQihOuOA4IB8i
 Ddhq2OYH+HCodTNOLWNyMSjk65O7Whee1O/YGiVW9+iUbKBUSBFatJ9fJjH4bXMT
 VRZZnhlFGGIMuZXhkL5+a4ZnomqfPRXNGJ/QBbB4Ty7CXr84mXb0SX5gW4qsLOBA
 YEuxiqD8Oej2Mrid4ypF5GmwBYLAf4CCZajGVii0yyz2hp69RvlQ0c5lA0WisQu3
 wvY2HDFGBsa+
 =PfWP
 -----END PGP SIGNATURE-----

Merge tag 'net-6.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from can, xfrm, bluetooth and netfilter.

  Current release - regressions:

   - ipv6: fix RCU splat in ipv6_route_seq_show()

   - wifi: iwlwifi: disable RFI feature

  Previous releases - regressions:

   - tcp: fix possible sk_priority leak in tcp_v4_send_reset()

   - tipc: do not update mtu if msg_max is too small in mtu negotiation

   - netfilter: fix null deref on element insertion

   - devlink: change per-devlink netdev notifier to static one

   - phylink: fix ksettings_set() ethtool call

   - wifi: mac80211: fortify the spinlock against deadlock by interrupt

   - wifi: brcmfmac: check for probe() id argument being NULL

   - eth: ice:
      - fix undersized tx_flags variable
      - fix ice VF reset during iavf initialization

   - eth: hns3: fix sending pfc frames after reset issue

  Previous releases - always broken:

   - xfrm: release all offloaded policy memory

   - nsh: use correct mac_offset to unwind gso skb in nsh_gso_segment()

   - vsock: avoid to close connected socket after the timeout

   - dsa: rzn1-a5psw: enable management frames for CPU port

   - eth: virtio_net: fix error unwinding of XDP initialization

   - eth: tun: fix memory leak for detached NAPI queue.

  Misc:

   - MAINTAINERS: sctp: move Neil to CREDITS"

* tag 'net-6.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (107 commits)
  MAINTAINERS: skip CCing netdev for Bluetooth patches
  mdio_bus: unhide mdio_bus_init prototype
  bridge: always declare tunnel functions
  atm: hide unused procfs functions
  net: isa: include net/Space.h
  Revert "ARM: dts: stm32: add CAN support on stm32f746"
  netfilter: nft_set_rbtree: fix null deref on element insertion
  netfilter: nf_tables: fix nft_trans type confusion
  netfilter: conntrack: define variables exp_nat_nla_policy and any_addr with CONFIG_NF_NAT
  net: wwan: t7xx: Ensure init is completed before system sleep
  net: selftests: Fix optstring
  net: pcs: xpcs: fix C73 AN not getting enabled
  net: wwan: iosm: fix NULL pointer dereference when removing device
  vlan: fix a potential uninit-value in vlan_dev_hard_start_xmit()
  mailmap: add entries for Nikolay Aleksandrov
  igb: fix bit_shift to be in [1..8] range
  net: dsa: mv88e6xxx: Fix mv88e6393x EPC write command offset
  cassini: Fix a memory leak in the error handling path of cas_init_one()
  tun: Fix memory leak for detached NAPI queue.
  can: kvaser_pciefd: Disable interrupts in probe error path
  ...
2023-05-18 08:52:14 -07:00
Linus Torvalds 4d6d4c7f54 linux-kselftest-fixes-6.4-rc3
This Kselftest fixes update for Linux 6.4-rc3 consists of:
 
 - sgx test fix for false negatives.
 - ftrace output is hard to parses and it masks inappropriate skips etc.
   This fix addresses the problems by integrating with kselftest runner.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmRlDzgACgkQCwJExA0N
 Qxzlfw//fc+Uq5hC2KUeVOiIAPYejEZ3+ToY+OUjaKvykWndMpYys5bCB68qERm8
 vtgzYOKm04ovvnyh2jXzxYTjd8439qeRxr0Eybqc5HjiVDhyA1jm/pcsy3EDBf1+
 5QGjgIN2fK9rdv7TmyMHL8pSxLvYQfe2zMjLtULuHZWtjzIY98hh3zjA9mdnn3pK
 f2zqEJzu2lysTiwkhDF/yspOa1TymVp1W6dCVvfcxr5wJOWYtZaZyOTYNiCI3gX6
 9Qy2fm4txPcsooNrwSEekXUIV669fet9vbQX5PF3pRDipxydarAAUUbTm1hOP4X+
 3YnKk9dx/Pg483Go/sGMhtf+/P3ZJlWj2YZSro+maULt8O1Jb8TGe1MB7myheiFc
 CiXj5duiG+WlwteF8WMsm58vjplgbnFAfyVZ+mumRhE1e4qLtjnc5YxSdNYoOHOF
 40wI1Nebfwofh/qW7/ipn5tx7bN29UUSXRd26q1T4qF3KwKYaaj720lcisS7kBR0
 gK8FFCdqroKilJ/Fj3e+AM3xqmlJxornj7EhQoAVtBZCsjN90Wyaw2gaCceABp+f
 lDAisdO00v/+fpVhWFjXy4Mww5t6MDsRxmE5gLGOS1QHZUV9/mV7H9e9gd9ix6mV
 GXYAeytvRrQQBCDEbOz1ax8SS3eA2MFVdbmMxHYn/gWVx6ClpL4=
 =DOlj
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-fixes-6.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kselftest fixes from Shuah Khan:

 - sgx test fix for false negatives

 - ftrace output is hard to parses and it masks inappropriate skips etc.
   This fix addresses the problems by integrating with kselftest runner

* tag 'linux-kselftest-fixes-6.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests/ftrace: Improve integration with kselftest runner
  selftests/sgx: Add "test_encl.elf" to TEST_FILES
2023-05-17 11:16:36 -07:00
Benjamin Poirier 9ba9485b87 net: selftests: Fix optstring
The cited commit added a stray colon to the 'v' option. That makes the
option work incorrectly.

ex:
tools/testing/selftests/net# ./fib_nexthops.sh -v
(should enable verbose mode, instead it shows help text due to missing arg)

Fixes: 5feba47273 ("selftests: fib_nexthops: Make ping timeout configurable")
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-17 13:01:06 +01:00
Jeremy Sowden 5f5486b620 selftests/bpf: Fix pkg-config call building sign-file
When building sign-file, the call to get the CFLAGS for libcrypto is
missing white-space between `pkg-config` and `--cflags`:

  $(shell $(HOSTPKG_CONFIG)--cflags libcrypto 2> /dev/null)

Removing the redirection of stderr, we see:

  $ make -C tools/testing/selftests/bpf sign-file
  make: Entering directory '[...]/tools/testing/selftests/bpf'
  make: pkg-config--cflags: No such file or directory
    SIGN-FILE sign-file
  make: Leaving directory '[...]/tools/testing/selftests/bpf'

Add the missing space.

Fixes: fc97590668 ("selftests/bpf: Add test for bpf_verify_pkcs7_signature() kfunc")
Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com>
Link: https://lore.kernel.org/bpf/20230426215032.415792-1-jeremy@azazel.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-05-14 14:59:44 -07:00
Andrea Mayer f97b8401e0 selftets: seg6: disable rp_filter by default in srv6_end_dt4_l3vpn_test
On some distributions, the rp_filter is automatically set (=1) by
default on a netdev basis (also on VRFs).
In an SRv6 End.DT4 behavior, decapsulated IPv4 packets are routed using
the table associated with the VRF bound to that tunnel. During lookup
operations, the rp_filter can lead to packet loss when activated on the
VRF.
Therefore, we chose to make this selftest more robust by explicitly
disabling the rp_filter during tests (as it is automatically set by some
Linux distributions).

Fixes: 2195444e09 ("selftests: add selftest for the SRv6 End.DT4 behavior")
Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Tested-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-11 18:01:38 -07:00
Andrea Mayer 21a933c79a selftests: seg6: disable DAD on IPv6 router cfg for srv6_end_dt4_l3vpn_test
The srv6_end_dt4_l3vpn_test instantiates a virtual network consisting of
several routers (rt-1, rt-2) and hosts.
When the IPv6 addresses of rt-{1,2} routers are configured, the Deduplicate
Address Detection (DAD) kicks in when enabled in the Linux distros running
the selftests. DAD is used to check whether an IPv6 address is already
assigned in a network. Such a mechanism consists of sending an ICMPv6 Echo
Request and waiting for a reply.
As the DAD process could take too long to complete, it may cause the
failing of some tests carried out by the srv6_end_dt4_l3vpn_test script.

To make the srv6_end_dt4_l3vpn_test more robust, we disable DAD on routers
since we configure the virtual network manually and do not need any address
deduplication mechanism at all.

Fixes: 2195444e09 ("selftests: add selftest for the SRv6 End.DT4 behavior")
Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-11 18:01:38 -07:00
Linus Torvalds 6e27831b91 Networking fixes for 6.4-rc2, including fixes from netfilter
Current release - regressions:
 
   - mtk_eth_soc: fix NULL pointer dereference
 
 Previous releases - regressions:
 
   - core:
     - skb_partial_csum_set() fix against transport header magic value
     - fix load-tearing on sk->sk_stamp in sock_recv_cmsgs().
     - annotate sk->sk_err write from do_recvmmsg()
     - add vlan_get_protocol_and_depth() helper
 
   - netlink: annotate accesses to nlk->cb_running
 
   - netfilter: always release netdev hooks from notifier
 
 Previous releases - always broken:
 
   - core: deal with most data-races in sk_wait_event()
 
   - netfilter: fix possible bug_on with enable_hooks=1
 
   - eth: bonding: fix send_peer_notif overflow
 
   - eth: xpcs: fix incorrect number of interfaces
 
   - eth: ipvlan: fix out-of-bounds caused by unclear skb->cb
 
   - eth: stmmac: Initialize MAC_ONEUS_TIC_COUNTER register
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmRcxawSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkv6wQAJgfOBlDAkZNKHzwtMuFiLxECeEMWY9h
 wJCyiq0qXnz9p5ZqjdmTmA8B+jUp9VkpgN5Z3lid5hXDfzDrvXL1KGZW4pc4ooz9
 GUzrp0EUzO5UsyrlZRS9vJ9mbCGN5M1ZWtWH93g8OzGJPRnLs0Q/Tr4IFTBVKzVb
 GmJPy/ZYWYDjnvx3BgewRDuYeH3Rt9lsIt4Pxq/E+D8W3ypvVM0m3GvrO5eEzMeu
 EfeilAdmJGJUufeoGguKt0hheqILS3kNCjQO25XS2Lq1OqetnR/wqTwXaaVxL2du
 Eb2ca7wKkihDpl2l8bQ3ss6vqM0HEpZ63Y2PJaNBS8ASdLsMq4n2L6j2JMfT8hWY
 RG3nJS7F2UFLyYmCJjNL1/I+Z9XeMyFKnHORzHK1dAkMlhd+8NauKWAxdjlxMbxX
 p1msyTl54bG0g6FrU/zAirCWNAAZYCPdZG/XvA/2Jj9mdy64OlGlv/QdJvfjcx+C
 L6nkwZfwXU7QUwKeeTfP8abte2SLrXIxkJrnNEAntPnFOSmd16+/yvQ8JVlbWTMd
 JugJrSAIxjOglIr/1fsnUuV+Ab+JDYQv/wkoyzvtcY2tjhTAHzgmTwwSfeYiCTJE
 rEbjyVvVgMcLTUIk/R9QC5/k6nX/7/KRDHxPOMBX4boOsuA0ARVjzt8uKRvv/7cS
 dRV98RwvCKvD
 =MoPD
 -----END PGP SIGNATURE-----

Merge tag 'net-6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from netfilter.

  Current release - regressions:

   - mtk_eth_soc: fix NULL pointer dereference

  Previous releases - regressions:

   - core:
      - skb_partial_csum_set() fix against transport header magic value
      - fix load-tearing on sk->sk_stamp in sock_recv_cmsgs().
      - annotate sk->sk_err write from do_recvmmsg()
      - add vlan_get_protocol_and_depth() helper

   - netlink: annotate accesses to nlk->cb_running

   - netfilter: always release netdev hooks from notifier

  Previous releases - always broken:

   - core: deal with most data-races in sk_wait_event()

   - netfilter: fix possible bug_on with enable_hooks=1

   - eth: bonding: fix send_peer_notif overflow

   - eth: xpcs: fix incorrect number of interfaces

   - eth: ipvlan: fix out-of-bounds caused by unclear skb->cb

   - eth: stmmac: Initialize MAC_ONEUS_TIC_COUNTER register"

* tag 'net-6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (31 commits)
  af_unix: Fix data races around sk->sk_shutdown.
  af_unix: Fix a data race of sk->sk_receive_queue->qlen.
  net: datagram: fix data-races in datagram_poll()
  net: mscc: ocelot: fix stat counter register values
  ipvlan:Fix out-of-bounds caused by unclear skb->cb
  docs: networking: fix x25-iface.rst heading & index order
  gve: Remove the code of clearing PBA bit
  tcp: add annotations around sk->sk_shutdown accesses
  net: add vlan_get_protocol_and_depth() helper
  net: pcs: xpcs: fix incorrect number of interfaces
  net: deal with most data-races in sk_wait_event()
  net: annotate sk->sk_err write from do_recvmmsg()
  netlink: annotate accesses to nlk->cb_running
  kselftest: bonding: add num_grat_arp test
  selftests: forwarding: lib: add netns support for tc rule handle stats get
  Documentation: bonding: fix the doc of peer_notif_delay
  bonding: fix send_peer_notif overflow
  net: ethernet: mtk_eth_soc: fix NULL pointer dereference
  selftests: nft_flowtable.sh: check ingress/egress chain too
  selftests: nft_flowtable.sh: monitor result file sizes
  ...
2023-05-11 08:42:47 -05:00
Mirsad Todorovac 976d3c6778 selftests: gpio: gpio-sim: Fix BUG: test FAILED due to recent change
According to Mirsad the gpio-sim.sh test appears to FAIL in a wrong way
due to missing initialisation of shell variables:

 4.2. Bias settings work correctly
 cat: /sys/devices/platform/gpio-sim.0/gpiochip18/sim_gpio0/value: No such file or directory
 ./gpio-sim.sh: line 393: test: =: unary operator expected
 bias setting does not work
 GPIO gpio-sim test FAIL

After this change the test passed:

 4.2. Bias settings work correctly
 GPIO gpio-sim test PASS

His testing environment is AlmaLinux 8.7 on Lenovo desktop box with
the latest Linux kernel based on v6.2:

  Linux 6.2.0-mglru-kmlk-andy-09238-gd2980d8d8265 x86_64

Suggested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2023-05-11 14:41:45 +02:00
Jakub Kicinski cceac92678 netfilter pull request 23-05-10
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEN9lkrMBJgcdVAPub1V2XiooUIOQFAmRbUnQACgkQ1V2XiooU
 IOQNVw//eoCiid3J4TuNdzHaBHXUlZvln1n5Z6K5fIz5ytrY1T7oKC8Y9StkzzWR
 29ToFLOJt1iRAcgxsghWRIwUzNwpuUdqgd9cUZMHMQxT0BJItp6FXUql2+1LkF/I
 b/gnnb90zyE7lBS/VSRyOiqMiJlP+Som22d7Nn5k2KfTYEdXKwfzjsWAu3W3Sb0s
 Lv/MA9DE42qcwiZubmFmDtOtAunPJFZm3HgkcAVeXoNkBDrSfkvxLIMYG6VfFNhQ
 AkKMyzX293wpwVxfOuQfJr4QVlxAgOQUko+FqajoWMBtfA3yldZjJ8RC7c9Af9uI
 ciOP11vHBCG84KrTabC5kdqOcvadreDiM/oIvk57ztQhCr3e+po+vIz6Cv4p9I30
 m5GXfgbtMRl6hM2S5lrRc5fNRkYJHE4aNvesFTGaLpK3LogusH1E9mH5jdjnbU42
 TwkIe250qJJelNn9ZxS5Jt0BgyogNfeiA9lOXmaQBpYmwIahrjDf0g8z2I85nPhF
 PDukjSXsxi4uHwpSF5wFrlqkAPiEX4vC95uSUTVbbBgZrHhNJdGgu4FJSHRM4mVo
 6Awxk2O2bvcDpXEfTBdD4EJF71bZ/aH2i8ddU1oupIl88O01TWTtkO4SYHqMX+tC
 fPnpGzBN8KJvHgeFxO4p0v1oelv2RtiITv1gi7YJOnq/+Vd2yy0=
 =HmfP
 -----END PGP SIGNATURE-----

Merge tag 'nf-23-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter updates for net

The following patchset contains Netfilter fixes for net:

1) Fix UAF when releasing netnamespace, from Florian Westphal.

2) Fix possible BUG_ON when nf_conntrack is enabled with enable_hooks,
   from Florian Westphal.

3) Fixes for nft_flowtable.sh selftest, from Boris Sukholitko.

4) Extend nft_flowtable.sh selftest to cover integration with
   ingress/egress hooks, from Florian Westphal.

* tag 'nf-23-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  selftests: nft_flowtable.sh: check ingress/egress chain too
  selftests: nft_flowtable.sh: monitor result file sizes
  selftests: nft_flowtable.sh: wait for specific nc pids
  selftests: nft_flowtable.sh: no need for ps -x option
  selftests: nft_flowtable.sh: use /proc for pid checking
  netfilter: conntrack: fix possible bug_on with enable_hooks=1
  netfilter: nf_tables: always release netdev hooks from notifier
====================

Link: https://lore.kernel.org/r/20230510083313.152961-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-10 19:08:58 -07:00
Hangbin Liu 6cbe791c0f kselftest: bonding: add num_grat_arp test
TEST: num_grat_arp (active-backup miimon num_grat_arp 10)           [ OK ]
TEST: num_grat_arp (active-backup miimon num_grat_arp 20)           [ OK ]
TEST: num_grat_arp (active-backup miimon num_grat_arp 30)           [ OK ]
TEST: num_grat_arp (active-backup miimon num_grat_arp 50)           [ OK ]

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-10 09:27:20 +01:00
Hangbin Liu b6d1599f8c selftests: forwarding: lib: add netns support for tc rule handle stats get
When run the test in netns, it's not easy to get the tc stats via
tc_rule_handle_stats_get(). With the new netns parameter, we can get
stats from specific netns like

  num=$(tc_rule_handle_stats_get "dev eth0 ingress" 101 ".packets" "-n ns")

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-10 09:27:20 +01:00
Florian Westphal 3acf8f6c14 selftests: nft_flowtable.sh: check ingress/egress chain too
Make sure flowtable interacts correctly with ingress and egress
chains, i.e. those get handled before and after flow table respectively.

Adds three more tests:
1. repeat flowtable test, but with 'ip dscp set cs3' done in
   inet forward chain.

Expect that some packets have been mangled (before flowtable offload
became effective) while some pass without mangling (after offload
succeeds).

2. repeat flowtable test, but with 'ip dscp set cs3' done in
   veth0:ingress.

Expect that all packets pass with cs3 dscp field.

3. same as 2, but use veth1:egress.  Expect the same outcome.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-05-10 09:31:07 +02:00
Boris Sukholitko 90ab51226d selftests: nft_flowtable.sh: monitor result file sizes
When running nft_flowtable.sh in VM on a busy server we've found that
the time of the netcat file transfers vary wildly.

Therefore replace hardcoded 3 second sleep with the loop checking for
a change in the file sizes. Once no change in detected we test the results.

Nice side effect is that we shave 1 second sleep in the fast case
(hard-coded 3 second sleep vs two 1 second sleeps).

Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-05-10 09:31:07 +02:00
Boris Sukholitko 1114803c2d selftests: nft_flowtable.sh: wait for specific nc pids
Doing wait with no parameters may interfere with some of the tests
having their own background processes.

Although no such test is currently present, the cleanup is useful
to rely on the nft_flowtable.sh for local development (e.g. running
background tcpdump command during the tests).

Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-05-10 09:31:06 +02:00
Boris Sukholitko 0749d670d7 selftests: nft_flowtable.sh: no need for ps -x option
Some ps commands (e.g. busybox derived) have no -x option. For the
purposes of hash calculation of the list of processes this option is
inessential.

Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-05-10 09:31:05 +02:00
Boris Sukholitko 0a11073e8e selftests: nft_flowtable.sh: use /proc for pid checking
Some ps commands (e.g. busybox derived) have no -p option. Use /proc for
pid existence check.

Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-05-10 09:30:52 +02:00
Mark Brown dbcf76390e selftests/ftrace: Improve integration with kselftest runner
The ftrace selftests do not currently produce KTAP output, they produce a
custom format much nicer for human consumption. This means that when run in
automated test systems we just get a single result for the suite as a whole
rather than recording results for individual test cases, making it harder
to look at the test data and masking things like inappropriate skips.

Address this by adding support for KTAP output to the ftracetest script and
providing a trivial wrapper which will be invoked by the kselftest runner
to generate output in this format by default, users using ftracetest
directly will continue to get the existing output.

This is not the most elegant solution but it is simple and effective. I
did consider implementing this by post processing the existing output
format but that felt more complex and likely to result in all output being
lost if something goes seriously wrong during the run which would not be
helpful. I did also consider just writing a separate runner script but
there's enough going on with things like the signal handling for that to
seem like it would be duplicating too much.

Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Tested-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-05-08 11:10:13 -06:00
Yi Lai 4b79f76920 selftests/sgx: Add "test_encl.elf" to TEST_FILES
The "test_encl.elf" file used by test_sgx is not installed in
INSTALL_PATH. Attempting to execute test_sgx causes false negative:

"
enclave executable open(): No such file or directory
main.c:188:unclobbered_vdso:Failed to load the test enclave.
"

Add "test_encl.elf" to TEST_FILES so that it will be installed.

Fixes: 2adcba79e6 ("selftests/x86: Add a selftest for SGX")
Signed-off-by: Yi Lai <yi1.lai@intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-05-08 11:01:03 -06:00
Linus Torvalds f085df1be6 Disable building BPF based features by default for v6.4.
We need to better polish building with BPF skels, so revert back to
 making it an experimental feature that has to be explicitely enabled
 using BUILD_BPF_SKEL=1.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCZFbCXwAKCRCyPKLppCJ+
 J7cHAP97erKY4hBXArjpfzcvpFmboh/oqhbTLntyIpS6TEnOyQEAyervAPGIjQYC
 DCo4foyXmOWn3dhNtK9M+YiRl3o2SgQ=
 =7G78
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v6.4-3-2023-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tool updates from Arnaldo Carvalho de Melo:
 "Third version of perf tool updates, with the build problems with with
  using a 'vmlinux.h' generated from the main build fixed, and the bpf
  skeleton build disabled by default.

  Build:

   - Require libtraceevent to build, one can disable it using
     NO_LIBTRACEEVENT=1.

     It is required for tools like 'perf sched', 'perf kvm', 'perf
     trace', etc.

     libtraceevent is available in most distros so installing
     'libtraceevent-devel' should be a one-time event to continue
     building perf as usual.

     Using NO_LIBTRACEEVENT=1 produces tooling that is functional and
     sufficient for lots of users not interested in those libtraceevent
     dependent features.

   - Allow Python support in 'perf script' when libtraceevent isn't
     linked, as not all features requires it, for instance Intel PT does
     not use tracepoints.

   - Error if the python interpreter needed for jevents to work isn't
     available and NO_JEVENTS=1 isn't set, preventing a build without
     support for JSON vendor events, which is a rare but possible
     condition. The two check error messages:

        $(error ERROR: No python interpreter needed for jevents generation. Install python or build with NO_JEVENTS=1.)
        $(error ERROR: Python interpreter needed for jevents generation too old (older than 3.6). Install a newer python or build with NO_JEVENTS=1.)

   - Make libbpf 1.0 the minimum required when building with out of
     tree, distro provided libbpf.

   - Use libsdtc++'s and LLVM's libcxx's __cxa_demangle, a portable C++
     demangler, add 'perf test' entry for it.

   - Make binutils libraries opt in, as distros disable building with it
     due to licensing, they were used for C++ demangling, for instance.

   - Switch libpfm4 to opt-out rather than opt-in, if libpfm-devel (or
     equivalent) isn't installed, we'll just have a build warning:

       Makefile.config:1144: libpfm4 not found, disables libpfm4 support. Please install libpfm4-dev

   - Add a feature test for scandirat(), that is not implemented so far
     in musl and uclibc, disabling features that need it, such as
     scanning for tracepoints in /sys/kernel/tracing/events.

  perf BPF filters:

   - New feature where BPF can be used to filter samples, for instance:

      $ sudo ./perf record -e cycles --filter 'period > 1000' true
      $ sudo ./perf script
           perf-exec 2273949 546850.708501:       5029 cycles:  ffffffff826f9e25 finish_wait+0x5 ([kernel.kallsyms])
           perf-exec 2273949 546850.708508:      32409 cycles:  ffffffff826f9e25 finish_wait+0x5 ([kernel.kallsyms])
           perf-exec 2273949 546850.708526:     143369 cycles:  ffffffff82b4cdbf xas_start+0x5f ([kernel.kallsyms])
           perf-exec 2273949 546850.708600:     372650 cycles:  ffffffff8286b8f7 __pagevec_lru_add+0x117 ([kernel.kallsyms])
           perf-exec 2273949 546850.708791:     482953 cycles:  ffffffff829190de __mod_memcg_lruvec_state+0x4e ([kernel.kallsyms])
                true 2273949 546850.709036:     501985 cycles:  ffffffff828add7c tlb_gather_mmu+0x4c ([kernel.kallsyms])
                true 2273949 546850.709292:     503065 cycles:      7f2446d97c03 _dl_map_object_deps+0x973 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)

   - In addition to 'period' (PERF_SAMPLE_PERIOD), the other
     PERF_SAMPLE_ can be used for filtering, and also some other sample
     accessible values, from tools/perf/Documentation/perf-record.txt:

        Essentially the BPF filter expression is:

        <term> <operator> <value> (("," | "||") <term> <operator> <value>)*

     The <term> can be one of:
        ip, id, tid, pid, cpu, time, addr, period, txn, weight, phys_addr,
        code_pgsz, data_pgsz, weight1, weight2, weight3, ins_lat, retire_lat,
        p_stage_cyc, mem_op, mem_lvl, mem_snoop, mem_remote, mem_lock,
        mem_dtlb, mem_blk, mem_hops

     The <operator> can be one of:
        ==, !=, >, >=, <, <=, &

     The <value> can be one of:
        <number> (for any term)
        na, load, store, pfetch, exec (for mem_op)
        l1, l2, l3, l4, cxl, io, any_cache, lfb, ram, pmem (for mem_lvl)
        na, none, hit, miss, hitm, fwd, peer (for mem_snoop)
        remote (for mem_remote)
        na, locked (for mem_locked)
        na, l1_hit, l1_miss, l2_hit, l2_miss, any_hit, any_miss, walk, fault (for mem_dtlb)
        na, by_data, by_addr (for mem_blk)
        hops0, hops1, hops2, hops3 (for mem_hops)

  perf lock contention:

   - Show lock type with address.

   - Track and show mmap_lock, siglock and per-cpu rq_lock with address.
     This is done for mmap_lock by following the current->mm pointer:

      $ sudo ./perf lock con -abl -- sleep 10
       contended   total wait     max wait     avg wait            address   symbol
       ...
           16344    312.30 ms      2.22 ms     19.11 us   ffff8cc702595640
           17686    310.08 ms      1.49 ms     17.53 us   ffff8cc7025952c0
               3     84.14 ms     45.79 ms     28.05 ms   ffff8cc78114c478   mmap_lock
            3557     76.80 ms     68.75 us     21.59 us   ffff8cc77ca3af58
               1     68.27 ms     68.27 ms     68.27 ms   ffff8cda745dfd70
               9     54.53 ms      7.96 ms      6.06 ms   ffff8cc7642a48b8   mmap_lock
           14629     44.01 ms     60.00 us      3.01 us   ffff8cc7625f9ca0
            3481     42.63 ms    140.71 us     12.24 us   ffffffff937906ac   vmap_area_lock
           16194     38.73 ms     42.15 us      2.39 us   ffff8cd397cbc560
              11     38.44 ms     10.39 ms      3.49 ms   ffff8ccd6d12fbb8   mmap_lock
               1      5.43 ms      5.43 ms      5.43 ms   ffff8cd70018f0d8
            1674      5.38 ms    422.93 us      3.21 us   ffffffff92e06080   tasklist_lock
             581      4.51 ms    130.68 us      7.75 us   ffff8cc9b1259058
               5      3.52 ms      1.27 ms    703.23 us   ffff8cc754510070
             112      3.47 ms     56.47 us     31.02 us   ffff8ccee38b3120
             381      3.31 ms     73.44 us      8.69 us   ffffffff93790690   purge_vmap_area_lock
             255      3.19 ms     36.35 us     12.49 us   ffff8d053ce30c80

   - Update default map size to 16384.

   - Allocate single letter option -M for --map-nr-entries, as it is
     proving being frequently used.

   - Fix struct rq lock access for older kernels with BPF's CO-RE
     (Compile once, run everywhere).

   - Fix problems found with MSAn.

  perf report/top:

   - Add inline information when using --call-graph=fp or lbr, as was
     already done to the --call-graph=dwarf callchain mode.

   - Improve the 'srcfile' sort key performance by really using an
     optimization introduced in 6.2 for the 'srcline' sort key that
     avoids calling addr2line for comparision with each sample.

  perf sched:

   - Make 'perf sched latency/map/replay' to use "sched:sched_waking"
     instead of "sched:sched_waking", consistent with 'perf record'
     since d566a9c2d4 ("perf sched: Prefer sched_waking event when it
     exists").

  perf ftrace:

   - Make system wide the default target for latency subcommand, run the
     following command then generate some network traffic and press
     control+C:

       # perf ftrace latency -T __kfree_skb
     ^C
         DURATION     |      COUNT | GRAPH                                          |
          0 - 1    us |         27 | #############                                  |
          1 - 2    us |         22 | ###########                                    |
          2 - 4    us |          8 | ####                                           |
          4 - 8    us |          5 | ##                                             |
          8 - 16   us |         24 | ############                                   |
         16 - 32   us |          2 | #                                              |
         32 - 64   us |          1 |                                                |
         64 - 128  us |          0 |                                                |
        128 - 256  us |          0 |                                                |
        256 - 512  us |          0 |                                                |
        512 - 1024 us |          0 |                                                |
          1 - 2    ms |          0 |                                                |
          2 - 4    ms |          0 |                                                |
          4 - 8    ms |          0 |                                                |
          8 - 16   ms |          0 |                                                |
         16 - 32   ms |          0 |                                                |
         32 - 64   ms |          0 |                                                |
         64 - 128  ms |          0 |                                                |
        128 - 256  ms |          0 |                                                |
        256 - 512  ms |          0 |                                                |
        512 - 1024 ms |          0 |                                                |
          1 - ...   s |          0 |                                                |
       #

  perf top:

   - Add --branch-history (LBR: Last Branch Record) option, just like
     already available for 'perf record'.

   - Fix segfault in thread__comm_len() where thread->comm was being
     used outside thread->comm_lock.

  perf annotate:

   - Allow configuring objdump and addr2line in ~/.perfconfig., so that
     you can use alternative binaries, such as llvm's.

  perf kvm:

   - Add TUI mode for 'perf kvm stat report'.

  Reference counting:

   - Add reference count checking infrastructure to check for use after
     free, done to the 'cpumap', 'namespaces', 'maps' and 'map' structs,
     more to come.

     To build with it use -DREFCNT_CHECKING=1 in the make command line
     to build tools/perf. Documented at:

       https://perf.wiki.kernel.org/index.php/Reference_Count_Checking

   - The above caught, for instance, fix, present in this series:

        - Fix maps use after put in 'perf test "Share thread maps"':

          'maps' is copied from leader, but the leader is put on line 79
          and then 'maps' is used to read the reference count below - so
          a use after put, with the put of maps happening within
          thread__put.

     Fixed by reversing the order of puts so that the leader is put
     last.

   - Also several fixes were made to places where reference counts were
     not being held.

   - Make this one of the tests in 'make -C tools/perf build-test' to
     regularly build test it and to make sure no direct access to the
     reference counted structs are made, doing that via accessors to
     check the validity of the struct pointer.

  ARM64:

   - Fix 'perf report' segfault when filtering coresight traces by
     sparse lists of CPUs.

   - Add support for 'simd' as a sort field for 'perf report', to show
     ARM's NEON SIMD's predicate flags: "partial" and "empty".

  arm64 vendor events:

   - Add N1 metrics.

  Intel vendor events:

   - Add graniterapids, grandridge and sierraforrest events.

   - Refresh events for: alderlake, aldernaken, broadwell, broadwellde,
     broadwellx, cascadelakx, haswell, haswellx, icelake, icelakex,
     jaketown, meteorlake, knightslanding, sandybridge, sapphirerapids,
     silvermont, skylake, tigerlake and westmereep-dp

   - Refresh metrics for alderlake-n, broadwell, broadwellde,
     broadwellx, haswell, haswellx, icelakex, ivybridge, ivytown and
     skylakex.

  perf stat:

   - Implement --topdown using JSON metrics.

   - Add TopdownL1 JSON metric as a default if present, but disable it
     for now for some Intel hybrid architectures, a series of patches
     addressing this is being reviewed and will be submitted for v6.5.

   - Use metrics for --smi-cost.

   - Update topdown documentation.

  Vendor events (JSON) infrastructure:

   - Add support for computing and printing metric threshold values. For
     instance, here is one found in thesapphirerapids json file:

       {
           "BriefDescription": "Percentage of cycles spent in System Management Interrupts.",
           "MetricExpr": "((msr@aperf@ - cycles) / msr@aperf@ if msr@smi@ > 0 else 0)",
           "MetricGroup": "smi",
           "MetricName": "smi_cycles",
           "MetricThreshold": "smi_cycles > 0.1",
           "ScaleUnit": "100%"
       },

   - Test parsing metric thresholds with the fake PMU in 'perf test
     pmu-events'.

   - Support for printing metric thresholds in 'perf list'.

   - Add --metric-no-threshold option to 'perf stat'.

   - Add rand (reverse and) and has_pmem (optane memory) support to
     metrics.

   - Sort list of input files to avoid depending on the order from
     readdir() helping in obtaining reproducible builds.

  S/390:

   - Add common metrics: - CPI (cycles per instruction), prbstate (ratio
     of instructions executed in problem state compared to total number
     of instructions), l1mp (Level one instruction and data cache misses
     per 100 instructions).

   - Add cache metrics for z13, z14, z15 and z16.

   - Add metric for TLB and cache.

  ARM:

   - Add raw decoding for SPE (Statistical Profiling Extension) v1.3 MTE
     (Memory Tagging Extension) and MOPS (Memory Operations) load/store.

  Intel PT hardware tracing:

   - Add event type names UINTR (User interrupt delivered) and UIRET
     (Exiting from user interrupt routine), documented in table 32-50
     "CFE Packet Type and Vector Fields Details" in the Intel Processor
     Trace chapter of The Intel SDM Volume 3 version 078.

   - Add support for new branch instructions ERETS and ERETU.

   - Fix CYC timestamps after standalone CBR

  ARM CoreSight hardware tracing:

   - Allow user to override timestamp and contextid settings.

   - Fix segfault in dso lookup.

   - Fix timeless decode mode detection.

   - Add separate decode paths for timeless and per-thread modes.

  auxtrace:

   - Fix address filter entire kernel size.

  Miscellaneous:

   - Fix use-after-free and unaligned bugs in the PLT handling routines.

   - Use zfree() to reduce chances of use after free.

   - Add missing 0x prefix for addresses printed in hexadecimal in 'perf
     probe'.

   - Suppress massive unsupported target platform errors in the unwind
     code.

   - Fix return incorrect build_id size in elf_read_build_id().

   - Fix 'perf scripts intel-pt-events.py' IPC output for Python 2 .

   - Add missing new parameter in kfree_skb tracepoint to the python
     scripts using it.

   - Add 'perf bench syscall fork' benchmark.

   - Add support for printing PERF_MEM_LVLNUM_UNC (Uncached access) in
     'perf mem'.

   - Fix wrong size expectation for perf test 'Setup struct
     perf_event_attr' caused by the patch adding
     perf_event_attr::config3.

   - Fix some spelling mistakes"

* tag 'perf-tools-for-v6.4-3-2023-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (365 commits)
  Revert "perf build: Make BUILD_BPF_SKEL default, rename to NO_BPF_SKEL"
  Revert "perf build: Warn for BPF skeletons if endian mismatches"
  perf metrics: Fix SEGV with --for-each-cgroup
  perf bpf skels: Stop using vmlinux.h generated from BTF, use subset of used structs + CO-RE
  perf stat: Separate bperf from bpf_profiler
  perf test record+probe_libc_inet_pton: Fix call chain match on x86_64
  perf test record+probe_libc_inet_pton: Fix call chain match on s390
  perf tracepoint: Fix memory leak in is_valid_tracepoint()
  perf cs-etm: Add fix for coresight trace for any range of CPUs
  perf build: Fix unescaped # in perf build-test
  perf unwind: Suppress massive unsupported target platform errors
  perf script: Add new parameter in kfree_skb tracepoint to the python scripts using it
  perf script: Print raw ip instead of binary offset for callchain
  perf symbols: Fix return incorrect build_id size in elf_read_build_id()
  perf list: Modify the warning message about scandirat(3)
  perf list: Fix memory leaks in print_tracepoint_events()
  perf lock contention: Rework offset calculation with BPF CO-RE
  perf lock contention: Fix struct rq lock access
  perf stat: Disable TopdownL1 on hybrid
  perf stat: Avoid SEGV on counter->name
  ...
2023-05-07 11:32:18 -07:00
Linus Torvalds ed23734c23 Including fixes from netfilter.
Current release - regressions:
 
  - sched: act_pedit: free pedit keys on bail from offset check
 
 Current release - new code bugs:
 
  - pds_core:
   - Kconfig fixes (DEBUGFS and AUXILIARY_BUS)
   - fix mutex double unlock in error path
 
 Previous releases - regressions:
 
  - sched: cls_api: remove block_cb from driver_list before freeing
 
  - nf_tables: fix ct untracked match breakage
 
  - eth: mtk_eth_soc: drop generic vlan rx offload
 
  - sched: flower: fix error handler on replace
 
 Previous releases - always broken:
 
  - tcp: fix skb_copy_ubufs() vs BIG TCP
 
  - ipv6: fix skb hash for some RST packets
 
  - af_packet: don't send zero-byte data in packet_sendmsg_spkt()
 
  - rxrpc: timeout handling fixes after moving client call connection
    to the I/O thread
 
  - ixgbe: fix panic during XDP_TX with > 64 CPUs
 
  - igc: RMW the SRRCTL register to prevent losing timestamp config
 
  - dsa: mt7530: fix corrupt frames using TRGMII on 40 MHz XTAL MT7621
 
  - r8152:
    - fix flow control issue of RTL8156A
    - fix the poor throughput for 2.5G devices
    - move setting r8153b_rx_agg_chg_indicate() to fix coalescing
    - enable autosuspend
 
  - ncsi: clear Tx enable mode when handling a Config required AEN
 
  - octeontx2-pf: macsec: fixes for CN10KB ASIC rev
 
 Misc:
 
  - 9p: remove INET dependency
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmRVeUIACgkQMUZtbf5S
 IrtTug/9Hhg/L0PTSwrfuGh4W1/cjheMWppNLkwyWQUiKG7FcZQ9vu9PxceE3VRu
 2fTqHyvgDMZ8jACovXObeda8z1+g3s/tIPaXELephBIjVlF/h3kG2OaIzlU4jDb4
 A4vklwf8eLbfyVBG22QgKl/I70zVMtnmnOo6c6CPuIOTcMPzslndFO9tB0nCg99F
 DCgCM1BBP1tz+OUch2rLnSzYcqkWqS49BhRk6dhYSliawUFU/5+1tDGDjwWolkfm
 0jqP9DjBOSpZKO8m7SpsUNz7NFRIfYErWZ+YebWbggNxj/6TRJTP83MM0tGoK1rE
 /mz2xpuOki59frlwVOAD6gb/qefjHUp21P4NA7bnhizxFlQL5MHpCeGQ9yLHBSmY
 9Q4ArJkM4jXQ0oDA2nII/pz+cDZGEWFGQ14WW3kYUb7WFmISH4I9OiA9i0TBW6OL
 r1Y/rqzkUvtKWzh9RpiAF9lsdHAm3SX9ES5RfMxzv0x886VOZR4jaMmokRDdPRzq
 0r2Oyj75b62+X0r44Fe22Pl/kPS/uh3642xo9h85aAv/EvhT9JNzMvomJm9d6tkb
 966I085AVbwxPAy+rl5SWyAq60EWDExNTjZvPv0mSMlmSsQ9iK5//xOF2Saw2zai
 /44zQ27tVGkCC44Ou5KmfJN3u4OrKkhcuyxtcDr9QeoOdKZRkMg=
 =9xND
 -----END PGP SIGNATURE-----

Merge tag 'net-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter.

  Current release - regressions:

   - sched: act_pedit: free pedit keys on bail from offset check

  Current release - new code bugs:

   - pds_core:
      - Kconfig fixes (DEBUGFS and AUXILIARY_BUS)
      - fix mutex double unlock in error path

  Previous releases - regressions:

   - sched: cls_api: remove block_cb from driver_list before freeing

   - nf_tables: fix ct untracked match breakage

   - eth: mtk_eth_soc: drop generic vlan rx offload

   - sched: flower: fix error handler on replace

  Previous releases - always broken:

   - tcp: fix skb_copy_ubufs() vs BIG TCP

   - ipv6: fix skb hash for some RST packets

   - af_packet: don't send zero-byte data in packet_sendmsg_spkt()

   - rxrpc: timeout handling fixes after moving client call connection
     to the I/O thread

   - ixgbe: fix panic during XDP_TX with > 64 CPUs

   - igc: RMW the SRRCTL register to prevent losing timestamp config

   - dsa: mt7530: fix corrupt frames using TRGMII on 40 MHz XTAL MT7621

   - r8152:
      - fix flow control issue of RTL8156A
      - fix the poor throughput for 2.5G devices
      - move setting r8153b_rx_agg_chg_indicate() to fix coalescing
      - enable autosuspend

   - ncsi: clear Tx enable mode when handling a Config required AEN

   - octeontx2-pf: macsec: fixes for CN10KB ASIC rev

  Misc:

   - 9p: remove INET dependency"

* tag 'net-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (69 commits)
  net: bcmgenet: Remove phy_stop() from bcmgenet_netif_stop()
  pds_core: fix mutex double unlock in error path
  net/sched: flower: fix error handler on replace
  Revert "net/sched: flower: Fix wrong handle assignment during filter change"
  net/sched: flower: fix filter idr initialization
  net: fec: correct the counting of XDP sent frames
  bonding: add xdp_features support
  net: enetc: check the index of the SFI rather than the handle
  sfc: Add back mailing list
  virtio_net: suppress cpu stall when free_unused_bufs
  ice: block LAN in case of VF to VF offload
  net: dsa: mt7530: fix network connectivity with multiple CPU ports
  net: dsa: mt7530: fix corrupt frames using trgmii on 40 MHz XTAL MT7621
  9p: Remove INET dependency
  netfilter: nf_tables: fix ct untracked match breakage
  af_packet: Don't send zero-byte data in packet_sendmsg_spkt().
  igc: read before write to SRRCTL register
  pds_core: add AUXILIARY_BUS and NET_DEVLINK to Kconfig
  pds_core: remove CONFIG_DEBUG_FS from makefile
  ionic: catch failure from devlink_alloc
  ...
2023-05-05 19:12:01 -07:00
Linus Torvalds 15fb96a35d - Some DAMON cleanups from Kefeng Wang
- Some KSM work from David Hildenbrand, to make the PR_SET_MEMORY_MERGE
   ioctl's behavior more similar to KSM's behavior.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZFLsxAAKCRDdBJ7gKXxA
 jl8yAQCqjstPsOULf9QN0z4bGAUhY+Wj4ERz1jbKSIuhFCJWiQEAgQvgRXObKjmi
 OtUB0Ek4CMDCQzbyIQ1Bhp3kxi6+Jgs=
 =AbyC
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2023-05-03-16-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull more MM updates from Andrew Morton:

 - Some DAMON cleanups from Kefeng Wang

 - Some KSM work from David Hildenbrand, to make the PR_SET_MEMORY_MERGE
   ioctl's behavior more similar to KSM's behavior.

[ Andrew called these "final", but I suspect we'll have a series fixing
  up the fact that the last commit in the dmapools series in the
  previous pull seems to have unintentionally just reverted all the
  other commits in the same series..   - Linus ]

* tag 'mm-stable-2023-05-03-16-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm: hwpoison: coredump: support recovery from dump_user_range()
  mm/page_alloc: add some comments to explain the possible hole in __pageblock_pfn_to_page()
  mm/ksm: move disabling KSM from s390/gmap code to KSM code
  selftests/ksm: ksm_functional_tests: add prctl unmerge test
  mm/ksm: unmerge and clear VM_MERGEABLE when setting PR_SET_MEMORY_MERGE=0
  mm/damon/paddr: fix missing folio_sz update in damon_pa_young()
  mm/damon/paddr: minor refactor of damon_pa_mark_accessed_or_deactivate()
  mm/damon/paddr: minor refactor of damon_pa_pageout()
2023-05-04 13:09:43 -07:00
Jeremy Sowden de4773f023 selftests: netfilter: fix libmnl pkg-config usage
1. Don't hard-code pkg-config
2. Remove distro-specific default for CFLAGS
3. Use pkg-config for LDLIBS

Fixes: a50a88f026 ("selftests: netfilter: fix a build error on openSUSE")
Suggested-by: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-05-03 08:24:31 +02:00
David Hildenbrand 1150ea9338 selftests/ksm: ksm_functional_tests: add prctl unmerge test
Let's test whether setting PR_SET_MEMORY_MERGE to 0 after setting it to 1
will unmerge pages, similar to how setting MADV_UNMERGEABLE after setting
MADV_MERGEABLE would.

Link: https://lkml.kernel.org/r/20230422205420.30372-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Stefan Roesch <shr@devkernel.io>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Janosch Frank <frankja@linux.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-02 17:21:49 -07:00
Linus Torvalds c8c655c34e s390:
* More phys_to_virt conversions
 
 * Improvement of AP management for VSIE (nested virtualization)
 
 ARM64:
 
 * Numerous fixes for the pathological lock inversion issue that
   plagued KVM/arm64 since... forever.
 
 * New framework allowing SMCCC-compliant hypercalls to be forwarded
   to userspace, hopefully paving the way for some more features
   being moved to VMMs rather than be implemented in the kernel.
 
 * Large rework of the timer code to allow a VM-wide offset to be
   applied to both virtual and physical counters as well as a
   per-timer, per-vcpu offset that complements the global one.
   This last part allows the NV timer code to be implemented on
   top.
 
 * A small set of fixes to make sure that we don't change anything
   affecting the EL1&0 translation regime just after having having
   taken an exception to EL2 until we have executed a DSB. This
   ensures that speculative walks started in EL1&0 have completed.
 
 * The usual selftest fixes and improvements.
 
 KVM x86 changes for 6.4:
 
 * Optimize CR0.WP toggling by avoiding an MMU reload when TDP is enabled,
   and by giving the guest control of CR0.WP when EPT is enabled on VMX
   (VMX-only because SVM doesn't support per-bit controls)
 
 * Add CR0/CR4 helpers to query single bits, and clean up related code
   where KVM was interpreting kvm_read_cr4_bits()'s "unsigned long" return
   as a bool
 
 * Move AMD_PSFD to cpufeatures.h and purge KVM's definition
 
 * Avoid unnecessary writes+flushes when the guest is only adding new PTEs
 
 * Overhaul .sync_page() and .invlpg() to utilize .sync_page()'s optimizations
   when emulating invalidations
 
 * Clean up the range-based flushing APIs
 
 * Revamp the TDP MMU's reaping of Accessed/Dirty bits to clear a single
   A/D bit using a LOCK AND instead of XCHG, and skip all of the "handle
   changed SPTE" overhead associated with writing the entire entry
 
 * Track the number of "tail" entries in a pte_list_desc to avoid having
   to walk (potentially) all descriptors during insertion and deletion,
   which gets quite expensive if the guest is spamming fork()
 
 * Disallow virtualizing legacy LBRs if architectural LBRs are available,
   the two are mutually exclusive in hardware
 
 * Disallow writes to immutable feature MSRs (notably PERF_CAPABILITIES)
   after KVM_RUN, similar to CPUID features
 
 * Overhaul the vmx_pmu_caps selftest to better validate PERF_CAPABILITIES
 
 * Apply PMU filters to emulated events and add test coverage to the
   pmu_event_filter selftest
 
 x86 AMD:
 
 * Add support for virtual NMIs
 
 * Fixes for edge cases related to virtual interrupts
 
 x86 Intel:
 
 * Don't advertise XTILE_CFG in KVM_GET_SUPPORTED_CPUID if XTILE_DATA is
   not being reported due to userspace not opting in via prctl()
 
 * Fix a bug in emulation of ENCLS in compatibility mode
 
 * Allow emulation of NOP and PAUSE for L2
 
 * AMX selftests improvements
 
 * Misc cleanups
 
 MIPS:
 
 * Constify MIPS's internal callbacks (a leftover from the hardware enabling
   rework that landed in 6.3)
 
 Generic:
 
 * Drop unnecessary casts from "void *" throughout kvm_main.c
 
 * Tweak the layout of "struct kvm_mmu_memory_cache" to shrink the struct
   size by 8 bytes on 64-bit kernels by utilizing a padding hole
 
 Documentation:
 
 * Fix goof introduced by the conversion to rST
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmRNExkUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNyjwf+MkzDael9y9AsOZoqhEZ5OsfQYJ32
 Im5ZVYsPRU2K5TuoWql6meIihgclCj1iIU32qYHa2F1WYt2rZ72rJp+HoY8b+TaI
 WvF0pvNtqQyg3iEKUBKPA4xQ6mj7RpQBw86qqiCHmlfNt0zxluEGEPxH8xrWcfhC
 huDQ+NUOdU7fmJ3rqGitCvkUbCuZNkw3aNPR8dhU8RAWrwRzP2hBOmdxIeo81WWY
 XMEpJSijbGpXL9CvM0Jz9nOuMJwZwCCBGxg1vSQq0xTfLySNMxzvWZC2GFaBjucb
 j0UOQ7yE0drIZDVhd3sdNslubXXU6FcSEzacGQb9aigMUon3Tem9SHi7Kw==
 =S2Hq
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "s390:

   - More phys_to_virt conversions

   - Improvement of AP management for VSIE (nested virtualization)

  ARM64:

   - Numerous fixes for the pathological lock inversion issue that
     plagued KVM/arm64 since... forever.

   - New framework allowing SMCCC-compliant hypercalls to be forwarded
     to userspace, hopefully paving the way for some more features being
     moved to VMMs rather than be implemented in the kernel.

   - Large rework of the timer code to allow a VM-wide offset to be
     applied to both virtual and physical counters as well as a
     per-timer, per-vcpu offset that complements the global one. This
     last part allows the NV timer code to be implemented on top.

   - A small set of fixes to make sure that we don't change anything
     affecting the EL1&0 translation regime just after having having
     taken an exception to EL2 until we have executed a DSB. This
     ensures that speculative walks started in EL1&0 have completed.

   - The usual selftest fixes and improvements.

  x86:

   - Optimize CR0.WP toggling by avoiding an MMU reload when TDP is
     enabled, and by giving the guest control of CR0.WP when EPT is
     enabled on VMX (VMX-only because SVM doesn't support per-bit
     controls)

   - Add CR0/CR4 helpers to query single bits, and clean up related code
     where KVM was interpreting kvm_read_cr4_bits()'s "unsigned long"
     return as a bool

   - Move AMD_PSFD to cpufeatures.h and purge KVM's definition

   - Avoid unnecessary writes+flushes when the guest is only adding new
     PTEs

   - Overhaul .sync_page() and .invlpg() to utilize .sync_page()'s
     optimizations when emulating invalidations

   - Clean up the range-based flushing APIs

   - Revamp the TDP MMU's reaping of Accessed/Dirty bits to clear a
     single A/D bit using a LOCK AND instead of XCHG, and skip all of
     the "handle changed SPTE" overhead associated with writing the
     entire entry

   - Track the number of "tail" entries in a pte_list_desc to avoid
     having to walk (potentially) all descriptors during insertion and
     deletion, which gets quite expensive if the guest is spamming
     fork()

   - Disallow virtualizing legacy LBRs if architectural LBRs are
     available, the two are mutually exclusive in hardware

   - Disallow writes to immutable feature MSRs (notably
     PERF_CAPABILITIES) after KVM_RUN, similar to CPUID features

   - Overhaul the vmx_pmu_caps selftest to better validate
     PERF_CAPABILITIES

   - Apply PMU filters to emulated events and add test coverage to the
     pmu_event_filter selftest

   - AMD SVM:
       - Add support for virtual NMIs
       - Fixes for edge cases related to virtual interrupts

   - Intel AMX:
       - Don't advertise XTILE_CFG in KVM_GET_SUPPORTED_CPUID if
         XTILE_DATA is not being reported due to userspace not opting in
         via prctl()
       - Fix a bug in emulation of ENCLS in compatibility mode
       - Allow emulation of NOP and PAUSE for L2
       - AMX selftests improvements
       - Misc cleanups

  MIPS:

   - Constify MIPS's internal callbacks (a leftover from the hardware
     enabling rework that landed in 6.3)

  Generic:

   - Drop unnecessary casts from "void *" throughout kvm_main.c

   - Tweak the layout of "struct kvm_mmu_memory_cache" to shrink the
     struct size by 8 bytes on 64-bit kernels by utilizing a padding
     hole

  Documentation:

   - Fix goof introduced by the conversion to rST"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (211 commits)
  KVM: s390: pci: fix virtual-physical confusion on module unload/load
  KVM: s390: vsie: clarifications on setting the APCB
  KVM: s390: interrupt: fix virtual-physical confusion for next alert GISA
  KVM: arm64: Have kvm_psci_vcpu_on() use WRITE_ONCE() to update mp_state
  KVM: arm64: Acquire mp_state_lock in kvm_arch_vcpu_ioctl_vcpu_init()
  KVM: selftests: Test the PMU event "Instructions retired"
  KVM: selftests: Copy full counter values from guest in PMU event filter test
  KVM: selftests: Use error codes to signal errors in PMU event filter test
  KVM: selftests: Print detailed info in PMU event filter asserts
  KVM: selftests: Add helpers for PMC asserts in PMU event filter test
  KVM: selftests: Add a common helper for the PMU event filter guest code
  KVM: selftests: Fix spelling mistake "perrmited" -> "permitted"
  KVM: arm64: vhe: Drop extra isb() on guest exit
  KVM: arm64: vhe: Synchronise with page table walker on MMU update
  KVM: arm64: pkvm: Document the side effects of kvm_flush_dcache_to_poc()
  KVM: arm64: nvhe: Synchronise with page table walker on TLBI
  KVM: arm64: Handle 32bit CNTPCTSS traps
  KVM: arm64: nvhe: Synchronise with page table walker on vcpu run
  KVM: arm64: vgic: Don't acquire its_lock before config_lock
  KVM: selftests: Add test to verify KVM's supported XCR0
  ...
2023-05-01 12:06:20 -07:00
Linus Torvalds 86e98ed15b cgroup changes for v6.4-rc1
* cpuset changes including the fix for an incorrect interaction with CPU
   hotplug and an optimization.
 
 * Other doc and cosmetic changes.
 -----BEGIN PGP SIGNATURE-----
 
 iIQEABYIACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCZErfng4cdGpAa2VybmVs
 Lm9yZwAKCRCxYfJx3gVYGVVtAQCDycK4VSgc4nsFPG1vh1Oy1A723ciEUwAbKmV/
 F1n7xwEA68FiDvE29LpMJJuYP9HnX0A5zRMyNnb52kN9jmgcEQI=
 =ALol
 -----END PGP SIGNATURE-----

Merge tag 'cgroup-for-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup updates from Tejun Heo:

 - cpuset changes including the fix for an incorrect interaction with
   CPU hotplug and an optimization

 - Other doc and cosmetic changes

* tag 'cgroup-for-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  docs: cgroup-v1/cpusets: update libcgroup project link
  cgroup/cpuset: Minor updates to test_cpuset_prs.sh
  cgroup/cpuset: Include offline CPUs when tasks' cpumasks in top_cpuset are updated
  cgroup/cpuset: Skip task update if hotplug doesn't affect current cpuset
  cpuset: Clean up cpuset_node_allowed
  cgroup: bpf: use cgroup_lock()/cgroup_unlock() wrappers
2023-04-29 10:05:22 -07:00
Linus Torvalds 89d77f71f4 RISC-V Patches for the 6.4 Merge Window, Part 1
* Support for runtime detection of the Svnapot extension.
 * Support for Zicboz when clearing pages.
 * We've moved to GENERIC_ENTRY.
 * Support for !MMU on rv32 systems.
 * The linear region is now mapped via huge pages.
 * Support for building relocatable kernels.
 * Support for the hwprobe interface.
 * Various fixes and cleanups throughout the tree.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmRL5rcTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYibpcD/0RnmO+N2OJxsJXf0KtHv4LlChAFaMZ
 mfcsU8lv8r3Rz1USJGyVoE57885R+iUw1664ic6Gj9Ll9/A+BDVyqlNeo1BZ7nnv
 6hZawSh8XGMyCJoatjaCSMW6VKObsSpHXLoA0mxtj06w1XhtpUnzjv4SZQqBYxC2
 7+/cfy6l3uGdSKQ0R402sF8PE+l3HthhO+Cw9NYHQZisAHEQrfFpXRnrovhs+vX0
 aVxoWo8bmIhhNke2jh6dnGhfFfAs+UClbaKgZfe8af6feboo+Tal3+OibiEy1K1j
 hDQ3w/G5jAdwSqnNPdXzpk4srskUOhP9is8AG79vCasMxybQIBfZcc7/kLmmQX+2
 xt1EoDVD/lSO1p+CWRautLXEsInWbpBYaSJie7WcR4SHe8S7/nomTDlwkJHx5cma
 mkSYHJKNwCbamDTI3gXg8nrScbxsRnJQsQUolFDwAeRz7AYVwtqVh8VxAWqAdU3q
 xUNKrUpCAzNC3d5GL7pmRfZrqjpQhuFXkHFSy85vaCPuckBu926OzxpKBmX4Kea1
 qLYWfxv78bcwuY47FWJKcd97Ib63iBYDgarJxvrHrwDaHV2xjBOmdapNPUc2PswT
 a938enbYYnJHIbuSmbeNBPF4iF6nKUXshyfZu7tCZl6MzsXloUckGdm++j97Bpvr
 g6G3ZP6STSQBmw==
 =oxQd
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.4-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Support for runtime detection of the Svnapot extension

 - Support for Zicboz when clearing pages

 - We've moved to GENERIC_ENTRY

 - Support for !MMU on rv32 systems

 - The linear region is now mapped via huge pages

 - Support for building relocatable kernels

 - Support for the hwprobe interface

 - Various fixes and cleanups throughout the tree

* tag 'riscv-for-linus-6.4-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (57 commits)
  RISC-V: hwprobe: Explicity check for -1 in vdso init
  RISC-V: hwprobe: There can only be one first
  riscv: Allow to downgrade paging mode from the command line
  dt-bindings: riscv: add sv57 mmu-type
  RISC-V: hwprobe: Remove __init on probe_vendor_features()
  riscv: Use --emit-relocs in order to move .rela.dyn in init
  riscv: Check relocations at compile time
  powerpc: Move script to check relocations at compile time in scripts/
  riscv: Introduce CONFIG_RELOCATABLE
  riscv: Move .rela.dyn outside of init to avoid empty relocations
  riscv: Prepare EFI header for relocatable kernels
  riscv: Unconditionnally select KASAN_VMALLOC if KASAN
  riscv: Fix ptdump when KASAN is enabled
  riscv: Fix EFI stub usage of KASAN instrumented strcmp function
  riscv: Move DTB_EARLY_BASE_VA to the kernel address space
  riscv: Rework kasan population functions
  riscv: Split early and final KASAN population functions
  riscv: Use PUD/P4D/PGD pages for the linear mapping
  riscv: Move the linear mapping creation in its own function
  riscv: Get rid of riscv_pfn_base variable
  ...
2023-04-28 16:55:39 -07:00