[ Upstream commit dc6306ad5b0dda040baf1fde3cfd458e6abfc4da ]
The SRSO default safe-ret mitigation is reported as "mitigated" even if
microcode hasn't been updated. That's wrong because userspace may still
be vulnerable to SRSO attacks due to IBPB not flushing branch type
predictions.
Report the safe-ret + !microcode case as vulnerable.
Also report the microcode-only case as vulnerable as it leaves the
kernel open to attacks.
Fixes: fb3bd914b3 ("x86/srso: Add a Speculative RAS Overflow mitigation")
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/a8a14f97d1b0e03ec255c81637afdf4cf0ae9c99.1693889988.git.jpoimboe@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
This reverts commits 86327e8eb9 ("memcg: drop kmem.limit_in_bytes") and
partially reverts 58056f7750 ("memcg, kmem: further deprecate
kmem.limit_in_bytes") which have incrementally removed support for the
kernel memory accounting hard limit. Unfortunately it has turned out that
there is still userspace depending on the existence of
memory.kmem.limit_in_bytes [1]. The underlying functionality is not
really required but the non-existent file just confuses the userspace
which fails in the result. The patch to fix this on the userspace side
has been submitted but it is hard to predict how it will propagate through
the maze of 3rd party consumers of the software.
Now, reverting alone 86327e8eb9 is not an option because there is
another set of userspace which cannot cope with ENOTSUPP returned when
writing to the file. Therefore we have to go and revisit 58056f7750 as
well. There are two ways to go ahead. Either we give up on the
deprecation and fully revert 58056f7750 as well or we can keep
kmem.limit_in_bytes but make the write a noop and warn about the fact.
This should work for both known breaking workloads which depend on the
existence but do not depend on the hard limit enforcement.
Note to backporters to stable trees. a8c49af3be ("memcg: add per-memcg
total kernel memory stat") introduced in 4.18 has added memcg_account_kmem
so the accounting is not done by obj_cgroup_charge_pages directly for v1
anymore. Prior kernels need to add it explicitly (thanks to Johannes for
pointing this out).
[akpm@linux-foundation.org: fix build - remove unused local]
Link: http://lkml.kernel.org/r/20230920081101.GA12096@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net [1]
Link: https://lkml.kernel.org/r/ZRE5VJozPZt9bRPy@dhcp22.suse.cz
Fixes: 86327e8eb9 ("memcg: drop kmem.limit_in_bytes")
Fixes: 58056f7750 ("memcg, kmem: further deprecate kmem.limit_in_bytes")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeremi Piotrowski <jpiotrowski@linux.microsoft.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Tejun heo <tj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
perf tools maintainership:
- Add git information for perf-tools and perf-tools-next trees/branches to the
MAINTAINERS file. That is where development now takes place and myself and
Namhyung Kim have write access, more people to come as we emulate other
maintainer groups.
perf record:
- Record kernel data maps when 'perf record --data' is used, so that global variables can
be resolved and used in tools that do data profiling.
perf trace:
- Remove the old, experimental support for BPF events in which a .c file was passed as
an event: "perf trace -e hello.c" to then get compiled and loaded.
The only known usage for that, that shipped with the kernel as an example for such events,
augmented the raw_syscalls tracepoints and was converted to a libbpf skeleton, reusing all
the user space components and the BPF code connected to the syscalls.
In the end just the way to glue the BPF part and the user space type beautifiers changed,
now being performed by libbpf skeletons.
The next step is to use BTF to do pretty printing of all syscall types, as discussed with
Alan Maguire and others.
Now, on a perf built with BUILD_BPF_SKEL=1 we get most if not all path/filenames/strings,
some of the networking data structures, perf_event_attr, etc, i.e. systemwide tracing of
nanosleep calls and perf_event_open syscalls while 'perf stat' runs 'sleep' for 5 seconds:
# perf trace -a -e *nanosleep,perf* perf stat -e cycles,instructions sleep 5
0.000 ( 9.034 ms): perf/327641 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0 (PERF_COUNT_HW_CPU_CYCLES), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 327642 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 3
9.039 ( 0.006 ms): perf/327641 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x1 (PERF_COUNT_HW_INSTRUCTIONS), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 327642 (perf-exec), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
? ( ): gpm/991 ... [continued]: clock_nanosleep()) = 0
10.133 ( ): sleep/327642 clock_nanosleep(rqtp: { .tv_sec: 5, .tv_nsec: 0 }, rmtp: 0x7ffd36f83ed0) ...
? ( ): pool-gsd-smart/3051 ... [continued]: clock_nanosleep()) = 0
30.276 ( ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ...
223.215 (1000.430 ms): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) = 0
30.276 (2000.394 ms): gpm/991 ... [continued]: clock_nanosleep()) = 0
1230.814 ( ): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) ...
1230.814 (1000.404 ms): pool-gsd-smart/3051 ... [continued]: clock_nanosleep()) = 0
2030.886 ( ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ...
2237.709 (1000.153 ms): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) = 0
? ( ): crond/1172 ... [continued]: clock_nanosleep()) = 0
3242.699 ( ): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) ...
2030.886 (2000.385 ms): gpm/991 ... [continued]: clock_nanosleep()) = 0
3728.078 ( ): crond/1172 clock_nanosleep(rqtp: { .tv_sec: 60, .tv_nsec: 0 }, rmtp: 0x7ffe0971dcf0) ...
3242.699 (1000.158 ms): pool-gsd-smart/3051 ... [continued]: clock_nanosleep()) = 0
4031.409 ( ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ...
10.133 (5000.375 ms): sleep/327642 ... [continued]: clock_nanosleep()) = 0
Performance counter stats for 'sleep 5':
2,617,347 cycles
1,855,997 instructions # 0.71 insn per cycle
5.002282128 seconds time elapsed
0.000855000 seconds user
0.000852000 seconds sys
#
perf annotate:
- Building with binutils' libopcode now is opt-in (BUILD_NONDISTRO=1) for
licensing reasons, and we missed a build test on tools/perf/tests makefile.
Since we now default to NDEBUG=1, we ended up segfaulting when building with
BUILD_NONDISTRO=1 because a needed initialization routine was being "error
checked" via an assert.
Fix it by explicitly checking the result and aborting instead if it fails.
We better back propagate the error, but at least 'perf annotate' on samples
collected for a BPF program is back working when perf is built with
BUILD_NONDISTRO=1.
perf report/top:
- Add back TUI hierarchy mode header, that is seen when using 'perf report/top --hierarchy'.
- Fix the number of entries for 'e' key in the TUI that was preventing navigation of
lines when expanding an entry.
perf report/script:
- Support cross platform register handling, allowing a perf.data file collected
on one architecture to have registers sampled correctly displayed when
analysis tools such as 'perf report' and 'perf script' are used on a different
architecture.
- Fix handling of event attributes in pipe mode, i.e. when one uses:
perf record -o - | perf report -i -
When no perf.data files are used.
- Handle files generated via pipe mode with a version of perf and then read
also via pipe mode with a different version of perf, where the event attr
record may have changed, use the record size field to properly support this
version mismatch.
perf probe:
- Accessing global variables from uprobes isn't supported, make the error
message state that instead of stating that some minimal kernel version is
needed to have that feature. This seems just a tool limitation, the kernel
probably has all that is needed.
perf tests:
- Fix a reference count related leak in the dlfilter v0 API where the result
of a thread__find_symbol_fb() is not matched with an addr_location__exit()
to drop the reference counts of the resolved components (machine, thread, map,
symbol, etc). Add a dlfilter test to make sure that doesn't regresses.
- Lots of fixes for the 'perf test' written in shell script related to problems
found with the shellcheck utility.
- Fixes for 'perf test' shell scripts testing features enabled when perf is
built with BUILD_BPF_SKEL=1, such as 'perf stat' bpf counters.
- Add perf record sample filtering test, things like the following example, that gets
implemented as a BPF filter attached to the event:
# perf record -e task-clock -c 10000 --filter 'ip < 0xffffffff00000000'
- Improve the way the task_analyzer test checks if libtraceevent is linked,
using 'perf version --build-options' instead of the more expensinve
'perf record -e "sched:sched_switch"'.
- Add support for riscv in the mmap-basic test. (This went as well via the RiscV tree, same contents).
libperf:
- Implement riscv mmap support (This went as well via the RiscV tree, same contents).
perf script:
- New tool that converts perf.data files to the firefox profiler format so that one can use
the visualizer at https://profiler.firefox.com/. Done by Anup Sharma as part of this year's
Google Summer of Code.
One can generate the output and upload it to the web interface but Anup also automated
everything:
perf script gecko -F 99 -a sleep 60
- Support syscall name parsing on arm64.
- Print "cgroup" field on the same line as "comm".
perf bench:
- Add new 'uprobe' benchmark to measure the overhead of uprobes with/without
BPF programs attached to it.
- breakpoints are not available on power9, skip that test.
perf stat:
- Add #num_cpus_online literal to be used in 'perf stat' metrics, and add this extra
'perf test' check that exemplifies its purpose:
TEST_ASSERT_VAL("#num_cpus_online",
expr__parse(&num_cpus_online, ctx, "#num_cpus_online") == 0);
TEST_ASSERT_VAL("#num_cpus", expr__parse(&num_cpus, ctx, "#num_cpus") == 0);
TEST_ASSERT_VAL("#num_cpus >= #num_cpus_online", num_cpus >= num_cpus_online);
Miscellaneous:
- Improve tool startup time by lazily reading PMU, JSON, sysfs data.
- Improve error reporting in the parsing of events, passing YYLTYPE to error routines,
so that the output can show were the parsing error was found.
- Add 'perf test' entries to check the parsing of events improvements.
- Fix various leak for things detected by -fsanitize=address, mostly things that would
be freed at tool exit, including:
- Free evsel->filter on the destructor.
- Allow tools to register a thread->priv destructor and use it in 'perf trace'.
- Free evsel->priv in 'perf trace'.
- Free string returned by synthesize_perf_probe_point() when the caller fails
to do all it needs.
- Adjust various compiler options to not consider errors some warnings when
building with broken headers found in things like python, flex, bison, as we
otherwise build with -Werror. Some for gcc, some for clang, some for some
specific version of those, some for some specific version of flex or bison, or
some specific combination of these components, bah.
- Allow customization of clang options for BPF target, this helps building on
gentoo where there are other oddities where BPF targets gets passed some compiler
options intended for the native build, so building with WERROR=0 helps while
these oddities are fixed.
- Dont pass ERR_PTR() values to perf_session__delete() in 'perf top' and 'perf lock',
fixing some segfaults when handling some odd failures.
- Add LTO build option.
- Fix format of unordered lists in the perf docs (tools/perf/Documentation).
- Overhaul the bison files, using constructs such as YYNOMEM.
- Remove unused tokens from the bison .y files.
- Add more comments to various structs.
- A few LoongArch enablement patches.
Vendor events (JSON):
- Add JSON metrics for Yitian 710 DDR (aarch64). Things like:
EventName, BriefDescription
visible_window_limit_reached_rd, "At least one entry in read queue reaches the visible window limit.",
visible_window_limit_reached_wr, "At least one entry in write queue reaches the visible window limit.",
op_is_dqsosc_mpc , "A DQS Oscillator MPC command to DRAM.",
op_is_dqsosc_mrr , "A DQS Oscillator MRR command to DRAM.",
op_is_tcr_mrr , "A Temperature Compensated Refresh(TCR) MRR command to DRAM.",
- Add AmpereOne metrics (aarch64).
- Update N2 and V2 metrics (aarch64) and events using Arm telemetry repo.
- Update scale units and descriptions of common topdown metrics on aarch64. Things like:
- "MetricExpr": "stall_slot_frontend / (#slots * cpu_cycles)",
- "BriefDescription": "Frontend bound L1 topdown metric",
+ "MetricExpr": "100 * (stall_slot_frontend / (#slots * cpu_cycles))",
+ "BriefDescription": "This metric is the percentage of total slots that were stalled due to resource constraints in the frontend of the processor.",
- Update events for intel: meteorlake to 1.04, sapphirerapids to 1.15, Icelake+ metric constraints.
- Update files for the power10 platform.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCZPfJZgAKCRCyPKLppCJ+
J1/eAP9lgtavD0V75wy1p5zyotkceOmPTkk1DYFVx2Euhxa/lAD/YW/JvuVSo0Gr
HqJP52XaV0tF8gG+YxL+Lay/Ke0P5AQ=
=d12c
-----END PGP SIGNATURE-----
Merge tag 'perf-tools-for-v6.6-1-2023-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo:
"perf tools maintainership:
- Add git information for perf-tools and perf-tools-next trees and
branches to the MAINTAINERS file. That is where development now
takes place and myself and Namhyung Kim have write access, more
people to come as we emulate other maintainer groups.
perf record:
- Record kernel data maps when 'perf record --data' is used, so that
global variables can be resolved and used in tools that do data
profiling.
perf trace:
- Remove the old, experimental support for BPF events in which a .c
file was passed as an event: "perf trace -e hello.c" to then get
compiled and loaded.
The only known usage for that, that shipped with the kernel as an
example for such events, augmented the raw_syscalls tracepoints and
was converted to a libbpf skeleton, reusing all the user space
components and the BPF code connected to the syscalls.
In the end just the way to glue the BPF part and the user space
type beautifiers changed, now being performed by libbpf skeletons.
The next step is to use BTF to do pretty printing of all syscall
types, as discussed with Alan Maguire and others.
Now, on a perf built with BUILD_BPF_SKEL=1 we get most if not all
path/filenames/strings, some of the networking data structures,
perf_event_attr, etc, i.e. systemwide tracing of nanosleep calls
and perf_event_open syscalls while 'perf stat' runs 'sleep' for 5
seconds:
# perf trace -a -e *nanosleep,perf* perf stat -e cycles,instructions sleep 5
0.000 ( 9.034 ms): perf/327641 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0 (PERF_COUNT_HW_CPU_CYCLES), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 327642 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 3
9.039 ( 0.006 ms): perf/327641 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x1 (PERF_COUNT_HW_INSTRUCTIONS), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 327642 (perf-exec), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
? ( ): gpm/991 ... [continued]: clock_nanosleep()) = 0
10.133 ( ): sleep/327642 clock_nanosleep(rqtp: { .tv_sec: 5, .tv_nsec: 0 }, rmtp: 0x7ffd36f83ed0) ...
? ( ): pool-gsd-smart/3051 ... [continued]: clock_nanosleep()) = 0
30.276 ( ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ...
223.215 (1000.430 ms): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) = 0
30.276 (2000.394 ms): gpm/991 ... [continued]: clock_nanosleep()) = 0
1230.814 ( ): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) ...
1230.814 (1000.404 ms): pool-gsd-smart/3051 ... [continued]: clock_nanosleep()) = 0
2030.886 ( ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ...
2237.709 (1000.153 ms): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) = 0
? ( ): crond/1172 ... [continued]: clock_nanosleep()) = 0
3242.699 ( ): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) ...
2030.886 (2000.385 ms): gpm/991 ... [continued]: clock_nanosleep()) = 0
3728.078 ( ): crond/1172 clock_nanosleep(rqtp: { .tv_sec: 60, .tv_nsec: 0 }, rmtp: 0x7ffe0971dcf0) ...
3242.699 (1000.158 ms): pool-gsd-smart/3051 ... [continued]: clock_nanosleep()) = 0
4031.409 ( ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ...
10.133 (5000.375 ms): sleep/327642 ... [continued]: clock_nanosleep()) = 0
Performance counter stats for 'sleep 5':
2,617,347 cycles
1,855,997 instructions # 0.71 insn per cycle
5.002282128 seconds time elapsed
0.000855000 seconds user
0.000852000 seconds sys
perf annotate:
- Building with binutils' libopcode now is opt-in (BUILD_NONDISTRO=1)
for licensing reasons, and we missed a build test on
tools/perf/tests makefile.
Since we now default to NDEBUG=1, we ended up segfaulting when
building with BUILD_NONDISTRO=1 because a needed initialization
routine was being "error checked" via an assert.
Fix it by explicitly checking the result and aborting instead if it
fails.
We better back propagate the error, but at least 'perf annotate' on
samples collected for a BPF program is back working when perf is
built with BUILD_NONDISTRO=1.
perf report/top:
- Add back TUI hierarchy mode header, that is seen when using 'perf
report/top --hierarchy'.
- Fix the number of entries for 'e' key in the TUI that was
preventing navigation of lines when expanding an entry.
perf report/script:
- Support cross platform register handling, allowing a perf.data file
collected on one architecture to have registers sampled correctly
displayed when analysis tools such as 'perf report' and 'perf
script' are used on a different architecture.
- Fix handling of event attributes in pipe mode, i.e. when one uses:
perf record -o - | perf report -i -
When no perf.data files are used.
- Handle files generated via pipe mode with a version of perf and
then read also via pipe mode with a different version of perf,
where the event attr record may have changed, use the record size
field to properly support this version mismatch.
perf probe:
- Accessing global variables from uprobes isn't supported, make the
error message state that instead of stating that some minimal
kernel version is needed to have that feature. This seems just a
tool limitation, the kernel probably has all that is needed.
perf tests:
- Fix a reference count related leak in the dlfilter v0 API where the
result of a thread__find_symbol_fb() is not matched with an
addr_location__exit() to drop the reference counts of the resolved
components (machine, thread, map, symbol, etc). Add a dlfilter test
to make sure that doesn't regresses.
- Lots of fixes for the 'perf test' written in shell script related
to problems found with the shellcheck utility.
- Fixes for 'perf test' shell scripts testing features enabled when
perf is built with BUILD_BPF_SKEL=1, such as 'perf stat' bpf
counters.
- Add perf record sample filtering test, things like the following
example, that gets implemented as a BPF filter attached to the
event:
# perf record -e task-clock -c 10000 --filter 'ip < 0xffffffff00000000'
- Improve the way the task_analyzer test checks if libtraceevent is
linked, using 'perf version --build-options' instead of the more
expensinve 'perf record -e "sched:sched_switch"'.
- Add support for riscv in the mmap-basic test. (This went as well
via the RiscV tree, same contents).
libperf:
- Implement riscv mmap support (This went as well via the RiscV tree,
same contents).
perf script:
- New tool that converts perf.data files to the firefox profiler
format so that one can use the visualizer at
https://profiler.firefox.com/. Done by Anup Sharma as part of this
year's Google Summer of Code.
One can generate the output and upload it to the web interface but
Anup also automated everything:
perf script gecko -F 99 -a sleep 60
- Support syscall name parsing on arm64.
- Print "cgroup" field on the same line as "comm".
perf bench:
- Add new 'uprobe' benchmark to measure the overhead of uprobes
with/without BPF programs attached to it.
- breakpoints are not available on power9, skip that test.
perf stat:
- Add #num_cpus_online literal to be used in 'perf stat' metrics, and
add this extra 'perf test' check that exemplifies its purpose:
TEST_ASSERT_VAL("#num_cpus_online",
expr__parse(&num_cpus_online, ctx, "#num_cpus_online") == 0);
TEST_ASSERT_VAL("#num_cpus", expr__parse(&num_cpus, ctx, "#num_cpus") == 0);
TEST_ASSERT_VAL("#num_cpus >= #num_cpus_online", num_cpus >= num_cpus_online);
Miscellaneous:
- Improve tool startup time by lazily reading PMU, JSON, sysfs data.
- Improve error reporting in the parsing of events, passing YYLTYPE
to error routines, so that the output can show were the parsing
error was found.
- Add 'perf test' entries to check the parsing of events
improvements.
- Fix various leak for things detected by -fsanitize=address, mostly
things that would be freed at tool exit, including:
- Free evsel->filter on the destructor.
- Allow tools to register a thread->priv destructor and use it in
'perf trace'.
- Free evsel->priv in 'perf trace'.
- Free string returned by synthesize_perf_probe_point() when the
caller fails to do all it needs.
- Adjust various compiler options to not consider errors some
warnings when building with broken headers found in things like
python, flex, bison, as we otherwise build with -Werror. Some for
gcc, some for clang, some for some specific version of those, some
for some specific version of flex or bison, or some specific
combination of these components, bah.
- Allow customization of clang options for BPF target, this helps
building on gentoo where there are other oddities where BPF targets
gets passed some compiler options intended for the native build, so
building with WERROR=0 helps while these oddities are fixed.
- Dont pass ERR_PTR() values to perf_session__delete() in 'perf top'
and 'perf lock', fixing some segfaults when handling some odd
failures.
- Add LTO build option.
- Fix format of unordered lists in the perf docs
(tools/perf/Documentation)
- Overhaul the bison files, using constructs such as YYNOMEM.
- Remove unused tokens from the bison .y files.
- Add more comments to various structs.
- A few LoongArch enablement patches.
Vendor events (JSON):
- Add JSON metrics for Yitian 710 DDR (aarch64). Things like:
EventName, BriefDescription
visible_window_limit_reached_rd, "At least one entry in read queue reaches the visible window limit.",
visible_window_limit_reached_wr, "At least one entry in write queue reaches the visible window limit.",
op_is_dqsosc_mpc , "A DQS Oscillator MPC command to DRAM.",
op_is_dqsosc_mrr , "A DQS Oscillator MRR command to DRAM.",
op_is_tcr_mrr , "A Temperature Compensated Refresh(TCR) MRR command to DRAM.",
- Add AmpereOne metrics (aarch64).
- Update N2 and V2 metrics (aarch64) and events using Arm telemetry
repo.
- Update scale units and descriptions of common topdown metrics on
aarch64. Things like:
- "MetricExpr": "stall_slot_frontend / (#slots * cpu_cycles)",
- "BriefDescription": "Frontend bound L1 topdown metric",
+ "MetricExpr": "100 * (stall_slot_frontend / (#slots * cpu_cycles))",
+ "BriefDescription": "This metric is the percentage of total slots that were stalled due to resource constraints in the frontend of the processor.",
- Update events for intel: meteorlake to 1.04, sapphirerapids to
1.15, Icelake+ metric constraints.
- Update files for the power10 platform"
* tag 'perf-tools-for-v6.6-1-2023-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (217 commits)
perf parse-events: Fix driver config term
perf parse-events: Fixes relating to no_value terms
perf parse-events: Fix propagation of term's no_value when cloning
perf parse-events: Name the two term enums
perf list: Don't print Unit for "default_core"
perf vendor events intel: Fix modifier in tma_info_system_mem_parallel_reads for skylake
perf dlfilter: Avoid leak in v0 API test use of resolve_address()
perf metric: Add #num_cpus_online literal
perf pmu: Remove str from perf_pmu_alias
perf parse-events: Make common term list to strbuf helper
perf parse-events: Minor help message improvements
perf pmu: Avoid uninitialized use of alias->str
perf jevents: Use "default_core" for events with no Unit
perf test stat_bpf_counters_cgrp: Enhance perf stat cgroup BPF counter test
perf test shell stat_bpf_counters: Fix test on Intel
perf test shell record_bpf_filter: Skip 6.2 kernel
libperf: Get rid of attr.id field
perf tools: Convert to perf_record_header_attr_id()
libperf: Add perf_record_header_attr_id()
perf tools: Handle old data in PERF_RECORD_ATTR
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmT7NXsQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgphevEACwpgak3KV+zE+u/2g8rsjRu9FDkgi9hW+7
RKbwPtgTN3R3/z+eGq57gWW3aSJx0Mqaxf/mtEqf+K+yxK3FbDDKAkkwMG0S/TGT
FT+beCeJHiA9tTruoW1ipGh7gxS0D3JWwPRl5i7+rtagzdbOSYWmsrNPVXl3uVee
x6sORlZ3Ye2VRd0m+3fQ7nWw+epMKebMrlpmWC7og0mdt/zhF+q6zgAV8tdEYe7E
VQ8yrWa9LyYw07a+M2kra2+IZzSWYZE7qZE1OZcwbI9ARvqNkXyslqXfW0gVrDml
lfoUnwqsRI45RfTOTao6lAzc5QZHTlrl1Mqg79139grDaEM/bCx/kQq2ZUG03Tcy
FUgKNrz6E/te5u1ar8E8ppXEAmjjAsKiYZsT+d5nHv7xPV9F15Ryu+Mr9sFAiHXw
hMyInmBJ6JcQx9BYpvxTMi7N4Mfiw2Z9GTE2yhhVPvEAG418+YYa7CSNdXVSy/M2
QuAQEc+K1CcVhmgNZbSe2FnSCve1khHURT0C3PYI0u1VHpZmkzcSY5utdKuqeRDw
SqHk/XO3bBWW13UuLBnIB8+jk1AGKDc0vJQrdzxxxSYHc7uTx30QHYQMdA90Bqyx
2z/lg7IwfITgFkRO30AGXbe2NN4YnbseJ5wyfHvlN/WcHrDo9zbZUt6gReVCcOWA
/NBftGe8WA==
=eHn2
-----END PGP SIGNATURE-----
Merge tag 'io_uring-6.6-2023-09-08' of git://git.kernel.dk/linux
Pull io_uring fixes from Jens Axboe:
"A few fixes that should go into the 6.6-rc merge window:
- Fix for a regression this merge window caused by the SQPOLL
affinity patch, where we can race with SQPOLL thread shutdown and
cause an oops when trying to set affinity (Gabriel)
- Fix for a regression this merge window where fdinfo reading with
for a ring setup with IORING_SETUP_NO_SQARRAY will attempt to
deference the non-existing SQ ring array (me)
- Add the patch that allows more finegrained control over who can use
io_uring (Matteo)
- Locking fix for a regression added this merge window for IOPOLL
overflow (Pavel)
- IOPOLL fix for stable, breaking our loop if helper threads are
exiting (Pavel)
Also had a fix for unreaped iopoll requests from io-wq from Ming, but
we found an issue with that and hence it got reverted. Will get this
sorted for a future rc"
* tag 'io_uring-6.6-2023-09-08' of git://git.kernel.dk/linux:
Revert "io_uring: fix IO hang in io_wq_put_and_exit from do_exit()"
io_uring: fix unprotected iopoll overflow
io_uring: break out of iowq iopoll on teardown
io_uring: add a sysctl to disable io_uring system-wide
io_uring/fdinfo: only print ->sq_array[] if it's there
io_uring: fix IO hang in io_wq_put_and_exit from do_exit()
io_uring: Don't set affinity on a dying sqpoll thread
Introduce a new sysctl (io_uring_disabled) which can be either 0, 1, or
2. When 0 (the default), all processes are allowed to create io_uring
instances, which is the current behavior. When 1, io_uring creation is
disabled (io_uring_setup() will fail with -EPERM) for unprivileged
processes not in the kernel.io_uring_group group. When 2, calls to
io_uring_setup() fail with -EPERM regardless of privilege.
Signed-off-by: Matteo Rizzo <matteorizzo@google.com>
[JEM: modified to add io_uring_group]
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Link: https://lore.kernel.org/r/x49y1i42j1z.fsf@segfault.boston.devel.redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
* Unbound workqueues now support more flexible affinity scopes. The default
behavior is to soft-affine according to last level cache boundaries. A
work item queued from a given LLC is executed by a worker running on the
same LLC but the worker may be moved across cache boundaries as the
scheduler sees fit. On machines which multiple L3 caches, which are
becoming more popular along with chiplet designs, this improves cache
locality while not harming work conservation too much.
Unbound workqueues are now also a lot more flexible in terms of execution
affinity. Differeing levels of affinity scopes are supported and both the
default and per-workqueue affinity settings can be modified dynamically.
This should help working around amny of sub-optimal behaviors observed
recently with asymmetric ARM CPUs.
This involved signficant restructuring of workqueue code. Nothing was
reported yet but there's some risk of subtle regressions. Should keep an
eye out.
* Rescuer workers now has more identifiable comms.
* workqueue.unbound_cpus added so that CPUs which can be used by workqueue
can be constrained early during boot.
* Now that all the in-tree users have been flushed out, trigger warning if
system-wide workqueues are flushed.
* One pull commit from for-6.5-fixes to avoid cascading conflicts in the
affinity scope patchset.
-----BEGIN PGP SIGNATURE-----
iIQEABYIACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCZPERlQ4cdGpAa2VybmVs
Lm9yZwAKCRCxYfJx3gVYGVqQAPwIOy9tWY5jFAmMuIyH6wV50hbmfxCc2n5xhQNr
5HoyGgEA8lw1W7afDCIPiQVA7AYsu8dhwuNSOcRCJxhrrn4XsA0=
=g/Uu
-----END PGP SIGNATURE-----
Merge tag 'wq-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue updates from Tejun Heo:
- Unbound workqueues now support more flexible affinity scopes.
The default behavior is to soft-affine according to last level cache
boundaries. A work item queued from a given LLC is executed by a
worker running on the same LLC but the worker may be moved across
cache boundaries as the scheduler sees fit. On machines which
multiple L3 caches, which are becoming more popular along with
chiplet designs, this improves cache locality while not harming work
conservation too much.
Unbound workqueues are now also a lot more flexible in terms of
execution affinity. Differeing levels of affinity scopes are
supported and both the default and per-workqueue affinity settings
can be modified dynamically. This should help working around amny of
sub-optimal behaviors observed recently with asymmetric ARM CPUs.
This involved signficant restructuring of workqueue code. Nothing was
reported yet but there's some risk of subtle regressions. Should keep
an eye out.
- Rescuer workers now has more identifiable comms.
- workqueue.unbound_cpus added so that CPUs which can be used by
workqueue can be constrained early during boot.
- Now that all the in-tree users have been flushed out, trigger warning
if system-wide workqueues are flushed.
* tag 'wq-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (31 commits)
workqueue: fix data race with the pwq->stats[] increment
workqueue: Rename rescuer kworker
workqueue: Make default affinity_scope dynamically updatable
workqueue: Add "Affinity Scopes and Performance" section to documentation
workqueue: Implement non-strict affinity scope for unbound workqueues
workqueue: Add workqueue_attrs->__pod_cpumask
workqueue: Factor out need_more_worker() check and worker wake-up
workqueue: Factor out work to worker assignment and collision handling
workqueue: Add multiple affinity scopes and interface to select them
workqueue: Modularize wq_pod_type initialization
workqueue: Add tools/workqueue/wq_dump.py which prints out workqueue configuration
workqueue: Generalize unbound CPU pods
workqueue: Factor out clearing of workqueue-only attrs fields
workqueue: Factor out actual cpumask calculation to reduce subtlety in wq_update_pod()
workqueue: Initialize unbound CPU pods later in the boot
workqueue: Move wq_pod_init() below workqueue_init()
workqueue: Rename NUMA related names to use pod instead
workqueue: Rename workqueue_attrs->no_numa to ->ordered
workqueue: Make unbound workqueues to use per-cpu pool_workqueues
workqueue: Call wq_update_unbound_numa() on all CPUs in NUMA node on CPU hotplug
...
* Per-cpu cpu usage stats are now tracked. This currently isn't printed out
in the cgroupfs interface and can only be accessed through e.g. BPF.
Should decide on a not-too-ugly way to show per-cpu stats in cgroupfs.
* cpuset received some cleanups and prepatory patches for the pending
cpus.exclusive patchset which will allow cpuset partitions to be created
below non-partition parents, which should ease the management of partition
cpusets.
* A lot of code and documentation cleanup patches.
* tools/testing/selftests/cgroup/test_cpuset.c is added. This causes trivial
conflicts in .gitignore and Makefile under the directory against
fe3b1bf19b ("selftests: cgroup: add test_zswap program"). They can be
resolved by keeping lines from both branches.
-----BEGIN PGP SIGNATURE-----
iIQEABYIACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCZPENTg4cdGpAa2VybmVs
Lm9yZwAKCRCxYfJx3gVYGcyBAP44cHwpSFxXe3cehxAzb1l/2BZXtzU5l48OqUQd
MwHyrwEAm7+MTVAR2xOF4f+oVM9KWmKj7oV7Clpixl1S7hHyjwE=
=FCc9
-----END PGP SIGNATURE-----
Merge tag 'cgroup-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
- Per-cpu cpu usage stats are now tracked
This currently isn't printed out in the cgroupfs interface and can
only be accessed through e.g. BPF. Should decide on a not-too-ugly
way to show per-cpu stats in cgroupfs
- cpuset received some cleanups and prepatory patches for the pending
cpus.exclusive patchset which will allow cpuset partitions to be
created below non-partition parents, which should ease the management
of partition cpusets
- A lot of code and documentation cleanup patches
- tools/testing/selftests/cgroup/test_cpuset.c added
* tag 'cgroup-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (32 commits)
cgroup: Avoid -Wstringop-overflow warnings
cgroup:namespace: Remove unused cgroup_namespaces_init()
cgroup/rstat: Record the cumulative per-cpu time of cgroup and its descendants
cgroup: clean up if condition in cgroup_pidlist_start()
cgroup: fix obsolete function name in cgroup_destroy_locked()
Documentation: cgroup-v2.rst: Correct number of stats entries
cgroup: fix obsolete function name above css_free_rwork_fn()
cgroup/cpuset: fix kernel-doc
cgroup: clean up printk()
cgroup: fix obsolete comment above cgroup_create()
docs: cgroup-v1: fix typo
docs: cgroup-v1: correct the term of Page Cache organization in inode
cgroup/misc: Store atomic64_t reads to u64
cgroup/misc: Change counters to be explicit 64bit types
cgroup/misc: update struct members descriptions
cgroup: remove cgrp->kn check in css_populate_dir()
cgroup: fix obsolete function name
cgroup: use cached local variable parent in for loop
cgroup: remove obsolete comment above struct cgroupstats
cgroup: put cgroup_tryget_css() inside CONFIG_CGROUP_SCHED
...
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+QmuaPwR3wnBdVwACF8+vY7k4RUFAmTxzmEACgkQCF8+vY7k
4RWP6A/+Ljbwdoq92qOcaKAG2h2HzJa/H+xKMQwqIYjpbXnjNuFD2S9FCRfhNa9b
Pt4K2g4lH2IJvYiJ3qhBbMxV8GPmovnHFX5LvyTFpRmrtZBAKp+TPXpbPt+a2/WL
IPfQ0I52/c/JNqhm3fnmKgpXorp0wHYNbfY/LXztslimZj95+t0qjW62BoBmsJ3s
hR+j/Xlgnd+9gld1OqX6OndH3mpeqDzBl4KZatQzw6yuIo8SK0ASEpu/vzgZoVy+
WiBtbzMuta2ZghnEHbnCkurwBSU/oLXhBmXsgp+Zdy0gglSk1RBdxM+3O65OVQt3
CCWSXMS0vGOk6JiogMpcPzO5piaUePcHEIjgAaaepTOzbKaf6PbEd9dj73LT9qcx
4TYFtGaDDhyDU4nzKTngfNiwmYrL1h+NuG119ZLHfrdH3MT7itIaydwFJRqLC+6D
7K6/1H2LKq25i+hRp5ZK2pgv0dAJw/nSdwFGFVgWM3Tuyt5dGdL/4SlZO4nIFKF2
pPWJUTJJP/0t9GUtwWmCh1fdgDr0A6Zg5M2OduyhC/YkqyLuD/02Bb4aR8hzloPj
pym+94/PFaT5S7zvKywpvyIc8U+87/M2tw+mAPN2r3i4c0RFJa7CkyKqlKTKFw13
jw7NLLlrRbZ3a3zlhpJVqGLKgF2FlWudLUo4Y4kddWvxTMbwYXs=
=yuz5
-----END PGP SIGNATURE-----
Merge tag 'media/v6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media updates from Mauro Carvalho Chehab:
- new i2c drivers: ds90ub913, ds90ub953, ds90ub960, dw9719, ds90ub913
- new Intel IVSC MEI drivers
- some Mediatek platform drivers were moved to a common location
- Intel atomisp2 driver is now working with the main ov2680 driver. Due
to that, the atomisp2 ov2680 staging one was removed
- the bttv driver was finally converted to videobuf2 framework. This
was the last one upstream using videobuf version 1 core. We'll likely
remove the old videobuf framework on 6.7
- lots of improvements at atomisp driver: it now works with normal I2C
sensors. Several compile-mode dependecies to select between ISP2400
and ISP2401 are now solved in runtime
- a new ipu-bridge logic was added to work with IVSC MEI drivers
- venus driver gained better support for new VPU versions
- the v4l core async framework has gained lots of improvements and
cleanups
- lots of other cleanups, improvements and driver fixes
* tag 'media/v6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (358 commits)
media: ivsc: Add ACPI dependency
media: bttv: convert to vb2
media: bttv: use audio defaults for winfast2000
media: bttv: refactor bttv_set_dma()
media: bttv: move vbi_skip/vbi_count out of buffer
media: bttv: remove crop info from bttv_buffer
media: bttv: remove tvnorm field from bttv_buffer
media: bttv: remove format field from bttv_buffer
media: bttv: move do_crop flag out of bttv_fh
media: bttv: copy vbi_fmt from bttv_fh
media: bttv: copy vid fmt/width/height from fh
media: bttv: radio use v4l2_fh instead of bttv_fh
media: bttv: replace BUG with WARN_ON
media: bttv: use video_drvdata to get bttv
media: i2c: rdacm21: Fix uninitialized value
media: coda: Remove duplicated include
media: vivid: fix the racy dev->radio_tx_rds_owner
media: i2c: ccs: Check rules is non-NULL
media: i2c: ds90ub960: Fix PLL config for 1200 MHz CSI rate
media: i2c: ds90ub953: Fix use of uninitialized variables
...
Here is the big set of char/misc and other small driver subsystem
changes for 6.6-rc1.
Stuff all over the place here, lots of driver updates and changes and
new additions. Short summary is:
- new IIO drivers and updates
- Interconnect driver updates
- fpga driver updates and additions
- fsi driver updates
- mei driver updates
- coresight driver updates
- nvmem driver updates
- counter driver updates
- lots of smaller misc and char driver updates and additions
All of these have been in linux-next for a long time with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZPH64g8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ynr2QCfd3RKeR+WnGzyEOFhksl30UJJhiIAoNZtYT5+
t9KG0iMDXRuTsOqeEQbd
=tVnk
-----END PGP SIGNATURE-----
Merge tag 'char-misc-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver updates from Greg KH:
"Here is the big set of char/misc and other small driver subsystem
changes for 6.6-rc1.
Stuff all over the place here, lots of driver updates and changes and
new additions. Short summary is:
- new IIO drivers and updates
- Interconnect driver updates
- fpga driver updates and additions
- fsi driver updates
- mei driver updates
- coresight driver updates
- nvmem driver updates
- counter driver updates
- lots of smaller misc and char driver updates and additions
All of these have been in linux-next for a long time with no reported
problems"
* tag 'char-misc-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (267 commits)
nvmem: core: Notify when a new layout is registered
nvmem: core: Do not open-code existing functions
nvmem: core: Return NULL when no nvmem layout is found
nvmem: core: Create all cells before adding the nvmem device
nvmem: u-boot-env:: Replace zero-length array with DECLARE_FLEX_ARRAY() helper
nvmem: sec-qfprom: Add Qualcomm secure QFPROM support
dt-bindings: nvmem: sec-qfprom: Add bindings for secure qfprom
dt-bindings: nvmem: Add compatible for QCM2290
nvmem: Kconfig: Fix typo "drive" -> "driver"
nvmem: Explicitly include correct DT includes
nvmem: add new NXP QorIQ eFuse driver
dt-bindings: nvmem: Add t1023-sfp efuse support
dt-bindings: nvmem: qfprom: Add compatible for MSM8226
nvmem: uniphier: Use devm_platform_get_and_ioremap_resource()
nvmem: qfprom: do some cleanup
nvmem: stm32-romem: Use devm_platform_get_and_ioremap_resource()
nvmem: rockchip-efuse: Use devm_platform_get_and_ioremap_resource()
nvmem: meson-mx-efuse: Convert to devm_platform_ioremap_resource()
nvmem: lpc18xx_otp: Convert to devm_platform_ioremap_resource()
nvmem: brcm_nvram: Use devm_platform_get_and_ioremap_resource()
...
Here is the big set of tty and serial driver changes for 6.6-rc1.
Lots of cleanups in here this cycle, and some driver updates. Short
summary is:
- Jiri's continued work to make the tty code and apis be a bit more
sane with regards to modern kernel coding style and types
- cpm_uart driver updates
- n_gsm updates and fixes
- meson driver updates
- sc16is7xx driver updates
- 8250 driver updates for different hardware types
- qcom-geni driver fixes
- tegra serial driver change
- stm32 driver updates
- synclink_gt driver cleanups
- tty structure size reduction
All of these have been in linux-next this week with no reported issues.
The last bit of cleanups from Jiri and the tty structure size reduction
came in last week, a bit late but as they were just style changes and
size reductions, I figured they should get into this merge cycle so that
others can work on top of them with no merge conflicts.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZPH+jA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykKyACgldt6QeenTN+6dXIHS/eQHtTKZwMAn3arSeXI
QrUUnLFjOWyoX87tbMBQ
=LVw0
-----END PGP SIGNATURE-----
Merge tag 'tty-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver updates from Greg KH:
"Here is the big set of tty and serial driver changes for 6.6-rc1.
Lots of cleanups in here this cycle, and some driver updates. Short
summary is:
- Jiri's continued work to make the tty code and apis be a bit more
sane with regards to modern kernel coding style and types
- cpm_uart driver updates
- n_gsm updates and fixes
- meson driver updates
- sc16is7xx driver updates
- 8250 driver updates for different hardware types
- qcom-geni driver fixes
- tegra serial driver change
- stm32 driver updates
- synclink_gt driver cleanups
- tty structure size reduction
All of these have been in linux-next this week with no reported
issues. The last bit of cleanups from Jiri and the tty structure size
reduction came in last week, a bit late but as they were just style
changes and size reductions, I figured they should get into this merge
cycle so that others can work on top of them with no merge conflicts"
* tag 'tty-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (199 commits)
tty: shrink the size of struct tty_struct by 40 bytes
tty: n_tty: deduplicate copy code in n_tty_receive_buf_real_raw()
tty: n_tty: extract ECHO_OP processing to a separate function
tty: n_tty: unify counts to size_t
tty: n_tty: use u8 for chars and flags
tty: n_tty: simplify chars_in_buffer()
tty: n_tty: remove unsigned char casts from character constants
tty: n_tty: move newline handling to a separate function
tty: n_tty: move canon handling to a separate function
tty: n_tty: use MASK() for masking out size bits
tty: n_tty: make n_tty_data::num_overrun unsigned
tty: n_tty: use time_is_before_jiffies() in n_tty_receive_overrun()
tty: n_tty: use 'num' for writes' counts
tty: n_tty: use output character directly
tty: n_tty: make flow of n_tty_receive_buf_common() a bool
Revert "tty: serial: meson: Add a earlycon for the T7 SoC"
Documentation: devices.txt: Fix minors for ttyCPM*
Documentation: devices.txt: Remove ttySIOC*
Documentation: devices.txt: Remove ttyIOC*
serial: 8250_bcm7271: improve bcm7271 8250 port
...
Here is the big set of USB, Thunderbolt, and PHY driver updates for
6.6-rc1. Included in here are:
- PHY driver additions and cleanups
- Thunderbolt minor additions and fixes
- USB MIDI 2 gadget support added
- dwc3 driver updates and additions
- Removal of some old USB wireless code that was missed when that
codebase was originally removed a few years ago, cleaning up some
core USB code paths
- USB core potential use-after-free fixes that syzbot from different
people/groups keeps tripping over
- typec updates and additions
- gadget fixes and cleanups
- loads of smaller USB core and driver cleanups all over the place
Full details are in the shortlog. All of these have been in linux-next
for a while with no reported problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZPIAOQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yn80gCgybzMp0YnSildFetSC8lUJTnzjQcAn3KWzb75
Zt72jxGl4ZOXHEpozG4O
=FLrK
-----END PGP SIGNATURE-----
Merge tag 'usb-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB / Thunderbolt / PHY driver updates from Greg KH:
"Here is the big set of USB, Thunderbolt, and PHY driver updates for
6.6-rc1. Included in here are:
- PHY driver additions and cleanups
- Thunderbolt minor additions and fixes
- USB MIDI 2 gadget support added
- dwc3 driver updates and additions
- Removal of some old USB wireless code that was missed when that
codebase was originally removed a few years ago, cleaning up some
core USB code paths
- USB core potential use-after-free fixes that syzbot from different
people/groups keeps tripping over
- typec updates and additions
- gadget fixes and cleanups
- loads of smaller USB core and driver cleanups all over the place
Full details are in the shortlog. All of these have been in linux-next
for a while with no reported problems"
* tag 'usb-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (154 commits)
platform/chrome: cros_ec_typec: Configure Retimer cable type
tcpm: Avoid soft reset when partner does not support get_status
usb: typec: tcpm: reset counter when enter into unattached state after try role
usb: typec: tcpm: set initial svdm version based on pd revision
USB: serial: option: add FOXCONN T99W368/T99W373 product
USB: serial: option: add Quectel EM05G variant (0x030e)
usb: dwc2: add pci_device_id driver_data parse support
usb: gadget: remove max support speed info in bind operation
usb: gadget: composite: cleanup function config_ep_by_speed_and_alt()
usb: gadget: config: remove max speed check in usb_assign_descriptors()
usb: gadget: unconditionally allocate hs/ss descriptor in bind operation
usb: gadget: f_uvc: change endpoint allocation in uvc_function_bind()
usb: gadget: add a inline function gether_bitrate()
usb: gadget: use working speed to calcaulate network bitrate and qlen
dt-bindings: usb: samsung,exynos-dwc3: Add Exynos850 support
usb: dwc3: exynos: Add support for Exynos850 variant
usb: gadget: udc-xilinx: fix incorrect type in assignment warning
usb: gadget: udc-xilinx: fix cast from restricted __le16 warning
usb: gadget: udc-xilinx: fix restricted __le16 degrades to integer warning
USB: dwc2: hande irq on dead controller correctly
...
* Support for the new "riscv,isa-extensions" and "riscv,isa-base" device
tree interfaces for probing extensions.
* Support for userspace access to the performance counters.
* Support for more instructions in kprobes.
* Crash kernels can be allocated above 4GiB.
* Support for KCFI.
* Support for ELFs in !MMU configurations.
* ARCH_KMALLOC_MINALIGN has been reduced to 8.
* mmap() defaults to sv48-sized addresses, with longer addresses hidden
behind a hint (similar to Arm and Intel).
* Also various fixes and cleanups.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmTx96kTHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRAuExnzX7sYiVjRD/9DYVLlkQ/OEDJjPaEcYCP49xgIVUUU
lhs3XbSs2VNHBaiG114f6Q0AaT/uNi+uqSej3CeTmEot2kZkBk/f2yu+UNIriPZ9
GQiZsdyXhu921C+5VFtiI47KDWOVZ+Jpy3M1ll61IWt3yPSQHr1xOP0AOiyHHqe3
cmqpNnzjajlfVDoXPc2mGGzUJt/7ar4thcwnMNi98raXR5Qh7SP6rrHjoQhE1oFk
LMP3CHqEAcHE2tE4CxZVpc6HOQ5m0LpQIOK7ypufGMyoIYESm5dt/JOT4MlhTtDw
6JzyVKtiM7lartUnUaW3ZoX4trQYT5gbXxWrJ2gCnUGy3VulikoXr1Rpz0qfdeOR
XN8OLkVAqHfTGFI7oKk24f9Adw96R5NPZcdCay90h4J/kMfCiC7ZyUUI1XIa5iy1
np5pZCkf8HNcdywML7qcFd5n2O0wchyFnRLFZo6kJP9Ls5cEi6kBx/1jSdTcNgx/
fUKXyoEcriGoQiiwn29+4RZnU69gJV3zqQNLPpuwDQ5F/Q1zHTlrr+dqzezKkzcO
dRTV2d2Q4A5vIDXPptzNNLlRQdrc8qxPJ1lxQVkPIU4/mtqczmZBwlyY2u9zwPyS
sehJgJZnoAf+jm71NgQAKLck4MUBsMnMogOWunhXkVRCoZlbbkUWX4ECZYwPKsVk
W7zVPmLvSM0l5g==
=/tXb
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-6.6-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
- Support for the new "riscv,isa-extensions" and "riscv,isa-base"
device tree interfaces for probing extensions
- Support for userspace access to the performance counters
- Support for more instructions in kprobes
- Crash kernels can be allocated above 4GiB
- Support for KCFI
- Support for ELFs in !MMU configurations
- ARCH_KMALLOC_MINALIGN has been reduced to 8
- mmap() defaults to sv48-sized addresses, with longer addresses hidden
behind a hint (similar to Arm and Intel)
- Also various fixes and cleanups
* tag 'riscv-for-linus-6.6-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (51 commits)
lib/Kconfig.debug: Restrict DEBUG_INFO_SPLIT for RISC-V
riscv: support PREEMPT_DYNAMIC with static keys
riscv: Move create_tmp_mapping() to init sections
riscv: Mark KASAN tmp* page tables variables as static
riscv: mm: use bitmap_zero() API
riscv: enable DEBUG_FORCE_FUNCTION_ALIGN_64B
riscv: remove redundant mv instructions
RISC-V: mm: Document mmap changes
RISC-V: mm: Update pgtable comment documentation
RISC-V: mm: Add tests for RISC-V mm
RISC-V: mm: Restrict address space for sv39,sv48,sv57
riscv: enable DMA_BOUNCE_UNALIGNED_KMALLOC for !dma_coherent
riscv: allow kmalloc() caches aligned to the smallest value
riscv: support the elf-fdpic binfmt loader
binfmt_elf_fdpic: support 64-bit systems
riscv: Allow CONFIG_CFI_CLANG to be selected
riscv/purgatory: Disable CFI
riscv: Add CFI error handling
riscv: Add ftrace_stub_graph
riscv: Add types to indirectly called assembly functions
...
- Add HOTPLUG_SMT support (/sys/devices/system/cpu/smt) and honour the
configured SMT state when hotplugging CPUs into the system.
- Combine final TLB flush and lazy TLB mm shootdown IPIs when using the Radix
MMU to avoid a broadcast TLBIE flush on exit.
- Drop the exclusion between ptrace/perf watchpoints, and drop the now unused
associated arch hooks.
- Add support for the "nohlt" command line option to disable CPU idle.
- Add support for -fpatchable-function-entry for ftrace, with GCC >= 13.1.
- Rework memory block size determination, and support 256MB size on systems
with GPUs that have hotpluggable memory.
- Various other small features and fixes.
Thanks to: Andrew Donnellan, Aneesh Kumar K.V, Arnd Bergmann, Athira Rajeev,
Benjamin Gray, Christophe Leroy, Frederic Barrat, Gautam Menghani, Geoff Levand,
Hari Bathini, Immad Mir, Jialin Zhang, Joel Stanley, Jordan Niethe, Justin
Stitt, Kajol Jain, Kees Cook, Krzysztof Kozlowski, Laurent Dufour, Liang He,
Linus Walleij, Mahesh Salgaonkar, Masahiro Yamada, Michal Suchanek, Nageswara
R Sastry, Nathan Chancellor, Nathan Lynch, Naveen N Rao, Nicholas Piggin, Nick
Desaulniers, Omar Sandoval, Randy Dunlap, Reza Arbab, Rob Herring, Russell
Currey, Sourabh Jain, Thomas Gleixner, Trevor Woerner, Uwe Kleine-König, Vaibhav
Jain, Xiongfeng Wang, Yuan Tan, Zhang Rui, Zheng Zengkai.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmTwgbwTHG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgFmpD/432vipeoqvkAYsyK0xi/Y3GcY0wcyd
WJApLXXadEbtKQrgXQ6sowWqalg5thYnQCRarg/tXKK/po3KfgwkPjGDpOL+cIdr
12QVN2XJm9VmJ1wYJxzk+yXx4F43AdmMdr94qWAGufbTHezwb4UpzVR1NxtFrOE/
X5TNsC2+2mdZY/ZaNHS5vsTIFv3EhQfqgjZPlIAdLn6CGc8xWT514Q/uHA8+ytM/
HL7Hqs33DoPSvgTa5TT/2E0d0k5nO3P5KObzAjpYlireTPaBi51mpKGewcrtm0o2
v3cBlbfx3C7pe9ZhKBK9BH8cjynfiqsVZ9/lCw/7eBNdm9tHuzG0jeS7Db9tCZXS
fM7G2R7SoIusPTqxlBmkU5DpYslwrHiVgCyy3ijxkoA/fakVwh/GgTcMsRt73IY6
n6DsUvWwuYHCIeIiHmHQJqCqCRtV+aMzU3AbbBHOjtdIanhlW16M686dEsgCirh7
akRVRD5VqKaqXs34PpkRL89Xv3wZRjl6XZ3hZFfCjSYXfpXDXhgSToIskpHYhKL8
gpY7WtG9YQP05Xz5HRCx6EluaZVeKe0lZi6fezX7Mi9AygJQO8FfXqP1mHBlEq40
ThWtvL9D89RV6lADqqFN20XepgvKNOyAXcE4szvsnIZYUSPmZQZSPxx+DHtROaLP
jX3ifxtxJp92pQ==
=5g7K
-----END PGP SIGNATURE-----
Merge tag 'powerpc-6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
- Add HOTPLUG_SMT support (/sys/devices/system/cpu/smt) and honour the
configured SMT state when hotplugging CPUs into the system
- Combine final TLB flush and lazy TLB mm shootdown IPIs when using the
Radix MMU to avoid a broadcast TLBIE flush on exit
- Drop the exclusion between ptrace/perf watchpoints, and drop the now
unused associated arch hooks
- Add support for the "nohlt" command line option to disable CPU idle
- Add support for -fpatchable-function-entry for ftrace, with GCC >=
13.1
- Rework memory block size determination, and support 256MB size on
systems with GPUs that have hotpluggable memory
- Various other small features and fixes
Thanks to Andrew Donnellan, Aneesh Kumar K.V, Arnd Bergmann, Athira
Rajeev, Benjamin Gray, Christophe Leroy, Frederic Barrat, Gautam
Menghani, Geoff Levand, Hari Bathini, Immad Mir, Jialin Zhang, Joel
Stanley, Jordan Niethe, Justin Stitt, Kajol Jain, Kees Cook, Krzysztof
Kozlowski, Laurent Dufour, Liang He, Linus Walleij, Mahesh Salgaonkar,
Masahiro Yamada, Michal Suchanek, Nageswara R Sastry, Nathan Chancellor,
Nathan Lynch, Naveen N Rao, Nicholas Piggin, Nick Desaulniers, Omar
Sandoval, Randy Dunlap, Reza Arbab, Rob Herring, Russell Currey, Sourabh
Jain, Thomas Gleixner, Trevor Woerner, Uwe Kleine-König, Vaibhav Jain,
Xiongfeng Wang, Yuan Tan, Zhang Rui, and Zheng Zengkai.
* tag 'powerpc-6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (135 commits)
macintosh/ams: linux/platform_device.h is needed
powerpc/xmon: Reapply "Relax frame size for clang"
powerpc/mm/book3s64: Use 256M as the upper limit with coherent device memory attached
powerpc/mm/book3s64: Fix build error with SPARSEMEM disabled
powerpc/iommu: Fix notifiers being shared by PCI and VIO buses
powerpc/mpc5xxx: Add missing fwnode_handle_put()
powerpc/config: Disable SLAB_DEBUG_ON in skiroot
powerpc/pseries: Remove unused hcall tracing instruction
powerpc/pseries: Fix hcall tracepoints with JUMP_LABEL=n
powerpc: dts: add missing space before {
powerpc/eeh: Use pci_dev_id() to simplify the code
powerpc/64s: Move CPU -mtune options into Kconfig
powerpc/powermac: Fix unused function warning
powerpc/pseries: Rework lppaca_shared_proc() to avoid DEBUG_PREEMPT
powerpc: Don't include lppaca.h in paca.h
powerpc/pseries: Move hcall_vphn() prototype into vphn.h
powerpc/pseries: Move VPHN constants into vphn.h
cxl: Drop unused detach_spa()
powerpc: Drop zalloc_maybe_bootmem()
powerpc/powernv: Use struct opal_prd_msg in more places
...
Chen Jiahao <chenjiahao16@huawei.com> says:
On riscv, the current crash kernel allocation logic is trying to
allocate within 32bit addressible memory region by default, if
failed, try to allocate without 4G restriction.
In need of saving DMA zone memory while allocating a relatively large
crash kernel region, allocating the reserved memory top down in
high memory, without overlapping the DMA zone, is a mature solution.
Hence this patchset introduces the parameter option crashkernel=X,[high,low].
One can reserve the crash kernel from high memory above DMA zone range
by explicitly passing "crashkernel=X,high"; or reserve a memory range
below 4G with "crashkernel=X,low". Besides, there are few rules need
to take notice:
1. "crashkernel=X,[high,low]" will be ignored if "crashkernel=size"
is specified.
2. "crashkernel=X,low" is valid only when "crashkernel=X,high" is passed
and there is enough memory to be allocated under 4G.
3. When allocating crashkernel above 4G and no "crashkernel=X,low" is
specified, a 128M low memory will be allocated automatically for
swiotlb bounce buffer.
See Documentation/admin-guide/kernel-parameters.txt for more information.
To verify loading the crashkernel, adapted kexec-tools is attached below:
https://github.com/chenjh005/kexec-tools/tree/build-test-riscv-v2
Following test cases have been performed as expected:
1) crashkernel=256M //low=256M
2) crashkernel=1G //low=1G
3) crashkernel=4G //high=4G, low=128M(default)
4) crashkernel=4G crashkernel=256M,high //high=4G, low=128M(default), high is ignored
5) crashkernel=4G crashkernel=256M,low //high=4G, low=128M(default), low is ignored
6) crashkernel=4G,high //high=4G, low=128M(default)
7) crashkernel=256M,low //low=0M, invalid
8) crashkernel=4G,high crashkernel=256M,low //high=4G, low=256M
9) crashkernel=4G,high crashkernel=4G,low //high=0M, low=0M, invalid
10) crashkernel=512M@0xd0000000 //low=512M
11) crashkernel=1G,high crashkernel=0M,low //high=1G, low=0M
* b4-shazam-merge:
docs: kdump: Update the crashkernel description for riscv
riscv: kdump: Implement crashkernel=X,[high,low]
Link: https://lore.kernel.org/r/20230726175000.2536220-1-chenjiahao16@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
- Work from Carlos Bilbao to integrate rustdoc output into the generated
HTML documentation. This took some work to figure out how to do it
without slowing the docs build and without creating people who don't have
Rust installed, but Carlos got there.
- Move the loongarch and mips architecture documentation under
Documentation/arch/.
- Some more maintainer documentation from Jakub
...plus the usual assortment of updates, translations, and fixes.
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAmTvqNkPHGNvcmJldEBs
d24ubmV0AAoJEBdDWhNsDH5YgIgH/3drfLtlFtzLqDOzrzDXS8yGnE3pPdxw796b
/ZFzAK16wYKaKevYoIz8bVGGKaE1sEUW0mhlq4KGdfZuxLG8YnWS8URyCW4FDU2E
6qNL+8oJ8LZfID46f9Q8ZgfEz7yF/mhCqPk7MEswYtwbscs2ZTGCTGYB/5BHlBuT
LR+M89uLmHgr8S1o24v30OgiX+VvQFyu0xoxIhbiqUZvBd/XdfX2pgYd9BGzMj5q
C2ZP+V14g36c5pV0EO9TwhCXOF/WVrp7DbjbfWAsqBSLxvpXPydH2q1DUzGeQtP1
exujrBD1O8q3pPdaNA5R+h6cWlHmUZug9mE4BRLp9ErGrozwJsQ=
=C3Uv
-----END PGP SIGNATURE-----
Merge tag 'docs-6.6' of git://git.lwn.net/linux
Pull documentation updates from Jonathan Corbet:
"Documentation work keeps chugging along; this includes:
- Work from Carlos Bilbao to integrate rustdoc output into the
generated HTML documentation. This took some work to figure out how
to do it without slowing the docs build and without creating people
who don't have Rust installed, but Carlos got there
- Move the loongarch and mips architecture documentation under
Documentation/arch/
- Some more maintainer documentation from Jakub
... plus the usual assortment of updates, translations, and fixes"
* tag 'docs-6.6' of git://git.lwn.net/linux: (56 commits)
Docu: genericirq.rst: fix irq-example
input: docs: pxrc: remove reference to phoenix-sim
Documentation: serial-console: Fix literal block marker
docs/mm: remove references to hmm_mirror ops and clean typos
docs/zh_CN: correct regi_chg(),regi_add() to region_chg(),region_add()
Documentation: Fix typos
Documentation/ABI: Fix typos
scripts: kernel-doc: fix macro handling in enums
scripts: kernel-doc: parse DEFINE_DMA_UNMAP_[ADDR|LEN]
Documentation: riscv: Update boot image header since EFI stub is supported
Documentation: riscv: Add early boot document
Documentation: arm: Add bootargs to the table of added DT parameters
docs: kernel-parameters: Refer to the correct bitmap function
doc: update params of memhp_default_state=
docs: Add book to process/kernel-docs.rst
docs: sparse: fix invalid link addresses
docs: vfs: clean up after the iterate() removal
docs: Add a section on surveys to the researcher guidelines
docs: move mips under arch
docs: move loongarch under arch
...
- allow dynamic sizing of the swiotlb buffer, to cater for secure
virtualization workloads that require all I/O to be bounce buffered
(Petr Tesarik)
- move a declaration to a header (Arnd Bergmann)
- check for memory region overlap in dma-contiguous (Binglei Wang)
- remove the somewhat dangerous runtime swiotlb-xen enablement and
unexport is_swiotlb_active (Christoph Hellwig, Juergen Gross)
- per-node CMA improvements (Yajun Deng)
-----BEGIN PGP SIGNATURE-----
iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmTuDHkLHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYOqvhAApMk2/ceTgVH17sXaKE822+xKvgv377O6TlggMeGG
W4zA0KD69DNz0AfaaCc5U5f7n8Ld/YY1RsvkHW4b3jgw+KRTeQr0jjitBgP5kP2M
A1+qxdyJpCTwiPt9s2+JFVPeyZ0s52V6OJODKRG3s0ore55R+U09VySKtASON+q3
GMKfWqQteKC+thg7NkrQ7JUixuo84oICws+rZn4K9ifsX2O0HYW6aMW0feRfZjJH
r0TgqZc4RdPTSaF22oapR9Ls39+7hp/pBvoLm5sBNA3cl5C3X4VWo9ERMU1jW9h+
VYQv39NycUspgskWJmpbU06/+ooYqQlwHSR/vdNusmFIvxo4tf6/UX72YO5F8Dar
ap0wYGauiEwTjSnhVxPTXk3obWyWEsgFAeRnPdTlH2CNmv38QZU2HLb8eU1pcXxX
j+WI2Ewy9z22uBVYiPOKpdW1jkSfmlmfPp/8SbAdua7I3YQ90rQN6AvU06zAi/cL
NQTgO81E4jPkygqAVgS/LeYziWAQ73yM7m9ExThtTgqFtHortwhJ4Fd8XKtvtvEb
viXAZ/WZtQBv/CIKAW98NhgIDP/SPOT8ym6V35WK+kkNFMS6LMSQUfl9GgbHGyFa
n9icMm7BmbDtT1+AKNafG9En4DtAf9M9QNidAVOyfrsIk6S0gZoZwvIStkA7on8a
cNY=
=kVVr
-----END PGP SIGNATURE-----
Merge tag 'dma-mapping-6.6-2023-08-29' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-maping updates from Christoph Hellwig:
- allow dynamic sizing of the swiotlb buffer, to cater for secure
virtualization workloads that require all I/O to be bounce buffered
(Petr Tesarik)
- move a declaration to a header (Arnd Bergmann)
- check for memory region overlap in dma-contiguous (Binglei Wang)
- remove the somewhat dangerous runtime swiotlb-xen enablement and
unexport is_swiotlb_active (Christoph Hellwig, Juergen Gross)
- per-node CMA improvements (Yajun Deng)
* tag 'dma-mapping-6.6-2023-08-29' of git://git.infradead.org/users/hch/dma-mapping:
swiotlb: optimize get_max_slots()
swiotlb: move slot allocation explanation comment where it belongs
swiotlb: search the software IO TLB only if the device makes use of it
swiotlb: allocate a new memory pool when existing pools are full
swiotlb: determine potential physical address limit
swiotlb: if swiotlb is full, fall back to a transient memory pool
swiotlb: add a flag whether SWIOTLB is allowed to grow
swiotlb: separate memory pool data from other allocator data
swiotlb: add documentation and rename swiotlb_do_find_slots()
swiotlb: make io_tlb_default_mem local to swiotlb.c
swiotlb: bail out of swiotlb_init_late() if swiotlb is already allocated
dma-contiguous: check for memory region overlap
dma-contiguous: support numa CMA for specified node
dma-contiguous: support per-numa CMA for all architectures
dma-mapping: move arch_dma_set_mask() declaration to header
swiotlb: unexport is_swiotlb_active
x86: always initialize xen-swiotlb when xen-pcifront is enabling
xen/pci: add flag for PCI passthrough being possible
("refactor Kconfig to consolidate KEXEC and CRASH options").
- kernel.h slimming work from Andy Shevchenko ("kernel.h: Split out a
couple of macros to args.h").
- gdb feature work from Kuan-Ying Lee ("Add GDB memory helper
commands").
- vsprintf inclusion rationalization from Andy Shevchenko
("lib/vsprintf: Rework header inclusions").
- Switch the handling of kdump from a udev scheme to in-kernel handling,
by Eric DeVolder ("crash: Kernel handling of CPU and memory hot
un/plug").
- Many singleton patches to various parts of the tree
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZO2GpAAKCRDdBJ7gKXxA
juW3AQD1moHzlSN6x9I3tjm5TWWNYFoFL8af7wXDJspp/DWH/AD/TO0XlWWhhbYy
QHy7lL0Syha38kKLMXTM+bN6YQHi9AU=
=WJQa
-----END PGP SIGNATURE-----
Merge tag 'mm-nonmm-stable-2023-08-28-22-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- An extensive rework of kexec and crash Kconfig from Eric DeVolder
("refactor Kconfig to consolidate KEXEC and CRASH options")
- kernel.h slimming work from Andy Shevchenko ("kernel.h: Split out a
couple of macros to args.h")
- gdb feature work from Kuan-Ying Lee ("Add GDB memory helper
commands")
- vsprintf inclusion rationalization from Andy Shevchenko
("lib/vsprintf: Rework header inclusions")
- Switch the handling of kdump from a udev scheme to in-kernel
handling, by Eric DeVolder ("crash: Kernel handling of CPU and memory
hot un/plug")
- Many singleton patches to various parts of the tree
* tag 'mm-nonmm-stable-2023-08-28-22-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (81 commits)
document while_each_thread(), change first_tid() to use for_each_thread()
drivers/char/mem.c: shrink character device's devlist[] array
x86/crash: optimize CPU changes
crash: change crash_prepare_elf64_headers() to for_each_possible_cpu()
crash: hotplug support for kexec_load()
x86/crash: add x86 crash hotplug support
crash: memory and CPU hotplug sysfs attributes
kexec: exclude elfcorehdr from the segment digest
crash: add generic infrastructure for crash hotplug support
crash: move a few code bits to setup support of crash hotplug
kstrtox: consistently use _tolower()
kill do_each_thread()
nilfs2: fix WARNING in mark_buffer_dirty due to discarded buffer reuse
scripts/bloat-o-meter: count weak symbol sizes
treewide: drop CONFIG_EMBEDDED
lockdep: fix static memory detection even more
lib/vsprintf: declare no_hash_pointers in sprintf.h
lib/vsprintf: split out sprintf() and friends
kernel/fork: stop playing lockless games for exe_file replacement
adfs: delete unused "union adfs_dirtail" definition
...
- Peter Xu has a series (mm/gup: Unify hugetlb, speed up thp") which
reduces the special-case code for handling hugetlb pages in GUP. It
also speeds up GUP handling of transparent hugepages.
- Peng Zhang provides some maple tree speedups ("Optimize the fast path
of mas_store()").
- Sergey Senozhatsky has improved te performance of zsmalloc during
compaction (zsmalloc: small compaction improvements").
- Domenico Cerasuolo has developed additional selftest code for zswap
("selftests: cgroup: add zswap test program").
- xu xin has doe some work on KSM's handling of zero pages. These
changes are mainly to enable the user to better understand the
effectiveness of KSM's treatment of zero pages ("ksm: support tracking
KSM-placed zero-pages").
- Jeff Xu has fixes the behaviour of memfd's
MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED sysctl ("mm/memfd: fix sysctl
MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED").
- David Howells has fixed an fscache optimization ("mm, netfs, fscache:
Stop read optimisation when folio removed from pagecache").
- Axel Rasmussen has given userfaultfd the ability to simulate memory
poisoning ("add UFFDIO_POISON to simulate memory poisoning with UFFD").
- Miaohe Lin has contributed some routine maintenance work on the
memory-failure code ("mm: memory-failure: remove unneeded PageHuge()
check").
- Peng Zhang has contributed some maintenance work on the maple tree
code ("Improve the validation for maple tree and some cleanup").
- Hugh Dickins has optimized the collapsing of shmem or file pages into
THPs ("mm: free retracted page table by RCU").
- Jiaqi Yan has a patch series which permits us to use the healthy
subpages within a hardware poisoned huge page for general purposes
("Improve hugetlbfs read on HWPOISON hugepages").
- Kemeng Shi has done some maintenance work on the pagetable-check code
("Remove unused parameters in page_table_check").
- More folioification work from Matthew Wilcox ("More filesystem folio
conversions for 6.6"), ("Followup folio conversions for zswap"). And
from ZhangPeng ("Convert several functions in page_io.c to use a
folio").
- page_ext cleanups from Kemeng Shi ("minor cleanups for page_ext").
- Baoquan He has converted some architectures to use the GENERIC_IOREMAP
ioremap()/iounmap() code ("mm: ioremap: Convert architectures to take
GENERIC_IOREMAP way").
- Anshuman Khandual has optimized arm64 tlb shootdown ("arm64: support
batched/deferred tlb shootdown during page reclamation/migration").
- Better maple tree lockdep checking from Liam Howlett ("More strict
maple tree lockdep"). Liam also developed some efficiency improvements
("Reduce preallocations for maple tree").
- Cleanup and optimization to the secondary IOMMU TLB invalidation, from
Alistair Popple ("Invalidate secondary IOMMU TLB on permission
upgrade").
- Ryan Roberts fixes some arm64 MM selftest issues ("selftests/mm fixes
for arm64").
- Kemeng Shi provides some maintenance work on the compaction code ("Two
minor cleanups for compaction").
- Some reduction in mmap_lock pressure from Matthew Wilcox ("Handle most
file-backed faults under the VMA lock").
- Aneesh Kumar contributes code to use the vmemmap optimization for DAX
on ppc64, under some circumstances ("Add support for DAX vmemmap
optimization for ppc64").
- page-ext cleanups from Kemeng Shi ("add page_ext_data to get client
data in page_ext"), ("minor cleanups to page_ext header").
- Some zswap cleanups from Johannes Weiner ("mm: zswap: three
cleanups").
- kmsan cleanups from ZhangPeng ("minor cleanups for kmsan").
- VMA handling cleanups from Kefeng Wang ("mm: convert to
vma_is_initial_heap/stack()").
- DAMON feature work from SeongJae Park ("mm/damon/sysfs-schemes:
implement DAMOS tried total bytes file"), ("Extend DAMOS filters for
address ranges and DAMON monitoring targets").
- Compaction work from Kemeng Shi ("Fixes and cleanups to compaction").
- Liam Howlett has improved the maple tree node replacement code
("maple_tree: Change replacement strategy").
- ZhangPeng has a general code cleanup - use the K() macro more widely
("cleanup with helper macro K()").
- Aneesh Kumar brings memmap-on-memory to ppc64 ("Add support for memmap
on memory feature on ppc64").
- pagealloc cleanups from Kemeng Shi ("Two minor cleanups for pcp list
in page_alloc"), ("Two minor cleanups for get pageblock migratetype").
- Vishal Moola introduces a memory descriptor for page table tracking,
"struct ptdesc" ("Split ptdesc from struct page").
- memfd selftest maintenance work from Aleksa Sarai ("memfd: cleanups
for vm.memfd_noexec").
- MM include file rationalization from Hugh Dickins ("arch: include
asm/cacheflush.h in asm/hugetlb.h").
- THP debug output fixes from Hugh Dickins ("mm,thp: fix sloppy text
output").
- kmemleak improvements from Xiaolei Wang ("mm/kmemleak: use
object_cache instead of kmemleak_initialized").
- More folio-related cleanups from Matthew Wilcox ("Remove _folio_dtor
and _folio_order").
- A VMA locking scalability improvement from Suren Baghdasaryan
("Per-VMA lock support for swap and userfaults").
- pagetable handling cleanups from Matthew Wilcox ("New page table range
API").
- A batch of swap/thp cleanups from David Hildenbrand ("mm/swap: stop
using page->private on tail pages for THP_SWAP + cleanups").
- Cleanups and speedups to the hugetlb fault handling from Matthew
Wilcox ("Change calling convention for ->huge_fault").
- Matthew Wilcox has also done some maintenance work on the MM subsystem
documentation ("Improve mm documentation").
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZO1JUQAKCRDdBJ7gKXxA
jrMwAP47r/fS8vAVT3zp/7fXmxaJYTK27CTAM881Gw1SDhFM/wEAv8o84mDenCg6
Nfio7afS1ncD+hPYT8947UnLxTgn+ww=
=Afws
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2023-08-28-18-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- Some swap cleanups from Ma Wupeng ("fix WARN_ON in
add_to_avail_list")
- Peter Xu has a series (mm/gup: Unify hugetlb, speed up thp") which
reduces the special-case code for handling hugetlb pages in GUP. It
also speeds up GUP handling of transparent hugepages.
- Peng Zhang provides some maple tree speedups ("Optimize the fast path
of mas_store()").
- Sergey Senozhatsky has improved te performance of zsmalloc during
compaction (zsmalloc: small compaction improvements").
- Domenico Cerasuolo has developed additional selftest code for zswap
("selftests: cgroup: add zswap test program").
- xu xin has doe some work on KSM's handling of zero pages. These
changes are mainly to enable the user to better understand the
effectiveness of KSM's treatment of zero pages ("ksm: support
tracking KSM-placed zero-pages").
- Jeff Xu has fixes the behaviour of memfd's
MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED sysctl ("mm/memfd: fix sysctl
MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED").
- David Howells has fixed an fscache optimization ("mm, netfs, fscache:
Stop read optimisation when folio removed from pagecache").
- Axel Rasmussen has given userfaultfd the ability to simulate memory
poisoning ("add UFFDIO_POISON to simulate memory poisoning with
UFFD").
- Miaohe Lin has contributed some routine maintenance work on the
memory-failure code ("mm: memory-failure: remove unneeded PageHuge()
check").
- Peng Zhang has contributed some maintenance work on the maple tree
code ("Improve the validation for maple tree and some cleanup").
- Hugh Dickins has optimized the collapsing of shmem or file pages into
THPs ("mm: free retracted page table by RCU").
- Jiaqi Yan has a patch series which permits us to use the healthy
subpages within a hardware poisoned huge page for general purposes
("Improve hugetlbfs read on HWPOISON hugepages").
- Kemeng Shi has done some maintenance work on the pagetable-check code
("Remove unused parameters in page_table_check").
- More folioification work from Matthew Wilcox ("More filesystem folio
conversions for 6.6"), ("Followup folio conversions for zswap"). And
from ZhangPeng ("Convert several functions in page_io.c to use a
folio").
- page_ext cleanups from Kemeng Shi ("minor cleanups for page_ext").
- Baoquan He has converted some architectures to use the
GENERIC_IOREMAP ioremap()/iounmap() code ("mm: ioremap: Convert
architectures to take GENERIC_IOREMAP way").
- Anshuman Khandual has optimized arm64 tlb shootdown ("arm64: support
batched/deferred tlb shootdown during page reclamation/migration").
- Better maple tree lockdep checking from Liam Howlett ("More strict
maple tree lockdep"). Liam also developed some efficiency
improvements ("Reduce preallocations for maple tree").
- Cleanup and optimization to the secondary IOMMU TLB invalidation,
from Alistair Popple ("Invalidate secondary IOMMU TLB on permission
upgrade").
- Ryan Roberts fixes some arm64 MM selftest issues ("selftests/mm fixes
for arm64").
- Kemeng Shi provides some maintenance work on the compaction code
("Two minor cleanups for compaction").
- Some reduction in mmap_lock pressure from Matthew Wilcox ("Handle
most file-backed faults under the VMA lock").
- Aneesh Kumar contributes code to use the vmemmap optimization for DAX
on ppc64, under some circumstances ("Add support for DAX vmemmap
optimization for ppc64").
- page-ext cleanups from Kemeng Shi ("add page_ext_data to get client
data in page_ext"), ("minor cleanups to page_ext header").
- Some zswap cleanups from Johannes Weiner ("mm: zswap: three
cleanups").
- kmsan cleanups from ZhangPeng ("minor cleanups for kmsan").
- VMA handling cleanups from Kefeng Wang ("mm: convert to
vma_is_initial_heap/stack()").
- DAMON feature work from SeongJae Park ("mm/damon/sysfs-schemes:
implement DAMOS tried total bytes file"), ("Extend DAMOS filters for
address ranges and DAMON monitoring targets").
- Compaction work from Kemeng Shi ("Fixes and cleanups to compaction").
- Liam Howlett has improved the maple tree node replacement code
("maple_tree: Change replacement strategy").
- ZhangPeng has a general code cleanup - use the K() macro more widely
("cleanup with helper macro K()").
- Aneesh Kumar brings memmap-on-memory to ppc64 ("Add support for
memmap on memory feature on ppc64").
- pagealloc cleanups from Kemeng Shi ("Two minor cleanups for pcp list
in page_alloc"), ("Two minor cleanups for get pageblock
migratetype").
- Vishal Moola introduces a memory descriptor for page table tracking,
"struct ptdesc" ("Split ptdesc from struct page").
- memfd selftest maintenance work from Aleksa Sarai ("memfd: cleanups
for vm.memfd_noexec").
- MM include file rationalization from Hugh Dickins ("arch: include
asm/cacheflush.h in asm/hugetlb.h").
- THP debug output fixes from Hugh Dickins ("mm,thp: fix sloppy text
output").
- kmemleak improvements from Xiaolei Wang ("mm/kmemleak: use
object_cache instead of kmemleak_initialized").
- More folio-related cleanups from Matthew Wilcox ("Remove _folio_dtor
and _folio_order").
- A VMA locking scalability improvement from Suren Baghdasaryan
("Per-VMA lock support for swap and userfaults").
- pagetable handling cleanups from Matthew Wilcox ("New page table
range API").
- A batch of swap/thp cleanups from David Hildenbrand ("mm/swap: stop
using page->private on tail pages for THP_SWAP + cleanups").
- Cleanups and speedups to the hugetlb fault handling from Matthew
Wilcox ("Change calling convention for ->huge_fault").
- Matthew Wilcox has also done some maintenance work on the MM
subsystem documentation ("Improve mm documentation").
* tag 'mm-stable-2023-08-28-18-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (489 commits)
maple_tree: shrink struct maple_tree
maple_tree: clean up mas_wr_append()
secretmem: convert page_is_secretmem() to folio_is_secretmem()
nios2: fix flush_dcache_page() for usage from irq context
hugetlb: add documentation for vma_kernel_pagesize()
mm: add orphaned kernel-doc to the rst files.
mm: fix clean_record_shared_mapping_range kernel-doc
mm: fix get_mctgt_type() kernel-doc
mm: fix kernel-doc warning from tlb_flush_rmaps()
mm: remove enum page_entry_size
mm: allow ->huge_fault() to be called without the mmap_lock held
mm: move PMD_ORDER to pgtable.h
mm: remove checks for pte_index
memcg: remove duplication detection for mem_cgroup_uncharge_swap
mm/huge_memory: work on folio->swap instead of page->private when splitting folio
mm/swap: inline folio_set_swap_entry() and folio_swap_entry()
mm/swap: use dedicated entry for swap in folio
mm/swap: stop using page->private on tail pages for THP_SWAP
selftests/mm: fix WARNING comparing pointer to 0
selftests: cgroup: fix test_kmem_memcg_deletion kernel mem check
...
Contents:
- Restrict linking of keys to .ima and .evm keyrings based on
digitalSignature attribute in the certificate.
- PowerVM: load machine owner keys into the .machine [1] keyring.
- PowerVM: load module signing keys into the secondary trusted keyring
(keys blessed by the vendor).
- tpm_tis_spi: half-duplex transfer mode
- tpm_tis: retry corrupted transfers
- Apply revocation list (.mokx) to an all system keyrings (e.g. .machine
keyring).
[1] https://blogs.oracle.com/linux/post/the-machine-keyring
BR, Jarkko
-----BEGIN PGP SIGNATURE-----
iIgEABYIADAWIQRE6pSOnaBC00OEHEIaerohdGur0gUCZN5/qBIcamFya2tvQGtl
cm5lbC5vcmcACgkQGnq6IXRrq9J4GQEAstTtQfGGrx5KInOTMWOvaq/Cum5iW4AD
NefVfbUtCCQBANvFtxoPYQS5u6+rIdxzIwFiNUlOyt2uR2bkk4UUiPML
=Vvs8
-----END PGP SIGNATURE-----
Merge tag 'tpmdd-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd
Pull tpm updates from Jarkko Sakkinen:
- Restrict linking of keys to .ima and .evm keyrings based on
digitalSignature attribute in the certificate
- PowerVM: load machine owner keys into the .machine [1] keyring
- PowerVM: load module signing keys into the secondary trusted keyring
(keys blessed by the vendor)
- tpm_tis_spi: half-duplex transfer mode
- tpm_tis: retry corrupted transfers
- Apply revocation list (.mokx) to an all system keyrings (e.g.
.machine keyring)
Link: https://blogs.oracle.com/linux/post/the-machine-keyring [1]
* tag 'tpmdd-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
certs: Reference revocation list for all keyrings
tpm/tpm_tis_synquacer: Use module_platform_driver macro to simplify the code
tpm: remove redundant variable len
tpm_tis: Resend command to recover from data transfer errors
tpm_tis: Use responseRetry to recover from data transfer errors
tpm_tis: Move CRC check to generic send routine
tpm_tis_spi: Add hardware wait polling
KEYS: Replace all non-returning strlcpy with strscpy
integrity: PowerVM support for loading third party code signing keys
integrity: PowerVM machine keyring enablement
integrity: check whether imputed trust is enabled
integrity: remove global variable from machine_keyring.c
integrity: ignore keys failing CA restrictions on non-UEFI platform
integrity: PowerVM support for loading CA keys on machine keyring
integrity: Enforce digitalSignature usage in the ima and evm keyrings
KEYS: DigitalSignature link restriction
tpm_tis: Revert "tpm_tis: Disable interrupts on ThinkPad T490s"
- Update the ACPICA code in the kernel to upstream revision 20230628
including the following changes:
* Suppress a GCC 12 dangling-pointer warning (Philip Prindeville).
* Reformat the ACPI_STATE_COMMON macro and its users (George Guo).
* Replace the ternary operator with ACPI_MIN() (Jiangshan Yi).
* Add support for _DSC as per ACPI 6.5 (Saket Dumbre).
* Remove a duplicate macro from zephyr header (Najumon B.A).
* Add data structures for GED and _EVT tracking (Jose Marinho).
* Fix misspelled CDAT DSMAS define (Dave Jiang).
* Simplify an error message in acpi_ds_result_push() (Christophe
Jaillet).
* Add a struct size macro related to SRAT (Dave Jiang).
* Add AML_NO_OPERAND_RESOLVE flag to Timer (Abhishek Mainkar).
* Add support for RISC-V external interrupt controllers in MADT (Sunil
V L).
* Add RHCT flags, CMO and MMU nodes (Sunil V L).
* Change ACPICA version to 20230628 (Bob Moore).
- Introduce new wrappers for ACPICA notify handler install/remove and
convert multiple drivers to using their own Notify() handlers instead
of the ACPI bus type .notify() slated for removal (Michal Wilczynski).
- Add backlight=native DMI quirk for Apple iMac12,1 and iMac12,2 (Hans
de Goede).
- Put ACPI video and its child devices explicitly into D0 on boot to
avoid platform firmware confusion (Kai-Heng Feng).
- Add backlight=native DMI quirk for Lenovo Ideapad Z470 (Jiri Slaby).
- Support obtaining physical CPU ID from MADT on LoongArch (Bibo Mao).
- Convert ACPI CPU initialization to using _OSC instead of _PDC that
has been depreceted since 2018 and dropped from the specification in
ACPI 6.5 (Michal Wilczynski, Rafael Wysocki).
- Drop non-functional nocrt parameter from ACPI thermal (Mario
Limonciello).
- Clean up the ACPI thermal driver, rework the handling of firmware
notifications in it and make it provide a table of generic trip point
structures to the core during initialization (Rafael Wysocki).
- Defer enumeration of devices with _DEP pointing to IVSC (Wentong Wu).
- Install SystemCMOS address space handler for ACPI000E (TAD) to meet
platform firmware expectations on some platforms (Zhang Rui).
- Fix finding the generic error data in the ACPi extlog driver for
compatibility with old and new firmware interface versions (Xiaochun
Lee).
- Remove assorted unused declarations of functions (Yue Haibing).
- Move AMBA bus scan handling into arm64 specific directory (Sudeep
Holla).
- Fix and clean up suspend-to-idle interface for AMD systems (Mario
Limonciello, Andy Shevchenko).
- Fix string truncation warning in pnpacpi_add_device() (Sunil V L).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmTslAkSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxtvgQAIsYAHXgalMFCLTqxeRZ8IMMLh/oxVfp
RDbZ7lsBNzhYKXNv0jczjDApvfOctplh+qVuRIyFPrVrD+xzPgI9eFaZx41B3UPy
3UG/KphjtF+sAu8rSFM2qcL2P+NGWe1Okr2G8Mbvu6bkGjfHShX6HGR7YgIyrZVO
VMVLWqLvjvvg/jkraozArTtlvMwTrpBFxgWQVPK9TLRwx+JfB7WV6Z1XkgbllpR5
8Q+O9wyVWfAIrLk/HZJdYNkdwREi46d226ePs6tJkwzE8KCUna0U4WbVVPRjbSfI
Mkpy7Bnn91EZdN43/3ZzPBNO6NJi7UsD46EwRMzf1v7KyfsttnKw/v/SMke6gKQW
43gi0trdbVxEdnN33CmBb2k82YQ5tjVuNOwKNy4xvFZ4vTE+WTz2imXMdW28biLT
sU1eYJXPzBUQ4Ja2x56a1DOp2C1uOcSHbTKNzmtFsmT2FIWOKN+9PrOjaFEn7cyU
FwWSzcONLE/QC0cSr7cWYym06aY8INtAlCm0hrlpwBDyiVDACz/QZ3NQAmbXKr5z
5JIDQ+8v6YvpIx48yDlTYaQPMEskkZ2udkmwZCh6Vs0fMGyo3DDWvAIltPmHuIoj
KekfDK5pkJjftK67IlHioAkS3KGHy4VTybi4Sx8KjYy9Q4GPQ4TL2h18kInXGu4f
tv7U9J6zx9mJ
=aXc6
-----END PGP SIGNATURE-----
Merge tag 'acpi-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"These include new ACPICA material, a rework of the ACPI thermal
driver, a switch-over of the ACPI processor driver to using _OSC
instead of (long deprecated) _PDC for CPU initialization, a rework of
firmware notifications handling in several drivers, fixes and cleanups
for suspend-to-idle handling on AMD systems, ACPI backlight driver
updates and more.
Specifics:
- Update the ACPICA code in the kernel to upstream revision 20230628
including the following changes:
- Suppress a GCC 12 dangling-pointer warning (Philip Prindeville)
- Reformat the ACPI_STATE_COMMON macro and its users (George Guo)
- Replace the ternary operator with ACPI_MIN() (Jiangshan Yi)
- Add support for _DSC as per ACPI 6.5 (Saket Dumbre)
- Remove a duplicate macro from zephyr header (Najumon B.A)
- Add data structures for GED and _EVT tracking (Jose Marinho)
- Fix misspelled CDAT DSMAS define (Dave Jiang)
- Simplify an error message in acpi_ds_result_push() (Christophe
Jaillet)
- Add a struct size macro related to SRAT (Dave Jiang)
- Add AML_NO_OPERAND_RESOLVE flag to Timer (Abhishek Mainkar)
- Add support for RISC-V external interrupt controllers in MADT
(Sunil V L)
- Add RHCT flags, CMO and MMU nodes (Sunil V L)
- Change ACPICA version to 20230628 (Bob Moore)
- Introduce new wrappers for ACPICA notify handler install/remove and
convert multiple drivers to using their own Notify() handlers
instead of the ACPI bus type .notify() slated for removal (Michal
Wilczynski)
- Add backlight=native DMI quirk for Apple iMac12,1 and iMac12,2
(Hans de Goede)
- Put ACPI video and its child devices explicitly into D0 on boot to
avoid platform firmware confusion (Kai-Heng Feng)
- Add backlight=native DMI quirk for Lenovo Ideapad Z470 (Jiri Slaby)
- Support obtaining physical CPU ID from MADT on LoongArch (Bibo Mao)
- Convert ACPI CPU initialization to using _OSC instead of _PDC that
has been depreceted since 2018 and dropped from the specification
in ACPI 6.5 (Michal Wilczynski, Rafael Wysocki)
- Drop non-functional nocrt parameter from ACPI thermal (Mario
Limonciello)
- Clean up the ACPI thermal driver, rework the handling of firmware
notifications in it and make it provide a table of generic trip
point structures to the core during initialization (Rafael Wysocki)
- Defer enumeration of devices with _DEP pointing to IVSC (Wentong
Wu)
- Install SystemCMOS address space handler for ACPI000E (TAD) to meet
platform firmware expectations on some platforms (Zhang Rui)
- Fix finding the generic error data in the ACPi extlog driver for
compatibility with old and new firmware interface versions
(Xiaochun Lee)
- Remove assorted unused declarations of functions (Yue Haibing)
- Move AMBA bus scan handling into arm64 specific directory (Sudeep
Holla)
- Fix and clean up suspend-to-idle interface for AMD systems (Mario
Limonciello, Andy Shevchenko)
- Fix string truncation warning in pnpacpi_add_device() (Sunil V L)"
* tag 'acpi-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (66 commits)
ACPI: x86: s2idle: Add a function to get LPS0 constraint for a device
ACPI: x86: s2idle: Add for_each_lpi_constraint() helper
ACPI: x86: s2idle: Add more debugging for AMD constraints parsing
ACPI: x86: s2idle: Fix a logic error parsing AMD constraints table
ACPI: x86: s2idle: Catch multiple ACPI_TYPE_PACKAGE objects
ACPI: x86: s2idle: Post-increment variables when getting constraints
ACPI: Adjust #ifdef for *_lps0_dev use
ACPI: TAD: Install SystemCMOS address space handler for ACPI000E
ACPI: Remove assorted unused declarations of functions
ACPI: extlog: Fix finding the generic error data for v3 structure
PNP: ACPI: Fix string truncation warning
ACPI: Remove unused extern declaration acpi_paddr_to_node()
ACPI: video: Add backlight=native DMI quirk for Apple iMac12,1 and iMac12,2
ACPI: video: Put ACPI video and its child devices into D0 on boot
ACPI: processor: LoongArch: Get physical ID from MADT
ACPI: scan: Defer enumeration of devices with a _DEP pointing to IVSC device
ACPI: thermal: Eliminate code duplication from acpi_thermal_notify()
ACPI: thermal: Drop unnecessary thermal zone callbacks
ACPI: thermal: Rework thermal_get_trend()
ACPI: thermal: Use trip point table to register thermal zones
...
- Add vfio-ap support to pass-through crypto devices to secure execution
guests
- Add API ordinal 6 support to zcrypt_ep11misc device drive, which is
required to handle key generate and key derive (e.g. secure key to
protected key) correctly
- Add missing secure/has_secure sysfs files for the case where it is not
possible to figure where a system has been booted from. Existing user
space relies on that these files are always present
- Fix DCSS block device driver list corruption, caused by incorrect
error handling
- Convert virt_to_pfn() and pfn_to_virt() from defines to static inline
functions to enforce type checking
- Cleanups, improvements, and minor fixes to the kernel mapping setup
- Fix various virtual vs physical address confusions
- Move pfault code to separate file, since it has nothing to do with
regular fault handling
- Move s390 documentation to Documentation/arch/ like it has been done
for other architectures already
- Add HAVE_FUNCTION_GRAPH_RETVAL support
- Factor out the s390_hypfs filesystem and add a new config option for
it. The filesystem is deprecated and as soon as all users are gone it
can be removed some time in the not so near future
- Remove support for old CEX2 and CEX3 crypto cards from zcrypt device
driver
- Add support for user-defined certificates: receive user-defined
certificates with a diagnose call and provide them via 'cert_store'
keyring to user space
- Couple of other small fixes and improvements all over the place
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEECMNfWEw3SLnmiLkZIg7DeRspbsIFAmTrqNYACgkQIg7DeRsp
bsKkUBAApWXr3WCJA2tige34AnFwmskx4sBxl/fgwcwJrC55fED1jKWaiXOM6isv
P+hqavZnks3gXZdYcD3kxXkNMh+fPNWw7BAL35J5Gu1VShA/jlbTC6ZrvUO3t+Fy
NsdLvBDbNDdyUzQF7w0Xb0jyIxqhJTRyhLfR5oXES63FHomv2F/vofu4jWR/q+cc
F9mcnoDeN4zLdssdvl6WtPX4nEY9RpG0QOh67drnxuq+8v7sL8gKN4ti94Rp6vhs
g4NhNs9xgRIPoOcX2KlSIdFqO9P12jSXZq0G4HcOp8UGQvgU/mS+UG3pQwV3ZJLS
3/kUJZ4/CwQa1xUFtPGP1/4AngGNOnhT9FCD4KrqjDkRZmLsd5RvURe6L1zQ3vbZ
KnX7q0Otx4xRVYPlbHb9aP+tC7f3Q10ytBAps616qZoA/2SMss2BLZiiPBpCCvDp
L+9dRhBGYCP2PSe6H/qGQFfMW+uY7QF+NDcDAT5mX1lS8OVrGJxqM7Q+sY2pMLGo
5nR16LvM9g6W/ZnsVn0+BWg4CgaPMi+PMfMPxs/o9RG+/0d1AJx1aLSiHdP1pXog
8/Wg4GaaJ27S4Ers0JUmH7VDO+QkkLvAArstjk8l59r1XslWiBP5USebkxtgu6EQ
ehAh0+oa432ALq8Rn1FK/X+pWFumbTVf8OPwR8YEjDbeTPIBCqg=
=ewd9
-----END PGP SIGNATURE-----
Merge tag 's390-6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Heiko Carstens:
- Add vfio-ap support to pass-through crypto devices to secure
execution guests
- Add API ordinal 6 support to zcrypt_ep11misc device drive, which is
required to handle key generate and key derive (e.g. secure key to
protected key) correctly
- Add missing secure/has_secure sysfs files for the case where it is
not possible to figure where a system has been booted from. Existing
user space relies on that these files are always present
- Fix DCSS block device driver list corruption, caused by incorrect
error handling
- Convert virt_to_pfn() and pfn_to_virt() from defines to static inline
functions to enforce type checking
- Cleanups, improvements, and minor fixes to the kernel mapping setup
- Fix various virtual vs physical address confusions
- Move pfault code to separate file, since it has nothing to do with
regular fault handling
- Move s390 documentation to Documentation/arch/ like it has been done
for other architectures already
- Add HAVE_FUNCTION_GRAPH_RETVAL support
- Factor out the s390_hypfs filesystem and add a new config option for
it. The filesystem is deprecated and as soon as all users are gone it
can be removed some time in the not so near future
- Remove support for old CEX2 and CEX3 crypto cards from zcrypt device
driver
- Add support for user-defined certificates: receive user-defined
certificates with a diagnose call and provide them via 'cert_store'
keyring to user space
- Couple of other small fixes and improvements all over the place
* tag 's390-6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (66 commits)
s390/pci: use builtin_misc_device macro to simplify the code
s390/vfio-ap: make sure nib is shared
KVM: s390: export kvm_s390_pv*_is_protected functions
s390/uv: export uv_pin_shared for direct usage
s390/vfio-ap: check for TAPQ response codes 0x35 and 0x36
s390/vfio-ap: handle queue state change in progress on reset
s390/vfio-ap: use work struct to verify queue reset
s390/vfio-ap: store entire AP queue status word with the queue object
s390/vfio-ap: remove upper limit on wait for queue reset to complete
s390/vfio-ap: allow deconfigured queue to be passed through to a guest
s390/vfio-ap: wait for response code 05 to clear on queue reset
s390/vfio-ap: clean up irq resources if possible
s390/vfio-ap: no need to check the 'E' and 'I' bits in APQSW after TAPQ
s390/ipl: refactor deprecated strncpy
s390/ipl: fix virtual vs physical address confusion
s390/zcrypt_ep11misc: support API ordinal 6 with empty pin-blob
s390/paes: fix PKEY_TYPE_EP11_AES handling for secure keyblobs
s390/pkey: fix PKEY_TYPE_EP11_AES handling for sysfs attributes
s390/pkey: fix PKEY_TYPE_EP11_AES handling in PKEY_VERIFYKEY2 IOCTL
s390/pkey: fix PKEY_TYPE_EP11_AES handling in PKEY_KBLOB2PROTK[23]
...
doc.2023.07.14b: Documentation updates.
fixes.2023.08.16a: Miscellaneous fixes, perhaps most notably simplifying
SRCU_NOTIFIER_INIT() as suggested.
rcu-tasks.2023.07.24a: RCU Tasks updates, most notably treating
Tasks RCU callbacks as lazy while still treating synchronous
grace periods as urgent. Also fixes one bug that restores the
ability to apply debug-objects to RCU Tasks and another that
fixes a race condition that could result in false-positive
failures of the boot-time self-test code.
rcuscale.2023.07.14b: RCU-scalability performance-test updates,
most notably adding the ability to measure the RCU-Tasks's
grace-period kthread's CPU consumption. This proved
quite useful for the rcu-tasks.2023.07.24a work.
refscale.2023.07.14b: Reference-acquisition/release performance-test
updates, including a fix for an uninitialized wait_queue_head_t.
torture.2023.08.14a: Miscellaneous torture-test updates.
torturescripts.2023.07.20a: Torture-test scripting updates, including
removal of the non-longer-functional formal-verification scripts,
test builds of individual RCU Tasks flavors, better diagnostics
for loss of connectivity for distributed rcutorture tests,
disabling of reboot loops in qemu/KVM-based rcutorture testing,
and passing of init parameters to rcutorture's init program.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmTjkssTHHBhdWxtY2tA
a2VybmVsLm9yZwAKCRCevxLzctn7jITND/9zEqYNbeFrcBs/YaHdoAjsNgOt1IYN
csfF/KArVgdvmrwlV/nEaQMLaJcw9X7DVU5+7E2JbbDaB/2FSacseNyKk6mfgSVK
/0rnTOXpqI9/T1HiJObWZvDQFuKL12bfteXWGJg1sMt2JUGZ4nAWhdZ3xRjp2XkO
89qB5r0fF8gyGwvQ3M29ss8T9Oy0uUNJmDY/QyVxHM6dhkpSAezFffKzD7C4zkSV
WucRTpYJ7bs6otBGtVmwz3x60UAuLwcVfQyB+CTbnGLsps9yAYU+1DDVdm7olcr3
ARXMeboeodMvy9jWXhtbWRVAAob4lVUDXQN27kb4sBgroRQBfQXMuByRAU6s0VtX
frOl6rlbORuAetsC8wFL0IFVn4yTpvXKbYw7h1MXTs7gVVbl33O9FieGvWu0r79/
VR4Xw+JbmYWtyvFV8Zaq4iIEcOe+PeNH6u0bPx+htsHYd1+DUG2UY0MVmJQ3a4sb
ygejA6mguCk7KBzWab8wdDpgAfhNwg0T9a+LQYcaskuD5SSWjYqqg6i1ulqqqyiE
bOfRKDX4mWmAobWKHLssqUrjiLbxfygIaHjCrt7rWJKPIs1bK/WfWa4JbrE0NRwK
9IDd1lWc9C+zoUpjyZWSG3ahK5lWo2u4sPNoRtMQjowjobIz1cBhaEwmFe72bG7C
FCKb7Da2oUaLOw==
=EujZ
-----END PGP SIGNATURE-----
Merge tag 'rcu.2023.08.21a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull RCU updates from Paul McKenney:
- Documentation updates
- Miscellaneous fixes, perhaps most notably simplifying
SRCU_NOTIFIER_INIT() as suggested
- RCU Tasks updates, most notably treating Tasks RCU callbacks as lazy
while still treating synchronous grace periods as urgent. Also fixes
one bug that restores the ability to apply debug-objects to RCU Tasks
and another that fixes a race condition that could result in
false-positive failures of the boot-time self-test code
- RCU-scalability performance-test updates, most notably adding the
ability to measure the RCU-Tasks's grace-period kthread's CPU
consumption. This proved quite useful for the RCU Tasks work
- Reference-acquisition/release performance-test updates, including a
fix for an uninitialized wait_queue_head_t
- Miscellaneous torture-test updates
- Torture-test scripting updates, including removal of the
non-longer-functional formal-verification scripts, test builds of
individual RCU Tasks flavors, better diagnostics for loss of
connectivity for distributed rcutorture tests, disabling of reboot
loops in qemu/KVM-based rcutorture testing, and passing of init
parameters to rcutorture's init program
* tag 'rcu.2023.08.21a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (64 commits)
rcu: Use WRITE_ONCE() for assignments to ->next for rculist_nulls
rcu: Make the rcu_nocb_poll boot parameter usable via boot config
rcu: Mark __rcu_irq_enter_check_tick() ->rcu_urgent_qs load
srcu,notifier: Remove #ifdefs in favor of SRCU Tiny srcu_usage
rcutorture: Stop right-shifting torture_random() return values
torture: Stop right-shifting torture_random() return values
torture: Move stutter_wait() timeouts to hrtimers
torture: Move torture_shuffle() timeouts to hrtimers
torture: Move torture_onoff() timeouts to hrtimers
torture: Make torture_hrtimeout_*() use TASK_IDLE
torture: Add lock_torture writer_fifo module parameter
torture: Add a kthread-creation callback to _torture_create_kthread()
rcu-tasks: Fix boot-time RCU tasks debug-only deadlock
rcu-tasks: Permit use of debug-objects with RCU Tasks flavors
checkpatch: Complain about unexpected uses of RCU Tasks Trace
torture: Cause mkinitrd.sh to indicate failure on compile errors
torture: Make init program dump command-line arguments
torture: Switch qemu from -nographic to -display none
torture: Add init-program support for loongarch
torture: Avoid torture-test reboot loops
...
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZOXTxQAKCRCRxhvAZXjc
okaVAP94WAlItvDRt/z2Wtzf0+RqPZeTXEdGTxua8+RxqCyYIQD+OO5nRfKQPHlV
AqqGJMKItQMSMIYgB5ftqVhNWZfnHgM=
=pSEW
-----END PGP SIGNATURE-----
Merge tag 'v6.6-vfs.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull misc vfs updates from Christian Brauner:
"This contains the usual miscellaneous features, cleanups, and fixes
for vfs and individual filesystems.
Features:
- Block mode changes on symlinks and rectify our broken semantics
- Report file modifications via fsnotify() for splice
- Allow specifying an explicit timeout for the "rootwait" kernel
command line option. This allows to timeout and reboot instead of
always waiting indefinitely for the root device to show up
- Use synchronous fput for the close system call
Cleanups:
- Get rid of open-coded lockdep workarounds for async io submitters
and replace it all with a single consolidated helper
- Simplify epoll allocation helper
- Convert simple_write_begin and simple_write_end to use a folio
- Convert page_cache_pipe_buf_confirm() to use a folio
- Simplify __range_close to avoid pointless locking
- Disable per-cpu buffer head cache for isolated cpus
- Port ecryptfs to kmap_local_page() api
- Remove redundant initialization of pointer buf in pipe code
- Unexport the d_genocide() function which is only used within core
vfs
- Replace printk(KERN_ERR) and WARN_ON() with WARN()
Fixes:
- Fix various kernel-doc issues
- Fix refcount underflow for eventfds when used as EFD_SEMAPHORE
- Fix a mainly theoretical issue in devpts
- Check the return value of __getblk() in reiserfs
- Fix a racy assert in i_readcount_dec
- Fix integer conversion issues in various functions
- Fix LSM security context handling during automounts that prevented
NFS superblock sharing"
* tag 'v6.6-vfs.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (39 commits)
cachefiles: use kiocb_{start,end}_write() helpers
ovl: use kiocb_{start,end}_write() helpers
aio: use kiocb_{start,end}_write() helpers
io_uring: use kiocb_{start,end}_write() helpers
fs: create kiocb_{start,end}_write() helpers
fs: add kerneldoc to file_{start,end}_write() helpers
io_uring: rename kiocb_end_write() local helper
splice: Convert page_cache_pipe_buf_confirm() to use a folio
libfs: Convert simple_write_begin and simple_write_end to use a folio
fs/dcache: Replace printk and WARN_ON by WARN
fs/pipe: remove redundant initialization of pointer buf
fs: Fix kernel-doc warnings
devpts: Fix kernel-doc warnings
doc: idmappings: fix an error and rephrase a paragraph
init: Add support for rootwait timeout parameter
vfs: fix up the assert in i_readcount_dec
fs: Fix one kernel-doc comment
docs: filesystems: idmappings: clarify from where idmappings are taken
fs/buffer.c: disable per-CPU buffer_head cache for isolated CPUs
vfs, security: Fix automount superblock LSM init problem, preventing NFS sb sharing
...
Merge ACPI thermal driver changes for 6.6-rc1:
- Drop non-functional nocrt parameter from ACPI thermal (Mario
Limonciello).
- Clean up the ACPI thermal driver, rework the handling of firmware
notifications in it and make it provide a table of generic trip point
structures to the core during initialization (Rafael Wysocki).
* acpi-thermal:
ACPI: thermal: Eliminate code duplication from acpi_thermal_notify()
ACPI: thermal: Drop unnecessary thermal zone callbacks
ACPI: thermal: Rework thermal_get_trend()
ACPI: thermal: Use trip point table to register thermal zones
thermal: core: Rework and rename __for_each_thermal_trip()
ACPI: thermal: Introduce struct acpi_thermal_trip
ACPI: thermal: Carry out trip point updates under zone lock
ACPI: thermal: Clean up acpi_thermal_register_thermal_zone()
thermal: core: Add priv pointer to struct thermal_trip
thermal: core: Introduce thermal_zone_device_exec()
thermal: core: Do not handle trip points with invalid temperature
ACPI: thermal: Drop redundant local variable from acpi_thermal_resume()
ACPI: thermal: Do not attach private data to ACPI handles
ACPI: thermal: Drop enabled flag from struct acpi_thermal_active
ACPI: thermal: Drop nocrt parameter
Introduce the crash_hotplug attribute for memory and CPUs for use by
userspace. These attributes directly facilitate the udev rule for
managing userspace re-loading of the crash kernel upon hot un/plug
changes.
For memory, expose the crash_hotplug attribute to the
/sys/devices/system/memory directory. For example:
# udevadm info --attribute-walk /sys/devices/system/memory/memory81
looking at device '/devices/system/memory/memory81':
KERNEL=="memory81"
SUBSYSTEM=="memory"
DRIVER==""
ATTR{online}=="1"
ATTR{phys_device}=="0"
ATTR{phys_index}=="00000051"
ATTR{removable}=="1"
ATTR{state}=="online"
ATTR{valid_zones}=="Movable"
looking at parent device '/devices/system/memory':
KERNELS=="memory"
SUBSYSTEMS==""
DRIVERS==""
ATTRS{auto_online_blocks}=="offline"
ATTRS{block_size_bytes}=="8000000"
ATTRS{crash_hotplug}=="1"
For CPUs, expose the crash_hotplug attribute to the
/sys/devices/system/cpu directory. For example:
# udevadm info --attribute-walk /sys/devices/system/cpu/cpu0
looking at device '/devices/system/cpu/cpu0':
KERNEL=="cpu0"
SUBSYSTEM=="cpu"
DRIVER=="processor"
ATTR{crash_notes}=="277c38600"
ATTR{crash_notes_size}=="368"
ATTR{online}=="1"
looking at parent device '/devices/system/cpu':
KERNELS=="cpu"
SUBSYSTEMS==""
DRIVERS==""
ATTRS{crash_hotplug}=="1"
ATTRS{isolated}==""
ATTRS{kernel_max}=="8191"
ATTRS{nohz_full}==" (null)"
ATTRS{offline}=="4-7"
ATTRS{online}=="0-3"
ATTRS{possible}=="0-7"
ATTRS{present}=="0-3"
With these sysfs attributes in place, it is possible to efficiently
instruct the udev rule to skip crash kernel reloading for kernels
configured with crash hotplug support.
For example, the following is the proposed udev rule change for RHEL
system 98-kexec.rules (as the first lines of the rule file):
# The kernel updates the crash elfcorehdr for CPU and memory changes
SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
SUBSYSTEM=="memory", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
When examined in the context of 98-kexec.rules, the above rules test if
crash_hotplug is set, and if so, the userspace initiated
unload-then-reload of the crash kernel is skipped.
CPU and memory checks are separated in accordance with CONFIG_HOTPLUG_CPU
and CONFIG_MEMORY_HOTPLUG kernel config options. If an architecture
supports, for example, memory hotplug but not CPU hotplug, then the
/sys/devices/system/memory/crash_hotplug attribute file is present, but
the /sys/devices/system/cpu/crash_hotplug attribute file will NOT be
present. Thus the udev rule skips userspace processing of memory hot
un/plug events, but the udev rule will evaluate false for CPU events, thus
allowing userspace to process CPU hot un/plug events (ie the
unload-then-reload of the kdump capture kernel).
Link: https://lkml.kernel.org/r/20230814214446.6659-5-eric.devolder@oracle.com
Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
Reviewed-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Akhil Raj <lf32.dev@gmail.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Mimi Zohar <zohar@linux.ibm.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Weißschuh <linux@weissschuh.net>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Alexandre Ghiti <alexghiti@rivosinc.com> says:
riscv used to allow direct access to cycle/time/instret counters,
bypassing the perf framework, this patchset intends to allow the user to
mmap any counter when accessed through perf.
**Important**: The default mode is now user access through perf only, not
the legacy so some applications will break. However, we introduce a sysctl
perf_user_access like arm64 does, which will allow to switch to the legacy
mode described above.
This version needs openSBI v1.3 *and* a kernel fix that went upstream lately
(https://lore.kernel.org/lkml/20230616114831.3186980-1-maz@kernel.org/T/).
* b4-shazam-merge:
perf: tests: Adapt mmap-basic.c for riscv
tools: lib: perf: Implement riscv mmap support
Documentation: admin-guide: Add riscv sysctl_perf_user_access
drivers: perf: Implement perf event mmap support in the SBI backend
drivers: perf: Implement perf event mmap support in the legacy backend
riscv: Prepare for user-space perf event mmap support
drivers: perf: Rename riscv pmu sbi driver
riscv: Make legacy counter enum match the HW numbering
include: riscv: Fix wrong include guard in riscv_pmu.h
perf: Fix wrong comment about default event_idx
Link: https://lore.kernel.org/r/20230802080328.1213905-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
ttyCPM* devices belong to CPM_UART driver at the first place
and that driver provides 6 ports.
Fixes: e29c3f81eb ("Documentation: devices.txt: reconcile serial/ucc_uart minor numers")
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/27d7124cf86157e2a27c2b039e769041994d3f22.1691992627.git.christophe.leroy@csgroup.eu
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stored in the first tail page's flags, this flag replaces the destructor.
That removes the last of the destructors, so remove all references to
folio_dtor and compound_dtor.
Link: https://lkml.kernel.org/r/20230816151201.3655946-9-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
We can use a bit in page[1].flags to indicate that this folio belongs to
hugetlb instead of using a value in page[1].dtors. That lets
folio_test_hugetlb() become an inline function like it should be. We can
also get rid of NULL_COMPOUND_DTOR.
Link: https://lkml.kernel.org/r/20230816151201.3655946-8-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ksm currently maintains several statistics, which let you determine how
successful KSM is at sharing pages. However it does not contain a metric
to determine how much work it does.
This commit adds the pages scanned metric. This allows the administrator
to determine how many pages have been scanned over a period of time.
Link: https://lkml.kernel.org/r/20230811193655.2518943-1-shr@devkernel.io
Signed-off-by: Stefan Roesch <shr@devkernel.io>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Currently, memmap_on_memory feature is only supported with memory block
sizes that result in vmemmap pages covering full page blocks. This is
because memory onlining/offlining code requires applicable ranges to be
pageblock-aligned, for example, to set the migratetypes properly.
This patch helps to lift that restriction by reserving more pages than
required for vmemmap space. This helps the start address to be page block
aligned with different memory block sizes. Using this facility implies
the kernel will be reserving some pages for every memoryblock. This
allows the memmap on memory feature to be widely useful with different
memory block size values.
For ex: with 64K page size and 256MiB memory block size, we require 4
pages to map vmemmap pages, To align things correctly we end up adding a
reserve of 28 pages. ie, for every 4096 pages 28 pages get reserved.
Link: https://lkml.kernel.org/r/20230808091501.287660-5-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Update DAMON usage document for the newly added address range type DAMOS
filter.
Link: https://lkml.kernel.org/r/20230802214312.110532-8-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Update the DAMON usage document for newly added
schemes/.../tried_regions/total_bytes file and the
update_schemes_tried_bytes command.
Link: https://lkml.kernel.org/r/20230802213222.109841-6-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The only user of frontswap is zswap, and has been for a long time. Have
swap call into zswap directly and remove the indirection.
[hannes@cmpxchg.org: remove obsolete comment, per Yosry]
Link: https://lkml.kernel.org/r/20230719142832.GA932528@cmpxchg.org
[fengwei.yin@intel.com: don't warn if none swapcache folio is passed to zswap_load]
Link: https://lkml.kernel.org/r/20230810095652.3905184-1-fengwei.yin@intel.com
Link: https://lkml.kernel.org/r/20230717160227.GA867137@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Nhat Pham <nphamcs@gmail.com>
Acked-by: Yosry Ahmed <yosryahmed@google.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Vitaly Wool <vitaly.wool@konsulko.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Let's update the documentation that any signal is sufficient, and add a
comment that not only checking for fatal signals is historical baggage:
changing it now could break existing user space. although unlikely.
For example, when an app provides a custom SIGALRM handler and triggers
memory offlining, the timeout cmd would no longer stop memory offlining,
because SIGALRM would no longer be considered a fatal signal.
Note that using signal_pending() instead of fatal_signal_pending() is
an anti-pattern, but slowly deprecating that behavior to eventually
change it in the far future is probably not worth the effort. If this
ever becomes relevant for user-space, we might want to rethink.
Link: https://lkml.kernel.org/r/20230711174050.603820-1-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Update the userfaultfd API to advertise this feature as part of feature
flags and supported ioctls (returned upon registration).
Add basic documentation describing the new feature.
Link: https://lkml.kernel.org/r/20230707215540.2324998-7-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Acked-by: Peter Xu <peterx@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Brian Geffon <bgeffon@google.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Huang, Ying <ying.huang@intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Houghton <jthoughton@google.com>
Cc: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
Cc: Jiaqi Yan <jiaqiyan@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Nadav Amit <namit@vmware.com>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: T.J. Alumbaugh <talumbau@google.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: ZhangPeng <zhangpeng362@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
kmem.limit_in_bytes (v1 way to limit kernel memory usage) has been
deprecated since 58056f7750 ("memcg, kmem: further deprecate
kmem.limit_in_bytes") merged in 5.16. We haven't heard about any serious
users since then but it seems that the mere presence of the file is
causing more harm thatn good. We (SUSE) have had several bug reports from
customers where Docker based containers started to fail because a write to
kmem.limit_in_bytes has failed.
This was unexpected because runc code only expects ENOENT (kmem disabled)
or EBUSY (tasks already running within cgroup). So a new error code was
unexpected and the whole container startup failed. This has been later
addressed by
52390d6804
so current Docker runtimes do not suffer from the problem anymore. There
are still older version of Docker in use and likely hard to get rid of
completely.
Address this by wiping out the file completely and effectively get back to
pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration.
I would recommend backporting to stable trees which have picked up
58056f7750 ("memcg, kmem: further deprecate kmem.limit_in_bytes").
[mhocko@suse.com: restore _KMEM switch case]
Link: https://lkml.kernel.org/r/ZKe5wxdbvPi5Cwd7@dhcp22.suse.cz
Link: https://lkml.kernel.org/r/20230704115240.14672-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Tejun Heo <tj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
When use_zero_pages is enabled, the calculation of ksm profit is not
correct because ksm zero pages is not counted in. So update the
calculation of KSM profit including the documentation.
Link: https://lkml.kernel.org/r/20230613030942.186041-1-yang.yang29@zte.com.cn
Signed-off-by: xu xin <xu.xin16@zte.com.cn>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Xiaokai Ran <ran.xiaokai@zte.com.cn>
Cc: Yang Yang <yang.yang29@zte.com.cn>
Cc: Jiang Xuexin <jiang.xuexin@zte.com.cn>
Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
As pages_sharing and pages_shared don't include the number of zero pages
merged by KSM, we cannot know how many pages are zero pages placed by KSM
when enabling use_zero_pages, which leads to KSM not being transparent
with all actual merged pages by KSM. In the early days of use_zero_pages,
zero-pages was unable to get unshared by the ways like MADV_UNMERGEABLE so
it's hard to count how many times one of those zeropages was then
unmerged.
But now, unsharing KSM-placed zero page accurately has been achieved, so
we can easily count both how many times a page full of zeroes was merged
with zero-page and how many times one of those pages was then unmerged.
and so, it helps to estimate memory demands when each and every shared
page could get unshared.
So we add ksm_zero_pages under /sys/kernel/mm/ksm/ to show the number
of all zero pages placed by KSM. Meanwhile, we update the Documentation.
Link: https://lkml.kernel.org/r/20230613030934.185944-1-yang.yang29@zte.com.cn
Signed-off-by: xu xin <xu.xin16@zte.com.cn>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: Xuexin Jiang <jiang.xuexin@zte.com.cn>
Reviewed-by: Xiaokai Ran <ran.xiaokai@zte.com.cn>
Reviewed-by: Yang Yang <yang.yang29@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The parser of the CPU lists is bitmap_parselist() that supports
special notations with the plain numbers. bitmap_parse() never
supported those and will fail in case one will try it.
Fixes: b18def121f ("bitmap_parse: Support 'all' semantics")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20230817140432.507889-1-andriy.shevchenko@linux.intel.com
Commit 5f47adf762 ("mm/memory_hotplug: allow to specify a default
online_type") allows to specify a default online_type which make
online memory to kernel or movable zone possible but fail to update
to doc. Update doc to fit this change.
Signed-off-by: Ma Wupeng <mawupeng1@huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20230802074312.2111074-1-mawupeng1@huawei.com
This patch enables config option GENERIC_IDLE_POLL_SETUP for arch
powerpc. This adds support for kernel param 'nohlt'.
Powerpc kernel also supports another kernel boot-time param called
'powersave' which can also be used to disable all cpu idle-states and
forces CPU to an idle-loop similar to what cpu_idle_poll() does. This
patch however makes powerpc kernel-parameters better aligned to the
generic boot-time parameters.
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230818050739.827851-1-vaibhav@linux.ibm.com
Since for MMIO driver using FIFO registers, also known as tpm_tis, the
default (and tbh recommended) behaviour is now the polling mode, the
"tristate" workaround is no longer for benefit.
If someone wants to explicitly enable IRQs for a TPM chip that should be
without question allowed. It could very well be a piece hardware in the
existing deny list because of e.g. firmware update or something similar.
While at it, document the module parameter, as this was not done in 2006
when it first appeared in the mainline.
Link: https://lore.kernel.org/linux-integrity/20201015214430.17937-1-jsnitsel@redhat.com/
Link: https://lore.kernel.org/all/1145393776.4829.19.camel@localhost.localdomain/
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>