Merge branch 'for-6.2/apple' into for-linus
- new quirks for select Apple keyboards (Kerem Karabay, Aditya Garg)
This commit is contained in:
commit
cfd1f6c16f
6
.mailmap
6
.mailmap
|
@ -104,6 +104,7 @@ Christoph Hellwig <hch@lst.de>
|
|||
Colin Ian King <colin.i.king@gmail.com> <colin.king@canonical.com>
|
||||
Corey Minyard <minyard@acm.org>
|
||||
Damian Hobson-Garcia <dhobsong@igel.co.jp>
|
||||
Dan Carpenter <error27@gmail.com> <dan.carpenter@oracle.com>
|
||||
Daniel Borkmann <daniel@iogearbox.net> <danborkmann@googlemail.com>
|
||||
Daniel Borkmann <daniel@iogearbox.net> <danborkmann@iogearbox.net>
|
||||
Daniel Borkmann <daniel@iogearbox.net> <daniel.borkmann@tik.ee.ethz.ch>
|
||||
|
@ -137,6 +138,7 @@ Filipe Lautert <filipe@icewall.org>
|
|||
Finn Thain <fthain@linux-m68k.org> <fthain@telegraphics.com.au>
|
||||
Franck Bui-Huu <vagabon.xyz@gmail.com>
|
||||
Frank Rowand <frowand.list@gmail.com> <frank.rowand@am.sony.com>
|
||||
Frank Rowand <frowand.list@gmail.com> <frank.rowand@sony.com>
|
||||
Frank Rowand <frowand.list@gmail.com> <frank.rowand@sonymobile.com>
|
||||
Frank Rowand <frowand.list@gmail.com> <frowand@mvista.com>
|
||||
Frank Zago <fzago@systemfabricworks.com>
|
||||
|
@ -336,6 +338,7 @@ Oleksij Rempel <linux@rempel-privat.de> <external.Oleksij.Rempel@de.bosch.com>
|
|||
Oleksij Rempel <linux@rempel-privat.de> <fixed-term.Oleksij.Rempel@de.bosch.com>
|
||||
Oleksij Rempel <linux@rempel-privat.de> <o.rempel@pengutronix.de>
|
||||
Oleksij Rempel <linux@rempel-privat.de> <ore@pengutronix.de>
|
||||
Oliver Upton <oliver.upton@linux.dev> <oupton@google.com>
|
||||
Pali Rohár <pali@kernel.org> <pali.rohar@gmail.com>
|
||||
Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
|
||||
Patrick Mochel <mochel@digitalimplant.org>
|
||||
|
@ -351,7 +354,8 @@ Peter Oruba <peter@oruba.de>
|
|||
Pratyush Anand <pratyush.anand@gmail.com> <pratyush.anand@st.com>
|
||||
Praveen BP <praveenbp@ti.com>
|
||||
Punit Agrawal <punitagrawal@gmail.com> <punit.agrawal@arm.com>
|
||||
Qais Yousef <qsyousef@gmail.com> <qais.yousef@imgtec.com>
|
||||
Qais Yousef <qyousef@layalina.io> <qais.yousef@imgtec.com>
|
||||
Qais Yousef <qyousef@layalina.io> <qais.yousef@arm.com>
|
||||
Quentin Monnet <quentin@isovalent.com> <quentin.monnet@netronome.com>
|
||||
Quentin Perret <qperret@qperret.net> <quentin.perret@arm.com>
|
||||
Rafael J. Wysocki <rjw@rjwysocki.net> <rjw@sisk.pl>
|
||||
|
|
|
@ -227,6 +227,17 @@ Contact: dmaengine@vger.kernel.org
|
|||
Description: Indicate the number of retires for an enqcmds submission on a sharedwq.
|
||||
A max value to set attribute is capped at 64.
|
||||
|
||||
What: /sys/bus/dsa/devices/wq<m>.<n>/op_config
|
||||
Date: Sept 14, 2022
|
||||
KernelVersion: 6.0.0
|
||||
Contact: dmaengine@vger.kernel.org
|
||||
Description: Shows the operation capability bits displayed in bitmap format
|
||||
presented by %*pb printk() output format specifier.
|
||||
The attribute can be configured when the WQ is disabled in
|
||||
order to configure the WQ to accept specific bits that
|
||||
correlates to the operations allowed. It's visible only
|
||||
on platforms that support the capability.
|
||||
|
||||
What: /sys/bus/dsa/devices/engine<m>.<n>/group_id
|
||||
Date: Oct 25, 2019
|
||||
KernelVersion: 5.6.0
|
||||
|
@ -255,3 +266,27 @@ Contact: dmaengine@vger.kernel.org
|
|||
Description: Indicates the number of Read Buffers reserved for the use of
|
||||
engines in the group. See DSA spec v1.2 9.2.18 GRPCFG Read Buffers
|
||||
Reserved.
|
||||
|
||||
What: /sys/bus/dsa/devices/group<m>.<n>/desc_progress_limit
|
||||
Date: Sept 14, 2022
|
||||
KernelVersion: 6.0.0
|
||||
Contact: dmaengine@vger.kernel.org
|
||||
Description: Allows control of the number of work descriptors that can be
|
||||
concurrently processed by an engine in the group as a fraction
|
||||
of the Maximum Work Descriptors in Progress value specified in
|
||||
the ENGCAP register. The acceptable values are 0 (default),
|
||||
1 (1/2 of max value), 2 (1/4 of the max value), and 3 (1/8 of
|
||||
the max value). It's visible only on platforms that support
|
||||
the capability.
|
||||
|
||||
What: /sys/bus/dsa/devices/group<m>.<n>/batch_progress_limit
|
||||
Date: Sept 14, 2022
|
||||
KernelVersion: 6.0.0
|
||||
Contact: dmaengine@vger.kernel.org
|
||||
Description: Allows control of the number of batch descriptors that can be
|
||||
concurrently processed by an engine in the group as a fraction
|
||||
of the Maximum Batch Descriptors in Progress value specified in
|
||||
the ENGCAP register. The acceptable values are 0 (default),
|
||||
1 (1/2 of max value), 2 (1/4 of the max value), and 3 (1/8 of
|
||||
the max value). It's visible only on platforms that support
|
||||
the capability.
|
||||
|
|
|
@ -516,3 +516,11 @@ Contact: Mathieu Poirier <mathieu.poirier@linaro.org>
|
|||
Description: (Read) Returns the number of special conditional P1 right-hand keys
|
||||
that the trace unit can use (0x194). The value is taken
|
||||
directly from the HW.
|
||||
|
||||
What: /sys/bus/coresight/devices/etm<N>/ts_source
|
||||
Date: October 2022
|
||||
KernelVersion: 6.1
|
||||
Contact: Mathieu Poirier <mathieu.poirier@linaro.org> or Suzuki K Poulose <suzuki.poulose@arm.com>
|
||||
Description: (Read) When FEAT_TRF is implemented, value of TRFCR_ELx.TS used for
|
||||
trace session. Otherwise -1 indicates an unknown time source. Check
|
||||
trcidr0.tssize to see if a global timestamp is available.
|
||||
|
|
|
@ -4,6 +4,12 @@ Contact: linux-iio@vger.kernel.org
|
|||
Description:
|
||||
Count data of Count Y represented as a string.
|
||||
|
||||
What: /sys/bus/counter/devices/counterX/countY/capture
|
||||
KernelVersion: 6.1
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
Historical capture of the Count Y count data.
|
||||
|
||||
What: /sys/bus/counter/devices/counterX/countY/ceiling
|
||||
KernelVersion: 5.2
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
|
@ -203,6 +209,13 @@ Description:
|
|||
both edges:
|
||||
Any state transition.
|
||||
|
||||
What: /sys/bus/counter/devices/counterX/countY/num_overflows
|
||||
KernelVersion: 6.1
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
This attribute indicates the number of overflows of count Y.
|
||||
|
||||
What: /sys/bus/counter/devices/counterX/countY/capture_component_id
|
||||
What: /sys/bus/counter/devices/counterX/countY/ceiling_component_id
|
||||
What: /sys/bus/counter/devices/counterX/countY/floor_component_id
|
||||
What: /sys/bus/counter/devices/counterX/countY/count_mode_component_id
|
||||
|
@ -213,11 +226,14 @@ What: /sys/bus/counter/devices/counterX/countY/prescaler_component_id
|
|||
What: /sys/bus/counter/devices/counterX/countY/preset_component_id
|
||||
What: /sys/bus/counter/devices/counterX/countY/preset_enable_component_id
|
||||
What: /sys/bus/counter/devices/counterX/countY/signalZ_action_component_id
|
||||
What: /sys/bus/counter/devices/counterX/countY/num_overflows_component_id
|
||||
What: /sys/bus/counter/devices/counterX/signalY/cable_fault_component_id
|
||||
What: /sys/bus/counter/devices/counterX/signalY/cable_fault_enable_component_id
|
||||
What: /sys/bus/counter/devices/counterX/signalY/filter_clock_prescaler_component_id
|
||||
What: /sys/bus/counter/devices/counterX/signalY/index_polarity_component_id
|
||||
What: /sys/bus/counter/devices/counterX/signalY/polarity_component_id
|
||||
What: /sys/bus/counter/devices/counterX/signalY/synchronous_mode_component_id
|
||||
What: /sys/bus/counter/devices/counterX/signalY/frequency_component_id
|
||||
KernelVersion: 5.16
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
|
@ -303,6 +319,19 @@ Description:
|
|||
Discrete set of available values for the respective Signal Y
|
||||
configuration are listed in this file.
|
||||
|
||||
What: /sys/bus/counter/devices/counterX/signalY/polarity
|
||||
KernelVersion: 6.1
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
Active level of Signal Y. The following polarity values are
|
||||
available:
|
||||
|
||||
positive:
|
||||
Signal high state considered active level (rising edge).
|
||||
|
||||
negative:
|
||||
Signal low state considered active level (falling edge).
|
||||
|
||||
What: /sys/bus/counter/devices/counterX/signalY/name
|
||||
KernelVersion: 5.2
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
|
@ -345,3 +374,9 @@ Description:
|
|||
via index_polarity. The index function (as enabled via
|
||||
preset_enable) is performed synchronously with the
|
||||
quadrature clock on the active level of the index input.
|
||||
|
||||
What: /sys/bus/counter/devices/counterX/signalY/frequency
|
||||
KernelVersion: 6.1
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
Read-only attribute that indicates the signal Y frequency, in Hz.
|
||||
|
|
|
@ -196,7 +196,7 @@ Description:
|
|||
Raw capacitance measurement from channel Y. Units after
|
||||
application of scale and offset are nanofarads.
|
||||
|
||||
What: /sys/.../iio:deviceX/in_capacitanceY-in_capacitanceZ_raw
|
||||
What: /sys/.../iio:deviceX/in_capacitanceY-capacitanceZ_raw
|
||||
KernelVersion: 3.2
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
|
@ -207,6 +207,25 @@ Description:
|
|||
is required is a consistent labeling. Units after application
|
||||
of scale and offset are nanofarads.
|
||||
|
||||
What: /sys/.../iio:deviceX/in_capacitanceY-capacitanceZ_zeropoint
|
||||
KernelVersion: 6.1
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
For differential channels, this an offset that is applied
|
||||
equally to both inputs. As the reading is of the difference
|
||||
between the two inputs, this should not be applied to the _raw
|
||||
reading by userspace (unlike _offset) and unlike calibbias
|
||||
it does not affect the differential value measured because
|
||||
the effect of _zeropoint cancels out across the two inputs
|
||||
that make up the differential pair. It's purpose is to bring
|
||||
the individual signals, before the differential is measured,
|
||||
within the measurement range of the device. The naming is
|
||||
chosen because if the separate inputs that make the
|
||||
differential pair are drawn on a graph in their
|
||||
_raw units, this is the value that the zero point on the
|
||||
measurement axis represents. It is expressed with the
|
||||
same scaling as _raw.
|
||||
|
||||
What: /sys/bus/iio/devices/iio:deviceX/in_temp_raw
|
||||
What: /sys/bus/iio/devices/iio:deviceX/in_tempX_raw
|
||||
What: /sys/bus/iio/devices/iio:deviceX/in_temp_x_raw
|
||||
|
@ -241,6 +260,15 @@ Description:
|
|||
Has all of the equivalent parameters as per voltageY. Units
|
||||
after application of scale and offset are m/s^2.
|
||||
|
||||
What: /sys/bus/iio/devices/iio:deviceX/in_accel_linear_x_raw
|
||||
What: /sys/bus/iio/devices/iio:deviceX/in_accel_linear_y_raw
|
||||
What: /sys/bus/iio/devices/iio:deviceX/in_accel_linear_z_raw
|
||||
KernelVersion: 6.1
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
As per in_accel_X_raw attributes, but minus the
|
||||
acceleration due to gravity.
|
||||
|
||||
What: /sys/bus/iio/devices/iio:deviceX/in_gravity_x_raw
|
||||
What: /sys/bus/iio/devices/iio:deviceX/in_gravity_y_raw
|
||||
What: /sys/bus/iio/devices/iio:deviceX/in_gravity_z_raw
|
||||
|
@ -2038,3 +2066,99 @@ Description:
|
|||
Available range for the forced calibration value, expressed as:
|
||||
|
||||
- a range specified as "[min step max]"
|
||||
|
||||
What: /sys/bus/iio/devices/iio:deviceX/in_voltageX_sampling_frequency
|
||||
What: /sys/bus/iio/devices/iio:deviceX/in_powerY_sampling_frequency
|
||||
What: /sys/bus/iio/devices/iio:deviceX/in_currentZ_sampling_frequency
|
||||
KernelVersion: 5.20
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
Some devices have separate controls of sampling frequency for
|
||||
individual channels. If multiple channels are enabled in a scan,
|
||||
then the sampling_frequency of the scan may be computed from the
|
||||
per channel sampling frequencies.
|
||||
|
||||
What: /sys/.../events/in_accel_gesture_singletap_en
|
||||
What: /sys/.../events/in_accel_gesture_doubletap_en
|
||||
KernelVersion: 6.1
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
Device generates an event on a single or double tap.
|
||||
|
||||
What: /sys/.../events/in_accel_gesture_singletap_value
|
||||
What: /sys/.../events/in_accel_gesture_doubletap_value
|
||||
KernelVersion: 6.1
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
Specifies the threshold value that the device is comparing
|
||||
against to generate the tap gesture event. The lower
|
||||
threshold value increases the sensitivity of tap detection.
|
||||
Units and the exact meaning of value are device-specific.
|
||||
|
||||
What: /sys/.../events/in_accel_gesture_tap_value_available
|
||||
KernelVersion: 6.1
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
Lists all available threshold values which can be used to
|
||||
modify the sensitivity of the tap detection.
|
||||
|
||||
What: /sys/.../events/in_accel_gesture_singletap_reset_timeout
|
||||
What: /sys/.../events/in_accel_gesture_doubletap_reset_timeout
|
||||
KernelVersion: 6.1
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
Specifies the timeout value in seconds for the tap detector
|
||||
to not to look for another tap event after the event as
|
||||
occurred. Basically the minimum quiet time between the two
|
||||
single-tap's or two double-tap's.
|
||||
|
||||
What: /sys/.../events/in_accel_gesture_tap_reset_timeout_available
|
||||
KernelVersion: 6.1
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
Lists all available tap reset timeout values. Units in seconds.
|
||||
|
||||
What: /sys/.../events/in_accel_gesture_doubletap_tap2_min_delay
|
||||
KernelVersion: 6.1
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
Specifies the minimum quiet time in seconds between the two
|
||||
taps of a double tap.
|
||||
|
||||
What: /sys/.../events/in_accel_gesture_doubletap_tap2_min_delay_available
|
||||
KernelVersion: 6.1
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
Lists all available delay values between two taps in the double
|
||||
tap. Units in seconds.
|
||||
|
||||
What: /sys/.../events/in_accel_gesture_tap_maxtomin_time
|
||||
KernelVersion: 6.1
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
Specifies the maximum time difference allowed between upper
|
||||
and lower peak of tap to consider it as the valid tap event.
|
||||
Units in seconds.
|
||||
|
||||
What: /sys/.../events/in_accel_gesture_tap_maxtomin_time_available
|
||||
KernelVersion: 6.1
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
Lists all available time values between upper peak to lower
|
||||
peak. Units in seconds.
|
||||
|
||||
What: /sys/bus/iio/devices/iio:deviceX/in_rot_yaw_raw
|
||||
What: /sys/bus/iio/devices/iio:deviceX/in_rot_pitch_raw
|
||||
What: /sys/bus/iio/devices/iio:deviceX/in_rot_roll_raw
|
||||
KernelVersion: 6.1
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
Raw (unscaled) euler angles readings. Units after
|
||||
application of scale are deg.
|
||||
|
||||
What: /sys/bus/iio/devices/iio:deviceX/serialnumber
|
||||
KernelVersion: 6.1
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
An example format is 16-bytes, 2-digits-per-byte, HEX-string
|
||||
representing the sensor unique ID number.
|
||||
|
|
|
@ -0,0 +1,81 @@
|
|||
What: /sys/bus/iio/devices/iio:deviceX/in_accel_raw_range
|
||||
KernelVersion: 6.1
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
Raw (unscaled) range for acceleration readings. Unit after
|
||||
application of scale is m/s^2. Note that this doesn't affects
|
||||
the scale (which should be used when changing the maximum and
|
||||
minimum readable value affects also the reading scaling factor).
|
||||
|
||||
What: /sys/bus/iio/devices/iio:deviceX/in_anglvel_raw_range
|
||||
KernelVersion: 6.1
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
Range for angular velocity readings in radians per second. Note
|
||||
that this does not affects the scale (which should be used when
|
||||
changing the maximum and minimum readable value affects also the
|
||||
reading scaling factor).
|
||||
|
||||
What: /sys/bus/iio/devices/iio:deviceX/in_accel_raw_range_available
|
||||
KernelVersion: 6.1
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
List of allowed values for in_accel_raw_range attribute
|
||||
|
||||
What: /sys/bus/iio/devices/iio:deviceX/in_anglvel_raw_range_available
|
||||
KernelVersion: 6.1
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
List of allowed values for in_anglvel_raw_range attribute
|
||||
|
||||
What: /sys/bus/iio/devices/iio:deviceX/in_magn_calibration_fast_enable
|
||||
KernelVersion: 6.1
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
Can be 1 or 0. Enables/disables the "Fast Magnetometer
|
||||
Calibration" HW function.
|
||||
|
||||
What: /sys/bus/iio/devices/iio:deviceX/fusion_enable
|
||||
KernelVersion: 6.1
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
Can be 1 or 0. Enables/disables the "sensor fusion" (a.k.a.
|
||||
NDOF) HW function.
|
||||
|
||||
What: /sys/bus/iio/devices/iio:deviceX/calibration_data
|
||||
KernelVersion: 6.1
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
Reports the binary calibration data blob for the IMU sensors.
|
||||
|
||||
What: /sys/bus/iio/devices/iio:deviceX/in_accel_calibration_auto_status
|
||||
KernelVersion: 6.1
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
Reports the autocalibration status for the accelerometer sensor.
|
||||
Can be 0 (calibration non even enabled) or 1 to 5 where the greater
|
||||
the number, the better the calibration status.
|
||||
|
||||
What: /sys/bus/iio/devices/iio:deviceX/in_gyro_calibration_auto_status
|
||||
KernelVersion: 6.1
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
Reports the autocalibration status for the gyroscope sensor.
|
||||
Can be 0 (calibration non even enabled) or 1 to 5 where the greater
|
||||
the number, the better the calibration status.
|
||||
|
||||
What: /sys/bus/iio/devices/iio:deviceX/in_magn_calibration_auto_status
|
||||
KernelVersion: 6.1
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
Reports the autocalibration status for the magnetometer sensor.
|
||||
Can be 0 (calibration non even enabled) or 1 to 5 where the greater
|
||||
the number, the better the calibration status.
|
||||
|
||||
What: /sys/bus/iio/devices/iio:deviceX/sys_calibration_auto_status
|
||||
KernelVersion: 6.1
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
Reports the status for the IMU overall autocalibration.
|
||||
Can be 0 (calibration non even enabled) or 1 to 5 where the greater
|
||||
the number, the better the calibration status.
|
|
@ -0,0 +1,11 @@
|
|||
What: /sys/.../iio:deviceX/in_capacitableY_calibbias_calibration
|
||||
What: /sys/.../iio:deviceX/in_capacitableY_calibscale_calibration
|
||||
KernelVersion: 6.1
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
Write 1 to trigger a calibration of the calibbias or
|
||||
calibscale. For calibscale, a full scale capacitance should
|
||||
be connected to the capacitance input and a
|
||||
calibscale_calibration then started. For calibbias see
|
||||
the device datasheet section on "capacitive system offset
|
||||
calibration".
|
|
@ -457,3 +457,36 @@ Description:
|
|||
|
||||
The file is writable if the PF is bound to a driver that
|
||||
implements ->sriov_set_msix_vec_count().
|
||||
|
||||
What: /sys/bus/pci/devices/.../resourceN_resize
|
||||
Date: September 2022
|
||||
Contact: Alex Williamson <alex.williamson@redhat.com>
|
||||
Description:
|
||||
These files provide an interface to PCIe Resizable BAR support.
|
||||
A file is created for each BAR resource (N) supported by the
|
||||
PCIe Resizable BAR extended capability of the device. Reading
|
||||
each file exposes the bitmap of available resource sizes:
|
||||
|
||||
# cat resource1_resize
|
||||
00000000000001c0
|
||||
|
||||
The bitmap represents supported resource sizes for the BAR,
|
||||
where bit0 = 1MB, bit1 = 2MB, bit2 = 4MB, etc. In the above
|
||||
example the device supports 64MB, 128MB, and 256MB BAR sizes.
|
||||
|
||||
When writing the file, the user provides the bit position of
|
||||
the desired resource size, for example:
|
||||
|
||||
# echo 7 > resource1_resize
|
||||
|
||||
This indicates to set the size value corresponding to bit 7,
|
||||
128MB. The resulting size is 2 ^ (bit# + 20). This definition
|
||||
matches the PCIe specification of this capability.
|
||||
|
||||
In order to make use of resource resizing, all PCI drivers must
|
||||
be unbound from the device and peer devices under the same
|
||||
parent bridge may need to be soft removed. In the case of
|
||||
VGA devices, writing a resize value will remove low level
|
||||
console drivers from the device. Raw users of pci-sysfs
|
||||
resourceN attributes must be terminated prior to resizing.
|
||||
Success of the resizing operation is not guaranteed.
|
||||
|
|
|
@ -153,7 +153,7 @@ Date: Jan 2020
|
|||
KernelVersion: 5.5
|
||||
Contact: Mika Westerberg <mika.westerberg@linux.intel.com>
|
||||
Description: This attribute reports number of RX lanes the device is
|
||||
using simultaneusly through its upstream port.
|
||||
using simultaneously through its upstream port.
|
||||
|
||||
What: /sys/bus/thunderbolt/devices/.../tx_speed
|
||||
Date: Jan 2020
|
||||
|
@ -167,7 +167,7 @@ Date: Jan 2020
|
|||
KernelVersion: 5.5
|
||||
Contact: Mika Westerberg <mika.westerberg@linux.intel.com>
|
||||
Description: This attribute reports number of TX lanes the device is
|
||||
using simultaneusly through its upstream port.
|
||||
using simultaneously through its upstream port.
|
||||
|
||||
What: /sys/bus/thunderbolt/devices/.../vendor
|
||||
Date: Sep 2017
|
||||
|
|
|
@ -0,0 +1,61 @@
|
|||
What: /sys/devices/hisi_ptt<sicl_id>_<core_id>/tune
|
||||
Date: October 2022
|
||||
KernelVersion: 6.1
|
||||
Contact: Yicong Yang <yangyicong@hisilicon.com>
|
||||
Description: This directory contains files for tuning the PCIe link
|
||||
parameters(events). Each file is named after the event
|
||||
of the PCIe link.
|
||||
|
||||
See Documentation/trace/hisi-ptt.rst for more information.
|
||||
|
||||
What: /sys/devices/hisi_ptt<sicl_id>_<core_id>/tune/qos_tx_cpl
|
||||
Date: October 2022
|
||||
KernelVersion: 6.1
|
||||
Contact: Yicong Yang <yangyicong@hisilicon.com>
|
||||
Description: (RW) Controls the weight of Tx completion TLPs, which influence
|
||||
the proportion of outbound completion TLPs on the PCIe link.
|
||||
The available tune data is [0, 1, 2]. Writing a negative value
|
||||
will return an error, and out of range values will be converted
|
||||
to 2. The value indicates a probable level of the event.
|
||||
|
||||
What: /sys/devices/hisi_ptt<sicl_id>_<core_id>/tune/qos_tx_np
|
||||
Date: October 2022
|
||||
KernelVersion: 6.1
|
||||
Contact: Yicong Yang <yangyicong@hisilicon.com>
|
||||
Description: (RW) Controls the weight of Tx non-posted TLPs, which influence
|
||||
the proportion of outbound non-posted TLPs on the PCIe link.
|
||||
The available tune data is [0, 1, 2]. Writing a negative value
|
||||
will return an error, and out of range values will be converted
|
||||
to 2. The value indicates a probable level of the event.
|
||||
|
||||
What: /sys/devices/hisi_ptt<sicl_id>_<core_id>/tune/qos_tx_p
|
||||
Date: October 2022
|
||||
KernelVersion: 6.1
|
||||
Contact: Yicong Yang <yangyicong@hisilicon.com>
|
||||
Description: (RW) Controls the weight of Tx posted TLPs, which influence the
|
||||
proportion of outbound posted TLPs on the PCIe link.
|
||||
The available tune data is [0, 1, 2]. Writing a negative value
|
||||
will return an error, and out of range values will be converted
|
||||
to 2. The value indicates a probable level of the event.
|
||||
|
||||
What: /sys/devices/hisi_ptt<sicl_id>_<core_id>/tune/rx_alloc_buf_level
|
||||
Date: October 2022
|
||||
KernelVersion: 6.1
|
||||
Contact: Yicong Yang <yangyicong@hisilicon.com>
|
||||
Description: (RW) Control the allocated buffer watermark for inbound packets.
|
||||
The packets will be stored in the buffer first and then transmitted
|
||||
either when the watermark reached or when timed out.
|
||||
The available tune data is [0, 1, 2]. Writing a negative value
|
||||
will return an error, and out of range values will be converted
|
||||
to 2. The value indicates a probable level of the event.
|
||||
|
||||
What: /sys/devices/hisi_ptt<sicl_id>_<core_id>/tune/tx_alloc_buf_level
|
||||
Date: October 2022
|
||||
KernelVersion: 6.1
|
||||
Contact: Yicong Yang <yangyicong@hisilicon.com>
|
||||
Description: (RW) Control the allocated buffer watermark of outbound packets.
|
||||
The packets will be stored in the buffer first and then transmitted
|
||||
either when the watermark reached or when timed out.
|
||||
The available tune data is [0, 1, 2]. Writing a negative value
|
||||
will return an error, and out of range values will be converted
|
||||
to 2. The value indicates a probable level of the event.
|
|
@ -0,0 +1,8 @@
|
|||
What: /sys/.../<device>/vfio-dev/vfioX/
|
||||
Date: September 2022
|
||||
Contact: Yi Liu <yi.l.liu@intel.com>
|
||||
Description:
|
||||
This directory is created when the device is bound to a
|
||||
vfio driver. The layout under this directory matches what
|
||||
exists for a standard 'struct device'. 'X' is a unique
|
||||
index marking this device in vfio.
|
|
@ -16,7 +16,7 @@ Description: Version of the application running on the device's CPU
|
|||
|
||||
What: /sys/class/habanalabs/hl<n>/clk_max_freq_mhz
|
||||
Date: Jun 2019
|
||||
KernelVersion: not yet upstreamed
|
||||
KernelVersion: 5.7
|
||||
Contact: ogabbay@kernel.org
|
||||
Description: Allows the user to set the maximum clock frequency, in MHz.
|
||||
The device clock might be set to lower value than the maximum.
|
||||
|
@ -26,7 +26,7 @@ Description: Allows the user to set the maximum clock frequency, in MHz.
|
|||
|
||||
What: /sys/class/habanalabs/hl<n>/clk_cur_freq_mhz
|
||||
Date: Jun 2019
|
||||
KernelVersion: not yet upstreamed
|
||||
KernelVersion: 5.7
|
||||
Contact: ogabbay@kernel.org
|
||||
Description: Displays the current frequency, in MHz, of the device clock.
|
||||
This property is valid only for the Gaudi ASIC family
|
||||
|
@ -176,6 +176,12 @@ KernelVersion: 5.1
|
|||
Contact: ogabbay@kernel.org
|
||||
Description: Version of the device's preboot F/W code
|
||||
|
||||
What: /sys/class/habanalabs/hl<n>/security_enabled
|
||||
Date: Oct 2022
|
||||
KernelVersion: 6.1
|
||||
Contact: obitton@habana.ai
|
||||
Description: Displays the device's security status
|
||||
|
||||
What: /sys/class/habanalabs/hl<n>/soft_reset
|
||||
Date: Jan 2019
|
||||
KernelVersion: 5.1
|
||||
|
@ -230,6 +236,6 @@ Description: Version of the u-boot running on the device's CPU
|
|||
|
||||
What: /sys/class/habanalabs/hl<n>/vrm_ver
|
||||
Date: Jan 2022
|
||||
KernelVersion: not yet upstreamed
|
||||
KernelVersion: 5.17
|
||||
Contact: ogabbay@kernel.org
|
||||
Description: Version of the Device's Voltage Regulator Monitor F/W code. N/A to GOYA and GAUDI
|
||||
|
|
|
@ -1417,6 +1417,15 @@ Description: This node is used to set or display whether UFS WriteBooster is
|
|||
platform that doesn't support UFSHCD_CAP_CLK_SCALING, we can
|
||||
disable/enable WriteBooster through this sysfs node.
|
||||
|
||||
What: /sys/bus/platform/drivers/ufshcd/*/enable_wb_buf_flush
|
||||
What: /sys/bus/platform/devices/*.ufs/enable_wb_buf_flush
|
||||
Date: July 2022
|
||||
Contact: Jinyoung Choi <j-young.choi@samsung.com>
|
||||
Description: This entry shows the status of WriteBooster buffer flushing
|
||||
and it can be used to enable or disable the flushing.
|
||||
If flushing is enabled, the device executes the flush
|
||||
operation when the command queue is empty.
|
||||
|
||||
What: /sys/bus/platform/drivers/ufshcd/*/device_descriptor/hpb_version
|
||||
What: /sys/bus/platform/devices/*.ufs/device_descriptor/hpb_version
|
||||
Date: June 2021
|
||||
|
@ -1591,6 +1600,43 @@ Description: This entry shows the status of HPB.
|
|||
|
||||
The file is read only.
|
||||
|
||||
Contact: Daniil Lunev <dlunev@chromium.org>
|
||||
What: /sys/bus/platform/drivers/ufshcd/*/capabilities/
|
||||
What: /sys/bus/platform/devices/*.ufs/capabilities/
|
||||
Date: August 2022
|
||||
Description: The group represents the effective capabilities of the
|
||||
host-device pair. i.e. the capabilities which are enabled in the
|
||||
driver for the specific host controller, supported by the host
|
||||
controller and are supported and/or have compatible
|
||||
configuration on the device side.
|
||||
|
||||
Contact: Daniil Lunev <dlunev@chromium.org>
|
||||
What: /sys/bus/platform/drivers/ufshcd/*/capabilities/clock_scaling
|
||||
What: /sys/bus/platform/devices/*.ufs/capabilities/clock_scaling
|
||||
Date: August 2022
|
||||
Contact: Daniil Lunev <dlunev@chromium.org>
|
||||
Description: Indicates status of clock scaling.
|
||||
|
||||
== ============================
|
||||
0 Clock scaling is not supported.
|
||||
1 Clock scaling is supported.
|
||||
== ============================
|
||||
|
||||
The file is read only.
|
||||
|
||||
What: /sys/bus/platform/drivers/ufshcd/*/capabilities/write_booster
|
||||
What: /sys/bus/platform/devices/*.ufs/capabilities/write_booster
|
||||
Date: August 2022
|
||||
Contact: Daniil Lunev <dlunev@chromium.org>
|
||||
Description: Indicates status of Write Booster.
|
||||
|
||||
== ============================
|
||||
0 Write Booster can not be enabled.
|
||||
1 Write Booster can be enabled.
|
||||
== ============================
|
||||
|
||||
The file is read only.
|
||||
|
||||
What: /sys/class/scsi_device/*/device/hpb_param_sysfs/activation_thld
|
||||
Date: February 2021
|
||||
Contact: Avri Altman <avri.altman@wdc.com>
|
||||
|
|
|
@ -466,6 +466,30 @@ Description: Show status of f2fs superblock in real time.
|
|||
0x4000 SBI_IS_FREEZING freefs is in process
|
||||
====== ===================== =================================
|
||||
|
||||
What: /sys/fs/f2fs/<disk>/stat/cp_status
|
||||
Date: September 2022
|
||||
Contact: "Chao Yu" <chao.yu@oppo.com>
|
||||
Description: Show status of f2fs checkpoint in real time.
|
||||
|
||||
=============================== ==============================
|
||||
cp flag value
|
||||
CP_UMOUNT_FLAG 0x00000001
|
||||
CP_ORPHAN_PRESENT_FLAG 0x00000002
|
||||
CP_COMPACT_SUM_FLAG 0x00000004
|
||||
CP_ERROR_FLAG 0x00000008
|
||||
CP_FSCK_FLAG 0x00000010
|
||||
CP_FASTBOOT_FLAG 0x00000020
|
||||
CP_CRC_RECOVERY_FLAG 0x00000040
|
||||
CP_NAT_BITS_FLAG 0x00000080
|
||||
CP_TRIMMED_FLAG 0x00000100
|
||||
CP_NOCRC_RECOVERY_FLAG 0x00000200
|
||||
CP_LARGE_NAT_BITMAP_FLAG 0x00000400
|
||||
CP_QUOTA_NEED_FSCK_FLAG 0x00000800
|
||||
CP_DISABLED_FLAG 0x00001000
|
||||
CP_DISABLED_QUICK_FLAG 0x00002000
|
||||
CP_RESIZEFS_FLAG 0x00004000
|
||||
=============================== ==============================
|
||||
|
||||
What: /sys/fs/f2fs/<disk>/ckpt_thread_ioprio
|
||||
Date: January 2021
|
||||
Contact: "Daeho Jeong" <daehojeong@google.com>
|
||||
|
|
|
@ -55,6 +55,14 @@ Description:
|
|||
The object directory contains subdirectories for each function
|
||||
that is patched within the object.
|
||||
|
||||
What: /sys/kernel/livepatch/<patch>/<object>/patched
|
||||
Date: August 2022
|
||||
KernelVersion: 6.1.0
|
||||
Contact: live-patching@vger.kernel.org
|
||||
Description:
|
||||
An attribute which indicates whether the object is currently
|
||||
patched.
|
||||
|
||||
What: /sys/kernel/livepatch/<patch>/<object>/<function,sympos>
|
||||
Date: Nov 2014
|
||||
KernelVersion: 3.19.0
|
||||
|
|
|
@ -0,0 +1,25 @@
|
|||
What: /sys/devices/virtual/memory_tiering/
|
||||
Date: August 2022
|
||||
Contact: Linux memory management mailing list <linux-mm@kvack.org>
|
||||
Description: A collection of all the memory tiers allocated.
|
||||
|
||||
Individual memory tier details are contained in subdirectories
|
||||
named by the abstract distance of the memory tier.
|
||||
|
||||
/sys/devices/virtual/memory_tiering/memory_tierN/
|
||||
|
||||
|
||||
What: /sys/devices/virtual/memory_tiering/memory_tierN/
|
||||
/sys/devices/virtual/memory_tiering/memory_tierN/nodes
|
||||
Date: August 2022
|
||||
Contact: Linux memory management mailing list <linux-mm@kvack.org>
|
||||
Description: Directory with details of a specific memory tier
|
||||
|
||||
This is the directory containing information about a particular
|
||||
memory tier, memtierN, where N is derived based on abstract distance.
|
||||
|
||||
A smaller value of N implies a higher (faster) memory tier in the
|
||||
hierarchy.
|
||||
|
||||
nodes: NUMA nodes that are part of this memory tier.
|
||||
|
|
@ -13,7 +13,7 @@ a) waiting for a CPU (while being runnable)
|
|||
b) completion of synchronous block I/O initiated by the task
|
||||
c) swapping in pages
|
||||
d) memory reclaim
|
||||
e) thrashing page cache
|
||||
e) thrashing
|
||||
f) direct compact
|
||||
g) write-protect copy
|
||||
|
||||
|
|
|
@ -299,7 +299,7 @@ Per-node-per-memcgroup LRU (cgroup's private LRU) is guarded by
|
|||
lruvec->lru_lock; PG_lru bit of page->flags is cleared before
|
||||
isolating a page from its LRU under lruvec->lru_lock.
|
||||
|
||||
2.7 Kernel Memory Extension (CONFIG_MEMCG_KMEM)
|
||||
2.7 Kernel Memory Extension
|
||||
-----------------------------------------------
|
||||
|
||||
With the Kernel memory extension, the Memory Controller is able to limit
|
||||
|
@ -386,8 +386,6 @@ U != 0, K >= U:
|
|||
|
||||
a. Enable CONFIG_CGROUPS
|
||||
b. Enable CONFIG_MEMCG
|
||||
c. Enable CONFIG_MEMCG_SWAP (to use swap extension)
|
||||
d. Enable CONFIG_MEMCG_KMEM (to use kmem extension)
|
||||
|
||||
3.1. Prepare the cgroups (see cgroups.txt, Why are cgroups needed?)
|
||||
-------------------------------------------------------------------
|
||||
|
|
|
@ -976,6 +976,29 @@ All cgroup core files are prefixed with "cgroup."
|
|||
killing cgroups is a process directed operation, i.e. it affects
|
||||
the whole thread-group.
|
||||
|
||||
cgroup.pressure
|
||||
A read-write single value file that allowed values are "0" and "1".
|
||||
The default is "1".
|
||||
|
||||
Writing "0" to the file will disable the cgroup PSI accounting.
|
||||
Writing "1" to the file will re-enable the cgroup PSI accounting.
|
||||
|
||||
This control attribute is not hierarchical, so disable or enable PSI
|
||||
accounting in a cgroup does not affect PSI accounting in descendants
|
||||
and doesn't need pass enablement via ancestors from root.
|
||||
|
||||
The reason this control attribute exists is that PSI accounts stalls for
|
||||
each cgroup separately and aggregates it at each level of the hierarchy.
|
||||
This may cause non-negligible overhead for some workloads when under
|
||||
deep level of the hierarchy, in which case this control attribute can
|
||||
be used to disable PSI accounting in the non-leaf cgroups.
|
||||
|
||||
irq.pressure
|
||||
A read-write nested-keyed file.
|
||||
|
||||
Shows pressure stall information for IRQ/SOFTIRQ. See
|
||||
:ref:`Documentation/accounting/psi.rst <psi>` for details.
|
||||
|
||||
Controllers
|
||||
===========
|
||||
|
||||
|
@ -1355,6 +1378,11 @@ PAGE_SIZE multiple when read back.
|
|||
pagetables
|
||||
Amount of memory allocated for page tables.
|
||||
|
||||
sec_pagetables
|
||||
Amount of memory allocated for secondary page tables,
|
||||
this currently includes KVM mmu allocations on x86
|
||||
and arm64.
|
||||
|
||||
percpu (npn)
|
||||
Amount of memory used for storing per-cpu kernel
|
||||
data structures.
|
||||
|
@ -2185,75 +2213,93 @@ Cpuset Interface Files
|
|||
|
||||
It accepts only the following input values when written to.
|
||||
|
||||
======== ================================
|
||||
"root" a partition root
|
||||
"member" a non-root member of a partition
|
||||
======== ================================
|
||||
========== =====================================
|
||||
"member" Non-root member of a partition
|
||||
"root" Partition root
|
||||
"isolated" Partition root without load balancing
|
||||
========== =====================================
|
||||
|
||||
When set to be a partition root, the current cgroup is the
|
||||
root of a new partition or scheduling domain that comprises
|
||||
itself and all its descendants except those that are separate
|
||||
partition roots themselves and their descendants. The root
|
||||
cgroup is always a partition root.
|
||||
The root cgroup is always a partition root and its state
|
||||
cannot be changed. All other non-root cgroups start out as
|
||||
"member".
|
||||
|
||||
There are constraints on where a partition root can be set.
|
||||
It can only be set in a cgroup if all the following conditions
|
||||
are true.
|
||||
When set to "root", the current cgroup is the root of a new
|
||||
partition or scheduling domain that comprises itself and all
|
||||
its descendants except those that are separate partition roots
|
||||
themselves and their descendants.
|
||||
|
||||
1) The "cpuset.cpus" is not empty and the list of CPUs are
|
||||
exclusive, i.e. they are not shared by any of its siblings.
|
||||
2) The parent cgroup is a partition root.
|
||||
3) The "cpuset.cpus" is also a proper subset of the parent's
|
||||
"cpuset.cpus.effective".
|
||||
4) There is no child cgroups with cpuset enabled. This is for
|
||||
eliminating corner cases that have to be handled if such a
|
||||
condition is allowed.
|
||||
When set to "isolated", the CPUs in that partition root will
|
||||
be in an isolated state without any load balancing from the
|
||||
scheduler. Tasks placed in such a partition with multiple
|
||||
CPUs should be carefully distributed and bound to each of the
|
||||
individual CPUs for optimal performance.
|
||||
|
||||
Setting it to partition root will take the CPUs away from the
|
||||
effective CPUs of the parent cgroup. Once it is set, this
|
||||
file cannot be reverted back to "member" if there are any child
|
||||
cgroups with cpuset enabled.
|
||||
The value shown in "cpuset.cpus.effective" of a partition root
|
||||
is the CPUs that the partition root can dedicate to a potential
|
||||
new child partition root. The new child subtracts available
|
||||
CPUs from its parent "cpuset.cpus.effective".
|
||||
|
||||
A parent partition cannot distribute all its CPUs to its
|
||||
child partitions. There must be at least one cpu left in the
|
||||
parent partition.
|
||||
A partition root ("root" or "isolated") can be in one of the
|
||||
two possible states - valid or invalid. An invalid partition
|
||||
root is in a degraded state where some state information may
|
||||
be retained, but behaves more like a "member".
|
||||
|
||||
Once becoming a partition root, changes to "cpuset.cpus" is
|
||||
generally allowed as long as the first condition above is true,
|
||||
the change will not take away all the CPUs from the parent
|
||||
partition and the new "cpuset.cpus" value is a superset of its
|
||||
children's "cpuset.cpus" values.
|
||||
All possible state transitions among "member", "root" and
|
||||
"isolated" are allowed.
|
||||
|
||||
Sometimes, external factors like changes to ancestors'
|
||||
"cpuset.cpus" or cpu hotplug can cause the state of the partition
|
||||
root to change. On read, the "cpuset.sched.partition" file
|
||||
can show the following values.
|
||||
On read, the "cpuset.cpus.partition" file can show the following
|
||||
values.
|
||||
|
||||
============== ==============================
|
||||
"member" Non-root member of a partition
|
||||
"root" Partition root
|
||||
"root invalid" Invalid partition root
|
||||
============== ==============================
|
||||
============================= =====================================
|
||||
"member" Non-root member of a partition
|
||||
"root" Partition root
|
||||
"isolated" Partition root without load balancing
|
||||
"root invalid (<reason>)" Invalid partition root
|
||||
"isolated invalid (<reason>)" Invalid isolated partition root
|
||||
============================= =====================================
|
||||
|
||||
It is a partition root if the first 2 partition root conditions
|
||||
above are true and at least one CPU from "cpuset.cpus" is
|
||||
granted by the parent cgroup.
|
||||
In the case of an invalid partition root, a descriptive string on
|
||||
why the partition is invalid is included within parentheses.
|
||||
|
||||
A partition root can become invalid if none of CPUs requested
|
||||
in "cpuset.cpus" can be granted by the parent cgroup or the
|
||||
parent cgroup is no longer a partition root itself. In this
|
||||
case, it is not a real partition even though the restriction
|
||||
of the first partition root condition above will still apply.
|
||||
The cpu affinity of all the tasks in the cgroup will then be
|
||||
associated with CPUs in the nearest ancestor partition.
|
||||
For a partition root to become valid, the following conditions
|
||||
must be met.
|
||||
|
||||
An invalid partition root can be transitioned back to a
|
||||
real partition root if at least one of the requested CPUs
|
||||
can now be granted by its parent. In this case, the cpu
|
||||
affinity of all the tasks in the formerly invalid partition
|
||||
will be associated to the CPUs of the newly formed partition.
|
||||
Changing the partition state of an invalid partition root to
|
||||
"member" is always allowed even if child cpusets are present.
|
||||
1) The "cpuset.cpus" is exclusive with its siblings , i.e. they
|
||||
are not shared by any of its siblings (exclusivity rule).
|
||||
2) The parent cgroup is a valid partition root.
|
||||
3) The "cpuset.cpus" is not empty and must contain at least
|
||||
one of the CPUs from parent's "cpuset.cpus", i.e. they overlap.
|
||||
4) The "cpuset.cpus.effective" cannot be empty unless there is
|
||||
no task associated with this partition.
|
||||
|
||||
External events like hotplug or changes to "cpuset.cpus" can
|
||||
cause a valid partition root to become invalid and vice versa.
|
||||
Note that a task cannot be moved to a cgroup with empty
|
||||
"cpuset.cpus.effective".
|
||||
|
||||
For a valid partition root with the sibling cpu exclusivity
|
||||
rule enabled, changes made to "cpuset.cpus" that violate the
|
||||
exclusivity rule will invalidate the partition as well as its
|
||||
sibiling partitions with conflicting cpuset.cpus values. So
|
||||
care must be taking in changing "cpuset.cpus".
|
||||
|
||||
A valid non-root parent partition may distribute out all its CPUs
|
||||
to its child partitions when there is no task associated with it.
|
||||
|
||||
Care must be taken to change a valid partition root to
|
||||
"member" as all its child partitions, if present, will become
|
||||
invalid causing disruption to tasks running in those child
|
||||
partitions. These inactivated partitions could be recovered if
|
||||
their parent is switched back to a partition root with a proper
|
||||
set of "cpuset.cpus".
|
||||
|
||||
Poll and inotify events are triggered whenever the state of
|
||||
"cpuset.cpus.partition" changes. That includes changes caused
|
||||
by write to "cpuset.cpus.partition", cpu hotplug or other
|
||||
changes that modify the validity status of the partition.
|
||||
This will allow user space agents to monitor unexpected changes
|
||||
to "cpuset.cpus.partition" without the need to do continuous
|
||||
polling.
|
||||
|
||||
|
||||
Device controller
|
||||
|
|
|
@ -141,6 +141,10 @@ root_hash_sig_key_desc <key_description>
|
|||
also gain new certificates at run time if they are signed by a certificate
|
||||
already in the secondary trusted keyring.
|
||||
|
||||
try_verify_in_tasklet
|
||||
If verity hashes are in cache, verify data blocks in kernel tasklet instead
|
||||
of workqueue. This option can reduce IO latency.
|
||||
|
||||
Theory of operation
|
||||
===================
|
||||
|
||||
|
|
|
@ -5,143 +5,115 @@ Dynamic debug
|
|||
Introduction
|
||||
============
|
||||
|
||||
This document describes how to use the dynamic debug (dyndbg) feature.
|
||||
Dynamic debug allows you to dynamically enable/disable kernel
|
||||
debug-print code to obtain additional kernel information.
|
||||
|
||||
Dynamic debug is designed to allow you to dynamically enable/disable
|
||||
kernel code to obtain additional kernel information. Currently, if
|
||||
``CONFIG_DYNAMIC_DEBUG`` is set, then all ``pr_debug()``/``dev_dbg()`` and
|
||||
``print_hex_dump_debug()``/``print_hex_dump_bytes()`` calls can be dynamically
|
||||
enabled per-callsite.
|
||||
If ``/proc/dynamic_debug/control`` exists, your kernel has dynamic
|
||||
debug. You'll need root access (sudo su) to use this.
|
||||
|
||||
If you do not want to enable dynamic debug globally (i.e. in some embedded
|
||||
system), you may set ``CONFIG_DYNAMIC_DEBUG_CORE`` as basic support of dynamic
|
||||
debug and add ``ccflags := -DDYNAMIC_DEBUG_MODULE`` into the Makefile of any
|
||||
modules which you'd like to dynamically debug later.
|
||||
Dynamic debug provides:
|
||||
|
||||
If ``CONFIG_DYNAMIC_DEBUG`` is not set, ``print_hex_dump_debug()`` is just
|
||||
shortcut for ``print_hex_dump(KERN_DEBUG)``.
|
||||
* a Catalog of all *prdbgs* in your kernel.
|
||||
``cat /proc/dynamic_debug/control`` to see them.
|
||||
|
||||
For ``print_hex_dump_debug()``/``print_hex_dump_bytes()``, format string is
|
||||
its ``prefix_str`` argument, if it is constant string; or ``hexdump``
|
||||
in case ``prefix_str`` is built dynamically.
|
||||
|
||||
Dynamic debug has even more useful features:
|
||||
|
||||
* Simple query language allows turning on and off debugging
|
||||
statements by matching any combination of 0 or 1 of:
|
||||
* a Simple query/command language to alter *prdbgs* by selecting on
|
||||
any combination of 0 or 1 of:
|
||||
|
||||
- source filename
|
||||
- function name
|
||||
- line number (including ranges of line numbers)
|
||||
- module name
|
||||
- format string
|
||||
|
||||
* Provides a debugfs control file: ``<debugfs>/dynamic_debug/control``
|
||||
which can be read to display the complete list of known debug
|
||||
statements, to help guide you
|
||||
|
||||
Controlling dynamic debug Behaviour
|
||||
===================================
|
||||
|
||||
The behaviour of ``pr_debug()``/``dev_dbg()`` are controlled via writing to a
|
||||
control file in the 'debugfs' filesystem. Thus, you must first mount
|
||||
the debugfs filesystem, in order to make use of this feature.
|
||||
Subsequently, we refer to the control file as:
|
||||
``<debugfs>/dynamic_debug/control``. For example, if you want to enable
|
||||
printing from source file ``svcsock.c``, line 1603 you simply do::
|
||||
|
||||
nullarbor:~ # echo 'file svcsock.c line 1603 +p' >
|
||||
<debugfs>/dynamic_debug/control
|
||||
|
||||
If you make a mistake with the syntax, the write will fail thus::
|
||||
|
||||
nullarbor:~ # echo 'file svcsock.c wtf 1 +p' >
|
||||
<debugfs>/dynamic_debug/control
|
||||
-bash: echo: write error: Invalid argument
|
||||
|
||||
Note, for systems without 'debugfs' enabled, the control file can be
|
||||
found in ``/proc/dynamic_debug/control``.
|
||||
- class name (as known/declared by each module)
|
||||
|
||||
Viewing Dynamic Debug Behaviour
|
||||
===============================
|
||||
|
||||
You can view the currently configured behaviour of all the debug
|
||||
statements via::
|
||||
You can view the currently configured behaviour in the *prdbg* catalog::
|
||||
|
||||
nullarbor:~ # cat <debugfs>/dynamic_debug/control
|
||||
:#> head -n7 /proc/dynamic_debug/control
|
||||
# filename:lineno [module]function flags format
|
||||
net/sunrpc/svc_rdma.c:323 [svcxprt_rdma]svc_rdma_cleanup =_ "SVCRDMA Module Removed, deregister RPC RDMA transport\012"
|
||||
net/sunrpc/svc_rdma.c:341 [svcxprt_rdma]svc_rdma_init =_ "\011max_inline : %d\012"
|
||||
net/sunrpc/svc_rdma.c:340 [svcxprt_rdma]svc_rdma_init =_ "\011sq_depth : %d\012"
|
||||
net/sunrpc/svc_rdma.c:338 [svcxprt_rdma]svc_rdma_init =_ "\011max_requests : %d\012"
|
||||
...
|
||||
init/main.c:1179 [main]initcall_blacklist =_ "blacklisting initcall %s\012
|
||||
init/main.c:1218 [main]initcall_blacklisted =_ "initcall %s blacklisted\012"
|
||||
init/main.c:1424 [main]run_init_process =_ " with arguments:\012"
|
||||
init/main.c:1426 [main]run_init_process =_ " %s\012"
|
||||
init/main.c:1427 [main]run_init_process =_ " with environment:\012"
|
||||
init/main.c:1429 [main]run_init_process =_ " %s\012"
|
||||
|
||||
The 3rd space-delimited column shows the current flags, preceded by
|
||||
a ``=`` for easy use with grep/cut. ``=p`` shows enabled callsites.
|
||||
|
||||
You can also apply standard Unix text manipulation filters to this
|
||||
data, e.g.::
|
||||
Controlling dynamic debug Behaviour
|
||||
===================================
|
||||
|
||||
nullarbor:~ # grep -i rdma <debugfs>/dynamic_debug/control | wc -l
|
||||
62
|
||||
The behaviour of *prdbg* sites are controlled by writing
|
||||
query/commands to the control file. Example::
|
||||
|
||||
nullarbor:~ # grep -i tcp <debugfs>/dynamic_debug/control | wc -l
|
||||
42
|
||||
# grease the interface
|
||||
:#> alias ddcmd='echo $* > /proc/dynamic_debug/control'
|
||||
|
||||
The third column shows the currently enabled flags for each debug
|
||||
statement callsite (see below for definitions of the flags). The
|
||||
default value, with no flags enabled, is ``=_``. So you can view all
|
||||
the debug statement callsites with any non-default flags::
|
||||
:#> ddcmd '-p; module main func run* +p'
|
||||
:#> grep =p /proc/dynamic_debug/control
|
||||
init/main.c:1424 [main]run_init_process =p " with arguments:\012"
|
||||
init/main.c:1426 [main]run_init_process =p " %s\012"
|
||||
init/main.c:1427 [main]run_init_process =p " with environment:\012"
|
||||
init/main.c:1429 [main]run_init_process =p " %s\012"
|
||||
|
||||
nullarbor:~ # awk '$3 != "=_"' <debugfs>/dynamic_debug/control
|
||||
# filename:lineno [module]function flags format
|
||||
net/sunrpc/svcsock.c:1603 [sunrpc]svc_send p "svc_process: st_sendto returned %d\012"
|
||||
Error messages go to console/syslog::
|
||||
|
||||
:#> ddcmd mode foo +p
|
||||
dyndbg: unknown keyword "mode"
|
||||
dyndbg: query parse failed
|
||||
bash: echo: write error: Invalid argument
|
||||
|
||||
If debugfs is also enabled and mounted, ``dynamic_debug/control`` is
|
||||
also under the mount-dir, typically ``/sys/kernel/debug/``.
|
||||
|
||||
Command Language Reference
|
||||
==========================
|
||||
|
||||
At the lexical level, a command comprises a sequence of words separated
|
||||
At the basic lexical level, a command is a sequence of words separated
|
||||
by spaces or tabs. So these are all equivalent::
|
||||
|
||||
nullarbor:~ # echo -n 'file svcsock.c line 1603 +p' >
|
||||
<debugfs>/dynamic_debug/control
|
||||
nullarbor:~ # echo -n ' file svcsock.c line 1603 +p ' >
|
||||
<debugfs>/dynamic_debug/control
|
||||
nullarbor:~ # echo -n 'file svcsock.c line 1603 +p' >
|
||||
<debugfs>/dynamic_debug/control
|
||||
:#> ddcmd file svcsock.c line 1603 +p
|
||||
:#> ddcmd "file svcsock.c line 1603 +p"
|
||||
:#> ddcmd ' file svcsock.c line 1603 +p '
|
||||
|
||||
Command submissions are bounded by a write() system call.
|
||||
Multiple commands can be written together, separated by ``;`` or ``\n``::
|
||||
|
||||
~# echo "func pnpacpi_get_resources +p; func pnp_assign_mem +p" \
|
||||
> <debugfs>/dynamic_debug/control
|
||||
:#> ddcmd "func pnpacpi_get_resources +p; func pnp_assign_mem +p"
|
||||
:#> ddcmd <<"EOC"
|
||||
func pnpacpi_get_resources +p
|
||||
func pnp_assign_mem +p
|
||||
EOC
|
||||
:#> cat query-batch-file > /proc/dynamic_debug/control
|
||||
|
||||
If your query set is big, you can batch them too::
|
||||
You can also use wildcards in each query term. The match rule supports
|
||||
``*`` (matches zero or more characters) and ``?`` (matches exactly one
|
||||
character). For example, you can match all usb drivers::
|
||||
|
||||
~# cat query-batch-file > <debugfs>/dynamic_debug/control
|
||||
:#> ddcmd file "drivers/usb/*" +p # "" to suppress shell expansion
|
||||
|
||||
Another way is to use wildcards. The match rule supports ``*`` (matches
|
||||
zero or more characters) and ``?`` (matches exactly one character). For
|
||||
example, you can match all usb drivers::
|
||||
|
||||
~# echo "file drivers/usb/* +p" > <debugfs>/dynamic_debug/control
|
||||
|
||||
At the syntactical level, a command comprises a sequence of match
|
||||
specifications, followed by a flags change specification::
|
||||
Syntactically, a command is pairs of keyword values, followed by a
|
||||
flags change or setting::
|
||||
|
||||
command ::= match-spec* flags-spec
|
||||
|
||||
The match-spec's are used to choose a subset of the known pr_debug()
|
||||
callsites to which to apply the flags-spec. Think of them as a query
|
||||
with implicit ANDs between each pair. Note that an empty list of
|
||||
match-specs will select all debug statement callsites.
|
||||
The match-spec's select *prdbgs* from the catalog, upon which to apply
|
||||
the flags-spec, all constraints are ANDed together. An absent keyword
|
||||
is the same as keyword "*".
|
||||
|
||||
A match specification comprises a keyword, which controls the
|
||||
attribute of the callsite to be compared, and a value to compare
|
||||
against. Possible keywords are:::
|
||||
|
||||
A match specification is a keyword, which selects the attribute of
|
||||
the callsite to be compared, and a value to compare against. Possible
|
||||
keywords are:::
|
||||
|
||||
match-spec ::= 'func' string |
|
||||
'file' string |
|
||||
'module' string |
|
||||
'format' string |
|
||||
'class' string |
|
||||
'line' line-range
|
||||
|
||||
line-range ::= lineno |
|
||||
|
@ -203,6 +175,16 @@ format
|
|||
format "nfsd: SETATTR" // a neater way to match a format with whitespace
|
||||
format 'nfsd: SETATTR' // yet another way to match a format with whitespace
|
||||
|
||||
class
|
||||
The given class_name is validated against each module, which may
|
||||
have declared a list of known class_names. If the class_name is
|
||||
found for a module, callsite & class matching and adjustment
|
||||
proceeds. Examples::
|
||||
|
||||
class DRM_UT_KMS # a DRM.debug category
|
||||
class JUNK # silent non-match
|
||||
// class TLD_* # NOTICE: no wildcard in class names
|
||||
|
||||
line
|
||||
The given line number or range of line numbers is compared
|
||||
against the line number of each ``pr_debug()`` callsite. A single
|
||||
|
@ -228,17 +210,16 @@ of the characters::
|
|||
The flags are::
|
||||
|
||||
p enables the pr_debug() callsite.
|
||||
f Include the function name in the printed message
|
||||
l Include line number in the printed message
|
||||
m Include module name in the printed message
|
||||
t Include thread ID in messages not generated from interrupt context
|
||||
_ No flags are set. (Or'd with others on input)
|
||||
_ enables no flags.
|
||||
|
||||
For ``print_hex_dump_debug()`` and ``print_hex_dump_bytes()``, only ``p`` flag
|
||||
have meaning, other flags ignored.
|
||||
Decorator flags add to the message-prefix, in order:
|
||||
t Include thread ID, or <intr>
|
||||
m Include module name
|
||||
f Include the function name
|
||||
l Include line number
|
||||
|
||||
For display, the flags are preceded by ``=``
|
||||
(mnemonic: what the flags are currently equal to).
|
||||
For ``print_hex_dump_debug()`` and ``print_hex_dump_bytes()``, only
|
||||
the ``p`` flag has meaning, other flags are ignored.
|
||||
|
||||
Note the regexp ``^[-+=][flmpt_]+$`` matches a flags specification.
|
||||
To clear all flags at once, use ``=_`` or ``-flmpt``.
|
||||
|
@ -313,7 +294,7 @@ For ``CONFIG_DYNAMIC_DEBUG`` kernels, any settings given at boot-time (or
|
|||
enabled by ``-DDEBUG`` flag during compilation) can be disabled later via
|
||||
the debugfs interface if the debug messages are no longer needed::
|
||||
|
||||
echo "module module_name -p" > <debugfs>/dynamic_debug/control
|
||||
echo "module module_name -p" > /proc/dynamic_debug/control
|
||||
|
||||
Examples
|
||||
========
|
||||
|
@ -321,37 +302,31 @@ Examples
|
|||
::
|
||||
|
||||
// enable the message at line 1603 of file svcsock.c
|
||||
nullarbor:~ # echo -n 'file svcsock.c line 1603 +p' >
|
||||
<debugfs>/dynamic_debug/control
|
||||
:#> ddcmd 'file svcsock.c line 1603 +p'
|
||||
|
||||
// enable all the messages in file svcsock.c
|
||||
nullarbor:~ # echo -n 'file svcsock.c +p' >
|
||||
<debugfs>/dynamic_debug/control
|
||||
:#> ddcmd 'file svcsock.c +p'
|
||||
|
||||
// enable all the messages in the NFS server module
|
||||
nullarbor:~ # echo -n 'module nfsd +p' >
|
||||
<debugfs>/dynamic_debug/control
|
||||
:#> ddcmd 'module nfsd +p'
|
||||
|
||||
// enable all 12 messages in the function svc_process()
|
||||
nullarbor:~ # echo -n 'func svc_process +p' >
|
||||
<debugfs>/dynamic_debug/control
|
||||
:#> ddcmd 'func svc_process +p'
|
||||
|
||||
// disable all 12 messages in the function svc_process()
|
||||
nullarbor:~ # echo -n 'func svc_process -p' >
|
||||
<debugfs>/dynamic_debug/control
|
||||
:#> ddcmd 'func svc_process -p'
|
||||
|
||||
// enable messages for NFS calls READ, READLINK, READDIR and READDIR+.
|
||||
nullarbor:~ # echo -n 'format "nfsd: READ" +p' >
|
||||
<debugfs>/dynamic_debug/control
|
||||
:#> ddcmd 'format "nfsd: READ" +p'
|
||||
|
||||
// enable messages in files of which the paths include string "usb"
|
||||
nullarbor:~ # echo -n 'file *usb* +p' > <debugfs>/dynamic_debug/control
|
||||
:#> ddcmd 'file *usb* +p' > /proc/dynamic_debug/control
|
||||
|
||||
// enable all messages
|
||||
nullarbor:~ # echo -n '+p' > <debugfs>/dynamic_debug/control
|
||||
:#> ddcmd '+p' > /proc/dynamic_debug/control
|
||||
|
||||
// add module, function to all enabled messages
|
||||
nullarbor:~ # echo -n '+mf' > <debugfs>/dynamic_debug/control
|
||||
:#> ddcmd '+mf' > /proc/dynamic_debug/control
|
||||
|
||||
// boot-args example, with newlines and comments for readability
|
||||
Kernel command line: ...
|
||||
|
@ -364,3 +339,38 @@ Examples
|
|||
dyndbg="file init/* +p #cmt ; func parse_one +p"
|
||||
// enable pr_debugs in 2 functions in a module loaded later
|
||||
pc87360.dyndbg="func pc87360_init_device +p; func pc87360_find +p"
|
||||
|
||||
Kernel Configuration
|
||||
====================
|
||||
|
||||
Dynamic Debug is enabled via kernel config items::
|
||||
|
||||
CONFIG_DYNAMIC_DEBUG=y # build catalog, enables CORE
|
||||
CONFIG_DYNAMIC_DEBUG_CORE=y # enable mechanics only, skip catalog
|
||||
|
||||
If you do not want to enable dynamic debug globally (i.e. in some embedded
|
||||
system), you may set ``CONFIG_DYNAMIC_DEBUG_CORE`` as basic support of dynamic
|
||||
debug and add ``ccflags := -DDYNAMIC_DEBUG_MODULE`` into the Makefile of any
|
||||
modules which you'd like to dynamically debug later.
|
||||
|
||||
|
||||
Kernel *prdbg* API
|
||||
==================
|
||||
|
||||
The following functions are cataloged and controllable when dynamic
|
||||
debug is enabled::
|
||||
|
||||
pr_debug()
|
||||
dev_dbg()
|
||||
print_hex_dump_debug()
|
||||
print_hex_dump_bytes()
|
||||
|
||||
Otherwise, they are off by default; ``ccflags += -DDEBUG`` or
|
||||
``#define DEBUG`` in a source file will enable them appropriately.
|
||||
|
||||
If ``CONFIG_DYNAMIC_DEBUG`` is not set, ``print_hex_dump_debug()`` is
|
||||
just a shortcut for ``print_hex_dump(KERN_DEBUG)``.
|
||||
|
||||
For ``print_hex_dump_debug()``/``print_hex_dump_bytes()``, format string is
|
||||
its ``prefix_str`` argument, if it is constant string; or ``hexdump``
|
||||
in case ``prefix_str`` is built dynamically.
|
||||
|
|
|
@ -321,6 +321,8 @@
|
|||
force_enable - Force enable the IOMMU on platforms known
|
||||
to be buggy with IOMMU enabled. Use this
|
||||
option with care.
|
||||
pgtbl_v1 - Use v1 page table for DMA-API (Default).
|
||||
pgtbl_v2 - Use v2 page table for DMA-API.
|
||||
|
||||
amd_iommu_dump= [HW,X86-64]
|
||||
Enable AMD IOMMU driver option to dump the ACPI table
|
||||
|
@ -1467,6 +1469,14 @@
|
|||
Permit 'security.evm' to be updated regardless of
|
||||
current integrity status.
|
||||
|
||||
early_page_ext [KNL] Enforces page_ext initialization to earlier
|
||||
stages so cover more early boot allocations.
|
||||
Please note that as side effect some optimizations
|
||||
might be disabled to achieve that (e.g. parallelized
|
||||
memory initialization is disabled) so the boot process
|
||||
might take longer, especially on systems with a lot of
|
||||
memory. Available with CONFIG_PAGE_EXTENSION=y.
|
||||
|
||||
failslab=
|
||||
fail_usercopy=
|
||||
fail_page_alloc=
|
||||
|
@ -3629,7 +3639,7 @@
|
|||
(bounds check bypass). With this option data leaks are
|
||||
possible in the system.
|
||||
|
||||
nospectre_v2 [X86,PPC_FSL_BOOK3E,ARM64] Disable all mitigations for
|
||||
nospectre_v2 [X86,PPC_E500,ARM64] Disable all mitigations for
|
||||
the Spectre variant 2 (indirect branch prediction)
|
||||
vulnerability. System may allow data leaks with this
|
||||
option.
|
||||
|
@ -3748,9 +3758,9 @@
|
|||
[X86,PV_OPS] Disable paravirtualized VMware scheduler
|
||||
clock and use the default one.
|
||||
|
||||
no-steal-acc [X86,PV_OPS,ARM64] Disable paravirtualized steal time
|
||||
accounting. steal time is computed, but won't
|
||||
influence scheduler behaviour
|
||||
no-steal-acc [X86,PV_OPS,ARM64,PPC/PSERIES] Disable paravirtualized
|
||||
steal time accounting. steal time is computed, but
|
||||
won't influence scheduler behaviour
|
||||
|
||||
nolapic [X86-32,APIC] Do not enable or use the local APIC.
|
||||
|
||||
|
@ -6039,12 +6049,6 @@
|
|||
This parameter controls use of the Protected
|
||||
Execution Facility on pSeries.
|
||||
|
||||
swapaccount= [KNL]
|
||||
Format: [0|1]
|
||||
Enable accounting of swap in memory resource
|
||||
controller if no parameter or 1 is given or disable
|
||||
it if 0 is given (See Documentation/admin-guide/cgroup-v1/memory.rst)
|
||||
|
||||
swiotlb= [ARM,IA-64,PPC,MIPS,X86]
|
||||
Format: { <int> [,<int>] | force | noforce }
|
||||
<int> -- Number of I/O TLB slabs
|
||||
|
@ -6847,6 +6851,12 @@
|
|||
Crash from Xen panic notifier, without executing late
|
||||
panic() code such as dumping handler.
|
||||
|
||||
xen_msr_safe= [X86,XEN]
|
||||
Format: <bool>
|
||||
Select whether to always use non-faulting (safe) MSR
|
||||
access functions when running as Xen PV guest. The
|
||||
default value is controlled by CONFIG_XEN_PV_MSR_SAFE.
|
||||
|
||||
xen_nopvspin [X86,XEN]
|
||||
Disables the qspinlock slowpath using Xen PV optimizations.
|
||||
This parameter is obsoleted by "nopvspin" parameter, which
|
||||
|
|
|
@ -5,10 +5,10 @@ CMA Debugfs Interface
|
|||
The CMA debugfs interface is useful to retrieve basic information out of the
|
||||
different CMA areas and to test allocation/release in each of the areas.
|
||||
|
||||
Each CMA zone represents a directory under <debugfs>/cma/, indexed by the
|
||||
kernel's CMA index. So the first CMA zone would be:
|
||||
Each CMA area represents a directory under <debugfs>/cma/, represented by
|
||||
its CMA name like below:
|
||||
|
||||
<debugfs>/cma/cma-0
|
||||
<debugfs>/cma/<cma_name>
|
||||
|
||||
The structure of the files created under that directory is as follows:
|
||||
|
||||
|
@ -18,8 +18,8 @@ The structure of the files created under that directory is as follows:
|
|||
- [RO] bitmap: The bitmap of page states in the zone.
|
||||
- [WO] alloc: Allocate N pages from that CMA area. For example::
|
||||
|
||||
echo 5 > <debugfs>/cma/cma-2/alloc
|
||||
echo 5 > <debugfs>/cma/<cma_name>/alloc
|
||||
|
||||
would try to allocate 5 pages from the cma-2 area.
|
||||
would try to allocate 5 pages from the 'cma_name' area.
|
||||
|
||||
- [WO] free: Free N pages from that CMA area, similar to the above.
|
||||
|
|
|
@ -1,8 +1,8 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
========================
|
||||
Monitoring Data Accesses
|
||||
========================
|
||||
==========================
|
||||
DAMON: Data Access MONitor
|
||||
==========================
|
||||
|
||||
:doc:`DAMON </mm/damon/index>` allows light-weight data access monitoring.
|
||||
Using DAMON, users can analyze the memory access patterns of their systems and
|
||||
|
|
|
@ -29,16 +29,9 @@ called DAMON Operator (DAMO). It is available at
|
|||
https://github.com/awslabs/damo. The examples below assume that ``damo`` is on
|
||||
your ``$PATH``. It's not mandatory, though.
|
||||
|
||||
Because DAMO is using the debugfs interface (refer to :doc:`usage` for the
|
||||
detail) of DAMON, you should ensure debugfs is mounted. Mount it manually as
|
||||
below::
|
||||
|
||||
# mount -t debugfs none /sys/kernel/debug/
|
||||
|
||||
or append the following line to your ``/etc/fstab`` file so that your system
|
||||
can automatically mount debugfs upon booting::
|
||||
|
||||
debugfs /sys/kernel/debug debugfs defaults 0 0
|
||||
Because DAMO is using the sysfs interface (refer to :doc:`usage` for the
|
||||
detail) of DAMON, you should ensure :doc:`sysfs </filesystems/sysfs>` is
|
||||
mounted.
|
||||
|
||||
|
||||
Recording Data Access Patterns
|
||||
|
|
|
@ -393,6 +393,11 @@ the files as above. Above is only for an example.
|
|||
debugfs Interface
|
||||
=================
|
||||
|
||||
.. note::
|
||||
|
||||
DAMON debugfs interface will be removed after next LTS kernel is released, so
|
||||
users should move to the :ref:`sysfs interface <sysfs_interface>`.
|
||||
|
||||
DAMON exports eight files, ``attrs``, ``target_ids``, ``init_regions``,
|
||||
``schemes``, ``monitor_on``, ``kdamond_pid``, ``mk_contexts`` and
|
||||
``rm_contexts`` under its debugfs directory, ``<debugfs>/damon/``.
|
||||
|
|
|
@ -32,6 +32,7 @@ the Linux memory management.
|
|||
idle_page_tracking
|
||||
ksm
|
||||
memory-hotplug
|
||||
multigen_lru
|
||||
nommu-mmap
|
||||
numa_memory_policy
|
||||
numaperf
|
||||
|
|
|
@ -184,6 +184,42 @@ The maximum possible ``pages_sharing/pages_shared`` ratio is limited by the
|
|||
``max_page_sharing`` tunable. To increase the ratio ``max_page_sharing`` must
|
||||
be increased accordingly.
|
||||
|
||||
Monitoring KSM profit
|
||||
=====================
|
||||
|
||||
KSM can save memory by merging identical pages, but also can consume
|
||||
additional memory, because it needs to generate a number of rmap_items to
|
||||
save each scanned page's brief rmap information. Some of these pages may
|
||||
be merged, but some may not be abled to be merged after being checked
|
||||
several times, which are unprofitable memory consumed.
|
||||
|
||||
1) How to determine whether KSM save memory or consume memory in system-wide
|
||||
range? Here is a simple approximate calculation for reference::
|
||||
|
||||
general_profit =~ pages_sharing * sizeof(page) - (all_rmap_items) *
|
||||
sizeof(rmap_item);
|
||||
|
||||
where all_rmap_items can be easily obtained by summing ``pages_sharing``,
|
||||
``pages_shared``, ``pages_unshared`` and ``pages_volatile``.
|
||||
|
||||
2) The KSM profit inner a single process can be similarly obtained by the
|
||||
following approximate calculation::
|
||||
|
||||
process_profit =~ ksm_merging_pages * sizeof(page) -
|
||||
ksm_rmap_items * sizeof(rmap_item).
|
||||
|
||||
where ksm_merging_pages is shown under the directory ``/proc/<pid>/``,
|
||||
and ksm_rmap_items is shown in ``/proc/<pid>/ksm_stat``.
|
||||
|
||||
From the perspective of application, a high ratio of ``ksm_rmap_items`` to
|
||||
``ksm_merging_pages`` means a bad madvise-applied policy, so developers or
|
||||
administrators have to rethink how to change madvise policy. Giving an example
|
||||
for reference, a page's size is usually 4K, and the rmap_item's size is
|
||||
separately 32B on 32-bit CPU architecture and 64B on 64-bit CPU architecture.
|
||||
so if the ``ksm_rmap_items/ksm_merging_pages`` ratio exceeds 64 on 64-bit CPU
|
||||
or exceeds 128 on 32-bit CPU, then the app's madvise policy should be dropped,
|
||||
because the ksm profit is approximately zero or negative.
|
||||
|
||||
Monitoring KSM events
|
||||
=====================
|
||||
|
||||
|
|
|
@ -0,0 +1,162 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
=============
|
||||
Multi-Gen LRU
|
||||
=============
|
||||
The multi-gen LRU is an alternative LRU implementation that optimizes
|
||||
page reclaim and improves performance under memory pressure. Page
|
||||
reclaim decides the kernel's caching policy and ability to overcommit
|
||||
memory. It directly impacts the kswapd CPU usage and RAM efficiency.
|
||||
|
||||
Quick start
|
||||
===========
|
||||
Build the kernel with the following configurations.
|
||||
|
||||
* ``CONFIG_LRU_GEN=y``
|
||||
* ``CONFIG_LRU_GEN_ENABLED=y``
|
||||
|
||||
All set!
|
||||
|
||||
Runtime options
|
||||
===============
|
||||
``/sys/kernel/mm/lru_gen/`` contains stable ABIs described in the
|
||||
following subsections.
|
||||
|
||||
Kill switch
|
||||
-----------
|
||||
``enabled`` accepts different values to enable or disable the
|
||||
following components. Its default value depends on
|
||||
``CONFIG_LRU_GEN_ENABLED``. All the components should be enabled
|
||||
unless some of them have unforeseen side effects. Writing to
|
||||
``enabled`` has no effect when a component is not supported by the
|
||||
hardware, and valid values will be accepted even when the main switch
|
||||
is off.
|
||||
|
||||
====== ===============================================================
|
||||
Values Components
|
||||
====== ===============================================================
|
||||
0x0001 The main switch for the multi-gen LRU.
|
||||
0x0002 Clearing the accessed bit in leaf page table entries in large
|
||||
batches, when MMU sets it (e.g., on x86). This behavior can
|
||||
theoretically worsen lock contention (mmap_lock). If it is
|
||||
disabled, the multi-gen LRU will suffer a minor performance
|
||||
degradation for workloads that contiguously map hot pages,
|
||||
whose accessed bits can be otherwise cleared by fewer larger
|
||||
batches.
|
||||
0x0004 Clearing the accessed bit in non-leaf page table entries as
|
||||
well, when MMU sets it (e.g., on x86). This behavior was not
|
||||
verified on x86 varieties other than Intel and AMD. If it is
|
||||
disabled, the multi-gen LRU will suffer a negligible
|
||||
performance degradation.
|
||||
[yYnN] Apply to all the components above.
|
||||
====== ===============================================================
|
||||
|
||||
E.g.,
|
||||
::
|
||||
|
||||
echo y >/sys/kernel/mm/lru_gen/enabled
|
||||
cat /sys/kernel/mm/lru_gen/enabled
|
||||
0x0007
|
||||
echo 5 >/sys/kernel/mm/lru_gen/enabled
|
||||
cat /sys/kernel/mm/lru_gen/enabled
|
||||
0x0005
|
||||
|
||||
Thrashing prevention
|
||||
--------------------
|
||||
Personal computers are more sensitive to thrashing because it can
|
||||
cause janks (lags when rendering UI) and negatively impact user
|
||||
experience. The multi-gen LRU offers thrashing prevention to the
|
||||
majority of laptop and desktop users who do not have ``oomd``.
|
||||
|
||||
Users can write ``N`` to ``min_ttl_ms`` to prevent the working set of
|
||||
``N`` milliseconds from getting evicted. The OOM killer is triggered
|
||||
if this working set cannot be kept in memory. In other words, this
|
||||
option works as an adjustable pressure relief valve, and when open, it
|
||||
terminates applications that are hopefully not being used.
|
||||
|
||||
Based on the average human detectable lag (~100ms), ``N=1000`` usually
|
||||
eliminates intolerable janks due to thrashing. Larger values like
|
||||
``N=3000`` make janks less noticeable at the risk of premature OOM
|
||||
kills.
|
||||
|
||||
The default value ``0`` means disabled.
|
||||
|
||||
Experimental features
|
||||
=====================
|
||||
``/sys/kernel/debug/lru_gen`` accepts commands described in the
|
||||
following subsections. Multiple command lines are supported, so does
|
||||
concatenation with delimiters ``,`` and ``;``.
|
||||
|
||||
``/sys/kernel/debug/lru_gen_full`` provides additional stats for
|
||||
debugging. ``CONFIG_LRU_GEN_STATS=y`` keeps historical stats from
|
||||
evicted generations in this file.
|
||||
|
||||
Working set estimation
|
||||
----------------------
|
||||
Working set estimation measures how much memory an application needs
|
||||
in a given time interval, and it is usually done with little impact on
|
||||
the performance of the application. E.g., data centers want to
|
||||
optimize job scheduling (bin packing) to improve memory utilizations.
|
||||
When a new job comes in, the job scheduler needs to find out whether
|
||||
each server it manages can allocate a certain amount of memory for
|
||||
this new job before it can pick a candidate. To do so, the job
|
||||
scheduler needs to estimate the working sets of the existing jobs.
|
||||
|
||||
When it is read, ``lru_gen`` returns a histogram of numbers of pages
|
||||
accessed over different time intervals for each memcg and node.
|
||||
``MAX_NR_GENS`` decides the number of bins for each histogram. The
|
||||
histograms are noncumulative.
|
||||
::
|
||||
|
||||
memcg memcg_id memcg_path
|
||||
node node_id
|
||||
min_gen_nr age_in_ms nr_anon_pages nr_file_pages
|
||||
...
|
||||
max_gen_nr age_in_ms nr_anon_pages nr_file_pages
|
||||
|
||||
Each bin contains an estimated number of pages that have been accessed
|
||||
within ``age_in_ms``. E.g., ``min_gen_nr`` contains the coldest pages
|
||||
and ``max_gen_nr`` contains the hottest pages, since ``age_in_ms`` of
|
||||
the former is the largest and that of the latter is the smallest.
|
||||
|
||||
Users can write the following command to ``lru_gen`` to create a new
|
||||
generation ``max_gen_nr+1``:
|
||||
|
||||
``+ memcg_id node_id max_gen_nr [can_swap [force_scan]]``
|
||||
|
||||
``can_swap`` defaults to the swap setting and, if it is set to ``1``,
|
||||
it forces the scan of anon pages when swap is off, and vice versa.
|
||||
``force_scan`` defaults to ``1`` and, if it is set to ``0``, it
|
||||
employs heuristics to reduce the overhead, which is likely to reduce
|
||||
the coverage as well.
|
||||
|
||||
A typical use case is that a job scheduler runs this command at a
|
||||
certain time interval to create new generations, and it ranks the
|
||||
servers it manages based on the sizes of their cold pages defined by
|
||||
this time interval.
|
||||
|
||||
Proactive reclaim
|
||||
-----------------
|
||||
Proactive reclaim induces page reclaim when there is no memory
|
||||
pressure. It usually targets cold pages only. E.g., when a new job
|
||||
comes in, the job scheduler wants to proactively reclaim cold pages on
|
||||
the server it selected, to improve the chance of successfully landing
|
||||
this new job.
|
||||
|
||||
Users can write the following command to ``lru_gen`` to evict
|
||||
generations less than or equal to ``min_gen_nr``.
|
||||
|
||||
``- memcg_id node_id min_gen_nr [swappiness [nr_to_reclaim]]``
|
||||
|
||||
``min_gen_nr`` should be less than ``max_gen_nr-1``, since
|
||||
``max_gen_nr`` and ``max_gen_nr-1`` are not fully aged (equivalent to
|
||||
the active list) and therefore cannot be evicted. ``swappiness``
|
||||
overrides the default value in ``/proc/sys/vm/swappiness``.
|
||||
``nr_to_reclaim`` limits the number of pages to evict.
|
||||
|
||||
A typical use case is that a job scheduler runs this command before it
|
||||
tries to land a new job on a server. If it fails to materialize enough
|
||||
cold pages because of the overestimation, it retries on the next
|
||||
server according to the ranking result obtained from the working set
|
||||
estimation step. This less forceful approach limits the impacts on the
|
||||
existing jobs.
|
|
@ -191,7 +191,14 @@ allocation failure to throttle the next allocation attempt::
|
|||
|
||||
/sys/kernel/mm/transparent_hugepage/khugepaged/alloc_sleep_millisecs
|
||||
|
||||
The khugepaged progress can be seen in the number of pages collapsed::
|
||||
The khugepaged progress can be seen in the number of pages collapsed (note
|
||||
that this counter may not be an exact count of the number of pages
|
||||
collapsed, since "collapsed" could mean multiple things: (1) A PTE mapping
|
||||
being replaced by a PMD mapping, or (2) All 4K physical pages replaced by
|
||||
one 2M hugepage. Each may happen independently, or together, depending on
|
||||
the type of memory and the failures that occur. As such, this value should
|
||||
be interpreted roughly as a sign of progress, and counters in /proc/vmstat
|
||||
consulted for more accurate accounting)::
|
||||
|
||||
/sys/kernel/mm/transparent_hugepage/khugepaged/pages_collapsed
|
||||
|
||||
|
@ -366,10 +373,9 @@ thp_split_pmd
|
|||
page table entry.
|
||||
|
||||
thp_zero_page_alloc
|
||||
is incremented every time a huge zero page is
|
||||
successfully allocated. It includes allocations which where
|
||||
dropped due race with other allocation. Note, it doesn't count
|
||||
every map of the huge zero page, only its allocation.
|
||||
is incremented every time a huge zero page used for thp is
|
||||
successfully allocated. Note, it doesn't count every map of
|
||||
the huge zero page, only its allocation.
|
||||
|
||||
thp_zero_page_alloc_failed
|
||||
is incremented if kernel fails to allocate
|
||||
|
|
|
@ -17,7 +17,10 @@ of the ``PROT_NONE+SIGSEGV`` trick.
|
|||
Design
|
||||
======
|
||||
|
||||
Userfaults are delivered and resolved through the ``userfaultfd`` syscall.
|
||||
Userspace creates a new userfaultfd, initializes it, and registers one or more
|
||||
regions of virtual memory with it. Then, any page faults which occur within the
|
||||
region(s) result in a message being delivered to the userfaultfd, notifying
|
||||
userspace of the fault.
|
||||
|
||||
The ``userfaultfd`` (aside from registering and unregistering virtual
|
||||
memory ranges) provides two primary functionalities:
|
||||
|
@ -34,12 +37,11 @@ The real advantage of userfaults if compared to regular virtual memory
|
|||
management of mremap/mprotect is that the userfaults in all their
|
||||
operations never involve heavyweight structures like vmas (in fact the
|
||||
``userfaultfd`` runtime load never takes the mmap_lock for writing).
|
||||
|
||||
Vmas are not suitable for page- (or hugepage) granular fault tracking
|
||||
when dealing with virtual address spaces that could span
|
||||
Terabytes. Too many vmas would be needed for that.
|
||||
|
||||
The ``userfaultfd`` once opened by invoking the syscall, can also be
|
||||
The ``userfaultfd``, once created, can also be
|
||||
passed using unix domain sockets to a manager process, so the same
|
||||
manager process could handle the userfaults of a multitude of
|
||||
different processes without them being aware about what is going on
|
||||
|
@ -50,6 +52,39 @@ is a corner case that would currently return ``-EBUSY``).
|
|||
API
|
||||
===
|
||||
|
||||
Creating a userfaultfd
|
||||
----------------------
|
||||
|
||||
There are two ways to create a new userfaultfd, each of which provide ways to
|
||||
restrict access to this functionality (since historically userfaultfds which
|
||||
handle kernel page faults have been a useful tool for exploiting the kernel).
|
||||
|
||||
The first way, supported since userfaultfd was introduced, is the
|
||||
userfaultfd(2) syscall. Access to this is controlled in several ways:
|
||||
|
||||
- Any user can always create a userfaultfd which traps userspace page faults
|
||||
only. Such a userfaultfd can be created using the userfaultfd(2) syscall
|
||||
with the flag UFFD_USER_MODE_ONLY.
|
||||
|
||||
- In order to also trap kernel page faults for the address space, either the
|
||||
process needs the CAP_SYS_PTRACE capability, or the system must have
|
||||
vm.unprivileged_userfaultfd set to 1. By default, vm.unprivileged_userfaultfd
|
||||
is set to 0.
|
||||
|
||||
The second way, added to the kernel more recently, is by opening
|
||||
/dev/userfaultfd and issuing a USERFAULTFD_IOC_NEW ioctl to it. This method
|
||||
yields equivalent userfaultfds to the userfaultfd(2) syscall.
|
||||
|
||||
Unlike userfaultfd(2), access to /dev/userfaultfd is controlled via normal
|
||||
filesystem permissions (user/group/mode), which gives fine grained access to
|
||||
userfaultfd specifically, without also granting other unrelated privileges at
|
||||
the same time (as e.g. granting CAP_SYS_PTRACE would do). Users who have access
|
||||
to /dev/userfaultfd can always create userfaultfds that trap kernel page faults;
|
||||
vm.unprivileged_userfaultfd is not considered.
|
||||
|
||||
Initializing a userfaultfd
|
||||
--------------------------
|
||||
|
||||
When first opened the ``userfaultfd`` must be enabled invoking the
|
||||
``UFFDIO_API`` ioctl specifying a ``uffdio_api.api`` value set to ``UFFD_API`` (or
|
||||
a later API version) which will specify the ``read/POLLIN`` protocol
|
||||
|
|
|
@ -65,6 +65,11 @@ combining the following values:
|
|||
4 s3_beep
|
||||
= =======
|
||||
|
||||
arch
|
||||
====
|
||||
|
||||
The machine hardware name, the same output as ``uname -m``
|
||||
(e.g. ``x86_64`` or ``aarch64``).
|
||||
|
||||
auto_msgmni
|
||||
===========
|
||||
|
@ -635,6 +640,17 @@ different types of memory (represented as different NUMA nodes) to
|
|||
place the hot pages in the fast memory. This is implemented based on
|
||||
unmapping and page fault too.
|
||||
|
||||
numa_balancing_promote_rate_limit_MBps
|
||||
======================================
|
||||
|
||||
Too high promotion/demotion throughput between different memory types
|
||||
may hurt application latency. This can be used to rate limit the
|
||||
promotion throughput. The per-node max promotion throughput in MB/s
|
||||
will be limited to be no more than the set value.
|
||||
|
||||
A rule of thumb is to set this to less than 1/10 of the PMEM node
|
||||
write bandwidth.
|
||||
|
||||
oops_all_cpu_backtrace
|
||||
======================
|
||||
|
||||
|
|
|
@ -926,6 +926,9 @@ calls without any restrictions.
|
|||
|
||||
The default value is 0.
|
||||
|
||||
Another way to control permissions for userfaultfd is to use
|
||||
/dev/userfaultfd instead of userfaultfd(2). See
|
||||
Documentation/admin-guide/mm/userfaultfd.rst.
|
||||
|
||||
user_reserve_kbytes
|
||||
===================
|
||||
|
|
|
@ -59,6 +59,7 @@ SoC-specific documents
|
|||
stm32/stm32f429-overview
|
||||
stm32/stm32mp13-overview
|
||||
stm32/stm32mp157-overview
|
||||
stm32/stm32-dma-mdma-chaining
|
||||
|
||||
sunxi
|
||||
|
||||
|
|
|
@ -0,0 +1,415 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
=======================
|
||||
STM32 DMA-MDMA chaining
|
||||
=======================
|
||||
|
||||
|
||||
Introduction
|
||||
------------
|
||||
|
||||
This document describes the STM32 DMA-MDMA chaining feature. But before going
|
||||
further, let's introduce the peripherals involved.
|
||||
|
||||
To offload data transfers from the CPU, STM32 microprocessors (MPUs) embed
|
||||
direct memory access controllers (DMA).
|
||||
|
||||
STM32MP1 SoCs embed both STM32 DMA and STM32 MDMA controllers. STM32 DMA
|
||||
request routing capabilities are enhanced by a DMA request multiplexer
|
||||
(STM32 DMAMUX).
|
||||
|
||||
**STM32 DMAMUX**
|
||||
|
||||
STM32 DMAMUX routes any DMA request from a given peripheral to any STM32 DMA
|
||||
controller (STM32MP1 counts two STM32 DMA controllers) channels.
|
||||
|
||||
**STM32 DMA**
|
||||
|
||||
STM32 DMA is mainly used to implement central data buffer storage (usually in
|
||||
the system SRAM) for different peripheral. It can access external RAMs but
|
||||
without the ability to generate convenient burst transfer ensuring the best
|
||||
load of the AXI.
|
||||
|
||||
**STM32 MDMA**
|
||||
|
||||
STM32 MDMA (Master DMA) is mainly used to manage direct data transfers between
|
||||
RAM data buffers without CPU intervention. It can also be used in a
|
||||
hierarchical structure that uses STM32 DMA as first level data buffer
|
||||
interfaces for AHB peripherals, while the STM32 MDMA acts as a second level
|
||||
DMA with better performance. As a AXI/AHB master, STM32 MDMA can take control
|
||||
of the AXI/AHB bus.
|
||||
|
||||
|
||||
Principles
|
||||
----------
|
||||
|
||||
STM32 DMA-MDMA chaining feature relies on the strengths of STM32 DMA and
|
||||
STM32 MDMA controllers.
|
||||
|
||||
STM32 DMA has a circular Double Buffer Mode (DBM). At each end of transaction
|
||||
(when DMA data counter - DMA_SxNDTR - reaches 0), the memory pointers
|
||||
(configured with DMA_SxSM0AR and DMA_SxM1AR) are swapped and the DMA data
|
||||
counter is automatically reloaded. This allows the SW or the STM32 MDMA to
|
||||
process one memory area while the second memory area is being filled/used by
|
||||
the STM32 DMA transfer.
|
||||
|
||||
With STM32 MDMA linked-list mode, a single request initiates the data array
|
||||
(collection of nodes) to be transferred until the linked-list pointer for the
|
||||
channel is null. The channel transfer complete of the last node is the end of
|
||||
transfer, unless first and last nodes are linked to each other, in such a
|
||||
case, the linked-list loops on to create a circular MDMA transfer.
|
||||
|
||||
STM32 MDMA has direct connections with STM32 DMA. This enables autonomous
|
||||
communication and synchronization between peripherals, thus saving CPU
|
||||
resources and bus congestion. Transfer Complete signal of STM32 DMA channel
|
||||
can triggers STM32 MDMA transfer. STM32 MDMA can clear the request generated
|
||||
by the STM32 DMA by writing to its Interrupt Clear register (whose address is
|
||||
stored in MDMA_CxMAR, and bit mask in MDMA_CxMDR).
|
||||
|
||||
.. table:: STM32 MDMA interconnect table with STM32 DMA
|
||||
|
||||
+--------------+----------------+-----------+------------+
|
||||
| STM32 DMAMUX | STM32 DMA | STM32 DMA | STM32 MDMA |
|
||||
| channels | channels | Transfer | request |
|
||||
| | | complete | |
|
||||
| | | signal | |
|
||||
+==============+================+===========+============+
|
||||
| Channel *0* | DMA1 channel 0 | dma1_tcf0 | *0x00* |
|
||||
+--------------+----------------+-----------+------------+
|
||||
| Channel *1* | DMA1 channel 1 | dma1_tcf1 | *0x01* |
|
||||
+--------------+----------------+-----------+------------+
|
||||
| Channel *2* | DMA1 channel 2 | dma1_tcf2 | *0x02* |
|
||||
+--------------+----------------+-----------+------------+
|
||||
| Channel *3* | DMA1 channel 3 | dma1_tcf3 | *0x03* |
|
||||
+--------------+----------------+-----------+------------+
|
||||
| Channel *4* | DMA1 channel 4 | dma1_tcf4 | *0x04* |
|
||||
+--------------+----------------+-----------+------------+
|
||||
| Channel *5* | DMA1 channel 5 | dma1_tcf5 | *0x05* |
|
||||
+--------------+----------------+-----------+------------+
|
||||
| Channel *6* | DMA1 channel 6 | dma1_tcf6 | *0x06* |
|
||||
+--------------+----------------+-----------+------------+
|
||||
| Channel *7* | DMA1 channel 7 | dma1_tcf7 | *0x07* |
|
||||
+--------------+----------------+-----------+------------+
|
||||
| Channel *8* | DMA2 channel 0 | dma2_tcf0 | *0x08* |
|
||||
+--------------+----------------+-----------+------------+
|
||||
| Channel *9* | DMA2 channel 1 | dma2_tcf1 | *0x09* |
|
||||
+--------------+----------------+-----------+------------+
|
||||
| Channel *10* | DMA2 channel 2 | dma2_tcf2 | *0x0A* |
|
||||
+--------------+----------------+-----------+------------+
|
||||
| Channel *11* | DMA2 channel 3 | dma2_tcf3 | *0x0B* |
|
||||
+--------------+----------------+-----------+------------+
|
||||
| Channel *12* | DMA2 channel 4 | dma2_tcf4 | *0x0C* |
|
||||
+--------------+----------------+-----------+------------+
|
||||
| Channel *13* | DMA2 channel 5 | dma2_tcf5 | *0x0D* |
|
||||
+--------------+----------------+-----------+------------+
|
||||
| Channel *14* | DMA2 channel 6 | dma2_tcf6 | *0x0E* |
|
||||
+--------------+----------------+-----------+------------+
|
||||
| Channel *15* | DMA2 channel 7 | dma2_tcf7 | *0x0F* |
|
||||
+--------------+----------------+-----------+------------+
|
||||
|
||||
STM32 DMA-MDMA chaining feature then uses a SRAM buffer. STM32MP1 SoCs embed
|
||||
three fast access static internal RAMs of various size, used for data storage.
|
||||
Due to STM32 DMA legacy (within microcontrollers), STM32 DMA performances are
|
||||
bad with DDR, while they are optimal with SRAM. Hence the SRAM buffer used
|
||||
between STM32 DMA and STM32 MDMA. This buffer is split in two equal periods
|
||||
and STM32 DMA uses one period while STM32 MDMA uses the other period
|
||||
simultaneously.
|
||||
::
|
||||
|
||||
dma[1:2]-tcf[0:7]
|
||||
.----------------.
|
||||
____________ ' _________ V____________
|
||||
| STM32 DMA | / __|>_ \ | STM32 MDMA |
|
||||
|------------| | / \ | |------------|
|
||||
| DMA_SxM0AR |<=>| | SRAM | |<=>| []-[]...[] |
|
||||
| DMA_SxM1AR | | \_____/ | | |
|
||||
|____________| \___<|____/ |____________|
|
||||
|
||||
STM32 DMA-MDMA chaining uses (struct dma_slave_config).peripheral_config to
|
||||
exchange the parameters needed to configure MDMA. These parameters are
|
||||
gathered into a u32 array with three values:
|
||||
|
||||
* the STM32 MDMA request (which is actually the DMAMUX channel ID),
|
||||
* the address of the STM32 DMA register to clear the Transfer Complete
|
||||
interrupt flag,
|
||||
* the mask of the Transfer Complete interrupt flag of the STM32 DMA channel.
|
||||
|
||||
Device Tree updates for STM32 DMA-MDMA chaining support
|
||||
-------------------------------------------------------
|
||||
|
||||
**1. Allocate a SRAM buffer**
|
||||
|
||||
SRAM device tree node is defined in SoC device tree. You can refer to it in
|
||||
your board device tree to define your SRAM pool.
|
||||
::
|
||||
|
||||
&sram {
|
||||
my_foo_device_dma_pool: dma-sram@0 {
|
||||
reg = <0x0 0x1000>;
|
||||
};
|
||||
};
|
||||
|
||||
Be careful of the start index, in case there are other SRAM consumers.
|
||||
Define your pool size strategically: to optimise chaining, the idea is that
|
||||
STM32 DMA and STM32 MDMA can work simultaneously, on each buffer of the
|
||||
SRAM.
|
||||
If the SRAM period is greater than the expected DMA transfer, then STM32 DMA
|
||||
and STM32 MDMA will work sequentially instead of simultaneously. It is not a
|
||||
functional issue but it is not optimal.
|
||||
|
||||
Don't forget to refer to your SRAM pool in your device node. You need to
|
||||
define a new property.
|
||||
::
|
||||
|
||||
&my_foo_device {
|
||||
...
|
||||
my_dma_pool = &my_foo_device_dma_pool;
|
||||
};
|
||||
|
||||
Then get this SRAM pool in your foo driver and allocate your SRAM buffer.
|
||||
|
||||
**2. Allocate a STM32 DMA channel and a STM32 MDMA channel**
|
||||
|
||||
You need to define an extra channel in your device tree node, in addition to
|
||||
the one you should already have for "classic" DMA operation.
|
||||
|
||||
This new channel must be taken from STM32 MDMA channels, so, the phandle of
|
||||
the DMA controller to use is the MDMA controller's one.
|
||||
::
|
||||
|
||||
&my_foo_device {
|
||||
[...]
|
||||
my_dma_pool = &my_foo_device_dma_pool;
|
||||
dmas = <&dmamux1 ...>, // STM32 DMA channel
|
||||
<&mdma1 0 0x3 0x1200000a 0 0>; // + STM32 MDMA channel
|
||||
};
|
||||
|
||||
Concerning STM32 MDMA bindings:
|
||||
|
||||
1. The request line number : whatever the value here, it will be overwritten
|
||||
by MDMA driver with the STM32 DMAMUX channel ID passed through
|
||||
(struct dma_slave_config).peripheral_config
|
||||
|
||||
2. The priority level : choose Very High (0x3) so that your channel will
|
||||
take priority other the other during request arbitration
|
||||
|
||||
3. A 32bit mask specifying the DMA channel configuration : source and
|
||||
destination address increment, block transfer with 128 bytes per single
|
||||
transfer
|
||||
|
||||
4. The 32bit value specifying the register to be used to acknowledge the
|
||||
request: it will be overwritten by MDMA driver, with the DMA channel
|
||||
interrupt flag clear register address passed through
|
||||
(struct dma_slave_config).peripheral_config
|
||||
|
||||
5. The 32bit mask specifying the value to be written to acknowledge the
|
||||
request: it will be overwritten by MDMA driver, with the DMA channel
|
||||
Transfer Complete flag passed through
|
||||
(struct dma_slave_config).peripheral_config
|
||||
|
||||
Driver updates for STM32 DMA-MDMA chaining support in foo driver
|
||||
----------------------------------------------------------------
|
||||
|
||||
**0. (optional) Refactor the original sg_table if dmaengine_prep_slave_sg()**
|
||||
|
||||
In case of dmaengine_prep_slave_sg(), the original sg_table can't be used as
|
||||
is. Two new sg_tables must be created from the original one. One for
|
||||
STM32 DMA transfer (where memory address targets now the SRAM buffer instead
|
||||
of DDR buffer) and one for STM32 MDMA transfer (where memory address targets
|
||||
the DDR buffer).
|
||||
|
||||
The new sg_list items must fit SRAM period length. Here is an example for
|
||||
DMA_DEV_TO_MEM:
|
||||
::
|
||||
|
||||
/*
|
||||
* Assuming sgl and nents, respectively the initial scatterlist and its
|
||||
* length.
|
||||
* Assuming sram_dma_buf and sram_period, respectively the memory
|
||||
* allocated from the pool for DMA usage, and the length of the period,
|
||||
* which is half of the sram_buf size.
|
||||
*/
|
||||
struct sg_table new_dma_sgt, new_mdma_sgt;
|
||||
struct scatterlist *s, *_sgl;
|
||||
dma_addr_t ddr_dma_buf;
|
||||
u32 new_nents = 0, len;
|
||||
int i;
|
||||
|
||||
/* Count the number of entries needed */
|
||||
for_each_sg(sgl, s, nents, i)
|
||||
if (sg_dma_len(s) > sram_period)
|
||||
new_nents += DIV_ROUND_UP(sg_dma_len(s), sram_period);
|
||||
else
|
||||
new_nents++;
|
||||
|
||||
/* Create sg table for STM32 DMA channel */
|
||||
ret = sg_alloc_table(&new_dma_sgt, new_nents, GFP_ATOMIC);
|
||||
if (ret)
|
||||
dev_err(dev, "DMA sg table alloc failed\n");
|
||||
|
||||
for_each_sg(new_dma_sgt.sgl, s, new_dma_sgt.nents, i) {
|
||||
_sgl = sgl;
|
||||
sg_dma_len(s) = min(sg_dma_len(_sgl), sram_period);
|
||||
/* Targets the beginning = first half of the sram_buf */
|
||||
s->dma_address = sram_buf;
|
||||
/*
|
||||
* Targets the second half of the sram_buf
|
||||
* for odd indexes of the item of the sg_list
|
||||
*/
|
||||
if (i & 1)
|
||||
s->dma_address += sram_period;
|
||||
}
|
||||
|
||||
/* Create sg table for STM32 MDMA channel */
|
||||
ret = sg_alloc_table(&new_mdma_sgt, new_nents, GFP_ATOMIC);
|
||||
if (ret)
|
||||
dev_err(dev, "MDMA sg_table alloc failed\n");
|
||||
|
||||
_sgl = sgl;
|
||||
len = sg_dma_len(sgl);
|
||||
ddr_dma_buf = sg_dma_address(sgl);
|
||||
for_each_sg(mdma_sgt.sgl, s, mdma_sgt.nents, i) {
|
||||
size_t bytes = min_t(size_t, len, sram_period);
|
||||
|
||||
sg_dma_len(s) = bytes;
|
||||
sg_dma_address(s) = ddr_dma_buf;
|
||||
len -= bytes;
|
||||
|
||||
if (!len && sg_next(_sgl)) {
|
||||
_sgl = sg_next(_sgl);
|
||||
len = sg_dma_len(_sgl);
|
||||
ddr_dma_buf = sg_dma_address(_sgl);
|
||||
} else {
|
||||
ddr_dma_buf += bytes;
|
||||
}
|
||||
}
|
||||
|
||||
Don't forget to release these new sg_tables after getting the descriptors
|
||||
with dmaengine_prep_slave_sg().
|
||||
|
||||
**1. Set controller specific parameters**
|
||||
|
||||
First, use dmaengine_slave_config() with a struct dma_slave_config to
|
||||
configure STM32 DMA channel. You just have to take care of DMA addresses,
|
||||
the memory address (depending on the transfer direction) must point on your
|
||||
SRAM buffer, and set (struct dma_slave_config).peripheral_size != 0.
|
||||
|
||||
STM32 DMA driver will check (struct dma_slave_config).peripheral_size to
|
||||
determine if chaining is being used or not. If it is used, then STM32 DMA
|
||||
driver fills (struct dma_slave_config).peripheral_config with an array of
|
||||
three u32 : the first one containing STM32 DMAMUX channel ID, the second one
|
||||
the channel interrupt flag clear register address, and the third one the
|
||||
channel Transfer Complete flag mask.
|
||||
|
||||
Then, use dmaengine_slave_config with another struct dma_slave_config to
|
||||
configure STM32 MDMA channel. Take care of DMA addresses, the device address
|
||||
(depending on the transfer direction) must point on your SRAM buffer, and
|
||||
the memory address must point to the buffer originally used for "classic"
|
||||
DMA operation. Use the previous (struct dma_slave_config).peripheral_size
|
||||
and .peripheral_config that have been updated by STM32 DMA driver, to set
|
||||
(struct dma_slave_config).peripheral_size and .peripheral_config of the
|
||||
struct dma_slave_config to configure STM32 MDMA channel.
|
||||
::
|
||||
|
||||
struct dma_slave_config dma_conf;
|
||||
struct dma_slave_config mdma_conf;
|
||||
|
||||
memset(&dma_conf, 0, sizeof(dma_conf));
|
||||
[...]
|
||||
config.direction = DMA_DEV_TO_MEM;
|
||||
config.dst_addr = sram_dma_buf; // SRAM buffer
|
||||
config.peripheral_size = 1; // peripheral_size != 0 => chaining
|
||||
|
||||
dmaengine_slave_config(dma_chan, &dma_config);
|
||||
|
||||
memset(&mdma_conf, 0, sizeof(mdma_conf));
|
||||
config.direction = DMA_DEV_TO_MEM;
|
||||
mdma_conf.src_addr = sram_dma_buf; // SRAM buffer
|
||||
mdma_conf.dst_addr = rx_dma_buf; // original memory buffer
|
||||
mdma_conf.peripheral_size = dma_conf.peripheral_size; // <- dma_conf
|
||||
mdma_conf.peripheral_config = dma_config.peripheral_config; // <- dma_conf
|
||||
|
||||
dmaengine_slave_config(mdma_chan, &mdma_conf);
|
||||
|
||||
**2. Get a descriptor for STM32 DMA channel transaction**
|
||||
|
||||
In the same way you get your descriptor for your "classic" DMA operation,
|
||||
you just have to replace the original sg_list (in case of
|
||||
dmaengine_prep_slave_sg()) with the new sg_list using SRAM buffer, or to
|
||||
replace the original buffer address, length and period (in case of
|
||||
dmaengine_prep_dma_cyclic()) with the new SRAM buffer.
|
||||
|
||||
**3. Get a descriptor for STM32 MDMA channel transaction**
|
||||
|
||||
If you previously get descriptor (for STM32 DMA) with
|
||||
|
||||
* dmaengine_prep_slave_sg(), then use dmaengine_prep_slave_sg() for
|
||||
STM32 MDMA;
|
||||
* dmaengine_prep_dma_cyclic(), then use dmaengine_prep_dma_cyclic() for
|
||||
STM32 MDMA.
|
||||
|
||||
Use the new sg_list using SRAM buffer (in case of dmaengine_prep_slave_sg())
|
||||
or, depending on the transfer direction, either the original DDR buffer (in
|
||||
case of DMA_DEV_TO_MEM) or the SRAM buffer (in case of DMA_MEM_TO_DEV), the
|
||||
source address being previously set with dmaengine_slave_config().
|
||||
|
||||
**4. Submit both transactions**
|
||||
|
||||
Before submitting your transactions, you may need to define on which
|
||||
descriptor you want a callback to be called at the end of the transfer
|
||||
(dmaengine_prep_slave_sg()) or the period (dmaengine_prep_dma_cyclic()).
|
||||
Depending on the direction, set the callback on the descriptor that finishes
|
||||
the overal transfer:
|
||||
|
||||
* DMA_DEV_TO_MEM: set the callback on the "MDMA" descriptor
|
||||
* DMA_MEM_TO_DEV: set the callback on the "DMA" descriptor
|
||||
|
||||
Then, submit the descriptors whatever the order, with dmaengine_tx_submit().
|
||||
|
||||
**5. Issue pending requests (and wait for callback notification)**
|
||||
|
||||
As STM32 MDMA channel transfer is triggered by STM32 DMA, you must issue
|
||||
STM32 MDMA channel before STM32 DMA channel.
|
||||
|
||||
If any, your callback will be called to warn you about the end of the overal
|
||||
transfer or the period completion.
|
||||
|
||||
Don't forget to terminate both channels. STM32 DMA channel is configured in
|
||||
cyclic Double-Buffer mode so it won't be disabled by HW, you need to terminate
|
||||
it. STM32 MDMA channel will be stopped by HW in case of sg transfer, but not
|
||||
in case of cyclic transfer. You can terminate it whatever the kind of transfer.
|
||||
|
||||
**STM32 DMA-MDMA chaining DMA_MEM_TO_DEV special case**
|
||||
|
||||
STM32 DMA-MDMA chaining in DMA_MEM_TO_DEV is a special case. Indeed, the
|
||||
STM32 MDMA feeds the SRAM buffer with the DDR data, and the STM32 DMA reads
|
||||
data from SRAM buffer. So some data (the first period) have to be copied in
|
||||
SRAM buffer when the STM32 DMA starts to read.
|
||||
|
||||
A trick could be pausing the STM32 DMA channel (that will raise a Transfer
|
||||
Complete signal, triggering the STM32 MDMA channel), but the first data read
|
||||
by the STM32 DMA could be "wrong". The proper way is to prepare the first SRAM
|
||||
period with dmaengine_prep_dma_memcpy(). Then this first period should be
|
||||
"removed" from the sg or the cyclic transfer.
|
||||
|
||||
Due to this complexity, rather use the STM32 DMA-MDMA chaining for
|
||||
DMA_DEV_TO_MEM and keep the "classic" DMA usage for DMA_MEM_TO_DEV, unless
|
||||
you're not afraid.
|
||||
|
||||
Resources
|
||||
---------
|
||||
|
||||
Application note, datasheet and reference manual are available on ST website
|
||||
(STM32MP1_).
|
||||
|
||||
Dedicated focus on three application notes (AN5224_, AN4031_ & AN5001_)
|
||||
dealing with STM32 DMAMUX, STM32 DMA and STM32 MDMA.
|
||||
|
||||
.. _STM32MP1: https://www.st.com/en/microcontrollers-microprocessors/stm32mp1-series.html
|
||||
.. _AN5224: https://www.st.com/resource/en/application_note/an5224-stm32-dmamux-the-dma-request-router-stmicroelectronics.pdf
|
||||
.. _AN4031: https://www.st.com/resource/en/application_note/dm00046011-using-the-stm32f2-stm32f4-and-stm32f7-series-dma-controller-stmicroelectronics.pdf
|
||||
.. _AN5001: https://www.st.com/resource/en/application_note/an5001-stm32cube-expansion-package-for-stm32h7-series-mdma-stmicroelectronics.pdf
|
||||
|
||||
:Authors:
|
||||
|
||||
- Amelie Delaunay <amelie.delaunay@foss.st.com>
|
|
@ -65,10 +65,6 @@ linux,uefi-mmap-desc-size 32-bit Size in bytes of each entry in the UEFI
|
|||
|
||||
linux,uefi-mmap-desc-ver 32-bit Version of the mmap descriptor format.
|
||||
|
||||
linux,initrd-start 64-bit Physical start address of an initrd
|
||||
|
||||
linux,initrd-end 64-bit Physical end address of an initrd
|
||||
|
||||
kaslr-seed 64-bit Entropy used to randomize the kernel image
|
||||
base address location.
|
||||
========================== ====== ===========================================
|
||||
|
|
|
@ -76,6 +76,8 @@ stable kernels.
|
|||
+----------------+-----------------+-----------------+-----------------------------+
|
||||
| ARM | Cortex-A55 | #1530923 | ARM64_ERRATUM_1530923 |
|
||||
+----------------+-----------------+-----------------+-----------------------------+
|
||||
| ARM | Cortex-A55 | #2441007 | ARM64_ERRATUM_2441007 |
|
||||
+----------------+-----------------+-----------------+-----------------------------+
|
||||
| ARM | Cortex-A57 | #832075 | ARM64_ERRATUM_832075 |
|
||||
+----------------+-----------------+-----------------+-----------------------------+
|
||||
| ARM | Cortex-A57 | #852523 | N/A |
|
||||
|
|
|
@ -144,6 +144,42 @@ managing and controlling ublk devices with help of several control commands:
|
|||
For retrieving device info via ``ublksrv_ctrl_dev_info``. It is the server's
|
||||
responsibility to save IO target specific info in userspace.
|
||||
|
||||
- ``UBLK_CMD_START_USER_RECOVERY``
|
||||
|
||||
This command is valid if ``UBLK_F_USER_RECOVERY`` feature is enabled. This
|
||||
command is accepted after the old process has exited, ublk device is quiesced
|
||||
and ``/dev/ublkc*`` is released. User should send this command before he starts
|
||||
a new process which re-opens ``/dev/ublkc*``. When this command returns, the
|
||||
ublk device is ready for the new process.
|
||||
|
||||
- ``UBLK_CMD_END_USER_RECOVERY``
|
||||
|
||||
This command is valid if ``UBLK_F_USER_RECOVERY`` feature is enabled. This
|
||||
command is accepted after ublk device is quiesced and a new process has
|
||||
opened ``/dev/ublkc*`` and get all ublk queues be ready. When this command
|
||||
returns, ublk device is unquiesced and new I/O requests are passed to the
|
||||
new process.
|
||||
|
||||
- user recovery feature description
|
||||
|
||||
Two new features are added for user recovery: ``UBLK_F_USER_RECOVERY`` and
|
||||
``UBLK_F_USER_RECOVERY_REISSUE``.
|
||||
|
||||
With ``UBLK_F_USER_RECOVERY`` set, after one ubq_daemon(ublk server's io
|
||||
handler) is dying, ublk does not delete ``/dev/ublkb*`` during the whole
|
||||
recovery stage and ublk device ID is kept. It is ublk server's
|
||||
responsibility to recover the device context by its own knowledge.
|
||||
Requests which have not been issued to userspace are requeued. Requests
|
||||
which have been issued to userspace are aborted.
|
||||
|
||||
With ``UBLK_F_USER_RECOVERY_REISSUE`` set, after one ubq_daemon(ublk
|
||||
server's io handler) is dying, contrary to ``UBLK_F_USER_RECOVERY``,
|
||||
requests which have been issued to userspace are requeued and will be
|
||||
re-issued to the new process after handling ``UBLK_CMD_END_USER_RECOVERY``.
|
||||
``UBLK_F_USER_RECOVERY_REISSUE`` is designed for backends who tolerate
|
||||
double-write since the driver may issue the same I/O request twice. It
|
||||
might be useful to a read-only FS or a VM backend.
|
||||
|
||||
Data plane
|
||||
----------
|
||||
|
||||
|
|
|
@ -37,6 +37,7 @@ Library functionality that is used throughout the kernel.
|
|||
kref
|
||||
assoc_array
|
||||
xarray
|
||||
maple_tree
|
||||
idr
|
||||
circular-buffers
|
||||
rbtree
|
||||
|
|
|
@ -0,0 +1,217 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0+
|
||||
|
||||
|
||||
==========
|
||||
Maple Tree
|
||||
==========
|
||||
|
||||
:Author: Liam R. Howlett
|
||||
|
||||
Overview
|
||||
========
|
||||
|
||||
The Maple Tree is a B-Tree data type which is optimized for storing
|
||||
non-overlapping ranges, including ranges of size 1. The tree was designed to
|
||||
be simple to use and does not require a user written search method. It
|
||||
supports iterating over a range of entries and going to the previous or next
|
||||
entry in a cache-efficient manner. The tree can also be put into an RCU-safe
|
||||
mode of operation which allows reading and writing concurrently. Writers must
|
||||
synchronize on a lock, which can be the default spinlock, or the user can set
|
||||
the lock to an external lock of a different type.
|
||||
|
||||
The Maple Tree maintains a small memory footprint and was designed to use
|
||||
modern processor cache efficiently. The majority of the users will be able to
|
||||
use the normal API. An :ref:`maple-tree-advanced-api` exists for more complex
|
||||
scenarios. The most important usage of the Maple Tree is the tracking of the
|
||||
virtual memory areas.
|
||||
|
||||
The Maple Tree can store values between ``0`` and ``ULONG_MAX``. The Maple
|
||||
Tree reserves values with the bottom two bits set to '10' which are below 4096
|
||||
(ie 2, 6, 10 .. 4094) for internal use. If the entries may use reserved
|
||||
entries then the users can convert the entries using xa_mk_value() and convert
|
||||
them back by calling xa_to_value(). If the user needs to use a reserved
|
||||
value, then the user can convert the value when using the
|
||||
:ref:`maple-tree-advanced-api`, but are blocked by the normal API.
|
||||
|
||||
The Maple Tree can also be configured to support searching for a gap of a given
|
||||
size (or larger).
|
||||
|
||||
Pre-allocating of nodes is also supported using the
|
||||
:ref:`maple-tree-advanced-api`. This is useful for users who must guarantee a
|
||||
successful store operation within a given
|
||||
code segment when allocating cannot be done. Allocations of nodes are
|
||||
relatively small at around 256 bytes.
|
||||
|
||||
.. _maple-tree-normal-api:
|
||||
|
||||
Normal API
|
||||
==========
|
||||
|
||||
Start by initialising a maple tree, either with DEFINE_MTREE() for statically
|
||||
allocated maple trees or mt_init() for dynamically allocated ones. A
|
||||
freshly-initialised maple tree contains a ``NULL`` pointer for the range ``0``
|
||||
- ``ULONG_MAX``. There are currently two types of maple trees supported: the
|
||||
allocation tree and the regular tree. The regular tree has a higher branching
|
||||
factor for internal nodes. The allocation tree has a lower branching factor
|
||||
but allows the user to search for a gap of a given size or larger from either
|
||||
``0`` upwards or ``ULONG_MAX`` down. An allocation tree can be used by
|
||||
passing in the ``MT_FLAGS_ALLOC_RANGE`` flag when initialising the tree.
|
||||
|
||||
You can then set entries using mtree_store() or mtree_store_range().
|
||||
mtree_store() will overwrite any entry with the new entry and return 0 on
|
||||
success or an error code otherwise. mtree_store_range() works in the same way
|
||||
but takes a range. mtree_load() is used to retrieve the entry stored at a
|
||||
given index. You can use mtree_erase() to erase an entire range by only
|
||||
knowing one value within that range, or mtree_store() call with an entry of
|
||||
NULL may be used to partially erase a range or many ranges at once.
|
||||
|
||||
If you want to only store a new entry to a range (or index) if that range is
|
||||
currently ``NULL``, you can use mtree_insert_range() or mtree_insert() which
|
||||
return -EEXIST if the range is not empty.
|
||||
|
||||
You can search for an entry from an index upwards by using mt_find().
|
||||
|
||||
You can walk each entry within a range by calling mt_for_each(). You must
|
||||
provide a temporary variable to store a cursor. If you want to walk each
|
||||
element of the tree then ``0`` and ``ULONG_MAX`` may be used as the range. If
|
||||
the caller is going to hold the lock for the duration of the walk then it is
|
||||
worth looking at the mas_for_each() API in the :ref:`maple-tree-advanced-api`
|
||||
section.
|
||||
|
||||
Sometimes it is necessary to ensure the next call to store to a maple tree does
|
||||
not allocate memory, please see :ref:`maple-tree-advanced-api` for this use case.
|
||||
|
||||
Finally, you can remove all entries from a maple tree by calling
|
||||
mtree_destroy(). If the maple tree entries are pointers, you may wish to free
|
||||
the entries first.
|
||||
|
||||
Allocating Nodes
|
||||
----------------
|
||||
|
||||
The allocations are handled by the internal tree code. See
|
||||
:ref:`maple-tree-advanced-alloc` for other options.
|
||||
|
||||
Locking
|
||||
-------
|
||||
|
||||
You do not have to worry about locking. See :ref:`maple-tree-advanced-locks`
|
||||
for other options.
|
||||
|
||||
The Maple Tree uses RCU and an internal spinlock to synchronise access:
|
||||
|
||||
Takes RCU read lock:
|
||||
* mtree_load()
|
||||
* mt_find()
|
||||
* mt_for_each()
|
||||
* mt_next()
|
||||
* mt_prev()
|
||||
|
||||
Takes ma_lock internally:
|
||||
* mtree_store()
|
||||
* mtree_store_range()
|
||||
* mtree_insert()
|
||||
* mtree_insert_range()
|
||||
* mtree_erase()
|
||||
* mtree_destroy()
|
||||
* mt_set_in_rcu()
|
||||
* mt_clear_in_rcu()
|
||||
|
||||
If you want to take advantage of the internal lock to protect the data
|
||||
structures that you are storing in the Maple Tree, you can call mtree_lock()
|
||||
before calling mtree_load(), then take a reference count on the object you
|
||||
have found before calling mtree_unlock(). This will prevent stores from
|
||||
removing the object from the tree between looking up the object and
|
||||
incrementing the refcount. You can also use RCU to avoid dereferencing
|
||||
freed memory, but an explanation of that is beyond the scope of this
|
||||
document.
|
||||
|
||||
.. _maple-tree-advanced-api:
|
||||
|
||||
Advanced API
|
||||
============
|
||||
|
||||
The advanced API offers more flexibility and better performance at the
|
||||
cost of an interface which can be harder to use and has fewer safeguards.
|
||||
You must take care of your own locking while using the advanced API.
|
||||
You can use the ma_lock, RCU or an external lock for protection.
|
||||
You can mix advanced and normal operations on the same array, as long
|
||||
as the locking is compatible. The :ref:`maple-tree-normal-api` is implemented
|
||||
in terms of the advanced API.
|
||||
|
||||
The advanced API is based around the ma_state, this is where the 'mas'
|
||||
prefix originates. The ma_state struct keeps track of tree operations to make
|
||||
life easier for both internal and external tree users.
|
||||
|
||||
Initialising the maple tree is the same as in the :ref:`maple-tree-normal-api`.
|
||||
Please see above.
|
||||
|
||||
The maple state keeps track of the range start and end in mas->index and
|
||||
mas->last, respectively.
|
||||
|
||||
mas_walk() will walk the tree to the location of mas->index and set the
|
||||
mas->index and mas->last according to the range for the entry.
|
||||
|
||||
You can set entries using mas_store(). mas_store() will overwrite any entry
|
||||
with the new entry and return the first existing entry that is overwritten.
|
||||
The range is passed in as members of the maple state: index and last.
|
||||
|
||||
You can use mas_erase() to erase an entire range by setting index and
|
||||
last of the maple state to the desired range to erase. This will erase
|
||||
the first range that is found in that range, set the maple state index
|
||||
and last as the range that was erased and return the entry that existed
|
||||
at that location.
|
||||
|
||||
You can walk each entry within a range by using mas_for_each(). If you want
|
||||
to walk each element of the tree then ``0`` and ``ULONG_MAX`` may be used as
|
||||
the range. If the lock needs to be periodically dropped, see the locking
|
||||
section mas_pause().
|
||||
|
||||
Using a maple state allows mas_next() and mas_prev() to function as if the
|
||||
tree was a linked list. With such a high branching factor the amortized
|
||||
performance penalty is outweighed by cache optimization. mas_next() will
|
||||
return the next entry which occurs after the entry at index. mas_prev()
|
||||
will return the previous entry which occurs before the entry at index.
|
||||
|
||||
mas_find() will find the first entry which exists at or above index on
|
||||
the first call, and the next entry from every subsequent calls.
|
||||
|
||||
mas_find_rev() will find the fist entry which exists at or below the last on
|
||||
the first call, and the previous entry from every subsequent calls.
|
||||
|
||||
If the user needs to yield the lock during an operation, then the maple state
|
||||
must be paused using mas_pause().
|
||||
|
||||
There are a few extra interfaces provided when using an allocation tree.
|
||||
If you wish to search for a gap within a range, then mas_empty_area()
|
||||
or mas_empty_area_rev() can be used. mas_empty_area() searches for a gap
|
||||
starting at the lowest index given up to the maximum of the range.
|
||||
mas_empty_area_rev() searches for a gap starting at the highest index given
|
||||
and continues downward to the lower bound of the range.
|
||||
|
||||
.. _maple-tree-advanced-alloc:
|
||||
|
||||
Advanced Allocating Nodes
|
||||
-------------------------
|
||||
|
||||
Allocations are usually handled internally to the tree, however if allocations
|
||||
need to occur before a write occurs then calling mas_expected_entries() will
|
||||
allocate the worst-case number of needed nodes to insert the provided number of
|
||||
ranges. This also causes the tree to enter mass insertion mode. Once
|
||||
insertions are complete calling mas_destroy() on the maple state will free the
|
||||
unused allocations.
|
||||
|
||||
.. _maple-tree-advanced-locks:
|
||||
|
||||
Advanced Locking
|
||||
----------------
|
||||
|
||||
The maple tree uses a spinlock by default, but external locks can be used for
|
||||
tree updates as well. To use an external lock, the tree must be initialized
|
||||
with the ``MT_FLAGS_LOCK_EXTERN flag``, this is usually done with the
|
||||
MTREE_INIT_EXT() #define, which takes an external lock as an argument.
|
||||
|
||||
Functions and structures
|
||||
========================
|
||||
|
||||
.. kernel-doc:: include/linux/maple_tree.h
|
||||
.. kernel-doc:: lib/maple_tree.c
|
|
@ -19,9 +19,6 @@ User Space Memory Access
|
|||
Memory Allocation Controls
|
||||
==========================
|
||||
|
||||
.. kernel-doc:: include/linux/gfp.h
|
||||
:internal:
|
||||
|
||||
.. kernel-doc:: include/linux/gfp_types.h
|
||||
:doc: Page mobility and placement hints
|
||||
|
||||
|
|
|
@ -612,6 +612,13 @@ Commit message
|
|||
|
||||
See: https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes
|
||||
|
||||
**BAD_FIXES_TAG**
|
||||
The Fixes: tag is malformed or does not follow the community conventions.
|
||||
This can occur if the tag have been split into multiple lines (e.g., when
|
||||
pasted in an email program with word wrapping enabled).
|
||||
|
||||
See: https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes
|
||||
|
||||
|
||||
Comparison style
|
||||
----------------
|
||||
|
|
|
@ -24,6 +24,7 @@ Documentation/dev-tools/testing-overview.rst
|
|||
kcov
|
||||
gcov
|
||||
kasan
|
||||
kmsan
|
||||
ubsan
|
||||
kmemleak
|
||||
kcsan
|
||||
|
|
|
@ -111,9 +111,17 @@ parameter can be used to control panic and reporting behaviour:
|
|||
report or also panic the kernel (default: ``report``). The panic happens even
|
||||
if ``kasan_multi_shot`` is enabled.
|
||||
|
||||
Hardware Tag-Based KASAN mode (see the section about various modes below) is
|
||||
intended for use in production as a security mitigation. Therefore, it supports
|
||||
additional boot parameters that allow disabling KASAN or controlling features:
|
||||
Software and Hardware Tag-Based KASAN modes (see the section about various
|
||||
modes below) support altering stack trace collection behavior:
|
||||
|
||||
- ``kasan.stacktrace=off`` or ``=on`` disables or enables alloc and free stack
|
||||
traces collection (default: ``on``).
|
||||
- ``kasan.stack_ring_size=<number of entries>`` specifies the number of entries
|
||||
in the stack ring (default: ``32768``).
|
||||
|
||||
Hardware Tag-Based KASAN mode is intended for use in production as a security
|
||||
mitigation. Therefore, it supports additional boot parameters that allow
|
||||
disabling KASAN altogether or controlling its features:
|
||||
|
||||
- ``kasan=off`` or ``=on`` controls whether KASAN is enabled (default: ``on``).
|
||||
|
||||
|
@ -132,9 +140,6 @@ additional boot parameters that allow disabling KASAN or controlling features:
|
|||
- ``kasan.vmalloc=off`` or ``=on`` disables or enables tagging of vmalloc
|
||||
allocations (default: ``on``).
|
||||
|
||||
- ``kasan.stacktrace=off`` or ``=on`` disables or enables alloc and free stack
|
||||
traces collection (default: ``on``).
|
||||
|
||||
Error reports
|
||||
~~~~~~~~~~~~~
|
||||
|
||||
|
|
|
@ -0,0 +1,427 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
.. Copyright (C) 2022, Google LLC.
|
||||
|
||||
===================================
|
||||
The Kernel Memory Sanitizer (KMSAN)
|
||||
===================================
|
||||
|
||||
KMSAN is a dynamic error detector aimed at finding uses of uninitialized
|
||||
values. It is based on compiler instrumentation, and is quite similar to the
|
||||
userspace `MemorySanitizer tool`_.
|
||||
|
||||
An important note is that KMSAN is not intended for production use, because it
|
||||
drastically increases kernel memory footprint and slows the whole system down.
|
||||
|
||||
Usage
|
||||
=====
|
||||
|
||||
Building the kernel
|
||||
-------------------
|
||||
|
||||
In order to build a kernel with KMSAN you will need a fresh Clang (14.0.6+).
|
||||
Please refer to `LLVM documentation`_ for the instructions on how to build Clang.
|
||||
|
||||
Now configure and build the kernel with CONFIG_KMSAN enabled.
|
||||
|
||||
Example report
|
||||
--------------
|
||||
|
||||
Here is an example of a KMSAN report::
|
||||
|
||||
=====================================================
|
||||
BUG: KMSAN: uninit-value in test_uninit_kmsan_check_memory+0x1be/0x380 [kmsan_test]
|
||||
test_uninit_kmsan_check_memory+0x1be/0x380 mm/kmsan/kmsan_test.c:273
|
||||
kunit_run_case_internal lib/kunit/test.c:333
|
||||
kunit_try_run_case+0x206/0x420 lib/kunit/test.c:374
|
||||
kunit_generic_run_threadfn_adapter+0x6d/0xc0 lib/kunit/try-catch.c:28
|
||||
kthread+0x721/0x850 kernel/kthread.c:327
|
||||
ret_from_fork+0x1f/0x30 ??:?
|
||||
|
||||
Uninit was stored to memory at:
|
||||
do_uninit_local_array+0xfa/0x110 mm/kmsan/kmsan_test.c:260
|
||||
test_uninit_kmsan_check_memory+0x1a2/0x380 mm/kmsan/kmsan_test.c:271
|
||||
kunit_run_case_internal lib/kunit/test.c:333
|
||||
kunit_try_run_case+0x206/0x420 lib/kunit/test.c:374
|
||||
kunit_generic_run_threadfn_adapter+0x6d/0xc0 lib/kunit/try-catch.c:28
|
||||
kthread+0x721/0x850 kernel/kthread.c:327
|
||||
ret_from_fork+0x1f/0x30 ??:?
|
||||
|
||||
Local variable uninit created at:
|
||||
do_uninit_local_array+0x4a/0x110 mm/kmsan/kmsan_test.c:256
|
||||
test_uninit_kmsan_check_memory+0x1a2/0x380 mm/kmsan/kmsan_test.c:271
|
||||
|
||||
Bytes 4-7 of 8 are uninitialized
|
||||
Memory access of size 8 starts at ffff888083fe3da0
|
||||
|
||||
CPU: 0 PID: 6731 Comm: kunit_try_catch Tainted: G B E 5.16.0-rc3+ #104
|
||||
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
|
||||
=====================================================
|
||||
|
||||
The report says that the local variable ``uninit`` was created uninitialized in
|
||||
``do_uninit_local_array()``. The third stack trace corresponds to the place
|
||||
where this variable was created.
|
||||
|
||||
The first stack trace shows where the uninit value was used (in
|
||||
``test_uninit_kmsan_check_memory()``). The tool shows the bytes which were left
|
||||
uninitialized in the local variable, as well as the stack where the value was
|
||||
copied to another memory location before use.
|
||||
|
||||
A use of uninitialized value ``v`` is reported by KMSAN in the following cases:
|
||||
- in a condition, e.g. ``if (v) { ... }``;
|
||||
- in an indexing or pointer dereferencing, e.g. ``array[v]`` or ``*v``;
|
||||
- when it is copied to userspace or hardware, e.g. ``copy_to_user(..., &v, ...)``;
|
||||
- when it is passed as an argument to a function, and
|
||||
``CONFIG_KMSAN_CHECK_PARAM_RETVAL`` is enabled (see below).
|
||||
|
||||
The mentioned cases (apart from copying data to userspace or hardware, which is
|
||||
a security issue) are considered undefined behavior from the C11 Standard point
|
||||
of view.
|
||||
|
||||
Disabling the instrumentation
|
||||
-----------------------------
|
||||
|
||||
A function can be marked with ``__no_kmsan_checks``. Doing so makes KMSAN
|
||||
ignore uninitialized values in that function and mark its output as initialized.
|
||||
As a result, the user will not get KMSAN reports related to that function.
|
||||
|
||||
Another function attribute supported by KMSAN is ``__no_sanitize_memory``.
|
||||
Applying this attribute to a function will result in KMSAN not instrumenting
|
||||
it, which can be helpful if we do not want the compiler to interfere with some
|
||||
low-level code (e.g. that marked with ``noinstr`` which implicitly adds
|
||||
``__no_sanitize_memory``).
|
||||
|
||||
This however comes at a cost: stack allocations from such functions will have
|
||||
incorrect shadow/origin values, likely leading to false positives. Functions
|
||||
called from non-instrumented code may also receive incorrect metadata for their
|
||||
parameters.
|
||||
|
||||
As a rule of thumb, avoid using ``__no_sanitize_memory`` explicitly.
|
||||
|
||||
It is also possible to disable KMSAN for a single file (e.g. main.o)::
|
||||
|
||||
KMSAN_SANITIZE_main.o := n
|
||||
|
||||
or for the whole directory::
|
||||
|
||||
KMSAN_SANITIZE := n
|
||||
|
||||
in the Makefile. Think of this as applying ``__no_sanitize_memory`` to every
|
||||
function in the file or directory. Most users won't need KMSAN_SANITIZE, unless
|
||||
their code gets broken by KMSAN (e.g. runs at early boot time).
|
||||
|
||||
Support
|
||||
=======
|
||||
|
||||
In order for KMSAN to work the kernel must be built with Clang, which so far is
|
||||
the only compiler that has KMSAN support. The kernel instrumentation pass is
|
||||
based on the userspace `MemorySanitizer tool`_.
|
||||
|
||||
The runtime library only supports x86_64 at the moment.
|
||||
|
||||
How KMSAN works
|
||||
===============
|
||||
|
||||
KMSAN shadow memory
|
||||
-------------------
|
||||
|
||||
KMSAN associates a metadata byte (also called shadow byte) with every byte of
|
||||
kernel memory. A bit in the shadow byte is set iff the corresponding bit of the
|
||||
kernel memory byte is uninitialized. Marking the memory uninitialized (i.e.
|
||||
setting its shadow bytes to ``0xff``) is called poisoning, marking it
|
||||
initialized (setting the shadow bytes to ``0x00``) is called unpoisoning.
|
||||
|
||||
When a new variable is allocated on the stack, it is poisoned by default by
|
||||
instrumentation code inserted by the compiler (unless it is a stack variable
|
||||
that is immediately initialized). Any new heap allocation done without
|
||||
``__GFP_ZERO`` is also poisoned.
|
||||
|
||||
Compiler instrumentation also tracks the shadow values as they are used along
|
||||
the code. When needed, instrumentation code invokes the runtime library in
|
||||
``mm/kmsan/`` to persist shadow values.
|
||||
|
||||
The shadow value of a basic or compound type is an array of bytes of the same
|
||||
length. When a constant value is written into memory, that memory is unpoisoned.
|
||||
When a value is read from memory, its shadow memory is also obtained and
|
||||
propagated into all the operations which use that value. For every instruction
|
||||
that takes one or more values the compiler generates code that calculates the
|
||||
shadow of the result depending on those values and their shadows.
|
||||
|
||||
Example::
|
||||
|
||||
int a = 0xff; // i.e. 0x000000ff
|
||||
int b;
|
||||
int c = a | b;
|
||||
|
||||
In this case the shadow of ``a`` is ``0``, shadow of ``b`` is ``0xffffffff``,
|
||||
shadow of ``c`` is ``0xffffff00``. This means that the upper three bytes of
|
||||
``c`` are uninitialized, while the lower byte is initialized.
|
||||
|
||||
Origin tracking
|
||||
---------------
|
||||
|
||||
Every four bytes of kernel memory also have a so-called origin mapped to them.
|
||||
This origin describes the point in program execution at which the uninitialized
|
||||
value was created. Every origin is associated with either the full allocation
|
||||
stack (for heap-allocated memory), or the function containing the uninitialized
|
||||
variable (for locals).
|
||||
|
||||
When an uninitialized variable is allocated on stack or heap, a new origin
|
||||
value is created, and that variable's origin is filled with that value. When a
|
||||
value is read from memory, its origin is also read and kept together with the
|
||||
shadow. For every instruction that takes one or more values, the origin of the
|
||||
result is one of the origins corresponding to any of the uninitialized inputs.
|
||||
If a poisoned value is written into memory, its origin is written to the
|
||||
corresponding storage as well.
|
||||
|
||||
Example 1::
|
||||
|
||||
int a = 42;
|
||||
int b;
|
||||
int c = a + b;
|
||||
|
||||
In this case the origin of ``b`` is generated upon function entry, and is
|
||||
stored to the origin of ``c`` right before the addition result is written into
|
||||
memory.
|
||||
|
||||
Several variables may share the same origin address, if they are stored in the
|
||||
same four-byte chunk. In this case every write to either variable updates the
|
||||
origin for all of them. We have to sacrifice precision in this case, because
|
||||
storing origins for individual bits (and even bytes) would be too costly.
|
||||
|
||||
Example 2::
|
||||
|
||||
int combine(short a, short b) {
|
||||
union ret_t {
|
||||
int i;
|
||||
short s[2];
|
||||
} ret;
|
||||
ret.s[0] = a;
|
||||
ret.s[1] = b;
|
||||
return ret.i;
|
||||
}
|
||||
|
||||
If ``a`` is initialized and ``b`` is not, the shadow of the result would be
|
||||
0xffff0000, and the origin of the result would be the origin of ``b``.
|
||||
``ret.s[0]`` would have the same origin, but it will never be used, because
|
||||
that variable is initialized.
|
||||
|
||||
If both function arguments are uninitialized, only the origin of the second
|
||||
argument is preserved.
|
||||
|
||||
Origin chaining
|
||||
~~~~~~~~~~~~~~~
|
||||
|
||||
To ease debugging, KMSAN creates a new origin for every store of an
|
||||
uninitialized value to memory. The new origin references both its creation stack
|
||||
and the previous origin the value had. This may cause increased memory
|
||||
consumption, so we limit the length of origin chains in the runtime.
|
||||
|
||||
Clang instrumentation API
|
||||
-------------------------
|
||||
|
||||
Clang instrumentation pass inserts calls to functions defined in
|
||||
``mm/kmsan/nstrumentation.c`` into the kernel code.
|
||||
|
||||
Shadow manipulation
|
||||
~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
For every memory access the compiler emits a call to a function that returns a
|
||||
pair of pointers to the shadow and origin addresses of the given memory::
|
||||
|
||||
typedef struct {
|
||||
void *shadow, *origin;
|
||||
} shadow_origin_ptr_t
|
||||
|
||||
shadow_origin_ptr_t __msan_metadata_ptr_for_load_{1,2,4,8}(void *addr)
|
||||
shadow_origin_ptr_t __msan_metadata_ptr_for_store_{1,2,4,8}(void *addr)
|
||||
shadow_origin_ptr_t __msan_metadata_ptr_for_load_n(void *addr, uintptr_t size)
|
||||
shadow_origin_ptr_t __msan_metadata_ptr_for_store_n(void *addr, uintptr_t size)
|
||||
|
||||
The function name depends on the memory access size.
|
||||
|
||||
The compiler makes sure that for every loaded value its shadow and origin
|
||||
values are read from memory. When a value is stored to memory, its shadow and
|
||||
origin are also stored using the metadata pointers.
|
||||
|
||||
Handling locals
|
||||
~~~~~~~~~~~~~~~
|
||||
|
||||
A special function is used to create a new origin value for a local variable and
|
||||
set the origin of that variable to that value::
|
||||
|
||||
void __msan_poison_alloca(void *addr, uintptr_t size, char *descr)
|
||||
|
||||
Access to per-task data
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
At the beginning of every instrumented function KMSAN inserts a call to
|
||||
``__msan_get_context_state()``::
|
||||
|
||||
kmsan_context_state *__msan_get_context_state(void)
|
||||
|
||||
``kmsan_context_state`` is declared in ``include/linux/kmsan.h``::
|
||||
|
||||
struct kmsan_context_state {
|
||||
char param_tls[KMSAN_PARAM_SIZE];
|
||||
char retval_tls[KMSAN_RETVAL_SIZE];
|
||||
char va_arg_tls[KMSAN_PARAM_SIZE];
|
||||
char va_arg_origin_tls[KMSAN_PARAM_SIZE];
|
||||
u64 va_arg_overflow_size_tls;
|
||||
char param_origin_tls[KMSAN_PARAM_SIZE];
|
||||
depot_stack_handle_t retval_origin_tls;
|
||||
};
|
||||
|
||||
This structure is used by KMSAN to pass parameter shadows and origins between
|
||||
instrumented functions (unless the parameters are checked immediately by
|
||||
``CONFIG_KMSAN_CHECK_PARAM_RETVAL``).
|
||||
|
||||
Passing uninitialized values to functions
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Clang's MemorySanitizer instrumentation has an option,
|
||||
``-fsanitize-memory-param-retval``, which makes the compiler check function
|
||||
parameters passed by value, as well as function return values.
|
||||
|
||||
The option is controlled by ``CONFIG_KMSAN_CHECK_PARAM_RETVAL``, which is
|
||||
enabled by default to let KMSAN report uninitialized values earlier.
|
||||
Please refer to the `LKML discussion`_ for more details.
|
||||
|
||||
Because of the way the checks are implemented in LLVM (they are only applied to
|
||||
parameters marked as ``noundef``), not all parameters are guaranteed to be
|
||||
checked, so we cannot give up the metadata storage in ``kmsan_context_state``.
|
||||
|
||||
String functions
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
The compiler replaces calls to ``memcpy()``/``memmove()``/``memset()`` with the
|
||||
following functions. These functions are also called when data structures are
|
||||
initialized or copied, making sure shadow and origin values are copied alongside
|
||||
with the data::
|
||||
|
||||
void *__msan_memcpy(void *dst, void *src, uintptr_t n)
|
||||
void *__msan_memmove(void *dst, void *src, uintptr_t n)
|
||||
void *__msan_memset(void *dst, int c, uintptr_t n)
|
||||
|
||||
Error reporting
|
||||
~~~~~~~~~~~~~~~
|
||||
|
||||
For each use of a value the compiler emits a shadow check that calls
|
||||
``__msan_warning()`` in the case that value is poisoned::
|
||||
|
||||
void __msan_warning(u32 origin)
|
||||
|
||||
``__msan_warning()`` causes KMSAN runtime to print an error report.
|
||||
|
||||
Inline assembly instrumentation
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
KMSAN instruments every inline assembly output with a call to::
|
||||
|
||||
void __msan_instrument_asm_store(void *addr, uintptr_t size)
|
||||
|
||||
, which unpoisons the memory region.
|
||||
|
||||
This approach may mask certain errors, but it also helps to avoid a lot of
|
||||
false positives in bitwise operations, atomics etc.
|
||||
|
||||
Sometimes the pointers passed into inline assembly do not point to valid memory.
|
||||
In such cases they are ignored at runtime.
|
||||
|
||||
|
||||
Runtime library
|
||||
---------------
|
||||
|
||||
The code is located in ``mm/kmsan/``.
|
||||
|
||||
Per-task KMSAN state
|
||||
~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Every task_struct has an associated KMSAN task state that holds the KMSAN
|
||||
context (see above) and a per-task flag disallowing KMSAN reports::
|
||||
|
||||
struct kmsan_context {
|
||||
...
|
||||
bool allow_reporting;
|
||||
struct kmsan_context_state cstate;
|
||||
...
|
||||
}
|
||||
|
||||
struct task_struct {
|
||||
...
|
||||
struct kmsan_context kmsan;
|
||||
...
|
||||
}
|
||||
|
||||
KMSAN contexts
|
||||
~~~~~~~~~~~~~~
|
||||
|
||||
When running in a kernel task context, KMSAN uses ``current->kmsan.cstate`` to
|
||||
hold the metadata for function parameters and return values.
|
||||
|
||||
But in the case the kernel is running in the interrupt, softirq or NMI context,
|
||||
where ``current`` is unavailable, KMSAN switches to per-cpu interrupt state::
|
||||
|
||||
DEFINE_PER_CPU(struct kmsan_ctx, kmsan_percpu_ctx);
|
||||
|
||||
Metadata allocation
|
||||
~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
There are several places in the kernel for which the metadata is stored.
|
||||
|
||||
1. Each ``struct page`` instance contains two pointers to its shadow and
|
||||
origin pages::
|
||||
|
||||
struct page {
|
||||
...
|
||||
struct page *shadow, *origin;
|
||||
...
|
||||
};
|
||||
|
||||
At boot-time, the kernel allocates shadow and origin pages for every available
|
||||
kernel page. This is done quite late, when the kernel address space is already
|
||||
fragmented, so normal data pages may arbitrarily interleave with the metadata
|
||||
pages.
|
||||
|
||||
This means that in general for two contiguous memory pages their shadow/origin
|
||||
pages may not be contiguous. Consequently, if a memory access crosses the
|
||||
boundary of a memory block, accesses to shadow/origin memory may potentially
|
||||
corrupt other pages or read incorrect values from them.
|
||||
|
||||
In practice, contiguous memory pages returned by the same ``alloc_pages()``
|
||||
call will have contiguous metadata, whereas if these pages belong to two
|
||||
different allocations their metadata pages can be fragmented.
|
||||
|
||||
For the kernel data (``.data``, ``.bss`` etc.) and percpu memory regions
|
||||
there also are no guarantees on metadata contiguity.
|
||||
|
||||
In the case ``__msan_metadata_ptr_for_XXX_YYY()`` hits the border between two
|
||||
pages with non-contiguous metadata, it returns pointers to fake shadow/origin regions::
|
||||
|
||||
char dummy_load_page[PAGE_SIZE] __attribute__((aligned(PAGE_SIZE)));
|
||||
char dummy_store_page[PAGE_SIZE] __attribute__((aligned(PAGE_SIZE)));
|
||||
|
||||
``dummy_load_page`` is zero-initialized, so reads from it always yield zeroes.
|
||||
All stores to ``dummy_store_page`` are ignored.
|
||||
|
||||
2. For vmalloc memory and modules, there is a direct mapping between the memory
|
||||
range, its shadow and origin. KMSAN reduces the vmalloc area by 3/4, making only
|
||||
the first quarter available to ``vmalloc()``. The second quarter of the vmalloc
|
||||
area contains shadow memory for the first quarter, the third one holds the
|
||||
origins. A small part of the fourth quarter contains shadow and origins for the
|
||||
kernel modules. Please refer to ``arch/x86/include/asm/pgtable_64_types.h`` for
|
||||
more details.
|
||||
|
||||
When an array of pages is mapped into a contiguous virtual memory space, their
|
||||
shadow and origin pages are similarly mapped into contiguous regions.
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
E. Stepanov, K. Serebryany. `MemorySanitizer: fast detector of uninitialized
|
||||
memory use in C++
|
||||
<https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43308.pdf>`_.
|
||||
In Proceedings of CGO 2015.
|
||||
|
||||
.. _MemorySanitizer tool: https://clang.llvm.org/docs/MemorySanitizer.html
|
||||
.. _LLVM documentation: https://llvm.org/docs/GettingStarted.html
|
||||
.. _LKML discussion: https://lore.kernel.org/all/20220614144853.3693273-1-glider@google.com/
|
|
@ -251,14 +251,15 @@ command line arguments:
|
|||
compiling a kernel (using ``build`` or ``run`` commands). For example:
|
||||
to enable compiler warnings, we can pass ``--make_options W=1``.
|
||||
|
||||
- ``--alltests``: Builds a UML kernel with all config options enabled
|
||||
using ``make allyesconfig``. This allows us to run as many tests as
|
||||
possible.
|
||||
- ``--alltests``: Enable a predefined set of options in order to build
|
||||
as many tests as possible.
|
||||
|
||||
.. note:: It is slow and prone to breakage as new options are
|
||||
added or modified. Instead, enable all tests
|
||||
which have satisfied dependencies by adding
|
||||
``CONFIG_KUNIT_ALL_TESTS=y`` to your ``.kunitconfig``.
|
||||
.. note:: The list of enabled options can be found in
|
||||
``tools/testing/kunit/configs/all_tests.config``.
|
||||
|
||||
If you only want to enable all tests with otherwise satisfied
|
||||
dependencies, instead add ``CONFIG_KUNIT_ALL_TESTS=y`` to your
|
||||
``.kunitconfig``.
|
||||
|
||||
- ``--kunitconfig``: Specifies the path or the directory of the ``.kunitconfig``
|
||||
file. For example:
|
||||
|
|
|
@ -75,3 +75,6 @@ always-$(CHECK_DT_BINDING) += $(patsubst $(srctree)/$(src)/%.yaml,%.example.dtb,
|
|||
# build artifacts here before they are processed by scripts/Makefile.clean
|
||||
clean-files = $(shell find $(obj) \( -name '*.example.dts' -o \
|
||||
-name '*.example.dtb' \) -delete 2>/dev/null)
|
||||
|
||||
dt_compatible_check: $(obj)/processed-schema.json
|
||||
$(Q)$(srctree)/scripts/dtc/dt-extract-compatibles $(srctree) | xargs dt-check-compatible -v -s $<
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/actions.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Actions Semi platforms device tree bindings
|
||||
title: Actions Semi platforms
|
||||
|
||||
maintainers:
|
||||
- Andreas Färber <afaerber@suse.de>
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/airoha.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Airoha SoC based Platforms Device Tree Bindings
|
||||
title: Airoha SoC based Platforms
|
||||
|
||||
maintainers:
|
||||
- Felix Fietkau <nbd@nbd.name>
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/altera.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Altera's SoCFPGA platform device tree bindings
|
||||
title: Altera's SoCFPGA platform
|
||||
|
||||
maintainers:
|
||||
- Dinh Nguyen <dinguyen@kernel.org>
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/amazon,al.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Amazon's Annapurna Labs Alpine Platform Device Tree Bindings
|
||||
title: Amazon's Annapurna Labs Alpine Platform
|
||||
|
||||
maintainers:
|
||||
- Hanna Hawa <hhhawa@amazon.com>
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/amlogic.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Amlogic MesonX device tree bindings
|
||||
title: Amlogic MesonX
|
||||
|
||||
maintainers:
|
||||
- Kevin Hilman <khilman@baylibre.com>
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/apple.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Apple ARM Machine Device Tree Bindings
|
||||
title: Apple ARM Machine
|
||||
|
||||
maintainers:
|
||||
- Hector Martin <marcan@marcan.st>
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/arm,cci-400.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: ARM CCI Cache Coherent Interconnect Device Tree Binding
|
||||
title: ARM CCI Cache Coherent Interconnect
|
||||
|
||||
maintainers:
|
||||
- Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
|
||||
|
|
|
@ -61,6 +61,9 @@ properties:
|
|||
maxItems: 1
|
||||
description: Address translation error interrupt
|
||||
|
||||
power-domains:
|
||||
maxItems: 1
|
||||
|
||||
in-ports:
|
||||
$ref: /schemas/graph.yaml#/properties/ports
|
||||
additionalProperties: false
|
||||
|
|
|
@ -98,6 +98,9 @@ properties:
|
|||
base cti node if compatible string arm,coresight-cti-v8-arch is used,
|
||||
or may appear in a trig-conns child node when appropriate.
|
||||
|
||||
power-domains:
|
||||
maxItems: 1
|
||||
|
||||
arm,cti-ctm-id:
|
||||
$ref: /schemas/types.yaml#/definitions/uint32
|
||||
description:
|
||||
|
|
|
@ -54,6 +54,9 @@ properties:
|
|||
- const: apb_pclk
|
||||
- const: atclk
|
||||
|
||||
power-domains:
|
||||
maxItems: 1
|
||||
|
||||
in-ports:
|
||||
$ref: /schemas/graph.yaml#/properties/ports
|
||||
|
||||
|
|
|
@ -54,6 +54,9 @@ properties:
|
|||
- const: apb_pclk
|
||||
- const: atclk
|
||||
|
||||
power-domains:
|
||||
maxItems: 1
|
||||
|
||||
qcom,replicator-loses-context:
|
||||
type: boolean
|
||||
description:
|
||||
|
|
|
@ -54,6 +54,9 @@ properties:
|
|||
- const: apb_pclk
|
||||
- const: atclk
|
||||
|
||||
power-domains:
|
||||
maxItems: 1
|
||||
|
||||
in-ports:
|
||||
$ref: /schemas/graph.yaml#/properties/ports
|
||||
additionalProperties: false
|
||||
|
|
|
@ -73,6 +73,9 @@ properties:
|
|||
- const: apb_pclk
|
||||
- const: atclk
|
||||
|
||||
power-domains:
|
||||
maxItems: 1
|
||||
|
||||
arm,coresight-loses-context-with-cpu:
|
||||
type: boolean
|
||||
description:
|
||||
|
|
|
@ -27,6 +27,9 @@ properties:
|
|||
compatible:
|
||||
const: arm,coresight-static-funnel
|
||||
|
||||
power-domains:
|
||||
maxItems: 1
|
||||
|
||||
in-ports:
|
||||
$ref: /schemas/graph.yaml#/properties/ports
|
||||
|
||||
|
|
|
@ -27,6 +27,9 @@ properties:
|
|||
compatible:
|
||||
const: arm,coresight-static-replicator
|
||||
|
||||
power-domains:
|
||||
maxItems: 1
|
||||
|
||||
in-ports:
|
||||
$ref: /schemas/graph.yaml#/properties/ports
|
||||
additionalProperties: false
|
||||
|
|
|
@ -61,6 +61,9 @@ properties:
|
|||
- const: apb_pclk
|
||||
- const: atclk
|
||||
|
||||
power-domains:
|
||||
maxItems: 1
|
||||
|
||||
out-ports:
|
||||
$ref: /schemas/graph.yaml#/properties/ports
|
||||
additionalProperties: false
|
||||
|
|
|
@ -55,6 +55,12 @@ properties:
|
|||
- const: apb_pclk
|
||||
- const: atclk
|
||||
|
||||
iommus:
|
||||
maxItems: 1
|
||||
|
||||
power-domains:
|
||||
maxItems: 1
|
||||
|
||||
arm,buffer-size:
|
||||
$ref: /schemas/types.yaml#/definitions/uint32
|
||||
deprecated: true
|
||||
|
|
|
@ -54,6 +54,9 @@ properties:
|
|||
- const: apb_pclk
|
||||
- const: atclk
|
||||
|
||||
power-domains:
|
||||
maxItems: 1
|
||||
|
||||
in-ports:
|
||||
$ref: /schemas/graph.yaml#/properties/ports
|
||||
additionalProperties: false
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/arm,corstone1000.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: ARM Corstone1000 Device Tree Bindings
|
||||
title: ARM Corstone1000
|
||||
|
||||
maintainers:
|
||||
- Vishnu Banavath <vishnu.banavath@arm.com>
|
||||
|
|
|
@ -33,6 +33,9 @@ properties:
|
|||
Handle to the cpu this ETE is bound to.
|
||||
$ref: /schemas/types.yaml#/definitions/phandle
|
||||
|
||||
power-domains:
|
||||
maxItems: 1
|
||||
|
||||
out-ports:
|
||||
description: |
|
||||
Output connections from the ETE to legacy CoreSight trace bus.
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/arm,integrator.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: ARM Integrator Boards Device Tree Bindings
|
||||
title: ARM Integrator Boards
|
||||
|
||||
maintainers:
|
||||
- Linus Walleij <linus.walleij@linaro.org>
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/arm,realview.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: ARM RealView Boards Device Tree Bindings
|
||||
title: ARM RealView Boards
|
||||
|
||||
maintainers:
|
||||
- Linus Walleij <linus.walleij@linaro.org>
|
||||
|
|
|
@ -0,0 +1,35 @@
|
|||
# SPDX-License-Identifier: (GPL-2.0-only or BSD-2-Clause)
|
||||
%YAML 1.2
|
||||
---
|
||||
$id: http://devicetree.org/schemas/arm/arm,versatile-sysreg.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Arm Versatile system registers
|
||||
|
||||
maintainers:
|
||||
- Linus Walleij <linus.walleij@linaro.org>
|
||||
|
||||
description:
|
||||
This is a system control registers block, providing multiple low level
|
||||
platform functions like board detection and identification, software
|
||||
interrupt generation, MMC and NOR Flash control, etc.
|
||||
|
||||
properties:
|
||||
compatible:
|
||||
items:
|
||||
- const: arm,versatile-sysreg
|
||||
- const: syscon
|
||||
- const: simple-mfd
|
||||
|
||||
reg:
|
||||
maxItems: 1
|
||||
|
||||
panel:
|
||||
type: object
|
||||
|
||||
required:
|
||||
- compatible
|
||||
- reg
|
||||
|
||||
additionalProperties: false
|
||||
...
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/arm,versatile.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: ARM Versatile Boards Device Tree Bindings
|
||||
title: ARM Versatile Boards
|
||||
|
||||
maintainers:
|
||||
- Linus Walleij <linus.walleij@linaro.org>
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/arm,vexpress-juno.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: ARM Versatile Express and Juno Boards Device Tree Bindings
|
||||
title: ARM Versatile Express and Juno Boards
|
||||
|
||||
maintainers:
|
||||
- Sudeep Holla <sudeep.holla@arm.com>
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/atmel-at91.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Atmel AT91 device tree bindings.
|
||||
title: Atmel AT91.
|
||||
|
||||
maintainers:
|
||||
- Alexandre Belloni <alexandre.belloni@bootlin.com>
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/axxia.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Axxia AXM55xx device tree bindings
|
||||
title: Axxia AXM55xx
|
||||
|
||||
maintainers:
|
||||
- Anders Berg <anders.berg@lsi.com>
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/bitmain.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Bitmain platform device tree bindings
|
||||
title: Bitmain platform
|
||||
|
||||
maintainers:
|
||||
- Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/calxeda.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Calxeda Platforms Device Tree Bindings
|
||||
title: Calxeda Platforms
|
||||
|
||||
maintainers:
|
||||
- Rob Herring <robh@kernel.org>
|
||||
|
|
|
@ -174,6 +174,7 @@ properties:
|
|||
- nvidia,tegra194-carmel
|
||||
- qcom,krait
|
||||
- qcom,kryo
|
||||
- qcom,kryo240
|
||||
- qcom,kryo250
|
||||
- qcom,kryo260
|
||||
- qcom,kryo280
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/digicolor.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Conexant Digicolor Platforms Device Tree Bindings
|
||||
title: Conexant Digicolor Platforms
|
||||
|
||||
maintainers:
|
||||
- Baruch Siach <baruch@tkos.co.il>
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/fsl.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Freescale i.MX Platforms Device Tree Bindings
|
||||
title: Freescale i.MX Platforms
|
||||
|
||||
maintainers:
|
||||
- Shawn Guo <shawnguo@kernel.org>
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/intel,keembay.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Keem Bay platform device tree bindings
|
||||
title: Keem Bay platform
|
||||
|
||||
maintainers:
|
||||
- Paul J. Murphy <paul.j.murphy@intel.com>
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/intel,socfpga.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Intel SoCFPGA platform device tree bindings
|
||||
title: Intel SoCFPGA platform
|
||||
|
||||
maintainers:
|
||||
- Dinh Nguyen <dinguyen@kernel.org>
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/intel-ixp4xx.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Intel IXP4xx Device Tree Bindings
|
||||
title: Intel IXP4xx
|
||||
|
||||
maintainers:
|
||||
- Linus Walleij <linus.walleij@linaro.org>
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/mediatek.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: MediaTek SoC based Platforms Device Tree Bindings
|
||||
title: MediaTek SoC based Platforms
|
||||
|
||||
maintainers:
|
||||
- Sean Wang <sean.wang@mediatek.com>
|
||||
|
|
|
@ -23,6 +23,7 @@ properties:
|
|||
- mediatek,mt2701-infracfg
|
||||
- mediatek,mt2712-infracfg
|
||||
- mediatek,mt6765-infracfg
|
||||
- mediatek,mt6795-infracfg
|
||||
- mediatek,mt6779-infracfg_ao
|
||||
- mediatek,mt6797-infracfg
|
||||
- mediatek,mt7622-infracfg
|
||||
|
@ -60,6 +61,7 @@ if:
|
|||
enum:
|
||||
- mediatek,mt2701-infracfg
|
||||
- mediatek,mt2712-infracfg
|
||||
- mediatek,mt6795-infracfg
|
||||
- mediatek,mt7622-infracfg
|
||||
- mediatek,mt7986-infracfg
|
||||
- mediatek,mt8135-infracfg
|
||||
|
|
|
@ -25,6 +25,7 @@ properties:
|
|||
- mediatek,mt2712-mmsys
|
||||
- mediatek,mt6765-mmsys
|
||||
- mediatek,mt6779-mmsys
|
||||
- mediatek,mt6795-mmsys
|
||||
- mediatek,mt6797-mmsys
|
||||
- mediatek,mt8167-mmsys
|
||||
- mediatek,mt8173-mmsys
|
||||
|
@ -52,7 +53,8 @@ properties:
|
|||
description:
|
||||
Using mailbox to communicate with GCE, it should have this
|
||||
property and list of phandle, mailbox specifiers. See
|
||||
Documentation/devicetree/bindings/mailbox/mtk-gce.txt for details.
|
||||
Documentation/devicetree/bindings/mailbox/mediatek,gce-mailbox.yaml
|
||||
for details.
|
||||
$ref: /schemas/types.yaml#/definitions/phandle-array
|
||||
|
||||
mediatek,gce-client-reg:
|
||||
|
|
|
@ -21,6 +21,7 @@ properties:
|
|||
- mediatek,mt2701-pericfg
|
||||
- mediatek,mt2712-pericfg
|
||||
- mediatek,mt6765-pericfg
|
||||
- mediatek,mt6795-pericfg
|
||||
- mediatek,mt7622-pericfg
|
||||
- mediatek,mt7629-pericfg
|
||||
- mediatek,mt8135-pericfg
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/microchip,sparx5.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Microchip Sparx5 Boards Device Tree Bindings
|
||||
title: Microchip Sparx5 Boards
|
||||
|
||||
maintainers:
|
||||
- Lars Povlsen <lars.povlsen@microchip.com>
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/moxart.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: MOXA ART device tree bindings
|
||||
title: MOXA ART
|
||||
|
||||
maintainers:
|
||||
- Jonas Jensen <jonas.jensen@gmail.com>
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: "http://devicetree.org/schemas/arm/nvidia,tegra194-ccplex.yaml#"
|
||||
$schema: "http://devicetree.org/meta-schemas/core.yaml#"
|
||||
|
||||
title: NVIDIA Tegra194 CPU Complex device tree bindings
|
||||
title: NVIDIA Tegra194 CPU Complex
|
||||
|
||||
maintainers:
|
||||
- Thierry Reding <thierry.reding@gmail.com>
|
||||
|
|
|
@ -41,31 +41,26 @@ properties:
|
|||
For implementations complying to PSCI versions prior to 0.2.
|
||||
const: arm,psci
|
||||
|
||||
- description:
|
||||
For implementations complying to PSCI 0.2.
|
||||
const: arm,psci-0.2
|
||||
|
||||
- description:
|
||||
For implementations complying to PSCI 0.2.
|
||||
Function IDs are not required and should be ignored by an OS with
|
||||
PSCI 0.2 support, but are permitted to be present for compatibility
|
||||
with existing software when "arm,psci" is later in the compatible
|
||||
list.
|
||||
minItems: 1
|
||||
items:
|
||||
- const: arm,psci-0.2
|
||||
- const: arm,psci
|
||||
|
||||
- description:
|
||||
For implementations complying to PSCI 1.0.
|
||||
const: arm,psci-1.0
|
||||
|
||||
- description:
|
||||
For implementations complying to PSCI 1.0.
|
||||
PSCI 1.0 is backward compatible with PSCI 0.2 with minor
|
||||
specification updates, as defined in the PSCI specification[2].
|
||||
minItems: 1
|
||||
items:
|
||||
- const: arm,psci-1.0
|
||||
- const: arm,psci-0.2
|
||||
- const: arm,psci
|
||||
|
||||
method:
|
||||
description: The method of calling the PSCI firmware.
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/qcom.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: QCOM device tree bindings
|
||||
title: QCOM
|
||||
|
||||
maintainers:
|
||||
- Bjorn Andersson <bjorn.andersson@linaro.org>
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/rda.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: RDA Micro platforms device tree bindings
|
||||
title: RDA Micro platforms
|
||||
|
||||
maintainers:
|
||||
- Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/realtek.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Realtek platforms device tree bindings
|
||||
title: Realtek platforms
|
||||
|
||||
maintainers:
|
||||
- Andreas Färber <afaerber@suse.de>
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/renesas.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Renesas SH-Mobile, R-Mobile, and R-Car Platform Device Tree Bindings
|
||||
title: Renesas SH-Mobile, R-Mobile, and R-Car Platform
|
||||
|
||||
maintainers:
|
||||
- Geert Uytterhoeven <geert+renesas@glider.be>
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/rockchip.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Rockchip platforms device tree bindings
|
||||
title: Rockchip platforms
|
||||
|
||||
maintainers:
|
||||
- Heiko Stuebner <heiko@sntech.de>
|
||||
|
|
|
@ -22,7 +22,6 @@ properties:
|
|||
description: |
|
||||
should contain 3 regions: control register, revision register,
|
||||
operation register, in this order.
|
||||
minItems: 3
|
||||
maxItems: 3
|
||||
|
||||
interrupts:
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/spear.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: ST SPEAr Platforms Device Tree Bindings
|
||||
title: ST SPEAr Platforms
|
||||
|
||||
maintainers:
|
||||
- Viresh Kumar <vireshk@kernel.org>
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/sti.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: ST STi Platforms Device Tree Bindings
|
||||
title: ST STi Platforms
|
||||
|
||||
maintainers:
|
||||
- Patrice Chotard <patrice.chotard@foss.st.com>
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/sunxi.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Allwinner platforms device tree bindings
|
||||
title: Allwinner platforms
|
||||
|
||||
maintainers:
|
||||
- Chen-Yu Tsai <wens@csie.org>
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
$id: http://devicetree.org/schemas/arm/tegra.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: NVIDIA Tegra device tree bindings
|
||||
title: NVIDIA Tegra
|
||||
|
||||
maintainers:
|
||||
- Thierry Reding <thierry.reding@gmail.com>
|
||||
|
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue