Pull libnvdimm fixes from Dan Williams:
"1) Fixes for a handful of smatch reports (Thanks Dan C.!) and minor
bug fixes (patches 1-6)
2) Correctness fixes to the BLK-mode nvdimm driver (patches 7-10).
Granted these are slightly large for a -rc update. They have been
out for review in one form or another since the end of May and were
deferred from the merge window while we settled on the "PMEM API"
for the PMEM-mode nvdimm driver (ie memremap_pmem, memcpy_to_pmem,
and wmb_pmem).
Now that those apis are merged we implement them in the BLK driver
to guarantee that mmio aperture moves stay ordered with respect to
incoming read/write requests, and that writes are flushed through
those mmio-windows and platform-buffers to be persistent on media.
These pass the sub-system unit tests with the updates to
tools/testing/nvdimm, and have received a successful build-report from
the kbuild robot (468 configs).
With acks from Rafael for the touches to drivers/acpi/"
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm:
nfit: add support for NVDIMM "latch" flag
nfit: update block I/O path to use PMEM API
tools/testing/nvdimm: add mock acpi_nfit_flush_address entries to nfit_test
tools/testing/nvdimm: fix return code for unimplemented commands
tools/testing/nvdimm: mock ioremap_wt
pmem: add maintainer for include/linux/pmem.h
nfit: fix smatch "use after null check" report
nvdimm: Fix return value of nvdimm_bus_init() if class_create() fails
libnvdimm: smatch cleanups in __nd_ioctl
sparse: fix misplaced __pmem definition
Add support in the NFIT BLK I/O path for the "latch" flag
defined in the "Get Block NVDIMM Flags" _DSM function:
http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
This flag requires the driver to read back the command register after it
is written in the block I/O path. This ensures that the hardware has
fully processed the new command and moved the aperture appropriately.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Update the nfit block I/O path to use the new PMEM API and to adhere to
the read/write flows outlined in the "NVDIMM Block Window Driver
Writer's Guide":
http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf
This includes adding support for targeted NVDIMM flushes called "flush
hints" in the ACPI 6.0 specification:
http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf
For performance and media durability the mapping for a BLK aperture is
moved to a write-combining mapping which is consistent with
memcpy_to_pmem() and wmb_blk().
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Device drivers typically use ACPI _HIDs/_CIDs listed in struct device_driver
acpi_match_table to match devices. However, for generic drivers, we do not
want to list _HID for all supported devices. Also, certain classes of devices
do not have _CID (e.g. SATA, USB). Instead, we can leverage ACPI _CLS,
which specifies PCI-defined class code (i.e. base-class, subclass and
programming interface). This patch adds support for matching ACPI devices using
the _CLS method.
To support loadable module, current design uses _HID or _CID to match device's
modalias. With the new way of matching with _CLS this would requires modification
to the current ACPI modalias key to include _CLS. This patch appends PCI-defined
class-code to the existing ACPI modalias as following.
acpi:<HID>:<CID1>:<CID2>:..:<CIDn>:<bbsspp>:
E.g:
# cat /sys/devices/platform/AMDI0600:00/modalias
acpi:AMDI0600:010601:
where bb is th base-class code, ss is te sub-class code, and pp is the
programming interface code
Since there would not be _HID/_CID in the ACPI matching table of the driver,
this patch adds a field to acpi_device_id to specify the matching _CLS.
static const struct acpi_device_id ahci_acpi_match[] = {
{ ACPI_DEVICE_CLASS(PCI_CLASS_STORAGE_SATA_AHCI, 0xffffff) },
{},
};
In this case, the corresponded entry in modules.alias file would be:
alias acpi*:010601:* ahci_platform
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Fix a return value (which should be a negative error code) and a
memory leak (the list allocated by acpi_dev_get_resources() needs
to be freed on ioremap() errors too) in acpi_lpss_create_device()
introduced by commit 4483d59e29 'ACPI / LPSS: check the result
of ioremap()'.
Fixes: 4483d59e29 'ACPI / LPSS: check the result of ioremap()'
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: 4.0+ <stable@vger.kernel.org> # 4.0+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This effectively reverts the following three commits:
7bc10388cc ACPI / resources: free memory on error in add_region_before()
0f1b414d19 ACPI / PNP: Avoid conflicting resource reservations
b9a5e5e18f ACPI / init: Fix the ordering of acpi_reserve_resources()
(commit b9a5e5e18f introduced regressions some of which, but not
all, were addressed by commit 0f1b414d19 and commit 7bc10388cc
was a fixup on top of the latter) and causes ACPI fixed hardware
resources to be reserved at the fs_initcall_sync stage of system
initialization.
The story is as follows. First, a boot regression was reported due
to an apparent resource reservation ordering change after a commit
that shouldn't lead to such changes. Investigation led to the
conclusion that the problem happened because acpi_reserve_resources()
was executed at the device_initcall() stage of system initialization
which wasn't strictly ordered with respect to driver initialization
(and with respect to the initialization of the pcieport driver in
particular), so a random change causing the device initcalls to be
run in a different order might break things.
The response to that was to attempt to run acpi_reserve_resources()
as soon as we knew that ACPI would be in use (commit b9a5e5e18f).
However, that turned out to be too early, because it caused resource
reservations made by the PNP system driver to fail on at least one
system and that failure was addressed by commit 0f1b414d19.
That fix still turned out to be insufficient, though, because
calling acpi_reserve_resources() before the fs_initcall stage of
system initialization caused a boot regression to happen on the
eCAFE EC-800-H20G/S netbook. That meant that we only could call
acpi_reserve_resources() at the fs_initcall initialization stage
or later, but then we might just as well call it after the PNP
initalization in which case commit 0f1b414d19 wouldn't be
necessary any more.
For this reason, the changes made by commit 0f1b414d19 are reverted
(along with a memory leak fixup on top of that commit), the changes
made by commit b9a5e5e18f that went too far are reverted too and
acpi_reserve_resources() is changed into fs_initcall_sync, which
will cause it to be executed after the PNP subsystem initialization
(which is an fs_initcall) and before device initcalls (including
the pcieport driver initialization) which should avoid the initial
issue.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=100581
Link: http://marc.info/?t=143092384600002&r=1&w=2
Link: https://bugzilla.kernel.org/show_bug.cgi?id=99831
Link: http://marc.info/?t=143389402600001&r=1&w=2
Fixes: b9a5e5e18f "ACPI / init: Fix the ordering of acpi_reserve_resources()"
Reported-by: Roland Dreier <roland@purestorage.com>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- Fix system resume problems related to 32-bit and 64-bit versions
of the Firmware ACPI Control Structure (FACS) in the firmare (Lv
Zheng).
- Fix double initialization of the FACS (Lv Zheng).
- Add _CLS object processing code to ACPICA (Suravee Suthikulpanit).
- Add support for the (currently missing) new GIC version field in
the Multiple APIC Description Table (MADT) (Hanjun Guo).
- Add support for overriding objects in the ACPI namespace to
ACPICA and OSDT support (Lv Zheng, Bob Moore, Zhang Rui).
- Updates related to the TCPA and TPM2 ACPI tables (Bob Moore).
- Restore the commit modifying _REV to always return "2" (as
required by ACPI 6) and add a blacklisting mechanism for
systems that may be affected by that change (Rafael J Wysocki).
- Assorted fixes and cleanups (Bob Moore, Lv Zheng, Sascha Wildner).
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJVlcwtAAoJEILEb/54YlRx/IwQAKMZaZZni2HhJ/ASBVAtF4zp
RNaS+XiTzLg2HIIR0QjRE9LT2CH3Zw2l99XzU91SqS2UfvTr+YJjnSNq3PllAgrT
SsFv5fVJZr7VfJw7gbARhOXp926INfDRqKp5WvpQ3XCFclCQRNbqzn0PD1ouooVQ
x4IhhFlxyCIOHwbINS//CsJ8H+PT7aUc2kSgEKGbVWFfKE9jfTCx1Nekh2GoEqf+
wutzaMmCoQsf0kVNldgEnF2vxIxwgcXkhYxBBdnGBl2afJz+THsPaJP6Bx6JNA+S
iaFh+iyo70jeJ4ouBxJc0E46g+pDOJdP71VQhexFu3c7OU+wmhyv30/f4SwxXLOD
+H8OhOMXFLff9PS+BVU4iR7t5SikZzbXc/AjuM6es1UT+k8zOlo+fRL1I8dXDF6V
t4GiT6hz/MX30cP3aumXtQ2dl9TksWPtfoerSjo1EowY6wPZ+WpJ2bmp5uecIDGV
TNdC4pKjDVgbFP889mZF4pG198uR4UV1gRCf4gvwEyiNMFd3xRbFhs4r7AkiSQLn
fy+V7MlgFiFaB6Ej/AU01fjarOPPSiv8uFWAZL4e9R/88UgfVVq0aFonw/r5l4jj
3rJBOH7YxNxGBhRjTL+d7cwruED6G/K2S0QbD2kZBOSHrouz1fuLFdvgKj8ahqyJ
VfQZs9A3PSv/v1wssUr/
=MlWS
-----END PGP SIGNATURE-----
Merge tag 'acpica-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPICA updates from Rafael Wysocki:
"Additional ACPICA material for v4.2-rc1
This will update the ACPICA code in the kernel to upstream revision
20150619 (a bug-fix release mostly including stable-candidate fixes)
and restore an earlier ACPICA commit that had to be reverted due to a
regression introduced by it (the regression is addressed by
blacklisting the only known system affected by it to date).
The only new feature added by this update is the support for
overriding objects in the ACPI namespace and a new ACPI table that can
be used for that called the Override System Definition Table (OSDT).
That should allow us to "patch" the ACPI namespace built from
incomplete or incorrect ACPI System Definition tables (DSDT, SSDT)
during system startup without the need to provide replacements for all
of those tables in the future.
Specifics:
- Fix system resume problems related to 32-bit and 64-bit versions of
the Firmware ACPI Control Structure (FACS) in the firmare (Lv
Zheng)
- Fix double initialization of the FACS (Lv Zheng)
- Add _CLS object processing code to ACPICA (Suravee Suthikulpanit)
- Add support for the (currently missing) new GIC version field in
the Multiple APIC Description Table (MADT) (Hanjun Guo)
- Add support for overriding objects in the ACPI namespace to ACPICA
and OSDT support (Lv Zheng, Bob Moore, Zhang Rui)
- Updates related to the TCPA and TPM2 ACPI tables (Bob Moore)
- Restore the commit modifying _REV to always return "2" (as required
by ACPI 6) and add a blacklisting mechanism for systems that may be
affected by that change (Rafael J Wysocki)
- Assorted fixes and cleanups (Bob Moore, Lv Zheng, Sascha Wildner)"
* tag 'acpica-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (28 commits)
Revert 'Revert "ACPICA: Permanently set _REV to the value '2'."'
ACPI / init: Make it possible to override _REV
ACPICA: Update version to 20150619
ACPICA: Comment update, no functional change
ACPICA: Update TPM2 ACPI table
ACPICA: Update definitions for the TCPA and TPM2 ACPI tables
ACPICA: Split C library prototypes to new header
ACPICA: De-macroize calls to standard C library functions
ACPI / acpidump: Update acpidump manual
ACPICA: acpidump: Convert the default behavior to dump from /sys/firmware/acpi/tables
ACPICA: acpidump: Allow customized tables to be dumped without accessing /dev/mem
ACPICA: Cleanup output for the ASL Debug object
ACPICA: Update for acpi_install_table memory types
ACPICA: Namespace: Change namespace override to avoid node deletion
ACPICA: Namespace: Add support of OSDT table
ACPICA: Namespace: Add support to allow overriding objects
ACPICA: ACPI 6.0: Add values for MADT GIC version field
ACPICA: Utilities: Add _CLS processing
ACPICA: Add dragon_fly support to unix file mapping file
ACPICA: EFI: Add EFI interface definitions to eliminate dependency of GNU EFI
...
Revert commit ff284f37fc (Revert "ACPICA: Permanently set _REV to
the value '2'.) as the regression introduced by commit b1ef297258
reverted by it is now addressed via a blacklist entry.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The platform firmware on some systems expects Linux to return "5" as
the supported ACPI revision which makes it expose system configuration
information in a special way.
For example, based on what ACPI exports as the supported revision,
Dell XPS 13 (2015) configures its audio device to either work in HDA
mode or in I2S mode, where the former is supposed to be used on Linux
until the latter is fully supported (in the kernel as well as in user
space).
Since ACPI 6 mandates that _REV should return "2" if ACPI 2 or later
is supported by the OS, a subsequent change will make that happen, so
make it possible to override that on systems where "5" is expected to
be returned for Linux to work correctly one them (such as the Dell
machine mentioned above).
Original-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
ACPICA commit f51bf8497889a94046820639537165bbd7ccdee6
Adds acclib.h
This patch doesn't affect the Linux kernel.
Link: https://github.com/acpica/acpica/commit/f51bf849
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
ACPICA commit 3b1026e0bdd3c32eb6d5d313f3ba0b1fee7597b4
ACPICA commit 00f0dc83f5cfca53b27a3213ae0d7719b88c2d6b
ACPICA commit 47d22a738d0e19fd241ffe4e3e9d4e198e4afc69
Across all of ACPICA. Replace C library macros such as ACPI_STRLEN with the
standard names such as strlen. The original purpose for these macros is
long since obsolete.
Also cast various invocations as necessary. Bob Moore, Jung-uk Kim, Lv Zheng.
Link: https://github.com/acpica/acpica/commit/3b1026e0
Link: https://github.com/acpica/acpica/commit/00f0dc83
Link: https://github.com/acpica/acpica/commit/47d22a73
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Jung-uk Kim <jkim@FreeBSD.org>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
ACPICA commit d4a53a396fe5d384425251b0257f8d125bbed617
Especially for use of the Index operator. For buffers and strings,
only output the actual byte pointed to by the index. For packages,
only print the package element decoded by the index.
Link: https://github.com/acpica/acpica/commit/d4a53a39
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
ACPICA commit c0ce529e1fbb8ec47d2522a3aa10f3ab77e16e41
There is no reference counting implemented for struct acpi_namespace_node, so it
is currently not removable during runtime.
This patch changes the namespace override code to keep the old
struct acpi_namespace_node undeleted so that the override mechanism can happen
during runtime. Bob Moore.
Link: https://github.com/acpica/acpica/commit/c0ce529e
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
ACPICA commit 27415c82fcecf467446f66d1007a0691cc5f3709
This patch adds OSDT (Override System Definition Table) support.
When OSDT is loaded, conflict namespace objects will be overridden
by the AML interpreter. Bob Moore, Lv Zheng.
Link: https://github.com/acpica/acpica/commit/27415c82
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
ACPICA commit 6084e34e44565c6293f446c0202b5e59b055e351
This patch adds an "NamespaceOverride" flag in struct acpi_walk_state, and allows
namespace objects to be overridden when this flag is set. Lv Zheng.
Link: https://github.com/acpica/acpica/commit/6084e34e
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
ACPICA commit 9a2b638acb3a7215209432e070c6bd0312374229
ACPI Device object often contains a _CLS object to supply PCI-defined class
code for the device. This patch introduces logic to process the _CLS
object. Suravee Suthikulpanit, Lv Zheng.
Link: https://github.com/acpica/acpica/commit/9a2b638a
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
ACPICA commit 90f5332a15e9d9ba83831ca700b2b9f708274658
This patch adds a new FACS initialization flag for acpi_tb_initialize().
acpi_enable_subsystem() might be invoked several times in OS bootup process,
and we don't want FACS initialization to be invoked twice. Lv Zheng.
Link: https://github.com/acpica/acpica/commit/90f5332a
Cc: All applicable <stable@vger.kernel.org> # All applicable
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
ACPICA commit 368eb60778b27b6ae94d3658ddc902ca1342a963
ACPICA commit 70f62a80d65515e1285fdeeb50d94ee6f07df4bd
ACPICA commit a04dbfa308a48ab0b2d10519c54a6c533c5c8949
ACPICA commit ebd544ed24c5a4faba11f265e228b7a821a729f5
The following commit is reported to have broken s2ram on some platforms:
Commit: 0249ed2444
ACPICA: Add option to favor 32-bit FADT addresses.
The platform reports 2 FACS tables (which is not allowed by ACPI
specification) and the new 32-bit address favor rule forces OSPMs to use
the FACS table reported via FADT's X_FIRMWARE_CTRL field.
The root cause of the reported bug might be one of the followings:
1. BIOS may favor the 64-bit firmware waking vector address when the
version of the FACS is greater than 0 and Linux currently only supports
resuming from the real mode, so the 64-bit firmware waking vector has
never been set and might be invalid to BIOS while the commit enables
higher version FACS.
2. BIOS may favor the FACS reported via the "FIRMWARE_CTRL" field in the
FADT while the commit doesn't set the firmware waking vector address of
the FACS reported by "FIRMWARE_CTRL", it only sets the firware waking
vector address of the FACS reported by "X_FIRMWARE_CTRL".
This patch excludes the cases that can trigger the bugs caused by the root
cause 2.
There is no handshaking mechanism can be used by OSPM to tell BIOS which
FACS is currently used. Thus the FACS reported by "FIRMWARE_CTRL" may still
be used by BIOS and the 0 value of the 32-bit firmware waking vector might
trigger such failure.
This patch enables the firmware waking vectors for both 32bit/64bit FACS
tables in order to ensure we can exclude the cases that trigger the bugs
caused by the root cause 2. The exclusion is split into 2 commits so that
if it turns out not to be necessary, this single commit can be reverted
without affecting the useful one. Lv Zheng, Bob Moore.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=74021
Link: https://github.com/acpica/acpica/commit/368eb607
Link: https://github.com/acpica/acpica/commit/70f62a80
Link: https://github.com/acpica/acpica/commit/a04dbfa3
Link: https://github.com/acpica/acpica/commit/ebd544ed
Reported-and-tested-by: Oswald Buddenhagen <ossi@kde.org>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
ACPICA commit f7b86f35416e3d1f71c3d816ff5075ddd33ed486
The following commit is reported to have broken s2ram on some platforms:
Commit: 0249ed2444
ACPICA: Add option to favor 32-bit FADT addresses.
The platform reports 2 FACS tables (which is not allowed by ACPI
specification) and the new 32-bit address favor rule forces OSPMs to use
the FACS table reported via FADT's X_FIRMWARE_CTRL field.
The root cause of the reported bug might be one of the followings:
1. BIOS may favor the 64-bit firmware waking vector address when the
version of the FACS is greater than 0 and Linux currently only supports
resuming from the real mode, so the 64-bit firmware waking vector has
never been set and might be invalid to BIOS while the commit enables
higher version FACS.
2. BIOS may favor the FACS reported via the "FIRMWARE_CTRL" field in the
FADT while the commit doesn't set the firmware waking vector address of
the FACS reported by "FIRMWARE_CTRL", it only sets the firware waking
vector address of the FACS reported by "X_FIRMWARE_CTRL".
This patch excludes the cases that can trigger the bugs caused by the root
cause 2.
There is no handshaking mechanism can be used by OSPM to tell BIOS which
FACS is currently used. Thus the FACS reported by "FIRMWARE_CTRL" may still
be used by BIOS and the 0 value of the 32-bit firmware waking vector might
trigger such failure.
This patch tries to favor 32bit FACS address in another way where both the
FACS reported by "FIRMWARE_CTRL" and the FACS reported by "X_FIRMWARE_CTRL"
are loaded so that further commit can set firmware waking vector in the
both tables to ensure we can exclude the cases that trigger the bugs caused
by the root cause 2. The exclusion is split into 2 commits as this commit
is also useful for dumping more ACPI tables, it won't get reverted when
such exclusion is no longer necessary. Lv Zheng.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=74021
Link: https://github.com/acpica/acpica/commit/f7b86f35
Cc: 3.14.1+ <stable@vger.kernel.org> # 3.14.1+
Reported-and-tested-by: Oswald Buddenhagen <ossi@kde.org>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
ACPICA commit 7aa598d711644ab0de5f70ad88f1e2de253115e4
The following commit is reported to have broken s2ram on some platforms:
Commit: 0249ed2444
ACPICA: Add option to favor 32-bit FADT addresses.
The platform reports 2 FACS tables (which is not allowed by ACPI
specification) and the new 32-bit address favor rule forces OSPMs to use
the FACS table reported via FADT's X_FIRMWARE_CTRL field.
The root cause of the reported bug might be one of the followings:
1. BIOS may favor the 64-bit firmware waking vector address when the
version of the FACS is greater than 0 and Linux currently only supports
resuming from the real mode, so the 64-bit firmware waking vector has
never been set and might be invalid to BIOS while the commit enables
higher version FACS.
2. BIOS may favor the FACS reported via the "FIRMWARE_CTRL" field in the
FADT while the commit doesn't set the firmware waking vector address of
the FACS reported by "FIRMWARE_CTRL", it only sets the firware waking
vector address of the FACS reported by "X_FIRMWARE_CTRL".
This patch excludes the cases that can trigger the bugs caused by the root
cause 1.
ACPI specification says:
A. 32-bit FACS address (FIRMWARE_CTRL field in FADT):
Physical memory address of the FACS, where OSPM and firmware exchange
control information.
If the X_FIRMWARE_CTRL field contains a non zero value then this field
must be zero.
A zero value indicates that no FACS is specified by this field.
B. 64-bit FACS address (X_FIRMWARE_CTRL field in FADT):
64bit physical memory address of the FACS.
This field is used when the physical address of the FACS is above 4GB.
If the FIRMWARE_CTRL field contains a non zero value then this field
must be zero.
A zero value indicates that no FACS is specified by this field.
Thus the 32bit and 64bit firmware waking vector should indicate completely
different resuming environment - real mode (1MB addressable) and non real
mode (4GB+ addressable) and currently Linux only supports resuming from
real mode.
This patch enables 64-bit firmware waking vector for selected FACS via new
acpi_set_firmware_waking_vectors() API so that it's up to OSPMs to
determine which resuming mode should be used by BIOS and ACPICA changes
won't trigger the bugs caused by the root cause 1. Lv Zheng.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=74021
Link: https://github.com/acpica/acpica/commit/7aa598d7
Reported-and-tested-by: Oswald Buddenhagen <ossi@kde.org>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- Fix a recently added memory leak in an error path in the ACPI
resources management code (Dan Carpenter).
- Fix a build warning triggered by an ACPI video header function
that should be static inline (Borislav Petkov).
- Change names of helper function converting struct fwnode_handle
pointers to either struct device_node or struct acpi_device
pointers so they don't conflict with local variable names
(Alexander Sverdlin).
- Make the hibernate core re-enable nonboot CPUs on failures to
disable them as expected (Vitaly Kuznetsov).
- Increase the default timeout of the device suspend watchdog to
prevent it from triggering too early on some systems (Takashi Iwai).
- Prevent the cpuidle powernv driver from registering idle
states with CPUIDLE_FLAG_TIMER_STOP set if CONFIG_TICK_ONESHOT
is unset which leads to boot hangs (Preeti U Murthy).
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJVksL4AAoJEILEb/54YlRxY9cP/jnXE12Jv2aYQAram5Fd7nY7
LSiuKAzVCqJbdBa3sRILKjMwgxciYABXypw6Zapa8TimAV374GZh6W4VXgIAifDf
gdicSSxB4A5cViUEte3uebzdaM2QMcOcQ6A+UOe849q3emfbu91f0LXTWwahR0og
Hjs7QCcvZS/swQIIY0JaIivC5mRwxrx141oPVn4GNf0tnOzH7eOUktYCZwh4U4Qo
FwUn66XI+ttlrRxs/IV5QSQ9S5qDBHOKdv6MgmGwMzUXINTL4w2Zawg+Pw0m3Puf
2VnT9jsz4anD1jZMJGLpeKNAjFJZ6ODiv06HjYhscaUZuMSECUiE9f6auUkiSU1F
r463ujaIcCW5MWUGjRBq5GwP15IuIL/NxjWAVyyMxeYcjOKrpQeEJ2liWnXPv/Bh
JVOoktFvFu1iOojlWfwnaC83bjZiJIJ6BFEywIm7l0wD1VAmk8IFepMqIrTp4U9R
ptKD8Go9GripftHryqSQA5wgp64hIQKxSHi+fEM6RfcLskXVKxEYSButva3wnFHx
2lQSxro42jsaP8O9T1wmb+br0h3efPeo9qyewmjld+vISeUhmsCILta55VtXI0wu
ektWoAuaA97d+u+mTee4U2kONXo+yUuXe0YqlCb3x/7m8oMg6hoblWZPnWFSeqeH
T/vI5cRS+a3NqAjctwmX
=vkde
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-4.2-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI fixes from Rafael Wysocki:
"These are fixes that didn't make it to the previous PM+ACPI pull
request or are fixing issues introduced by it.
Specifics:
- Fix a recently added memory leak in an error path in the ACPI
resources management code (Dan Carpenter)
- Fix a build warning triggered by an ACPI video header function that
should be static inline (Borislav Petkov)
- Change names of helper function converting struct fwnode_handle
pointers to either struct device_node or struct acpi_device
pointers so they don't conflict with local variable names
(Alexander Sverdlin)
- Make the hibernate core re-enable nonboot CPUs on failures to
disable them as expected (Vitaly Kuznetsov)
- Increase the default timeout of the device suspend watchdog to
prevent it from triggering too early on some systems (Takashi Iwai)
- Prevent the cpuidle powernv driver from registering idle states
with CPUIDLE_FLAG_TIMER_STOP set if CONFIG_TICK_ONESHOT is unset
which leads to boot hangs (Preeti U Murthy)"
* tag 'pm+acpi-4.2-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
tick/idle/powerpc: Do not register idle states with CPUIDLE_FLAG_TIMER_STOP set in periodic mode
PM / sleep: Increase default DPM watchdog timeout to 60
PM / hibernate: re-enable nonboot cpus on disable_nonboot_cpus() failure
ACPI / OF: Rename of_node() and acpi_node() to to_of_node() and to_acpi_node()
ACPI / video: Inline acpi_video_set_dmi_backlight_type
ACPI / resources: free memory on error in add_region_before()
4 drivers / enabling modules:
NFIT:
Instantiates an "nvdimm bus" with the core and registers memory devices
(NVDIMMs) enumerated by the ACPI 6.0 NFIT (NVDIMM Firmware Interface
table). After registering NVDIMMs the NFIT driver then registers
"region" devices. A libnvdimm-region defines an access mode and the
boundaries of persistent memory media. A region may span multiple
NVDIMMs that are interleaved by the hardware memory controller. In
turn, a libnvdimm-region can be carved into a "namespace" device and
bound to the PMEM or BLK driver which will attach a Linux block device
(disk) interface to the memory.
PMEM:
Initially merged in v4.1 this driver for contiguous spans of persistent
memory address ranges is re-worked to drive PMEM-namespaces emitted by
the libnvdimm-core. In this update the PMEM driver, on x86, gains the
ability to assert that writes to persistent memory have been flushed all
the way through the caches and buffers in the platform to persistent
media. See memcpy_to_pmem() and wmb_pmem().
BLK:
This new driver enables access to persistent memory media through "Block
Data Windows" as defined by the NFIT. The primary difference of this
driver to PMEM is that only a small window of persistent memory is
mapped into system address space at any given point in time. Per-NVDIMM
windows are reprogrammed at run time, per-I/O, to access different
portions of the media. BLK-mode, by definition, does not support DAX.
BTT:
This is a library, optionally consumed by either PMEM or BLK, that
converts a byte-accessible namespace into a disk with atomic sector
update semantics (prevents sector tearing on crash or power loss). The
sinister aspect of sector tearing is that most applications do not know
they have a atomic sector dependency. At least today's disk's rarely
ever tear sectors and if they do one almost certainly gets a CRC error
on access. NVDIMMs will always tear and always silently. Until an
application is audited to be robust in the presence of sector-tearing
the usage of BTT is recommended.
Thanks to: Ross Zwisler, Jeff Moyer, Vishal Verma, Christoph Hellwig,
Ingo Molnar, Neil Brown, Boaz Harrosh, Robert Elliott, Matthew Wilcox,
Andy Rudoff, Linda Knippers, Toshi Kani, Nicholas Moulin, Rafael
Wysocki, and Bob Moore.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJVjZGBAAoJEB7SkWpmfYgC4fkP/j+k6HmSRNU/yRYPyo7CAWvj
3P5P1i6R6nMZZbjQrQArAXaIyLlFk4sEQDYsciR6dmslhhFZAkR2eFwVO5rBOyx3
QN0yxEpyjJbroRFUrV/BLaFK4cq2oyJAFFHs0u7/pLHBJ4MDMqfRKAMtlnBxEkTE
LFcqXapSlvWitSbjMdIBWKFEvncaiJ2mdsFqT4aZqclBBTj00eWQvEG9WxleJLdv
+tj7qR/vGcwOb12X5UrbQXgwtMYos7A6IzhHbqwQL8IrOcJ6YB8NopJUpLDd7ZVq
KAzX6ZYMzNueN4uvv6aDfqDRLyVL7qoxM9XIjGF5R8SV9sF2LMspm1FBpfowo1GT
h2QMr0ky1nHVT32yspBCpE9zW/mubRIDtXxEmZZ53DIc4N6Dy9jFaNVmhoWtTAqG
b9pndFnjUzzieCjX5pCvo2M5U6N0AQwsnq76/CasiWyhSa9DNKOg8MVDRg0rbxb0
UvK0v8JwOCIRcfO3qiKcx+02nKPtjCtHSPqGkFKPySRvAdb+3g6YR26CxTb3VmnF
etowLiKU7HHalLvqGFOlDoQG6viWes9Zl+ZeANBOCVa6rL2O7ZnXJtYgXf1wDQee
fzgKB78BcDjXH4jHobbp/WBANQGN/GF34lse8yHa7Ym+28uEihDvSD1wyNLnefmo
7PJBbN5M5qP5tD0aO7SZ
=VtWG
-----END PGP SIGNATURE-----
Merge tag 'libnvdimm-for-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm
Pull libnvdimm subsystem from Dan Williams:
"The libnvdimm sub-system introduces, in addition to the
libnvdimm-core, 4 drivers / enabling modules:
NFIT:
Instantiates an "nvdimm bus" with the core and registers memory
devices (NVDIMMs) enumerated by the ACPI 6.0 NFIT (NVDIMM Firmware
Interface table).
After registering NVDIMMs the NFIT driver then registers "region"
devices. A libnvdimm-region defines an access mode and the
boundaries of persistent memory media. A region may span multiple
NVDIMMs that are interleaved by the hardware memory controller. In
turn, a libnvdimm-region can be carved into a "namespace" device and
bound to the PMEM or BLK driver which will attach a Linux block
device (disk) interface to the memory.
PMEM:
Initially merged in v4.1 this driver for contiguous spans of
persistent memory address ranges is re-worked to drive
PMEM-namespaces emitted by the libnvdimm-core.
In this update the PMEM driver, on x86, gains the ability to assert
that writes to persistent memory have been flushed all the way
through the caches and buffers in the platform to persistent media.
See memcpy_to_pmem() and wmb_pmem().
BLK:
This new driver enables access to persistent memory media through
"Block Data Windows" as defined by the NFIT. The primary difference
of this driver to PMEM is that only a small window of persistent
memory is mapped into system address space at any given point in
time.
Per-NVDIMM windows are reprogrammed at run time, per-I/O, to access
different portions of the media. BLK-mode, by definition, does not
support DAX.
BTT:
This is a library, optionally consumed by either PMEM or BLK, that
converts a byte-accessible namespace into a disk with atomic sector
update semantics (prevents sector tearing on crash or power loss).
The sinister aspect of sector tearing is that most applications do
not know they have a atomic sector dependency. At least today's
disk's rarely ever tear sectors and if they do one almost certainly
gets a CRC error on access. NVDIMMs will always tear and always
silently. Until an application is audited to be robust in the
presence of sector-tearing the usage of BTT is recommended.
Thanks to: Ross Zwisler, Jeff Moyer, Vishal Verma, Christoph Hellwig,
Ingo Molnar, Neil Brown, Boaz Harrosh, Robert Elliott, Matthew Wilcox,
Andy Rudoff, Linda Knippers, Toshi Kani, Nicholas Moulin, Rafael
Wysocki, and Bob Moore"
* tag 'libnvdimm-for-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm: (33 commits)
arch, x86: pmem api for ensuring durability of persistent memory updates
libnvdimm: Add sysfs numa_node to NVDIMM devices
libnvdimm: Set numa_node to NVDIMM devices
acpi: Add acpi_map_pxm_to_online_node()
libnvdimm, nfit: handle unarmed dimms, mark namespaces read-only
pmem: flag pmem block devices as non-rotational
libnvdimm: enable iostat
pmem: make_request cleanups
libnvdimm, pmem: fix up max_hw_sectors
libnvdimm, blk: add support for blk integrity
libnvdimm, btt: add support for blk integrity
fs/block_dev.c: skip rw_page if bdev has integrity
libnvdimm: Non-Volatile Devices
tools/testing/nvdimm: libnvdimm unit test infrastructure
libnvdimm, nfit, nd_blk: driver for BLK-mode access persistent memory
nd_btt: atomic sector updates
libnvdimm: infrastructure for btt devices
libnvdimm: write blk label set
libnvdimm: write pmem label set
libnvdimm: blk labels and namespace instantiation
...
Add support of sysfs 'numa_node' to I/O-related NVDIMM devices
under /sys/bus/nd/devices, regionN, namespaceN.0, and bttN.x.
An example of numa_node values on a 2-socket system with a single
NVDIMM range on each socket is shown below.
/sys/bus/nd/devices
|-- btt0.0/numa_node:0
|-- btt1.0/numa_node:1
|-- btt1.1/numa_node:1
|-- namespace0.0/numa_node:0
|-- namespace1.0/numa_node:1
|-- region0/numa_node:0
|-- region1/numa_node:1
These numa_node files are then linked under the block class of
their device names.
/sys/class/block/pmem0/device/numa_node:0
/sys/class/block/pmem1s/device/numa_node:1
This enables numactl(8) to accept 'block:' and 'file:' paths of
pmem and btt devices as shown in the examples below.
numactl --preferred block:pmem0 --show
numactl --preferred file:/dev/pmem1s --show
Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
ACPI NFIT table has System Physical Address Range Structure entries that
describe a proximity ID of each range when ACPI_NFIT_PROXIMITY_VALID is
set in the flags.
Change acpi_nfit_register_region() to map a proximity ID to its node ID,
and set it to a new numa_node field of nd_region_desc, which is then
conveyed to the nd_region device.
The device core arranges for btt and namespace devices to inherit their
node from their parent region.
Signed-off-by: Toshi Kani <toshi.kani@hp.com>
[djbw: move set_dev_node() from region.c to bus.c]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The kernel initializes CPU & memory's NUMA topology from ACPI
SRAT table. Some other ACPI tables, such as NFIT and DMAR, also
contain proximity IDs for their device's NUMA topology. This
information can be used to improve performance of these devices.
This patch introduces acpi_map_pxm_to_online_node(), which is
similar to acpi_map_pxm_to_node(), but always returns an online
node. When the mapped node from a given proximity ID is offline,
it looks up the node distance table and returns the nearest
online node.
ACPI device drivers, which are called after the NUMA initialization
has completed in the kernel, can call this interface to obtain their
device NUMA topology from ACPI tables. Such drivers do not have to
deal with offline nodes. A node may be offline when a device
proximity ID is unique, SRAT memory entry does not exist, or NUMA is
disabled, ex. "numa=off" on x86.
This patch also moves the pxm range check from acpi_get_node() to
acpi_map_pxm_to_node().
Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Upon detection of an unarmed dimm in a region, arrange for descendant
BTT, PMEM, or BLK instances to be read-only. A dimm is primarily marked
"unarmed" via flags passed by platform firmware (NFIT).
The flags in the NFIT memory device sub-structure indicate the state of
the data on the nvdimm relative to its energy source or last "flush to
persistence". For the most part there is nothing the driver can do but
advertise the state of these flags in sysfs and emit a message if
firmware indicates that the contents of the device may be corrupted.
However, for the case of ACPI_NFIT_MEM_ARMED, the driver can arrange for
the block devices incorporating that nvdimm to be marked read-only.
This is a safe default as the data is still available and new writes are
held off until the administrator either forces read-write mode, or the
energy source becomes armed.
A 'read_only' attribute is added to REGION devices to allow for
overriding the default read-only policy of all descendant block devices.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
'libnvdimm' is the first driver sub-system in the kernel to implement
mocking for unit test coverage. The nfit_test module gets built as an
external module and arranges for external module replacements of nfit,
libnvdimm, nd_pmem, and nd_blk. These replacements use the linker
--wrap option to redirect calls to ioremap() + request_mem_region() to
custom defined unit test resources. The end result is a fully
functional nvdimm_bus, as far as userspace is concerned, but with the
capability to perform otherwise destructive tests on emulated resources.
Q: Why not use QEMU for this emulation?
QEMU is not suitable for unit testing. QEMU's role is to faithfully
emulate the platform. A unit test's role is to unfaithfully implement
the platform with the goal of triggering bugs in the corners of the
sub-system implementation. As bugs are discovered in platforms, or the
sub-system itself, the unit tests are extended to backstop a fix with a
reproducer unit test.
Another problem with QEMU is that it would require coordination of 3
software projects instead of 2 (kernel + libndctl [1]) to maintain and
execute the tests. The chances for bit rot and the difficulty of
getting the tests running goes up non-linearly the more components
involved.
Q: Why submit this to the kernel tree instead of external modules in
libndctl?
Simple, to alleviate the same risk that out-of-tree external modules
face. Updates to drivers/nvdimm/ can be immediately evaluated to see if
they have any impact on tools/testing/nvdimm/.
Q: What are the negative implications of merging this?
It is a unique maintenance burden because the purpose of mocking an
interface to enable a unit test is to purposefully short circuit the
semantics of a routine to enable testing. For example
__wrap_ioremap_cache() fakes the pmem driver into "ioremap()'ing" a test
resource buffer allocated by dma_alloc_coherent(). The future
maintenance burden hits when someone changes the semantics of
ioremap_cache() and wonders what the implications are for the unit test.
[1]: https://github.com/pmem/ndctl
Cc: <linux-acpi@vger.kernel.org>
Cc: Lv Zheng <lv.zheng@intel.com>
Cc: Robert Moore <robert.moore@intel.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The libnvdimm implementation handles allocating dimm address space (DPA)
between PMEM and BLK mode interfaces. After DPA has been allocated from
a BLK-region to a BLK-namespace the nd_blk driver attaches to handle I/O
as a struct bio based block device. Unlike PMEM, BLK is required to
handle platform specific details like mmio register formats and memory
controller interleave. For this reason the libnvdimm generic nd_blk
driver calls back into the bus provider to carry out the I/O.
This initial implementation handles the BLK interface defined by the
ACPI 6 NFIT [1] and the NVDIMM DSM Interface Example [2] composed from
DCR (dimm control region), BDW (block data window), IDT (interleave
descriptor) NFIT structures and the hardware register format.
[1]: http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf
[2]: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
BTT stands for Block Translation Table, and is a way to provide power
fail sector atomicity semantics for block devices that have the ability
to perform byte granularity IO. It relies on the capability of libnvdimm
namespace devices to do byte aligned IO.
The BTT works as a stacked blocked device, and reserves a chunk of space
from the backing device for its accounting metadata. It is a bio-based
driver because all IO is done synchronously, and there is no queuing or
asynchronous completions at either the device or the driver level.
The BTT uses 'lanes' to index into various 'on-disk' data structures,
and lanes also act as a synchronization mechanism in case there are more
CPUs than available lanes. We did a comparison between two lane lock
strategies - first where we kept an atomic counter around that tracked
which was the last lane that was used, and 'our' lane was determined by
atomically incrementing that. That way, for the nr_cpus > nr_lanes case,
theoretically, no CPU would be blocked waiting for a lane. The other
strategy was to use the cpu number we're scheduled on to and hash it to
a lane number. Theoretically, this could block an IO that could've
otherwise run using a different, free lane. But some fio workloads
showed that the direct cpu -> lane hash performed faster than tracking
'last lane' - my reasoning is the cache thrash caused by moving the
atomic variable made that approach slower than simply waiting out the
in-progress IO. This supports the conclusion that the driver can be a
very simple bio-based one that does synchronous IOs instead of queuing.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Neil Brown <neilb@suse.de>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
[jmoyer: fix nmi watchdog timeout in btt_map_init]
[jmoyer: move btt initialization to module load path]
[jmoyer: fix memory leak in the btt initialization path]
[jmoyer: Don't overwrite corrupted arenas]
Signed-off-by: Vishal Verma <vishal.l.verma@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Pull thermal management updates from Zhang Rui:
"Specifics:
- enhance Thermal Framework with several new capabilities:
* use power estimates
* compute weights with relative integers instead of percentages
* allow governors to have private data in thermal zones
* export thermal zone parameters through sysfs
Thanks to the ARM thermal team (Javi, Punit, KP).
- introduce a new thermal governor: power allocator. First in kernel
closed loop PI(D) controller for thermal control. Thanks to ARM
thermal team.
- enhance OF thermal to allow thermal zones to have sustainable power
HW specification. Thanks to Punit.
- introduce thermal driver for Intel Quark SoC x1000platform. Thanks
to Ong, Boon Leong.
- introduce QPNP PMIC temperature alarm driver. Thanks to Ivan T. I.
- introduce thermal driver for Hisilicon hi6220. Thanks to
kongxinwei.
- enhance Exynos thermal driver to handle Exynos5433 TMU. Thanks to
Chanwoo C.
- TI thermal driver now has a better implementation for EOCZ bit.
From Pavel M.
- add id for Skylake processors in int340x processor thermal driver.
- a couple of small fixes and cleanups."
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: (36 commits)
thermal: hisilicon: add new hisilicon thermal sensor driver
dt-bindings: Document the hi6220 thermal sensor bindings
thermal: of-thermal: add support for reading coefficients property
thermal: support slope and offset coefficients
thermal: power_allocator: round the division when divvying up power
thermal: exynos: Add the support for Exynos5433 TMU
thermal: cpu_cooling: Fix power calculation when CPUs are offline
thermal: cpu_cooling: Remove cpu_dev update on policy CPU update
thermal: export thermal_zone_parameters to sysfs
thermal: cpu_cooling: Check memory allocation of power_table
ti-soc-thermal: request temperature periodically if hw can't do that itself
ti-soc-thermal: implement eocz bit to make driver useful on omap3
cleanup ti-soc-thermal
thermal: remove stale THERMAL_POWER_ACTOR select
thermal: Default OF created trip points to writable
thermal: core: Add Kconfig option to enable writable trips
thermal: x86_pkg_temp: drop const for thermal_zone_parameters
of: thermal: Introduce sustainable power for a thermal zone
thermal: add trace events to the power allocator governor
thermal: introduce the Power Allocator governor
...
On platforms that have firmware support for reading/writing per-dimm
label space, a portion of the dimm may be accessible via an interleave
set PMEM mapping in addition to the dimm's BLK (block-data-window
aperture(s)) interface. A label, stored in a "configuration data
region" on the dimm, disambiguates which dimm addresses are accessed
through which exclusive interface.
Add infrastructure that allows the kernel to block modifications to a
label in the set while any member dimm is active. Note that this is
meant only for enforcing "no modifications of active labels" via the
coarse ioctl command. Adding/deleting namespaces from an active
interleave set is always possible via sysfs.
Another aspect of tracking interleave sets is tracking their integrity
when DIMMs in a set are physically re-ordered. For this purpose we
generate an "interleave-set cookie" that can be recorded in a label and
validated against the current configuration. It is the bus provider
implementation's responsibility to calculate the interleave set cookie
and attach it to a given region.
Cc: Neil Brown <neilb@suse.de>
Cc: <linux-acpi@vger.kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Robert Moore <robert.moore@intel.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The libnvdimm region driver is an intermediary driver that translates
non-volatile "region"s into "namespace" sub-devices that are surfaced by
persistent memory block-device drivers (PMEM and BLK).
ACPI 6 introduces the concept that a given nvdimm may simultaneously
offer multiple access modes to its media through direct PMEM load/store
access, or windowed BLK mode. Existing nvdimms mostly implement a PMEM
interface, some offer a BLK-like mode, but never both as ACPI 6 defines.
If an nvdimm is single interfaced, then there is no need for dimm
metadata labels. For these devices we can take the region boundaries
directly to create a child namespace device (nd_namespace_io).
Acked-by: Christoph Hellwig <hch@lst.de>
Tested-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
A "region" device represents the maximum capacity of a BLK range (mmio
block-data-window(s)), or a PMEM range (DAX-capable persistent memory or
volatile memory), without regard for aliasing. Aliasing, in the
dimm-local address space (DPA), is resolved by metadata on a dimm to
designate which exclusive interface will access the aliased DPA ranges.
Support for the per-dimm metadata/label arrvies is in a subsequent
patch.
The name format of "region" devices is "regionN" where, like dimms, N is
a global ida index assigned at discovery time. This id is not reliable
across reboots nor in the presence of hotplug. Look to attributes of
the region or static id-data of the sub-namespace to generate a
persistent name. However, if the platform configuration does not change
it is reasonable to expect the same region id to be assigned at the next
boot.
"region"s have 2 generic attributes "size", and "mapping"s where:
- size: the BLK accessible capacity or the span of the
system physical address range in the case of PMEM.
- mappingN: a tuple describing a dimm's contribution to the region's
capacity in the format (<nmemX>,<dpa>,<size>). For a PMEM-region
there will be at least one mapping per dimm in the interleave set. For
a BLK-region there is only "mapping0" listing the starting DPA of the
BLK-region and the available DPA capacity of that space (matches "size"
above).
The max number of mappings per "region" is hard coded per the
constraints of sysfs attribute groups. That said the number of mappings
per region should never exceed the maximum number of possible dimms in
the system. If the current number turns out to not be enough then the
"mappings" attribute clarifies how many there are supposed to be. "32
should be enough for anybody...".
Cc: Neil Brown <neilb@suse.de>
Cc: <linux-acpi@vger.kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Robert Moore <robert.moore@intel.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* Implement the device-model infrastructure for loading modules and
attaching drivers to nvdimm devices. This is a simple association of a
nd-device-type number with a driver that has a bitmask of supported
device types. To facilitate userspace bind/unbind operations 'modalias'
and 'devtype', that also appear in the uevent, are added as generic
sysfs attributes for all nvdimm devices. The reason for the device-type
number is to support sub-types within a given parent devtype, be it a
vendor-specific sub-type or otherwise.
* The first consumer of this infrastructure is the driver
for dimm devices. It simply uses control messages to retrieve and
store the configuration-data image (label set) from each dimm.
Note: nd_device_register() arranges for asynchronous registration of
nvdimm bus devices by default.
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Neil Brown <neilb@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Tested-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Most discovery/configuration of the nvdimm-subsystem is done via sysfs
attributes. However, some nvdimm_bus instances, particularly the
ACPI.NFIT bus, define a small set of messages that can be passed to the
platform. For convenience we derive the initial libnvdimm-ioctl command
formats directly from the NFIT DSM Interface Example formats.
ND_CMD_SMART: media health and diagnostics
ND_CMD_GET_CONFIG_SIZE: size of the label space
ND_CMD_GET_CONFIG_DATA: read label space
ND_CMD_SET_CONFIG_DATA: write label space
ND_CMD_VENDOR: vendor-specific command passthrough
ND_CMD_ARS_CAP: report address-range-scrubbing capabilities
ND_CMD_ARS_START: initiate scrubbing
ND_CMD_ARS_STATUS: report on scrubbing state
ND_CMD_SMART_THRESHOLD: configure alarm thresholds for smart events
If a platform later defines different commands than this set it is
straightforward to extend support to those formats.
Most of the commands target a specific dimm. However, the
address-range-scrubbing commands target the bus. The 'commands'
attribute in sysfs of an nvdimm_bus, or nvdimm, enumerate the supported
commands for that object.
Cc: <linux-acpi@vger.kernel.org>
Cc: Robert Moore <robert.moore@intel.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reported-by: Nicholas Moulin <nicholas.w.moulin@linux.intel.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Enable nvdimm devices to be registered on a nvdimm_bus. The kernel
assigned device id for nvdimm devicesis dynamic. If userspace needs a
more static identifier it should consult a provider-specific attribute.
In the case where NFIT is the provider, the 'nmemX/nfit/handle' or
'nmemX/nfit/serial' attributes may be used for this purpose.
Cc: Neil Brown <neilb@suse.de>
Cc: <linux-acpi@vger.kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Robert Moore <robert.moore@intel.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The control device for a nvdimm_bus is registered as an "nd" class
device. The expectation is that there will usually only be one "nd" bus
registered under /sys/class/nd. However, we allow for the possibility
of multiple buses and they will listed in discovery order as
ndctl0...ndctlN. This character device hosts the ioctl for passing
control messages. The initial command set has a 1:1 correlation with
the commands listed in the by the "NFIT DSM Example" document [1], but
this scheme is extensible to future command sets.
Note, nd_ioctl() and the backing ->ndctl() implementation are defined in
a subsequent patch. This is simply the initial registrations and sysfs
attributes.
[1]: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
Cc: Neil Brown <neilb@suse.de>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: <linux-acpi@vger.kernel.org>
Cc: Robert Moore <robert.moore@intel.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
A struct nvdimm_bus is the anchor device for registering nvdimm
resources and interfaces, for example, a character control device,
nvdimm devices, and I/O region devices. The ACPI NFIT (NVDIMM Firmware
Interface Table) is one possible platform description for such
non-volatile memory resources in a system. The nfit.ko driver attaches
to the "ACPI0012" device that indicates the presence of the NFIT and
parses the table to register a struct nvdimm_bus instance.
Cc: <linux-acpi@vger.kernel.org>
Cc: Lv Zheng <lv.zheng@intel.com>
Cc: Robert Moore <robert.moore@intel.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
ACPICA commit cb3d1c79f862cd368d749c9b8d9dced40111b0d0
__FUNCTION__ is MSVC only, in Linux, it is __func__. Lv Zheng.
As noted by Christoph Hellwig: "__func__ is in C99 and never.
__FUNCTION__ is an old extension supported by various compilers."
In ACPICA, this is achieved by string replacement in release script and
this patch contains the source code difference between the Linux upstream
and ACPICA that is caused by the back porting.
Link: https://github.com/acpica/acpica/commit/cb3d1c79
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There is a small memory leak on error.
Fixes: 0f1b414d19 (ACPI / PNP: Avoid conflicting resource reservations)
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- ACPICA update to upstream revision 20150515 including basic
support for ACPI 6 features: new ACPI tables introduced by
ACPI 6 (STAO, XENV, WPBT, NFIT, IORT), changes related to the
other tables (DTRM, FADT, LPIT, MADT), new predefined names
(_BTH, _CR3, _DSD, _LPI, _MTL, _PRR, _RDI, _RST, _TFP, _TSN),
fixes and cleanups (Bob Moore, Lv Zheng).
- ACPI device power management core code update to follow ACPI 6
which reflects the ACPI device power management implementation
in Windows (Rafael J Wysocki).
- Rework of the backlight interface selection logic to reduce the
number of kernel command line options and improve the handling
of DMI quirks that may be involved in that and to make the
code generally more straightforward (Hans de Goede).
- Fixes for the ACPI Embedded Controller (EC) driver related to
the handling of EC transactions (Lv Zheng).
- Fix for a regression related to the ACPI resources management
and resulting from a recent change of ACPI initialization code
ordering (Rafael J Wysocki).
- Fix for a system initialization regression related to ACPI
introduced during the 3.14 cycle and caused by running the
code that switches the platform over to the ACPI mode too
early in the initialization sequence (Rafael J Wysocki).
- Support for the ACPI _CCA device configuration object related
to DMA cache coherence (Suravee Suthikulpanit).
- ACPI/APEI fixes and cleanups (Jiri Kosina, Borislav Petkov).
- ACPI battery driver cleanups (Luis Henriques, Mathias Krause).
- ACPI processor driver cleanups (Hanjun Guo).
- Cleanups and documentation update related to the ACPI device
properties interface based on _DSD (Rafael J Wysocki).
- ACPI device power management fixes (Rafael J Wysocki).
- Assorted cleanups related to ACPI (Dominik Brodowski. Fabian
Frederick, Lorenzo Pieralisi, Mathias Krause, Rafael J Wysocki).
- Fix for a long-standing issue causing General Protection Faults
to be generated occasionally on return to user space after resume
from ACPI-based suspend-to-RAM on 32-bit x86 (Ingo Molnar).
- Fix to make the suspend core code return -EBUSY consistently in
all cases when system suspend is aborted due to wakeup detection
(Ruchi Kandoi).
- Support for automated device wakeup IRQ handling allowing drivers
to make their PM support more starightforward (Tony Lindgren).
- New tracepoints for suspend-to-idle tracing and rework of the
prepare/complete callbacks tracing in the PM core (Todd E Brandt,
Rafael J Wysocki).
- Wakeup sources framework enhancements (Jin Qian).
- New macro for noirq system PM callbacks (Grygorii Strashko).
- Assorted cleanups related to system suspend (Rafael J Wysocki).
- cpuidle core cleanups to make the code more efficient (Rafael J
Wysocki).
- powernv/pseries cpuidle driver update (Shilpasri G Bhat).
- cpufreq core fixes related to CPU online/offline that should
reduce the overhead of these operations quite a bit, unless the
CPU in question is physically going away (Viresh Kumar, Saravana
Kannan).
- Serialization of cpufreq governor callbacks to avoid race
conditions in some cases (Viresh Kumar).
- intel_pstate driver fixes and cleanups (Doug Smythies, Prarit
Bhargava, Joe Konno).
- cpufreq driver (arm_big_little, cpufreq-dt, qoriq) updates (Sudeep
Holla, Felipe Balbi, Tang Yuantian).
- Assorted cleanups in cpufreq drivers and core (Shailendra Verma,
Fabian Frederick, Wang Long).
- New Device Tree bindings for representing Operating Performance
Points (Viresh Kumar).
- Updates for the common clock operations support code in the PM
core (Rajendra Nayak, Geert Uytterhoeven).
- PM domains core code update (Geert Uytterhoeven).
- Intel Knights Landing support for the RAPL (Running Average Power
Limit) power capping driver (Dasaratharaman Chandramouli).
- Fixes related to the floor frequency setting on Atom SoCs in the
RAPL power capping driver (Ajay Thomas).
- Runtime PM framework documentation update (Ben Dooks).
- cpupower tool fix (Herton R Krzesinski).
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJViJdWAAoJEILEb/54YlRx/9gP/3gHoFevNRycvn0VpKqdufCI
Mxy2LBBLlfyW2uD3+NvqvA2WWSo0Cs/LgXa04eAVxPdU7k48s8w+54U23wSouzjW
gfwAmuHxzDR8v0h8X3h6BxNzmkIQHtmDcQlA/cZdHejY/UUw01yxRGNUUZDNbxlm
WXn2nmlBLmGqXTYq0fpBV+3jicUghJqHHsBCqa3VR2yQioHMJG01F4UZMqYTZunN
OIvDUghxByKz6alzdCqlLl1Y0exV6vwWUAzBsl1qHqmHu/bWFSZn3ujNNVrjqHhw
Kl7/8dC2pQkv3Zo3gEVvfQ0onotwWZxGHzPQRdvmxvRnBunQVCi/wynx90yABX/r
PPb/iBNV0mZskbF0zb0GZT3ZZWGA8Z0p3o5JQv2jV4m62qTzx8w50Y5kbn9N1WT+
5bre7AVbVAlGonWszcS9iE+6TOboRz9OD1CCwPFXHItFutlBkau+1hHfFoLM0o9n
LhpGuyszT/EUa1BHkLzuCckFqO2DpbF3N2CKmuTekw0CdgdsvRL2pRByuerk3j7R
WQhlcvBq5YH6j43AuoEZKp8r1iN8oG/iqlrMYQaYWrW9hJaoQOoU8dGJxp/e7gKN
r/qeYjETI+tIsjCbtH5WQzzxDI3gPISAYAtfqs7G34EEo+Lwp6kyRUAF4kDot2V3
ZIyuKMmTu4cdwDETr/O+
=7jTj
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI updates from Rafael Wysocki:
"The rework of backlight interface selection API from Hans de Goede
stands out from the number of commits and the number of affected
places perspective. The cpufreq core fixes from Viresh Kumar are
quite significant too as far as the number of commits goes and because
they should reduce CPU online/offline overhead quite a bit in the
majority of cases.
From the new featues point of view, the ACPICA update (to upstream
revision 20150515) adding support for new ACPI 6 material to ACPICA is
the one that matters the most as some new significant features will be
based on it going forward. Also included is an update of the ACPI
device power management core to follow ACPI 6 (which in turn reflects
the Windows' device PM implementation), a PM core extension to support
wakeup interrupts in a more generic way and support for the ACPI _CCA
device configuration object.
The rest is mostly fixes and cleanups all over and some documentation
updates, including new DT bindings for Operating Performance Points.
There is one fix for a regression introduced in the 4.1 cycle, but it
adds quite a number of lines of code, it wasn't really ready before
Thursday and you were on vacation, so I refrained from pushing it on
the last minute for 4.1.
Specifics:
- ACPICA update to upstream revision 20150515 including basic support
for ACPI 6 features: new ACPI tables introduced by ACPI 6 (STAO,
XENV, WPBT, NFIT, IORT), changes related to the other tables (DTRM,
FADT, LPIT, MADT), new predefined names (_BTH, _CR3, _DSD, _LPI,
_MTL, _PRR, _RDI, _RST, _TFP, _TSN), fixes and cleanups (Bob Moore,
Lv Zheng).
- ACPI device power management core code update to follow ACPI 6
which reflects the ACPI device power management implementation in
Windows (Rafael J Wysocki).
- rework of the backlight interface selection logic to reduce the
number of kernel command line options and improve the handling of
DMI quirks that may be involved in that and to make the code
generally more straightforward (Hans de Goede).
- fixes for the ACPI Embedded Controller (EC) driver related to the
handling of EC transactions (Lv Zheng).
- fix for a regression related to the ACPI resources management and
resulting from a recent change of ACPI initialization code ordering
(Rafael J Wysocki).
- fix for a system initialization regression related to ACPI
introduced during the 3.14 cycle and caused by running the code
that switches the platform over to the ACPI mode too early in the
initialization sequence (Rafael J Wysocki).
- support for the ACPI _CCA device configuration object related to
DMA cache coherence (Suravee Suthikulpanit).
- ACPI/APEI fixes and cleanups (Jiri Kosina, Borislav Petkov).
- ACPI battery driver cleanups (Luis Henriques, Mathias Krause).
- ACPI processor driver cleanups (Hanjun Guo).
- cleanups and documentation update related to the ACPI device
properties interface based on _DSD (Rafael J Wysocki).
- ACPI device power management fixes (Rafael J Wysocki).
- assorted cleanups related to ACPI (Dominik Brodowski, Fabian
Frederick, Lorenzo Pieralisi, Mathias Krause, Rafael J Wysocki).
- fix for a long-standing issue causing General Protection Faults to
be generated occasionally on return to user space after resume from
ACPI-based suspend-to-RAM on 32-bit x86 (Ingo Molnar).
- fix to make the suspend core code return -EBUSY consistently in all
cases when system suspend is aborted due to wakeup detection (Ruchi
Kandoi).
- support for automated device wakeup IRQ handling allowing drivers
to make their PM support more starightforward (Tony Lindgren).
- new tracepoints for suspend-to-idle tracing and rework of the
prepare/complete callbacks tracing in the PM core (Todd E Brandt,
Rafael J Wysocki).
- wakeup sources framework enhancements (Jin Qian).
- new macro for noirq system PM callbacks (Grygorii Strashko).
- assorted cleanups related to system suspend (Rafael J Wysocki).
- cpuidle core cleanups to make the code more efficient (Rafael J
Wysocki).
- powernv/pseries cpuidle driver update (Shilpasri G Bhat).
- cpufreq core fixes related to CPU online/offline that should reduce
the overhead of these operations quite a bit, unless the CPU in
question is physically going away (Viresh Kumar, Saravana Kannan).
- serialization of cpufreq governor callbacks to avoid race
conditions in some cases (Viresh Kumar).
- intel_pstate driver fixes and cleanups (Doug Smythies, Prarit
Bhargava, Joe Konno).
- cpufreq driver (arm_big_little, cpufreq-dt, qoriq) updates (Sudeep
Holla, Felipe Balbi, Tang Yuantian).
- assorted cleanups in cpufreq drivers and core (Shailendra Verma,
Fabian Frederick, Wang Long).
- new Device Tree bindings for representing Operating Performance
Points (Viresh Kumar).
- updates for the common clock operations support code in the PM core
(Rajendra Nayak, Geert Uytterhoeven).
- PM domains core code update (Geert Uytterhoeven).
- Intel Knights Landing support for the RAPL (Running Average Power
Limit) power capping driver (Dasaratharaman Chandramouli).
- fixes related to the floor frequency setting on Atom SoCs in the
RAPL power capping driver (Ajay Thomas).
- runtime PM framework documentation update (Ben Dooks).
- cpupower tool fix (Herton R Krzesinski)"
* tag 'pm+acpi-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (194 commits)
cpuidle: powernv/pseries: Auto-promotion of snooze to deeper idle state
x86: Load __USER_DS into DS/ES after resume
PM / OPP: Add binding for 'opp-suspend'
PM / OPP: Allow multiple OPP tables to be passed via DT
PM / OPP: Add new bindings to address shortcomings of existing bindings
ACPI: Constify ACPI device IDs in documentation
ACPI / enumeration: Document the rules regarding the PRP0001 device ID
ACPI / video: Make acpi_video_unregister_backlight() private
acpi-video-detect: Remove old API
toshiba-acpi: Port to new backlight interface selection API
thinkpad-acpi: Port to new backlight interface selection API
sony-laptop: Port to new backlight interface selection API
samsung-laptop: Port to new backlight interface selection API
msi-wmi: Port to new backlight interface selection API
msi-laptop: Port to new backlight interface selection API
intel-oaktrail: Port to new backlight interface selection API
ideapad-laptop: Port to new backlight interface selection API
fujitsu-laptop: Port to new backlight interface selection API
eeepc-laptop: Port to new backlight interface selection API
dell-wmi: Port to new backlight interface selection API
...
This patch reduces source code differences between the Linux kernel and the
ACPICA upstream so that the linuxized ACPICA 20150616 release can be
applied with reduced human intervention.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull x86 core updates from Ingo Molnar:
"There were so many changes in the x86/asm, x86/apic and x86/mm topics
in this cycle that the topical separation of -tip broke down somewhat -
so the result is a more traditional architecture pull request,
collected into the 'x86/core' topic.
The topics were still maintained separately as far as possible, so
bisectability and conceptual separation should still be pretty good -
but there were a handful of merge points to avoid excessive
dependencies (and conflicts) that would have been poorly tested in the
end.
The next cycle will hopefully be much more quiet (or at least will
have fewer dependencies).
The main changes in this cycle were:
* x86/apic changes, with related IRQ core changes: (Jiang Liu, Thomas
Gleixner)
- This is the second and most intrusive part of changes to the x86
interrupt handling - full conversion to hierarchical interrupt
domains:
[IOAPIC domain] -----
|
[MSI domain] --------[Remapping domain] ----- [ Vector domain ]
| (optional) |
[HPET MSI domain] ----- |
|
[DMAR domain] -----------------------------
|
[Legacy domain] -----------------------------
This now reflects the actual hardware and allowed us to distangle
the domain specific code from the underlying parent domain, which
can be optional in the case of interrupt remapping. It's a clear
separation of functionality and removes quite some duct tape
constructs which plugged the remap code between ioapic/msi/hpet
and the vector management.
- Intel IOMMU IRQ remapping enhancements, to allow direct interrupt
injection into guests (Feng Wu)
* x86/asm changes:
- Tons of cleanups and small speedups, micro-optimizations. This
is in preparation to move a good chunk of the low level entry
code from assembly to C code (Denys Vlasenko, Andy Lutomirski,
Brian Gerst)
- Moved all system entry related code to a new home under
arch/x86/entry/ (Ingo Molnar)
- Removal of the fragile and ugly CFI dwarf debuginfo annotations.
Conversion to C will reintroduce many of them - but meanwhile
they are only getting in the way, and the upstream kernel does
not rely on them (Ingo Molnar)
- NOP handling refinements. (Borislav Petkov)
* x86/mm changes:
- Big PAT and MTRR rework: making the code more robust and
preparing to phase out exposing direct MTRR interfaces to drivers -
in favor of using PAT driven interfaces (Toshi Kani, Luis R
Rodriguez, Borislav Petkov)
- New ioremap_wt()/set_memory_wt() interfaces to support
Write-Through cached memory mappings. This is especially
important for good performance on NVDIMM hardware (Toshi Kani)
* x86/ras changes:
- Add support for deferred errors on AMD (Aravind Gopalakrishnan)
This is an important RAS feature which adds hardware support for
poisoned data. That means roughly that the hardware marks data
which it has detected as corrupted but wasn't able to correct, as
poisoned data and raises an APIC interrupt to signal that in the
form of a deferred error. It is the OS's responsibility then to
take proper recovery action and thus prolonge system lifetime as
far as possible.
- Add support for Intel "Local MCE"s: upcoming CPUs will support
CPU-local MCE interrupts, as opposed to the traditional system-
wide broadcasted MCE interrupts (Ashok Raj)
- Misc cleanups (Borislav Petkov)
* x86/platform changes:
- Intel Atom SoC updates
... and lots of other cleanups, fixlets and other changes - see the
shortlog and the Git log for details"
* 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (222 commits)
x86/hpet: Use proper hpet device number for MSI allocation
x86/hpet: Check for irq==0 when allocating hpet MSI interrupts
x86/mm/pat, drivers/infiniband/ipath: Use arch_phys_wc_add() and require PAT disabled
x86/mm/pat, drivers/media/ivtv: Use arch_phys_wc_add() and require PAT disabled
x86/platform/intel/baytrail: Add comments about why we disabled HPET on Baytrail
genirq: Prevent crash in irq_move_irq()
genirq: Enhance irq_data_to_desc() to support hierarchy irqdomain
iommu, x86: Properly handle posted interrupts for IOMMU hotplug
iommu, x86: Provide irq_remapping_cap() interface
iommu, x86: Setup Posted-Interrupts capability for Intel iommu
iommu, x86: Add cap_pi_support() to detect VT-d PI capability
iommu, x86: Avoid migrating VT-d posted interrupts
iommu, x86: Save the mode (posted or remapped) of an IRTE
iommu, x86: Implement irq_set_vcpu_affinity for intel_ir_chip
iommu: dmar: Provide helper to copy shared irte fields
iommu: dmar: Extend struct irte for VT-d Posted-Interrupts
iommu: Add new member capability to struct irq_remap_ops
x86/asm/entry/64: Disentangle error_entry/exit gsbase/ebx/usermode code
x86/asm/entry/32: Shorten __audit_syscall_entry() args preparation
x86/asm/entry/32: Explain reloading of registers after __audit_syscall_entry()
...