There are several reports of freeze on enabling HWP (Hardware PStates)
feature on Skylake-based systems by the Intel P-states driver. The root
cause is identified as the HWP interrupts causing BIOS code to freeze.
HWP interrupts use the thermal LVT which can be handled by Linux
natively, but on the affected Skylake-based systems SMM will respond
to it by default. This is a problem for several reasons:
- On the affected systems the SMM thermal LVT handler is broken (it
will crash when invoked) and a BIOS update is necessary to fix it.
- With thermal interrupt handled in SMM we lose all of the reporting
features of the arch/x86/kernel/cpu/mcheck/therm_throt driver.
- Some thermal drivers like x86-package-temp depend on the thermal
threshold interrupts signaled via the thermal LVT.
- The HWP interrupts are useful for debugging and tuning
performance (if the kernel can handle them).
The native handling of thermal interrupts needs to be enabled
because of that.
This requires some way to tell SMM that the OS can handle thermal
interrupts. That can be done by using _OSC/_PDC in processor
scope very early during ACPI initialization.
The meaning of _OSC/_PDC bit 12 in processor scope is whether or
not the OS supports native handling of interrupts for Collaborative
Processor Performance Control (CPPC) notifications. Since on
HWP-capable systems CPPC is a firmware interface to HWP, setting
this bit effectively tells the firmware that the OS will handle
thermal interrupts natively going forward.
For details on _OSC/_PDC refer to:
http://www.intel.com/content/www/us/en/standards/processor-vendor-specific-acpi-specification.html
To implement the _OSC/_PDC handshake as described, introduce a new
function, acpi_early_processor_osc(), that walks the ACPI
namespace looking for ACPI processor objects and invokes _OSC for
them with bit 12 in the capabilities buffer set and terminates the
namespace walk on the first success.
Also modify intel_thermal_interrupt() to clear HWP status bits in
the HWP_STATUS MSR to acknowledge HWP interrupts (which prevents
them from firing continuously).
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
[ rjw: Subject & changelog, function rename ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- Fix for an intel_pstate driver issue related to the handling of
MSR updates uncovered by the recent cpufreq rework (Rafael Wysocki).
- cpufreq core cleanups related to starting governors and frequency
synchronization during resume from system suspend and a locking
fix for cpufreq_quick_get() (Rafael Wysocki, Richard Cochran).
- acpi-cpufreq and powernv cpufreq driver updates (Jisheng Zhang,
Michael Neuling, Richard Cochran, Shilpasri Bhat).
- intel_idle driver update preventing some Skylake-H systems
from hanging during initialization by disabling deep C-states
mishandled by the platform in the problematic configurations (Len
Brown).
- Intel Xeon Phi Processor x200 support for intel_idle (Dasaratharaman
Chandramouli).
- cpuidle menu governor updates to make it always honor PM QoS
latency constraints (and prevent C1 from being used as the
fallback C-state on x86 when they are set below its exit latency)
and to restore the previous behavior to fall back to C1 if the next
timer event is set far enough in the future that was changed in 4.4
which led to an energy consumption regression (Rik van Riel, Rafael
Wysocki).
- New device ID for a future AMD UART controller in the ACPI driver
for AMD SoCs (Wang Hongcheng).
- Rockchip rk3399 support for the rockchip-io-domain adaptive voltage
scaling (AVS) driver (David Wu).
- ACPI PCI resources management fix for the handling of IO space
resources on architectures where the IO space is memory mapped
(IA64 and ARM64) broken by the introduction of common ACPI
resources parsing for PCI host bridges in 4.4 (Lorenzo Pieralisi).
- Fix for the ACPI backend of the generic device properties API
to make it parse non-device (data node only) children of an
ACPI device correctly (Irina Tirdea).
- Fixes for the handling of global suspend flags (introduced in 4.4)
during hibernation and resume from it (Lukas Wunner).
- Support for obtaining configuration information from Device Trees
in the PM clocks framework (Jon Hunter).
- ACPI _DSM helper code and devfreq framework cleanups (Colin Ian
King, Geert Uytterhoeven).
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJW9JaRAAoJEILEb/54YlRx/GAQAJujANWilWHZYm24a9JDcIE9
rsNZIC/FdeBVilPtRTZQnig/Pj32Z4Jm7IZ/DLOq0Deu1YK/9uv3y59M3BcX6WyL
H5VR80L8geUJZ7RRk0WfM5D4X82ovzwpE/kWt2Z7HDuvJSCBmFBZOvNrXbaRncKD
jIvat/p6uCuxt5c08+ebnBLQ6tOs8wLTWiCx3fO128GIrGRGN2xFV6hzRWVGnJ4g
WXGAR+AdLxRMZz4PPmqdTfRj4TNSR071GjKyaeKfZUjQGAsf5O9A77JFjeNVomDx
g1K37Byid2bTByzVavlEXPJZ7eKb5dAhlo7IJ9HAcOAXChLqH2Czjrpd+1XjR9MF
SV/78rCnF8eet83QYLbGV/Mzf7gbJP2Xp6wiaM22VAPpGe+sYfphJoQka9XRTfId
OgAjyYMYdWAKo5DhxVNI8WyN0W5dsoBFPxnaUFhHSGDCIJH7Ksy20m6y3plG2Bxf
ahoiQhmd9ohjtB5JbRnf4MY0hjekp8Srdf+DoNKsk/+JscIyROpYY3msQ3smUKo+
f628MC/wAosMpSV+l+KOYkbjCbtB49IabWtZ//NVD9hYB3E1f6aTN59yFbWB+1rp
L7Y8iaxzSkyJy/yYVuBal3rSk356+BvvoXBlLXmBsyu1TMlcDjALIYztSiTVT5MB
RZBhgNwdkxNCYJfU3ex+
=hUVj
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-4.6-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management and ACPI updates from Rafael Wysocki:
"The second batch of power management and ACPI updates for v4.6.
Included are fixups on top of the previous PM/ACPI pull request and
other material that didn't make into it but still should go into 4.6.
Among other things, there's a fix for an intel_pstate driver issue
uncovered by recent cpufreq changes, a workaround for a boot hang on
Skylake-H related to the handling of deep C-states by the platform and
a PCI/ACPI fix for the handling of IO port resources on non-x86
architectures plus some new device IDs and similar.
Specifics:
- Fix for an intel_pstate driver issue related to the handling of MSR
updates uncovered by the recent cpufreq rework (Rafael Wysocki).
- cpufreq core cleanups related to starting governors and frequency
synchronization during resume from system suspend and a locking fix
for cpufreq_quick_get() (Rafael Wysocki, Richard Cochran).
- acpi-cpufreq and powernv cpufreq driver updates (Jisheng Zhang,
Michael Neuling, Richard Cochran, Shilpasri Bhat).
- intel_idle driver update preventing some Skylake-H systems from
hanging during initialization by disabling deep C-states mishandled
by the platform in the problematic configurations (Len Brown).
- Intel Xeon Phi Processor x200 support for intel_idle
(Dasaratharaman Chandramouli).
- cpuidle menu governor updates to make it always honor PM QoS
latency constraints (and prevent C1 from being used as the fallback
C-state on x86 when they are set below its exit latency) and to
restore the previous behavior to fall back to C1 if the next timer
event is set far enough in the future that was changed in 4.4 which
led to an energy consumption regression (Rik van Riel, Rafael
Wysocki).
- New device ID for a future AMD UART controller in the ACPI driver
for AMD SoCs (Wang Hongcheng).
- Rockchip rk3399 support for the rockchip-io-domain adaptive voltage
scaling (AVS) driver (David Wu).
- ACPI PCI resources management fix for the handling of IO space
resources on architectures where the IO space is memory mapped
(IA64 and ARM64) broken by the introduction of common ACPI
resources parsing for PCI host bridges in 4.4 (Lorenzo Pieralisi).
- Fix for the ACPI backend of the generic device properties API to
make it parse non-device (data node only) children of an ACPI
device correctly (Irina Tirdea).
- Fixes for the handling of global suspend flags (introduced in 4.4)
during hibernation and resume from it (Lukas Wunner).
- Support for obtaining configuration information from Device Trees
in the PM clocks framework (Jon Hunter).
- ACPI _DSM helper code and devfreq framework cleanups (Colin Ian
King, Geert Uytterhoeven)"
* tag 'pm+acpi-4.6-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (23 commits)
PM / AVS: rockchip-io: add io selectors and supplies for rk3399
intel_idle: Support for Intel Xeon Phi Processor x200 Product Family
intel_idle: prevent SKL-H boot failure when C8+C9+C10 enabled
ACPI / PM: Runtime resume devices when waking from hibernate
PM / sleep: Clear pm_suspend_global_flags upon hibernate
cpufreq: governor: Always schedule work on the CPU running update
cpufreq: Always update current frequency before startig governor
cpufreq: Introduce cpufreq_update_current_freq()
cpufreq: Introduce cpufreq_start_governor()
cpufreq: powernv: Add sysfs attributes to show throttle stats
cpufreq: acpi-cpufreq: make Intel/AMD MSR access, io port access static
PCI: ACPI: IA64: fix IO port generic range check
ACPI / util: cast data to u64 before shifting to fix sign extension
cpufreq: powernv: Define per_cpu chip pointer to optimize hot-path
cpuidle: menu: Fall back to polling if next timer event is near
cpufreq: acpi-cpufreq: Clean up hot plug notifier callback
intel_pstate: Do not call wrmsrl_on_cpu() with disabled interrupts
cpufreq: Make cpufreq_quick_get() safe to call
ACPI / property: fix data node parsing in acpi_get_next_subnode()
ACPI / APD: Add device HID for future AMD UART controller
...
Commit 58a1fbbb2e ("PM / PCI / ACPI: Kick devices that might have been
reset by firmware") added a runtime resume for devices that were runtime
suspended when the system entered suspend-to-RAM.
Briefly, the motivation was to ensure that devices did not remain in a
reset-power-on state after resume, potentially preventing deep SoC-wide
low-power states from being entered on idle.
Currently we're not doing the same when leaving suspend-to-disk and this
asymmetry is a problem if drivers rely on the automatic resume triggered
by pm_complete_with_resume_check(). Fix it.
Fixes: 58a1fbbb2e (PM / PCI / ACPI: Kick devices that might have been reset by firmware)
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: 4.4+ <stable@vger.kernel.org> # 4.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The [0 - 64k] ACPI PCI IO port resource boundary check in:
acpi_dev_ioresource_flags()
is currently applied blindly in the ACPI resource parsing to all
architectures, but only x86 suffers from that IO space limitation.
On arches (ie IA64 and ARM64) where IO space is memory mapped,
the PCI root bridges IO resource windows are firstly initialized from
the _CRS (in acpi_decode_space()) and contain the CPU physical address
at which a root bridge decodes IO space in the CPU physical address
space with the offset value representing the offset required to translate
the PCI bus address into the CPU physical address.
The IO resource windows are then parsed and updated in arch code
before creating and enumerating PCI buses (eg IA64 add_io_space())
to map in an arch specific way the obtained CPU physical address range
to a slice of virtual address space reserved to map PCI IO space,
ending up with PCI bridges resource windows containing IO
resources like the following on a working IA64 configuration:
PCI host bridge to bus 0000:00
pci_bus 0000:00: root bus resource [io 0x1000000-0x100ffff window] (bus
address [0x0000-0xffff])
pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000fffff window]
pci_bus 0000:00: root bus resource [mem 0x80000000-0x8fffffff window]
pci_bus 0000:00: root bus resource [mem 0x80004000000-0x800ffffffff window]
pci_bus 0000:00: root bus resource [bus 00]
This implies that the [0 - 64K] check in acpi_dev_ioresource_flags()
leaves platforms with memory mapped IO space (ie IA64) broken (ie kernel
can't claim IO resources since the host bridge IO resource is disabled
and discarded by ACPI core code, see log on IA64 with missing root bridge
IO resource, silently filtered by current [0 - 64k] check in
acpi_dev_ioresource_flags()):
PCI host bridge to bus 0000:00
pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000fffff window]
pci_bus 0000:00: root bus resource [mem 0x80000000-0x8fffffff window]
pci_bus 0000:00: root bus resource [mem 0x80004000000-0x800ffffffff window]
pci_bus 0000:00: root bus resource [bus 00]
[...]
pci 0000:00:03.0: [1002:515e] type 00 class 0x030000
pci 0000:00:03.0: reg 0x10: [mem 0x80000000-0x87ffffff pref]
pci 0000:00:03.0: reg 0x14: [io 0x1000-0x10ff]
pci 0000:00:03.0: reg 0x18: [mem 0x88020000-0x8802ffff]
pci 0000:00:03.0: reg 0x30: [mem 0x88000000-0x8801ffff pref]
pci 0000:00:03.0: supports D1 D2
pci 0000:00:03.0: can't claim BAR 1 [io 0x1000-0x10ff]: no compatible
bridge window
For this reason, the IO port resources boundaries check in generic ACPI
parsing code should be guarded with a CONFIG_X86 guard so that more arches
(ie ARM64) can benefit from the generic ACPI resources parsing interface
without incurring in unexpected resource filtering, fixing at the same
time current breakage on IA64.
This patch factors out IO ports boundary [0 - 64k] check in generic ACPI
code and makes the IO space check X86 specific to make sure that IO
space resources are usable on other arches too.
Fixes: 3772aea7d6 (ia64/PCI/ACPI: Use common ACPI resource parsing interface for host bridge)
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: 4.4+ <stable@vger.kernel.org> # 4.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
obj->buffer.pointer[i] should be cast to u64 to prevent an unintentional
sign extension. For example, if pointer[7] is 0x80, then the value
0xffffffffff000000 is or'd into mask rather than the intended value
0xff00000000000000
Detected with static analysis by CoverityScan
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When an ACPI node has both ACPI device nodes and ACPI data nodes,
acpi_get_next_subnode() will return the ACPI data nodes of its last
parsed child.
To avoid that, make acpi_get_next_subnode() go back to the original
ACPI device object when all of the device node children of it have
been found already.
Signed-off-by: Irina Tirdea <irina.tirdea@intel.com>
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add device HID AMDI0020 to match the AMD ACPI Vendor ID (AMDI) as
registered in http://www.uefi.org/acpi_id_list, and the UART
controller on future AMD paltform will use the HID instead of AMD0020.
Signed-off-by: Wang Hongcheng <annie.wang@amd.com>
Acked-by: Ken Xue <Ken.Xue@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
1/ Asynchronous address range scrub:
Given the capacities of next generation persistent memory devices a
scrub operation to find all poison may take 10s of seconds. We want
this scrub work to be done asynchronously with the rest of system
initialization, so we move it out of line from the NFIT probing, i.e.
acpi_nfit_add().
2/ Clear poison:
ACPI 6.1 introduces the ability to send "clear error" commands to the
ACPI0012:00 device representing the root of an "nvdimm bus". Similar to
relocating a bad block on a disk, this support clears media errors in
response to a write.
3/ Persistent memory resource tracking:
A persistent memory range may be designated as simply "reserved" by
platform firmware in the efi/e820 memory map. Later when the NFIT
driver loads it discovers that the range is "Persistent Memory". The
NFIT bus driver inserts a resource to advertise that "persistent"
attribute in the system resource tree for /proc/iomem and
kernel-internal usages.
4/ Miscellaneous cleanups and fixes:
Workaround section misaligned pmem ranges when allocating a struct page
memmap, fix handling of the read-only case in the ioctl path, and clean
up block device major number allocation.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJW6E0QAAoJEB7SkWpmfYgCM9EP/Aibi3BAYlv6KeCgLFXxNIyR
Y7rA0K5UiJwzQMWmo3xQ4EOvAHVCQ33cVEdXy0zJPLnzQ+GCvaMuD/pxOB+HoZWq
qUYdVvNomh7VzZDkbONidjuk4kwNHq8HtOo1bdGlPiXjIWEh3uop/rIShPFsRp9i
RVByTE/9TGoDQ9Q6Aakw1GlvT75tZ36ZqwkM2jyzu1a7fmqfkfAJjjDY6gzm3/fJ
OVv1SDGwknoTPMZFoAh5iyrzHsShw1l1nZFhP4LiulSUEYv4B1I0YNvzbmY9EkgQ
LHg/HChXpDCfQN/68k0W7OX6rYPSNjeiX0Y+kqc9owznA32lxsdSMUHcEnGz/3ZE
2yy0XfGMHYsXaWI514dKp1LceTvWYsuQ+NtYnDzEwMch9YjAJpOkxaJTqoRjD0rI
2yxPamLrF1RP7r0jUw2OiMBBpf/N6NvwbIUJ4ssR87ryA8axNcs8Teeu1lgDjajS
Xp2AKP5ViWP+lGdAJBY/fa70nSL6oyrHQlzV/3zAPyrVyhAfOTc5mHamlvzYYSBJ
EoHDG1A0diP/E4wdiVNrD2fcKie5Vmp4Ws59OCAM8PwOJRXyRGfVB7PP+Q1DSZlc
Tsh0QFjfGQOhS02VEaQPm7A19BYFgpTMgU6YqPOPyqVYALIqzj21Ov7+2VI73FyG
ORqEjCAxLVto+3gjN0oD
=F67V
-----END PGP SIGNATURE-----
Merge tag 'libnvdimm-for-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm updates from Dan Williams:
- Asynchronous address range scrub:
Given the capacities of next generation persistent memory devices a
scrub operation to find all poison may take 10s of seconds. We
want this scrub work to be done asynchronously with the rest of
system initialization, so we move it out of line from the NFIT
probing, i.e. acpi_nfit_add().
- Clear poison:
ACPI 6.1 introduces the ability to send "clear error" commands to
the ACPI0012:00 device representing the root of an "nvdimm bus".
Similar to relocating a bad block on a disk, this support clears
media errors in response to a write.
- Persistent memory resource tracking:
A persistent memory range may be designated as simply "reserved" by
platform firmware in the efi/e820 memory map. Later when the NFIT
driver loads it discovers that the range is "Persistent Memory".
The NFIT bus driver inserts a resource to advertise that
"persistent" attribute in the system resource tree for /proc/iomem
and kernel-internal usages.
- Miscellaneous cleanups and fixes:
Workaround section misaligned pmem ranges when allocating a struct
page memmap, fix handling of the read-only case in the ioctl path,
and clean up block device major number allocation.
* tag 'libnvdimm-for-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (26 commits)
libnvdimm, pmem: clear poison on write
libnvdimm, pmem: fix kmap_atomic() leak in error path
nvdimm/btt: don't allocate unused major device number
nvdimm/blk: don't allocate unused major device number
pmem: don't allocate unused major device number
ACPI: Change NFIT driver to insert new resource
resource: Export insert_resource and remove_resource
resource: Add remove_resource interface
resource: Change __request_region to inherit from immediate parent
libnvdimm, pmem: fix ia64 build, use PHYS_PFN
nfit, libnvdimm: clear poison command support
libnvdimm, pfn: 'resource'-address and 'size' attributes for pfn devices
libnvdimm, pmem: adjust for section collisions with 'System RAM'
libnvdimm, pmem: fix 'pfn' support for section-misaligned namespaces
libnvdimm: Fix security issue with DSM IOCTL.
libnvdimm: Clean-up access mode check.
tools/testing/nvdimm: expand ars unit testing
nfit: disable userspace initiated ars during scrub
nfit: scrub and register regions in a workqueue
nfit, libnvdimm: async region scrub workqueue
...
- Redesign of cpufreq governors and the intel_pstate driver to
make them use callbacks invoked by the scheduler to trigger CPU
frequency evaluation instead of using per-CPU deferrable timers
for that purpose (Rafael Wysocki).
- Reorganization and cleanup of cpufreq governor code to make it
more straightforward and fix some concurrency problems in it
(Rafael Wysocki, Viresh Kumar).
- Cleanup and improvements of locking in the cpufreq core (Viresh
Kumar).
- Assorted cleanups in the cpufreq core (Rafael Wysocki, Viresh
Kumar, Eric Biggers).
- intel_pstate driver updates including fixes, optimizations and a
modification to make it enable enable hardware-coordinated P-state
selection (HWP) by default if supported by the processor (Philippe
Longepe, Srinivas Pandruvada, Rafael Wysocki, Viresh Kumar, Felipe
Franciosi).
- Operating Performance Points (OPP) framework updates to improve
its handling of voltage regulators and device clocks and updates
of the cpufreq-dt driver on top of that (Viresh Kumar, Jon Hunter).
- Updates of the powernv cpufreq driver to fix initialization
and cleanup problems in it and correct its worker thread handling
with respect to CPU offline, new powernv_throttle tracepoint
(Shilpasri Bhat).
- ACPI cpufreq driver optimization and cleanup (Rafael Wysocki).
- ACPICA updates including one fix for a regression introduced
by previos changes in the ACPICA code (Bob Moore, Lv Zheng,
David Box, Colin Ian King).
- Support for installing ACPI tables from initrd (Lv Zheng).
- Optimizations of the ACPI CPPC code (Prashanth Prakash, Ashwin
Chaugule).
- Support for _HID(ACPI0010) devices (ACPI processor containers)
and ACPI processor driver cleanups (Sudeep Holla).
- Support for ACPI-based enumeration of the AMBA bus (Graeme Gregory,
Aleksey Makarov).
- Modification of the ACPI PCI IRQ management code to make it treat
255 in the Interrupt Line register as "not connected" on x86 (as
per the specification) and avoid attempts to use that value as
a valid interrupt vector (Chen Fan).
- ACPI APEI fixes related to resource leaks (Josh Hunt).
- Removal of modularity from a few ACPI drivers (BGRT, GHES,
intel_pmic_crc) that cannot be built as modules in practice (Paul
Gortmaker).
- PNP framework update to make it treat ACPI_RESOURCE_TYPE_SERIAL_BUS
as a valid resource type (Harb Abdulhamid).
- New device ID (future AMD I2C controller) in the ACPI driver for
AMD SoCs (APD) and in the designware I2C driver (Xiangliang Yu).
- Assorted ACPI cleanups (Colin Ian King, Kaiyen Chang, Oleg Drokin).
- cpuidle menu governor optimization to avoid a square root
computation in it (Rasmus Villemoes).
- Fix for potential use-after-free in the generic device properties
framework (Heikki Krogerus).
- Updates of the generic power domains (genpd) framework including
support for multiple power states of a domain, fixes and debugfs
output improvements (Axel Haslam, Jon Hunter, Laurent Pinchart,
Geert Uytterhoeven).
- Intel RAPL power capping driver updates to reduce IPI overhead in
it (Jacob Pan).
- System suspend/hibernation code cleanups (Eric Biggers, Saurabh
Sengar).
- Year 2038 fix for the process freezer (Abhilash Jindal).
- turbostat utility updates including new features (decoding of more
registers and CPUID fields, sub-second intervals support, GFX MHz
and RC6 printout, --out command line option), fixes (syscall jitter
detection and workaround, reductioin of the number of syscalls made,
fixes related to Xeon x200 processors, compiler warning fixes) and
cleanups (Len Brown, Hubert Chrzaniuk, Chen Yu).
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJW50NXAAoJEILEb/54YlRxvr8QAIktC9+ft0y5AmU46hDcBWcK
QutyWJL9X9BS6DWBJZA2qclDYFmhMfi5Fza1se0gQ9TnLB/KrBwHWLsiYoTsb1k+
nPKf214aPk+qAhkVuyB4leNWML9Qz9n9jwku/EYxWWpgtbSRf3+0ioIKZeWWc/8V
JvuaOu4O+g/tkmL7QTrnGWBwhIIssAAV85QPsHkx+g68MrCj4UMMzm7z9G21SPXX
bmP8yIHsczX/XnRsY0W2NSno7Vdk6ImHpDJ26IAZg28WRNPWICHgGYHvB0TTWMvb
tts+yqfF7/7QLRjT/M8k9CzDBDE/DnVqoZ0fNJ+aYr7hNKF32mtAN+jH9ZB9dl/P
fEFapJkPxnWyzAoVoB9Dz0rkcZkYMlbxlLWzUGpaPq0JflUUTzLk0ApSjmMn4HRO
UddwCDdyHTaYThp3gn6GbOb0pIP0SdOVbI1M2QV2x/4PLcT2Ft8Np1+1RFWOeinZ
Bdl9AE890big0808mqbBzw/buETwr9FjHtCdDPXpP0vJpkBLu3nIYRNb0LCt39es
mWMp6dFhGgvGj3D3ahTuV3GI8hdpDkh9SObexa11RCjkTKrXcwEmFxHxLeFXwKYq
alG278bo6cSChRMziS1lis+W/3tsJRN4TXUSv1PPzJHrFgptQVFRStU9ngBKP+pN
WB+itPc4Fw0YHOrAFsrx
=cfty
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-4.6-rc1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI updates from Rafael Wysocki:
"This time the majority of changes go into cpufreq and they are
significant.
First off, the way CPU frequency updates are triggered is different
now. Instead of having to set up and manage a deferrable timer for
each CPU in the system to evaluate and possibly change its frequency
periodically, cpufreq governors set up callbacks to be invoked by the
scheduler on a regular basis (basically on utilization updates). The
"old" governors, "ondemand" and "conservative", still do all of their
work in process context (although that is triggered by the scheduler
now), but intel_pstate does it all in the callback invoked by the
scheduler with no need for any additional asynchronous processing.
Of course, this eliminates the overhead related to the management of
all those timers, but also it allows the cpufreq governor code to be
simplified quite a bit. On top of that, the common code and data
structures used by the "ondemand" and "conservative" governors are
cleaned up and made more straightforward and some long-standing and
quite annoying problems are addressed. In particular, the handling of
governor sysfs attributes is modified and the related locking becomes
more fine grained which allows some concurrency problems to be avoided
(particularly deadlocks with the core cpufreq code).
In principle, the new mechanism for triggering frequency updates
allows utilization information to be passed from the scheduler to
cpufreq. Although the current code doesn't make use of it, in the
works is a new cpufreq governor that will make decisions based on the
scheduler's utilization data. That should allow the scheduler and
cpufreq to work more closely together in the long run.
In addition to the core and governor changes, cpufreq drivers are
updated too. Fixes and optimizations go into intel_pstate, the
cpufreq-dt driver is updated on top of some modification in the
Operating Performance Points (OPP) framework and there are fixes and
other updates in the powernv cpufreq driver.
Apart from the cpufreq updates there is some new ACPICA material,
including a fix for a problem introduced by previous ACPICA updates,
and some less significant changes in the ACPI code, like CPPC code
optimizations, ACPI processor driver cleanups and support for loading
ACPI tables from initrd.
Also updated are the generic power domains framework, the Intel RAPL
power capping driver and the turbostat utility and we have a bunch of
traditional assorted fixes and cleanups.
Specifics:
- Redesign of cpufreq governors and the intel_pstate driver to make
them use callbacks invoked by the scheduler to trigger CPU
frequency evaluation instead of using per-CPU deferrable timers for
that purpose (Rafael Wysocki).
- Reorganization and cleanup of cpufreq governor code to make it more
straightforward and fix some concurrency problems in it (Rafael
Wysocki, Viresh Kumar).
- Cleanup and improvements of locking in the cpufreq core (Viresh
Kumar).
- Assorted cleanups in the cpufreq core (Rafael Wysocki, Viresh
Kumar, Eric Biggers).
- intel_pstate driver updates including fixes, optimizations and a
modification to make it enable enable hardware-coordinated P-state
selection (HWP) by default if supported by the processor (Philippe
Longepe, Srinivas Pandruvada, Rafael Wysocki, Viresh Kumar, Felipe
Franciosi).
- Operating Performance Points (OPP) framework updates to improve its
handling of voltage regulators and device clocks and updates of the
cpufreq-dt driver on top of that (Viresh Kumar, Jon Hunter).
- Updates of the powernv cpufreq driver to fix initialization and
cleanup problems in it and correct its worker thread handling with
respect to CPU offline, new powernv_throttle tracepoint (Shilpasri
Bhat).
- ACPI cpufreq driver optimization and cleanup (Rafael Wysocki).
- ACPICA updates including one fix for a regression introduced by
previos changes in the ACPICA code (Bob Moore, Lv Zheng, David Box,
Colin Ian King).
- Support for installing ACPI tables from initrd (Lv Zheng).
- Optimizations of the ACPI CPPC code (Prashanth Prakash, Ashwin
Chaugule).
- Support for _HID(ACPI0010) devices (ACPI processor containers) and
ACPI processor driver cleanups (Sudeep Holla).
- Support for ACPI-based enumeration of the AMBA bus (Graeme Gregory,
Aleksey Makarov).
- Modification of the ACPI PCI IRQ management code to make it treat
255 in the Interrupt Line register as "not connected" on x86 (as
per the specification) and avoid attempts to use that value as a
valid interrupt vector (Chen Fan).
- ACPI APEI fixes related to resource leaks (Josh Hunt).
- Removal of modularity from a few ACPI drivers (BGRT, GHES,
intel_pmic_crc) that cannot be built as modules in practice (Paul
Gortmaker).
- PNP framework update to make it treat ACPI_RESOURCE_TYPE_SERIAL_BUS
as a valid resource type (Harb Abdulhamid).
- New device ID (future AMD I2C controller) in the ACPI driver for
AMD SoCs (APD) and in the designware I2C driver (Xiangliang Yu).
- Assorted ACPI cleanups (Colin Ian King, Kaiyen Chang, Oleg Drokin).
- cpuidle menu governor optimization to avoid a square root
computation in it (Rasmus Villemoes).
- Fix for potential use-after-free in the generic device properties
framework (Heikki Krogerus).
- Updates of the generic power domains (genpd) framework including
support for multiple power states of a domain, fixes and debugfs
output improvements (Axel Haslam, Jon Hunter, Laurent Pinchart,
Geert Uytterhoeven).
- Intel RAPL power capping driver updates to reduce IPI overhead in
it (Jacob Pan).
- System suspend/hibernation code cleanups (Eric Biggers, Saurabh
Sengar).
- Year 2038 fix for the process freezer (Abhilash Jindal).
- turbostat utility updates including new features (decoding of more
registers and CPUID fields, sub-second intervals support, GFX MHz
and RC6 printout, --out command line option), fixes (syscall jitter
detection and workaround, reductioin of the number of syscalls
made, fixes related to Xeon x200 processors, compiler warning
fixes) and cleanups (Len Brown, Hubert Chrzaniuk, Chen Yu)"
* tag 'pm+acpi-4.6-rc1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (182 commits)
tools/power turbostat: bugfix: TDP MSRs print bits fixing
tools/power turbostat: correct output for MSR_NHM_SNB_PKG_CST_CFG_CTL dump
tools/power turbostat: call __cpuid() instead of __get_cpuid()
tools/power turbostat: indicate SMX and SGX support
tools/power turbostat: detect and work around syscall jitter
tools/power turbostat: show GFX%rc6
tools/power turbostat: show GFXMHz
tools/power turbostat: show IRQs per CPU
tools/power turbostat: make fewer systems calls
tools/power turbostat: fix compiler warnings
tools/power turbostat: add --out option for saving output in a file
tools/power turbostat: re-name "%Busy" field to "Busy%"
tools/power turbostat: Intel Xeon x200: fix turbo-ratio decoding
tools/power turbostat: Intel Xeon x200: fix erroneous bclk value
tools/power turbostat: allow sub-sec intervals
ACPI / APEI: ERST: Fixed leaked resources in erst_init
ACPI / APEI: Fix leaked resources
intel_pstate: Do not skip samples partially
intel_pstate: Remove freq calculation from intel_pstate_calc_busy()
intel_pstate: Move intel_pstate_calc_busy() into get_target_pstate_use_performance()
...
$ make tags
GEN tags
ctags: Warning: drivers/acpi/processor_idle.c:64: null expansion of name pattern "\1"
ctags: Warning: drivers/xen/events/events_2l.c:41: null expansion of name pattern "\1"
ctags: Warning: kernel/locking/lockdep.c:151: null expansion of name pattern "\1"
ctags: Warning: kernel/rcu/rcutorture.c:133: null expansion of name pattern "\1"
ctags: Warning: kernel/rcu/rcutorture.c:135: null expansion of name pattern "\1"
ctags: Warning: kernel/workqueue.c:323: null expansion of name pattern "\1"
ctags: Warning: net/ipv4/syncookies.c:53: null expansion of name pattern "\1"
ctags: Warning: net/ipv6/syncookies.c:44: null expansion of name pattern "\1"
ctags: Warning: net/rds/page.c:45: null expansion of name pattern "\1"
Which are all the result of the DEFINE_PER_CPU pattern:
scripts/tags.sh:200: '/\<DEFINE_PER_CPU([^,]*, *\([[:alnum:]_]*\)/\1/v/'
scripts/tags.sh:201: '/\<DEFINE_PER_CPU_SHARED_ALIGNED([^,]*, *\([[:alnum:]_]*\)/\1/v/'
The below cures them. All except the workqueue one are within reasonable
distance of the 80 char limit. TJ do you have any preference on how to
fix the wq one, or shall we just not care its too long?
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull ram resource handling changes from Ingo Molnar:
"Core kernel resource handling changes to support NVDIMM error
injection.
This tree introduces a new I/O resource type, IORESOURCE_SYSTEM_RAM,
for System RAM while keeping the current IORESOURCE_MEM type bit set
for all memory-mapped ranges (including System RAM) for backward
compatibility.
With this resource flag it no longer takes a strcmp() loop through the
resource tree to find "System RAM" resources.
The new resource type is then used to extend ACPI/APEI error injection
facility to also support NVDIMM"
* 'core-resources-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
ACPI/EINJ: Allow memory error injection to NVDIMM
resource: Kill walk_iomem_res()
x86/kexec: Remove walk_iomem_res() call with GART type
x86, kexec, nvdimm: Use walk_iomem_res_desc() for iomem search
resource: Add walk_iomem_res_desc()
memremap: Change region_intersects() to take @flags and @desc
arm/samsung: Change s3c_pm_run_res() to use System RAM type
resource: Change walk_system_ram() to use System RAM type
drivers: Initialize resource entry to zero
xen, mm: Set IORESOURCE_SYSTEM_RAM to System RAM
kexec: Set IORESOURCE_SYSTEM_RAM for System RAM
arch: Set IORESOURCE_SYSTEM_RAM flag for System RAM
ia64: Set System RAM type and descriptor
x86/e820: Set System RAM type and descriptor
resource: Add I/O resource descriptor
resource: Handle resource flags properly
resource: Add System RAM resource type
* acpi-pci:
x86/ACPI/PCI: Recognize that Interrupt Line 255 means "not connected"
* acpi-soc:
i2c: designware: Add device HID for future AMD I2C controller
* pnp:
PNP / ACPI: add ACPI_RESOURCE_TYPE_SERIAL_BUS as a valid type
* acpi-processor:
ACPI / sleep: move acpi_processor_sleep to sleep.c
ACPI / processor : add support for ACPI0010 processor container
ACPI / processor_idle: replace PREFIX with pr_fmt
* acpi-cppc:
ACPI / CPPC: use MRTT/MPAR to decide if/when a req can be sent
ACPI / CPPC: replace writeX/readX to PCC with relaxed version
mailbox: pcc: optimized pcc_send_data
ACPI / CPPC: optimized cpc_read and cpc_write
ACPI / CPPC: Optimize PCC Read Write operations
* acpi-scan:
ACPI / scan: AMBA bus probing support
ACPI: introduce a function to find the first physical device
* acpi-osl:
ACPI / OSL: Add support to install tables via initrd
ACPI / OSL: Clean up initrd table override code
* acpi-apei:
ACPI / APEI: ERST: Fixed leaked resources in erst_init
ACPI / APEI: Fix leaked resources
* acpica:
ACPICA / Interpreter: Fix a regression triggered because of wrong Linux ECDT support
ACPICA: Utilities: Update trace mechinism for acquire_object
ACPICA: Namespace: Rename acpi_gbl_reg_methods_enabled to acpi_gbl_namespace_initialized
ACPICA: Namespace: Ensure \_SB._INI executed before any _REG
ACPICA: ACPICA: Tune _REG evaluations order in the initialization steps
ACPICA: Tables: make default region accessible during the table load
ACPICA: ACPI 6.0/iASL: Add support for the External AML opcode
ACPICA: Remove unnecessary arguments to ACPI_INFO
ACPICA: debugger: dbconvert: free pld_info on error return path
ACPICA: iASL: Update to use internal acpi_ut_strtoul64 function
ACPICA: iASL: Fix some typos with the name strtoul64
ACPICA: Remove incorrect "static" from a global structure
ACPICA: aclocal: Put parens around some definitions.
erst_init currently leaks resources allocated from its call to
apei_resources_init(). The data allocated there gets copied
into apei_resources_all and can be freed when we're done with it.
Signed-off-by: Josh Hunt <johunt@akamai.com>
Reviewed-by: Chen, Gong <gong.chen@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
We leak the NVS and arch resources (if used), in apei_resources_request.
They are allocated to make sure we exclude them from the APEI resources,
but they are never freed at the end of the function. Free them now.
Signed-off-by: Josh Hunt <johunt@akamai.com>
Reviewed-by: Chen, Gong <gong.chen@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add device HID AMDI0010 to match the AMD ACPI Vendor ID (AMDI) that
was registered in http://www.uefi.org/acpi_id_list, and the I2C
controller on future AMD paltform will use the HID instead of AMD0010.
Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
It is reported that the following commit triggers regressions:
Linux commit: efaed9be99
ACPICA commit: 31178590dde82368fdb0f6b0e466b6c0add96c57
Subject: ACPICA: Events: Enhance acpi_ev_execute_reg_method() to
ensure no _REG evaluations can happen during OS early boot
stages
This is because that the ECDT support is not corrected in Linux, and Linux
requires to execute _REG for ECDT (though this sounds so wrong), we need to
ensure acpi_gbl_namespace_initialized is set before ECDT probing in order
for _REG to be executed. Since we have to move
"acpi_gbl_namespace_initialized = TRUE" to the initialization step
happening before ECDT probing, acpi_load_tables() is the best candidate for
now. Thus this patch fixes the regression by doing so.
But if the ECDT support is fixed, Linux will not execute _REG for ECDT, and
ECDT probing will happen before acpi_load_tables(). At that time, we still
want to ensure acpi_gbl_namespace_initialized is set after executing
acpi_ns_initialize_objects() (under the condition of
acpi_gbl_group_module_level_code = FALSE), this patch also moves
acpi_ns_initialize_objects() to acpi_load_tables() accordingly.
Since acpi_ns_initialize_objects() doesn't seem to be skippable, this
patch also removes ACPI_NO_OBJECT_INIT for the one invoked in
acpi_load_tables(). And since the default region handlers should always be
installed before loading the tables, this patch also removes useless
acpi_gbl_group_module_level_code check accordingly. Reported by Chris
Bainbridge, Fixed by Lv Zheng.
Fixes: efaed9be99 (ACPICA: Events: Enhance acpi_ev_execute_reg_method() to ensure no _REG evaluations can happen during OS early boot stages)
Reported-and-tested-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This patch adds support to install tables from initrd.
If a table in the initrd wasn't used by the override mechanism,
the table would be installed after initializing all RSDT/XSDT
tables.
Link: https://lkml.org/lkml/2014/2/28/368
Reported-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This patch cleans up the initrd table override code by merging
redundant logics and re-ordering code blocks.
No functional changes.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
element is &package->package.elements[i] which can never be NULL
so the check to see if it is NULL is redundant and can be removed.
Detected with static analysis by CoverityScan
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Some HP laptops seem to have invalid 64 bit FADT X_PM* addresses
which are causing various boot issues. In these cases, it would
be useful to force ACPI to use the valid legacy 32 bit equivalent
PM addresses. Add a acpi_force_32bit_fadt_addr to set the ACPICA
acpi_gbl_use32_bit_fadt_addresses to TRUE to force this override.
Link: https://bugs.launchpad.net/bugs/1529381
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The Kconfig currently controlling compilation of this code is:
drivers/acpi/Kconfig:config CRC_PMIC_OPREGION
drivers/acpi/Kconfig: bool "ACPI operation region support for CrystalCove PMIC"
...meaning that it currently is not being built as a module by anyone.
Lets remove the couple modular references, so that when reading
the driver there is no doubt it is builtin-only.
Since module_init translates to device_initcall in the non-modular
case, the init ordering remains unchanged with this commit.
We also delete the MODULE_LICENSE tag etc. since all that information
is already contained at the top of the file in the comments.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Acked-by: Aaron Lu <aaron.lu@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The Kconfig currently controlling compilation of this code is:
config ACPI_APEI_GHES
bool "APEI Generic Hardware Error Source"
...meaning that it currently is not being built as a module by anyone.
Lets remove the modular code that is essentially orphaned, so that
when reading the driver there is no doubt it is builtin-only.
Since module_init translates to device_initcall in the non-modular
case, the init ordering remains unchanged with this commit.
We replace module.h with moduleparam.h as we are keeping the
pre-existing module_param that the file has, as currently that is
the easiest way to maintain compatibility with the existing boot
arg use cases.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The Kconfig for this driver is currently:
config ACPI_BGRT
bool "Boottime Graphics Resource Table support"
...meaning that it currently is not being built as a module by anyone.
Lets remove all modular references, so that when reading the driver
there is no doubt it is builtin-only.
Since module_init translates to device_initcall in the non-modular
case, the init ordering remains unchanged with this commit.
We also delete the MODULE_LICENSE tag etc. since all that information
was (or is now) contained at the top of the file in the comments.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The ACPI spec defines Minimum Request Turnaround Time(MRTT) and
Maximum Periodic Access Rate(MPAR) to prevent the OSPM from sending
too many requests than the platform can handle. For further details
on these parameters please refer to section 14.1.3 of ACPI 6.0 spec.
This patch includes MRTT/MPAR in deciding if or when a CPPC request
can be sent to the platform to make sure CPPC implementation is
compliant to the spec.
Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org>
Acked-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
We do not have a strict read/write order requirement while accessing
PCC subspace. The only requirement is all access should be committed
before triggering the PCC doorbell to transfer the ownership of PCC
to the platform and this requirement is enforced by the PCC driver.
Profiling on a many core system shows improvement of about 1.8us on
average per freq change request(about 10% improvement on average).
Since these operations are executed while holding the pcc_lock,
reducing this time helps the CPPC implementation to scale much
better as the number of cores increases.
Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org>
Acked-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
cpc_read and cpc_write are used while holding the pcc_lock spin_lock,
so they need to be as fast as possible. acpi_os_read/write_memory
APIs linearly search through a list for cached mapping which is
quite expensive. Since the PCC subspace is already mapped into
virtual address space during initialization, we can just add the
offset and access the necessary CPPC registers.
This patch + similar changes to PCC driver reduce the time per freq.
transition from around 200us to about 20us for the CPPC cpufreq
driver.
Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org>
Acked-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Previously the send_pcc_cmd() code checked if the
PCC operation had completed before returning from
the function. This check was performed regardless
of the PCC op type (i.e. Read/Write). Knowing
the type of cmd can be used to optimize the check
and avoid needless waiting. e.g. with Write ops,
the actual Writing is done before calling send_pcc_cmd().
And the subsequent Writes will check if the channel is
free at the entry of send_pcc_cmd() anyway.
However, for Read cmds, we need to wait for the cmd
completion bit to be flipped, since the actual Read
ops follow after returning from the send_pcc_cmd(). So,
only do the looping check at the end for Read ops.
Also, instead of using udelay() calls, use ktime as a
means to check for deadlines. The current deadline
in which the Remote should flip the cmd completion bit
is defined as N * Nominal latency. Where N is arbitrary
and large enough to work on slow emulators and Nominal
latency comes from the ACPI table (PCCT). This helps
in working around the CONFIG_HZ effects on udelay()
and also avoids needing different ACPI tables for Silicon
and Emulation platforms.
Signed-off-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
device_decode is now no longer used, so we may as well remove it.
Fixes gcc 6 warning:
drivers/acpi/acpi_video.c:221:19: warning: ‘device_decode’ defined
but not used [-Wunused-const-variable]
static const char device_decode[][30] = {
^~~~~~~~~~~~~
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In debugfs it's not enough to just set file mode to read-only to
deny write access to a file, instead just don't provide
the write method unless write access is really requested.
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Acked-by: Thomas Renninger <trenn@suse.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Silence the following checkpatch warning:
WARNING: struct dev_pm_ops should normally be const.
Signed-off-by: Kaiyen Chang <kaiyen.chang@intel.com>
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
ACPI 6 defines persistent memory (PMEM) ranges in multiple
firmware interfaces, e820, EFI, and ACPI NFIT table. This EFI
change, however, leads to hit a bug in the grub bootloader, which
treats EFI_PERSISTENT_MEMORY type as regular memory and corrupts
stored user data [1].
Therefore, BIOS may set generic reserved type in e820 and EFI to
cover PMEM ranges. The kernel can initialize PMEM ranges from
ACPI NFIT table alone.
This scheme causes a problem in the iomem table, though. On x86,
for instance, e820_reserve_resources() initializes top-level entries
(iomem_resource.child) from the e820 table at early boot-time.
This creates "reserved" entry for a PMEM range, which does not allow
region_intersects() to check with PMEM type.
Change acpi_nfit_register_region() to call acpi_nfit_insert_resource(),
which calls insert_resource() to insert a PMEM entry from NFIT when
the iomem table does not have a PMEM entry already. That is, when
a PMEM range is marked as reserved type in e820, it inserts
"Persistent Memory" entry, which results as follows.
+ "Persistent Memory"
+ "reserved"
This allows the EINJ driver, which calls region_intersects() to check
PMEM ranges, to work continuously even if BIOS sets reserved type
(or sets nothing) to PMEM ranges in e820 and EFI.
[1]: https://lists.gnu.org/archive/html/grub-devel/2015-11/msg00209.html
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Per the x86-specific footnote to PCI spec r3.0, sec 6.2.4, the value 255 in
the Interrupt Line register means "unknown" or "no connection."
Previously, when we couldn't derive an IRQ from the _PRT, we fell back to
using the value from Interrupt Line as an IRQ. It's questionable whether
we should do that at all, but the spec clearly suggests we shouldn't do it
for the value 255 on x86.
Calling request_irq() with IRQ 255 may succeed, but the driver won't
receive any interrupts. Or, if IRQ 255 is shared with another device, it
may succeed, and the driver's ISR will be called at random times when the
*other* device interrupts. Or it may fail if another device is using IRQ
255 with incompatible flags. What we *want* is for request_irq() to fail
predictably so the driver can fall back to polling.
On x86, assume 255 in the Interrupt Line means the INTx line is not
connected. In that case, set dev->irq to IRQ_NOTCONNECTED so request_irq()
will fail gracefully with -ENOTCONN.
We found this problem on a system where Secure Boot firmware assigned
Interrupt Line 255 to an i801_smbus device and another device was already
using MSI-X IRQ 255. This was in v3.10, where i801_probe() fails if
request_irq() fails:
i801_smbus 0000:00:1f.3: enabling device (0140 -> 0143)
i801_smbus 0000:00:1f.3: can't derive routing for PCI INT C
i801_smbus 0000:00:1f.3: PCI INT C: no GSI
genirq: Flags mismatch irq 255. 00000080 (i801_smbus) vs. 00000000 (megasa)
CPU: 0 PID: 2487 Comm: kworker/0:1 Not tainted 3.10.0-229.el7.x86_64 #1
Hardware name: FUJITSU PRIMEQUEST 2800E2/D3736, BIOS PRIMEQUEST 2000 Serie5
Call Trace:
dump_stack+0x19/0x1b
__setup_irq+0x54a/0x570
request_threaded_irq+0xcc/0x170
i801_probe+0x32f/0x508 [i2c_i801]
local_pci_probe+0x45/0xa0
i801_smbus 0000:00:1f.3: Failed to allocate irq 255: -16
i801_smbus: probe of 0000:00:1f.3 failed with error -16
After aeb8a3d16a ("i2c: i801: Check if interrupts are disabled"),
i801_probe() will fall back to polling if request_irq() fails. But we
still need this patch because request_irq() may succeed or fail depending
on other devices in the system. If request_irq() fails, i801_smbus will
work by falling back to polling, but if it succeeds, i801_smbus won't work
because it expects interrupts that it may not receive.
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
ACPICA commit eade8f78f2aa21e8eabc3380a5728db47273bcf1
Revert commit ae90fbf562 (ACPICA: Parser: Fix for SuperName method
invocation).
Support for method invocations as part of super_name will be
removed from the ACPI specification, since no AML interpreter
supports it.
Fixes: ae90fbf562 (ACPICA: Parser: Fix for SuperName method invocation)
Link: https://github.com/acpica/acpica/commit/eade8f78
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
ACPICA commit 0824ab90e03c2e4239e890615f447e7962b1daa2
Was not using the correct macro. Updated a comment in
acoutput.h
Link: https://github.com/acpica/acpica/commit/0824ab90
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add the boiler-plate for a 'clear error' command based on section
9.20.7.6 "Function Index 4 - Clear Uncorrectable Error" from the ACPI
6.1 specification, and add a reference implementation in nfit_test.
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
While the nfit driver is issuing address range scrub commands and
reaping the results do not permit an ars_start command issued from
userspace. The scrub thread assumes that all ars completions are for
scrubs initiated by platform firmware at boot, or by the nfit driver.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Address range scrub is a potentially long running process that we want
to complete before any pmem regions are registered. Perform this
operation asynchronously to allow other drivers to load in the meantime.
Platform firmware may have initiated a partial scrub prior to the driver
loading, so we must be careful to consume those results before kicking
off kernel initiated scrubs on other regions.
This rework also makes the registration path more tolerant of scrub
errors in that it splits scrubbing into 2 phases. The first phase
synchronously waits for a platform-firmware initiated scrub to complete.
The second phase scans the remaining address ranges asynchronously and
notifies the related driver(s) when the scrub completes.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Introduce a workqueue that will be used to run address range scrub
asynchronously with the rest of nvdimm device probing.
Userspace still wants notification when probing operations complete, so
introduce a new callback to flush this workqueue when userspace is
awaiting probe completion.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The nvdimm unit test infrastructure performs its own initialization of
an acpi_nfit_desc to specify test overrides over the native
implementation. Make it clear which attributes and operations it is
overriding by re-using acpi_nfit_init_desc() as a common starting point.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The return value from an 'ndctl_fn' reports the command execution
status, i.e. was the command properly formatted and was it successfully
submitted to the bus provider. The new 'cmd_rc' parameter allows the bus
provider to communicate command specific results, translated into
common error codes.
Convert the ARS commands to this scheme to:
1/ Consolidate status reporting
2/ Prepare for for expanding ars unit test cases
3/ Make the implementation more generic
Cc: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
ACPI 6.1 and JEDEC Annex L Release 3 formalize the format interface
code. Add definitions and update their usage in the unit test.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
If firmware doesn't implement any of the ARS commands, take that to
mean that ARS is unsupported, and continue to initialize regions without
bad block lists. We cannot make the assumption that ARS commands will be
unconditionally supported on all NVDIMMs.
Reported-by: Haozhong Zhang <haozhong.zhang@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Acked-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Tested-by: Haozhong Zhang <haozhong.zhang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>