Commit Graph

29 Commits

Author SHA1 Message Date
Dan Williams 6450ddbd5d ACPI: NFIT: Define runtime firmware activation commands
Platform reboots are expensive. Towards reducing downtime to apply
firmware updates the Intel NVDIMM command definition is growing support
for applying live firmware updates that only require temporarily
suspending memory traffic instead of a full reboot.

Follow-on commits add support for triggering firmware activation, this
patch only defines the commands, adds probe support, and validates that
they are blocked via the ioctl path. The ioctl-path block ensures that
the OS is in charge since these commands have side effects only the OS
can handle. Specifically firmware activation may cause the memory
controller to be quiesced on the order of 100s of milliseconds. In that
case Linux ensure the activation only takes place while the OS is in a
suspend state.

Link: https://pmem.io/documents/IntelOptanePMem_DSM_Interface-V2.0.pdf
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
2020-07-25 19:34:47 -06:00
Dan Williams 92fe2aa859 libnvdimm: Validate command family indices
The ND_CMD_CALL format allows for a general passthrough of passlisted
commands targeting a given command set. However there is no validation
of the family index relative to what the bus supports.

- Update the NFIT bus implementation (the only one that supports
  ND_CMD_CALL passthrough) to also passlist the valid set of command
  family indices.

- Update the generic __nd_ioctl() path to validate that field on behalf
  of all implementations.

Fixes: 31eca76ba2 ("nfit, libnvdimm: limited/whitelisted dimm command marshaling mechanism")
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
2020-07-25 19:34:47 -06:00
Vaibhav Jain f517f7925b ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods
Introduce support for PAPR NVDIMM Specific Methods (PDSM) in papr_scm
module and add the command family NVDIMM_FAMILY_PAPR to the white list
of NVDIMM command sets. Also advertise support for ND_CMD_CALL for the
nvdimm command mask and implement necessary scaffolding in the module
to handle ND_CMD_CALL ioctl and PDSM requests that we receive.

The layout of the PDSM request as we expect from libnvdimm/libndctl is
described in newly introduced uapi header 'papr_pdsm.h' which
defines a 'struct nd_pkg_pdsm' and a maximal union named
'nd_pdsm_payload'. These new structs together with 'struct nd_cmd_pkg'
for a pdsm envelop thats sent by libndctl to libnvdimm and serviced by
papr_scm in 'papr_scm_service_pdsm()'. The PDSM request is
communicated by member 'struct nd_cmd_pkg.nd_command' together with
other information on the pdsm payload (size-in, size-out).

The patch also introduces 'struct pdsm_cmd_desc' instances of which
are stored in an array __pdsm_cmd_descriptors[] indexed with PDSM cmd
and corresponding access function pdsm_cmd_desc() is
introduced. 'struct pdsm_cdm_desc' holds the service function for a
given PDSM and corresponding payload in/out sizes.

A new function papr_scm_service_pdsm() is introduced and is called from
papr_scm_ndctl() in case of a PDSM request is received via ND_CMD_CALL
command from libnvdimm. The function performs validation on the PDSM
payload based on info present in corresponding PDSM descriptor and if
valid calls the 'struct pdcm_cmd_desc.service' function to service the
PDSM.

Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20200615124407.32596-6-vaibhav@linux.ibm.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-06-15 18:22:44 -07:00
Dexuan Cui 1194c41331 nfit: Add Hyper-V NVDIMM DSM command set to white list
Add the Hyper-V _DSM command set to the white list of NVDIMM command
sets.

This command set is documented at http://www.uefi.org/RFIC_LIST
(see "Virtual NVDIMM 0x1901").

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-01-29 22:09:31 -08:00
David Howells f366d322ae UAPI: ndctl: Remove use of PAGE_SIZE
The macro PAGE_SIZE isn't valid outside of the kernel, so it should not
appear in UAPI headers.

Furthermore, the actual machine page size could theoretically change from
an application's point of view if it's running in a container that gets
migrated to another machine (say 4K/ppc64 to 64K/ppc64).

Fixes: f2ba5a5bae ("libnvdimm, namespace: make min namespace size 4K")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-10-17 13:56:58 -07:00
David Howells 9607871f37 UAPI: ndctl: Fix g++-unsupported initialisation in headers
The following code in the linux/ndctl header file:

	static inline const char *nvdimm_bus_cmd_name(unsigned cmd)
	{
		static const char * const names[] = {
			[ND_CMD_ARS_CAP] = "ars_cap",
			[ND_CMD_ARS_START] = "ars_start",
			[ND_CMD_ARS_STATUS] = "ars_status",
			[ND_CMD_CLEAR_ERROR] = "clear_error",
			[ND_CMD_CALL] = "cmd_call",
		};

		if (cmd < ARRAY_SIZE(names) && names[cmd])
			return names[cmd];
		return "unknown";
	}

is broken in a number of ways:

 (1) ARRAY_SIZE() is not generally defined.

 (2) g++ does not support "non-trivial" array initialisers fully yet.

 (3) Every file that calls this function will acquire a copy of names[].

The same goes for nvdimm_cmd_name().

Fix all three by converting to a switch statement where each case returns a
string.  That way if cmd is a constant, the compiler can trivially reduce it
and, if not, the compiler can use a shared lookup table if it thinks that is
more efficient.

A better way would be to remove these functions and their arrays from the
header entirely.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-10-17 13:56:54 -07:00
Dan Williams f2ba5a5bae libnvdimm, namespace: make min namespace size 4K
The arbitrary 4MB minimum namespace size turns out to be too large for
some environments. Quoting Cheng-mean Liu:

    In the case of emulated NVDIMM devices in the VM environment, there
    are scenarios that NVDIMM device with much smaller sizes are
    desired, for example, we might use a single enumerated NVDIMM DAX
    device for representing each container layer, which in some cases
    could be just a few KBs size.

PAGE_SIZE is the minimum where we can still support DAX of at least
a single page.

Cc: Matthew Wilcox <willy@infradead.org>
Reported-by: Cheng-mean Liu <soccerl@microsoft.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-02-01 21:27:22 -08:00
Dan Williams cdd77d3e19 nfit, libnvdimm: deprecate the generic SMART ioctl
The kernel's ND_IOCTL_SMART_THRESHOLD command is based on a payload
definition that has become broken / out-of-sync with recent versions of
the NVDIMM_FAMILY_INTEL definition. Deprecate the use of the
ND_IOCTL_SMART_THRESHOLD command in favor of the ND_CMD_CALL approach
taken by NVDIMM_FAMILY_{HPE,MSFT}, where we can manage the per-vendor
variance in userspace.

In a couple years, when the new scheme is widely deployed in userspace
packages, the ND_IOCTL_SMART_THRESHOLD support can be removed. For now
we prevent new binaries from compiling against the kernel header
definitions, but kernel still compatible with old binaries. The
libndctl.h [1] header is now the authoritative interface definition for
NVDIMM SMART.

[1]: https://github.com/pmem/ndctl
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-12-04 10:19:20 -08:00
Dan Williams 7a14724f54 libnvdimm: clean up command definitions
Remove the command payloads that do not have an associated libnvdimm
ioctl. I.e. remove the payloads that would only ever be carried in the
ND_CMD_CALL envelope. This prevents userspace from growing unnecessary
dependencies on this kernel header when userspace already has everything
it needs to craft and send these commands.

Cc: Jerry Hoemann <jerry.hoemann@hpe.com>
Reported-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-08-28 08:33:20 -07:00
Toshi Kani 807900395e acpi/nfit: Issue Start ARS to retrieve existing records
ACPI 6.2 defines in section 9.20.7.2 that the OSPM may call a Start
ARS with Flags Bit [1] set upon receiving the 0x81 notification.

  Upon receiving the notification, the OSPM may decide to issue
  a Start ARS with Flags Bit [1] set to prepare for the retrieval
  of existing records and issue the Query ARS Status function to
  retrieve the records.

Add support to call a Start ARS from acpi_nfit_uc_error_notify()
with ND_ARS_RETURN_PREV_DATA set when HW_ERROR_SCRUB_ON is not set.

Link: http://www.uefi.org/sites/default/files/resources/ACPI_6_2.pdf
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Linda Knippers <linda.knippers@hpe.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-07-02 09:56:37 -07:00
Jerry Hoemann 759d6a9641 libnvdimm: New ACPI 6.2 DSM functions
ACPI 6.2 added new NVDIMM root DSM functions.  Define their
data structures.

Signed-off-by: Jerry Hoemann <jerry.hoemann@hpe.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-07-01 08:49:59 -07:00
Jerry Hoemann 37d74841b9 acpi, nfit: Enable DSM pass thru for root functions.
Set ND_CMD_CALL in the cmd_mask to enable calling root
functions via the pass thru mechanism.

Signed-off-by: Jerry Hoemann <jerry.hoemann@hpe.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-07-01 08:49:59 -07:00
Dan Williams 9d62ed9651 libnvdimm: handle locked label storage areas
Per the latest version of the "NVDIMM DSM Interface Example" [1], the
label data retrieval routine can report a "locked" status. In this case
all regions associated with that DIMM are disabled until the label area
is unlocked. Provide generic libnvdimm enabling for NVDIMMs with label
data area locking capabilities.

[1]: http://pmem.io/documents/

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-05-04 15:41:39 -07:00
Dan Williams 02486c2905 libnvdimm: fix SMART Health DSM payload definition
"NVDIMM DSM Interface Example" v1.2 made an incompatible change to the
layout of function1 "SMART and Health Info".  While the kernel does not
directly consume this payload, it does define it in ndctl.h that
userpace utilities consume.

Reported-by: Brian Boylston <brian.boylston@hpe.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-08-15 11:07:21 -07:00
stuart hayes e02fb7264d nfit: add Microsoft NVDIMM DSM command set to white list
Add the Microsoft _DSM command set to the white list of NVDIMM command
sets.

This command set is documented at:

    https://msdn.microsoft.com/library/windows/hardware/mt604741

Cc: Pavel Machek <pavel@ucw.cz>
[pavel: fix up braces]
Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-06-20 11:19:34 -07:00
Dan Williams 1f716d05f8 Merge branch 'for-4.7/dsm' into libnvdimm-for-next 2016-05-18 10:06:59 -07:00
Dan Williams 2159669f58 Merge branch 'for-4.7/libnvdimm' into libnvdimm-for-next 2016-05-18 10:06:48 -07:00
Dan Williams cd03412a51 libnvdimm, dax: introduce device-dax infrastructure
Device DAX is the device-centric analogue of Filesystem DAX
(CONFIG_FS_DAX).  It allows persistent memory ranges to be allocated and
mapped without need of an intervening file system.  This initial
infrastructure arranges for a libnvdimm pfn-device to be represented as
a different device-type so that it can be attached to a driver other
than the pmem driver.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-05-09 15:35:42 -07:00
Dan Williams 31eca76ba2 nfit, libnvdimm: limited/whitelisted dimm command marshaling mechanism
There are currently 4 known similar but incompatible definitions of the
command sets that can be sent to an NVDIMM through ACPI.  It is also
clear that future platform generations (ACPI or not) will continue to
revise and extend the DIMM command set as new devices and use cases
arrive.

It is obviously untenable to continue to proliferate divergence
of these command definitions, and to that end a standardization process
has begun to provide for a unified specification.  However, that leaves a
problem about what to do with this first generation where vendors are
already shipping divergence.

The Linux kernel can support these initial diverged platforms without
giving platform-firmware free reign to continue to diverge and compound
kernel maintenance overhead.  The kernel implementation can encourage
standardization in two ways:

1/ Require that any function code that userspace wants to send be
   explicitly white-listed in the implementation.  For ACPI this means
   function codes marked as supported by acpi_check_dsm() may
   only be invoked if they appear in the white-list.  A function must be
   publicly documented before it is added to the white-list.

2/ The above restrictions can be trivially bypassed by using the
   "vendor-specific" payload command.  However, since vendor-specific
   commands are by definition not publicly documented and have the
   potential to corrupt the kernel's view of the dimm state, we provide a
   toggle to disable vendor-specific operations.  Enabling undefined
   behavior is a policy decision that can be made by the platform owner
   and encourages firmware implementations to choose public over
   private command implementations.

Based on an initial patch from Jerry Hoemann
Cc: Jerry Hoemann <jerry.hoemann@hpe.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-04-28 16:59:06 -07:00
Dan Williams baa51277cf libnvdimm, test: add mock SMART data payload
Provide simulated SMART data to enable the ndctl implementation of SMART
data retrieval and parsing.

The payload is defined here, "Section 4.1 SMART and Health Info
(Function Index 1)":

    http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-04-11 11:11:14 -07:00
Dan Williams d4f323672a nfit, libnvdimm: clear poison command support
Add the boiler-plate for a 'clear error' command based on section
9.20.7.6 "Function Index 4 - Clear Uncorrectable Error" from the ACPI
6.1 specification, and add a reference implementation in nfit_test.

Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-03-05 18:06:14 -08:00
Dan Williams 4577b06655 nfit: update address range scrub commands to the acpi 6.1 format
The original format of these commands from the "NVDIMM DSM Interface
Example" [1] are superseded by the ACPI 6.1 definition of the "NVDIMM Root
Device _DSMs" [2].

[1]: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
[2]: http://www.uefi.org/sites/default/files/resources/ACPI_6_1.pdf
     "9.20.7 NVDIMM Root Device _DSMs"

Changes include:
1/ New 'restart' fields in ars_status, unfortunately these are
   implemented in the middle of the existing definition so this change
   is not backwards compatible.  The expectation is that shipping
   platforms will only ever support the ACPI 6.1 definition.

2/ New status values for ars_start ('busy') and ars_status ('overflow').

Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Linda Knippers <linda.knippers@hpe.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-02-23 17:17:20 -08:00
Vishal Verma 39c686b862 libnvdimm: Add DSM support for Address Range Scrub commands
Add support for the three ARS DSM commands:
- Query ARS Capabilities - Queries the firmware to check if a given
  range supports scrub, and if so, which type (persistent vs. volatile)
- Start ARS - Starts a scrub for a given range/type
- Query ARS Status - Checks status of a previously started scrub, and
  provides the error logs if any.

  The commands are described by the example DSM spec at:
  http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf

Also add these commands to the nfit_test test framework, and return
canned data.

Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-07-27 22:53:19 -04:00
Vishal Verma ec92777f2b libnvdimm: Update name of the ars_status_record mask field
The spec suggests that this is a simple 'length' field, not a mask.
Update the name accordingly.

Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-07-27 22:53:19 -04:00
Dan Williams bf9bccc14c libnvdimm: pmem label sets and namespace instantiation.
A complete label set is a PMEM-label per-dimm per-interleave-set where
all the UUIDs match and the interleave set cookie matches the hosting
interleave set.

Present sysfs attributes for manipulation of a PMEM-namespace's
'alt_name', 'uuid', and 'size' attributes.  A later patch will make
these settings persistent by writing back the label.

Note that PMEM allocations grow forwards from the start of an interleave
set (lowest dimm-physical-address (DPA)).  BLK-namespaces that alias
with a PMEM interleave set will grow allocations backward from the
highest DPA.

Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Neil Brown <neilb@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-06-24 21:24:10 -04:00
Dan Williams 4a826c83db libnvdimm: namespace indices: read and validate
This on media label format [1] consists of two index blocks followed by
an array of labels.  None of these structures are ever updated in place.
A sequence number tracks the current active index and the next one to
write, while labels are written to free slots.

    +------------+
    |            |
    |  nsindex0  |
    |            |
    +------------+
    |            |
    |  nsindex1  |
    |            |
    +------------+
    |   label0   |
    +------------+
    |   label1   |
    +------------+
    |            |
     ....nslot...
    |            |
    +------------+
    |   labelN   |
    +------------+

After reading valid labels, store the dpa ranges they claim into
per-dimm resource trees.

[1]: http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf

Cc: Neil Brown <neilb@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-06-24 21:24:10 -04:00
Dan Williams 3d88002e4a libnvdimm: support for legacy (non-aliasing) nvdimms
The libnvdimm region driver is an intermediary driver that translates
non-volatile "region"s into "namespace" sub-devices that are surfaced by
persistent memory block-device drivers (PMEM and BLK).

ACPI 6 introduces the concept that a given nvdimm may simultaneously
offer multiple access modes to its media through direct PMEM load/store
access, or windowed BLK mode.  Existing nvdimms mostly implement a PMEM
interface, some offer a BLK-like mode, but never both as ACPI 6 defines.
If an nvdimm is single interfaced, then there is no need for dimm
metadata labels.  For these devices we can take the region boundaries
directly to create a child namespace device (nd_namespace_io).

Acked-by: Christoph Hellwig <hch@lst.de>
Tested-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-06-24 21:24:10 -04:00
Dan Williams 4d88a97aa9 libnvdimm, nvdimm: dimm driver and base libnvdimm device-driver infrastructure
* Implement the device-model infrastructure for loading modules and
  attaching drivers to nvdimm devices.  This is a simple association of a
  nd-device-type number with a driver that has a bitmask of supported
  device types.  To facilitate userspace bind/unbind operations 'modalias'
  and 'devtype', that also appear in the uevent, are added as generic
  sysfs attributes for all nvdimm devices.  The reason for the device-type
  number is to support sub-types within a given parent devtype, be it a
  vendor-specific sub-type or otherwise.

* The first consumer of this infrastructure is the driver
  for dimm devices.  It simply uses control messages to retrieve and
  store the configuration-data image (label set) from each dimm.

Note: nd_device_register() arranges for asynchronous registration of
      nvdimm bus devices by default.

Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Neil Brown <neilb@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Tested-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-06-24 21:24:10 -04:00
Dan Williams 62232e45f4 libnvdimm: control (ioctl) messages for nvdimm_bus and nvdimm devices
Most discovery/configuration of the nvdimm-subsystem is done via sysfs
attributes.  However, some nvdimm_bus instances, particularly the
ACPI.NFIT bus, define a small set of messages that can be passed to the
platform.  For convenience we derive the initial libnvdimm-ioctl command
formats directly from the NFIT DSM Interface Example formats.

    ND_CMD_SMART: media health and diagnostics
    ND_CMD_GET_CONFIG_SIZE: size of the label space
    ND_CMD_GET_CONFIG_DATA: read label space
    ND_CMD_SET_CONFIG_DATA: write label space
    ND_CMD_VENDOR: vendor-specific command passthrough
    ND_CMD_ARS_CAP: report address-range-scrubbing capabilities
    ND_CMD_ARS_START: initiate scrubbing
    ND_CMD_ARS_STATUS: report on scrubbing state
    ND_CMD_SMART_THRESHOLD: configure alarm thresholds for smart events

If a platform later defines different commands than this set it is
straightforward to extend support to those formats.

Most of the commands target a specific dimm.  However, the
address-range-scrubbing commands target the bus.  The 'commands'
attribute in sysfs of an nvdimm_bus, or nvdimm, enumerate the supported
commands for that object.

Cc: <linux-acpi@vger.kernel.org>
Cc: Robert Moore <robert.moore@intel.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reported-by: Nicholas Moulin <nicholas.w.moulin@linux.intel.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-06-24 21:24:10 -04:00