If the driver sequence number coincidentally equals the previous
command response sequence number, the driver may proceed with
fetching the entire buffer before the OCC has processed the current
command. To be sure the correct response is obtained, check the
command type and also retry if any of the response parameters have
changed when the rest of the buffer is fetched. Also initialize the
driver with a random sequence number in order to reduce the chances
of this happening.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/20220208152235.19686-1-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
If the SBEFIFO response indicates an error, store the response in the
user buffer and return an error. Previously, the user had no way of
obtaining the SBEFIFO FFDC.
The user's buffer now contains data in the event of a failure. No change
in the event of a successful transfer.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20211019205307.36946-3-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
Allocate a large buffer for each OCC to handle response data. This
removes memory allocation during an operation, and also allows for
the maximum amount of SBE FFDC.
Previously for the putsram and attn commands, only 32 words would have
been available, and for getsram, only up to the size of the transfer.
SBE FFDC might be up to 8Kb.
The SBE interface expects data to be specified in units of words (4
bytes), defined as OCC_MAX_RESP_WORDS.
This change allows the full FFDC capture to be implemented, where before
it was not available.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20211019205307.36946-2-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
Set and increment the sequence number during the submit operation.
This prevents sequence number conflicts between different users of
the interface. A sequence number conflict may result in a user
getting an OCC response meant for a different command. Since the
sequence number is now modified, the checksum must be calculated and
set before submitting the command.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/20210721190231.117185-2-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
Log an error if the response checksum doesn't match the
calculated checksum.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/20210209171235.20624-3-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
If the OCC is not initialized and responds as such, the driver
should continue waiting for a valid response until the timeout
expires.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Fixes: 7ed98dddb7 ("fsi: Add On-Chip Controller (OCC) driver")
Link: https://lore.kernel.org/r/20210209171235.20624-2-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
This patch adds missing MODULE_DEVICE_TABLE definition which generates
correct modalias for automatic loading of this driver when it is built
as an external module.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zou Wei <zou_wei@huawei.com>
Link: https://lore.kernel.org/r/1620896249-52769-1-git-send-email-zou_wei@huawei.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
The P10 OCC has a different SRAM address for the command and response
buffers. In addition, the SBE commands to access the SRAM have changed
format. Add versioning to the driver to handle these differences.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/20201120010315.190737-3-joel@jms.id.au
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
In case of error, the function platform_device_register_full()
returns ERR_PTR() and never returns NULL. The NULL test in the
return value check should be replaced with IS_ERR().
Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Sequence numbering of the commands submitted to the OCC is required by
the OCC interface specification. Add sequence numbering and check for
the correct sequence number on the response.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Lei YU <mine260309@gmail.com>
Signed-off-by: Joel Stanley <joel@jms.id.au>
The OCC is a device embedded on a POWER processor that collects and
aggregates sensor data from the processor and system. The OCC can
provide the raw sensor data as well as perform thermal and power
management on the system.
This driver provides an atomic communications channel between a service
processor (e.g. a BMC) and the OCC. The driver is dependent on the FSI
SBEFIFO driver to get hardware access through the SBE to the OCC SRAM.
Commands are issued to the SBE to send or fetch data to the SRAM.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>