Instead of registering the hwmon device at probe time, use the
existing "occ_active" sysfs file to control when the driver polls
the OCC for sensor data and registers with hwmon. The reason for
this change is that the SBE, which is the device by which the
driver communicates with the OCC, cannot handle communications
during certain system state transitions, resulting in
unrecoverable system errors.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20220427140443.11428-1-eajames@linux.ibm.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Export the power caps data for the soft minimum power cap through hwmon.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/20220215151022.7498-5-eajames@linux.ibm.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
- SBEFIFO usersapce interfaces to perform FFDC (First Failure
Data Capture) and detect timeouts
- A fix to handle multiple messages in flight
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+nHMAt9PCBDH63wBa3ZZB4FHcJ4FAmFx+ZIACgkQa3ZZB4FH
cJ7WHhAAn840J0zZS5mUMq7oJC6oWsSwEtKoBqoEwJq0nGGTDY7KX5r5umH1uEdR
SlIjZ8WoNgZVpZEXXOXZyKWU8yzEckVjEYMWNWZGuabuAXMzTDyB/J376vn9IYQO
ZkGYbu4B9CZssn0sDC671R/FHfPFk7L5jJ0sJgi5I3nDOApw+uvcGqw9r0AWmUfp
7BHNLZvvUNO/Z3yqF+YaeDhOIXHRbN0kq7fwi+lp4s5UUhPZDmNrrbVu+6HqJe9E
ghUST309zWUdtBToyzxkRb2U8rK8QDtkZppRPf+e/64RP8Fz1yNDPHG6HQX7cyYS
1VyMbA5AzzbUE2XORmfbPGrJ9WmQSX8JOJX1bxq/eu15VJAeDVYuMVHm0ekNhi3u
gwBvPXeyCwMCn0rQBGxbqoM9bHHj07vJ1FVaSWpWYjbTWo343itjVi54nYxgQxIA
12TW8xRI1H7IHkgKtaCvrINxHgyXinfZuWrNXVKHsiT02Y+F08eXp0xotaMg0bmj
nhvTw4wRZa0rHIFNFqj/yEKpIaaoDuPo1ZxT81NOAK4kxgzuFB/z2IsUXf+xJI4r
2UAWyryth5wFbfu0BBq95onGOI+MgZwpksVNekcS1P2pByox+pKKvUsqXeSBRHnx
DmiSZVME2zRUb+9jZHVfEVHBZBrMVzDn1pb9Fm1OId8cjeFupdA=
=ug/2
-----END PGP SIGNATURE-----
Merge tag 'fsi-for-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/joel/fsi into char-misc-next
Joel writes:
FSI changes for v5.16
- SBEFIFO usersapce interfaces to perform FFDC (First Failure
Data Capture) and detect timeouts
- A fix to handle multiple messages in flight
* tag 'fsi-for-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/joel/fsi:
fsi: sbefifo: Use interruptible mutex locking
fsi: sbefifo: Add sysfs file indicating a timeout error
docs: ABI: testing: Document the SBEFIFO timeout interface
hwmon: (occ) Provide the SBEFIFO FFDC in binary sysfs
docs: ABI: testing: Document the OCC hwmon FFDC binary interface
fsi: occ: Store the SBEFIFO FFDC in the user response buffer
fsi: occ: Use a large buffer for responses
hwmon: (occ) Remove sequence numbering and checksum calculation
fsi: occ: Force sequence numbering per OCC
Checksumming of the request and sequence numbering is now done in the
OCC interface driver in order to keep unique sequence numbers. So
remove those in the hwmon driver. Also, add the command length to the
send_cmd function pointer, since the checksum must be placed in the
last two bytes of the command. The submit interface must receive the
exact size of the command - previously it could be rounded to the
nearest 8 bytes with no consequence.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20210721190231.117185-3-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
The P10 (temp sensor version 0x10) doesn't do the same VRM status
reporting that was used on P9. It just reports the temperature, so
drop the check for VRM fru type in the sysfs show function, and don't
set the name to "alarm".
Fixes: db4919ec86 ("hwmon: (occ) Add new temperature sensor type")
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20210929153604.14968-1-eajames@linux.ibm.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
- Bug fixes for the OCC, SCOM and SBEFIFO drivers
- Performance fix for aspeed fsi master
- Small fixes from the mailing lists
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+nHMAt9PCBDH63wBa3ZZB4FHcJ4FAmDBm5YACgkQa3ZZB4FH
cJ6UtQ/8DwO1YMvokaILRK80up5mI5EOzNu3ULvrgSDw+cb4p8lYHElNipxkEo8X
g6LhCjAU1q2Y3/y7TdGBX01AxhMOsi4PUdXQCLln+2ku0fGYZgakOxSS4313mVyZ
e5nfhqM6RGFWCgXZctZwmO3wWkvoZCsBraN8wz92hQPYOD3IYGD/R+kcvuU8Ua4y
NxiPPYaBDrZUfgaxI+30eBYzzkS9x1lWU57mecqrI/lcO49GAv+y0WjEcYhlISSi
ue+//OGCgop2S4rJKIrBOXRk5A1zjY7vm2kYdji6j7usKsURzBQn4/ARdg5q8txt
swoK9s4HgAKVg3PeXiqrCjV6VhyU2YMn+tCtCS+XRZoWxWGKBd8dyzbllEmMhYfJ
j3Lsy1VVOX82bDBHtE6trOQHx7vCm9ab+2EIvG+QWmWVYhxS2TdstFQzwuNCpvKV
BzeFEjkbo0F//PGfmVpJ0K4061qeFO4YN+7LLfxuDfgoXnZqexmcIks16GEJoxhC
TdsqneYtOUBy5FEcRVMehL35NiHpEHk3Quo8po46fVkc0FRBxAhaZrpmfll2HZ9k
3P9zXxKJa/8kdw8fqIbQcc8AN0ixcJ4VOmqaChkXAsoLzvPBbwPwgsDVXSfSIn6/
UUja2lE2bSCPLiFV7kl9deKsZvQUONZJwlOnakkTYx3wrO+qDhU=
=xHRF
-----END PGP SIGNATURE-----
Merge tag 'fsi-for-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/joel/fsi into char-misc-next
Joel writes:
FSI changes for v5.14
- Bug fixes for the OCC, SCOM and SBEFIFO drivers
- Performance fix for aspeed fsi master
- Small fixes from the mailing lists
* tag 'fsi-for-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/joel/fsi:
fsi/sbefifo: Fix reset timeout
fsi/sbefifo: Clean up correct FIFO when receiving reset request from SBE
fsi: master-ast-cf: Remove redundant error printing in fsi_master_acf_probe()
fsi: Aspeed: Reduce poll timeout
fsi: aspeed: convert to devm_platform_ioremap_resource
hwmon: (occ) Print response status in first poll error message
hwmon: (occ) Start sequence number at one
fsi: occ: Log error for checksum failure
fsi: occ: Don't accept response from un-initialized OCC
fsi: scom: Remove retries
fsi: scom: Reset the FSI2PIB engine for any error
fsi: aspeed: Emit fewer barriers in opb operations
fsi: core: Fix return of error values on failures
fsi: Add missing MODULE_DEVICE_TABLE
In order to better debug problems starting up the driver, print
the response status from the OCC in the error logged when the first
poll command fails.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20210209171235.20624-5-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
Initialize the sequence number at one, rather than zero, in order
to prevent false matches with the zero-initialized OCC SRAM
buffer before the OCC is fully initialized.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20210209171235.20624-4-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
The poll rate limiter time was initialized at zero. This breaks the
comparison in time_after if jiffies is large. Switch to storing the
next update time rather than the previous time, and initialize the
time when the device is probed.
Fixes: c10e753d43 ("hwmon (occ): Add sensor types and versions")
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20210429151336.18980-1-eajames@linux.ibm.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
coccicheck complains about the use of snprintf() in sysfs
show functions.
drivers/hwmon/ina3221.c:701:8-16: WARNING: use scnprintf or sprintf
This results in a large number of patch submissions. Fix it all in
one go using the following coccinelle rules. Use sysfs_emit instead
of scnprintf or sprintf since that makes more sense.
@depends on patch@
identifier show, dev, attr, buf;
@@
ssize_t show(struct device *dev, struct device_attribute *attr, char *buf)
{
<...
return
- snprintf(buf, \( PAGE_SIZE \| PAGE_SIZE - 1 \),
+ sysfs_emit(buf,
...);
...>
}
@depends on patch@
identifier show, dev, attr, buf, rc;
@@
ssize_t show(struct device *dev, struct device_attribute *attr, char *buf)
{
<...
rc =
- snprintf(buf, \( PAGE_SIZE \| PAGE_SIZE - 1 \),
+ sysfs_emit(buf,
...);
...>
}
While at it, remove unnecessary braces and as well as unnecessary
else after return statements to address checkpatch warnings in the
resulting patch.
Cc: Zihao Tang <tangzihao1@hisilicon.com>
Cc: Jay Fang <f.fangjian@huawei.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
The latest version of the On-Chip Controller (OCC) has a different
format for the temperature sensor data. Add a new temperature sensor
version to handle this data.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/20201120010315.190737-4-joel@jms.id.au
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
The code in occ_get_powr_avg() invokes div64_u64() without checking the
divisor. In case the divisor is zero, kernel gets an "Division by zero
in kernel" error.
Check the divisor and make it return 0 if the divisor is 0.
Fixes: c10e753d43 ("hwmon (occ): Add sensor types and versions")
Signed-off-by: Lei YU <mine260309@gmail.com>
Reviewed-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/1562813088-23708-1-git-send-email-mine260309@gmail.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Here is the "large" pull request for char and misc and other assorted
smaller driver subsystems for 5.3-rc1.
It seems that this tree is becoming the funnel point of lots of smaller
driver subsystems, which is fine for me, but that's why it is getting
larger over time and does not just contain stuff under drivers/char/ and
drivers/misc.
Lots of small updates all over the place here from different driver
subsystems:
- habana driver updates
- coresight driver updates
- documentation file movements and updates
- Android binder fixes and updates
- extcon driver updates
- google firmware driver updates
- fsi driver updates
- smaller misc and char driver updates
- soundwire driver updates
- nvmem driver updates
- w1 driver fixes
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXSXmoQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ylV9wCgyJGbpPch8v/ecrZGFHYS4sIMexIAoMco3zf6
wnqFmXiz1O0tyo1sgV9R
=7sqO
-----END PGP SIGNATURE-----
Merge tag 'char-misc-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char / misc driver updates from Greg KH:
"Here is the "large" pull request for char and misc and other assorted
smaller driver subsystems for 5.3-rc1.
It seems that this tree is becoming the funnel point of lots of
smaller driver subsystems, which is fine for me, but that's why it is
getting larger over time and does not just contain stuff under
drivers/char/ and drivers/misc.
Lots of small updates all over the place here from different driver
subsystems:
- habana driver updates
- coresight driver updates
- documentation file movements and updates
- Android binder fixes and updates
- extcon driver updates
- google firmware driver updates
- fsi driver updates
- smaller misc and char driver updates
- soundwire driver updates
- nvmem driver updates
- w1 driver fixes
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (188 commits)
coresight: Do not default to CPU0 for missing CPU phandle
dt-bindings: coresight: Change CPU phandle to required property
ocxl: Allow contexts to be attached with a NULL mm
fsi: sbefifo: Don't fail operations when in SBE IPL state
coresight: tmc: Smatch: Fix potential NULL pointer dereference
coresight: etm3x: Smatch: Fix potential NULL pointer dereference
coresight: Potential uninitialized variable in probe()
coresight: etb10: Do not call smp_processor_id from preemptible
coresight: tmc-etf: Do not call smp_processor_id from preemptible
coresight: tmc-etr: alloc_perf_buf: Do not call smp_processor_id from preemptible
coresight: tmc-etr: Do not call smp_processor_id() from preemptible
docs: misc-devices: convert files without extension to ReST
fpga: dfl: fme: align PR buffer size per PR datawidth
fpga: dfl: fme: remove copy_to_user() in ioctl for PR
fpga: dfl-fme-mgr: fix FME_PR_INTFC_ID register address.
intel_th: msu: Start read iterator from a non-empty window
intel_th: msu: Split sgt array and pointer in multiwindow mode
intel_th: msu: Support multipage blocks
intel_th: pci: Add Ice Lake NNPI support
intel_th: msu: Fix single mode with disabled IOMMU
...
Sequence numbering of the commands submitted to the OCC is required by
the OCC interface specification. Add sequence numbering and check for
the correct sequence number on the response.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Lei YU <mine260309@gmail.com>
Signed-off-by: Joel Stanley <joel@jms.id.au>
The occ driver supports two formats for the temp sensor value.
The OCC firmware for P8 supports only the first format, for which
no range checking or error processing is performed in the driver.
Inspecting the OCC sources for P8 reveals that OCC may send
a special value 0xFFFF to indicate that a sensor read timeout
has occurred, see
https://github.com/open-power/occ/blob/master_p8/src/occ/cmdh/cmdh_fsp_cmds.c#L395
That situation wasn't handled in the driver. This patch adds invalid
temp value check for the sensor data format 1 and handles it the same
way as it is done for the format 2, where EREMOTEIO is reported for
this case.
Signed-off-by: Alexander Soldatov <a.soldatov@yadro.com>
Signed-off-by: Alexander Amelkin <a.amelkin@yadro.com>
Reviewed-by: Alexander Amelkin <a.amelkin@yadro.com>
Cc: Edward A. James <eajames@us.ibm.com>
Cc: Joel Stanley <joel@jms.id.au>
Reviewed-by: Eddie James <eajames@linux.ibm.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
The OCC driver limits the rate of sending poll commands to the OCC. If a
user reads a hwmon entry after a poll response resulted in an error and
is rate-limited, the error is invisible to the user. Fix this by storing
the last error and returning that in the rate-limited case.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Instead of duplicating the common code into the 2 (binary) drivers,
move the common code to a separate module. This is cleaner.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Eddie James <eajames@linux.ibm.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Eddie James <eajames@linux.ibm.com>
Tested-by: Eddie James <eajames@linux.ibm.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
In the case of power sensor version 0xA0, the sensor indexing overlapped
with the "caps" power sensors, resulting in probe failure and kernel
warnings. Fix this by specifying the next index for each power sensor
version.
Fixes: 54076cb3b5 ("hwmon (occ): Add sensor attributes and register ...")
Cc: stable@vger.kernel.org
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Tested-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Cast get_unaligned_be32(...) to u64 in order to give the compiler
complete information about the proper arithmetic to use and avoid
a potential integer overflow.
Notice that such function call is used in contexts that expect
expressions of type u64 (64 bits, unsigned); and the following
expressions are currently being evaluated using 32-bit
arithmetic:
val = get_unaligned_be32(&power->update_tag) *
occ->powr_sample_time_us;
val = get_unaligned_be32(&power->vdn.update_tag) *
occ->powr_sample_time_us;
Addresses-Coverity-ID: 1442357 ("Unintentional integer overflow")
Addresses-Coverity-ID: 1442476 ("Unintentional integer overflow")
Addresses-Coverity-ID: 1442508 ("Unintentional integer overflow")
Fixes: ff692d80b2e2 ("hwmon (occ): Add sensor types and versions")
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Eddie James <eajames@linux.ibm.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
The OCC provides a variety of additional information about the state of
the host processor, such as throttling, error conditions, and the number
of OCCs detected in the system. This information is essential to service
processor applications such as fan control and host management.
Therefore, export this data in the form of sysfs attributes attached to
the platform device (to which the hwmon device is also attached).
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Setup the sensor attributes for every OCC sensor found by the first poll
response. Register the attributes with hwmon.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Add structures to define all sensor types and versions. Add sysfs show
and store functions for each sensor type. Add a method to construct the
"set user power cap" command and send it to the OCC. Add rate limit to
polling the OCC (in case user-space reads our hwmon entries rapidly).
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Add method to parse the response from the OCC poll command. This only
needs to be done during probe(), since the OCC shouldn't change the
number or format of sensors while it's running. The parsed response
allows quick access to sensor data, as well as information on the
number and version of sensors, which we need to instantiate hwmon
attributes.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
The OCC is a device embedded on a POWER processor that collects and
aggregates sensor data from the processor and system. The OCC can
provide the raw sensor data as well as perform thermal and power
management on the system.
This driver provides a hwmon interface to the OCC from a service
processor (e.g. a BMC). The driver supports both POWER8 and POWER9 OCCs.
Communications with the POWER8 OCC are established over standard I2C
bus. The driver communicates with the POWER9 OCC through the FSI-based
OCC driver, which handles the lower-level communication details.
This patch lays out the structure of the OCC hwmon driver. There are two
platform drivers, one each for P8 and P9 OCCs. These are probed through
the I2C tree and the FSI-based OCC driver, respectively. The patch also
defines the first common structures and methods between the two OCC
versions.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
[groeck: Fix up SPDX license identifier]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>