edac.txt: update information about newer Intel CPUs
There's a chapter at edac.rst written by the time Nehalem support was added. Such information is used not only by the Nehalem driver (i7core_edac), but by all newer Intel CPU architectures that are supported by i7core_edac, sb_edac and sbx_edac drivers. Update the information to reflect that. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
This commit is contained in:
parent
96714bd707
commit
e4b5301674
|
@ -741,13 +741,25 @@ The ``test_device_edac`` sample driver is located at the
|
||||||
http://bluesmoke.sourceforge.net project site for EDAC.
|
http://bluesmoke.sourceforge.net project site for EDAC.
|
||||||
|
|
||||||
|
|
||||||
Nehalem Usage of EDAC APIs
|
Usage of EDAC APIs on Nehalem and newer Intel CPUs
|
||||||
--------------------------
|
--------------------------------------------------
|
||||||
|
|
||||||
Due to the way Nehalem exports Memory Controller data, some adjustments
|
On older Intel architectures, the memory controller was part of the North
|
||||||
were done at i7core_edac driver. This chapter will cover those differences
|
Bridge chipset. Nehalem, Sandy Bridge, Ivy Bridge, Haswell, Sky Lake and
|
||||||
|
newer Intel architectures integrated an enhanced version of the memory
|
||||||
|
controller (MC) inside the CPUs.
|
||||||
|
|
||||||
1) On Nehalem, there is one Memory Controller per Quick Patch Interconnect
|
This chapter will cover the differences of the enhanced memory controllers
|
||||||
|
found on newer Intel CPUs, such as ``i7core_edac``, ``sb_edac`` and
|
||||||
|
``sbx_edac`` drivers.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
The Xeon E7 processor families use a separate chip for the memory
|
||||||
|
controller, called Intel Scalable Memory Buffer. This section doesn't
|
||||||
|
apply for such families.
|
||||||
|
|
||||||
|
1) There is one Memory Controller per Quick Patch Interconnect
|
||||||
(QPI). At the driver, the term "socket" means one QPI. This is
|
(QPI). At the driver, the term "socket" means one QPI. This is
|
||||||
associated with a physical CPU socket.
|
associated with a physical CPU socket.
|
||||||
|
|
||||||
|
@ -757,7 +769,7 @@ were done at i7core_edac driver. This chapter will cover those differences
|
||||||
|
|
||||||
The minimum known unity is DIMMs. There are no information about csrows.
|
The minimum known unity is DIMMs. There are no information about csrows.
|
||||||
As EDAC API maps the minimum unity is csrows, the driver sequentially
|
As EDAC API maps the minimum unity is csrows, the driver sequentially
|
||||||
maps channel/dimm into different csrows.
|
maps channel/DIMM into different csrows.
|
||||||
|
|
||||||
For example, supposing the following layout::
|
For example, supposing the following layout::
|
||||||
|
|
||||||
|
@ -780,8 +792,8 @@ were done at i7core_edac driver. This chapter will cover those differences
|
||||||
|
|
||||||
Each QPI is exported as a different memory controller.
|
Each QPI is exported as a different memory controller.
|
||||||
|
|
||||||
2) Nehalem MC has the ability to generate errors. The driver implements this
|
2) The MC has the ability to inject errors to test drivers. The drivers
|
||||||
functionality via some error injection nodes:
|
implement this functionality via some error injection nodes:
|
||||||
|
|
||||||
For injecting a memory error, there are some sysfs nodes, under
|
For injecting a memory error, there are some sysfs nodes, under
|
||||||
``/sys/devices/system/edac/mc/mc?/``:
|
``/sys/devices/system/edac/mc/mc?/``:
|
||||||
|
@ -855,13 +867,14 @@ were done at i7core_edac driver. This chapter will cover those differences
|
||||||
|
|
||||||
EDAC MC0: UE row 0, channel-a= 0 channel-b= 0 labels "-": NON_FATAL (addr = 0x0075b980, socket=0, Dimm=0, Channel=2, syndrome=0x00000040, count=1, Err=8c0000400001009f:4000080482 (read error: read ECC error))
|
EDAC MC0: UE row 0, channel-a= 0 channel-b= 0 labels "-": NON_FATAL (addr = 0x0075b980, socket=0, Dimm=0, Channel=2, syndrome=0x00000040, count=1, Err=8c0000400001009f:4000080482 (read error: read ECC error))
|
||||||
|
|
||||||
3) Nehalem specific Corrected Error memory counters
|
3) Corrected Error memory register counters
|
||||||
|
|
||||||
Nehalem have some registers to count memory errors. The driver uses those
|
Those newer MCs have some registers to count memory errors. The driver
|
||||||
registers to report Corrected Errors on devices with Registered Dimms.
|
uses those registers to report Corrected Errors on devices with Registered
|
||||||
|
DIMMs.
|
||||||
|
|
||||||
However, those counters don't work with Unregistered Dimms. As the chipset
|
However, those counters don't work with Unregistered DIMM. As the chipset
|
||||||
offers some counters that also work with UDIMMS (but with a worse level of
|
offers some counters that also work with UDIMMs (but with a worse level of
|
||||||
granularity than the default ones), the driver exposes those registers for
|
granularity than the default ones), the driver exposes those registers for
|
||||||
UDIMM memories.
|
UDIMM memories.
|
||||||
|
|
||||||
|
@ -896,8 +909,8 @@ were done at i7core_edac driver. This chapter will cover those differences
|
||||||
4) Standard error counters
|
4) Standard error counters
|
||||||
|
|
||||||
The standard error counters are generated when an mcelog error is received
|
The standard error counters are generated when an mcelog error is received
|
||||||
by the driver. Since, with udimm, this is counted by software, it is
|
by the driver. Since, with UDIMM, this is counted by software, it is
|
||||||
possible that some errors could be lost. With rdimm's, they display the
|
possible that some errors could be lost. With RDIMM's, they display the
|
||||||
contents of the registers
|
contents of the registers
|
||||||
|
|
||||||
Reference documents used on ``amd64_edac``
|
Reference documents used on ``amd64_edac``
|
||||||
|
@ -958,6 +971,7 @@ Credits
|
||||||
* |copy| Mauro Carvalho Chehab
|
* |copy| Mauro Carvalho Chehab
|
||||||
|
|
||||||
- 05 Aug 2009 Nehalem interface
|
- 05 Aug 2009 Nehalem interface
|
||||||
|
- 26 Oct 2016 Converted to ReST and cleanups at the Nehalem section
|
||||||
|
|
||||||
* EDAC authors/maintainers:
|
* EDAC authors/maintainers:
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue