Commit Graph

1332 Commits

Author SHA1 Message Date
Yazen Ghannam eb77e6b80f EDAC, amd64: Fix reporting of Chip Select sizes on Fam17h
The wrong index into the csbases/csmasks arrays was being passed to
the function to compute the chip select sizes, which resulted in the
wrong size being computed. Address that so that the correct values are
computed and printed.

Also, redo how we calculate the number of pages in a CS row.

Reported-by: Benjamin Bennett <benbennett@gmail.com>
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Cc: <stable@vger.kernel.org> # 4.10.x
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1493313114-11260-1-git-send-email-Yazen.Ghannam@amd.com
[ Remove unneeded integer math comment, minor cleanups. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-03 16:27:36 +02:00
Borislav Petkov f8d5549df2 EDAC, ghes: Do not enable it by default
Leave it to the user to decide whether to enable this or not. Otherwise,
platform-specific drivers won't initialize (currently, EDAC supports
only a single platform driver loaded).

Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-27 14:15:38 +02:00
Borislav Petkov bffc7dece9 EDAC: Rename report status accessors
Change them to have the edac_ prefix.

No functionality change.

Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-10 17:15:02 +02:00
Borislav Petkov fee27d7d97 EDAC: Delete edac_stub.c
Move the remaining functionality to edac_mc.c. Convert "edac_report=" to
a module parameter.

Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-10 17:14:48 +02:00
Borislav Petkov a06b85ff07 EDAC: Update Kconfig help text
Remove the old URLs.

Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-10 17:14:44 +02:00
Borislav Petkov e3c4ff6d8c EDAC: Remove EDAC_MM_EDAC
Move all the EDAC core functionality behind CONFIG_EDAC and get rid of
that indirection. Update defconfigs which had it.

While at it, fix dependencies such that EDAC depends on RAS for the
tracepoints.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: linux-edac@vger.kernel.org
2017-04-10 17:14:41 +02:00
Borislav Petkov be1d162948 EDAC: Issue tracepoint only when it is defined
... and this happens only when CONFIG_RAS is enabled.

Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-10 17:14:38 +02:00
Borislav Petkov 8c22b4fece EDAC: Move edac_op_state to edac_mc.c
... as part of moving stuff away from edac_stub.c

Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-10 17:14:29 +02:00
Borislav Petkov d3116a0837 EDAC: Remove edac_err_assert
... and the glue around it. It is not needed anymore.

Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-10 17:14:21 +02:00
Borislav Petkov 97bb6c17ad EDAC: Get rid of edac_handlers
Use mc_devices list instead to check whether we have EDAC driver
instances successfully registered with EDAC core.

Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-10 17:14:17 +02:00
Borislav Petkov db47d5f856 x86/nmi, EDAC: Get rid of DRAM error reporting thru PCI SERR NMI
Apparently, some machines used to report DRAM errors through a PCI SERR
NMI. This is why we have a call into EDAC in the NMI handler. See

  c0d1217202 ("drivers/edac: add new nmi rescan").

From looking at the patch above, that's two drivers: e752x_edac.c and
e7xxx_edac.c. Now, I wanna say those are old machines which are probably
decommissioned already.

Tony says that "[t]the newest CPU supported by either of those drivers
is the Xeon E7520 (a.k.a. "Nehalem") released in Q1'2010. Possibly some
folks are still using these ... but people that hold onto h/w for 7
years generally cling to old s/w too ... so I'd guess it unlikely that
we will get complaints for breaking these in upstream."

So even if there is a small number still in use, we did load EDAC with
edac_op_state == EDAC_OPSTATE_POLL by default (we still do, in fact)
which means a default EDAC setup without any parameters supplied on the
command line or otherwise would never even log the error in the NMI
handler because we're polling by default:

  inline int edac_handler_set(void)
  {
         if (edac_op_state == EDAC_OPSTATE_POLL)
                 return 0;

         return atomic_read(&edac_handlers);
  }

So, long story short, I'd like to get rid of that nastiness called
edac_stub.c and confine all the EDAC drivers solely to drivers/edac/. If
we ever have to do stuff like that again, it should be notifiers we're
using and not some insanity like this one.

Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
2017-04-10 17:13:48 +02:00
Borislav Petkov 76f6a26ce9 EDAC, highbank: Align Makefile directives
... like the rest of the file.

Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-10 17:10:43 +02:00
Sergey Temerkhanov 5195c206fd EDAC, thunderx: Remove unused code
Remove unused code reserved for upcoming CPUs.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Sergey Temerkhanov <s.temerkhanov@gmail.com>
Cc: David Daney <david.daney@cavium.com>
Cc: Jan.Glauber@cavium.com
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170406113834.17153-1-s.temerkhanov@gmail.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-07 11:49:32 +02:00
Sergey Temerkhanov 3d2d8c0f84 EDAC, thunderx: Change LMC index calculation
Shift the node number by 3 bits instead of 8 allowing proper functioning
with default EDAC_MAX_MCS.

Signed-off-by: Sergey Temerkhanov <s.temerkhanov@gmail.com>
Cc: David Daney <david.daney@cavium.com>
Cc: Jan.Glauber@cavium.com
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170406113755.17082-1-s.temerkhanov@gmail.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-07 11:47:44 +02:00
Thor Thayer 25b223ddfe EDAC, altera: Fix peripheral warnings for Cyclone5
The peripherals' RAS functionality only exist on the Arria10 SoCFPGA.
The Cyclone5 initialization generates EDAC warnings when the peripherals
aren't found in the device tree. Fix by checking for Arria10 in the init
functions.

Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1491415262-5018-1-git-send-email-thor.thayer@linux.intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-06 11:42:38 +02:00
Jan Glauber 621c4fe3cc EDAC, thunderx: Fix L2C MCI interrupt disable
Fix a typo that disabled the MCI interrupts using the wrong bitmask.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
Cc: David Daney <david.daney@cavium.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Sergey Temerkhanov <s.temerkhanov@gmail.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170405102739.6301-1-jglauber@cavium.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-05 14:46:05 +02:00
Sergey Temerkhanov 41003396f9 EDAC, thunderx: Add Cavium ThunderX EDAC driver
Add support for Cavium ThunderX EDAC capable on-chip peripherals, namely
the DRAM controller (LMC), cache coherent processor interconnect (CCPI)
and level 2 cache blocks (L2C-TAD, L2C-MCI, L2C-CBC)

Signed-off-by: Sergey Temerkhanov <s.temerkhanov@gmail.com>
Cc: David.Daney@cavium.com
Cc: Jan.Glauber@cavium.com
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170324222837.60583-1-s.temerkhanov@gmail.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-03-27 11:43:56 +02:00
Qiuxu Zhuo 819f60fb7d EDAC, pnd2_edac: Fix reported DIMM number
DIMM number passed to edac_mc_handle_error() was accidentally hardcoded
to zero. Pass in the correct daddr->dimm value.

Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-03-26 09:36:28 +02:00
Borislav Petkov cd1be315ac EDAC, pnd2_edac: Fix !EDAC_DEBUG build
Provide debugfs function stubs when EDAC_DEBUG is not enabled so that we
don't fail the build:

  drivers/edac/pnd2_edac.c: In function ‘pnd2_init’:
  drivers/edac/pnd2_edac.c:1521:2: error: implicit declaration of function ‘setup_pnd2_debug’ [-Werror=implicit-function-declaration]
    setup_pnd2_debug();
    ^
  drivers/edac/pnd2_edac.c: In function ‘pnd2_exit’:
  drivers/edac/pnd2_edac.c:1529:2: error: implicit declaration of function ‘teardown_pnd2_debug’ [-Werror=implicit-function-declaration]
    teardown_pnd2_debug();
    ^

Signed-off-by: Borislav Petkov <bp@suse.de>
2017-03-23 12:56:23 +01:00
Borislav Petkov 1c5bf78114 EDAC: Select DEBUG_FS
The debugfs.c functionality relies on DEBUG_FS so select it.

Signed-off-by: Borislav Petkov <bp@suse.de>
2017-03-23 12:56:09 +01:00
Tony Luck 5c71ad17f9 EDAC, pnd2_edac: Add new EDAC driver for Intel SoC platforms
Initial target for this driver is the Intel Apollo Lake platform and
Denverton micro-server, they use the same internal memory controller IP
called Pondicherry2.

Memory controller registers are not in PCI config space like earlier
Intel memory controllers. For Apollo Lake platform they are accessed via
a "side-band" interface, for Denverton micro-server they are access via
PCI config space and memory map I/O. This driver is for Apollo Lake and
Denverton, but only the Denverton is fully enabled while we wait for the
sideband driver.

Apollo lake driver and initial cut at Denverton driver by Tony Luck.
Extensive cleanup, refactoring and basic verification by Qiuxu Zhuo.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170308174539.14432-1-qiuxu.zhuo@intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-03-16 12:40:52 +01:00
Jérémy Lefaure e61555c29c EDAC, i5000, i5400: Fix use of MTR_DRAM_WIDTH macro
The MTR_DRAM_WIDTH macro returns the data width. It is sometimes used
as if it returned a boolean true if the width if 8. Fix the tests where
MTR_DRAM_WIDTH is misused.

Signed-off-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170309011809.8340-1-jeremy.lefaure@lse.epita.fr
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-03-09 09:25:29 +01:00
Colin Ian King 4bd035eae2 EDAC, xgene: Fix wrongly spelled "procesing"
Fix spelling mistake in dev_err message.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Loc Ho <lho@apm.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170223002609.9440-1-colin.king@canonical.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-03-06 17:08:19 +01:00
Linus Torvalds 60c906bab1 Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RAS updates from Ingo Molnar:
 "The main changes in this cycle were:

  - Assign notifier chain priorities for all RAS related handlers to
    make the ordering explicit (Borislav Petkov)

  - Improve the AMD MCA banks sysfs output (Yazen Ghannam)

  - Various cleanups and restructuring of the x86 RAS code (Borislav
    Petkov)"

* 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/ras, EDAC, acpi: Assign MCE notifier handlers a priority
  x86/ras: Get rid of mce_process_work()
  EDAC/mce/amd: Dump TSC value
  EDAC/mce/amd: Unexport amd_decode_mce()
  x86/ras/amd/inj: Change dependency
  x86/ras: Flip the TSC-adding logic
  x86/ras/amd: Make sysfs names of banks more user-friendly
  x86/ras/therm_throt: Do not log a fake MCE for thermal events
  x86/ras/inject: Make it depend on X86_LOCAL_APIC=y
2017-02-20 12:47:44 -08:00
Yazen Ghannam 75bf2f6478 EDAC, mce_amd: Print IPID and Syndrome on a separate line
Currently, the IPID and Syndrome are printed on the same line as the
Address. There are cases when we can have a valid Syndrome but not a
valid Address.

For example, the MCA_SYND register can be used to hold more detailed
error info that the hardware folks can use. It's not just DRAM ECC
syndromes. There are some error types that aren't related to memory that
may have valid syndromes, like some errors related to links in the Data
Fabric, etc.

In these cases, the IPID and Syndrome are not printed at the same log
level as the rest of the stanza, so users won't see them on the console.

Console:
  [Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over|CE|MiscV|-|-|-|-|SyndV|-]: 0xd82000000002080b
  [Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2

Dmesg:
  [Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over|CE|MiscV|-|-|-|-|SyndV|-]: 0xd82000000002080b
  , Syndrome: 0x000000010b404000, IPID: 0x0001002e00000002
  [Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2

Print the IPID first and on a new line. The IPID should always be
printed on SMCA systems. The Syndrome will then be printed with the IPID
and at the same log level when valid:

  [Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over|CE|MiscV|-|-|-|-|SyndV|-]: 0xd82000000002080b
  [Hardware Error]: IPID: 0x0001002e00000002, Syndrome: 0x000000010b404000
  [Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1487192182-2474-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-02-16 15:39:32 +01:00
Borislav Petkov e62d2ca9d0 EDAC, amd64: Bump driver version
Last time we did that was when we enabled Bulldozer. Now, we enabled Zen
so it is only natural ... :-)

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
2017-02-14 11:58:05 +01:00
Wei Yongjun 279fa58035 EDAC, fsl_ddr: Make locally used symbols static
Fix the following sparse warnings:

  drivers/edac/fsl_ddr_edac.c:148:1: warning:
   symbol 'dev_attr_inject_data_hi' was not declared. Should it be static?
  drivers/edac/fsl_ddr_edac.c:150:1: warning:
   symbol 'dev_attr_inject_data_lo' was not declared. Should it be static?
  drivers/edac/fsl_ddr_edac.c:152:1: warning:
   symbol 'dev_attr_inject_ctrl' was not declared. Should it be static?

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170209150424.15124-1-weiyj.lk@gmail.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-02-09 17:40:54 +01:00
Chris Packham 321d17c19b EDAC, mpc85xx: Add T2080 l2-cache support
The L2 cache controller on the T2080 SoC has similar capabilities to the
others already supported by the mpc85xx_edac driver. Add it to the list
of compatible devices.

Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: devicetree@vger.kernel.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/20170201231624.28843-1-chris.packham@alliedtelesis.co.nz
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-02-03 10:36:35 +01:00
Yazen Ghannam 1bd9900b83 EDAC, amd64: Add x86cpuid sanity check during init
Match one of the devices in amd64_cpuids[] before loading the module.
This is an additional sanity check against users trying to load
amd64_edac_mod on unsupported systems.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1485537863-2707-9-git-send-email-Yazen.Ghannam@amd.com
[ Get rid of err_ret label, make it a bit more readable this way. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-01-28 14:43:06 +01:00
Yazen Ghannam 4688c9b42d EDAC, amd64: Don't treat ECC disabled as failure
Having ECC disabled on a node doesn't necessarily mean that it's
disabled for the entire system. So let's return a non-failing code when
ECC is disabled on a node. This way we can skip initialization for the
node but still continue with the remaining nodes.

After probing all instances, make sure we have at least one MC device
allocated.

This issue is seen and fix tested on Fam15h and Fam17h MCM systems.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1485537863-2707-8-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-01-28 14:38:49 +01:00
Yazen Ghannam d7fc9d77ac EDAC: Add routine to check if MC devices list is empty
We need to know if any MC devices have been allocated.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1485537863-2707-7-git-send-email-Yazen.Ghannam@amd.com
[ Prettify text. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-01-28 14:36:47 +01:00
Yazen Ghannam df64636fa4 EDAC, amd64: Remove unused printing macros
amd64_{debug,notice} don't have any users, so remove them.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1485537863-2707-6-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-01-28 13:20:06 +01:00
Yazen Ghannam 11ab1cae58 EDAC, amd64: Rework messages in ecc_enabled()
Print the node number when informing that DRAM ECC is disabled so
that we can show which nodes have DRAM ECC disabled. Also, print more
detailed system information as edac_dbg(), so as to not bother general
users.

Switch amd64_notice to amd64_info to match the message above it.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1485537863-2707-5-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-01-28 13:18:33 +01:00
Yazen Ghannam 234365f56e EDAC, amd64: Move global code out of instance functions
We have a few functions that register/unregister an ECC error decoding
routine. These functions are called when we init/remove instances.
However, they are global and so don't need to be registered/unregistered
multiple times.

So move them out of the init/remove instance functions and into the
module init/exit routines.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1485297149-13733-4-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-01-28 13:08:10 +01:00
Yazen Ghannam 2b9b2c4659 EDAC, amd64: Free unused memory when init_one_instance() fails
Jump to memory freeing routines when init_one_instance() fails.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1485297149-13733-3-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-01-28 13:03:40 +01:00
Yazen Ghannam 67d7fd306e EDAC, mce_amd: Give more context to deferred error message
Users may not be familiar with the concept of deferred errors. There is
no action for users to take on this type of error, so give more context
in the error message to make this more clear.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1485297149-13733-2-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-01-28 13:03:29 +01:00
Borislav Petkov 58fb24cb95 EDAC, i7300: Test for the second channel properly
REDMEMB[17] is the ECC_Locator bit, which, when set, identifies the
CS[3:2] as the simbols in error. And thus the second channel.

The macro computing it was wrong so get rid of it (it was used at one
place only) and get rid of the conditional too. Generates better code
this way anyway.

Signed-off-by: Borislav Petkov <bp@suse.de>
Reported-by: David Binderman <dcb314@hotmail.com>
Reviewed-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2017-01-26 11:35:23 +01:00
Borislav Petkov 9026cc82b6 x86/ras, EDAC, acpi: Assign MCE notifier handlers a priority
Assign all notifiers on the MCE decode chain a priority so that they get
called in the correct order.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170123183514.13356-10-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-24 09:14:57 +01:00
Borislav Petkov 0bceab677d EDAC/mce/amd: Dump TSC value
Dump the TSC value of the time when the MCE got logged.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170123183514.13356-8-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-24 09:14:56 +01:00
Borislav Petkov 1fbcd90903 EDAC/mce/amd: Unexport amd_decode_mce()
It is not used outside of the driver anymore.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170123183514.13356-7-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-24 09:14:55 +01:00
Nicolas Iooss 127c1225bf EDAC, sb_edac: Get rid of ->show_interleave_mode()
Function sbridge_register_mci() sets pvt->info.show_interleave_mode
to knl_show_interleave_mode() on Knight's Landing and
show_interleave_mode() anywhere else.

Merge show_interleave_mode() and knl_show_interleave_mode() in a single
implementation and use it without an indirect function pointer.

Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170122172806.10412-1-nicolas.iooss_linux@m4x.org
[ Call it get_intlv_mode_str(). ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-01-23 11:39:48 +01:00
Aaron Miller 4fb6fde74d EDAC: Expose per-DIMM error counts in sysfs
The old csrowX sysfs directories have per-csrow error counters, but the
new dimmX directories do not currently expose error counts.

EDAC already keeps these counts, add them to sysfs so per-DIMM counts
are still available when CONFIG_EDAC_LEGACY_SYSFS=n.

Signed-off-by: Aaron Miller <aaronmiller@fb.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20161103220153.3997328-1-aaronmiller@fb.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-01-19 10:29:40 +01:00
Yazen Ghannam 2287c63643 EDAC, amd64: Save and return err code from probe_one_instance()
We should save the return code from probe_one_instance() so that it can
be returned from the module init function. Otherwise, we'll be returning
the -ENOMEM from above.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1484322741-41884-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-01-16 11:53:39 +01:00
Arvind Yadav f5c61277f6 EDAC, i82975x: Add ioremap_nocache() error handling
If ioremap_nocache() fails, it will return NULL. Which will then cause a
NULL-pointer dereference. Handle the returned value properly.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1484549092-11349-1-git-send-email-arvind.yadav.cs@gmail.com
[ Boris: massage commit message and improve error message. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-01-16 11:09:22 +01:00
Ben Dooks 628ea92f09 EDAC: Make dev_attr_sdram_scrub_rate static
The dev_attr_sdram_scrub_rate is not declared in a header or used
anywhere else, so make it static to fix the following warning:

  drivers/edac/edac_mc_sysfs.c:816:1: warning: symbol
  'dev_attr_sdram_scrub_rate' was not declared. Should it be static?

Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Reviewed-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1465407356-7357-1-git-send-email-ben.dooks@codethink.co.uk
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-01-06 15:54:16 +01:00
Linus Torvalds 7c0f6ba682 Replace <asm/uaccess.h> with <linux/uaccess.h> globally
This was entirely automated, using the script by Al:

  PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
  sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
        $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.

Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-24 11:46:01 -08:00
Mauro Carvalho Chehab 66c222a02f edac: fix kernel-doc tags at the drivers/edac_*.h
Some kernel-doc tags don't provide good descriptions or use
a different style. Adjust them.

Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-12-15 08:58:10 -02:00
Mauro Carvalho Chehab 6634fbb6b6 driver-api: create an edac.rst file with EDAC documentation
Currently, there's no device driver documentation for the EDAC
subsystem at the driver-api book. Fill in the blanks for the
structures and functions that misses documentation, uniform
the word on the existing ones, and add a new edac.rst file at
driver-api, in order to document the EDAC subsystem.

Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-12-15 08:54:52 -02:00
Mauro Carvalho Chehab e01aa14cf2 edac: move documentation from edac_mc.c to edac_core.h
Several functions are documented at edac_mc.c.

As we'll be including edac_core.h at drivers-api book, move
those, in order for the kernel-doc markups be part of the API
documentation book.

Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-12-15 08:54:52 -02:00
Mauro Carvalho Chehab fdaf0b3505 edac: move documentation from edac_pci*.c to edac_pci.h
Several functions are documented at edac_pci.c and edac_pci_sysfs.c.

As we'll be including edac_pci.h at drivers-api book, move those,
in order for the kernel-doc markups be part of the API
documentation book.

As several of those kernel-doc macros are not in the right format,
fix them.

Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-12-15 08:54:51 -02:00