OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Gustavo A. R. Silva	a8e9b186f1	EDAC, sb_edac: Fix missing break in switch Add missing break statement in order to prevent the code from falling through. Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20171016174029.GA19757@embeddedor.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-10-19 10:53:42 +02:00
Luis Felipe Sandoval Castro	24281a2f4c	EDAC, sb_edac: Fix missing DIMM sysfs entries with KNL SNC2/SNC4 mode When figuring out the size of the DIMMs and the cluster mode is SNC2 or SNC4 the current algorithm ignores the contribution of some of the channels resulting in EDAC never knowing of the existence of some DIMMs attached to such channels (thus sysfs is not populated). Instead of selectively iterating from 0 to interlv_ways when looking for all the participants in the interleave, do an exhaustive search and iterate from 0 to KNL_MAX_CHANNELS. The algorithm is already smart enough to consider participants only one time. This works fine in all KNL cluster modes and even when there are missing DIMMs as the contribution of those channels is 0. Signed-off-by: Luis Felipe Sandoval Castro <luis.felipe.sandoval.castro@intel.com> Acked-by: Tony Luck <tony.luck@intel.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: arozansk@redhat.com Cc: linux-edac <linux-edac@vger.kernel.org> Cc: qiuxu.zhuo@intel.com Link: http://lkml.kernel.org/r/1506606882-90521-1-git-send-email-luis.felipe.sandoval.castro@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-10-11 15:57:25 +02:00
Tony Luck	88ae80aa60	EDAC, skx_edac: Handle systems with segmented PCI busses Large systems separate their PCI busses into segments since the limit of only 256 PCI busses can be too restrictive. Extend this driver to check whether <segment, bus-number> matches when deciding how to group memory controller PCI devices to CPU sockets. Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: Aristeu Rozanski <arozansk@redhat.com> Cc: Charles Rose <charles.rose@dell.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/f58abfd10bf73c8bc5adc1fe4de7408128b00625.1506358467.git.tony.luck@intel.com [ Make skx_dev.seg an int. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2017-09-28 18:01:55 +02:00
Jan Glauber	f821fe8cc7	EDAC, thunderx: Remove suspend/resume support The memory controller on ThunderX/OcteonTX systems does not support power management. Therefore remove the suspend/resume callbacks. Signed-off-by: Jan Glauber <jglauber@cavium.com> Cc: David Daney <david.daney@cavium.com> Cc: Jan Glauber <jglauber@cavium.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Suzuki K Poulose <Suzuki.Poulose@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Zhangshaokun <zhangshaokun@hisilicon.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Cc: linux-mips@linux-mips.org Link: http://lkml.kernel.org/r/20170925123502.17289-2-jglauber@cavium.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-09-27 17:35:58 +02:00
Charles Rose	a9c0a10888	EDAC, skx_edac: Fix detection of single-rank DIMMs Single-rank DIMMs did not get detected on Skylake/Kabylake systems due to wrong limit check. The single rank DIMM check is a simple typo. "0" is a legal value in this field meaning single rank. Signed-off-by: Charles Rose <charles.rose@dell.com> Cc: Aristeu Rozanski <arozansk@redhat.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/66df72d327c265fbf92fe25df96daa228a35f076.1506358467.git.tony.luck@intel.com [ Also fix debug message to print number of ranks. ] Signed-off-by: Tony Luck <tony.luck@intel.com> [ Expand commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2017-09-27 17:31:38 +02:00
Qiuxu Zhuo	15cc3ae001	EDAC, sb_edac: Don't create a second memory controller if HA1 is not present Yi Zhang reported the following failure on a 2-socket Haswell (E5-2603v3) server (DELL PowerEdge 730xd): EDAC sbridge: Some needed devices are missing EDAC MC: Removed device 0 for sb_edac.c Haswell SrcID#0_Ha#0: DEV 0000:7f:12.0 EDAC MC: Removed device 1 for sb_edac.c Haswell SrcID#1_Ha#0: DEV 0000:ff:12.0 EDAC sbridge: Couldn't find mci handler EDAC sbridge: Couldn't find mci handler EDAC sbridge: Failed to register device with error -19. The refactored sb_edac driver creates the IMC1 (the 2nd memory controller) if any IMC1 device is present. In this case only HA1_TA of IMC1 was present, but the driver expected to find HA1/HA1_TM/HA1_TAD[0-3] devices too, leading to the above failure. The document [1] says the 'E5-2603 v3' CPU has 4 memory channels max. Yi Zhang inserted one DIMM per channel for each CPU, and did random error address injection test with this patch: 4024 addresses fell in TOLM hole area 12715 addresses fell in CPU_SrcID#0_Ha#0_Chan#0_DIMM#0 12774 addresses fell in CPU_SrcID#0_Ha#0_Chan#1_DIMM#0 12798 addresses fell in CPU_SrcID#0_Ha#0_Chan#2_DIMM#0 12913 addresses fell in CPU_SrcID#0_Ha#0_Chan#3_DIMM#0 12674 addresses fell in CPU_SrcID#1_Ha#0_Chan#0_DIMM#0 12686 addresses fell in CPU_SrcID#1_Ha#0_Chan#1_DIMM#0 12882 addresses fell in CPU_SrcID#1_Ha#0_Chan#2_DIMM#0 12934 addresses fell in CPU_SrcID#1_Ha#0_Chan#3_DIMM#0 106400 addresses were injected totally. The test result shows that all the 4 channels belong to IMC0 per CPU, so the server really only has one IMC per CPU. In the 1st page of chapter 2 in datasheet [2], it also says 'E5-2600 v3' implements either one or two IMCs. For CPUs with one IMC, IMC1 is not used and should be ignored. Thus, do not create a second memory controller if the key HA1 is absent. [1] http://ark.intel.com/products/83349/Intel-Xeon-Processor-E5-2603-v3-15M-Cache-1_60-GHz [2] https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/xeon-e5-v3-datasheet-vol-2.pdf Reported-and-tested-by: Yi Zhang <yizhan@redhat.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Fixes: `e2f747b1f4` ("EDAC, sb_edac: Assign EDAC memory controller per h/w controller") Link: http://lkml.kernel.org/r/20170913104214.7325-1-qiuxu.zhuo@intel.com [ Massage commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2017-09-27 12:15:43 +02:00
Toshi Kani	301375e764	EDAC: Add owner check to the x86 platform drivers Change x86 EDAC platform drivers to verify the module owner at the beginning of their module init functions. This allows them to fail their init immediately when ghes_edac is enabled. Similar change can be made to other edac drivers if necessary. Also, remove ".c" from module names of pnp2_edac, sb_edac, and skx_edac. Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Suggested-by: Borislav Petkov <bp@alien8.de> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170823225447.15608-6-toshi.kani@hpe.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-09-25 13:09:39 +02:00
Toshi Kani	3877c7d1e2	EDAC: Add helper which returns the loaded platform driver Only a single EDAC platform driver can be loaded. When ghes_edac is enabled, an EDAC platform driver still attempts to register itself and fails in edac_mc_add_mc(). Add edac_get_owner() so that EDAC platform drivers can check the owner first. Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Suggested-by: Borislav Petkov <bp@alien8.de> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170823225447.15608-5-toshi.kani@hpe.com [ Massage commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2017-09-25 12:55:59 +02:00
Toshi Kani	5deed6b6a4	EDAC, ghes: Add platform check The ghes_edac driver was introduced in 2013 [1], but it has not been enabled by any distro yet. This driver obtains error info from firmware interfaces (APEI), which are not properly implemented on many platforms, as the driver says on load: This EDAC driver relies on BIOS to enumerate memory and get error reports. Unfortunately, not all BIOSes reflect the memory layout correctly. So, the end result of using this driver varies from vendor to vendor. If you find incorrect reports, please contact your hardware vendor to correct its BIOS. To get out from this situation, add a platform check to selectively enable the driver on platforms that are known to have proper APEI firmware implementation. "ghes_edac.force_load=1" skips this platform check. [1]: https://lkml.kernel.org/r/cover.1360931635.git.mchehab@redhat.com Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-acpi@vger.kernel.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170823225447.15608-4-toshi.kani@hpe.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-09-25 12:55:12 +02:00
Borislav Petkov	0fe5f281f7	EDAC, ghes: Model a single, logical memory controller We're enumerating the DIMMs through a DMI walk and since we can't get any more detailed topological information about which DIMMs belong to which memory controller, convert it to a single, logical controller which contains all the DIMMs. The error reporting path from GHES ghes_edac_report_mem_error() doesn't get called in NMI context but add a warning about it to catch any changes in the future as if so, our locking scheme will be insufficient then. Signed-off-by: Borislav Petkov <bp@suse.de>	2017-09-25 12:35:28 +02:00
Borislav Petkov	c9c8b4d6d0	EDAC, ghes: Remove symbol exports They're called from builtin code so no need for the exports. Signed-off-by: Borislav Petkov <bp@suse.de>	2017-09-25 12:35:27 +02:00
Arvind Yadav	75f029c3a8	EDAC: Handle return value of kasprintf() kasprintf() can fail and we must check its return value. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Cc: linux-edac@vger.kernel.org [ Merged into a single patch, small formatting fixups. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2017-09-21 12:18:44 +02:00
Borislav Petkov	398443471f	EDAC, mce_amd: Get rid of local var in amd_filter_mce() ... and use the macro for that. No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de>	2017-08-21 17:59:38 +02:00
Borislav Petkov	f3c0891c2f	EDAC, mce_amd: Get rid of most struct cpuinfo_x86 uses struct mce.cpuid contains CPUID(1).EAX which contains family, model and stepping and thus has enough information for our purposes. Thus get rid of some external dependencies which are not really needed. No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de>	2017-08-21 17:54:57 +02:00
Borislav Petkov	4ab1784b48	EDAC, mce_amd: Rename decode_smca_errors() to decode_smca_error() Singular fits better because it decodes a single error. No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de>	2017-08-21 17:44:09 +02:00
Bhumika Goyal	b2b3e7362e	EDAC: Make device_type const Make these const as they are only stored in the type field of a device structure, which is const. Done using Coccinelle. Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1503130946-2854-2-git-send-email-bhumirks@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-08-20 13:12:32 +02:00
Qiuxu Zhuo	bc8f10babc	EDAC, pnd2: Properly toggle hidden state for P2SB PCI device Properly handle hidden state of P2SB PCI device (DEV:D, FUN:0) for Apollo Lake. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170814154905.21707-1-qiuxu.zhuo@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-08-19 10:51:11 +02:00
Qiuxu Zhuo	5fd77cb3ba	EDAC, pnd2: Conditionally unhide/hide the P2SB PCI device to read BAR On Deverton server, the P2SB PCI device (DEV:1F, FUN:1) is used by multiple device drivers. If it's hidden by some device driver (e.g. with the i801 I2C driver, the commit `9424693035` ("i2c: i801: Create iTCO device on newer Intel PCHs") unconditionally hid the P2SB PCI device wrongly) it will make the pnd2_edac driver read out an invalid BAR value of 0xffffffff and then fail on ioremap(). Therefore, store the presence state of P2SB PCI device before unhiding it for reading BAR and restore the presence state after reading BAR. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: linux-i2c@vger.kernel.org Link: http://lkml.kernel.org/r/20170814154845.21663-1-qiuxu.zhuo@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-08-19 10:47:24 +02:00
Qiuxu Zhuo	d84676a9e1	EDAC, pnd2: Mask off the lower four bits of a BAR Bit[0] of BAR is always zero. Bit[2:1] and bit[3] of BAR contain the information of 'type' and the 'prefetchable' accordingly. Therefore, mask the lower four bits to retrieve the actual base address of a BAR. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170814154813.21619-1-qiuxu.zhuo@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-08-19 10:33:30 +02:00
Christophe JAILLET	3eaef0fa39	EDAC, thunderx: Fix error handling path in thunderx_lmc_probe() Return the proper error value if ioremap() fails (and not 0). Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: David Daney <david.daney@cavium.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: linux-mips@linux-mips.org Link: http://lkml.kernel.org/r/20170816045821.14165-1-christophe.jaillet@wanadoo.fr [ Massage commit message, remove newline. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2017-08-18 19:12:08 +02:00
Christophe JAILLET	8b073d945c	EDAC, altera: Fix error handling path in altr_edac_device_probe() Return the proper error value if devm_ioremap() fails (and not 0). Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Thor Thayer <thor.thayer@linux.intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170816050506.14541-1-christophe.jaillet@wanadoo.fr [ Massage commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2017-08-18 18:21:27 +02:00
Tony Luck	3e5d2bd191	EDAC, pnd2: Build in a minimal sideband driver for Apollo Lake I've been waing a long time for the generic sideband driver to appear. Patience has run out, so include the minimum here to just read registers. Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: Aristeu Rozanski <arozansk@redhat.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: Patrick Geary <patrickg@supermicro.com> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170803210536.5662-1-tony.luck@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-08-04 05:58:23 +02:00
Qiuxu Zhuo	039d7af651	EDAC, sb_edac: Classify memory mirroring modes Basically, there are full memory mirroring and address range partial memory mirroring (supported by Haswell EX and Broadwell EX) modes. a) In full memory mirroring, the memory behind each memory controller is mirrored, i.e. the memory is split into two identical mirrors (primary and secondary), half of the memory is reserved for redundancy. b) In address range partial memory mirroring, the memory size (range) of primary and secondary behind each memory controller can be user defined by the TAD0 register. The rest of memory ranges defined by TAD1/TAD2/... in that memory controller are non-mirrored. For more detail on memory mirroring, see the following link written by Tony Luck: https://01.org/lkp/blogs/tonyluck/2016/address-range-partial-memory-mirroring-linux Currently the sb_edac driver only supports address decoding in full memory mirroring and non-mirroring modes. In address range partial memory mirroring mode, it may fail to decode an address that falls in a non-mirroring area (the following was one of this kind of failed logs). mce: Uncorrected hardware memory error in user-access at 566d53a400 Memory failure: 0x566d53a: Killing einj_mem_uc:4647 due to hardware memory corruption Memory failure: 0x566d53a: recovery action for dirty LRU page: Recovered mce: [Hardware Error]: Machine check events logged EDAC sbridge MC1: HANDLING MCE MEMORY ERROR EDAC sbridge MC1: CPU 48: Machine Check Event: 0 Bank 7: ec00000000010090 EDAC sbridge MC1: TSC 4b914aa5a99dab EDAC sbridge MC1: ADDR 566d53a400 EDAC sbridge MC1: MISC 1443a0c86 EDAC sbridge MC1: PROCESSOR 0:406f1 TIME 1499712764 SOCKET 2 APIC 80 EDAC MC1: 0 UE Can't discover the memory rank for ch addr 0x7fb54e900 on any memory ( page:0x0 offset:0x0 grain:32) mce: [Hardware Error]: Machine check events logged Therefore, classify memory mirroring modes and make the address decoding in address range partial memory mode correct. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170730180651.30060-1-qiuxu.zhuo@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-08-02 05:40:11 +02:00
Rob Herring	2efdda4a41	EDAC, cpc925, ppc4xx: Convert to using %pOF instead of full_name Now that we have a custom printf format specifier, convert users of full_name to use %pOF instead. This is preparation to remove storing of the full path string for each node. Signed-off-by: Rob Herring <robh@kernel.org> Cc: devicetree@vger.kernel.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170718214339.7774-19-robh@kernel.org Signed-off-by: Borislav Petkov <bp@suse.de>	2017-07-19 07:42:41 +02:00
Borislav Petkov	c54182ec0e	EDAC: Get rid of mci->mod_ver It is a write-only variable so get rid of it. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Robert Richter <rric@kernel.org> Acked-by: Michal Simek <michal.simek@xilinx.com> Acked-by: Thor Thayer <thor.thayer@linux.intel.com> Acked-by: Tony Luck <tony.luck@intel.com> Cc: Mark Gross <mark.gross@intel.com> Cc: Tim Small <tim@buttersideup.com> Cc: Ranganathan Desikan <ravi@jetztechnologies.com> Cc: "Arvind R." <arvino55@gmail.com> Cc: Jason Baron <jbaron@akamai.com> Cc: "Sören Brinkmann" <soren.brinkmann@xilinx.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Daney <david.daney@cavium.com> Cc: Loc Ho <lho@apm.com> Cc: linux-edac@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-mips@linux-mips.org	2017-07-17 13:42:48 +02:00
Arvind Yadav	1c18be5a4e	EDAC: Constify attribute_group structures attribute_groups are not supposed to change at runtime. All functions working with attribute_groups provided by <linux/sysfs.h> work with const attribute_group. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> CC: linux-edac@vger.kernel.org Link: http://lkml.kernel.org/r/776cb8265509054abd01b0b551624cc0da3b88e7.1499078335.git.arvind.yadav.cs@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-07-17 10:20:25 +02:00
Yazen Ghannam	fbe63acf62	EDAC, mce_amd: Use cpu_to_node() to find the node ID Using the homegrown amd_get_nb_id() to find a node ID on AMD was fine while the L3 to node mapping was 1:1. And Zen topology broke this. So let's start slowly moving away from it and use the topology interfaces instead. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1490041614-90057-2-git-send-email-Yazen.Ghannam@amd.com [ Massage commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2017-07-17 07:01:08 +02:00
Tony Luck	164c29244d	EDAC, pnd2: Fix Apollo Lake DIMM detection Non-existent or empty DIMM slots result in error return from RD_REGP(). But we shouldn't give up on failure. So long as we find at least one DIMM we can continue. Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170628234407.21521-1-tony.luck@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-06-29 10:37:50 +02:00
Jérémy Lefaure	a8c8261425	EDAC, i5000, i5400: Fix definition of NRECMEMB register In the i5000 and i5400 drivers, the NRECMEMB register is defined as a 16-bit value, which results in wrong shifts in the code, as reported by sparse. In the datasheets ([1], section 3.9.22.20 and [2], section 3.9.22.21), this register is a 32-bit register. A u32 value for the register fixes the wrong shifts warnings and matches the datasheet. Also fix the mask to access to the CAS bits [27:16] in the i5000 driver. [1]: https://www.intel.com/content/dam/doc/datasheet/5000p-5000v-5000z-chipset-memory-controller-hub-datasheet.pdf [2]: https://www.intel.se/content/dam/doc/datasheet/5400-chipset-memory-controller-hub-datasheet.pdf Signed-off-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170629005729.8478-1-jeremy.lefaure@lse.epita.fr Signed-off-by: Borislav Petkov <bp@suse.de>	2017-06-29 10:33:13 +02:00
Colin Ian King	77641dacea	EDAC, pnd2: Make function sbi_send() static The function sbi_send() is local to just pnd2_edac.c and does not need to be in global scope, so make it static. Signed-off-by: Colin Ian King <colin.king@canonical.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170623084855.9197-1-colin.king@canonical.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-06-26 16:13:25 +02:00
Gustavo A. R. Silva	ee514c7a23	EDAC, pnd2: Return proper error value from apl_rd_reg() Add code comment to make it clear that the fall-through is intentional and, OR ret with its previous value to avoid overwriting it so that callers can check the correct return value. Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170622220535.GA4896@embeddedgus [ Massage a bit. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2017-06-23 09:48:50 +02:00
Chris Packham	ff0abed492	EDAC, altera: Simplify calculation of total memory Use of_address_to_resource() and resource_size() instead of manually parsing the "reg" property from the "memory" node(s). Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Tested-by: Thor Thayer <thor.thayer@linux.intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170606235500.22772-3-chris.packham@alliedtelesis.co.nz Signed-off-by: Borislav Petkov <bp@suse.de>	2017-06-14 13:49:25 +02:00
Qiuxu Zhuo	133e4455c9	EDAC, sb_edac: Avoid creating SOCK memory controller Xiaolong Ye reported the following failure on Broadwell D server: EDAC sbridge: Some needed devices are missing EDAC MC: Removed device 0 for sbridge_edac.c Broadwell SrcID#0_Ha#0: DEV 0000:ff:12.0 EDAC sbridge: Couldn't find mci handler EDAC sbridge: Failed to register device with error -19. Broadwell D (only IMC0 per socket) and Broadwell X (IMC0 and IMC1 per socket) use the same PCI device IDs for IMC0 per socket, then they share pci_dev_descr_broadwell_table (n_imcs_per_sock=2). In this case, Broadwell D wrongly creates the nonexistent SOCK EDAC memory controller and reports above error messages, since it has no IMC1 per socket. Avoid creating the nonexistent SOCK memory controller. Reported-and-tested-by: Xiaolong Ye <xiaolong.ye@intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170608113351.25323-1-qiuxu.zhuo@intel.com [ Massage. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2017-06-14 11:53:39 +02:00
Yazen Ghannam	bdf1bf1744	EDAC, mce_amd: Fix typo in SMCA error description Fix typo in "poison consumption" error description. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1497286703-62853-1-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-06-12 19:03:55 +02:00
Chris Packham	3b405e30cb	EDAC, mv64x60: Sanity check edac_op_state before registering edac_op_state is a module parameter which affects the behaviour of the driver probe which can potentially be invoked as soon as the platform driver registration happens. Because of this we need to ensure that we sanity check the module parameter before calling platform_register_drivers(). Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170607215530.8604-1-chris.packham@alliedtelesis.co.nz Signed-off-by: Borislav Petkov <bp@suse.de>	2017-06-09 11:55:55 +02:00
Vadim Lomovtsev	cf97825862	EDAC, thunderx: Fix a warning during l2c debugfs node creation Compare the number of debugfs entries created by thunderx_create_debugfs_nodes() with the requested number of entries to properly determine whether to print a warning. Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: linux-mips@linux-mips.org Link: http://lkml.kernel.org/r/20170531155157.93583-1-stemerkhanov@cavium.com Signed-off-by: Sergey Temerkhanov <s.temerkhanov@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de>	2017-06-01 20:08:17 +02:00
Chris Packham	7d2fdaa694	EDAC, mv64x60: Check driver registration success Check the return status of platform_driver_register() in mv64x60_edac_init(). Only output messages and initialise the edac_op_state if the registration is successful. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20170529212142.25572-2-chris.packham@alliedtelesis.co.nz Signed-off-by: Borislav Petkov <bp@suse.de>	2017-05-30 09:55:51 +02:00
Jason Baron	7103de0e58	EDAC, ie31200: Add Intel Kaby Lake CPU support Kaby Lake seems to work just like Skylake. Reported-and-tested-by: Doug Thompson <bc.tdw@recursor.net> Signed-off-by: Jason Baron <jbaron@akamai.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Tony Luck <tony.luck@intel.com Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1495823683-32569-1-git-send-email-jbaron@akamai.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-05-28 19:29:40 +02:00
Chris Packham	8b9afe5946	EDAC, mv64x60: Replace in_le32()/out_le32() with readl()/writel() To allow this driver to be used on non-powerpc platforms it needs to use io accessors suitable for all platforms. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20170518083135.28048-4-chris.packham@alliedtelesis.co.nz Signed-off-by: Borislav Petkov <bp@suse.de>	2017-05-26 22:55:50 +02:00
Chris Packham	0b3df44eeb	EDAC, mv64x60: Fix pdata->name Change this from mpc85xx_pci_err to mv64x60_pci_err. The former is likely a hangover from when this driver was created. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20170518083135.28048-3-chris.packham@alliedtelesis.co.nz Signed-off-by: Borislav Petkov <bp@suse.de>	2017-05-26 22:54:04 +02:00
Qiuxu Zhuo	d14e3a201f	EDAC, sb_edac: Bump driver version and do some cleanups Collapse 'case:' in *_mci_bind_devs() and update driver version from 1.1.1 to 1.1.2. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170523000934.87971-1-qiuxu.zhuo@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-05-25 15:00:36 +02:00
Qiuxu Zhuo	4d475dde79	EDAC, sb_edac: Check if ECC enabled when at least one DIMM is present This is based on previous work by Patrick Geary, see Link. Additional cleanups ontop: - Remove the code to read MCMTR from pci_ha1_ta and CHN_TO_HA macro, now that TA0 and TA1 are unified. - Remove get_pdev_same_bus(), since in get_dimm_config() the variable "pvt->pci_ta" for KNL is also ready, we can simply use pci_read_config_dword(pvt->pci_ta, KNL_MCMTR, &pvt->info.mcmtr) to read MCMTR. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: https://lkml.kernel.org/r/57884350.1030401@supermicro.com Link: http://lkml.kernel.org/r/20170523000910.87925-1-qiuxu.zhuo@intel.com [ Make __populate_dimms() return int. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2017-05-25 14:57:52 +02:00
Qiuxu Zhuo	3286d3eb90	EDAC, sb_edac: Drop NUM_CHANNELS from 8 back to 4 We don't need this quirk anymore now that the EDAC memory controller representation matches the hardware. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170523000834.87881-1-qiuxu.zhuo@intel.com [ Commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2017-05-25 14:40:40 +02:00
Borislav Petkov	6696522957	EDAC, sb_edac: Carve out dimm-populating loop ... to slim down get_dimm_config(). No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de>	2017-05-25 14:37:34 +02:00
Borislav Petkov	199389acd9	EDAC, sb_edac: Fix mod_name It is called "sb_edac.c" now. Signed-off-by: Borislav Petkov <bp@suse.de>	2017-05-25 14:37:33 +02:00
Qiuxu Zhuo	e2f747b1f4	EDAC, sb_edac: Assign EDAC memory controller per h/w controller Tony pointed out: "currently the driver pretends there is one big 8-channel memory controller per socket instead of 2 4-channel controllers. This is fine with all memory controller populated with symmetrical DIMM configurations, but runs into difficulties on asymmetrical setups". Restructure the driver to assign an EDAC memory controller to each real h/w memory controller to resolve the issue. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170523000731.87793-1-qiuxu.zhuo@intel.com [ Break some lines at convenient points. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2017-05-25 14:37:21 +02:00
Tony Luck	7fd562b75d	EDAC, sb_edac: Don't use "Socket#" in the memory controller name EDAC assigns logical memory controller numbers in the order that we find memory controllers, which depends on which PCI bus they are on. Some systems end up with MC0 on socket0, others (e.g Haswell) have MC0 on socket3. All this is made more confusing for users because we use the string "Socket" while generating names for memory controllers, but the number that we attach there is the memory controller number. E.g. EDAC MC0: Giving out device to module sbridge_edac.c controller Haswell Socket#0: DEV 0000:ff:12.0 (INTERRUPT) Change the names to say "SrcID#%d" (where the number we use is read from the h/w associated with the memory controller instead of some logical number internal to the EDAC driver). New message: EDAC MC0: Giving out device to module sbridge_edac.c controller Haswell SrcID#3: DEV 0000:ff:12.0 (INTERRUPT) Reported-by: Andrey Korolyov <andrey@xdel.ru> Reported-by: Patrick Geary <patrickg@supermicro.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170523000603.87748-1-qiuxu.zhuo@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-05-25 11:47:11 +02:00
Qiuxu Zhuo	00cf50d90a	EDAC, sb_edac: Classify PCI-IDs by topology Each of the PCI device IDs belongs to a CPU socket, or to one of the integrated memory controllers. Provide an enum to specify the domain of each, and distinguish the resource number in each domain: the number of the PCI device IDs per integrated memory controller/socket, and the number of integrated memory controllers per socket. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170523000533.87704-1-qiuxu.zhuo@intel.com [ Realign pci_dev_descr_knl members. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2017-05-25 11:19:25 +02:00
Tobias Klauser	18caec20bf	EDAC, altera: Constify irq_domain_ops struct irq_domain_ops is not modified, so it can be made const. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Cc: Thor Thayer <thor.thayer@linux.intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170524133505.1233-1-tklauser@distanz.ch Signed-off-by: Borislav Petkov <bp@suse.de>	2017-05-24 15:46:25 +02:00
Yazen Ghannam	eb77e6b80f	EDAC, amd64: Fix reporting of Chip Select sizes on Fam17h The wrong index into the csbases/csmasks arrays was being passed to the function to compute the chip select sizes, which resulted in the wrong size being computed. Address that so that the correct values are computed and printed. Also, redo how we calculate the number of pages in a CS row. Reported-by: Benjamin Bennett <benbennett@gmail.com> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Cc: <stable@vger.kernel.org> # 4.10.x Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1493313114-11260-1-git-send-email-Yazen.Ghannam@amd.com [ Remove unneeded integer math comment, minor cleanups. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2017-05-03 16:27:36 +02:00
Borislav Petkov	f8d5549df2	EDAC, ghes: Do not enable it by default Leave it to the user to decide whether to enable this or not. Otherwise, platform-specific drivers won't initialize (currently, EDAC supports only a single platform driver loaded). Signed-off-by: Borislav Petkov <bp@suse.de>	2017-04-27 14:15:38 +02:00
Borislav Petkov	bffc7dece9	EDAC: Rename report status accessors Change them to have the edac_ prefix. No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de>	2017-04-10 17:15:02 +02:00
Borislav Petkov	fee27d7d97	EDAC: Delete edac_stub.c Move the remaining functionality to edac_mc.c. Convert "edac_report=" to a module parameter. Signed-off-by: Borislav Petkov <bp@suse.de>	2017-04-10 17:14:48 +02:00
Borislav Petkov	a06b85ff07	EDAC: Update Kconfig help text Remove the old URLs. Signed-off-by: Borislav Petkov <bp@suse.de>	2017-04-10 17:14:44 +02:00
Borislav Petkov	e3c4ff6d8c	EDAC: Remove EDAC_MM_EDAC Move all the EDAC core functionality behind CONFIG_EDAC and get rid of that indirection. Update defconfigs which had it. While at it, fix dependencies such that EDAC depends on RAS for the tracepoints. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxppc-dev@lists.ozlabs.org Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: linux-edac@vger.kernel.org	2017-04-10 17:14:41 +02:00
Borislav Petkov	be1d162948	EDAC: Issue tracepoint only when it is defined ... and this happens only when CONFIG_RAS is enabled. Signed-off-by: Borislav Petkov <bp@suse.de>	2017-04-10 17:14:38 +02:00
Borislav Petkov	8c22b4fece	EDAC: Move edac_op_state to edac_mc.c ... as part of moving stuff away from edac_stub.c Signed-off-by: Borislav Petkov <bp@suse.de>	2017-04-10 17:14:29 +02:00
Borislav Petkov	d3116a0837	EDAC: Remove edac_err_assert ... and the glue around it. It is not needed anymore. Signed-off-by: Borislav Petkov <bp@suse.de>	2017-04-10 17:14:21 +02:00
Borislav Petkov	97bb6c17ad	EDAC: Get rid of edac_handlers Use mc_devices list instead to check whether we have EDAC driver instances successfully registered with EDAC core. Signed-off-by: Borislav Petkov <bp@suse.de>	2017-04-10 17:14:17 +02:00
Borislav Petkov	db47d5f856	x86/nmi, EDAC: Get rid of DRAM error reporting thru PCI SERR NMI Apparently, some machines used to report DRAM errors through a PCI SERR NMI. This is why we have a call into EDAC in the NMI handler. See `c0d1217202` ("drivers/edac: add new nmi rescan"). From looking at the patch above, that's two drivers: e752x_edac.c and e7xxx_edac.c. Now, I wanna say those are old machines which are probably decommissioned already. Tony says that "[t]the newest CPU supported by either of those drivers is the Xeon E7520 (a.k.a. "Nehalem") released in Q1'2010. Possibly some folks are still using these ... but people that hold onto h/w for 7 years generally cling to old s/w too ... so I'd guess it unlikely that we will get complaints for breaking these in upstream." So even if there is a small number still in use, we did load EDAC with edac_op_state == EDAC_OPSTATE_POLL by default (we still do, in fact) which means a default EDAC setup without any parameters supplied on the command line or otherwise would never even log the error in the NMI handler because we're polling by default: inline int edac_handler_set(void) { if (edac_op_state == EDAC_OPSTATE_POLL) return 0; return atomic_read(&edac_handlers); } So, long story short, I'd like to get rid of that nastiness called edac_stub.c and confine all the EDAC drivers solely to drivers/edac/. If we ever have to do stuff like that again, it should be notifiers we're using and not some insanity like this one. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com>	2017-04-10 17:13:48 +02:00
Borislav Petkov	76f6a26ce9	EDAC, highbank: Align Makefile directives ... like the rest of the file. Signed-off-by: Borislav Petkov <bp@suse.de>	2017-04-10 17:10:43 +02:00
Sergey Temerkhanov	5195c206fd	EDAC, thunderx: Remove unused code Remove unused code reserved for upcoming CPUs. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Sergey Temerkhanov <s.temerkhanov@gmail.com> Cc: David Daney <david.daney@cavium.com> Cc: Jan.Glauber@cavium.com Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170406113834.17153-1-s.temerkhanov@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-04-07 11:49:32 +02:00
Sergey Temerkhanov	3d2d8c0f84	EDAC, thunderx: Change LMC index calculation Shift the node number by 3 bits instead of 8 allowing proper functioning with default EDAC_MAX_MCS. Signed-off-by: Sergey Temerkhanov <s.temerkhanov@gmail.com> Cc: David Daney <david.daney@cavium.com> Cc: Jan.Glauber@cavium.com Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170406113755.17082-1-s.temerkhanov@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-04-07 11:47:44 +02:00
Thor Thayer	25b223ddfe	EDAC, altera: Fix peripheral warnings for Cyclone5 The peripherals' RAS functionality only exist on the Arria10 SoCFPGA. The Cyclone5 initialization generates EDAC warnings when the peripherals aren't found in the device tree. Fix by checking for Arria10 in the init functions. Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1491415262-5018-1-git-send-email-thor.thayer@linux.intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-04-06 11:42:38 +02:00
Jan Glauber	621c4fe3cc	EDAC, thunderx: Fix L2C MCI interrupt disable Fix a typo that disabled the MCI interrupts using the wrong bitmask. Signed-off-by: Jan Glauber <jglauber@cavium.com> Cc: David Daney <david.daney@cavium.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Sergey Temerkhanov <s.temerkhanov@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170405102739.6301-1-jglauber@cavium.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-04-05 14:46:05 +02:00
Sergey Temerkhanov	41003396f9	EDAC, thunderx: Add Cavium ThunderX EDAC driver Add support for Cavium ThunderX EDAC capable on-chip peripherals, namely the DRAM controller (LMC), cache coherent processor interconnect (CCPI) and level 2 cache blocks (L2C-TAD, L2C-MCI, L2C-CBC) Signed-off-by: Sergey Temerkhanov <s.temerkhanov@gmail.com> Cc: David.Daney@cavium.com Cc: Jan.Glauber@cavium.com Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170324222837.60583-1-s.temerkhanov@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-03-27 11:43:56 +02:00
Qiuxu Zhuo	819f60fb7d	EDAC, pnd2_edac: Fix reported DIMM number DIMM number passed to edac_mc_handle_error() was accidentally hardcoded to zero. Pass in the correct daddr->dimm value. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de>	2017-03-26 09:36:28 +02:00
Borislav Petkov	cd1be315ac	EDAC, pnd2_edac: Fix !EDAC_DEBUG build Provide debugfs function stubs when EDAC_DEBUG is not enabled so that we don't fail the build: drivers/edac/pnd2_edac.c: In function ‘pnd2_init’: drivers/edac/pnd2_edac.c:1521:2: error: implicit declaration of function ‘setup_pnd2_debug’ [-Werror=implicit-function-declaration] setup_pnd2_debug(); ^ drivers/edac/pnd2_edac.c: In function ‘pnd2_exit’: drivers/edac/pnd2_edac.c:1529:2: error: implicit declaration of function ‘teardown_pnd2_debug’ [-Werror=implicit-function-declaration] teardown_pnd2_debug(); ^ Signed-off-by: Borislav Petkov <bp@suse.de>	2017-03-23 12:56:23 +01:00
Borislav Petkov	1c5bf78114	EDAC: Select DEBUG_FS The debugfs.c functionality relies on DEBUG_FS so select it. Signed-off-by: Borislav Petkov <bp@suse.de>	2017-03-23 12:56:09 +01:00
Tony Luck	5c71ad17f9	EDAC, pnd2_edac: Add new EDAC driver for Intel SoC platforms Initial target for this driver is the Intel Apollo Lake platform and Denverton micro-server, they use the same internal memory controller IP called Pondicherry2. Memory controller registers are not in PCI config space like earlier Intel memory controllers. For Apollo Lake platform they are accessed via a "side-band" interface, for Denverton micro-server they are access via PCI config space and memory map I/O. This driver is for Apollo Lake and Denverton, but only the Denverton is fully enabled while we wait for the sideband driver. Apollo lake driver and initial cut at Denverton driver by Tony Luck. Extensive cleanup, refactoring and basic verification by Qiuxu Zhuo. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170308174539.14432-1-qiuxu.zhuo@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-03-16 12:40:52 +01:00
Jérémy Lefaure	e61555c29c	EDAC, i5000, i5400: Fix use of MTR_DRAM_WIDTH macro The MTR_DRAM_WIDTH macro returns the data width. It is sometimes used as if it returned a boolean true if the width if 8. Fix the tests where MTR_DRAM_WIDTH is misused. Signed-off-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170309011809.8340-1-jeremy.lefaure@lse.epita.fr Signed-off-by: Borislav Petkov <bp@suse.de>	2017-03-09 09:25:29 +01:00
Colin Ian King	4bd035eae2	EDAC, xgene: Fix wrongly spelled "procesing" Fix spelling mistake in dev_err message. Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Loc Ho <lho@apm.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170223002609.9440-1-colin.king@canonical.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-03-06 17:08:19 +01:00
Linus Torvalds	60c906bab1	Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS updates from Ingo Molnar: "The main changes in this cycle were: - Assign notifier chain priorities for all RAS related handlers to make the ordering explicit (Borislav Petkov) - Improve the AMD MCA banks sysfs output (Yazen Ghannam) - Various cleanups and restructuring of the x86 RAS code (Borislav Petkov)" * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ras, EDAC, acpi: Assign MCE notifier handlers a priority x86/ras: Get rid of mce_process_work() EDAC/mce/amd: Dump TSC value EDAC/mce/amd: Unexport amd_decode_mce() x86/ras/amd/inj: Change dependency x86/ras: Flip the TSC-adding logic x86/ras/amd: Make sysfs names of banks more user-friendly x86/ras/therm_throt: Do not log a fake MCE for thermal events x86/ras/inject: Make it depend on X86_LOCAL_APIC=y	2017-02-20 12:47:44 -08:00
Yazen Ghannam	75bf2f6478	EDAC, mce_amd: Print IPID and Syndrome on a separate line Currently, the IPID and Syndrome are printed on the same line as the Address. There are cases when we can have a valid Syndrome but not a valid Address. For example, the MCA_SYND register can be used to hold more detailed error info that the hardware folks can use. It's not just DRAM ECC syndromes. There are some error types that aren't related to memory that may have valid syndromes, like some errors related to links in the Data Fabric, etc. In these cases, the IPID and Syndrome are not printed at the same log level as the rest of the stanza, so users won't see them on the console. Console: [Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over\|CE\|MiscV\|-\|-\|-\|-\|SyndV\|-]: 0xd82000000002080b [Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2 Dmesg: [Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over\|CE\|MiscV\|-\|-\|-\|-\|SyndV\|-]: 0xd82000000002080b , Syndrome: 0x000000010b404000, IPID: 0x0001002e00000002 [Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2 Print the IPID first and on a new line. The IPID should always be printed on SMCA systems. The Syndrome will then be printed with the IPID and at the same log level when valid: [Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over\|CE\|MiscV\|-\|-\|-\|-\|SyndV\|-]: 0xd82000000002080b [Hardware Error]: IPID: 0x0001002e00000002, Syndrome: 0x000000010b404000 [Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2 Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1487192182-2474-1-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-02-16 15:39:32 +01:00
Borislav Petkov	e62d2ca9d0	EDAC, amd64: Bump driver version Last time we did that was when we enabled Bulldozer. Now, we enabled Zen so it is only natural ... :-) Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>	2017-02-14 11:58:05 +01:00
Wei Yongjun	279fa58035	EDAC, fsl_ddr: Make locally used symbols static Fix the following sparse warnings: drivers/edac/fsl_ddr_edac.c:148:1: warning: symbol 'dev_attr_inject_data_hi' was not declared. Should it be static? drivers/edac/fsl_ddr_edac.c:150:1: warning: symbol 'dev_attr_inject_data_lo' was not declared. Should it be static? drivers/edac/fsl_ddr_edac.c:152:1: warning: symbol 'dev_attr_inject_ctrl' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170209150424.15124-1-weiyj.lk@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-02-09 17:40:54 +01:00
Chris Packham	321d17c19b	EDAC, mpc85xx: Add T2080 l2-cache support The L2 cache controller on the T2080 SoC has similar capabilities to the others already supported by the mpc85xx_edac driver. Add it to the list of compatible devices. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Acked-by: Johannes Thumshirn <jth@kernel.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: devicetree@vger.kernel.org Cc: linux-edac <linux-edac@vger.kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20170201231624.28843-1-chris.packham@alliedtelesis.co.nz Signed-off-by: Borislav Petkov <bp@suse.de>	2017-02-03 10:36:35 +01:00
Yazen Ghannam	1bd9900b83	EDAC, amd64: Add x86cpuid sanity check during init Match one of the devices in amd64_cpuids[] before loading the module. This is an additional sanity check against users trying to load amd64_edac_mod on unsupported systems. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1485537863-2707-9-git-send-email-Yazen.Ghannam@amd.com [ Get rid of err_ret label, make it a bit more readable this way. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2017-01-28 14:43:06 +01:00
Yazen Ghannam	4688c9b42d	EDAC, amd64: Don't treat ECC disabled as failure Having ECC disabled on a node doesn't necessarily mean that it's disabled for the entire system. So let's return a non-failing code when ECC is disabled on a node. This way we can skip initialization for the node but still continue with the remaining nodes. After probing all instances, make sure we have at least one MC device allocated. This issue is seen and fix tested on Fam15h and Fam17h MCM systems. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1485537863-2707-8-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-01-28 14:38:49 +01:00
Yazen Ghannam	d7fc9d77ac	EDAC: Add routine to check if MC devices list is empty We need to know if any MC devices have been allocated. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1485537863-2707-7-git-send-email-Yazen.Ghannam@amd.com [ Prettify text. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2017-01-28 14:36:47 +01:00
Yazen Ghannam	df64636fa4	EDAC, amd64: Remove unused printing macros amd64_{debug,notice} don't have any users, so remove them. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1485537863-2707-6-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-01-28 13:20:06 +01:00
Yazen Ghannam	11ab1cae58	EDAC, amd64: Rework messages in ecc_enabled() Print the node number when informing that DRAM ECC is disabled so that we can show which nodes have DRAM ECC disabled. Also, print more detailed system information as edac_dbg(), so as to not bother general users. Switch amd64_notice to amd64_info to match the message above it. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1485537863-2707-5-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-01-28 13:18:33 +01:00
Yazen Ghannam	234365f56e	EDAC, amd64: Move global code out of instance functions We have a few functions that register/unregister an ECC error decoding routine. These functions are called when we init/remove instances. However, they are global and so don't need to be registered/unregistered multiple times. So move them out of the init/remove instance functions and into the module init/exit routines. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1485297149-13733-4-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-01-28 13:08:10 +01:00
Yazen Ghannam	2b9b2c4659	EDAC, amd64: Free unused memory when init_one_instance() fails Jump to memory freeing routines when init_one_instance() fails. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1485297149-13733-3-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-01-28 13:03:40 +01:00
Yazen Ghannam	67d7fd306e	EDAC, mce_amd: Give more context to deferred error message Users may not be familiar with the concept of deferred errors. There is no action for users to take on this type of error, so give more context in the error message to make this more clear. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1485297149-13733-2-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-01-28 13:03:29 +01:00
Borislav Petkov	58fb24cb95	EDAC, i7300: Test for the second channel properly REDMEMB[17] is the ECC_Locator bit, which, when set, identifies the CS[3:2] as the simbols in error. And thus the second channel. The macro computing it was wrong so get rid of it (it was used at one place only) and get rid of the conditional too. Generates better code this way anyway. Signed-off-by: Borislav Petkov <bp@suse.de> Reported-by: David Binderman <dcb314@hotmail.com> Reviewed-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2017-01-26 11:35:23 +01:00
Borislav Petkov	9026cc82b6	x86/ras, EDAC, acpi: Assign MCE notifier handlers a priority Assign all notifiers on the MCE decode chain a priority so that they get called in the correct order. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170123183514.13356-10-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-01-24 09:14:57 +01:00
Borislav Petkov	0bceab677d	EDAC/mce/amd: Dump TSC value Dump the TSC value of the time when the MCE got logged. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170123183514.13356-8-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-01-24 09:14:56 +01:00
Borislav Petkov	1fbcd90903	EDAC/mce/amd: Unexport amd_decode_mce() It is not used outside of the driver anymore. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170123183514.13356-7-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-01-24 09:14:55 +01:00
Nicolas Iooss	127c1225bf	EDAC, sb_edac: Get rid of ->show_interleave_mode() Function sbridge_register_mci() sets pvt->info.show_interleave_mode to knl_show_interleave_mode() on Knight's Landing and show_interleave_mode() anywhere else. Merge show_interleave_mode() and knl_show_interleave_mode() in a single implementation and use it without an indirect function pointer. Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170122172806.10412-1-nicolas.iooss_linux@m4x.org [ Call it get_intlv_mode_str(). ] Signed-off-by: Borislav Petkov <bp@suse.de>	2017-01-23 11:39:48 +01:00
Aaron Miller	4fb6fde74d	EDAC: Expose per-DIMM error counts in sysfs The old csrowX sysfs directories have per-csrow error counters, but the new dimmX directories do not currently expose error counts. EDAC already keeps these counts, add them to sysfs so per-DIMM counts are still available when CONFIG_EDAC_LEGACY_SYSFS=n. Signed-off-by: Aaron Miller <aaronmiller@fb.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20161103220153.3997328-1-aaronmiller@fb.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-01-19 10:29:40 +01:00
Yazen Ghannam	2287c63643	EDAC, amd64: Save and return err code from probe_one_instance() We should save the return code from probe_one_instance() so that it can be returned from the module init function. Otherwise, we'll be returning the -ENOMEM from above. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1484322741-41884-1-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-01-16 11:53:39 +01:00
Arvind Yadav	f5c61277f6	EDAC, i82975x: Add ioremap_nocache() error handling If ioremap_nocache() fails, it will return NULL. Which will then cause a NULL-pointer dereference. Handle the returned value properly. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Cc: "Arvind R." <arvino55@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1484549092-11349-1-git-send-email-arvind.yadav.cs@gmail.com [ Boris: massage commit message and improve error message. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2017-01-16 11:09:22 +01:00
Ben Dooks	628ea92f09	EDAC: Make dev_attr_sdram_scrub_rate static The dev_attr_sdram_scrub_rate is not declared in a header or used anywhere else, so make it static to fix the following warning: drivers/edac/edac_mc_sysfs.c:816:1: warning: symbol 'dev_attr_sdram_scrub_rate' was not declared. Should it be static? Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Reviewed-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1465407356-7357-1-git-send-email-ben.dooks@codethink.co.uk Signed-off-by: Borislav Petkov <bp@suse.de>	2017-01-06 15:54:16 +01:00
Linus Torvalds	7c0f6ba682	Replace <asm/uaccess.h> with <linux/uaccess.h> globally This was entirely automated, using the script by Al: PATT='^[[:blank:]]#[[:blank:]]include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"\|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-12-24 11:46:01 -08:00
Mauro Carvalho Chehab	66c222a02f	edac: fix kernel-doc tags at the drivers/edac_*.h Some kernel-doc tags don't provide good descriptions or use a different style. Adjust them. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2016-12-15 08:58:10 -02:00
Mauro Carvalho Chehab	6634fbb6b6	driver-api: create an edac.rst file with EDAC documentation Currently, there's no device driver documentation for the EDAC subsystem at the driver-api book. Fill in the blanks for the structures and functions that misses documentation, uniform the word on the existing ones, and add a new edac.rst file at driver-api, in order to document the EDAC subsystem. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2016-12-15 08:54:52 -02:00
Mauro Carvalho Chehab	e01aa14cf2	edac: move documentation from edac_mc.c to edac_core.h Several functions are documented at edac_mc.c. As we'll be including edac_core.h at drivers-api book, move those, in order for the kernel-doc markups be part of the API documentation book. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2016-12-15 08:54:52 -02:00
Mauro Carvalho Chehab	fdaf0b3505	edac: move documentation from edac_pci*.c to edac_pci.h Several functions are documented at edac_pci.c and edac_pci_sysfs.c. As we'll be including edac_pci.h at drivers-api book, move those, in order for the kernel-doc markups be part of the API documentation book. As several of those kernel-doc macros are not in the right format, fix them. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2016-12-15 08:54:51 -02:00
Mauro Carvalho Chehab	5336f75499	edac: move documentation from edac_device to edac_core.h Several functions are documented at edac_device.c. As we'll be including edac_core.h at drivers-api book, move those, in order for the kernel-doc markups be part of the API documentation book. As several of those kernel-doc macros are not in the right format, fix them. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2016-12-15 08:54:51 -02:00
Mauro Carvalho Chehab	78d88e8a3d	edac: rename edac_core.h to edac_mc.h Now, all left at edac_core.h are at drivers/edac/edac_mc.c, so rename it to edac_mc.h. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2016-12-15 08:54:51 -02:00
Mauro Carvalho Chehab	6d8ef24724	edac: move EDAC device definitions to drivers/edac/edac_device.h The edac_core.h header contain data structures and function definitions for both EDAC MC and EDAC device. Let's move the devices ones to a separate header file, as part of a header reorganization. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2016-12-15 08:54:51 -02:00
Mauro Carvalho Chehab	0b892c7177	edac: move EDAC PCI definitions to drivers/edac/edac_pci.h The edac_core.h header contain data structures and function definitions for the 3 parts of EDAC: MC, PCI and device. Let's move the PCI ones to a separate header file, as part of a header reorganization. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2016-12-15 08:54:50 -02:00
Mauro Carvalho Chehab	7230617537	edac: edac_core.h: remove prototype for edac_pci_reset_delay_period() This function doesn't exist. So, remove its prototype. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2016-12-15 08:54:49 -02:00
Mauro Carvalho Chehab	a2c223b5ed	edac: edac_core.h: get rid of unused kobj_complete This element of struct edac_pci_ctl_info is never used. So, get rid of it. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2016-12-15 08:54:49 -02:00
Pan Bian	0de2788447	EDAC, amd64: Fix improper return value When the call to zalloc_cpumask_var() fails, returning "false" seems improper. The real value of macro "false" is 0, and 0 means no error. Return -ENOMEM instead. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=189071 Signed-off-by: Pan Bian <bianpan2016@163.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1480831638-5361-1-git-send-email-bianpan201604@163.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-12-04 10:51:42 +01:00
Borislav Petkov	5246c54007	EDAC, amd64: Improve amd64-specific printing macros Prefix the warn and error macros with the respective string so that callers don't have to say "Error" or "Warning". We save us string length this way in the actual calls. While at it, shorten the calls in reserve_mc_sibling_devs(). Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>	2016-12-01 11:35:07 +01:00
Yazen Ghannam	95d3af6bd1	EDAC, amd64: Autoload amd64_edac_mod on Fam17h systems Add Fam17h to the list of families to autoload amd64_edac_mod. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1479423463-8536-18-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-29 18:05:49 +01:00
Yazen Ghannam	713ad54675	EDAC, amd64: Define and register UMC error decode function How we need to decode UMC errors is different from how we decode bus errors, so let's define a new function for this. We also need a way to determine the UMC channel since we're not guaranteed that there is a fixed relation between channel and MCA bank. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1480359593-80369-1-git-send-email-Yazen.Ghannam@amd.com [ Fold in decode_synd_reg(), simplify. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-29 18:05:48 +01:00
Yazen Ghannam	d27f3a348e	EDAC, amd64: Determine EDAC capabilities on Fam17h systems We need to determine the EDAC capabilities from all UMCs on the node. We should only check UMCs that are enabled and make sure they all agree. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1479423463-8536-15-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-29 18:05:47 +01:00
Yazen Ghannam	2d09d8f301	EDAC, amd64: Determine EDAC MC capabilities on Fam17h The UMCs on Fam17h are independent memory controllers so we need to read the capabilities from all UMCs and make sure they agree. Once we determine what capabilities are available we should save them for convenience. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1480431116-94683-1-git-send-email-Yazen.Ghannam@amd.com [ Simplify f17h_determine_edac_ctl_cap(), preinit edac_mode in init_csrows(). ] Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-29 18:04:54 +01:00
Yazen Ghannam	07ed82ef93	EDAC, amd64: Add Fam17h debug output Read a few more UMC registers and provide debug output in order to be as similar as possible to older AMD systems. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1480344621-14966-1-git-send-email-Yazen.Ghannam@amd.com [ Remove unneeded K8 check and comments, fixup others. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-29 17:16:09 +01:00
Yazen Ghannam	8051c0af3c	EDAC, amd64: Add Fam17h scrubber support Fam17h has new register offsets and fields for setting up the DRAM scrubber so add support for this. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1479423463-8536-17-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-28 17:50:12 +01:00
Yazen Ghannam	a6c14dce85	EDAC, mce_amd: Don't report poison bit on Fam15h, bank 4 MCA_STATUS[43] has been defined as "Poison" or "Reserved" for every bank since Fam15h except for Fam15h, bank 4 in which case it's defined as part of the McaStatSubCache bitfield. Filter out that case. Reported-by: Dean Liberty <Dean.Liberty@amd.com> Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1479478222-19896-1-git-send-email-Yazen.Ghannam@amd.com [ Split an almost unparseable ternary conditional, add a comment. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-28 17:50:12 +01:00
Yazen Ghannam	b64ce7cd7f	EDAC, amd64: Read MC registers on AMD Fam17h Fam17h has a different set of registers and bitfields. Most of these registers are read through SMN (System Management Network) rather than PCI config space. Also, the derivation of various values is now different. Update amd64_edac to read the appropriate registers and extract the correct values for Fam17h. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1479423463-8536-12-git-send-email-Yazen.Ghannam@amd.com [ Save us the indentation level in read_mc_regs(), add defines ] Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-28 17:50:11 +01:00
Yazen Ghannam	936fc3afaa	EDAC, amd64: Reserve correct PCI devices on AMD Fam17h Fam17h needs PCI device functions 0 and 6 instead of 1 and 2 as on older systems. Update struct amd64_pvt to hold the new functions and reserve them if on Fam17h. Also, allocate an array of UMC structs within our newly allocated PVT struct. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1479423463-8536-11-git-send-email-Yazen.Ghannam@amd.com [ init_one_instance() error handling, shorten lines, unbreak >80 cols lines. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-28 17:49:40 +01:00
Yazen Ghannam	f1cbbec9fc	EDAC, amd64: Add AMD Fam17h family type and ops Add a family type and associated ops for Fam17h. Define a struct to hold all the UMC registers that we need. Make this a part of struct amd64_pvt in order to maximize code reuse in the rest of the driver. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1479423463-8536-10-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-24 21:24:09 +01:00
Yazen Ghannam	196b79fcc8	EDAC, amd64: Extend ecc_enabled() to Fam17h Update the ecc_enabled() function to work on Fam17h. This entails reading a different set of registers and using the SMN (System Management Network) rather than PCI devices. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1479423463-8536-9-git-send-email-Yazen.Ghannam@amd.com [ Fixup ecc_en assignment and get_umc_base(). ] Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-24 21:07:03 +01:00
Borislav Petkov	627bc29ed9	Merge tip:ras/core to pick up dependent changes tip:ras/core contains the respective Fam17h x86 RAS bits which amd64_edac is going to use. So merge it into the EDAC branch. Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-23 21:13:40 +01:00
Yazen Ghannam	044e7a414b	EDAC, amd64: Don't force-enable ECC checking on newer systems It's not recommended for the OS to try and force-enable ECC checking. This is considered a firmware task since it includes memory training, etc, so don't change ECC settings on Fam17h or newer systems and inform the user. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1479850816-1595-1-git-send-email-Yazen.Ghannam@amd.com [ Put the "forcing" message in an else branch. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-23 19:04:11 +01:00
Yazen Ghannam	d12a969ebb	EDAC, amd64: Add Deferred Error type Currently, deferred errors are classified as correctable in EDAC. Add a new error type for deferred errors so that they are correctly reported to the user. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1479423463-8536-7-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-21 10:57:19 +01:00
Yazen Ghannam	e70984d9eb	EDAC, amd64: Rename __log_bus_error() to be more specific We only use __log_bus_error() to log DRAM ECC errors, so let's change the name to reflect this. We'll also use this function for DRAM ECC errors on Fam17h, but we'll call it from a different function than decode_bus_error(). Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1479423463-8536-6-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-21 10:42:55 +01:00
Yazen Ghannam	e7934b70d7	EDAC, amd64: Change target of pci_name from F2 to F3 AMD Fam17h will not be using PCI function 2 for EDAC, but will continue to use function 3. So let's get the name of F3 instead of F2 to support Fam17h and previous families. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1479423463-8536-5-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-21 10:24:20 +01:00
Yazen Ghannam	5c332202f8	EDAC, mce_amd: Rename nb_bus_decoder to dram_ecc_decoder nb_bus_decoder() is only used for DRAM ECC errors so rename it so that the name is more generic and descriptive. Also, call it for DRAM ECC errors on SMCA systems. [ Boris: rename it to real function name with a verb in it. ] Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1479423463-8536-4-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-21 09:43:15 +01:00
Yanjiang Jin	27bda205ba	EDAC, mpc85xx: Implement remove method for the platform driver If we execute the below steps without this patch: modprobe mpc85xx_edac [The first insmod, everything is well.] modprobe -r mpc85xx_edac modprobe mpc85xx_edac [insmod again, error happens.] We would get the error messages as below: BUG: recent printk recursion! Oops: Kernel access of bad area, sig: 11 [#48] Modules linked in: mpc85xx_edac edac_core softdog [last unloaded: mpc85xx_edac] CPU: 5 PID: 14773 Comm: modprobe Tainted: G D C 4.8.3-rt2 .vsnprintf .vscnprintf .vprintk_emit .printk .edac_pci_add_device .mpc85xx_pci_err_probe .platform_drv_probe .driver_probe_device .__driver_attach .bus_for_each_dev .driver_attach .bus_add_driver .driver_register .__platform_register_drivers .mpc85xx_mc_init .do_one_initcall .do_init_module .load_module .SyS_finit_module system_call Address this by cleaning up properly when removing the platform driver. Tested on a T4240QDS board. Signed-off-by: Yanjiang Jin <yanjiang.jin@windriver.com> Acked-by: Johannes Thumshirn <jthumshirn@suse.de> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: york.sun@nxp.com Link: http://lkml.kernel.org/r/1479351380-17109-2-git-send-email-yanjiang.jin@windriver.com [ Boris: massage commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-17 11:19:50 +01:00
Colin Ian King	8176170e03	EDAC, xgene: Fix spelling mistake in error messages Trivial fix to spelling mistake "Mutilple" to "Multiple" in error messages. Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Loc Ho <lho@apm.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20161114231104.5585-1-colin.king@canonical.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-15 11:10:02 +01:00
Borislav Petkov	c73e8833be	EDAC, mc: Fix locking around mc_devices list When accessing the mc_devices list of memory controller descriptors, we need to hold mem_ctls_mutex. This was not always the case, fix that. Make all external callers call a version which grabs the mutex since the last is local to edac_mc.c. Reported-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-14 13:26:11 +01:00
Borislav Petkov	c09a8c40e0	x86/RAS: Hide SMCA bank names Add accessor functions and hide the smca_names array. Also, add a sanity-check to bank HWID assignment in get_smca_bank_info(). Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/20161104152317.5r276t35df53qk76@pd.tnic Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-11-08 17:10:15 +01:00
Borislav Petkov	a9a1c0ee04	x86/RAS: Rename smca_bank_names to smca_names Make it differ more from struct smca_bank_name for better readability. Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Yazen Ghannam <yazen.ghannam@amd.com> Link: http://lkml.kernel.org/r/20161103125556.15482-3-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-11-08 17:10:14 +01:00
Borislav Petkov	1ce9cd7f9f	x86/RAS: Simplify SMCA HWID descriptor struct Call it simply smca_hwid and call local variables "hwid". More readable. Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Yazen Ghannam <yazen.ghannam@amd.com> Link: http://lkml.kernel.org/r/20161103125556.15482-2-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-11-08 17:10:14 +01:00
Thor Thayer	90e493d7d5	EDAC, altera: Disable IRQs while injecting SDRAM errors Disable IRQs while injecting SDRAM errors. The RT patches exposed a spinlock deadlock where the spinlock taken for the regmap write deadlocked with the IRQ clear regmap write. Error injection is not normally enabled for ECC but only for testing. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1476906827-9412-1-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-10-22 20:14:03 +02:00
Wei Yongjun	240ea9214a	EDAC, skx_edac: Fix non static symbol warnings Fix the following sparse warnings: drivers/edac/skx_edac.c:266:25: warning: symbol 'skx_cpuids' was not declared. Should it be static? drivers/edac/skx_edac.c:1040:12: warning: symbol 'skx_init' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1477147098-2842-1-git-send-email-weiyj.lk@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-10-22 20:10:38 +02:00
Piotr Luc	9a9260ca92	EDAC, sb_edac: Add Knights Mill support Add Knights Mill (KNM) to the list of CPU models supported by sb_edac. Signed-off-by: Piotr Luc <piotr.luc@intel.com> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20161013153105.2517-6-piotr.luc@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-10-19 12:37:44 +02:00
Dave Hansen	20f4d69243	EDAC, {sb,skx}_edac: Use Intel model macros instead of open-coding them We now have symbolic names for a bunch of Intel CPU models via asm/intel-family.h. The original conversion missed the EDAC drivers. Convert them. Signed-off-by: Dave Hansen <dave.hansen@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20160929204321.9FAE5F84@viggo.jf.intel.com [ Remove comment, macro name is descriptive enough. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2016-10-19 12:32:40 +02:00
Linus Torvalds	19fe416532	* Altera Arria10 enablement of NAND, DMA, USB, QSPI and SD-MMC FIFO buffers (Thor Thayer) * Split the memory controller part out of mpc85xx and share it with a * new Freescale ARM Layerscape driver (York Sun) * amd64_edac fixes (Yazen Ghannam) * Misc cleanups, refactoring and fixes all over the place -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJX80EwAAoJEBLB8Bhh3lVKWwYP/1bbODQ7o+XhO8IaCDffYk30 8y4WdSnI0/QcP8JbSvFA7y6Zn4L0BbrbYhKLRDAg9c34V2bMaqonCnkDtT6YatUb 6l0H/hQ/Cah9AOm5PJLYg6O9s+ZBT8zA5b+F2Z9kUsuB6LSnVhp9skNrH6KPlm0U 4pFaLnHQenQPZbuRCfRxPU49ZuKtBZtQDkJLJlHXwn7e1qZy2Q4tMnnEtsY6U2ea t3Hj+F8g+cdoiTQXOceCcOTR8GqDI6szgzn7vpXAGYvljBndszauAkxO7by79jg1 I8AQfgwoBF5CYL2Q0pzT1maHmmG2sydeRAHIvhmGxiEfFz1abWhriXbS33c32q8a iFiVMAUIaSKpB/sB+5w5ymuBctI1mX5EQVW+8Xl2Gxt+olnhdJMocHnvQdYkfsYm Ka8LcbaiK6ZQTbs/cIMOc2paE0AFPu5uXKHCPeZlhQAxOBvSPuDAv0+qUB/of5Uq 1SPidtsTmCI7X2hrdHAH9hLEkSjq68v3kqL5YnZL3H4gA3WohQEmX9ybjk097Kus WWEhdi/PSFX0qQKotMUUDuxfNcKI6PZH9p+i2dN6tNCkiTDdb0Eo5lCXN7RVVhvq qfE0Fcc4uDzh5MUS5jT58MWpA1cfdu9jbAf2BwFIU/poJcaeqy/SMyzCL+1D2/u6 dmDAtQbKUUwiltB8QzQd =pcI8 -----END PGP SIGNATURE----- Merge tag 'edac_for_4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp Pull EDAC updates from Borislav Petkov: "A lot of movement in the EDAC tree this time around, coarse summary below: - Altera Arria10 enablement of NAND, DMA, USB, QSPI and SD-MMC FIFO buffers (Thor Thayer) - split the memory controller part out of mpc85xx and share it with a new Freescale ARM Layerscape driver (York Sun) - amd64_edac fixes (Yazen Ghannam) - misc cleanups, refactoring and fixes all over the place" * tag 'edac_for_4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: (37 commits) EDAC, altera: Add IRQ Flags to disable IRQ while handling EDAC, altera: Correct EDAC IRQ error message EDAC, amd64: Autoload module using x86_cpu_id EDAC, sb_edac: Remove NULL pointer check on array pci_tad EDAC: Remove NO_IRQ from powerpc-only drivers EDAC, fsl_ddr: Fix error return code in fsl_mc_err_probe() EDAC, fsl_ddr: Add entry to MAINTAINERS EDAC: Move Doug Thompson to CREDITS EDAC, I3000: Orphan driver EDAC, fsl_ddr: Replace simple_strtoul() with kstrtoul() EDAC, layerscape: Add Layerscape EDAC support EDAC, fsl_ddr: Fix IRQ dispose warning when module is removed EDAC, fsl_ddr: Add support for little endian EDAC, fsl_ddr: Add missing DDR DRAM types EDAC, fsl_ddr: Rename macros and names EDAC, fsl-ddr: Separate FSL DDR driver from MPC85xx EDAC, mpc85xx: Replace printk() with pr_* format EDAC, mpc85xx: Drop setting/clearing RFXE bit in HID1 EDAC, altera: Rename MC trigger to common name EDAC, altera: Rename device trigger to common name ...	2016-10-04 12:06:26 -07:00
Thor Thayer	a29d64a45e	EDAC, altera: Add IRQ Flags to disable IRQ while handling Add the IRQF_ONESHOT and IRQF_TRIGGER_HIGH flags to disable the IRQ while executing the IRQ handler. Remove the IRQF_SHARED because these are not shared IRQs in the domain. Exposed when flooding IRQs. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1474582419-7053-2-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-23 12:03:34 +02:00
Thor Thayer	3763569f4c	EDAC, altera: Correct EDAC IRQ error message Correct the error message sent out in the case of a single bit error IRQ allocation. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1474582419-7053-1-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-23 11:52:39 +02:00
Yazen Ghannam	d6efab74f6	EDAC, amd64: Autoload module using x86_cpu_id Reinstate driver autoloading now that PCI dependency is gone. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1473984445-1726-2-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-21 12:48:15 +02:00
Yazen Ghannam	a884675b87	x86/MCE/AMD, EDAC: Handle reserved bank 4 on Fam17h properly Bank 4 is reserved on family 0x17 and shouldn't generate any MCE records. However, broken hardware and software is not something unheard of so warn about bank 4 errors. They shouldn't be coming from bank 4 naturally but users can still use mce_amd_inj to simulate errors from it for testing purposed. Also, avoid special handling in the injector mce_amd_inj like it is being done on the older families. [ bp: Rewrite commit message and merge into one patch. Use boot_cpu_data. ] Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Link: http://lkml.kernel.org/r/1473384591-5323-1-git-send-email-Yazen.Ghannam@amd.com Link: http://lkml.kernel.org/r/1473384591-5323-2-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-09-13 15:23:14 +02:00
Yazen Ghannam	4b711f92c9	x86/mce, EDAC/mce_amd: Print MCA_SYND and MCA_IPID during MCE on SMCA systems The MCA_SYND and MCA_IPID registers contain valuable information and should be included in MCE output. The MCA_SYND register contains syndrome and other error information, and the MCA_IPID register will uniquely identify the MCA bank's type without having to rely on system software. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1472680624-34221-2-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-09-13 15:23:13 +02:00
Yazen Ghannam	5896820e0a	x86/mce/AMD, EDAC/mce_amd: Define and use tables for known SMCA IP types Scalable MCA defines a number of IP types. An MCA bank on an SMCA system is defined as one of these IP types. A bank's type is uniquely identified by the combination of the HWID and MCATYPE values read from its MCA_IPID register. Add the required tables in order to be able to lookup error descriptions based on a bank's type and the error's extended error code. [ bp: Align comments, simplify a bit. ] Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1472741832-1690-1-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-09-13 15:23:10 +02:00
Yazen Ghannam	856095b179	EDAC/mce_amd: Use SMCA prefix for error descriptions arrays The error descriptions defined for Fam17h can be reused for other SMCA systems, so their names should reflect this. Change f17h prefix to smca for error descriptions. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1472673994-12235-4-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-09-13 15:23:09 +02:00
Yazen Ghannam	c019b951e1	EDAC/mce_amd: Add missing SMCA error descriptions Add missing SMCA error descriptions to the error descriptions arrays. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1472673994-12235-3-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-09-13 15:23:09 +02:00
Yazen Ghannam	b300e87300	EDAC/mce_amd: Print syndrome register value on SMCA systems Print SyndV bit status and print the raw value of the MCA_SYND register. Further decoding of the syndrome from struct mce.synd can be done in other places where appropriate, e.g. DRAM ECC. Boris: make the error stanza more compact by putting the error address and syndrome on the same line: [Hardware Error]: Corrected error, no action required. [Hardware Error]: CPU:2 (17:0:0) MC4_STATUS[-\|CE\|-\|PCC\|AddrV\|-\|-\|SyndV\|CECC]: 0x96204100001e0117 [Hardware Error]: Error Addr: 0x000000007f4c52e3, Syndrome: 0x0000000000000000 [Hardware Error]: Invalid IP block specified. [Hardware Error]: cache level: L3/GEN, tx: DATA, mem-tx: RD Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1467633035-32080-2-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-09-13 15:23:07 +02:00
Colin Ian King	c7c35407cd	EDAC, sb_edac: Remove NULL pointer check on array pci_tad pvt->pci_tad is a NUM_CHANNELS array of struct pci_dev pointers and hence cannot be NULL, so the NULL pointer check on pci_tad is redundant. Remove it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20160908083801.14766-1-colin.king@canonical.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-12 20:15:43 +02:00
Michael Ellerman	372095723a	EDAC: Remove NO_IRQ from powerpc-only drivers We'd like to eventually remove NO_IRQ on powerpc, so remove usages of it from powerpc-only drivers. The pdata structs are kzalloc'ed, so we don't need to initialise those to 0, we can just drop the assignments entirely. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Johannes Thumshirn <morbidrsa@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: linuxppc-dev@ozlabs.org Link: http://lkml.kernel.org/r/1473674436-19467-1-git-send-email-mpe@ellerman.id.au Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-12 13:04:56 +02:00
Wei Yongjun	43fa9ba632	EDAC, fsl_ddr: Fix error return code in fsl_mc_err_probe() Return negative error code from the edac_mc_add_mc() error handling case instead of 0, as done elsewhere in this function. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: York Sun <york.sun@nxp.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1473350284-26482-1-git-send-email-weiyj.lk@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-09 19:27:22 +02:00
York Sun	f47ae798d8	EDAC, fsl_ddr: Replace simple_strtoul() with kstrtoul() Replace obsolete simple_strtoul() with kstrtoul(). Signed-off-by: York Sun <york.sun@nxp.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1471990593-27536-1-git-send-email-york.sun@nxp.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-01 10:28:03 +02:00
York Sun	eeb3d68b6c	EDAC, layerscape: Add Layerscape EDAC support Add DDR EDAC driver for ARM-based compatible controllers. Both big-endian and little-endian are supported, as specified in device tree. Signed-off-by: York Sun <york.sun@nxp.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1471990465-27443-1-git-send-email-york.sun@nxp.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-01 10:28:03 +02:00
York Sun	55764ed37e	EDAC, fsl_ddr: Fix IRQ dispose warning when module is removed When compiled as a module, removing it causes kernel warnings when irq_dispose_mapping() is called. Instead of calling irq_of_parse_and_map(), use platform_get_irq() to acquire the IRQ number. Signed-off-by: York Sun <york.sun@nxp.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: morbidrsa@gmail.com Cc: oss@buserror.net Cc: stuart.yoder@nxp.com Link: http://lkml.kernel.org/r/1470779760-16483-8-git-send-email-york.sun@nxp.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-01 10:28:02 +02:00
York Sun	339fdff14c	EDAC, fsl_ddr: Add support for little endian Get endianness from device tree. Both big endian and little endian are supported. Default to big endian for backwards compatibility to MPC85xx. Signed-off-by: York Sun <york.sun@nxp.com> Acked-by: Rob Herring <robh+dt@kernel.org> Cc: devicetree@vger.kernel.org Cc: linux-edac <linux-edac@vger.kernel.org> Cc: morbidrsa@gmail.com Cc: oss@buserror.net Cc: stuart.yoder@nxp.com Link: http://lkml.kernel.org/r/1470779760-16483-7-git-send-email-york.sun@nxp.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-01 10:28:02 +02:00
York Sun	4e2c3252d2	EDAC, fsl_ddr: Add missing DDR DRAM types The compatible DDR controllers may support DDR, DDR2, DDR3, DDR4 DRAM. An individual controller doesn't support all of them. The EDAC driver reads SDRAM_CFG to determine which mode is configured. Add DDR4 and drop the defines used only in the mtype assignment. Signed-off-by: York Sun <york.sun@nxp.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: morbidrsa@gmail.com Cc: oss@buserror.net Cc: stuart.yoder@nxp.com Link: http://lkml.kernel.org/r/1470779760-16483-6-git-send-email-york.sun@nxp.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-01 10:28:01 +02:00
York Sun	d43a9fb202	EDAC, fsl_ddr: Rename macros and names Use FSL-specific prefix for macros, variables and functions. Signed-off-by: York Sun <york.sun@nxp.com> Cc: Johannes Thumshirn <morbidrsa@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: oss@buserror.net Cc: stuart.yoder@nxp.com Link: http://lkml.kernel.org/r/1470779760-16483-5-git-send-email-york.sun@nxp.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-01 10:28:01 +02:00
York Sun	ea2eb9a8b6	EDAC, fsl-ddr: Separate FSL DDR driver from MPC85xx The mpc85xx-compatible DDR controllers are used on ARM-based SoCs too. Carve out the DDR part from the mpc85xx EDAC driver in preparation to support both architectures. Signed-off-by: York Sun <york.sun@nxp.com> Cc: Johannes Thumshirn <morbidrsa@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: oss@buserror.net Cc: stuart.yoder@nxp.com Link: http://lkml.kernel.org/r/1470946525-3410-1-git-send-email-york.sun@nxp.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-01 10:28:00 +02:00
York Sun	88857ebe71	EDAC, mpc85xx: Replace printk() with pr_* format Replace printk() with pr_err/pr_warn/pr_info macros. Signed-off-by: York Sun <york.sun@nxp.com> Cc: Johannes Thumshirn <morbidrsa@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: oss@buserror.net Cc: stuart.yoder@nxp.com Link: http://lkml.kernel.org/r/1470779760-16483-3-git-send-email-york.sun@nxp.com [ Boris: unbreak strings for easier greppability. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-01 10:27:59 +02:00
York Sun	9e6a03a044	EDAC, mpc85xx: Drop setting/clearing RFXE bit in HID1 On e500v1, read fault exception enable (RFXE) controls whether assertion of core_fault_in causes a machine check interrupt. Assertion of core_fault_in can result from uncorrectable data error, such as an L2 multi-bit ECC error. It can also occur from a system error if logic on the integrated device signals a fault for nonfatal errors. RFXE bit is cleared out of reset, and should be left clear for normal operation. Assertion of core_fault_in does not cause a machine check. RFXE is set specifically for RIO (Rapid IO) and PCI for book E to catch the errors by machine check. With this bit set, the EDAC driver can't get the interrupt in case of uncorrectable error. So this bit is cleared in favor of EDAC. However, the benefit of catching such uncorrectable error doesn't outweigh the other errors which may hang the system. Besides, e500v2 has different errors masked by RFXE, and e500mc doesn't support this bit. It is more reasonable to leave RFXE as is in the EDAC driver, and leave the uncorrectable errors triggering machine check for e500v1. Suggested-by: Scott Wood <oss@buserror.net> Signed-off-by: York Sun <york.sun@nxp.com> Cc: Johannes Thumshirn <morbidrsa@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: oss@buserror.net Cc: stuart.yoder@nxp.com Link: http://lkml.kernel.org/r/1470779760-16483-2-git-send-email-york.sun@nxp.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-01 10:27:59 +02:00
Thor Thayer	b8978badc4	EDAC, altera: Rename MC trigger to common name Rename the Memory Controller debug trigger to the same common name as the EDAC devices. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1471622666-15197-3-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-01 09:06:42 +02:00
Thor Thayer	f399f34bdb	EDAC, altera: Rename device trigger to common name The L2 and OCRAM devices have different ecc trigger names than the other EDAC devices (FIFO peripherals). Make them all the same and remove the character array from the device structure. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1471622666-15197-2-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-01 08:55:24 +02:00
Tony Luck	4ec656bdf4	EDAC, skx_edac: Add EDAC driver for Skylake This is an entirely new driver instead of yet another set of patches to sb_edac.c because: 1) Mapping from PCI devices to socket/memory controller is significantly different. Skylake scatters devices on a socket across a number of PCI buses. 2) There is an extra level of interleaving via the "mcroute" register that would be a little messy to squeeze into the old driver. 3) Validation is getting too expensive. Changes to sb_edac need to be checked against Sandy Bridge, Ivy Bridge, Haswell, Broadwell and Knights Landing. Acked-by: Aristeu Rozanski <aris@redhat.com> Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-08-21 10:58:34 -07:00
Tillmann Heidsieck	6fa06b0d9e	EDAC, mpc85xx: Fix PCIe error capture According to the reference manual of MPC8572 and T4240, bit 31 of PEX_ERR_CAP_STAT is W1C (write 1 to clear). Add the corresponding write to PEX_ERR_CAP_STAT in order to fix the PCIe error capture. Tested on a T4240 processor. Signed-off-by: Tillmann Heidsieck <theidsieck@leenox.de> Acked-by: Johannes Thumshirn <jthumshirn@suse.de> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20160815190849.29327-1-theidsieck@leenox.de Signed-off-by: Borislav Petkov <bp@suse.de>	2016-08-18 10:17:40 +02:00
Bhaktipriya Shridhar	7bb8b77779	EDAC, wq: Remove deprecated create_singlethread_workqueue() Replace the deprecated create_singlethread_workqueue() with alloc_ordered_workqueue() with WQ_MEM_RECLAIM. This is the identity conversion. It's not recommended to stall it from memory pressure. Hence, WQ_MEM_RECLAIM has been set to ensure forward progress under memory pressure. Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20160813164124.GA9077@Karyakshetra Signed-off-by: Borislav Petkov <bp@suse.de>	2016-08-15 07:21:29 +02:00
Wei Yongjun	9bcd919eb8	EDAC, altera: Make a10_eccmgr_ic_ops static Fix the following sparse warning: drivers/edac/altera_edac.c:1649:23: warning: symbol 'a10_eccmgr_ic_ops' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com> Reviewed-by: Thor Thayer <tthayer@opensource.altera.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: lkml <linux-kernel@vger.kernel.org> Link: http://lkml.kernel.org/r/1470836667-11822-1-git-send-email-weiyj.lk@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-08-10 18:37:06 +02:00
Thor Thayer	911049845d	EDAC, altera: Add Arria10 SD-MMC EDAC support Add Altera Arria10 SD-MMC FIFO memory EDAC support. The SD-MMC is a dual port RAM implementation which is different than any of the other peripherals and therefore requires additional code. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: dinguyen@opensource.altera.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1470753653-23465-3-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-08-10 14:43:14 +02:00
Yazen Ghannam	dc0a50a841	EDAC, amd64: Fix channel decode on Fam15hMod60h systems Fam15hMod60h systems are using the channel decode of Fam15hMod30h which gives incorrect results. Fam15hMod60h systems should use the generic channel decode method plus a couple more cases. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1470236355-30039-1-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-08-08 05:59:42 +02:00
Thor Thayer	485fe9e24e	EDAC, altera: Add Arria10 QSPI support Add Altera Arria10 QSPI FIFO memory support. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: dinguyen@opensource.altera.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1468512408-5156-9-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-08-08 05:59:39 +02:00
Thor Thayer	c609581d1f	EDAC, altera: Add Arria10 USB support Add Altera Arria10 USB FIFO memory support. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: dinguyen@opensource.altera.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1468512408-5156-8-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-08-08 05:59:37 +02:00
Thor Thayer	e8263793b7	EDAC, altera: Add Arria10 DMA support Add Altera Arria10 DMA FIFO memory support. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: dinguyen@opensource.altera.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1468512408-5156-7-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-08-08 05:59:36 +02:00
Thor Thayer	c6882fb2e8	EDAC, altera: Add Arria10 NAND support Add Altera Arria10 NAND FIFO memory support. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: dinguyen@opensource.altera.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1468512408-5156-6-git-send-email-tthayer@opensource.altera.com [ Reformat loop in altr_edac_a10_probe() for better readability. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2016-08-08 05:59:35 +02:00
Lukasz Odzioba	c5b48fa7e2	EDAC, sb_edac: Fix channel reporting on Knights Landing On Intel Xeon Phi Knights Landing processor family the channels of the memory controller have untypical arrangement - MC0 is mapped to CH3,4,5 and MC1 is mapped to CH0,1,2. This causes the EDAC driver to report the channel name incorrectly. We missed this change earlier, so the code already contains similar comment, but the translation function is incorrect. Without this patch: errors in DIMM_A and DIMM_D were reported in DIMM_D errors in DIMM_B and DIMM_E were reported in DIMM_E errors in DIMM_C and DIMM_F were reported in DIMM_F Correct this. Hubert Chrzaniuk: - rebased to 4.8 - comments and code cleanup Fixes: `d0cdf90031` ("sb_edac: Add Knights Landing (Xeon Phi gen 2) support") Reviewed-by: Tony Luck <tony.luck@intel.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: lukasz.anaczkowski@intel.com Cc: lukasz.odzioba@intel.com Cc: mchehab@kernel.org Cc: <stable@vger.kernel.org> # v4.5.. Link: http://lkml.kernel.org/r/1469231089-22837-1-git-send-email-lukasz.odzioba@intel.com Signed-off-by: Lukasz Odzioba <lukasz.odzioba@intel.com> [ Boris: Simplify a bit by removing char mc. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2016-08-08 05:52:08 +02:00
Linus Torvalds	c79a14defb	* Altera Arria10 ethernet FIFO buffer support (Thor Thayer) * Minor cleanups -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJXlbvGAAoJEBLB8Bhh3lVK9g0P/16Z5/WIqrrLjJegcZ3qKb4m u2b5FBwbTDQyj/smZmfZfhZUleO7HDayL188fHPKO11my+Mvjyws8IZCvHshy7+3 b6Krno8NXZuNYtm32srSQQoLE4eN5pookOPMe2KJcRFghwnDqOo0ZfMZ8Lt6RVGg YtqiVZ3DwaqoHUbvxkHvdJuMbtVEZIEC6SzK9t4URmdncAx+3/BTLqDSYc4LaL+B tI6REgMy+QT1IUMImuUbOA5ktd78K+D26YKF9NuTSWNQvwzjSYy4y6WmDsjYpZF6 h6YWgGwERIy6gws5IdvkBMKX0s1RV7/sY6xFBsEQrX+IFywyY5OcI1gPRttD6VAn ORJ4j7WNH8Phqld9YxjEX8yTwGxlxge88J/lb0i9s5jXPMX7ioxAqAjdpsJj6PlA sNDP0fTyCECqME73cWaAS9bdrzB2dgOxEDfcWKWR24vZeHfnh72owD1k+J0iWu9c 0R2usPB1Yjbo6ODGpqG1R+u09LR+RpNe38oT5GcsFrj6/EaGz0kn2QulaNtylsDo J5qkm02X1h3+F8/UHrzKavOGk82IQzLCbcW7eHPmyzkH8PTe1semLk9jHnXN5sK3 uL2nirZQVIF6QLbYxf0Uf2oSVs05xc7u9ZT+QOl6hqh/6/K7DWC45SoABwMXDaHH jXr+y4vtr5IR8GQfZ/nd =mdxx -----END PGP SIGNATURE----- Merge tag 'edac_for_4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp Pull EDAC updates from Borislav Petkov: "This last cycle, Thor was busy adding Arria10 eth FIFO support to the altera_edac driver along with other improvements. We have two cleanups/fixes too. Summary: - Altera Arria10 ethernet FIFO buffer support (Thor Thayer) - Minor cleanups" * tag 'edac_for_4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: ARM: dts: Add Arria10 Ethernet EDAC devicetree entry EDAC, altera: Add Arria10 Ethernet EDAC support EDAC, altera: Add Arria10 ECC memory init functions Documentation: dt: socfpga: Add Arria10 Ethernet binding EDAC, altera: Drop some ifdeffery EDAC, altera: Add panic flag check to A10 IRQ EDAC, altera: Check parent status for Arria10 EDAC block EDAC, altera: Make all private data structures static EDAC: Correct channel count limit EDAC, amd64_edac: Init opstate at the proper time during init EDAC, altera: Handle Arria10 SDRAM child node EDAC, altera: Add ECC Manager IRQ controller support Documentation: dt: socfpga: Add interrupt-controller to ecc-manager	2016-07-27 13:40:47 -07:00
Tony Luck	0ba169ac36	EDAC, sb_edac: Fix Knights Landing In commit `2c1ea4c700` ("EDAC, sb_edac: Use cpu family/model in driver detection") I broke Knights Landing because I failed to notice that it called a wrapper macro "sbridge_get_all_devices_knl" instead of "sbridge_get_all_devices" like all the other types. Now that we include the processor type in the pci_id_table structure we can skip the wrappers and just have the sbridge_get_all_devices() check the type to decide whether to allow duplicate devices and controllers to have registers spread across buses. Fixes: `2c1ea4c700` ("EDAC, sb_edac: Use cpu family/model in driver detection") Tested-by: Lukasz Odzioba <lukasz.odzioba@intel.com> Acked-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-07-16 06:11:59 +09:00
Thor Thayer	ab8c1e0fb0	EDAC, altera: Add Arria10 Ethernet EDAC support Add Altera Arria10 Ethernet FIFO memory EDAC support. Update to support a common compatibility string for all Ethernet FIFOs in the DT. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1466603939-7526-8-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-06-25 11:31:34 +02:00
Thor Thayer	1166fde93d	EDAC, altera: Add Arria10 ECC memory init functions In preparation for additional memory module ECCs, add the memory initialization functions and helpers. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1466603939-7526-7-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-06-24 21:16:38 +02:00
Thor Thayer	6b300fb953	EDAC, altera: Drop some ifdeffery Make the IRQ and check_deps() functions available to all the memory buffers by moving them outside of the OCRAM only area. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1466603939-7526-5-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-06-24 14:18:33 +02:00
Thor Thayer	2b083d65ff	EDAC, altera: Add panic flag check to A10 IRQ In preparation for additional memory module ECCs, the IRQ function will check a panic flag before doing a kernel panic on double bit errors. OCRAM uncorrectable errors cause a panic because sleep/resume functions and FPGA contents during sleep are stored in OCRAM. ECCs on peripheral FIFO buffers will not cause a kernel panic on DBERRs because the packet can be retried and therefore recovered. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1466603939-7526-3-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-06-24 11:58:43 +02:00
Thor Thayer	44ec9b307e	EDAC, altera: Check parent status for Arria10 EDAC block In preparation for the Arria10 ECC modules, check the status of the parent in the device tree to ensure the block is enabled. Skip if no parent phandle is set in the device tree. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1466603939-7526-2-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-06-24 11:58:42 +02:00
Thor Thayer	1cf7037724	EDAC, altera: Make all private data structures static The device private data structures are used only here so make them static. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1466603939-7526-4-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-06-24 11:58:42 +02:00
Borislav Petkov	bba142957e	EDAC: Correct channel count limit `c44696fff0` ("EDAC: Remove arbitrary limit on number of channels") lifted the arbitrary limit on memory controller channels in EDAC. However, the dynamic channel attributes dynamic_csrow_dimm_attr and dynamic_csrow_ce_count_attr remained 6. This wasn't a problem except channels 6 and 7 weren't visible in sysfs on machines with more than 6 channels after the conversion to static attr groups with `2c1946b6d6` ("EDAC: Use static attribute groups for managing sysfs entries") [ without that, we're exploding in edac_create_sysfs_mci_device() because we're dereferencing out of the bounds of the dynamic_csrow_dimm_attr array. ] Add attributes for channels 6 and 7 along with a guard for the future, should more channels be required and/or to sanity check for misconfigured machines. We still need to check against the number of channels present on the MC first, as Thor reported. Signed-off-by: Borislav Petkov <bp@suse.de> Reported-by: Hironobu Ishii <ishii.hironobu@jp.fujitsu.com> Tested-by: Thor Thayer <tthayer@opensource.altera.com> Cc: <stable@vger.kernel.org> # 4.2	2016-06-16 10:06:35 +02:00
Borislav Petkov	6ba92fea1b	EDAC, amd64_edac: Init opstate at the proper time during init It is useless to do it if we're loaded on unsupported hardware so do that only after we have detected at least 1 supported AMD northbridge. Signed-off-by: Borislav Petkov <bp@suse.de>	2016-06-16 01:13:18 +02:00
Thor Thayer	ab564cb51e	EDAC, altera: Handle Arria10 SDRAM child node Separate the device match arrays for each platform to prevent CycloneV matches when calling of_platform_populate() on the Arria10 ECC manager node. If the SDRAM is a child node of ECC manager, call probe function via of_platform_populate(). Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1464193783-5071-4-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-06-08 13:23:09 +02:00
Thor Thayer	13ab8448d2	EDAC, altera: Add ECC Manager IRQ controller support To better support child devices, the ECC manager needs to be implemented as an IRQ controller. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1465331757-10227-1-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-06-08 13:23:01 +02:00
Tony Luck	665f05e0b8	EDAC, sb_edac: Readd accidentally dropped Broadwell-D support In commit `2c1ea4c700` ("EDAC, sb_edac: Use cpu family/model in driver detection") we switched from using PCI ids to determine which platform we are running on to using CPU model instead. I forgot that Broadwell-DE has its own distinct model number different from Broadwell-EP or -EX. Fixing this isn't just adding a line to the array of cpuids - the exising code assumed a 1:1 mapping between entries in that array and the "enum type" values. Added the type to pci_id_table structure to remove this dependency and allows two Broadwell cpu models. Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: Aristeu Rozanski <arozansk@redhat.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Fixes: `2c1ea4c700` ("EDAC, sb_edac: Use cpu family/model in driver detection") Link: http://lkml.kernel.org/r/b3cffe40dec6dfe0235a5d52a504f0ba86a07ce7.1464902605.git.tony.luck@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-06-03 17:28:21 +02:00
Nicholas Krause	fbedcaf43f	EDAC: Fix workqueues poll period resetting After the workqueue cleanup, we're registering workqueues based on the presence of an ->edac_check function. When that is the case, we're setting OP_RUNNING_POLL. But we forgot to check that in edac_mc_reset_delay_period(), leading to: BUG: unable to handle kernel paging request at 0000000000015d10 IP: [ .. ] queued_spin_lock_slowpath PGD 3ffcc8067 PUD 3ffc56067 PMD 0 Oops: 0002 [#1] SMP Modules linked in: ... CPU: 1 PID: 2792 Comm: edactest Not tainted 4.6.0-dirty #1 Hardware name: HP ProLiant MicroServer, BIOS O41 10/01/2013 Stack: Call Trace: ? _raw_spin_lock_irqsave ? lock_timer_base.isra.34 ? del_timer ? try_to_grab_pending ? mod_delayed_work_on ? edac_mc_reset_delay_period ? edac_set_poll_msec ? param_attr_store ? module_attr_store ? kernfs_fop_write ? __vfs_write ? __vfs_read ? __alloc_fd ? vfs_write ? SyS_write ? entry_SYSCALL_64_fastpath Code: RIP [ .. ] queued_spin_lock_slowpath RSP <> CR2: 0000000000015d10 ---[ end trace 3f286bc71cca15d1 ]--- Kernel panic - not syncing: Fatal exception Fix it. Signed-off-by: Nicholas Krause <xerofoify@gmail.com> Cc: <stable@vger.kernel.org> # 4.5 Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1463697958-13406-1-git-send-email-xerofoify@gmail.com [ Rewrite commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2016-06-03 11:14:27 +02:00
Tony Luck	c7103f650a	EDAC, sb_edac: Fix rank lookup on Broadwell Broadwell made a small change to the rank target register moving the target rank ID field up from bits 16:19 to bits 20:23. Also found that the offset field grew by one bit in the IVY_BRIDGE to HASWELL transition, so fix the RIR_OFFSET() macro too. Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: stable@vger.kernel.org # v3.19+ Cc: Aristeu Rozanski <arozansk@redhat.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/2943fb819b1f7e396681165db9c12bb3df0e0b16.1464735623.git.tony.luck@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-06-03 10:05:49 +02:00
Linus Torvalds	16bf834805	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree updates from Jiri Kosina. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (21 commits) gitignore: fix wording mfd: ab8500-debugfs: fix "between" in printk memstick: trivial fix of spelling mistake on management cpupowerutils: bench: fix "average" treewide: Fix typos in printk IB/mlx4: printk fix pinctrl: sirf/atlas7: fix printk spelling serial: mctrl_gpio: Grammar s/lines GPIOs/line GPIOs/, /sets/set/ w1: comment spelling s/minmum/minimum/ Blackfin: comment spelling s/divsor/divisor/ metag: Fix misspellings in comments. ia64: Fix misspellings in comments. hexagon: Fix misspellings in comments. tools/perf: Fix misspellings in comments. cris: Fix misspellings in comments. c6x: Fix misspellings in comments. blackfin: Fix misspelling of 'register' in comment. avr32: Fix misspelling of 'definitions' in comment. treewide: Fix typos in printk Doc: treewide : Fix typos in DocBook/filesystem.xml ...	2016-05-17 17:05:30 -07:00
Linus Torvalds	1cc3880a3c	* Altera Arria10 L2 cache and On-Chip RAM ECC handling. (Thor Thayer) * Remove ad-hoc buffering of MCE records in sb_edac and i7core_edac. (Tony Luck) * Do not register sb_edac with pci_register_driver(). (Tony Luck) * Add support for Skylake to ie31200_edac. (Jason Baron) * Do not register amd64_edac with pci_register_driver(). (Borislav Petkov) + the usual round of cleanups and fixes all over the place. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJXOaHaAAoJEBLB8Bhh3lVKRdoP/jDewaX4GAxgcKaK9Kx7VjMe i42E/7syh5iLoyFgZ1hUjvqiQHN4RyWloUYyNHbNxGS9uXCMXIUoJQd2FjOY442s oeEaoMCfZsJLzKQti3fUPK9GugZ57BSZxfn6NlJPpyG4FxSirOcZFCxGaNuyqqrh 0M+gFvbWfZcVDLE0FI5CUYZWRxsk//+pLM4ZkQZFA8Eo1tGVc3r0zEEAlL/uEMDW P0ATvRHNUeib9YQGTdSD7cNUpRX4SX+T8lDUiaVm+tHL5ES3b4rayMBdbVMrahfc Gnxu9GtO5gWXEY6QDDpOx+VkbfFmDFodx833psJ3MVD8evEFHHdinkgDaptLrrV6 92ZDKR5s3W6tKXkcmGuExrtc17UgjLcRZCebXbv+5FJlVrslzvQ9ESOTRyiZBrCD ZpFi2TZhpYU1uEWuoBZCbWNXW2pcSt7/bQ9bYUvfrvNfgPzPnblubJuVKRcfY0WB x2vj0PNnckpoPRvskV7GEe0Y/JISzAxBQUK6XO+GJgMgz5M+23SEaSVU5yeyf26e x/yD5yImQGN84AxfRMMvbR2JvpVLN3vdFWtWigneht160erVA/Qgw5Gcrpw53tZr zPD3fGaetrNgS8BvJAy/c6xHV1j6A1RGCGThy8Ivqep9WD90a5JhR1NRjZp6m8sf aB0A1zTP7e+hRFkZGahO =ud2P -----END PGP SIGNATURE----- Merge tag 'edac_for_4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp Pull EDAC updates from Borislav Petkov: "It was pretty busy in EDAC land this time: - Altera Arria10 L2 cache and On-Chip RAM ECC handling (Thor Thayer) - Remove ad-hoc buffering of MCE records in sb_edac and i7core_edac (Tony Luck) - Do not register sb_edac with pci_register_driver() (Tony Luck) - Add support for Skylake to ie31200_edac (Jason Baron) - Do not register amd64_edac with pci_register_driver() (Borislav Petkov) ... plus the usual round of cleanups and fixes all over the place" * tag 'edac_for_4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: (25 commits) EDAC, amd64_edac: Drop pci_register_driver() use EDAC, ie31200_edac: Add Skylake support EDAC, sb_edac: Use cpu family/model in driver detection EDAC, i7core: Remove double buffering of error records EDAC, amd64_edac: Issue driver banner only on success ARM: socfpga: Initialize Arria10 OCRAM ECC on startup EDAC: Increment correct counter in edac_inc_ue_error() EDAC, sb_edac: Remove double buffering of error records EDAC: Fix used after kfree() error in edac_unregister_sysfs() EDAC, altera: Avoid unused function warnings EDAC, altera: Remove useless casts ARM: socfpga: Enable Arria10 OCRAM ECC on startup EDAC, altera: Add Arria10 OCRAM ECC support Documentation: dt: socfpga: Add Altera Arria10 OCRAM binding EDAC, altera: Make OCRAM ECC dependency check generic EDAC, altera: Add register offset for ECC Enable EDAC, altera: Extract error inject operations to a struct fops ARM: socfpga: Enable Arria10 L2 cache ECC on startup EDAC, altera: Add Arria10 L2 Cache ECC handling Documentation, dt, socfpga: Add Altera Arria10 L2 cache binding ...	2016-05-16 18:44:39 -07:00
Yazen Ghannam	a348ed83d9	EDAC, mce_amd: Detect SMCA using X86_FEATURE_SMCA Use X86_FEATURE_SMCA when detecting if SMCA is available instead of directly using CPUID 0x80000007_EBX. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1462971509-3856-7-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-05-12 09:08:23 +02:00
Borislav Petkov	3f37a36b62	EDAC, amd64_edac: Drop pci_register_driver() use - remove homegrown instances counting. - take F3 PCI device from amd_nb caching instead of F2 which was used with the PCI core. With those changes, the driver doesn't need to register a PCI driver and relies on the northbridges caching which we do anyway on AMD. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Yazen Ghannam <yazen.ghannam@amd.com>	2016-05-09 20:41:16 +02:00
Jason Baron	953dee9bbd	EDAC, ie31200_edac: Add Skylake support Skylake adjusts some register locations, but otherwise follows the existing model quite closely. I was able to verify that the 'ce_count' increments when 'bad dimms' are used. The accounting of 'ce_count' and 'ue_count' is the primary functionality of interest for us. Tested on Intel(R) Xeon(R) CPU E3-1260L v5 @ 2.90GHz. Signed-off-by: Jason Baron <jbaron@akamai.com> Acked-by: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1462547927-22679-1-git-send-email-jbaron@akamai.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-05-06 18:50:14 +02:00
Tony Luck	2c1ea4c700	EDAC, sb_edac: Use cpu family/model in driver detection Instead of picking a random PCI ID from the dozen or so we need to access, just use x86_match_cpu() to pick based on CPU model number. The choosing of PCI devices has been problematic in the past, see `11249e7399` ("sb_edac: Fix detection on SNB machines") which fixed problems introduced by `d0585cd815` ("sb_edac: Claim a different PCI device"). This is especially ugly if future hardware might not even have EDAC-relevant registers in PCI config space and we would still be required to choose some "random" PCI devices to scan for just so our driver loads. Is this cleaner/clearer? It deletes much more code than it adds. Only tested on Broadwell. The driver loads/unloads and loads again. Still decodes errors too. Signed-off-by: Tony Luck <tony.luck@intel.com> Suggested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Borislav Petkov <bp@suse.de>	2016-05-02 19:44:43 +02:00
Tony Luck	5359534505	EDAC, i7core: Remove double buffering of error records In the bad old days the functions from x86_mce_decoder_chain could be called in machine check context. So we used to carefully copy them and defer processing until later. But in `f29a7aff4b` ("x86/mce: Avoid potential deadlock due to printk() in MCE context") we switched the logging code to save the record in a genpool, and call the functions that registered to be notified later from a work queue. So drop all the double buffering and do all the work we want to do as soon as i7core_mce_check_error() is called. Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/29ab2c370915c6e132fc5d88e7b72cb834bedbfe.1461855008.git.tony.luck@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-04-29 16:41:24 +02:00
Tony Luck	c4fc1956fa	EDAC: i7core, sb_edac: Don't return NOTIFY_BAD from mce_decoder callback Both of these drivers can return NOTIFY_BAD, but this terminates processing other callbacks that were registered later on the chain. Since the driver did nothing to log the error it seems wrong to prevent other interested parties from seeing it. E.g. neither of them had even bothered to check the type of the error to see if it was a memory error before the return NOTIFY_BAD. Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Aristeu Rozanski <aris@redhat.com> Acked-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/72937355dd92318d2630979666063f8a2853495b.1461864507.git.tony.luck@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-04-29 15:43:10 +02:00
Borislav Petkov	de0336b30d	EDAC, amd64_edac: Issue driver banner only on success ... and don't mislead users into thinking that the driver has loaded successfully. Signed-off-by: Borislav Petkov <bp@suse.de>	2016-04-27 12:30:26 +02:00
Emmanouil Maroudas	993f88f1cc	EDAC: Increment correct counter in edac_inc_ue_error() Fix typo in edac_inc_ue_error() to increment ue_noinfo_count instead of ce_noinfo_count. Signed-off-by: Emmanouil Maroudas <emmanouil.maroudas@gmail.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Fixes: `4275be6355` ("edac: Change internal representation to work with layers") Link: http://lkml.kernel.org/r/1461425580-5898-1-git-send-email-emmanouil.maroudas@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-04-23 18:10:09 +02:00
Tony Luck	ad08c4e974	EDAC, sb_edac: Remove double buffering of error records In the bad old days the functions from x86_mce_decoder_chain could be called in machine check context. So we used to carefully copy them and defer processing until later. But in `f29a7aff4b` ("x86/mce: Avoid potential deadlock due to printk() in MCE context") we switched the logging code to save the record in a genpool, and call the functions that registered to be notified later from a work queue. So drop all the double buffering and do all the work we want to do as soon as sbridge_mce_check_error() is called. Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: Aristeu Rozanski <arozansk@redhat.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: patrickg@supermicro.com Link: http://lkml.kernel.org/r/100025611cd780d9bca72792b2b2146760da53e0.1460756761.git.tony.luck@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-04-23 14:02:02 +02:00
Tony Luck	ab67b6c22d	EDAC: Fix used after kfree() error in edac_unregister_sysfs() Code flow looks like this: device_unregister(&mci->dev); -> kobject_put+0x25/0x50 -> kobject_cleanup+0x77/0x190 -> device_release+0x32/0xa0 -> mci_attr_release+0x36/0x70 -> kfree(mci); bus_unregister(mci->bus); Fix is to grab a local copy of "mci->bus" and use that when we call bus_unregister(). Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Aristeu Rozanski <aris@redhat.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/21d595b0ab3d718d9cb206647f4ec91c05e62ec4.1461261078.git.tony.luck@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-04-23 13:23:54 +02:00
Arnd Bergmann	1aa6eb5c5b	EDAC, altera: Avoid unused function warnings The recently added Arria10 OCRAM ECC support caused some new harmless warnings about unused functions when it is disabled: drivers/edac/altera_edac.c:1067:20: error: 'altr_edac_a10_ecc_irq' defined but not used [-Werror=unused-function] drivers/edac/altera_edac.c:658:12: error: 'altr_check_ecc_deps' defined but not used [-Werror=unused-function] This rearranges the code slightly to have those two functions inside of the same #ifdef that hides their callers. It also manages to avoid a forward declaration of the IRQ handler in the process. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thor Thayer <tthayer@opensource.altera.com> Cc: Alan Tull <atull@opensource.altera.com> Cc: Dinh Nguyen <dinguyen@opensource.altera.com> Cc: linux-edac <linux-edac@vger.kernel.org> Fixes: `c7b4be8db8` ("EDAC, altera: Add Arria10 OCRAM ECC support") Link: http://lkml.kernel.org/r/1460837650-1237650-2-git-send-email-arnd@arndb.de Signed-off-by: Borislav Petkov <bp@suse.de>	2016-04-23 12:00:12 +02:00
Arnd Bergmann	2c911f6cac	EDAC, altera: Remove useless casts The altera EDAC driver refers to its per-device data using a cast to '(void *)', which makes the pointer non-const, though both the source and destination are actually const. Removing the annotation makes the reference (almost) fit into a single line for improved readability, and ensures that it is actually defined as const. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thor Thayer <tthayer@opensource.altera.com> Cc: Alan Tull <atull@opensource.altera.com> Cc: Dinh Nguyen <dinguyen@opensource.altera.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1460837650-1237650-1-git-send-email-arnd@arndb.de Signed-off-by: Borislav Petkov <bp@suse.de>	2016-04-23 11:54:42 +02:00
Tony Luck	ea5dfb5fae	x86 EDAC, sb_edac.c: Take account of channel hashing when needed Haswell and Broadwell can be configured to hash the channel interleave function using bits [27:12] of the physical address. On those processor models we must check to see if hashing is enabled (bit21 of the HASWELL_HASYSDEFEATURE2 register) and act accordingly. Based on a patch by patrickg <patrickg@supermicro.com> Tested-by: Patrick Geary <patrickg@supermicro.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: Aristeu Rozanski <arozansk@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-edac@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-04-22 10:10:01 +02:00
Tony Luck	ff15e95c82	x86 EDAC, sb_edac.c: Repair damage introduced when "fixing" channel address In commit: `eb1af3b71f` ("Fix computation of channel address") I switched the "sck_way" variable from holding the log2 value read from the h/w to instead be the actual number. Unfortunately it is needed in log2 form when used to shift the address. Tested-by: Patrick Geary <patrickg@supermicro.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: Aristeu Rozanski <arozansk@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-edac@vger.kernel.org Cc: stable@vger.kernel.org Fixes: `eb1af3b71f` ("Fix computation of channel address") Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-04-22 10:10:01 +02:00

... 2 3 4 5 6 ...

1531 Commits