Basic support for the single socket Broadwell-DE processor
was added back in commit 1f39581a9a
sb_edac: Add support for Broadwell-DE processor
This patch extends Broadwell support to cover the two
socket "-EP" and four socket "-EX" versions of Broadwell.
Only tested on the 2 socket - but this code is largely
cloned from the Haswell path.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
First noticed a problem on a 4 socket machine where EDAC only reported
half the DIMMS. Tracked this down to the code that assumes that systems
with two home agents only have two memory channels on each agent. This
is true on 2 sockect ("-EP") machines. But four socket ("-EX") machines
have four memory channels on each home agent.
The old code would have had problems on two socket systems as it did
a shuffling trick to make the internals of the code think that the
channels from the first agent were '0' and '1', with the second agent
providing '2' and '3'. But the code didn't uniformly convert from
{ha,channel} tuples to this internal representation.
New code always considers up to eight channels.
On a machine with a single home agent these map easily to edac channels
0, 1, 2, 3. On machines with two home agents we map using:
edac_channel = 4*ha# + channel
So on a -EP machine where each home agent supports only two channels
we'll fill in channels 0, 1, 4, 5, and on a -EX machine we use all of 0,
1, 2, 3, 4, 5, 6, 7.
[mchehab@osg.samsung.com: fold a fixup patch as per Tony's request and fixed
a few CodingStyle issues]
Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
typo: "a7mode" chooses whether to use bits {8, 7, 9} or {8, 7, 6}
in the algorithm to spread access between memory resources. But
the non-a7mode path was incorrectly using GET_BITFIELD(addr, 7, 9)
and so picking bits {9, 8, 7}
thinko: BIT(1) of the dram_rule registers chooses whether to just
use the {8, 7, 6} (or {8, 7, 9}) bits mentioned above as they are,
or to XOR them with bits {18, 17, 16} but the code inverted the
test. We need the additional XOR when dram_rule{1} == 0.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Currently set to "6", but the reset of the code will dynamically
allocate as needed. We need to go to "8" today, but drop the check
completely to save doing this again when we need even larger numbers.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
<asm/edac.h> contains only the arch-specific scrubbing function and is
thus not needed in edac_stub.c. Kill it.
Signed-off-by: Borislav Petkov <bp@suse.de>
So first of all, this atomic_scrub() function's naming is bad. It looks
like an atomic_t helper. Change it to edac_atomic_scrub().
The bigger problem is that this function is arch-specific and every new
arch which doesn't necessarily need that functionality still needs to
define it, otherwise EDAC doesn't compile.
So instead of doing that and including arch-specific headers, have each
arch define an EDAC_ATOMIC_SCRUB symbol which can be used in edac_mc.c
for ifdeffery. Much cleaner.
And we already are doing this with another symbol - EDAC_SUPPORT. This
is also much cleaner than having CONFIG_EDAC enumerate all the arches
which need/have EDAC support and drivers.
This way I can kill the useless edac.h header in tile too.
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Chris Metcalf <cmetcalf@ezchip.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Doug Thompson <dougthompson@xmission.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: "Maciej W. Rozycki" <macro@codesourcery.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Steven J. Hill" <Steven.Hill@imgtec.com>
Cc: x86@kernel.org
Signed-off-by: Borislav Petkov <bp@suse.de>
While testing asynchronous PCI probe on this driver I noticed it failed
because the driver checks if any of the PCI devices have been bound to
the driver after registering it, which obviously does not work if
probing is asynchronous.
While there are patches and discussions on how the driver should behave
are ongoing, let's enforce synchronous probe for this driver for now.
Reviewed-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The SDRAM EDAC requires SDRAM configuration/initialization before SDRAM
is accessed (in the preloader) and therefore before Linux is loaded.
Having a module compile is not desired so force to be built into kernel.
Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: mchehab@osg.samsung.com
Cc: Takashi Iwai <tiwai@suse.de>
Link: http://lkml.kernel.org/r/1429308974-26380-3-git-send-email-tthayer@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
The semantic patch that fixes this problem is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@r@
type T;
identifier f;
@@
static T f (...) { ... }
@@
identifier r.f;
declarer name EXPORT_SYMBOL_GPL;
@@
-EXPORT_SYMBOL_GPL(f);
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Cc: Doug Thompson <dougthompson@xmission.com>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Tim Small <tim@buttersideup.com>
Link: http://lkml.kernel.org/r/1426092997-30605-13-git-send-email-Julia.Lawall@lip6.fr
Signed-off-by: Borislav Petkov <bp@suse.de>
edac_init() does not deallocate already allocated resources on failure
path.
Found by Linux Driver Verification project (linuxtesting.org).
[ Boris: The unwind path functions have __exit annotation but are being
used in an __init function, leading to section mismatches. Drop the
section annotation and make them normal functions. ]
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Link: http://lkml.kernel.org/r/1423203162-26368-1-git-send-email-khoroshilov@ispras.ru
Signed-off-by: Borislav Petkov <bp@suse.de>
Instead of calling device_create_file() and device_remove_file()
manually, pass the static attribute groups with the new
edac_mc_add_mc_with_groups(). The conditional creation of inject sysfs
files is done by a proper is_visible callback.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: http://lkml.kernel.org/r/1423046938-18111-4-git-send-email-tiwai@suse.de
Signed-off-by: Borislav Petkov <bp@suse.de>
Add edac_mc_add_mc_with_groups() for initializing the mem_ctl_info
object with the optional attribute groups. This allows drivers to
pass additional sysfs entries without manual (and racy)
device_create_file() and co calls.
edac_mc_add_mc() is kept as is, just calling edac_mc_add_with_groups()
with NULL groups.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: http://lkml.kernel.org/r/1423046938-18111-3-git-send-email-tiwai@suse.de
Signed-off-by: Borislav Petkov <bp@suse.de>
Instead of manual calls of device_create_file() and
device_remove_file(), use static attribute groups with proper
is_visible callbacks for managing the sysfs entries.
This simplifies the code a lot and avoids the possible races.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: http://lkml.kernel.org/r/1423046938-18111-2-git-send-email-tiwai@suse.de
Signed-off-by: Borislav Petkov <bp@suse.de>
The pci_dev_put() function tests whether its argument is NULL and thus
the test around the call is not needed.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Link: http://lkml.kernel.org/r/54CFC12C.9010002@users.sourceforge.net
Signed-off-by: Borislav Petkov <bp@suse.de>
* A fix to amd64_edac to not explode on Numascale machines with more
than 16 memory controllers, from Daniel J Blueman.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJU5aSqAAoJEBLB8Bhh3lVKGbgP+gOzBaeBlIa9BcdrohiF2mKz
UyS/2v/RN/OK0F9u/LUJr15jwZex4TPbE7QPoMF6IvsHQR/5Jh5k4fZUaU8/3sY8
F6ugd7/I4x2mFrYvcJsK5PUm5ZqdYG6dyQKNAInqQw+/sAdL9i9shXz9SUUJXh98
Qa2zGSEyJVNIYmTi3rcSy8gxwJR8xdwL56iGmt4HEc/asjziSoIxOCwEVLh6OR9e
vgLDzOntgS56ymbPiNXHfpEE7IqBkJELCFma6RHer4L4R2fTlJRqk1oahS91tx3X
/m70IVrLu/v6/aDWMdDzSj9oTm6mZD2IpQEAurrp+kFqzLj7EUOlhnXJF1fsyEJY
OmDO+/Z72iRfgVv0SlT0AsrDkVvXUOfERBhyKZpd4wxUV2/XUYFqhwVPE86+al8p
wSuwUJrvKQ4bGdRQYEufZBO7JOUTk8K09iFEzREEYbzvEZPc4ZPMUTXAgOA54x6V
HOhD6NhPR0RJwg9OgqJndgnF0XTcruh6/LuFO2ioyKCR92hjutwuoYyxVI8jfv1W
rHB09wdNzv4EIIoH/5BeBNIv3Vtc04n5d6MRWbYHmBWAt6Ib+jBXWVNMoWSrAOIQ
4MNBgxH5sEhTFKnFOt+/cXZktLVr0G79vii5GaodXKgaTnV59Jm4sV06k5vxHxly
eee2zEWMoT3M4fhYC3St
=Zypx
-----END PGP SIGNATURE-----
Merge tag 'edac_fixes_for_3.20' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp
Pull two EDAC fixes from Borislav Petkov:
- A fix to sb_edac for proper detection on SNB machines
- A fix to amd64_edac to not explode on Numascale machines with more
than 16 memory controllers, from Daniel J Blueman.
* tag 'edac_fixes_for_3.20' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
EDAC, amd64_edac: Prevent OOPS with >16 memory controllers
sb_edac: Fix detection on SNB machines
d0585cd815 ("sb_edac: Claim a different PCI device") changed the
probing of sb_edac to look for PCI device 0x3ca0:
3f:0e.0 System peripheral: Intel Corporation Xeon E5/Core i7 Processor Home Agent (rev 07)
00: 86 80 a0 3c 00 00 00 00 07 00 80 08 00 00 80 00
...
but we're matching for 0x3ca8, i.e. PCI_DEVICE_ID_INTEL_SBRIDGE_IMC_TA
in sbridge_probe() therefore the probing fails.
Changing it to probe for 0x3ca0 (PCI_DEVICE_ID_INTEL_SBRIDGE_IMC_HA0),
.i.e., the 14.0 device, fixes the issue and driver loads successfully
again:
[ 2449.013120] EDAC DEBUG: sbridge_init:
[ 2449.017029] EDAC sbridge: Seeking for: PCI ID 8086:3ca0
[ 2449.022368] EDAC DEBUG: sbridge_get_onedevice: Detected 8086:3ca0
[ 2449.028498] EDAC sbridge: Seeking for: PCI ID 8086:3ca0
[ 2449.033768] EDAC sbridge: Seeking for: PCI ID 8086:3ca8
[ 2449.039028] EDAC DEBUG: sbridge_get_onedevice: Detected 8086:3ca8
[ 2449.045155] EDAC sbridge: Seeking for: PCI ID 8086:3ca8
...
Add a debug printk while at it to be able to catch the failure in the
future and dump driver version on successful load.
Fixes: d0585cd815 ("sb_edac: Claim a different PCI device")
Cc: stable@vger.kernel.org # 3.18
Acked-by: Aristeu Rozanski <aris@redhat.com>
Cc: Tony Luck <tony.luck@intel.com>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
If edac_mc_add_mc() fails then we should preserve the error code, but
instead the current code returns success.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: http://lkml.kernel.org/r/20150128191351.GC10259@mwanda
Signed-off-by: Borislav Petkov <bp@suse.de>
Also use goto labels for all failure paths in
edac_create_sysfs_mci_device and update meaningless labels.
Signed-off-by: Junjie Mao <junjie.mao@hotmail.com>
Link: http://lkml.kernel.org/r/BLU436-SMTP25291B6B612942A212AEBFE95300@phx.gbl
[ Boris: Use ! for 0 checks and add newlines for less crammed code. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Add EDAC support for ecc errors reporting on the synopsys ddr
controller. The ddr ecc controller corrects single bit errors and
detects double bit errors.
Selected important-ish notes from the changelog:
- I have not taken care of spliting synps_edac_geterror_info function as
it adds additional indentation levels and moreover the existing changes
were made as part of the v2 review comments
- Removed dt binding info as already there is a binding info available
under memorycontroller. so, updated ecc info there.
- Shortened the prefix "sysnopsys" to "synps"
Signed-off-by: Punnaiah Choudary Kalluri <punnaia@xilinx.com>
Link: http://lkml.kernel.org/r/a728a8d4678f4dbf9de189a480297c3d@BY2FFO11FD034.protection.gbl
[ Boris: massage commit message. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Fixes the following sparse warnings:
drivers/edac/mce_amd_inj.c:204:3: warning:
symbol 'dfs_fls' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Link: http://lkml.kernel.org/r/1418087095-14174-1-git-send-email-weiyj_lk@163.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Here's the set of driver core patches for 3.19-rc1.
They are dominated by the removal of the .owner field in platform
drivers. They touch a lot of files, but they are "simple" changes, just
removing a line in a structure.
Other than that, a few minor driver core and debugfs changes. There are
some ath9k patches coming in through this tree that have been acked by
the wireless maintainers as they relied on the debugfs changes.
Everything has been in linux-next for a while.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iEYEABECAAYFAlSOD20ACgkQMUfUDdst+ylLPACg2QrW1oHhdTMT9WI8jihlHVRM
53kAoLeteByQ3iVwWurwwseRPiWa8+MI
=OVRS
-----END PGP SIGNATURE-----
Merge tag 'driver-core-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core update from Greg KH:
"Here's the set of driver core patches for 3.19-rc1.
They are dominated by the removal of the .owner field in platform
drivers. They touch a lot of files, but they are "simple" changes,
just removing a line in a structure.
Other than that, a few minor driver core and debugfs changes. There
are some ath9k patches coming in through this tree that have been
acked by the wireless maintainers as they relied on the debugfs
changes.
Everything has been in linux-next for a while"
* tag 'driver-core-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (324 commits)
Revert "ath: ath9k: use debugfs_create_devm_seqfile() helper for seq_file entries"
fs: debugfs: add forward declaration for struct device type
firmware class: Deletion of an unnecessary check before the function call "vunmap"
firmware loader: fix hung task warning dump
devcoredump: provide a one-way disable function
device: Add dev_<level>_once variants
ath: ath9k: use debugfs_create_devm_seqfile() helper for seq_file entries
ath: use seq_file api for ath9k debugfs files
debugfs: add helper function to create device related seq_file
drivers/base: cacheinfo: remove noisy error boot message
Revert "core: platform: add warning if driver has no owner"
drivers: base: support cpu cache information interface to userspace via sysfs
drivers: base: add cpu_device_create to support per-cpu devices
topology: replace custom attribute macros with standard DEVICE_ATTR*
cpumask: factor out show_cpumap into separate helper function
driver core: Fix unbalanced device reference in drivers_probe
driver core: fix race with userland in device_add()
sysfs/kernfs: make read requests on pre-alloc files use the buffer.
sysfs/kernfs: allow attributes to request write buffer be pre-allocated.
fs: sysfs: return EGBIG on write if offset is larger than file size
...
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJUhyDVAAoJEAhfPr2O5OEVsdMP/it7jKAKu1uHvVoWfwPcB0nG
QZ7Dt3Hu9aZtaZJs9OFOl2RrGyp4tXOK1SM2QInRGRjjquEP83AASk7XJRkXHr61
DTiMxHnKXvGfky4ONS76ZP3JWwfBffT5Y8oXIGiarAyUgcK66mnelrlju0r1XM11
uoUZqOzE5ySfARFcYm9VG53qLG4RxOERqP+EKSyctRE5gCsuymPaSKIwqj7rcg4R
QikDJQv2H8y2Ui0T9Dk6urdOjlUpBczGRVaXdY+2FyDfs0KPLEWUQVl8q2Lj/bur
1A10H+p9HdIA9emV2WEetQmJzz7POgn/j+BLkRKImPS5NgKv2Gu5BWiVUKQg4oRt
shvcpp38PYSmeDIy16v+3RDf0THSSYxhq7ExlGacIzaRk7G2asgRAWixiVS45NzI
yMb17ZJlHhzVMEyoSlTJH90o4y6UT9njZ7TcK3+9wmhbFyIlGWrrD1Mp56PDCYes
KO5VRYkwnw8GAd7gH0xwtHRREQYt4VneVyb7Ccw2VxrdIffVp1fDhpxnS+tw+Qri
294+RoJlnrp4JIpXerAJgQIIiMsRhg64EQ3rsT4Skj6Hfq4KExkmdrHE38p+FZLO
xHURKAt2jDSuu/DqaWpBclJXE4Q+MhS303pn80x0MHbFmRAYxwmWFjgPwQKrzNU9
x47JgdYNpBoF2+WK6ZGo
=CeIU
-----END PGP SIGNATURE-----
Merge tag 'edac/v3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac
Pull edac updates from Mauro Carvalho Chehab:
- Broadwell-DE support on sb-edac driver
- Some fixes at sb-edac driver
* tag 'edac/v3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac:
sb_edac: Fix typo computing number of banks
sb_edac: Add support for Broadwell-DE processor
sb_edac: Fix discovery of top-of-low-memory for Haswell
sb_edac: Fix erroneous bytes->gigabytes conversion
sb_edac: Fix off-by-one error in number of channels
Pull x86 RAS update from Ingo Molnar:
"The biggest change in this cycle is better support for UCNA
(UnCorrected No Action) events:
"Handle all uncorrected error reports in the same way (soft
offline the page). We used to only do that for SRAO
(software recoverable action optional) machine checks, but
it makes sense to also do it for UCNA (UnCorrected No
Action) logs found by CMCI or polling."
plus various x86 MCE handling updates and fixes"
* 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce: Spell "panicked" correctly
x86, mce: Support memory error recovery for both UCNA and Deferred error in machine_check_poll
x86, mce, severity: Extend the the mce_severity mechanism to handle UCNA/DEFERRED error
x86, MCE, AMD: Assign interrupt handler only when bank supports it
x86, MCE, AMD: Drop software-defined bank in error thresholding
x86, MCE, AMD: Move invariant code out from loop body
x86, MCE, AMD: Correct thresholding error logging
x86, MCE, AMD: Use macros to compute bank MSRs
RAS, HWPOISON: Fix wrong error recovery status
GHES: Make ghes_estatus_caches static
APEI, GHES: Cleanup unnecessary function for lockless list
* Enablement for AMD F15h models 0x60 CPUs. Most notably DDR4 RAM
support. Out of tree stuff is
arch/x86/kernel/amd_nb.c | 2 +
include/linux/pci_ids.h | 2 +
adding the required PCI IDs. From Aravind Gopalakrishnan.
* Enable amd64_edac for 32-bit due to popular demand. From Tomasz Pala.
* Convert the AMD MCE injection module to debugfs, where it belongs.
* Misc EDAC cleanups
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJUhZbcAAoJEBLB8Bhh3lVKd5kQAK77qsB4ebpke1rEBerQl9jQ
YCVrByCKu7QTRt/xvlqU9Vyp7EvcnpNxFbRCCqzIpBcJjre9v17QVRA2/zFS0q81
QRDTOWf9uhMWbssI2Zu1hbjDNMWYiEb9+aeZHjScVtkzPDmsgYuKGdWfTSLw9dkS
AG/UUZ3ojwyc2dK3i0W3pCjakKYLUsCmijyTZBfb37+u4rRGuAgrQ9G8fBn2lL+l
huptgV2BsTCQqdL554zTs64Yt912PnidsJWYCPCjMubgEPSeNcWRzTTBYDUf9NIn
RxFXYHnOQxetPSQqfLVXlX3V/cGNQg0yEXFZ9S0tCt5uLNbbN/D8Uumtst0rq9x3
XkJ/EGHXBFP+KwHdV/i9j6OYM5rq4z+4ql4OqbWzsvPrEDbh/4p5gRbhqd1Hhy9U
zgHJjVPpD/l2t82Tpz0jIOscQruZ6VqGMDSYo3LiLnNt724pcrmr3DiN9mc6ljzJ
rsNsemMH0IoH8KbBHKGtMLnBVO6HbnrtC6iKFfocNBisvo1PKKzn9s2O1pdjmsCs
jHwz5njoM7Ki/ygkJhbKiSDMXPs67eggwoGIGzpNMoY4RWxrcQzYE9yKfKzRNxET
Qb3xUwDWDyL8ErwHtL3xMxGwyfkhb+SZdMd5aKYA5Rdbf+TN8P6iAv05nrnfpkyk
lPTv5o9EQYvr8/Tc9FZW
=ULmb
-----END PGP SIGNATURE-----
Merge tag 'edac_for_3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp
Pull EDAC updates from Borislav Petkov:
"EDAC updates all over the place:
- Enablement for AMD F15h models 0x60 CPUs. Most notably DDR4 RAM
support. Out of tree stuff is adding the required PCI IDs. From
Aravind Gopalakrishnan.
- Enable amd64_edac for 32-bit due to popular demand. From Tomasz
Pala.
- Convert the AMD MCE injection module to debugfs, where it belongs.
- Misc EDAC cleanups"
* tag 'edac_for_3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
EDAC, MCE, AMD: Correct formatting of decoded text
EDAC, mce_amd_inj: Add an injector function
EDAC, mce_amd_inj: Add hw-injection attributes
EDAC, mce_amd_inj: Enable direct writes to MCE MSRs
EDAC, mce_amd_inj: Convert mce_amd_inj module to debugfs
EDAC: Delete unnecessary check before calling pci_dev_put()
EDAC, pci_sysfs: remove unneccessary ifdef around entire file
ghes_edac: Use snprintf() to silence a static checker warning
amd64_edac: Build module on x86-32
EDAC, MCE, AMD: Add decoding table for MC6 xec
amd64_edac: Add F15h M60h support
{mv64x60,ppc4xx}_edac,: Remove deprecated IRQF_DISABLED
EDAC: Sync memory types and names
EDAC: Add DDR3 LRDIMM entries to edac_mem_types
x86, amd_nb: Add device IDs to NB tables for F15h M60h
pci_ids: Add PCI device IDs for F15h M60h
Code will always think there are 16 banks because of a typo
Reported-by: Misha
Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Broadwell-DE is the microserver version of next generation Xeon
processors. A whole bunch of new PCIe device ids, but otherwise
pretty much the same as Haswell.
Acked-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Haswell moved the TOLM/TOHM registers to a different device and offset.
The sb_edac driver accounted for the change of device, but not for the
new offset. There was also a typo in the constant to fill in the low
26 bits (was 0x1ffffff, should be 0x3ffffff).
This resulted in a bogus value for the top of low memory:
EDAC DEBUG: get_memory_layout: TOLM: 0.032 GB (0x0000000001ffffff)
which would result in EDAC refusing to translate addresses for
errors above the bogus value and below 4GB:
sbridge MC3: HANDLING MCE MEMORY ERROR
sbridge MC3: CPU 0: Machine Check Event: 0 Bank 7: 8c00004000010090
sbridge MC3: TSC 0
sbridge MC3: ADDR 2000000
sbridge MC3: MISC 523eac86
sbridge MC3: PROCESSOR 0:306f3 TIME 1414600951 SOCKET 0 APIC 0
MC3: 1 CE Error at TOLM area, on addr 0x02000000 on any memory ( page:0x0 offset:0x0 grain:32 syndrome:0x0)
With the fix we see the correct TOLM value:
DEBUG: get_memory_layout: TOLM: 2.048 GB (0x000000007fffffff)
and we decode address 2000000 correctly:
sbridge MC3: HANDLING MCE MEMORY ERROR
sbridge MC3: CPU 0: Machine Check Event: 0 Bank 7: 8c00004000010090
sbridge MC3: TSC 0
sbridge MC3: ADDR 2000000
sbridge MC3: MISC 523e1086
sbridge MC3: PROCESSOR 0:306f3 TIME 1414601319 SOCKET 0 APIC 0
DEBUG: get_memory_error_data: SAD interleave package: 0 = CPU socket 0, HA 0, shiftup: 0
DEBUG: get_memory_error_data: TAD#0: address 0x0000000002000000 < 0x000000007fffffff, socket interleave 1, channel interleave 4 (offset 0x00000000), index 0, base ch: 0, ch mask: 0x01
DEBUG: get_memory_error_data: RIR#0, limit: 4.095 GB (0x00000000ffffffff), way: 1
DEBUG: get_memory_error_data: RIR#0: channel address 0x00200000 < 0xffffffff, RIR interleave 0, index 0
DEBUG: sbridge_mce_output_error: area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0
MC3: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x2000 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0)
Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Write out MCx_ADDR into the more humanly readable "MCx Error Address"
and remove double colon in the output.
Cc: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Selectively inject either a real MCE or a sw-only version which
exercises the decoding path only. The hardware-injected MCE triggers a
machine check exception (#MC) so that the MCE handler can be bothered to
do something too.
Signed-off-by: Borislav Petkov <bp@suse.de>
Until now, the mce_severity mechanism can only identify the severity
of UCNA error as MCE_KEEP_SEVERITY. Meanwhile, it is not able to filter
out DEFERRED error for AMD platform.
This patch extends the mce_severity mechanism for handling
UCNA/DEFERRED error. In order to do this, the patch introduces a new
severity level - MCE_UCNA/DEFERRED_SEVERITY.
In addition, mce_severity is specific to machine check exception,
and it will check MCIP/EIPV/RIPV bits. In order to use mce_severity
mechanism in non-exception context, the patch also introduces a new
argument (is_excp) for mce_severity. `is_excp' is used to explicitly
specify the calling context of mce_severity.
Reviewed-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Signed-off-by: Chen Yucong <slaoub@gmail.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The pci_dev_put() function tests whether its argument is NULL and then
returns immediately. Thus the test before the call is not needed.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Link: http://lkml.kernel.org/r/546CB20D.4070808@users.sourceforge.net
[ Boris: commit message. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
The file edac_pci_sysfs.c is dependent on CONFIG_PCI. This is already
modelled in the Makefile, but edac_pci_sysfs.o is still contained in
the list of files compiled even without CONFIG_PCI.
This change removes edac_pci_sysfs.o from the list of built objects
when not having CONFIG_PCI enabled and removes the then-unnecessary
ifdef from the source file.
Signed-off-by: Andreas Ruprecht <rupran@einserver.de>
Link: http://lkml.kernel.org/r/1407697803-3837-1-git-send-email-rupran@einserver.de
Signed-off-by: Borislav Petkov <bp@suse.de>
My static checker complains because the "e->location" has up to 256
characters but we are copying it into the "pvt->detail_location" which
only has space for 240 characters. That's not counting the surrounding
text and the "e->other_detail" string which can be over 80 characters
long.
I am not familiar with this code but presumably it normally works.
Let's add a limit though for safety.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Link: http://lkml.kernel.org/r/20140801082514.GD28869@mwanda
Signed-off-by: Borislav Petkov <bp@suse.de>
By popular demand, enable amd64_edac on 32-bit too.
Boris:
- update Kconfig text.
- add a warning on load which states that 32-bit configurations are unsupported.
Signed-off-by: Tomasz Pala <gotar@polanet.pl>
Link: http://lkml.kernel.org/r/20141102102212.GA7034@polanet.pl
Signed-off-by: Borislav Petkov <bp@suse.de>
This patch adds support for ECC error decoding for F15h M60h processor.
Aside from the usual changes, the patch adds support for some new features
in the processor:
- DDR4(unbuffered, registered); LRDIMM DDR3 support
- relevant debug messages have been modified/added to report these
memory types
- new dbam_to_cs mappers
- if (F15h M60h && LRDIMM); we need a 'multiplier' value to find
cs_size. This multiplier value is obtained from the per-dimm
DCSM register. So, change the interface to accept a 'cs_mask_nr'
value to facilitate this calculation
- switch-casing determine_memory_type()
- done to cleanse the function of too many if-else statements
and improve readability
- This is now called early in read_mc_regs() to cache dram_type
Misc cleanup:
- amd64_pci_table[] is condensed by using PCI_VDEVICE macro.
Testing details:
Tested the patch by injecting 'ECC' type errors using mce_amd_inj
and error decoding works fine.
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Link: http://lkml.kernel.org/r/1414617483-4941-1-git-send-email-Aravind.Gopalakrishnan@amd.com
[ Boris: determine_memory_type() cleanups ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Make keeping the sync between the mem_types enum and the actual string
names simpler by using designated initializers.
Signed-off-by: Borislav Petkov <bp@suse.de>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJUNsUKAAoJEAhfPr2O5OEVXSMQAJ953vtyqmMEi02NMf24NpA+
OuTmoe5c8RT7/+YD2edHzG7VgYIE1L4evVOm71XLoPwI3mmGwuAIYZAFvVXsYPtB
U7xfWHpC5r29DlTQen7L0doD1NnRTQfOWFUtsHnd5ygdrzmIHToXqqN1IjoQudpK
i8ttU5zshSMt1IPwFh+CXMHkp8wA11hGX4LyonmJ0WD/bCeu8kreilRrK/ehZtAU
sBjzEif2+xtqAWBaxyZ0IzWBJdYBjo6u68jyifK0liM8oBZ8vov11i6cCZBuGGAy
eNG6lNBmak77U187yUeeyqjbdhnPy7NPLEvvfDN/C5voGHqPkrKEjH64j54ayeaR
TQ6u6VlPJ+3RXeqtRfYbMOgHAsxNMpJZLkKx0NNty073RQc2qgX8Xxs4t33+9B37
TfkoL8fnkh7Mq9x8czRw4n5X4qiw2ZeDDNX4TZPdGP2QQwP3JVDfMOzyr6x+BhMp
4TwCp+Sr9PeohyGYZ8InjRdxkA3yTrc1CbqSC00lXBe0lt9Pw4aQ9HGFQWSXxYH3
uZ8PoqBpxy3C5xvcwDQ8kjmu6y0GPtsRHAdb0G3HW3bjMiLuQbVohTxILEvg9HWr
2tSyKSfUZXJaTckXfWOVelyexkh3IHvQ0AoQFtG6qXQh2HF12S1OV7EoGxTua2Ye
/XDAjKzdi12tRLlY172L
=FRmr
-----END PGP SIGNATURE-----
Merge tag 'edac/v3.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac
Pull edac updates from Mauro Carvalho Chehab:
"Nothing really exiting here: just one bug fix at sb_edac, and some
changes to allow other drivers to use some shared PCI addresses"
* tag 'edac/v3.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac:
sb_edac: Claim a different PCI device
Move Intel SNB device ids from sb_edac to pci_ids.h
sb_edac: avoid INTERNAL ERROR message in EDAC with unspecified channel
These are changes for drivers that are intimately tied to some SoC
and for some reason could not get merged through the respective
subsystem maintainer tree.
Most of the new code is for the Keystone Navigator driver, which is
new base support that is going to be needed for their hardware
accelerated network driver and other units.
Most of the commits are for moving old code around from at91 and omap
for things that are done in device drivers nowadays.
- at91: move reset, poweroff, memory and clocksource code into drivers
directories
- socfpga: add edac driver (through arm-soc, as requested by Boris)
- omap: move omap-intc code to drivers/irqchip
- sunxi: added an RTC driver for sun6i
- omap: mailbox driver related changes
- keystone: support for the "Navigator" component
- versatile: new reboot, led and soc drivers
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIVAwUAVDWWQGCrR//JCVInAQKX7Q//bDkoseKCZsGaXN7vfQ2YhT3SAc52mROV
YQKdNmtMUrHqDgngATZTx5ogOh1hInnqueFjGGhfMYsHQO1Vj8+odj0r+4jhjuUY
3YfY+qZ+91tq33JlUOhKn+mfVMdxJc8XarGgR6MSWYkqWVYCtLtBluum7hKm2UJ6
/e4hd2zzImX5ATwj/LXWLx5eTf1qAVFGWzNUph1DrW+1V5lOu58X4gKwk1QOCVEh
Pa0GV9oRTkjoswwz9drzjeFtie2yofQ2mygj6QKxg5NsosIF0+B8kJ61Sxwg56Ak
tF+qn1hGtB2cDQkpxK4o2cZgCELhkh5Aqgol/vZUS1DMBSUEGCV9PPp2eOW83r3B
0zsTgsShyVcTh7khdpQmHNRigvcc7e69LaAGC4o/RxaZpCU/LUNCQ+/iqVExSE8A
VNEXr+JNxGxhj3m9KUHuEktdWx1oNvaYR8Rr4RPr6EWR8R6emJ04I7kXInvzhJZL
HOGh75vSuAU83FrsP8fFRLadoHNVDXylAs38BPfGEMngVpjvwJLgQ3+729CwW+Q4
+xQXAKSwKfr8xA8eg6wBSbFcwnEW4QwRqFqQ5XPw7zTZkCZbiLtvn3JpI5bH5A5Q
/d2D+M2vFbB7VbWJBM4etO95eNS/pfhqJhcQh4t0DjXjoW6WqLiHCxhEx8Ogfvop
/4ckyGvtEOI=
=POJD
-----END PGP SIGNATURE-----
Merge tag 'drivers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC driver updates from Arnd Bergmann:
"These are changes for drivers that are intimately tied to some SoC and
for some reason could not get merged through the respective subsystem
maintainer tree.
Most of the new code is for the Keystone Navigator driver, which is
new base support that is going to be needed for their hardware
accelerated network driver and other units.
Most of the commits are for moving old code around from at91 and omap
for things that are done in device drivers nowadays.
- at91: move reset, poweroff, memory and clocksource code into
drivers directories
- socfpga: add edac driver (through arm-soc, as requested by Boris)
- omap: move omap-intc code to drivers/irqchip
- sunxi: added an RTC driver for sun6i
- omap: mailbox driver related changes
- keystone: support for the "Navigator" component
- versatile: new reboot, led and soc drivers"
* tag 'drivers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (92 commits)
bus: arm-ccn: Fix spurious warning message
leds: add device tree bindings for register bit LEDs
soc: add driver for the ARM RealView
power: reset: driver for the Versatile syscon reboot
leds: add a driver for syscon-based LEDs
drivers/soc: ti: fix build break with modules
MAINTAINERS: Add Keystone Multicore Navigator drivers entry
soc: ti: add Keystone Navigator DMA support
Documentation: dt: soc: add Keystone Navigator DMA bindings
soc: ti: add Keystone Navigator QMSS driver
Documentation: dt: soc: add Keystone Navigator QMSS bindings
rtc: sunxi: Depend on platforms sun4i/sun7i that actually have the rtc
rtc: sun6i: Add sun6i RTC driver
irqchip: omap-intc: remove unnecessary comments
irqchip: omap-intc: correct maximum number or MIR registers
irqchip: omap-intc: enable TURBO idle mode
irqchip: omap-intc: enable IP protection
irqchip: omap-intc: remove unnecesary of_address_to_resource() call
irqchip: omap-intc: comment style cleanup
irqchip: omap-intc: minor improvement to omap_irq_pending()
...
sb_edac controls a large number of different PCI functions. Rather
than registering as a normal PCI driver for all of them, it
registers for just one so that it gets probed and, at probe time, it
looks for all the others.
Coincidentally, the device it registers for also contains the SMBUS
registers, so the PCI core will refuse to probe both sb_edac and a
future iMC SMBUS driver. The drivers don't actually conflict, so
just change sb_edac's device table to probe a different device.
An alternative fix would be to merge the two drivers, but sb_edac
will also refuse to load on non-ECC systems, whereas i2c_imc would
still be useful without ECC.
The only user-visible change should be that sb_edac appears to bind
a different device.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Cc: Rui Wang <ruiv.wang@gmail.com>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
The i2c_imc driver will use two of them, and moving only part of
the list seems messier.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Intel IA32 SDM Table 15-14 defines channel 0xf as 'not specified', but
EDAC doesn't know about this and returns and INTERNAL ERROR when the
channel is greater than NUM_CHANNELS:
kernel: [ 1538.886456] CPU 0: Machine Check Exception: 0 Bank 1: 940000000000009f
kernel: [ 1538.886669] TSC 2bc68b22e7e812 ADDR 46dae7000 MISC 0 PROCESSOR 0:306e4 TIME 1390414572 SOCKET 0 APIC 0
kernel: [ 1538.971948] EDAC MC1: INTERNAL ERROR: channel value is out of range (15 >= 4)
kernel: [ 1538.972203] EDAC MC1: 0 CE memory read error on unknown memory (slot:0 page:0x46dae7 offset:0x0 grain:0 syndrome:0x0 - area:DRAM err_code:0000:009f socket:1 channel_mask:1 rank:0)
This commit changes sb_edac to forward a channel of -1 to EDAC if the
channel is not specified. edac_mc_handle_error() sets the channel to -1
internally after the error message anyway, so this commit should have no
effect other than avoiding the INTERNAL ERROR message when the channel
is not specified.
Signed-off-by: Seth Jennings <sjenning@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Rationale behind this change:
- F2x1xx addresses were stopped from being mapped explicitly to DCT1
from F15h (OR) onwards. They use _dct[0:1] mechanism to access the
registers. So we should move away from using address ranges to select
DCT for these families.
- On newer processors, the address ranges used to indicate DCT1 (0x140,
0x1a0) have different meanings than what is assumed currently.
Changes introduced:
- amd64_read_dct_pci_cfg() now takes in dct value and uses it for
'selecting the dct'
- Update usage of the function. Keep in mind that different families
have specific handling requirements
- Remove [k8|f10]_read_dct_pci_cfg() as they don't do much different
from amd64_read_pci_cfg()
- Move the k8 specific check to amd64_read_pci_cfg
- Remove f15_read_dct_pci_cfg() and move logic to amd64_read_dct_pci_cfg()
- Remove now needless .read_dct_pci_cfg
Testing:
- Tested on Fam 10h; Fam15h Models: 00h, 30h; Fam16h using 'EDAC_DEBUG'
and mce_amd_inj
- driver obtains info from F2x registers and caches it in pvt
structures correctly
- ECC decoding works fine
Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
Link: http://lkml.kernel.org/r/1410799058-3149-1-git-send-email-aravind.gopalakrishnan@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Fix the following error
drivers/edac/ppc4xx_edac.c:977:45: error: request for member 'dimm' in something
not a structure or union
by changing member access to pointer dereference.
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Link: http://lkml.kernel.org/r/1408482646-22541-1-git-send-email-bobby.prani@gmail.com
CC: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
This patch adds support for the CycloneV and ArriaV SDRAM controllers.
Correction and reporting of SBEs, Panic on DBEs.
There was a discussion thread on whether this driver should be an mfd driver
or just make use of syscon, which is already a mfd. Ultimately, the
decision to use a simple syscon interface was reached.[1]
[1] https://lkml.org/lkml/2014/7/30/514
[dinguyen] Fixed Kconfig to have EDAC_ALTERA_MC as a tristate to prevent a
build failure for allmodconfig.
Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Acked-by: Borislav Petkov <bp@suse.de>
[dinguyen] cleaned up commit message
Signed-off-by: Dinh Nguyen <dinguyen@opensource.altera.com>
Pull EDAC updates from Mauro Carvalho Chehab.
* 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac:
sb_edac: add support for Haswell based systems
sb_edac: Fix mix tab/spaces alignments
edac: add DDR4 and RDDR4
sb_edac: remove bogus assumption on mc ordering
sb_edac: make minimal use of channel_mask
sb_edac: fix socket detection on Ivy Bridge controllers
sb_edac: update Kconfig description
sb_edac: search devices using product id
sb_edac: make RIR limit retrieval per model
sb_edac: make node id retrieval per model
sb_edac: make memory type detection per memory controller
window:
Group changes to the device tree. In preparation for adding device tree
overlay support, OF_DYNAMIC is reworked so that a set of device tree
changes can be prepared and applied to the tree all at once. OF_RECONFIG
notifiers see the most significant change here so that users always get
a consistent view of the tree. Notifiers generation is moved from before
a change to after it, and notifiers for a group of changes are emitted
after the entire block of changes have been applied
Automatic console selection from DT. Console drivers can now use
of_console_check() to see if the device node is specified as a console
device. If so then it gets added as a preferred console. UART devices
get this support automatically when uart_add_one_port() is called.
DT unit tests no longer depend on pre-loaded data in the device tree.
Data is loaded dynamically at the start of unit tests, and then unloaded
again when the tests have completed.
Also contains a few bugfixes for reserved regions and early memory setup.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJT6M7XAAoJEMWQL496c2LNTTgP/2rXyrTTGZpK/qrLKHWKYHvr
XL7tcTkhA0OLU64E37fB+xtDXyBYnLsharuwUFSd1LlL1Wnc1cZN5ORRlMJbmjUR
Wvwl0A8/mkhGl4tzzKgJ4z4rMJXvlZfJRpnVoRB5FOn90LI7k/jsf5rIwF/6S90B
6D6II0r4RG9ku1m7g70cToxcIFCzp0V+eu2tym9GnhsyGKlunPT9iNiTpwfVhPAj
QUvMPKIQXReOv6xDU5q6E07839IMf7SdAvciBTHGnCDD7sGziHvnBIShj/2vTDAF
27sGRKrWUnnuxRUMOoIudiYyeHXIdt1WXp6FsS/ztVI37Ijh9YPShJwWGhQDppnp
4tcoSdefqw9IRUajAVWsB0RUW/tCL4a7tggWofylOA6itDi+HZnw6YafE1G1YzxF
q8OFo9uqLcmFQfHDJpk+sdtXoMZzOgrxlEscsGsQ8kd2Uoe8+chgR9EY1sqPkWVF
Zw0FJCTB6spBlsnXeboBGrnvpbPkacwhvesIFO0IANy4j4R55xlEeTcs1fe3ubUf
UhNyyFdnCAUA7e5CAabcAQYsdbEKG6v0Il3H6xts36c8lXCSFXVgNcw5mdCpFCcQ
DZ3/1FGSVzkYNXX8hlzIn1W0dqvwn4x0ZbnpXUJrGUBpSmjt9qkXx0XbdeFwartU
7X4tNGS1jfNYhOdP87jF
=8IFz
-----END PGP SIGNATURE-----
Merge tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux
Pull device tree updates from Grant Likely:
"The branch contains the following device tree changes the v3.17 merge
window:
Group changes to the device tree. In preparation for adding device
tree overlay support, OF_DYNAMIC is reworked so that a set of device
tree changes can be prepared and applied to the tree all at once.
OF_RECONFIG notifiers see the most significant change here so that
users always get a consistent view of the tree. Notifiers generation
is moved from before a change to after it, and notifiers for a group
of changes are emitted after the entire block of changes have been
applied
Automatic console selection from DT. Console drivers can now use
of_console_check() to see if the device node is specified as a console
device. If so then it gets added as a preferred console. UART
devices get this support automatically when uart_add_one_port() is
called.
DT unit tests no longer depend on pre-loaded data in the device tree.
Data is loaded dynamically at the start of unit tests, and then
unloaded again when the tests have completed.
Also contains a few bugfixes for reserved regions and early memory
setup"
* tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux: (21 commits)
of: Fixing OF Selftest build error
drivers: of: add automated assignment of reserved regions to client devices
of: Use proper types for checking memory overflow
of: typo fix in __of_prop_dup()
Adding selftest testdata dynamically into live tree
of: Add todo tasklist for Devicetree
of: Transactional DT support.
of: Reorder device tree changes and notifiers
of: Move dynamic node fixups out of powerpc and into common code
of: Make sure attached nodes don't carry along extra children
of: Make devicetree sysfs update functions consistent.
of: Create unlocked versions of node and property add/remove functions
OF: Utility helper functions for dynamic nodes
of: Move CONFIG_OF_DYNAMIC code into a separate file
of: rename of_aliases_mutex to just of_mutex
of/platform: Fix of_platform_device_destroy iteration of devices
of: Migrate of_find_node_by_name() users to for_each_node_by_name()
tty: Update hypervisor tty drivers to use core stdout parsing code.
arm/versatile: Add the uart as the stdout device.
of: Enable console on serial ports specified by /chosen/stdout-path
...
Pull RAS updates from Ingo Molnar:
"The main changes in this cycle are:
- RAS tracing/events infrastructure, by Gong Chen.
- Various generalizations of the APEI code to make it available to
non-x86 architectures, by Tomasz Nowicki"
* 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/ras: Fix build warnings in <linux/aer.h>
acpi, apei, ghes: Factor out ioremap virtual memory for IRQ and NMI context.
acpi, apei, ghes: Make NMI error notification to be GHES architecture extension.
apei, mce: Factor out APEI architecture specific MCE calls.
RAS, extlog: Adjust init flow
trace, eMCA: Add a knob to adjust where to save event log
trace, RAS: Add eMCA trace event interface
RAS, debugfs: Add debugfs interface for RAS subsystem
CPER: Adjust code flow of some functions
x86, MCE: Robustify mcheck_init_device
trace, AER: Move trace into unified interface
trace, RAS: Add basic RAS trace event
x86, MCE: Kill CPU_POST_DEAD
Haswell memory controllers are very similar to Ivy Bridge and Sandy Bridge
ones. This patch adds support to Haswell based systems.
[m.chehab@samsung.com: Fix CodingStyle issues]
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Haswell memory controller can make use of DDR4 and Registered DDR4
Cc: tony.luck@intel.com
Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
When a MC is handled, the correct sbridge_dev is searched based on the node,
checking again later with the assumption the first memory controller found is
the first socket's memory controller is a bogus assumption. Get rid of it.
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
channel_mask will be used in the future to determine which group of memory
modules is causing the errors since when mirroring, lockstep and close page
are enabled you can't. While that doesn't happen, use the channel_mask to
determine the channel instead of relying on the MC event/exception.
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
This patch fixes the obvious bug while handling the socket/HA bitmask used in
Ivy Bridge memory controllers.
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
This patch changes the way devices are searched by using product id instead of
device/function numbers. Tested in a Sandy Bridge and a Ivy Bridge machine to
make sure everything works properly.
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Haswell has a different way to retrieve RIR limits, make this procedure per
model.
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Haswell has a different way to retrieve the node id, make so this procedure
can be reimplemented.
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Haswell has different register, offset to determine memory type and supports
DDR4 in some models. This patch makes it easier to have a different method
depending on the memory controller type.
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
There are a bunch of users open coding the for_each_node_by_name() by
calling of_find_node_by_name() directly instead of using the macro. This
is getting in the way of some cleanups, and the possibility of removing
of_find_node_by_name() entirely. Clean it up so that all the users are
consistent.
Signed-off-by: Grant Likely <grant.likely@linaro.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Takashi Iwai <tiwai@suse.de>
To avoid confuision and conflict of usage for RAS related trace event,
add an unified RAS trace event stub.
Start a RAS subsystem menu which will be fleshed out in time, when more
features get added to it.
Signed-off-by: Chen, Gong <gong.chen@linux.intel.com>
Link: http://lkml.kernel.org/r/1402475691-30045-2-git-send-email-gong.chen@linux.intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Assign PCI resources before pci_bus_add_device(). The resources must be
assigned before a driver can claim the device.
[bhelgaas: changelog]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>