OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Borislav Petkov	73ba85937b	amd64_edac: Add a fix for Erratum 505 When accessing the scrub rate control register (F3x58) on F15h, the DRAM controller selector (F1x10C[DctCfgSel]) has to point to DCT0 so that the scrub rate configuration can take effect. See Erratum 505 in the AMD F15h revision guide for more details. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-10-06 12:34:05 +02:00
Borislav Petkov	b0b07a2bd4	EDAC, MCE, AMD: Simplify NB MCE decoder interface Drop third nbcfg argument which is old remains and not required anymore. No functionality change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-10-06 12:34:04 +02:00
Borislav Petkov	c1ae68309b	amd64_edac: Erratum #637 workaround F15h CPUs may report a non-DRAM address when reporting an error address belonging to a CC6 state save area. Add a workaround to detect this condition and compute the actual DRAM address of the error as documented in the Revision Guide for AMD Family 15h Models 00h-0Fh Processors. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-04-26 16:18:56 +02:00
Borislav Petkov	f08e457cec	amd64_edac: Factor in CC6 save area F15h and later use a portion of DRAM as a CC6 storage area. BIOS programs D18F1x[17C:140,7C:40] DRAM Base/Limit accordingly by subtracting the storage area from the DRAM limit setting. However, in order for edac to consider that part of DRAM too, we need to include it into the per-node range. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-04-26 16:18:44 +02:00
Borislav Petkov	f030ddfb37	amd64_edac: Remove node interleave warning This warning was wrongfully added for a normal condition - intlvsel actually selects the destination node when node interleaving is enabled and it is not a mismatch. For a detailed example, see section 2.8.10.2 "Node Interleaving" in F10h BKDG. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-04-26 16:18:12 +02:00
Markus Trippelsdorf	4949603a6f	EDAC: Remove debugging output in scrub rate handling This patch removes superfluous debugging output in the sysfs scrub rate handler. It also consolidates the error handling in the scrub rate accessors. Signed-off-by: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-04-21 12:44:58 +02:00
Borislav Petkov	a9f0fbe2bb	amd64_edac: Fix potential memleak We check the pointers together but at least one of them could be invalid due to failed allocation. Since we cannot continue if either of the two allocations has failed, exit early by freeing them both. Cc: <stable@kernel.org> # 38.x Reported-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-29 18:19:06 +02:00
Borislav Petkov	d34a6ecd45	amd64_edac: Fix decode_syndrome types Those should all be unsigned. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:40 +01:00
Borislav Petkov	8c6717510f	amd64_edac: Fix DCT argument type Fix amd64_debug_display_dimm_sizes() arguments order per convention (pvt is always first). Also, the now second arg denotes the DCT so adjust its type. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:34 +01:00
Borislav Petkov	e761359a25	amd64_edac: Fix ranges signedness The dram ranges make sense only as an unsigned type. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:33 +01:00
Borislav Petkov	972ea17ab9	amd64_edac: Drop local variable Use the macro directly instead Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:32 +01:00
Borislav Petkov	71d2a32e8e	amd64_edac: Fix PCI config addressing types Adjust argument types to the PCI config API's types. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:31 +01:00
Borislav Petkov	151fa71c58	amd64_edac: Fix DRAM base macros Return unsigned u8 values only. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:30 +01:00
Borislav Petkov	b487c33e55	amd64_edac: Fix node id signedness A node id can never be negative since we use it as an index into the DRAM ranges array. This also makes one of the BUG_ON conditions redundant. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:28 +01:00
Borislav Petkov	df71a05324	amd64_edac: Enable driver on F15h Add the PCI device ids required for driver registration. Remove pvt->ctl_name and use the family descriptor directly, instead. Then, bump driver version and fixup its format. Finally, enable DRAM ECC decoding on F15h. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:26 +01:00
Borislav Petkov	a3b7db09a6	amd64_edac: Adjust ECC symbol size to F15h F15h has the same ECC symbol size options as F10h revD and later so adjust checks to that. Simplify code a bit. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:26 +01:00
Borislav Petkov	87b3e0e6e4	amd64_edac: Simplify scrubrate setting Drop per-instance variable and compute min scrubrate dynamically. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:25 +01:00
Borislav Petkov	41d8bfaba7	amd64_edac: Improve DRAM address mapping Drop static tables which map the bits in F2x80 to a chip select size in favor of functions doing the mapping with some bit fiddling. Also, add F15 support. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:24 +01:00
Borislav Petkov	5a5d237169	amd64_edac: Sanitize ->read_dram_ctl_register This function is relevant for F10h and higher, and it has only one callsite so drop its function pointer from the low_ops struct. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:23 +01:00
Borislav Petkov	b15f0fcab1	amd64_edac: Adjust sys_addr to chip select conversion routine to F15h F15h sys_addr to chip select mapping is almost identical to F10h's so reuse that. Rename functions on that path accordingly. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:22 +01:00
Borislav Petkov	355fba6005	amd64_edac: Beef up early exit reporting Add paranoid checks for the sys address before going off and decoding it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:22 +01:00
Borislav Petkov	614ec9d853	amd64_edac: Revamp online spare handling Replace per-DCT macros with smarter ones, drop hack and look for the spare rank on all chip selects on a channel. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:21 +01:00
Borislav Petkov	5d4b58e84a	amd64_edac: Fix channel interleave removal Remove the channel interleave select bit properly. See F2x110[DctSelIntLvAddr] for details. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:21 +01:00
Borislav Petkov	e2f79dbdfb	amd64_edac: Correct node interleaving removal When node interleaving is enabled, a subset of the addr[14:12] bits has to be removed in order to get the normalized DCT address of the DRAM channel. The actual number of bits to remove is determined by F1x[1, 0][7C:40][IntlvEn]. Do this correctly. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:20 +01:00
Borislav Petkov	95b0ef55cd	amd64_edac: Add support for interleaved region swapping On revC3 and revE Fam10h machines and later, non-interleaved graphics framebuffer memory under the 16G mark can be swapped with a region located at the bottom of memory so that the GPU can use the interleaved region and thus two channels. Add support for that. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:20 +01:00
Borislav Petkov	700466249f	amd64_edac: Unify get_error_address The address bits from MC4_STATUS differ only between K8 and the rest so no need for a per-family method. No functional change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:19 +01:00
Borislav Petkov	f192c7b16c	amd64_edac: Simplify decoding path Use the struct mce directly instead of copying from it into a custom struct err_regs. No functionality change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:19 +01:00
Borislav Petkov	7d20d14da1	amd64_edac: Adjust channel counting to F15h The only difference is that F10h used to sport ganged DCTs and F15h doesn't so adjust the F10h routine and reuse it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:19 +01:00
Borislav Petkov	5980bb9cd8	amd64_edac: Cleanup old defines cruft Remove unused defines, drop family names from define names. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:18 +01:00
Borislav Petkov	bcd781f46a	amd64_edac: Cleanup NBSH cruft Remove reporting of errors with UC bit set - this is done by the MCE decoding code anyway and this driver deals with DRAM ECC errors only. UC (NB uncorrectable error) doesn't necessarily mean it is a DRAM error. Remove unused macros while at it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:17 +01:00
Borislav Petkov	a97fa68ec4	amd64_edac: Cleanup NBCFG handling The fact whether we are chipkill capable or not does not have any bearing when computing the channel index on a ganged DCT configuration so remove that. Also, simplify debug statements. Finally, remove old error injection leftovers, while at it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:16 +01:00
Borislav Petkov	c9f4f26eae	amd64_edac: Cleanup NBCTL code Remove family names from macro names, drop single bit defines and comment their meaning instead. No functional change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:15 +01:00
Borislav Petkov	78da121e15	amd64_edac: Cleanup DCT Select Low/High code Shorten macro names, remove family name from macros, fix macro arguments, shorten debug strings. No functionality change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:15 +01:00
Borislav Petkov	cb32850744	amd64_edac: Cleanup Dram Configuration registers handling * Restrict DCT ganged mode check since only Fam10h supports it * Adjust DRAM type detection for BD since it only supports DDR3 * Remove second and thus unneeded DCLR read in k8_early_channel_count() - we do that in read_mc_regs() * Cleanup comments and remove family names from register macros * Remove unused defines There should be no functional change resulting from this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:14 +01:00
Borislav Petkov	525a1b20a6	amd64_edac: Cleanup DBAM handling Do not read DBAM regs twice and simplify code around them. There should be no functional change resulting from this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:14 +01:00
Borislav Petkov	f678b8ccce	amd64_edac: Replace huge bitmasks with a macro Replace hard to read hex constants with a continuous masks macro. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:14 +01:00
Borislav Petkov	c8e518d567	amd64_edac: Sanitize f10_get_base_addr_offset This function maps the system address to the normalized DCT address. Document what the code does for more clarity and wrap insane bitmasks in a more understandable macro which generates them. Also, reduce number of arguments passed to the function. Finally, rename this function to what it actually does. No functional change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:13 +01:00
Borislav Petkov	229a7a11ac	amd64_edac: Sanitize channel extraction Cleanup and simplify f10_determine_channel(); make it more readable. Also drop f10_map_intlv_en_to_shift() in favor of simply counting the bits in F1x124[DramIntlvEn] which is equivalent. There should be no functionality change resulting from this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:13 +01:00
Borislav Petkov	11c75eadaf	amd64_edac: Cleanup chipselect handling Add a struct representing the DRAM chip select base/limit register pairs. Concentrate all CS handling in a single function. Also, add CS looping macros for cleaner, more readable code. While at it, adjust code to F15h. Finally, do smaller macro names cleanups (remove family names from register macros) and debug messages clarification. No functional change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:12 +01:00
Borislav Petkov	bc21fa5787	amd64_edac: Cleanup DHAR handling Adjust to F15h, simplify code, fixup macros. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:12 +01:00
Borislav Petkov	7f19bf755c	amd64_edac: Remove DRAM base/limit subfields caching Add a struct representing the DRAM base/limit range pairs and remove all cached subfields. Replace them with accessor functions, which actually saves us some space: text data bss dec hex filename 14712 1577 336 16625 40f1 drivers/edac/amd64_edac_mod.o.after 14831 1609 336 16776 4188 drivers/edac/amd64_edac_mod.o.before Also, it simplifies the code a lot allowing to merge the K8 and F10h routines. No functional change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:11 +01:00
Borislav Petkov	b2b0c60543	amd64_edac: Add support for F15h DCT PCI config accesses F15h "multiplexes" between the configuration space of the two DRAM controllers by toggling D18F1x10C[DctCfgSel] while F10h has a different set of registers for DCT0, and DCT1 in extended PCI config space. Add DCT configuration space accessors per family thus wrapping all the different access prerequisites. Clean up code while at it, shorten names. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:11 +01:00
Borislav Petkov	4d7963648f	amd64_edac: Fix DIMMs per DCTs output amd64_debug_display_dimm_sizes() reports the distribution of the DIMMs on each DRAM controller and its chip select sizes. Thus, the last don't have anything to do with whether we're running in ganged DCT mode or not - their sizes don't change all of a sudden. Fix that by removing the ganged-check and dump DCT0's config for DCT1 when in ganged mode since they're identical. Reported-and-tested-by: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-02-10 14:41:49 +01:00
Linus Torvalds	128283a47e	Merge branch 'mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp * 'mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: EDAC, MCE: Fix NB error formatting EDAC, MCE: Use BIT_64() to eliminate warnings on 32-bit EDAC, MCE: Enable MCE decoding on F15h EDAC, MCE: Allow F15h bank 6 MCE injection EDAC, MCE: Shorten error report formatting EDAC, MCE: Overhaul error fields extraction macros EDAC, MCE: Add F15h FP MCE decoder EDAC, MCE: Add F15 EX MCE decoder EDAC, MCE: Add an F15h NB MCE decoder EDAC, MCE: No F15h LS MCE decoder EDAC, MCE: Add F15h CU MCE decoder EDAC, MCE: Add F15h IC MCE decoder EDAC, MCE: Add F15h DC MCE decoder EDAC, MCE: Select extended error code mask	2011-01-07 14:54:03 -08:00
Borislav Petkov	6245288232	EDAC, MCE: Overhaul error fields extraction macros Make macro names shorter thus making code shorter and more clear. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:54:21 +01:00
Borislav Petkov	a135cef79a	amd64_edac: Disable DRAM ECC injection on K8 K8 does not allow for an atomic RMW to a cacheline as F10h does so disable the error injection interface for it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:38:46 +01:00
Borislav Petkov	390944439f	EDAC: Fixup scrubrate manipulation Make the ->{get\|set}_sdram_scrub_rate return the actual scrub rate bandwidth it succeeded setting and remove superfluous arg pointer used for that. A negative value returned still means that an error occurred while setting the scrubrate. Document this for future reference. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:38:31 +01:00
Borislav Petkov	360b7f3c60	amd64_edac: Remove two-stage initialization Now that all prerequisites are in place, drop the two-stage driver instances initialization in favor of the following simple init sequence: 1. Probe PCI device: we only test ECC capabilities here and if none exit early. 2. If the hw supports ECC and it is/can be enabled, we init the per-node instance. Remove "amd64_" prefix from static functions touched, while at it. There actually should be no visible functional change resulting from this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:34:03 +01:00
Borislav Petkov	2299ef7114	amd64_edac: Check ECC capabilities initially Rework the code to check the hardware ECC capabilities at PCI probing time. We do all further initialization only if we actually can/have ECC enabled. While at it: 0. Fix function naming. 1. Simplify/clarify debug output. 2. Remove amd64_ prefix from the static functions 3. Reorganize code. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:34:02 +01:00
Borislav Petkov	ae7bb7c679	amd64_edac: Carve out ECC-related hw settings This is in preparation for the init path reorganization where we want only to 1) test whether a particular node supports ECC 2) can it be enabled and only then do the necessary allocation/initialization. For that, we need to decouple the ECC settings of the node from the instance's descriptor. The should be no functional change introduced by this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:34:00 +01:00
Borislav Petkov	f1db274e1b	amd64_edac: Remove PCI ECS enabling functions PCI ECS is being enabled by default since 2.6.26 on AMD so this code is just superfluous now, remove it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:59 +01:00
Borislav Petkov	cc4d8860fc	amd64_edac: Allocate driver instances dynamically Remove static allocation in favor of dynamically allocating space for as many driver instances as northbridges present on the system. There should be no functional change resulting from this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:57 +01:00
Borislav Petkov	24f9a7fe3f	amd64_edac: Rework printk macros Add a macro per printk level, shorten up error messages. Add relevant information to KERN_INFO level. No functional change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:56 +01:00
Borislav Petkov	8d5b5d9c7b	amd64_edac: Rename CPU PCI devices Rename variables representing PCI devices to their BKDG names for faster search and shorter, clearer code. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:54 +01:00
Borislav Petkov	b8cfa02f83	amd64_edac: Concentrate per-family init even more Move the remaining per-family init code into the proper place and simplify the rest of the initialization. Reorganize error handling in amd64_init_one_instance(). Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:53 +01:00
Borislav Petkov	bbd0c1f675	amd64_edac: Cleanup the CPU PCI device reservation Shorten code and clarify comments, return proper -E* values on error. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:52 +01:00
Borislav Petkov	0092b20d4c	amd64_edac: Simplify CPU family detection Concentrate CPU family detection in the per-family init function. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:51 +01:00
Borislav Petkov	395ae783b3	amd64_edac: Add per-family init function Run a per-family init function which does all the settings based on the family this driver instance is running on. Move the scrubrate calculation in it and simplify code. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:50 +01:00
Borislav Petkov	9f56da0e3c	amd64_edac: Use cached extended CPU model ... instead of computing it needlessly again. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:49 +01:00
Borislav Petkov	3ab0e7dc2e	amd64_edac: Remove F11h support F11h doesn't support DRAM ECC so whack it away. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:47 +01:00
Linus Torvalds	42cbd8efb0	Merge branch 'x86-amd-nb-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-amd-nb-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, cacheinfo: Cleanup L3 cache index disable support x86, amd-nb: Cleanup AMD northbridge caching code x86, amd-nb: Complete the rename of AMD NB and related code	2011-01-06 10:50:28 -08:00
Borislav Petkov	e726f3c368	amd64_edac: Fix interleaving check When matching error address to the range contained by one memory node, we're in valid range when node interleaving 1. is disabled, or 2. enabled and when the address bits we interleave on match the interleave selector on this node (see the "Node Interleaving" section in the BKDG for an enlightening example). Thus, when we early-exit, we need to reverse the compound logic statement properly. Cc: <stable@kernel.org> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-12-08 19:52:54 +01:00
Hans Rosenfeld	9653a5c76c	x86, amd-nb: Cleanup AMD northbridge caching code Support more than just the "Misc Control" part of the northbridges. Support more flags by turning "gart_supported" into a single bit flag that is stored in a flags member. Clean up related code by using a set of functions (amd_nb_num(), amd_nb_has_feature() and node_to_amd_nb()) instead of accessing the NB data structures directly. Reorder the initialization code and put the GART flush words caching in a separate function. Signed-off-by: Hans Rosenfeld <hans.rosenfeld@amd.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-11-18 15:53:05 +01:00
Hans Rosenfeld	eec1d4fa00	x86, amd-nb: Complete the rename of AMD NB and related code Not only the naming of the files was confusing, it was even more so for the function and variable names. Renamed the K8 NB and NUMA stuff that is also used on other AMD platforms. This also renames the CONFIG_K8_NUMA option to CONFIG_AMD_NUMA and the related file k8topology_64.c to amdtopology_64.c. No functional changes intended. Signed-off-by: Hans Rosenfeld <hans.rosenfeld@amd.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-11-18 15:53:04 +01:00
Linus Torvalds	c029e405bd	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: (21 commits) EDAC, MCE: Fix shift warning on 32-bit EDAC, MCE: Add a BIT_64() macro EDAC, MCE: Enable MCE decoding on F12h EDAC, MCE: Add F12h NB MCE decoder EDAC, MCE: Add F12h IC MCE decoder EDAC, MCE: Add F12h DC MCE decoder EDAC, MCE: Add support for F11h MCEs EDAC, MCE: Enable MCE decoding on F14h EDAC, MCE: Fix FR MCEs decoding EDAC, MCE: Complete NB MCE decoders EDAC, MCE: Warn about LS MCEs on F14h EDAC, MCE: Adjust IC decoders to F14h EDAC, MCE: Adjust DC decoders to F14h EDAC, MCE: Rename files EDAC, MCE: Rework MCE injection EDAC: Export edac sysfs class to users. EDAC, MCE: Pass complete MCE info to decoders EDAC, MCE: Sanitize error codes EDAC, MCE: Remove unused function parameter EDAC, MCE: Add HW_ERR prefix ...	2010-10-21 14:04:58 -07:00
Borislav Petkov	7cfd4a8744	EDAC, MCE: Pass complete MCE info to decoders ... instead of the MCi_STATUS info only for improved handling of certain types of errors later. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-10-21 14:47:58 +02:00
Andreas Herrmann	23ac4ae827	x86, k8: Rename k8.[ch] to amd_nb.[ch] and CONFIG_K8_NB to CONFIG_AMD_NB The file names are somehow misleading as the code is not specific to AMD K8 CPUs anymore. The files accomodate code for other AMD CPU northbridges as well. Same is true for the config option which is valid for AMD CPU northbridges in general and not specific to K8. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <20100917160343.GD4958@loge.amd.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-09-20 14:22:58 -07:00
Andreas Herrmann	900f9ac9f1	x86, k8-gart: Decouple handling of garts and northbridges So far we only provide num_k8_northbridges. This is required in different areas (e.g. L3 cache index disable, GART). But not all AMD CPUs provide a GART. Thus it is useful to split off the GART handling from the generic caching of AMD northbridge misc devices. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <20100917160254.GC4958@loge.amd.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-09-17 13:26:21 -07:00
Borislav Petkov	37b7370a8d	amd64_edac: Do not report error overflow as a separate error When the Overflow MCi_STATUS bit is set, EDAC reports the lost error with a "no information available" message which often puzzles users parsing the dmesg. This doesn't make much sense since this error has been lost anyway so no need for reporting it separately. Thus, report the overflow bit setting in the MCE dump instead. While at it, remove reporting of MiscV and ErrorEnable (en) which are superfluous. Now it looks like this: [ 1501.650024] MC4_STATUS: Corrected error, other errors lost: yes, CPU context corrupt: no, CECC Error [ 1501.666887] Northbridge Error, node 2 Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-08-26 12:46:03 +02:00
Borislav Petkov	c4799c7570	amd64_edac: Minor formatting fix EDAC MC3: CE page 0xc32281, offset 0x8a0, grain 0, syndrome 0x1, row 2, channel 1, label "": amd64_edac EDAC MC3: CE - no information available: amd64_edacError Overflow Add the missing space before "Error Overflow" on the second line. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-08-04 11:16:01 +02:00
Borislav Petkov	962b70a1eb	amd64_edac: Fix operator precendence error The bitwise AND is of higher precedence, make that explicit. Cc: <stable@kernel.org> # 34.x Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-08-04 11:15:09 +02:00
Borislav Petkov	eba042a81e	edac, mc: Improve scrub rate handling Fortify the interface to not accept negative values, remove memctrl_int_store() as a result. Also, sanitize bandwidth setting by making the argument a simple u32 instead of strange u32 pointer being passed around for no obvious reason. Then, fix error handling and teach it to return proper error values. Finally, make code more readable, simplify debug messages. Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Cc: Arthur Jones <ajones@riverbed.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Doug Thompson <dougthompson@xmission.com>	2010-08-03 16:14:06 +02:00
Borislav Petkov	bc57117856	amd64_edac: Correct scrub rate setting Exit early when setting scrub rate on unknown/unsupported families. Cc: <stable@kernel.org> # 32.x 33.x 34.x Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Doug Thompson <dougthompson@xmission.com>	2010-08-03 16:14:05 +02:00
Borislav Petkov	9975a5f22a	amd64_edac: Fix DCT base address selector The correct check is to verify whether in high range we're below 4GB and not to extract the DctSelBaseAddr again. See "2.8.5 Routing DRAM Requests" in the F10h BKDG. Cc: <stable@kernel.org> # .32.x .33.x .34.x Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Doug Thompson <dougthompson@xmission.com>	2010-08-03 16:14:04 +02:00
Borislav Petkov	f4347553b3	amd64_edac: Remove polling mechanism Switch to reusing the mcheck core's machine check polling mechanism instead of duplicating functionality by using the EDAC polling routine. Correct formatting while at it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Doug Thompson <dougthompson@xmission.com>	2010-08-03 16:14:03 +02:00
Borislav Petkov	ad6a32e969	amd64_edac: Sanitize syndrome extraction Remove the two syndrome extraction macros and add a single function which does the same thing but with proper typechecking. While at it, make sure to cache ECC syndrome size and dump it in debug output. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-08-03 16:13:31 +02:00
Borislav Petkov	41c310447f	amd64_edac: Fix syndrome calculation on K8 When calculating the DCT channel from the syndrome we need to know the syndrome type (x4 vs x8). On F10h, this is read out from extended PCI cfg space register F3x180 while on K8 we only support x4 syndromes and don't have extended PCI config space anyway. Make the code accessing F3x180 F10h only and fall back to x4 syndromes on everything else. Cc: <stable@kernel.org> # .33.x .34.x Reported-by: Jeffrey Merkey <jeffmerkey@gmail.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-07-02 17:32:34 +02:00
Linus Torvalds	eaa5eec739	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: amd64_edac: Simplify ECC override handling	2010-03-03 09:25:37 -08:00
Linus Torvalds	0a135ba14d	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu: add __percpu sparse annotations to what's left percpu: add __percpu sparse annotations to fs percpu: add __percpu sparse annotations to core kernel subsystems local_t: Remove leftover local.h this_cpu: Remove pageset_notifier this_cpu: Page allocator conversion percpu, x86: Generic inc / dec percpu instructions local_t: Move local.h include to ringbuffer.c and ring_buffer_benchmark.c module: Use this_cpu_xx to dynamically allocate counters local_t: Remove cpu_local_xx macros percpu: refactor the code in pcpu_[de]populate_chunk() percpu: remove compile warnings caused by __verify_pcpu_ptr() percpu: make accessors check for percpu pointer in sparse percpu: add __percpu for sparse. percpu: make access macros universal percpu: remove per_cpu__ prefix.	2010-03-03 07:34:18 -08:00
Borislav Petkov	d95cf4de6a	amd64_edac: Simplify ECC override handling No need for clearing ecc_enable_override and checking it in two places. Instead, simply check it during probing and act accordingly. Also, rename the flag bitfields according to the functionality they actually represent. What is more, make sure original BIOS ECC settings are restored when the module is unloaded. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-03-01 19:25:12 +01:00
Tejun Heo	a29d8b8e2d	percpu: add __percpu sparse annotations to what's left Add __percpu sparse annotations to places which didn't make it in one of the previous patches. All converions are trivial. These annotations are to make sparse consider percpu variables to be in a different address space and warn if accessed without going through percpu accessors. This patch doesn't affect normal builds. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Len Brown <lenb@kernel.org> Cc: Neil Brown <neilb@suse.de>	2010-02-17 11:17:38 +09:00
Borislav Petkov	cab4d27764	amd64_edac: Do not falsely trigger kerneloops An unfortunate "WARNING" in the message amd64_edac dumps when the system doesn't support DRAM ECC or ECC checking is not enabled in the BIOS used to trigger kerneloops which qualified the message as an OOPS thus misleading the users. See, e.g. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/422536 http://bugzilla.kernel.org/show_bug.cgi?id=15238 Downgrade the message level to KERN_NOTICE and fix the formulation. Cc: stable@kernel.org # .32.x Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Doug Thompson <dougthompson@xmission.com>	2010-02-11 20:32:14 +01:00
Roel Kluin	926311fd7d	amd64_edac: Ensure index stays within bounds in amd64_get_scrub_rate Add a missing iterator variable thus fixing the conditional of the for-loop in amd64_get_scrub_rate(). Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-01-15 10:45:58 +01:00
Borislav Petkov	92389102b6	amd64_edac: restrict PCI config space access Do not access F2x19[0,4] on K8 since they're undefined there. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-24 11:07:08 +01:00
Borislav Petkov	43f5e68733	amd64_edac: fix forcing module load/unload Clear the override flag after force-loading the module. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-24 11:07:08 +01:00
Borislav Petkov	56b34b91e2	amd64_edac: make driver loading more robust Currently, the module does not initialize fully when the DIMMs aren't ECC but remains still loaded. Propagate the error when no instance of the driver is properly initialized and prevent further loading. Reorganize and polish error handling in amd64_edac_init() while at it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-24 11:07:07 +01:00
Borislav Petkov	8f68ed9728	amd64_edac: fix driver instance freeing Fix use-after-free errors by pushing all memory-freeing calls to the end of amd64_remove_one_instance(). Reported-by: Darren Jenkins <darrenrjenkins@gmail.com> LKML-Reference: <1261370306.11354.52.camel@ICE-BOX> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-24 11:07:07 +01:00
Borislav Petkov	603adaf6b3	amd64_edac: fix K8 chip select reporting Fix the case when amd64_debug_display_dimm_sizes() reports only half the amount of DRAM on it because it doesn't account for when the single DCT operates in 128-bit mode and merges chip selects from different DIMMs. Reported-by: Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de> LKML-Reference: <200912112202.48173.johannes.hirte@fem.tu-ilmenau.de> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-24 11:07:07 +01:00
Borislav Petkov	505422517d	x86, msr: Add support for non-contiguous cpumasks The current rd/wrmsr_on_cpus helpers assume that the supplied cpumasks are contiguous. However, there are machines out there like some K8 multinode Opterons which have a non-contiguous core enumeration on each node (e.g. cores 0,2 on node 0 instead of 0,1), see http://www.gossamer-threads.com/lists/linux/kernel/1160268. This patch fixes out-of-bounds writes (see URL above) by adding per-CPU msr structs which are used on the respective cores. Additionally, two helpers, msrs_{alloc,free}, are provided for use by the callers of the MSR accessors. Cc: H. Peter Anvin <hpa@zytor.com> Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Cc: Aristeu Rozanski <aris@redhat.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <20091211171440.GD31998@aftab> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-12-11 10:59:21 -08:00
Andrew Morton	18ba54ac12	amd64_edac: fix use-uninitialised bug drivers/edac/amd64_edac.c: In function 'amd64_edac_init': drivers/edac/amd64_edac.c:2840: warning: 'ret' may be used uninitialized in this function Cc: Doug Thompson <dougthompson@xmission.com> Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-08 13:38:13 +01:00
Borislav Petkov	bdc30a0c8c	amd64_edac: correct sys address to chip select mapping The routine does the reverse mapping of the error address of a CECC back to the node id, DRAM controller and chip select of the DIMM which caused the error. We should lookup the channel using the syndromes _only_ when the DCTs are ganged so fix that. Also, add an early exit when there's an error while scanning for the csrow thus decreasing indentation levels for better readability. Finally, fixup comments. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-08 13:38:12 +01:00
Borislav Petkov	bfc04aec7d	amd64_edac: add a leaner syndrome decoding algorithm Instead of using the whole syndrome tables for channel decoding, use a set of eigenvectors with which the tables can be generated to search for the syndrome in error. The algorithm operates independently of symbol size and can be used for both x4 and x8 syndromes. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-08 13:37:59 +01:00
Borislav Petkov	986a42a250	amd64_edac: remove early hw support check The .probe_valid_hardware low_ops member checked whether the DCTs are in DDR3 mode and bailed out if so. Now that all the needed changes for DDR3 support is in place, remove it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-07 19:14:31 +01:00
Borislav Petkov	6b4c0bdeb0	amd64_edac: detect DDR3 memory type Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-07 19:14:31 +01:00
Borislav Petkov	239642fe19	edac: add memory types strings for debugging Instead of using deeply-nested conditionals for dumping the DIMM type in debug mode, add a strings array of the supported DIMM types. This is useful in cases where an edac driver supports multiple DRAM types and is only defined in debug builds. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-07 19:14:31 +01:00
Borislav Petkov	1f6bcee75e	amd64_edac: remove unneeded extract_error_address wrapper Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-07 19:14:30 +01:00
Borislav Petkov	44e9e2ee21	amd64_edac: rename StinkyIdentifier SystemAddress -> sys_addr Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-07 19:14:30 +01:00
Borislav Petkov	ad858bfa14	amd64_edac: remove superfluous dbg printk Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-07 19:14:29 +01:00
Borislav Petkov	1433eb9903	amd64_edac: enhance address to DRAM bank mapping Add cs mode to cs size mapping tables for DDR2 and DDR3 and F10 and all K8 flavors and remove klugdy table of pseudo values. Add a low_ops->dbam_to_cs member which is family-specific and replaces low_ops->dbam_map_to_pages since the pages calculation is a one liner now. Further cleanups, while at it: - shorten family name defines - align amd64_family_types struct members Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-07 19:14:29 +01:00
Borislav Petkov	d16149e8c3	amd64_edac: cleanup f10_early_channel_count Do not read DCLR[01] again since this is done in amd64_read_mc_registers() earlier. There can be more than two physical DIMMs present so clamp the channels value to max 2. Also, do not report DCT data width - it is also done earlier. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-07 19:14:29 +01:00

1 2 3 4 5

201 Commits