Commit Graph

715 Commits

Author SHA1 Message Date
Borislav Petkov b487c33e55 amd64_edac: Fix node id signedness
A node id can never be negative since we use it as an index into
the DRAM ranges array. This also makes one of the BUG_ON conditions
redundant.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:28 +01:00
Borislav Petkov d88977a9c4 amd64_edac: Drop redundant declarations
Those were moved to the mce_amd.h header.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:27 +01:00
Borislav Petkov df71a05324 amd64_edac: Enable driver on F15h
Add the PCI device ids required for driver registration. Remove
pvt->ctl_name and use the family descriptor directly, instead. Then,
bump driver version and fixup its format. Finally, enable DRAM ECC
decoding on F15h.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:26 +01:00
Borislav Petkov a3b7db09a6 amd64_edac: Adjust ECC symbol size to F15h
F15h has the same ECC symbol size options as F10h revD and later so
adjust checks to that. Simplify code a bit.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:26 +01:00
Borislav Petkov 87b3e0e6e4 amd64_edac: Simplify scrubrate setting
Drop per-instance variable and compute min scrubrate dynamically.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:25 +01:00
Borislav Petkov 41d8bfaba7 amd64_edac: Improve DRAM address mapping
Drop static tables which map the bits in F2x80 to a chip select size in
favor of functions doing the mapping with some bit fiddling. Also, add
F15 support.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:24 +01:00
Borislav Petkov 5a5d237169 amd64_edac: Sanitize ->read_dram_ctl_register
This function is relevant for F10h and higher, and it has only one
callsite so drop its function pointer from the low_ops struct.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:23 +01:00
Borislav Petkov b15f0fcab1 amd64_edac: Adjust sys_addr to chip select conversion routine to F15h
F15h sys_addr to chip select mapping is almost identical to F10h's so
reuse that. Rename functions on that path accordingly.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:22 +01:00
Borislav Petkov 355fba6005 amd64_edac: Beef up early exit reporting
Add paranoid checks for the sys address before going off and decoding
it.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:22 +01:00
Borislav Petkov 614ec9d853 amd64_edac: Revamp online spare handling
Replace per-DCT macros with smarter ones, drop hack and look for the
spare rank on all chip selects on a channel.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:21 +01:00
Borislav Petkov 5d4b58e84a amd64_edac: Fix channel interleave removal
Remove the channel interleave select bit properly. See
F2x110[DctSelIntLvAddr] for details.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:21 +01:00
Borislav Petkov e2f79dbdfb amd64_edac: Correct node interleaving removal
When node interleaving is enabled, a subset of the addr[14:12] bits has
to be removed in order to get the normalized DCT address of the DRAM
channel. The actual number of bits to remove is determined by F1x[1,
0][7C:40][IntlvEn]. Do this correctly.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:20 +01:00
Borislav Petkov 95b0ef55cd amd64_edac: Add support for interleaved region swapping
On revC3 and revE Fam10h machines and later, non-interleaved graphics
framebuffer memory under the 16G mark can be swapped with a region
located at the bottom of memory so that the GPU can use the interleaved
region and thus two channels. Add support for that.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:20 +01:00
Borislav Petkov 700466249f amd64_edac: Unify get_error_address
The address bits from MC4_STATUS differ only between K8 and the rest so
no need for a per-family method.

No functional change.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:19 +01:00
Borislav Petkov f192c7b16c amd64_edac: Simplify decoding path
Use the struct mce directly instead of copying from it into a custom
struct err_regs.

No functionality change.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:19 +01:00
Borislav Petkov 7d20d14da1 amd64_edac: Adjust channel counting to F15h
The only difference is that F10h used to sport ganged DCTs and F15h
doesn't so adjust the F10h routine and reuse it.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:19 +01:00
Borislav Petkov 5980bb9cd8 amd64_edac: Cleanup old defines cruft
Remove unused defines, drop family names from define names.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:18 +01:00
Borislav Petkov bcd781f46a amd64_edac: Cleanup NBSH cruft
Remove reporting of errors with UC bit set - this is done by the MCE
decoding code anyway and this driver deals with DRAM ECC errors only. UC
(NB uncorrectable error) doesn't necessarily mean it is a DRAM error.
Remove unused macros while at it.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:17 +01:00
Borislav Petkov a97fa68ec4 amd64_edac: Cleanup NBCFG handling
The fact whether we are chipkill capable or not does not have any
bearing when computing the channel index on a ganged DCT configuration
so remove that. Also, simplify debug statements. Finally, remove old
error injection leftovers, while at it.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:16 +01:00
Borislav Petkov c9f4f26eae amd64_edac: Cleanup NBCTL code
Remove family names from macro names, drop single bit defines and
comment their meaning instead.

No functional change.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:15 +01:00
Borislav Petkov 78da121e15 amd64_edac: Cleanup DCT Select Low/High code
Shorten macro names, remove family name from macros, fix macro
arguments, shorten debug strings.

No functionality change.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:15 +01:00
Borislav Petkov cb32850744 amd64_edac: Cleanup Dram Configuration registers handling
* Restrict DCT ganged mode check since only Fam10h supports it
* Adjust DRAM type detection for BD since it only supports DDR3
* Remove second and thus unneeded DCLR read in k8_early_channel_count() - we do
  that in read_mc_regs()
* Cleanup comments and remove family names from register macros
* Remove unused defines

There should be no functional change resulting from this patch.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:14 +01:00
Borislav Petkov 525a1b20a6 amd64_edac: Cleanup DBAM handling
Do not read DBAM regs twice and simplify code around them.

There should be no functional change resulting from this patch.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:14 +01:00
Borislav Petkov f678b8ccce amd64_edac: Replace huge bitmasks with a macro
Replace hard to read hex constants with a continuous masks macro.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:14 +01:00
Borislav Petkov c8e518d567 amd64_edac: Sanitize f10_get_base_addr_offset
This function maps the system address to the normalized DCT address.
Document what the code does for more clarity and wrap insane bitmasks in
a more understandable macro which generates them. Also, reduce number of
arguments passed to the function. Finally, rename this function to what
it actually does.

No functional change.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:13 +01:00
Borislav Petkov 229a7a11ac amd64_edac: Sanitize channel extraction
Cleanup and simplify f10_determine_channel(); make it more readable.
Also drop f10_map_intlv_en_to_shift() in favor of simply counting the
bits in F1x124[DramIntlvEn] which is equivalent.

There should be no functionality change resulting from this patch.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:13 +01:00
Borislav Petkov 11c75eadaf amd64_edac: Cleanup chipselect handling
Add a struct representing the DRAM chip select base/limit register
pairs. Concentrate all CS handling in a single function. Also, add CS
looping macros for cleaner, more readable code. While at it, adjust code
to F15h. Finally, do smaller macro names cleanups (remove family names
from register macros) and debug messages clarification.

No functional change.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:12 +01:00
Borislav Petkov bc21fa5787 amd64_edac: Cleanup DHAR handling
Adjust to F15h, simplify code, fixup macros.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:12 +01:00
Borislav Petkov 7f19bf755c amd64_edac: Remove DRAM base/limit subfields caching
Add a struct representing the DRAM base/limit range pairs and remove all
cached subfields. Replace them with accessor functions, which actually
saves us some space:

   text    data     bss     dec     hex filename
  14712    1577     336   16625    40f1 drivers/edac/amd64_edac_mod.o.after
  14831    1609     336   16776    4188 drivers/edac/amd64_edac_mod.o.before

Also, it simplifies the code a lot allowing to merge the K8 and F10h
routines.

No functional change.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:11 +01:00
Borislav Petkov b2b0c60543 amd64_edac: Add support for F15h DCT PCI config accesses
F15h "multiplexes" between the configuration space of the two DRAM
controllers by toggling D18F1x10C[DctCfgSel] while F10h has a different
set of registers for DCT0, and DCT1 in extended PCI config space.

Add DCT configuration space accessors per family thus wrapping all the
different access prerequisites. Clean up code while at it, shorten
names.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:11 +01:00
Borislav Petkov b6a280bb96 EDAC: Shut up sysfs registration debug code
Raise the debug level of these routines so that their output get issued
out only when the highest debug level is selected. Otherwise, don't
pollute driver debug output.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17 14:46:10 +01:00
Chris Metcalf 5c77075548 drivers/edac: provide support for tile architecture
Add tile support for the EDAC driver, which provides unified system
error (memory, PCI, etc.) reporting. For now, the TILEPro port
reports memory correctable error (CE) only.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2011-03-10 13:30:14 -05:00
Grant Likely 000061245a dt/powerpc: Eliminate users of of_platform_{,un}register_driver
Get rid of old users of of_platform_driver in arch/powerpc.  Most
of_platform_driver users can be converted to use the platform_bus
directly.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-02-28 01:36:39 -07:00
Arvind R cb60a42269 edac: correct i82975x error-info reported
to edac-core

fix the totally wrong info w.r.t page,row,dimm-label previously reported to
edac-core by i82975x driver

Signed-off-by: Arvind R. <arvino55@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-02-17 16:47:04 +01:00
Arvind R da95b3d21f edac: correct i82975x mci initialisation
corrected mtype, and added dev_name,scrubmode initialisers
in i82975x struct mem_ctl initialisation

Signed-off-by: Arvind R. <arvino55@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-02-17 16:46:22 +01:00
Arvind R 7ba9957581 edac: correct commented info
wrong comments in i82975x driver corrected

Signed-off-by: Arvind R. <arvino55@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-02-17 16:44:45 +01:00
Jiri Kosina 0a9d59a246 Merge branch 'master' into for-next 2011-02-15 10:24:31 +01:00
Borislav Petkov 4d7963648f amd64_edac: Fix DIMMs per DCTs output
amd64_debug_display_dimm_sizes() reports the distribution of the DIMMs
on each DRAM controller and its chip select sizes. Thus, the last don't
have anything to do with whether we're running in ganged DCT mode or not
- their sizes don't change all of a sudden. Fix that by removing the
ganged-check and dump DCT0's config for DCT1 when in ganged mode since
they're identical.

Reported-and-tested-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-02-10 14:41:49 +01:00
Arvind R 25527885e3 edac: i82975x author/maintainer email address change
edac-i82975x author/maintainer email address change

Signed-off-by: Arvind R. <arvino55@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-01-24 16:17:51 +01:00
Jesper Juhl 42b16b3fbb Kill off warning: ‘inline’ is not at beginning of declaration
Fix a bunch of
	warning: ‘inline’ is not at beginning of declaration
messages when building a 'make allyesconfig' kernel with -Wextra.

These warnings are trivial to kill, yet rather annoying when building with
-Wextra.
The more we can cut down on pointless crap like this the better (IMHO).

A previous patch to do this for a 'allnoconfig' build has already been
merged. This just takes the cleanup a little further.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-01-19 15:43:08 +01:00
Linus Torvalds 008d23e485 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits)
  Documentation/trace/events.txt: Remove obsolete sched_signal_send.
  writeback: fix global_dirty_limits comment runtime -> real-time
  ppc: fix comment typo singal -> signal
  drivers: fix comment typo diable -> disable.
  m68k: fix comment typo diable -> disable.
  wireless: comment typo fix diable -> disable.
  media: comment typo fix diable -> disable.
  remove doc for obsolete dynamic-printk kernel-parameter
  remove extraneous 'is' from Documentation/iostats.txt
  Fix spelling milisec -> ms in snd_ps3 module parameter description
  Fix spelling mistakes in comments
  Revert conflicting V4L changes
  i7core_edac: fix typos in comments
  mm/rmap.c: fix comment
  sound, ca0106: Fix assignment to 'channel'.
  hrtimer: fix a typo in comment
  init/Kconfig: fix typo
  anon_inodes: fix wrong function name in comment
  fix comment typos concerning "consistent"
  poll: fix a typo in comment
  ...

Fix up trivial conflicts in:
 - drivers/net/wireless/iwlwifi/iwl-core.c (moved to iwl-legacy.c)
 - fs/ext4/ext4.h

Also fix missed 'diabled' typo in drivers/net/bnx2x/bnx2x.h while at it.
2011-01-13 10:05:56 -08:00
Linus Torvalds 128283a47e Merge branch 'mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp
* 'mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
  EDAC, MCE: Fix NB error formatting
  EDAC, MCE: Use BIT_64() to eliminate warnings on 32-bit
  EDAC, MCE: Enable MCE decoding on F15h
  EDAC, MCE: Allow F15h bank 6 MCE injection
  EDAC, MCE: Shorten error report formatting
  EDAC, MCE: Overhaul error fields extraction macros
  EDAC, MCE: Add F15h FP MCE decoder
  EDAC, MCE: Add F15 EX MCE decoder
  EDAC, MCE: Add an F15h NB MCE decoder
  EDAC, MCE: No F15h LS MCE decoder
  EDAC, MCE: Add F15h CU MCE decoder
  EDAC, MCE: Add F15h IC MCE decoder
  EDAC, MCE: Add F15h DC MCE decoder
  EDAC, MCE: Select extended error code mask
2011-01-07 14:54:03 -08:00
Borislav Petkov 6d5db46687 EDAC, MCE: Fix NB error formatting
Minor formatting fixup since the information which core was associated
with the MCE is not always valid.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07 11:54:26 +01:00
Randy Dunlap 50adbbd8a8 EDAC, MCE: Use BIT_64() to eliminate warnings on 32-bit
Building for X86_32 produces shift count warnings, so use BIT_64() to
eliminate the warnings.

drivers/edac/mce_amd.c:778: warning: left shift count >= width of type
drivers/edac/mce_amd.c:778: warning: left shift count >= width of type

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Doug Thompson <dougthompson@xmission.com>
Cc: bluesmoke-devel@lists.sourceforge.net
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07 11:54:25 +01:00
Borislav Petkov bad11e0318 EDAC, MCE: Enable MCE decoding on F15h
Now that everything is inplace, enable MCE decoding on F15h. Make
initcall routine a bit more readable.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07 11:54:24 +01:00
Borislav Petkov 1b07ca47ff EDAC, MCE: Allow F15h bank 6 MCE injection
F15h adds a sixth MCE bank: adjust bank number check in the injection
code.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07 11:54:23 +01:00
Borislav Petkov fa7ae8cc8c EDAC, MCE: Shorten error report formatting
Shorten up MCi_STATUS flags and add BD's new deferred and poison types.
Also, simplify formatting.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07 11:54:22 +01:00
Borislav Petkov 6245288232 EDAC, MCE: Overhaul error fields extraction macros
Make macro names shorter thus making code shorter and more clear.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07 11:54:21 +01:00
Borislav Petkov b8f85c477b EDAC, MCE: Add F15h FP MCE decoder
Add decoder for FP MCEs.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07 11:54:20 +01:00
Borislav Petkov 8259a7e572 EDAC, MCE: Add F15 EX MCE decoder
Integrate the single FIROB signature into an expanded table along with
the new BD MCE types.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07 11:54:19 +01:00
Borislav Petkov 05cd667d66 EDAC, MCE: Add an F15h NB MCE decoder
by (almost) reusing the F10h one since the signatures are the same.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07 11:54:18 +01:00
Borislav Petkov b18434cad1 EDAC, MCE: No F15h LS MCE decoder
F15h BD doesn't generate LS MCEs so warn about it.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07 11:54:17 +01:00
Borislav Petkov 70fdb494aa EDAC, MCE: Add F15h CU MCE decoder
MCE bank 2 is redefined from a BU to a CU (Combined Unit) bank on F15h.
Add a decoder function for CU MCEs.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07 11:54:16 +01:00
Borislav Petkov 86039cd401 EDAC, MCE: Add F15h IC MCE decoder
Add support for decoding F15h IC MCEs.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07 11:54:15 +01:00
Borislav Petkov 25a4f8b059 EDAC, MCE: Add F15h DC MCE decoder
Add a decoder for F15h DC MCEs to support the new types of DC MCEs
introduced by the BD microarchitecture.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07 11:54:14 +01:00
Borislav Petkov 2be64bfac7 EDAC, MCE: Select extended error code mask
F15h enlarges the extended error code of an MCE to a 5-bit field
(MCi_STATUS[20:16]). Add a mask variable which default 0xf is overridden
on F15h.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07 11:54:12 +01:00
Borislav Petkov a135cef79a amd64_edac: Disable DRAM ECC injection on K8
K8 does not allow for an atomic RMW to a cacheline as F10h does so
disable the error injection interface for it.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07 11:38:46 +01:00
Borislav Petkov 390944439f EDAC: Fixup scrubrate manipulation
Make the ->{get|set}_sdram_scrub_rate return the actual scrub rate
bandwidth it succeeded setting and remove superfluous arg pointer used
for that. A negative value returned still means that an error occurred
while setting the scrubrate. Document this for future reference.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07 11:38:31 +01:00
Borislav Petkov 360b7f3c60 amd64_edac: Remove two-stage initialization
Now that all prerequisites are in place, drop the two-stage driver
instances initialization in favor of the following simple init sequence:

1. Probe PCI device: we only test ECC capabilities here and if none exit
early.

2. If the hw supports ECC and it is/can be enabled, we init the per-node
instance.

Remove "amd64_" prefix from static functions touched, while at it.

There actually should be no visible functional change resulting from
this patch.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07 11:34:03 +01:00
Borislav Petkov 2299ef7114 amd64_edac: Check ECC capabilities initially
Rework the code to check the hardware ECC capabilities at PCI probing
time. We do all further initialization only if we actually can/have ECC
enabled.

While at it:
0. Fix function naming.
1. Simplify/clarify debug output.
2. Remove amd64_ prefix from the static functions
3. Reorganize code.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07 11:34:02 +01:00
Borislav Petkov ae7bb7c679 amd64_edac: Carve out ECC-related hw settings
This is in preparation for the init path reorganization where we want
only to

1) test whether a particular node supports ECC
2) can it be enabled

and only then do the necessary allocation/initialization. For that,
we need to decouple the ECC settings of the node from the instance's
descriptor.

The should be no functional change introduced by this patch.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07 11:34:00 +01:00
Borislav Petkov f1db274e1b amd64_edac: Remove PCI ECS enabling functions
PCI ECS is being enabled by default since 2.6.26 on AMD so this code is
just superfluous now, remove it.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07 11:33:59 +01:00
Borislav Petkov 027dbd6f5d amd64_edac: Remove explicit Kconfig PCI dependency
AMD_NB pulls in the dependency on PCI. Clarify/fix help text while at it.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07 11:33:58 +01:00
Borislav Petkov cc4d8860fc amd64_edac: Allocate driver instances dynamically
Remove static allocation in favor of dynamically allocating space for as
many driver instances as northbridges present on the system.

There should be no functional change resulting from this patch.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07 11:33:57 +01:00
Borislav Petkov 24f9a7fe3f amd64_edac: Rework printk macros
Add a macro per printk level, shorten up error messages. Add relevant
information to KERN_INFO level. No functional change.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07 11:33:56 +01:00
Borislav Petkov 8d5b5d9c7b amd64_edac: Rename CPU PCI devices
Rename variables representing PCI devices to their BKDG names for faster
search and shorter, clearer code.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07 11:33:54 +01:00
Borislav Petkov b8cfa02f83 amd64_edac: Concentrate per-family init even more
Move the remaining per-family init code into the proper place and
simplify the rest of the initialization. Reorganize error handling in
amd64_init_one_instance().

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07 11:33:53 +01:00
Borislav Petkov bbd0c1f675 amd64_edac: Cleanup the CPU PCI device reservation
Shorten code and clarify comments, return proper -E* values on error.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07 11:33:52 +01:00
Borislav Petkov 0092b20d4c amd64_edac: Simplify CPU family detection
Concentrate CPU family detection in the per-family init function.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07 11:33:51 +01:00
Borislav Petkov 395ae783b3 amd64_edac: Add per-family init function
Run a per-family init function which does all the settings based on
the family this driver instance is running on. Move the scrubrate
calculation in it and simplify code.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07 11:33:50 +01:00
Borislav Petkov 9f56da0e3c amd64_edac: Use cached extended CPU model
... instead of computing it needlessly again.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07 11:33:49 +01:00
Borislav Petkov 3ab0e7dc2e amd64_edac: Remove F11h support
F11h doesn't support DRAM ECC so whack it away.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07 11:33:47 +01:00
Linus Torvalds 42cbd8efb0 Merge branch 'x86-amd-nb-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-amd-nb-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, cacheinfo: Cleanup L3 cache index disable support
  x86, amd-nb: Cleanup AMD northbridge caching code
  x86, amd-nb: Complete the rename of AMD NB and related code
2011-01-06 10:50:28 -08:00
David Sterba e7bf068aa3 i7core_edac: fix typos in comments
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-12-28 01:20:51 +01:00
Jiri Kosina 4b7bd36470 Merge branch 'master' into for-next
Conflicts:
	MAINTAINERS
	arch/arm/mach-omap2/pm24xx.c
	drivers/scsi/bfa/bfa_fcpim.c

Needed to update to apply fixes for which the old branch was too
outdated.
2010-12-22 18:57:02 +01:00
Borislav Petkov e726f3c368 amd64_edac: Fix interleaving check
When matching error address to the range contained by one memory node,
we're in valid range when node interleaving

1. is disabled, or
2. enabled and when the address bits we interleave on match the
interleave selector on this node (see the "Node Interleaving" section in
the BKDG for an enlightening example).

Thus, when we early-exit, we need to reverse the compound logic
statement properly.

Cc: <stable@kernel.org>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-12-08 19:52:54 +01:00
Andrei Konovalov 76f04f2591 EDAC: Correct MiB_TO_PAGES() macro
This corrects the misprint introduced when moving '#if
PAGE_SHIFT' from i7core_edac.c to edac_core.h (commit
e9144601d3)

Cc: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Andrei Konovalov <akonovalov@mvista.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-12-08 19:52:53 +01:00
Borislav Petkov bb31b3122c EDAC: Fix workqueue-related crashes
00740c5854 changed edac_core to
un-/register a workqueue item only if a lowlevel driver supplies a
polling routine. Normally, when we remove a polling low-level driver, we
go and cancel all the queued work. However, the workqueue unreg happens
based on the ->op_state setting, and edac_mc_del_mc() sets this to
OP_OFFLINE _before_ we cancel the work item, leading to NULL ptr oops on
the workqueue list.

Fix it by putting the unreg stuff in proper order.

Cc: <stable@kernel.org> #36.x
Reported-and-tested-by: Tobias Karnat <tobias.karnat@googlemail.com>
LKML-Reference: <1291201307.3029.21.camel@Tobias-Karnat>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-12-08 19:52:27 +01:00
Axel Lin df4b2a30e0 EDAC, MCE: Fix edac_init_mce_inject error handling
Otherwise, variable i will be -1 inside the latest iteration of the
while loop.

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-11-22 15:35:32 +01:00
Tracey Dent f570e1dd84 EDAC: Remove deprecated kbuild goal definitions
Change EDAC's Makefile to use <modules>-y instead of
<modules>-objs because -objs is deprecated and not mentioned in
Documentation/kbuild/makefiles.txt.

 [bp: Fixup commit message]
 [bp: Fixup indentation]

Signed-off-by: Tracey Dent <tdent48227@gmail.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-11-22 15:35:31 +01:00
Hans Rosenfeld 9653a5c76c x86, amd-nb: Cleanup AMD northbridge caching code
Support more than just the "Misc Control" part of the northbridges.
Support more flags by turning "gart_supported" into a single bit flag
that is stored in a flags member. Clean up related code by using a set
of functions (amd_nb_num(), amd_nb_has_feature() and node_to_amd_nb())
instead of accessing the NB data structures directly. Reorder the
initialization code and put the GART flush words caching in a separate
function.

Signed-off-by: Hans Rosenfeld <hans.rosenfeld@amd.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-11-18 15:53:05 +01:00
Hans Rosenfeld eec1d4fa00 x86, amd-nb: Complete the rename of AMD NB and related code
Not only the naming of the files was confusing, it was even more so for
the function and variable names.

Renamed the K8 NB and NUMA stuff that is also used on other AMD
platforms. This also renames the CONFIG_K8_NUMA option to
CONFIG_AMD_NUMA and the related file k8topology_64.c to
amdtopology_64.c. No functional changes intended.

Signed-off-by: Hans Rosenfeld <hans.rosenfeld@amd.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-11-18 15:53:04 +01:00
Uwe Kleine-König b595076a18 tree-wide: fix comment/printk typos
"gadget", "through", "command", "maintain", "maintain", "controller", "address",
"between", "initiali[zs]e", "instead", "function", "select", "already",
"equal", "access", "management", "hierarchy", "registration", "interest",
"relative", "memory", "offset", "already",

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-11-01 15:38:34 -04:00
Linus Torvalds da62aa69c1 Merge branch 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/i7core
* 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/i7core: (34 commits)
  i7core_edac: return -ENODEV when devices were already probed
  i7core_edac: properly terminate pci_dev_table
  i7core_edac: Avoid PCI refcount to reach zero on successive load/reload
  i7core_edac: Fix refcount error at PCI devices
  i7core_edac: it is safe to i7core_unregister_mci() when mci=NULL
  i7core_edac: Fix an oops at i7core probe
  i7core_edac: Remove unused member channels in i7core_pvt
  i7core_edac: Remove unused arg csrow from get_dimm_config
  i7core_edac: Reduce args of i7core_register_mci
  i7core_edac: Introduce i7core_unregister_mci
  i7core_edac: Use saved pointers
  i7core_edac: Check probe counter in i7core_remove
  i7core_edac: Call pci_dev_put() when alloc_i7core_dev()  failed
  i7core_edac: Fix error path of i7core_register_mci
  i7core_edac: Fix order of lines in i7core_register_mci
  i7core_edac: Always do get/put for all devices
  i7core_edac: Introduce i7core_pci_ctl_create/release
  i7core_edac: Introduce free_i7core_dev
  i7core_edac: Introduce alloc_i7core_dev
  i7core_edac: Reduce args of i7core_get_onedevice
  ...
2010-10-26 10:13:48 -07:00
Linus Torvalds 229aebb873 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
  Update broken web addresses in arch directory.
  Update broken web addresses in the kernel.
  Revert "drivers/usb: Remove unnecessary return's from void functions" for musb gadget
  Revert "Fix typo: configuation => configuration" partially
  ida: document IDA_BITMAP_LONGS calculation
  ext2: fix a typo on comment in ext2/inode.c
  drivers/scsi: Remove unnecessary casts of private_data
  drivers/s390: Remove unnecessary casts of private_data
  net/sunrpc/rpc_pipe.c: Remove unnecessary casts of private_data
  drivers/infiniband: Remove unnecessary casts of private_data
  drivers/gpu/drm: Remove unnecessary casts of private_data
  kernel/pm_qos_params.c: Remove unnecessary casts of private_data
  fs/ecryptfs: Remove unnecessary casts of private_data
  fs/seq_file.c: Remove unnecessary casts of private_data
  arm: uengine.c: remove C99 comments
  arm: scoop.c: remove C99 comments
  Fix typo configue => configure in comments
  Fix typo: configuation => configuration
  Fix typo interrest[ing|ed] => interest[ing|ed]
  Fix various typos of valid in comments
  ...

Fix up trivial conflicts in:
	drivers/char/ipmi/ipmi_si_intf.c
	drivers/usb/gadget/rndis.c
	net/irda/irnet/irnet_ppp.c
2010-10-24 13:41:39 -07:00
Linus Torvalds 8de547e182 Merge branch 'devel' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/edac
* 'devel' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/edac: (25 commits)
  i7300_edac: Properly initialize per-csrow memory size
  V4L/DVB: i7300_edac: better initialize page counts
  MAINTAINERS: Add maintainer for i7300-edac driver
  i7300-edac: CodingStyle cleanup
  i7300_edac: Improve comments
  i7300_edac: Cleanup: reorganize the file contents
  i7300_edac: Properly detect channel on CE errors
  i7300_edac: enrich FBD error info for corrected errors
  i7300_edac: enrich FBD error info for fatal errors
  i7300_edac: pre-allocate a buffer used to prepare err messages
  i7300_edac: Fix MTR x4/x8 detection logic
  i7300_edac: Make the debug messages coherent with the others
  i7300_edac: Cleanup: remove get_error_info logic
  i7300_edac: Add a code to cleanup error registers
  i7300_edac: Add support for reporting FBD errors
  i7300_edac: Properly detect the type of error correction
  i7300_edac: Detect if the device is on single mode
  i7300_edac: Adds detection for enhanced scrub mode on x8
  i7300_edac: Clear the error bit after reading
  i7300_edac: Add error detection code for global errors
  ...
2010-10-24 13:06:57 -07:00
Mauro Carvalho Chehab 76a7bd8113 i7core_edac: return -ENODEV when devices were already probed
Due to the nature of i7core, we need to probe and attach all PCI
devices used by this driver during the first time probe is called.
However, PCI core will call the probe routine one time for each CPU
socket. If we return -EINVAL to those calls, it would seem that the
driver fails, when, in fact, there's no more devices left to initialize.

Changing the return code to -ENODEV solves this issue.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:36:19 -02:00
Mauro Carvalho Chehab 3c52cc57cc i7core_edac: properly terminate pci_dev_table
At pci_xeon_fixup(), it waits for a null-terminated table, while at
i7core_get_all_devices, it just do a for 0..ARRAY_SIZE. As other tables
are zero-terminated, change it to be terminate with 0 as well, and fixes
a bug where it may be running out of the table elements.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:31:50 -02:00
Mauro Carvalho Chehab a3e1541637 i7core_edac: Avoid PCI refcount to reach zero on successive load/reload
That's a nasty bug that took me a lot of time to track, and whose
solution took just one line to solve. The best fragrances and the worse
poisons are shipped on the smalest bottles.

The drivers/pci/quick.c implements the pci_get_device function. The normal
behavior is that you call it, the function returns you a pdev pointer
and increment pdev->kobj.kref.refcount of the pci device. However,
if you want to keep searching an object, you need to pass the previous
pdev function to the search.

When you use a not null pointer to pdev "from" field, pci_get_device
will decrement pdev->kobj.kref.refcount, assuming that the driver won't
be using the previous pdev.

The solution is simple: we just need to call pci_dev_get() manually,
for the pdev's that the driver will actually use.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:41 -02:00
Mauro Carvalho Chehab 79daef2099 i7core_edac: Fix refcount error at PCI devices
Probably due to a bug or some testing logic at PCI level, device
refcount for <bus>:00.0 device is decremented at the end of the
pci_get_device, made by i7core_get_all_devices(). The fact is that
the first versions of the driver relied on those devices to probe
for Nehalem, but the current versions don't use it at all.

So, let's just remove those devices from the driver, making it simpler
and fixing the bug.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:41 -02:00
Mauro Carvalho Chehab 88ef5ea976 i7core_edac: it is safe to i7core_unregister_mci() when mci=NULL
i7core_unregister_mci() checks internally when mci=NULL. There's no
need to test it outside.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:41 -02:00
Mauro Carvalho Chehab 6d37d240f2 i7core_edac: Fix an oops at i7core probe
changeset c91d57ba9ce5b5c93a7077e2f72510eb1f9131c4 moved the init
of the priv pointer to the end of the probe routine. However, we need
them before that, otherwise, we hit an OOPS:

[   67.743453] EDAC DEBUG: mci_bind_devs: Associated fn 0.0, dev = ffff88011b46e000, socket 0
[   67.751861] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
[   67.759685] IP: [<ffffffffa017e484>] i7core_probe+0x979/0x130c [i7core_edac]
[   67.766721] PGD 10bd38067 PUD 10bd37067 PMD 0
[   67.771178] Oops: 0000 [#1] SMP
[   67.774414] last sysfs file: /sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map
[   67.782213] CPU 1
[   67.784042] Modules linked in: i7core_edac(+) edac_core cpufreq_ondemand binfmt_misc dm_multipath video output pci_slot snd_hda_codd

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:41 -02:00
Hidetoshi Seto 21b6806a8c i7core_edac: Remove unused member channels in i7core_pvt
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:41 -02:00
Hidetoshi Seto 2e5185f7ff i7core_edac: Remove unused arg csrow from get_dimm_config
A local is enough.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:41 -02:00
Hidetoshi Seto aace42831a i7core_edac: Reduce args of i7core_register_mci
We can check the number of channels in i7core_register_mci.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:40 -02:00
Hidetoshi Seto 1c6edbbe25 i7core_edac: Introduce i7core_unregister_mci
In i7core_probe, when setup of mci for 2nd or later socket failed,
we should cleanup prepared mci for 1st socket or so before "put" of
all devices.

So let have i7core_unregister_mci that can be shared between here
and i7core_remove.

While here fix a typo "hanler".

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:40 -02:00
Hidetoshi Seto 73589c80cd i7core_edac: Use saved pointers
We already have saved pointers.  Use shorter ones.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:40 -02:00
Hidetoshi Seto 71fe01706d i7core_edac: Check probe counter in i7core_remove
Prevent i7core_remove from running multiple times.
Otherwise value proved will be negative and something will be wrong.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:40 -02:00
Hidetoshi Seto 2896637b86 i7core_edac: Call pci_dev_put() when alloc_i7core_dev() failed
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:40 -02:00
Hidetoshi Seto 628c5ddfb0 i7core_edac: Fix error path of i7core_register_mci
Release resources properly.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:40 -02:00
Hidetoshi Seto 5939813b9c i7core_edac: Fix order of lines in i7core_register_mci
The flag is_registered is not initialized until mci_bind_devs()
is called.  Refer it properly.

The mci->dev and mci->edac_check is required in edac_mc_add_mc(),
so prepare them just before the call.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:39 -02:00
Hidetoshi Seto 64c10f6e0e i7core_edac: Always do get/put for all devices
We already do 'get' for all sockets at once. So do 'put' in the
same way.

And let args of the 'get' function to void since it handles
only the single, static and known size table pci_dev_table[].

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:39 -02:00
Hidetoshi Seto a3aa0a4ab5 i7core_edac: Introduce i7core_pci_ctl_create/release
Have a couple of method.
while here sort out lines in the i7core_register_mci() a bit.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:39 -02:00
Hidetoshi Seto 2aa9be448d i7core_edac: Introduce free_i7core_dev
Have a method to make a couple with alloc_i7core_dev() previously
introduced.  Using in pair will help proper resource handling.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:39 -02:00
Hidetoshi Seto 848b2f7ed6 i7core_edac: Introduce alloc_i7core_dev
It's nice to have a method for a single purpose.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:38 -02:00
Hidetoshi Seto b197cba071 i7core_edac: Reduce args of i7core_get_onedevice
Since we need to pass the index of the entry, pass the table itself
instead of passing individual members of the table.

While here make it static.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:38 -02:00
Hidetoshi Seto 45b7c981ae i7core_edac: Fix the logic in i7core_remove()
commit 47251b4d960bdfa648b0d06dbc6d445f41cb3906 have changed
the logic for unexplained reasons.  It looks strange that it
can release i7core_dev without calling i7core_put_devices()
that releases i7core_dev->pdev.

Fix the part.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:38 -02:00
Mauro Carvalho Chehab 54a08ab153 i7core_edac: Don't do the legacy PCI probe by default
The legacy PCI probe sometimes cause hangs. Better to have it
disabled by default, and have a parameter to enable it.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:38 -02:00
Mauro Carvalho Chehab accf74fff3 i7core_edac: don't use a freed mci struct
This is a nasty bug. Since kobject count will be reduced by zero by
edac_mc_del_mc(), and this triggers the kobj release method, the
mci memory will be freed automatically. So, all we have left is ctl_name,
as shown by enabling debug:

[   80.822186] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 1020: edac_remove_sysfs_mci_device()  remove_link
[   80.832590] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 1024: edac_remove_sysfs_mci_device()  remove_mci_instance
[   80.843776] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 640: edac_mci_control_release() mci instance idx=0 releasing
[   80.855163] EDAC MC: Removed device 0 for i7core_edac.c i7 core #0: DEV 0000:3f:03.0
[   80.862936] EDAC DEBUG: in drivers/edac/i7core_edac.c, line at 2089: (null): free structs
[   80.871134] EDAC DEBUG: in drivers/edac/edac_mc.c, line at 238: edac_mc_free()
[   80.878379] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 726: edac_mc_unregister_sysfs_main_kobj()
[   80.888043] EDAC DEBUG: in drivers/edac/i7core_edac.c, line at 1232: drivers/edac/i7core_edac.c: i7core_put_devices()

Also, kfree(mci) shouldn't happen at the kobj.release, as it happens
when edac_remove_sysfs_mci_device() is called, but the logic is:
	edac_remove_sysfs_mci_device(mci);
	edac_printk(KERN_INFO, EDAC_MC,
		"Removed device %d for %s %s: DEV %s\n", mci->mc_idx,
		mci->mod_name, mci->ctl_name, edac_dev_name(mci));
So, as the edac_printk() needs the mci struct, this generates an OOPS.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:38 -02:00
Mauro Carvalho Chehab bbc560ae67 edac_core: Print debug messages at release calls
This is important to track a nasty bug at the free logic.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:38 -02:00
Mauro Carvalho Chehab ac99768c53 edac_core: Don't let free(mci) happen while using it
A very nasty bug were happening on edac core, due to the way mci objects are
freed. mci memory is freed when kobject count reaches zero, by
edac_mci_control_release(). However, from the logs, this is clearly happening
before the final usage of mci struct:

[15799.607454] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 640: edac_mci_control_release() mci instance idx=0 releasing
[15799.618773] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 769: edac_inst_grp_release()
[15799.627326] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 894: edac_remove_mci_instance_attributes() end of seeking for group all_channel_counts
[15799.640887] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 877: edac_remove_mci_instance_attributes() sysfs_attrib = ffffffffa01d7240
[15799.653412] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 1020: edac_remove_sysfs_mci_device()  remove_link
[15799.663753] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 1024: edac_remove_sysfs_mci_device()  remove_mci_instance

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:37 -02:00
Mauro Carvalho Chehab 6fe1108f14 edac_core: Do a better job with node removal
Make sure we remove groups at the right order

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:37 -02:00
Mauro Carvalho Chehab 39300e7143 i7core_edac: explicitly remove PCI devices from the devices list
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:37 -02:00
Mauro Carvalho Chehab 41ba6c1058 i7core_edac: MCE NMI handling should stop first
Otherwise, a NMI may happen causing a race condition and a panic.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:37 -02:00
Mauro Carvalho Chehab 6ee7dd5044 i7core_edac: Initialize all priv vars before start polling
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:37 -02:00
Mauro Carvalho Chehab 3cfd01468b i7core_edac: Improve debug to seek for register/remove errors
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:37 -02:00
Mauro Carvalho Chehab e9144601d3 i7core_edac: move #if PAGE_SHIFT to edac_core.h
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:36 -02:00
Mauro Carvalho Chehab 1288c18f48 i7core_edac: Properly mark const static vars as such
There are two groups of sysfs attributes: one for rdimm and another
for udimm. Instead of changing dynamically the unique static struct
for handling udimm's, declare two vars and make them constant.

This avoids the risk of having two or more memory controllers, each
needing a different set of attributes.

While here, use const on all places where it is applicable.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

edac_core: use const for constant sysfs arguments

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:14 -02:00
Mauro Carvalho Chehab 18c29002f9 i7core_edac: move static vars to the beginning of the file
While here, don't initialize probed with 0.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:12 -02:00
Mauro Carvalho Chehab 939747bd68 i7core_edac: Be sure that the edac pci handler will be properly released
With multi-sockets, more than one edac pci handler is enabled. Be sure to
un-register all instances.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24 11:20:12 -02:00
Linus Torvalds c029e405bd Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: (21 commits)
  EDAC, MCE: Fix shift warning on 32-bit
  EDAC, MCE: Add a BIT_64() macro
  EDAC, MCE: Enable MCE decoding on F12h
  EDAC, MCE: Add F12h NB MCE decoder
  EDAC, MCE: Add F12h IC MCE decoder
  EDAC, MCE: Add F12h DC MCE decoder
  EDAC, MCE: Add support for F11h MCEs
  EDAC, MCE: Enable MCE decoding on F14h
  EDAC, MCE: Fix FR MCEs decoding
  EDAC, MCE: Complete NB MCE decoders
  EDAC, MCE: Warn about LS MCEs on F14h
  EDAC, MCE: Adjust IC decoders to F14h
  EDAC, MCE: Adjust DC decoders to F14h
  EDAC, MCE: Rename files
  EDAC, MCE: Rework MCE injection
  EDAC: Export edac sysfs class to users.
  EDAC, MCE: Pass complete MCE info to decoders
  EDAC, MCE: Sanitize error codes
  EDAC, MCE: Remove unused function parameter
  EDAC, MCE: Add HW_ERR prefix
  ...
2010-10-21 14:04:58 -07:00
Linus Torvalds 2f0384e5fc Merge branch 'x86-amd-nb-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-amd-nb-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, amd_nb: Enable GART support for AMD family 0x15 CPUs
  x86, amd: Use compute unit information to determine thread siblings
  x86, amd: Extract compute unit information for AMD CPUs
  x86, amd: Add support for CPUID topology extension of AMD CPUs
  x86, nmi: Support NMI watchdog on newer AMD CPU families
  x86, mtrr: Assume SYS_CFG[Tom2ForceMemTypeWB] exists on all future AMD CPUs
  x86, k8: Rename k8.[ch] to amd_nb.[ch] and CONFIG_K8_NB to CONFIG_AMD_NB
  x86, k8-gart: Decouple handling of garts and northbridges
  x86, cacheinfo: Fix dependency of AMD L3 CID
  x86, kvm: add new AMD SVM feature bits
  x86, cpu: Fix allowed CPUID bits for KVM guests
  x86, cpu: Update AMD CPUID feature bits
  x86, cpu: Fix renamed, not-yet-shipping AMD CPUID feature bit
  x86, AMD: Remove needless CPU family check (for L3 cache info)
  x86, tsc: Remove CPU frequency calibration on AMD
2010-10-21 13:01:08 -07:00
Borislav Petkov 525906bc89 EDAC, MCE: Fix shift warning on 32-bit
Fix

drivers/edac/mce_amd.c:262: warning: left shift count >= width of type

on 32-bit builds.

Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21 14:48:07 +02:00
Borislav Petkov cf1d2200db EDAC, MCE: Add a BIT_64() macro
Add a macro for 64-bit vectors to use when accessing MSR contents.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21 14:48:06 +02:00
Borislav Petkov fda7561f43 EDAC, MCE: Enable MCE decoding on F12h
Turn on MCE decoding on F12h.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21 14:48:06 +02:00
Borislav Petkov cb9d5ecdff EDAC, MCE: Add F12h NB MCE decoder
F12h is completely covered by the generic path.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21 14:48:05 +02:00
Borislav Petkov e7281eb37d EDAC, MCE: Add F12h IC MCE decoder
... which is the same as for K8 and F10h.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21 14:48:05 +02:00
Borislav Petkov 9be0bb1072 EDAC, MCE: Add F12h DC MCE decoder
F12h DC MCE signatures are a subset of F10h's so reuse them.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21 14:48:04 +02:00
Borislav Petkov f0157b3afd EDAC, MCE: Add support for F11h MCEs
F11h has almost the same MCE signatures as K8 except DRAM ECC and MC5
bank errors. Reuse functionality from the other families.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21 14:48:04 +02:00
Borislav Petkov 9530d608ef EDAC, MCE: Enable MCE decoding on F14h
Now that all decoders have been taught about F14h, models < 0x10
MCEs, enable decoding on this family of CPUs. Also, issue a short
informational message upon boot that MCE decoding gets enabled.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21 14:48:03 +02:00
Borislav Petkov fe4ea2623b EDAC, MCE: Fix FR MCEs decoding
Those are N/A on K8, so don't decode them there.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21 14:48:03 +02:00
Borislav Petkov 5ce88f6ea6 EDAC, MCE: Complete NB MCE decoders
Add support for decoding F14h BU MCEs and improve decoding of the
remaining families.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21 14:48:02 +02:00
Borislav Petkov ded5062328 EDAC, MCE: Warn about LS MCEs on F14h
F14h CPUs do not generate LS MCEs so exit early and warn the user in
case this path is ever hit that something else might be going haywire.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21 14:48:02 +02:00
Borislav Petkov dd53bce4e8 EDAC, MCE: Adjust IC decoders to F14h
Add support for IC MCEs for F14h CPUs. K8 and F10h are almost identical
so use one function for both.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21 14:48:01 +02:00
Borislav Petkov 888ab8e6eb EDAC, MCE: Adjust DC decoders to F14h
Add a per-family data cache decoders. Since there is a certain overlap
between the different DC MCE signatures, reuse functionality between the
families as far as possible.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21 14:48:00 +02:00
Borislav Petkov 47ca08a40b EDAC, MCE: Rename files
Drop "edac_" string from the filenames since they're prefixed with edac/
in their pathname anyway.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21 14:48:00 +02:00
Borislav Petkov 9cdeb404a1 EDAC, MCE: Rework MCE injection
Add sysfs injection facilities for testing of the MCE decoding code.
Remove large parts of amd64_edac_dbg.c, as a result, which did only
NB MCE injection anyway and the new injection code supports that
functionality already.

Add an injection module so that MCE decoding code in production kernels
like those in RHEL and SLES can be tested.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21 14:47:59 +02:00
Borislav Petkov 30e1f7a812 EDAC: Export edac sysfs class to users.
Move toplevel sysfs class to the stub and make it available to
non-modularized code too. Add proper refcounting of its users and move
the registration functionality into the reference counting routines.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21 14:47:59 +02:00
Borislav Petkov 7cfd4a8744 EDAC, MCE: Pass complete MCE info to decoders
... instead of the MCi_STATUS info only for improved handling of certain
types of errors later.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21 14:47:58 +02:00
Borislav Petkov 6337583d7d EDAC, MCE: Sanitize error codes
Clean up error codes names, shorten to mnemonics, add RRRR boundary
checking.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21 14:47:58 +02:00
Borislav Petkov 0ee8efa8f4 EDAC, MCE: Remove unused function parameter
Remove remains from previous functionality.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21 14:47:57 +02:00
Borislav Petkov c9f281fd96 EDAC, MCE: Add HW_ERR prefix
.. so that the user knows what she's looking at there in dmesg. Also,
fix a minor cosmetic output inconsistency.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21 14:47:57 +02:00
Borislav Petkov ca755e0a49 EDAC: Fix error return
We should return a negative value when we cannot get the toplevel edac
sysfs class.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21 14:47:56 +02:00
Justin P. Mattock 631dd1a885 Update broken web addresses in the kernel.
The patch below updates broken web addresses in the kernel

Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Finn Thain <fthain@telegraphics.com.au>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Dimitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Mike Frysinger <vapier.adi@gmail.com>
Acked-by: Ben Pfaff <blp@cs.stanford.edu>
Acked-by: Hans J. Koch <hjk@linutronix.de>
Reviewed-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-10-18 11:03:14 +02:00
Marcin Slusarz 64aab720bd i7core_edac: fix panic in udimm sysfs attributes registration
Array of udimm sysfs attributes was not ended with NULL marker, leading to
dereference of random memory.

  EDAC DEBUG: edac_create_mci_instance_attributes: edac_create_mci_instance_attributes() file udimm0
  EDAC DEBUG: edac_create_mci_instance_attributes: edac_create_mci_instance_attributes() file udimm1
  EDAC DEBUG: edac_create_mci_instance_attributes: edac_create_mci_instance_attributes() file udimm2
  BUG: unable to handle kernel NULL pointer dereference at 00000000000001a4
  IP: [<ffffffff81330b36>] edac_create_mci_instance_attributes+0x148/0x1f1
  Pid: 1, comm: swapper Not tainted 2.6.36-rc3-nv+ #483 P6T SE/System Product Name
  RIP: 0010:[<ffffffff81330b36>]  [<ffffffff81330b36>] edac_create_mci_instance_attributes+0x148/0x1f1
  (...)
  Call Trace:
   [<ffffffff81330b86>] edac_create_mci_instance_attributes+0x198/0x1f1
   [<ffffffff81330c9a>] edac_create_sysfs_mci_device+0xbb/0x2b2
   [<ffffffff8132f533>] edac_mc_add_mc+0x46b/0x557
   [<ffffffff81428901>] i7core_probe+0xccf/0xec0
  RIP  [<ffffffff81330b36>] edac_create_mci_instance_attributes+0x148/0x1f1
  ---[ end trace 20de320855b81d78 ]---
  Kernel panic - not syncing: Attempted to kill init!

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Doug Thompson <dougthompson@xmission.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-01 10:50:58 -07:00
Borislav Petkov 00740c5854 amd64_edac: Fix driver module removal
f4347553b3 removed the edac polling
mechanism in favor of using a notifier chain for conveying MCE
information to edac. However, the module removal path didn't test
whether the driver had setup the polling function workqueue at all and
the rmmod process was hanging in the kernel at try_to_del_timer_sync()
in the cancel_delayed_work() path, trying to cancel an uninitialized
work struct.

Fix that by adding a balancing check to the workqueue removal path.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-09-27 12:52:58 +02:00
Mauro Carvalho Chehab e6649cc629 i7300_edac: Properly initialize per-csrow memory size
Due to the current edac-core limits, we cannot represent a per-channel
memory size, for FB-DIMM drivers. So, we need to sum-up all values
for each slot, in order to properly represent the total amount of
memory found by the i7300 driver.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-09-24 14:16:12 -03:00
Mauro Carvalho Chehab 1aa4a7b6b0 V4L/DVB: i7300_edac: better initialize page counts
It is still somewhat fake, as the pages may not be on this exact order,
and may even be used in mirror mode, but this is a best guess than the
other random fake values.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-09-24 14:16:12 -03:00
Andreas Herrmann 23ac4ae827 x86, k8: Rename k8.[ch] to amd_nb.[ch] and CONFIG_K8_NB to CONFIG_AMD_NB
The file names are somehow misleading as the code is not specific to
AMD K8 CPUs anymore. The files accomodate code for other AMD CPU
northbridges as well.

Same is true for the config option which is valid for AMD CPU
northbridges in general and not specific to K8.

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
LKML-Reference: <20100917160343.GD4958@loge.amd.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-09-20 14:22:58 -07:00
Andreas Herrmann 900f9ac9f1 x86, k8-gart: Decouple handling of garts and northbridges
So far we only provide num_k8_northbridges. This is required in
different areas (e.g. L3 cache index disable, GART). But not all AMD
CPUs provide a GART. Thus it is useful to split off the GART handling
from the generic caching of AMD northbridge misc devices.

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
LKML-Reference: <20100917160254.GC4958@loge.amd.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-09-17 13:26:21 -07:00