OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Shiju Jose	b7765b0a03	ACPI: APEI: Fix AER info corruption when error status data has multiple sections [ Upstream commit e2abc47a5a1a9f641e7cacdca643fdd40729bf6e ] ghes_handle_aer() passes AER data to the PCI core for logging and recovery by calling aer_recover_queue() with a pointer to struct aer_capability_regs. The problem was that aer_recover_queue() queues the pointer directly without copying the aer_capability_regs data. The pointer was to the ghes->estatus buffer, which could be reused before aer_recover_work_func() reads the data. To avoid this problem, allocate a new aer_capability_regs structure from the ghes_estatus_pool, copy the AER data from the ghes->estatus buffer into it, pass a pointer to the new struct to aer_recover_queue(), and free it after aer_recover_work_func() has processed it. Reported-by: Bjorn Helgaas <helgaas@kernel.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> [ rjw: Subject edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-28 17:19:37 +00:00
Linus Torvalds	7adcadb984	- Make ghes_edac a simple module like the rest of the EDAC drivers and drop this forced built-in only configuration by disentangling it from GHES. Work by Jia He. - The usual small cleanups and improvements all over EDAC land -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmOXPvoACgkQEsHwGGHe VUq98xAAmhz4u9e9pXG0Ixkx25ZtnZ+YxANeQ53Hsa2gWicbcoFgL2E30gi97c1y X9W361B2Q5dYq+J/YRUnEOXlI/KMWLxzNykSvVipUFNfxXZH+PijEAArz2V35/uE 6ZISRLUYVYEtHEoUXbTogeyBmBUnIaJfYheZCluDQlWPggsDESP1qmE+FTg25OBs rDl5y+zUZYPxrWustNodVThPyhdMwGyYAUS6qYKCoNs9SNkAjGnrXoPc9j/U+cV+ qMY2dNS3uKnCujKEssQhcHucyWgCEDvmEKWMH4ItryV2UBBjpNRoM6HDe7XFKwVJ riOKX8VDrpdSdlV1jbCx9KB47BUwFygOYsFdW7gIDJ1hb8usN4nSYQNDIlZKEIQG cHNpv2XGT+pCSvyc4Iv2Fgyvnp25XensSQwQAtk5Y4/lJL1yrgcPjMOkPmRS+mmH BclDWNbL+gwqkyWxgfoivDBOetLgwJYTr2ewBr6QbBtwLB8rL4BxXIdomcoFPuxi jAxixZnTbS+Xq5S7uYK4r6KbaHGcJtwolXMGjx13IHmPfvYtTTQzfRcrBlAtQ/pV BDLoygmDVlkhSVx6bi5V5QZ06rcWYR4cRsBQ54FnBGMr730ZljgFONOHFtUab28T C+YUOaeLEYEYI0cIkkyoSuiz6avB6YvQAiyEPM0EdHZrQFwhBBw= =DFp8 -----END PGP SIGNATURE----- Merge tag 'edac_updates_for_6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC updates from Borislav Petkov: - Make ghes_edac a simple module like the rest of the EDAC drivers and drop the forced built-in only configuration by disentangling it from GHES (Jia He) - The usual small cleanups and improvements all over EDAC land * tag 'edac_updates_for_6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/i10nm: fix refcount leak in pci_get_dev_wrapper() EDAC/i5400: Fix typo in comment: vaious -> various EDAC/mc_sysfs: Increase legacy channel support to 12 MAINTAINERS: Make Mauro EDAC reviewer MAINTAINERS: Make Manivannan Sadhasivam the maintainer of qcom_edac EDAC/igen6: Return the correct error type when not the MC owner apei/ghes: Use xchg_release() for updating new cache slot instead of cmpxchg() EDAC: Check for GHES preference in the chipset-specific EDAC drivers EDAC/ghes: Make ghes_edac a proper module EDAC/ghes: Prepare to make ghes_edac a proper module EDAC/ghes: Add a notifier for reporting memory errors efi/cper: Export several helpers for ghes_edac to use EDAC/i5000: Mark as BROKEN	2022-12-12 14:47:31 -08:00
Jia He	802e7f1dfe	EDAC/ghes: Make ghes_edac a proper module Commit `dc4e8c07e9` ("ACPI: APEI: explicit init of HEST and GHES in apci_init()") introduced a bug leading to ghes_edac_register() to be invoked before edac_init(). Because at that time the bus "edac" hadn't been even registered, this created sysfs nodes as /devices/mc0 instead of /sys/devices/system/edac/mc/mc0 on an Ampere eMag server. Fix this by turning ghes_edac into a proper module. The list of GHES devices returned is not protected from being modified concurrently but it is pretty static as it gets created only during GHES init and latter is not a module so... [ bp: Massage. ] Fixes: `dc4e8c07e9` ("ACPI: APEI: explicit init of HEST and GHES in apci_init()") Co-developed-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Jia He <justin.he@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221010023559.69655-5-justin.he@arm.com	2022-10-21 21:59:19 +02:00
Jia He	9057a3f7ac	EDAC/ghes: Prepare to make ghes_edac a proper module To make ghes_edac a proper module, prepare to decouple its dependencies from GHES. Move the ghes_edac.force_load parameter to ghes.c in order to properly control whether ghes_edac should be force-loaded: In ghes_edac_register() it is too late to set the module flag. Introduce a helper ghes_get_devices(), which returns the list of GHES devices which got probed when the platform-check passes on the system. The previous force_load check is not needed in ghes_edac_unregister() since it will be checked in the module's init function of ghes_edac later. [ bp: Massage. ] Suggested-by: Toshi Kani <toshi.kani@hpe.com> Suggested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Jia He <justin.he@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221010023559.69655-4-justin.he@arm.com	2022-10-21 19:32:38 +02:00
Jia He	8e40612f61	EDAC/ghes: Add a notifier for reporting memory errors In order to make it a proper module and disentangle it from facilities, add a notifier for reporting memory errors. Use an atomic notifier because calls sites like ghes_proc_in_irq() run in interrupt context. [ bp: Massage commit message. ] Suggested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Jia He <justin.he@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221010023559.69655-3-justin.he@arm.com	2022-10-20 13:25:53 +02:00
Ashish Kalra	43d2748394	ACPI: APEI: Fix integer overflow in ghes_estatus_pool_init() Change num_ghes from int to unsigned int, preventing an overflow and causing subsequent vmalloc() to fail. The overflow happens in ghes_estatus_pool_init() when calculating len during execution of the statement below as both multiplication operands here are signed int: len += (num_ghes * GHES_ESOURCE_PREALLOC_MAX_SIZE); The following call trace is observed because of this bug: [ 9.317108] swapper/0: vmalloc error: size 18446744071562596352, exceeds total pages, mode:0xcc0(GFP_KERNEL), nodemask=(null),cpuset=/,mems_allowed=0-1 [ 9.317131] Call Trace: [ 9.317134] <TASK> [ 9.317137] dump_stack_lvl+0x49/0x5f [ 9.317145] dump_stack+0x10/0x12 [ 9.317146] warn_alloc.cold+0x7b/0xdf [ 9.317150] ? __device_attach+0x16a/0x1b0 [ 9.317155] __vmalloc_node_range+0x702/0x740 [ 9.317160] ? device_add+0x17f/0x920 [ 9.317164] ? dev_set_name+0x53/0x70 [ 9.317166] ? platform_device_add+0xf9/0x240 [ 9.317168] __vmalloc_node+0x49/0x50 [ 9.317170] ? ghes_estatus_pool_init+0x43/0xa0 [ 9.317176] vmalloc+0x21/0x30 [ 9.317177] ghes_estatus_pool_init+0x43/0xa0 [ 9.317179] acpi_hest_init+0x129/0x19c [ 9.317185] acpi_init+0x434/0x4a4 [ 9.317188] ? acpi_sleep_proc_init+0x2a/0x2a [ 9.317190] do_one_initcall+0x48/0x200 [ 9.317195] kernel_init_freeable+0x221/0x284 [ 9.317200] ? rest_init+0xe0/0xe0 [ 9.317204] kernel_init+0x1a/0x130 [ 9.317205] ret_from_fork+0x22/0x30 [ 9.317208] </TASK> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2022-10-13 20:40:09 +02:00
Shiju Jose	9aa9cf3ee9	ACPI / APEI: Add a notifier chain for unknown (vendor) CPER records CPER records describing a firmware-first error are identified by GUID. The ghes driver currently logs, but ignores any unknown CPER records. This prevents describing errors that can't be represented by a standard entry, that would otherwise allow a driver to recover from an error. The UEFI spec calls these 'Non-standard Section Body' (N.2.3 of version 2.8). Add a notifier chain for these non-standard/vendor-records. Callers must identify their type of records by GUID. Record data is copied to memory from the ghes_estatus_pool to allow us to keep it until after the notifier has run. Co-developed-by: James Morse <james.morse@arm.com> Link: https://lore.kernel.org/r/20200903123456.1823-2-shiju.jose@huawei.com Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: "Rafael J. Wysocki" <rjw@rjwysocki.net>	2020-09-16 10:30:42 +01:00
James Morse	7f17b4a121	ACPI: APEI: Kick the memory_failure() queue for synchronous errors memory_failure() offlines or repairs pages of memory that have been discovered to be corrupt. These may be detected by an external component, (e.g. the memory controller), and notified via an IRQ. In this case the work is queued as not all of memory_failure()s work can happen in IRQ context. If the error was detected as a result of user-space accessing a corrupt memory location the CPU may take an abort instead. On arm64 this is a 'synchronous external abort', and on a firmware first system it is replayed using NOTIFY_SEA. This notification has NMI like properties, (it can interrupt IRQ-masked code), so the memory_failure() work is queued. If we return to user-space before the queued memory_failure() work is processed, we will take the fault again. This loop may cause platform firmware to exceed some threshold and reboot when Linux could have recovered from this error. For NMIlike notifications keep track of whether memory_failure() work was queued, and make task_work pending to flush out the queue. To save memory allocations, the task_work is allocated as part of the ghes_estatus_node, and free()ing it back to the pool is deferred. Signed-off-by: James Morse <james.morse@arm.com> Tested-by: Tyler Baicar <baicar@os.amperecomputing.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-05-19 19:51:11 +02:00
James Morse	5cc6c68287	ACPI / APEI: Don't update struct ghes' flags in read/clear estatus ghes_read_estatus() sets a flag in struct ghes if the buffer of CPER records needs to be cleared once the records have been processed. This flag value is a problem if a struct ghes can be processed concurrently, as happens at probe time if an NMI arrives for the same error source. The NMI clears the flag, meaning the interrupted handler may never do the ghes_estatus_clear() work. The GHES_TO_CLEAR flags is only set at the same time as buffer_paddr, which is now owned by the caller and passed to ghes_clear_estatus(). Use this value as the flag. A non-zero buf_paddr returned by ghes_read_estatus() means ghes_clear_estatus() should clear this address. ghes_read_estatus() already checks for a read of error_status_address being zero, so CPER records cannot be written here. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Borislav Petkov <bp@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-02-07 23:10:45 +01:00
James Morse	eeb2555779	ACPI / APEI: Don't store CPER records physical address in struct ghes When CPER records are found the address of the records is stashed in the struct ghes. Once the records have been processed, this address is overwritten with zero so that it won't be processed again without being re-populated by firmware. This goes wrong if a struct ghes can be processed concurrently, as can happen at probe time when an NMI occurs. If the NMI arrives on another CPU, the probing CPU may call ghes_clear_estatus() on the records before the handler had finished with them. Even on the same CPU, once the interrupted handler is resumed, it will call ghes_clear_estatus() on the NMIs records, this memory may have already been re-used by firmware. Avoid this stashing by letting the caller hold the address. A later patch will do away with the use of ghes->flags in the read/clear code too. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Borislav Petkov <bp@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-02-07 23:10:45 +01:00
James Morse	fb7be08f1a	ACPI / APEI: Make estatus pool allocation a static size Adding new NMI-like notifications duplicates the calls that grow and shrink the estatus pool. This is all pretty pointless, as the size is capped to 64K. Allocate this for each ghes and drop the code that grows and shrinks the pool. Suggested-by: Borislav Petkov <bp@suse.de> Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Borislav Petkov <bp@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-02-07 23:10:45 +01:00
James Morse	e147133a42	ACPI / APEI: Make hest.c manage the estatus memory pool ghes.c has a memory pool it uses for the estatus cache and the estatus queue. The cache is initialised when registering the platform driver. For the queue, an NMI-like notification has to grow/shrink the pool as it is registered and unregistered. This is all pretty noisy when adding new NMI-like notifications, it would be better to replace this with a static pool size based on the number of users. As a precursor, move the call that creates the pool from ghes_init(), into hest.c. Later this will take the number of ghes entries and consolidate the queue allocations. Remove ghes_estatus_pool_exit() as hest.c doesn't have anywhere to put this. The pool is now initialised as part of ACPI's subsys_initcall(): (acpi_init(), acpi_scan_init(), acpi_pci_root_init(), acpi_hest_init()) Before this patch it happened later as a GHES specific device_initcall(). Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Borislav Petkov <bp@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-02-07 23:10:45 +01:00
Dongjiu Geng	1035a07835	arm64 / ACPI: clean the additional checks before calling ghes_notify_sea() In order to remove the additional check before calling the ghes_notify_sea(), make stub definition when !CONFIG_ACPI_APEI_SEA. After this cleanup, we can simply call the ghes_notify_sea() to let APEI driver handle the SEA notification. Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-08-09 10:55:18 +02:00
Alexandru Gagniuc	305d0e006a	EDAC, ghes: Remove unused argument to ghes_edac_report_mem_error() The use of the @ghes argument was removed in a previous commit, but function signature was not updated to reflect this. Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com> Acked-by: "Rafael J. Wysocki" <rafael@kernel.org> Cc: linux-acpi@vger.kernel.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20180430213358.8319-1-mr.nuke.me@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>	2018-05-12 10:59:15 +02:00
Borislav Petkov	cc7f3f1326	ghes, EDAC: Fix ghes_edac registration Tony reported seeing "Internal error: Can't find EDAC structure" when injecting correctable errors due to the fact that ghes_edac would still load even if the whitelist won't hit. Drop the pr_err() in ghes_edac_report_mem_error() for now due to the hacky way how ghes_edac depends on ghes.c. While at it, make ghes_edac_register() return an error if it doesn't hit in the whitelist as it is the only sensible thing to do in that situation. Furthermore, move the call to it to happen last in ghes_probe() so that GHES initializing properly does not depend on ghes_edac init at all as latter is only reporting errors and not required for GHES's proper functioning. Reviewed-by: Toshi Kani <toshi.kani@hpe.com> Tested-by: Sughosh Ganu <sughosh.ganu@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20180420182015.zao3olss4tvvlxki@agluck-desk	2018-05-02 13:57:30 +02:00
Greg Kroah-Hartman	b24413180f	License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a /uapi/ one with no licensing information in it, - file was a /uapi/ one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non /uapi/ files that summary was: SPDX license identifier # files ---------------------------------------------------\|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a /uapi/ path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------\|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the /uapi/ ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------\|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-11-02 11:10:55 +01:00
gengdongjiu	c4335fdd38	ACPI: APEI: fix the wrong iteration of generic error status block The revision 0x300 generic error data entry is different from the old version, but currently iterating through the GHES estatus blocks does not take into account this difference. This will lead to failure to get the right data entry if GHES has revision 0x300 error data entry. Update the GHES estatus iteration macro to properly increment using acpi_hest_get_next(), and correct the iteration termination condition because the status block data length only includes error data length. Convert the CPER estatus checking and printing iteration logic to use same macro. Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com> Tested-by: Tyler Baicar <tbaicar@codeaurora.org> Reviewed-by: Borislav Petkov <bp@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-08-24 03:29:46 +02:00
Tyler Baicar	621f48e40e	arm/arm64: KVM: add guest SEA support Currently external aborts are unsupported by the guest abort handling. Add handling for SEAs so that the host kernel reports SEAs which occur in the guest kernel. When an SEA occurs in the guest kernel, the guest exits and is routed to kvm_handle_guest_abort(). Prior to this patch, a print message of an unsupported FSC would be printed and nothing else would happen. With this patch, the code gets routed to the APEI handling of SEAs in the host kernel to report the SEA information. Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-22 18:22:05 +01:00
Tyler Baicar	7edda0886b	acpi: apei: handle SEA notification type for ARMv8 ARM APEI extension proposal added SEA (Synchronous External Abort) notification type for ARMv8. Add a new GHES error source handling function for SEA. If an error source's notification type is SEA, then this function can be registered into the SEA exception handler. That way GHES will parse and report SEA exceptions when they occur. An SEA can interrupt code that had interrupts masked and is treated as an NMI. To aid this the page of address space for mapping APEI buffers while in_nmi() is always reserved, and ghes_ioremap_pfn_nmi() is changed to use the helper methods to find the prot_t to map with in the same way as ghes_ioremap_pfn_irq(). Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org> CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org> Reviewed-by: James Morse <james.morse@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-22 18:22:03 +01:00
Tyler Baicar	bbcc2e7b64	ras: acpi/apei: cper: add support for generic data v3 structure The ACPI 6.1 spec adds a new revision of the generic error data entry structure. Add support to handle the new structure as well as properly verify and iterate through the generic data entries. Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org> CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-22 15:43:47 +01:00
Tyler Baicar	42aa560446	acpi: apei: read ack upon ghes record consumption A RAS (Reliability, Availability, Serviceability) controller may be a separate processor running in parallel with OS execution, and may generate error records for consumption by the OS. If the RAS controller produces multiple error records, then they may be overwritten before the OS has consumed them. The Generic Hardware Error Source (GHES) v2 structure introduces the capability for the OS to acknowledge the consumption of the error record generated by the RAS controller. A RAS controller supporting GHESv2 shall wait for the acknowledgment before writing a new error record, thus eliminating the race condition. Add support for parsing of GHESv2 sub-tables as well. Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org> CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org> Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-22 15:43:47 +01:00
Lv Zheng	0a00fd5e20	ACPICA: Restore error table definitions to reduce code differences between Linux and ACPICA upstream. The following commit has changed ACPICA table header definitions: Commit: `88f074f487` Subject: ACPI, CPER: Update cper info While such definitions are currently maintained in ACPICA. As the modifications applying to the table definitions affect other OSPMs' drivers, it is very difficult for ACPICA to initiate a process to complete the merge. Thus this commit finally only leaves us divergences. Revert such naming modifications to reduce the source code differecnes between Linux and ACPICA upstream. No functional changes. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Cc: Bob Moore <robert.moore@intel.com> Cc: Chen, Gong <gong.chen@linux.intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Borislav Petkov <bp@alien8.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-06-16 22:33:50 +02:00
Chen, Gong	88f074f487	ACPI, CPER: Update cper info We have a lot of confusing names of functions and data structures in amongs the the error reporting code. In particular the "apei" prefix has been applied to many objects that are not part of APEI. Since we will be using these routines for extended error log reporting it will be clearer if we fix up the names first. Signed-off-by: Chen, Gong <gong.chen@linux.intel.com> Acked-by: Borislav Petkov <bp@suse.de> Reviewed-by: Mauro Carvalho Chehab <m.chehab@samsung.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2013-10-21 15:12:00 -07:00
Mauro Carvalho Chehab	21480547c8	ghes: add the needed hooks for EDAC error report In order to allow reporting errors via EDAC, add hooks for: 1) register an EDAC driver; 2) unregister an EDAC driver; 3) report errors via EDAC. As the EDAC driver will need to access the ghes structure, adds it as one of the parameters for ghes_do_proc. Acked-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2013-02-25 19:41:51 -03:00
Mauro Carvalho Chehab	40e0641538	ghes: move structures/enum to a header file As a ghes_edac driver will need to access ghes structures, in order to properly handle the errors, move those structures to a separate header file. No functional changes. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2013-02-21 14:16:28 -03:00

25 Commits