While commit 55f1ea1521 ("efi: Fix for_each_efi_memory_desc_in_map()
for empty memmaps") made an attempt to deal with empty memory maps, it
didn't address the case where the map field never gets set, as is
apparently the case when running under Xen.
Reported-by: <lists@ssl-mail.com>
Tested-by: <lists@ssl-mail.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: <stable@vger.kernel.org> # v4.7+
Signed-off-by: Jan Beulich <jbeulich@suse.com>
[ Guard the loop with a NULL check instead of pointer underflow ]
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Re-organize the GUID table so that every GUID takes a single line.
This makes each line super long, but if you have a large enough terminal
(or zoom out of a small terminal) then you can see the structure at
a glance - which is more readable than it was the case with the
multi-line layout.
Acked-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Joe Perches <joe@perches.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Jones <pjones@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20160627104920.GA9099@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Nothing calls the efi_get_time() function on x86, but it does suffer
from the 32-bit time_t overflow in 2038.
This removes the function, we can always put it back in case we need
it later.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1466839230-12781-8-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This commit makes a few slight modifications to the efi_call_virt() macro
to get it to work with function pointers that are stored in locations
other than efi.systab->runtime, and renames the macro to
efi_call_virt_pointer(). The majority of the changes here are to pull
these macros up into header files so that they can be accessed from
outside of drivers/firmware/efi/runtime-wrappers.c.
The most significant change not directly related to the code move is to
add an extra "p" argument into the appropriate efi_call macros, and use
that new argument in place of the, formerly hard-coded,
efi.systab->runtime pointer.
The last piece of the puzzle was to add an efi_call_virt() macro back into
drivers/firmware/efi/runtime-wrappers.c to wrap around the new
efi_call_virt_pointer() macro - this was mainly to keep the code from
looking too cluttered by adding a bunch of extra references to
efi.systab->runtime everywhere.
Note that I also broke up the code in the efi_call_virt_pointer() macro a
bit in the process of moving it.
Signed-off-by: Alex Thorlton <athorlton@sgi.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Roy Franz <roy.franz@linaro.org>
Cc: Russ Anderson <rja@sgi.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1466839230-12781-5-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add a comment documenting why EFI GUIDs are laid out like they are.
Ideally I'd like to change all the ", " to "," too, but right now the
format is such that checkpatch won't complain with new ones, and staring
at checkpatch didn't get me anywhere towards making that work.
Signed-off-by: Peter Jones <pjones@redhat.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Joe Perches <joe@perches.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1466839230-12781-3-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We have a generic read_persistent_clock64 interface now, and can
change the ia64 implementation to provide that instead of
read_persistent_clock.
The main point of this is to avoid the use of struct timespec
in the global efi.h, which would cause build errors as soon
as we want to build a kernel without 'struct timespec' defined
on 32-bit architectures.
Aside from this, we get a little closer to removing the
__weak read_persistent_clock() definition, which relies on
converting all architectures to provide read_persistent_clock64
instead.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Commit:
78ce248faa ("efi: Iterate over efi.memmap in for_each_efi_memory_desc()")
introduced a regression for systems booted with the 'noefi' kernel option.
In particular, I observed an early kernel hang in efi_find_mirror()'s
for_each_efi_memory_desc() call. As we don't have efi memmap on this
system we enter this iterator with the following parameters:
efi.memmap.map = 0, efi.memmap.map_end = 0, efi.memmap.desc_size = 28
... then for_each_efi_memory_desc_in_map() does the following comparison:
(md) <= (efi_memory_desc_t *)((m)->map_end - (m)->desc_size);
... where md = 0, (m)->map_end = 0 and (m)->desc_size = 28 but when we subtract
something from a NULL pointer wrap around happens and we end up returning
invalid pointer and crash.
Fix it by using the correct pointer arithmetics.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Salter <msalter@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Fixes: 78ce248faa ("efi: Iterate over efi.memmap in for_each_efi_memory_desc()")
Link: http://lkml.kernel.org/r/1464690224-4503-2-git-send-email-matt@codeblueprint.co.uk
[ Made the changelog more readable. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Generic UUID library defines structure type, macro to define UUID, and
the length of the UUID string. This patch removes duplicate data
structure definition, UUID string length constant as well as macro for
UUID handling.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The parameters atomic and duplicates of efivar_init always have opposite
values. Drop the parameter atomic, replace the uses of !atomic with
duplicates, and update the call sites accordingly.
The code using duplicates is slightly reorganized with an 'else', to avoid
duplicating the lock code.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jeremy Kerr <jk@ozlabs.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Saurabh Sengar <saurabh.truth@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vaishali Thakkar <vaishali.thakkar@oracle.com>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1462570771-13324-5-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
If an EFI capsule has been sent to the firmware we must match the type
of EFI reset against that required by the capsule to ensure it is
processed correctly.
Force an EFI reboot if a capsule is pending for the next reset.
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Kweh Hock Leong <hock.leong.kweh@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: joeyli <jlee@suse.com>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1461614832-17633-29-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The EFI capsule mechanism allows data blobs to be passed to the EFI
firmware. A common use case is performing firmware updates. This patch
just introduces the main infrastructure for interacting with the
firmware, and a driver that allows users to upload capsules will come
in a later patch.
Once a capsule has been passed to the firmware, the next reboot must
be performed using the ResetSystem() EFI runtime service, which may
involve overriding the reboot type specified by reboot=. This ensures
the reset value returned by QueryCapsuleCapabilities() is used to
reset the system, which is required for the capsule to be processed.
efi_capsule_pending() is provided for this purpose.
At the moment we only allow a single capsule blob to be sent to the
firmware despite the fact that UpdateCapsule() takes a 'CapsuleCount'
parameter. This simplifies the API and shouldn't result in any
downside since it is still possible to send multiple capsules by
repeatedly calling UpdateCapsule().
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Bryan O'Donoghue <pure.logic@nexus-software.ie>
Cc: Kweh Hock Leong <hock.leong.kweh@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Peter Jones <pjones@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: joeyli <jlee@suse.com>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1461614832-17633-28-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Move efi_status_to_err() to the architecture independent code as it's
generally useful in all bits of EFI code where there is a need to
convert an efi_status_t to a kernel error value.
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Kweh Hock Leong <hock.leong.kweh@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: joeyli <jlee@suse.com>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1461614832-17633-27-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This module installs a reboot callback, such that if reboot() is invoked
with a string argument NNN, "NNN" is copied to the "LoaderEntryOneShot"
EFI variable, to be read by the bootloader.
If the string matches one of the boot labels defined in its configuration,
the bootloader will boot once to that label. The "LoaderEntryRebootReason"
EFI variable is set with the reboot reason: "reboot", "shutdown".
The bootloader reads this reboot reason and takes particular action
according to its policy.
There are reboot implementations that do "reboot <reason>", such as
Android's reboot command and Upstart's reboot replacement, which pass
the reason as an argument to the reboot syscall. There is no
platform-agnostic way how those could be modified to pass the reason
to the bootloader, regardless of platform or bootloader.
Signed-off-by: Jeremy Compostella <jeremy.compostella@intel.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stefan Stanacar <stefan.stanacar@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1461614832-17633-26-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In order to hand over the framebuffer described by the GOP protocol and
discovered by the UEFI stub, make struct screen_info accessible by the
stub. This involves allocating a loader data buffer and passing it to the
kernel proper via a UEFI Configuration Table, since the UEFI stub executes
in the context of the decompressor, and cannot access the kernel's copy of
struct screen_info directly.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Borislav Petkov <bp@alien8.de>
Cc: David Herrmann <dh.herrmann@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Jones <pjones@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1461614832-17633-22-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The Graphics Output Protocol code executes in the stub, so create a generic
version based on the x86 version in libstub so that we can move other archs
to it in subsequent patches. The new source file gop.c is added to the
libstub build for all architectures, but only wired up for x86.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Borislav Petkov <bp@alien8.de>
Cc: David Herrmann <dh.herrmann@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Jones <pjones@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1461614832-17633-18-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In preparation of moving this code to drivers/firmware/efi and reusing
it on ARM and arm64, apply any changes that will be required to make this
code build for other architectures. This should make it easier to track
down problems that this move may cause to its operation on x86.
Note that the generic version uses slightly different ways of casting the
protocol methods and some other variables to the correct types, since such
method calls are not loosely typed on ARM and arm64 as they are on x86.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Borislav Petkov <bp@alien8.de>
Cc: David Herrmann <dh.herrmann@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Jones <pjones@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1461614832-17633-17-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This implements shared support for discovering the presence of the
Memory Attributes table, and for parsing and validating its contents.
The table is validated against the construction rules in the UEFI spec.
Since this is a new table, it makes sense to complain if we encounter
a table that does not follow those rules.
The parsing and validation routine takes a callback that can be specified
per architecture, that gets passed each unique validated region, with the
virtual address retrieved from the ordinary memory map.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[ Trim pr_*() strings to 80 cols and use EFI consistently. ]
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Leif Lindholm <leif.lindholm@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Jones <pjones@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1461614832-17633-14-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This declares the GUID and struct typedef for the new memory attributes
table which contains the permissions that can be used to apply stricter
permissions to UEFI Runtime Services memory regions.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Leif Lindholm <leif.lindholm@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Jones <pjones@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1461614832-17633-13-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Abolish the poorly named EFI memory map, 'memmap'. It is shadowed by a
bunch of local definitions in various files and having two ways to
access the EFI memory map ('efi.memmap' vs. 'memmap') is rather
confusing.
Furthermore, IA64 doesn't even provide this global object, which has
caused issues when trying to write generic EFI memmap code.
Replace all occurrences with efi.memmap, and convert the remaining
iterator code to use for_each_efi_mem_desc().
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Luck, Tony <tony.luck@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1461614832-17633-8-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Most of the users of for_each_efi_memory_desc() are equally happy
iterating over the EFI memory map in efi.memmap instead of 'memmap',
since the former is usually a pointer to the latter.
For those users that want to specify an EFI memory map other than
efi.memmap, that can be done using for_each_efi_memory_desc_in_map().
One such example is in the libstub code where the firmware is queried
directly for the memory map, it gets iterated over, and then freed.
This change goes part of the way toward deleting the global 'memmap'
variable, which is not universally available on all architectures
(notably IA64) and is rather poorly named.
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Leif Lindholm <leif.lindholm@linaro.org>
Cc: Mark Salter <msalter@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1461614832-17633-7-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The EFI_SYSTEM_TABLES status bit is set by all EFI supporting architectures
upon discovery of the EFI system table, but the bit is never tested in any
code we have in the tree. So remove it.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Leif Lindholm <leif.lindholm@linaro.org>
Cc: Luck, Tony <tony.luck@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1461614832-17633-2-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull EFI updates from Ingo Molnar:
"The main changes are:
- Use separate EFI page tables when executing EFI firmware code.
This isolates the EFI context from the rest of the kernel, which
has security and general robustness advantages. (Matt Fleming)
- Run regular UEFI firmware with interrupts enabled. This is already
the status quo under other OSs. (Ard Biesheuvel)
- Various x86 EFI enhancements, such as the use of non-executable
attributes for EFI memory mappings. (Sai Praneeth Prakhya)
- Various arm64 UEFI enhancements. (Ard Biesheuvel)
- ... various fixes and cleanups.
The separate EFI page tables feature got delayed twice already,
because it's an intrusive change and we didn't feel confident about
it - third time's the charm we hope!"
* 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits)
x86/mm/pat: Fix boot crash when 1GB pages are not supported by the CPU
x86/efi: Only map kernel text for EFI mixed mode
x86/efi: Map EFI_MEMORY_{XP,RO} memory region bits to EFI page tables
x86/mm/pat: Don't implicitly allow _PAGE_RW in kernel_map_pages_in_pgd()
efi/arm*: Perform hardware compatibility check
efi/arm64: Check for h/w support before booting a >4 KB granular kernel
efi/arm: Check for LPAE support before booting a LPAE kernel
efi/arm-init: Use read-only early mappings
efi/efistub: Prevent __init annotations from being used
arm64/vmlinux.lds.S: Handle .init.rodata.xxx and .init.bss sections
efi/arm64: Drop __init annotation from handle_kernel_image()
x86/mm/pat: Use _PAGE_GLOBAL bit for EFI page table mappings
efi/runtime-wrappers: Run UEFI Runtime Services with interrupts enabled
efi: Reformat GUID tables to follow the format in UEFI spec
efi: Add Persistent Memory type name
efi: Add NV memory attribute
x86/efi: Show actual ending addresses in efi_print_memmap
x86/efi/bgrt: Don't ignore the BGRT if the 'valid' bit is 0
efivars: Use to_efivar_entry
efi: Runtime-wrapper: Get rid of the rtc_lock spinlock
...
- Initial page table creation reworked to avoid breaking large block
mappings (huge pages) into smaller ones. The ARM architecture requires
break-before-make in such cases to avoid TLB conflicts but that's not
always possible on live page tables
- Kernel virtual memory layout: the kernel image is no longer linked to
the bottom of the linear mapping (PAGE_OFFSET) but at the bottom of
the vmalloc space, allowing the kernel to be loaded (nearly) anywhere
in physical RAM
- Kernel ASLR: position independent kernel Image and modules being
randomly mapped in the vmalloc space with the randomness is provided
by UEFI (efi_get_random_bytes() patches merged via the arm64 tree,
acked by Matt Fleming)
- Implement relative exception tables for arm64, required by KASLR
(initial code for ARCH_HAS_RELATIVE_EXTABLE added to lib/extable.c but
actual x86 conversion to deferred to 4.7 because of the merge
dependencies)
- Support for the User Access Override feature of ARMv8.2: this allows
uaccess functions (get_user etc.) to be implemented using LDTR/STTR
instructions. Such instructions, when run by the kernel, perform
unprivileged accesses adding an extra level of protection. The
set_fs() macro is used to "upgrade" such instruction to privileged
accesses via the UAO bit
- Half-precision floating point support (part of ARMv8.2)
- Optimisations for CPUs with or without a hardware prefetcher (using
run-time code patching)
- copy_page performance improvement to deal with 128 bytes at a time
- Sanity checks on the CPU capabilities (via CPUID) to prevent
incompatible secondary CPUs from being brought up (e.g. weird
big.LITTLE configurations)
- valid_user_regs() reworked for better sanity check of the sigcontext
information (restored pstate information)
- ACPI parking protocol implementation
- CONFIG_DEBUG_RODATA enabled by default
- VDSO code marked as read-only
- DEBUG_PAGEALLOC support
- ARCH_HAS_UBSAN_SANITIZE_ALL enabled
- Erratum workaround Cavium ThunderX SoC
- set_pte_at() fix for PROT_NONE mappings
- Code clean-ups
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJW6u95AAoJEGvWsS0AyF7xMyoP/3x2O6bgreSQ84BdO4JChN4+
RQ9OVdX8u2ItO9sgaCY2AA6KoiBuEjGmPl/XRuK0I7DpODTtRjEXQHuNNhz8AelC
hn4AEVqamY6Z5BzHFIjs8G9ydEbq+OXcKWEdwSsBhP/cMvI7ss3dps1f5iNPT5Vv
50E/kUz+aWYy7pKlB18VDV7TUOA3SuYuGknWV8+bOY5uPb8hNT3Y3fHOg/EuNNN3
DIuYH1V7XQkXtF+oNVIGxzzJCXULBE7egMcWAm1ydSOHK0JwkZAiL7OhI7ceVD0x
YlDxBnqmi4cgzfBzTxITAhn3OParwN6udQprdF1WGtFF6fuY2eRDSH/L/iZoE4DY
OulL951OsBtF8YC3+RKLk908/0bA2Uw8ftjCOFJTYbSnZBj1gWK41VkCYMEXiHQk
EaN8+2Iw206iYIoyvdjGCLw7Y0oakDoVD9vmv12SOaHeQljTkjoN8oIlfjjKTeP7
3AXj5v9BDMDVh40nkVayysRNvqe48Kwt9Wn0rhVTLxwdJEiFG/OIU6HLuTkretdN
dcCNFSQrRieSFHpBK9G0vKIpIss1ZwLm8gjocVXH7VK4Mo/TNQe4p2/wAF29mq4r
xu1UiXmtU3uWxiqZnt72LOYFCarQ0sFA5+pMEvF5W+NrVB0wGpXhcwm+pGsIi4IM
LepccTgykiUBqW5TRzPz
=/oS+
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
"Here are the main arm64 updates for 4.6. There are some relatively
intrusive changes to support KASLR, the reworking of the kernel
virtual memory layout and initial page table creation.
Summary:
- Initial page table creation reworked to avoid breaking large block
mappings (huge pages) into smaller ones. The ARM architecture
requires break-before-make in such cases to avoid TLB conflicts but
that's not always possible on live page tables
- Kernel virtual memory layout: the kernel image is no longer linked
to the bottom of the linear mapping (PAGE_OFFSET) but at the bottom
of the vmalloc space, allowing the kernel to be loaded (nearly)
anywhere in physical RAM
- Kernel ASLR: position independent kernel Image and modules being
randomly mapped in the vmalloc space with the randomness is
provided by UEFI (efi_get_random_bytes() patches merged via the
arm64 tree, acked by Matt Fleming)
- Implement relative exception tables for arm64, required by KASLR
(initial code for ARCH_HAS_RELATIVE_EXTABLE added to lib/extable.c
but actual x86 conversion to deferred to 4.7 because of the merge
dependencies)
- Support for the User Access Override feature of ARMv8.2: this
allows uaccess functions (get_user etc.) to be implemented using
LDTR/STTR instructions. Such instructions, when run by the kernel,
perform unprivileged accesses adding an extra level of protection.
The set_fs() macro is used to "upgrade" such instruction to
privileged accesses via the UAO bit
- Half-precision floating point support (part of ARMv8.2)
- Optimisations for CPUs with or without a hardware prefetcher (using
run-time code patching)
- copy_page performance improvement to deal with 128 bytes at a time
- Sanity checks on the CPU capabilities (via CPUID) to prevent
incompatible secondary CPUs from being brought up (e.g. weird
big.LITTLE configurations)
- valid_user_regs() reworked for better sanity check of the
sigcontext information (restored pstate information)
- ACPI parking protocol implementation
- CONFIG_DEBUG_RODATA enabled by default
- VDSO code marked as read-only
- DEBUG_PAGEALLOC support
- ARCH_HAS_UBSAN_SANITIZE_ALL enabled
- Erratum workaround Cavium ThunderX SoC
- set_pte_at() fix for PROT_NONE mappings
- Code clean-ups"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (99 commits)
arm64: kasan: Fix zero shadow mapping overriding kernel image shadow
arm64: kasan: Use actual memory node when populating the kernel image shadow
arm64: Update PTE_RDONLY in set_pte_at() for PROT_NONE permission
arm64: Fix misspellings in comments.
arm64: efi: add missing frame pointer assignment
arm64: make mrs_s prefixing implicit in read_cpuid
arm64: enable CONFIG_DEBUG_RODATA by default
arm64: Rework valid_user_regs
arm64: mm: check at build time that PAGE_OFFSET divides the VA space evenly
arm64: KVM: Move kvm_call_hyp back to its original localtion
arm64: mm: treat memstart_addr as a signed quantity
arm64: mm: list kernel sections in order
arm64: lse: deal with clobbered IP registers after branch via PLT
arm64: mm: dump: Use VA_START directly instead of private LOWEST_ADDR
arm64: kconfig: add submenu for 8.2 architectural features
arm64: kernel: acpi: fix ioremap in ACPI parking protocol cpu_postboot
arm64: Add support for Half precision floating point
arm64: Remove fixmap include fragility
arm64: Add workaround for Cavium erratum 27456
arm64: mm: Mark .rodata as RO
...
This exposes the firmware's implementation of EFI_RNG_PROTOCOL via a new
function efi_get_random_bytes().
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This makes it much easier to hunt for typos in the GUID definitions.
It also makes checkpatch complain less about efi.h GUID additions, so
that if you add another one with the same style, checkpatch won't
complain about it.
Signed-off-by: Peter Jones <pjones@redhat.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1455712566-16727-2-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
"rm -rf" is bricking some peoples' laptops because of variables being
used to store non-reinitializable firmware driver data that's required
to POST the hardware.
These are 100% bugs, and they need to be fixed, but in the mean time it
shouldn't be easy to *accidentally* brick machines.
We have to have delete working, and picking which variables do and don't
work for deletion is quite intractable, so instead make everything
immutable by default (except for a whitelist), and make tools that
aren't quite so broad-spectrum unset the immutable flag.
Signed-off-by: Peter Jones <pjones@redhat.com>
Tested-by: Lee, Chun-Yi <jlee@suse.com>
Acked-by: Matthew Garrett <mjg59@coreos.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
All the variables in this list so far are defined to be in the global
namespace in the UEFI spec, so this just further ensures we're
validating the variables we think we are.
Including the guid for entries will become more important in future
patches when we decide whether or not to allow deletion of variables
based on presence in this list.
Signed-off-by: Peter Jones <pjones@redhat.com>
Tested-by: Lee, Chun-Yi <jlee@suse.com>
Acked-by: Matthew Garrett <mjg59@coreos.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
The function efi_query_variable_store() may be invoked by
efivar_entry_set_nonblocking(), which itself takes care to only
call a non-blocking version of the SetVariable() runtime
wrapper. However, efi_query_variable_store() may call the
SetVariable() wrapper directly, as well as the wrapper for
QueryVariableInfo(), both of which could deadlock in the same
way we are trying to prevent by calling
efivar_entry_set_nonblocking() in the first place.
So instead, modify efi_query_variable_store() to use the
non-blocking variants of QueryVariableInfo() (and give up rather
than free up space if the available space is below
EFI_MIN_RESERVE) if invoked with the 'nonblocking' argument set
to true.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1454364428-494-5-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This introduces a new runtime wrapper for the
QueryVariableInfo() UEFI Runtime Service, which gives up
immediately rather than spins on failure to grab the efi_runtime
spinlock.
This is required in the non-blocking path of the efi-pstore
code.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1454364428-494-4-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
There is no need for a separate nonblocking prototype definition
for the SetVariable() UEFI Runtime Service, since it is
identical to the blocking version.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1454364428-494-3-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We have been getting away with using a void* for the physical
address of the UEFI memory map, since, even on 32-bit platforms
with 64-bit physical addresses, no truncation takes place if the
memory map has been allocated by the firmware (which only uses
1:1 virtually addressable memory), which is usually the case.
However, commit:
0f96a99dab ("efi: Add "efi_fake_mem" boot option")
adds code that clones and modifies the UEFI memory map, and the
clone may live above 4 GB on 32-bit platforms.
This means our use of void* for struct efi_memory_map::phys_map has
graduated from 'incorrect but working' to 'incorrect and
broken', and we need to fix it.
So redefine struct efi_memory_map::phys_map as phys_addr_t, and
get rid of a bunch of casts that are now unneeded.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: izumi.taku@jp.fujitsu.com
Cc: kamezawa.hiroyu@jp.fujitsu.com
Cc: linux-efi@vger.kernel.org
Cc: matt.fleming@intel.com
Link: http://lkml.kernel.org/r/1445593697-1342-1-git-send-email-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch introduces new boot option named "efi_fake_mem".
By specifying this parameter, you can add arbitrary attribute
to specific memory range.
This is useful for debugging of Address Range Mirroring feature.
For example, if "efi_fake_mem=2G@4G:0x10000,2G@0x10a0000000:0x10000"
is specified, the original (firmware provided) EFI memmap will be
updated so that the specified memory regions have
EFI_MEMORY_MORE_RELIABLE attribute (0x10000):
<original>
efi: mem36: [Conventional Memory| | | | | | |WB|WT|WC|UC] range=[0x0000000100000000-0x00000020a0000000) (129536MB)
<updated>
efi: mem36: [Conventional Memory| |MR| | | | |WB|WT|WC|UC] range=[0x0000000100000000-0x0000000180000000) (2048MB)
efi: mem37: [Conventional Memory| | | | | | |WB|WT|WC|UC] range=[0x0000000180000000-0x00000010a0000000) (61952MB)
efi: mem38: [Conventional Memory| |MR| | | | |WB|WT|WC|UC] range=[0x00000010a0000000-0x0000001120000000) (2048MB)
efi: mem39: [Conventional Memory| | | | | | |WB|WT|WC|UC] range=[0x0000001120000000-0x00000020a0000000) (63488MB)
And you will find that the following message is output:
efi: Memory: 4096M/131455M mirrored memory
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
UEFI v2.5 introduces a runtime memory protection feature that splits
PE/COFF runtime images into separate code and data regions. Since this
may require special handling by the OS, allocate a EFI_xxx bit to
keep track of whether this feature is currently active or not.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Leif Lindholm <leif.lindholm@linaro.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Version 2.5 of the UEFI spec introduces a new configuration table
called the 'EFI Properties table'. Currently, it is only used to
convey whether the Memory Protection feature is enabled, which splits
PE/COFF images into separate code and data memory regions.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Leif Lindholm <leif.lindholm@linaro.org>
Acked-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
As we now have a common debug infrastructure between core and arm64 efi,
drop the bit of the interface passing verbose output flags around.
Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Mark Salter <msalter@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
The UEFI spec v2.5 introduces a new memory attribute
EFI_MEMORY_RO, which is now the preferred attribute to convey
that the nature of the contents of such a region allows it to be
mapped read-only (i.e., it contains .text and .rodata only).
The specification of the existing EFI_MEMORY_WP attribute has been
updated to align more closely with its common use as a
cacheability attribute rather than a permission attribute.
Add the #define and add the attribute to the memory map dumping
routine.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1438936621-5215-1-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
4 drivers / enabling modules:
NFIT:
Instantiates an "nvdimm bus" with the core and registers memory devices
(NVDIMMs) enumerated by the ACPI 6.0 NFIT (NVDIMM Firmware Interface
table). After registering NVDIMMs the NFIT driver then registers
"region" devices. A libnvdimm-region defines an access mode and the
boundaries of persistent memory media. A region may span multiple
NVDIMMs that are interleaved by the hardware memory controller. In
turn, a libnvdimm-region can be carved into a "namespace" device and
bound to the PMEM or BLK driver which will attach a Linux block device
(disk) interface to the memory.
PMEM:
Initially merged in v4.1 this driver for contiguous spans of persistent
memory address ranges is re-worked to drive PMEM-namespaces emitted by
the libnvdimm-core. In this update the PMEM driver, on x86, gains the
ability to assert that writes to persistent memory have been flushed all
the way through the caches and buffers in the platform to persistent
media. See memcpy_to_pmem() and wmb_pmem().
BLK:
This new driver enables access to persistent memory media through "Block
Data Windows" as defined by the NFIT. The primary difference of this
driver to PMEM is that only a small window of persistent memory is
mapped into system address space at any given point in time. Per-NVDIMM
windows are reprogrammed at run time, per-I/O, to access different
portions of the media. BLK-mode, by definition, does not support DAX.
BTT:
This is a library, optionally consumed by either PMEM or BLK, that
converts a byte-accessible namespace into a disk with atomic sector
update semantics (prevents sector tearing on crash or power loss). The
sinister aspect of sector tearing is that most applications do not know
they have a atomic sector dependency. At least today's disk's rarely
ever tear sectors and if they do one almost certainly gets a CRC error
on access. NVDIMMs will always tear and always silently. Until an
application is audited to be robust in the presence of sector-tearing
the usage of BTT is recommended.
Thanks to: Ross Zwisler, Jeff Moyer, Vishal Verma, Christoph Hellwig,
Ingo Molnar, Neil Brown, Boaz Harrosh, Robert Elliott, Matthew Wilcox,
Andy Rudoff, Linda Knippers, Toshi Kani, Nicholas Moulin, Rafael
Wysocki, and Bob Moore.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJVjZGBAAoJEB7SkWpmfYgC4fkP/j+k6HmSRNU/yRYPyo7CAWvj
3P5P1i6R6nMZZbjQrQArAXaIyLlFk4sEQDYsciR6dmslhhFZAkR2eFwVO5rBOyx3
QN0yxEpyjJbroRFUrV/BLaFK4cq2oyJAFFHs0u7/pLHBJ4MDMqfRKAMtlnBxEkTE
LFcqXapSlvWitSbjMdIBWKFEvncaiJ2mdsFqT4aZqclBBTj00eWQvEG9WxleJLdv
+tj7qR/vGcwOb12X5UrbQXgwtMYos7A6IzhHbqwQL8IrOcJ6YB8NopJUpLDd7ZVq
KAzX6ZYMzNueN4uvv6aDfqDRLyVL7qoxM9XIjGF5R8SV9sF2LMspm1FBpfowo1GT
h2QMr0ky1nHVT32yspBCpE9zW/mubRIDtXxEmZZ53DIc4N6Dy9jFaNVmhoWtTAqG
b9pndFnjUzzieCjX5pCvo2M5U6N0AQwsnq76/CasiWyhSa9DNKOg8MVDRg0rbxb0
UvK0v8JwOCIRcfO3qiKcx+02nKPtjCtHSPqGkFKPySRvAdb+3g6YR26CxTb3VmnF
etowLiKU7HHalLvqGFOlDoQG6viWes9Zl+ZeANBOCVa6rL2O7ZnXJtYgXf1wDQee
fzgKB78BcDjXH4jHobbp/WBANQGN/GF34lse8yHa7Ym+28uEihDvSD1wyNLnefmo
7PJBbN5M5qP5tD0aO7SZ
=VtWG
-----END PGP SIGNATURE-----
Merge tag 'libnvdimm-for-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm
Pull libnvdimm subsystem from Dan Williams:
"The libnvdimm sub-system introduces, in addition to the
libnvdimm-core, 4 drivers / enabling modules:
NFIT:
Instantiates an "nvdimm bus" with the core and registers memory
devices (NVDIMMs) enumerated by the ACPI 6.0 NFIT (NVDIMM Firmware
Interface table).
After registering NVDIMMs the NFIT driver then registers "region"
devices. A libnvdimm-region defines an access mode and the
boundaries of persistent memory media. A region may span multiple
NVDIMMs that are interleaved by the hardware memory controller. In
turn, a libnvdimm-region can be carved into a "namespace" device and
bound to the PMEM or BLK driver which will attach a Linux block
device (disk) interface to the memory.
PMEM:
Initially merged in v4.1 this driver for contiguous spans of
persistent memory address ranges is re-worked to drive
PMEM-namespaces emitted by the libnvdimm-core.
In this update the PMEM driver, on x86, gains the ability to assert
that writes to persistent memory have been flushed all the way
through the caches and buffers in the platform to persistent media.
See memcpy_to_pmem() and wmb_pmem().
BLK:
This new driver enables access to persistent memory media through
"Block Data Windows" as defined by the NFIT. The primary difference
of this driver to PMEM is that only a small window of persistent
memory is mapped into system address space at any given point in
time.
Per-NVDIMM windows are reprogrammed at run time, per-I/O, to access
different portions of the media. BLK-mode, by definition, does not
support DAX.
BTT:
This is a library, optionally consumed by either PMEM or BLK, that
converts a byte-accessible namespace into a disk with atomic sector
update semantics (prevents sector tearing on crash or power loss).
The sinister aspect of sector tearing is that most applications do
not know they have a atomic sector dependency. At least today's
disk's rarely ever tear sectors and if they do one almost certainly
gets a CRC error on access. NVDIMMs will always tear and always
silently. Until an application is audited to be robust in the
presence of sector-tearing the usage of BTT is recommended.
Thanks to: Ross Zwisler, Jeff Moyer, Vishal Verma, Christoph Hellwig,
Ingo Molnar, Neil Brown, Boaz Harrosh, Robert Elliott, Matthew Wilcox,
Andy Rudoff, Linda Knippers, Toshi Kani, Nicholas Moulin, Rafael
Wysocki, and Bob Moore"
* tag 'libnvdimm-for-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm: (33 commits)
arch, x86: pmem api for ensuring durability of persistent memory updates
libnvdimm: Add sysfs numa_node to NVDIMM devices
libnvdimm: Set numa_node to NVDIMM devices
acpi: Add acpi_map_pxm_to_online_node()
libnvdimm, nfit: handle unarmed dimms, mark namespaces read-only
pmem: flag pmem block devices as non-rotational
libnvdimm: enable iostat
pmem: make_request cleanups
libnvdimm, pmem: fix up max_hw_sectors
libnvdimm, blk: add support for blk integrity
libnvdimm, btt: add support for blk integrity
fs/block_dev.c: skip rw_page if bdev has integrity
libnvdimm: Non-Volatile Devices
tools/testing/nvdimm: libnvdimm unit test infrastructure
libnvdimm, nfit, nd_blk: driver for BLK-mode access persistent memory
nd_btt: atomic sector updates
libnvdimm: infrastructure for btt devices
libnvdimm: write blk label set
libnvdimm: write pmem label set
libnvdimm: blk labels and namespace instantiation
...
UEFI GetMemoryMap() uses a new attribute bit to mark mirrored memory
address ranges. See UEFI 2.5 spec pages 157-158:
http://www.uefi.org/sites/default/files/resources/UEFI%202_5.pdf
On EFI enabled systems scan the memory map and tell memblock about any
mirrored ranges.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Xiexiuqi <xiexiuqi@huawei.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
So, I'm told this problem exists in the world:
> Subject: Build error in -next due to 'efi: Add esrt support'
>
> Building ia64:defconfig ... failed
> --------------
> Error log:
>
> drivers/firmware/efi/esrt.c:28:31: fatal error: asm/early_ioremap.h: No such file or directory
>
I'm not really sure how it's okay that we have things in asm-generic on
some platforms but not others - is having it the same everywhere not the
whole point of asm-generic?
That said, ia64 doesn't have early_ioremap.h . So instead, since it's
difficult to imagine new IA64 machines with UEFI 2.5, just don't build
this code there.
To me this looks like a workaround - doing something like:
generic-y += early_ioremap.h
in arch/ia64/include/asm/Kbuild would appear to be more correct, but
ia64 has its own early_memremap() decl in arch/ia64/include/asm/io.h ,
and it's a macro. So adding the above /and/ requiring that asm/io.h be
included /after/ asm/early_ioremap.h in all cases would fix it, but
that's pretty ugly as well. Since I'm not going to spend the rest of my
life rectifying ia64 headers vs "generic" headers that aren't generic,
it's much simpler to just not build there.
Note that I've only actually tried to build this patch on x86_64, but
esrt.o still gets built there, and that would seem to demonstrate that
the conditional building is working correctly at all the places the code
built before. I no longer have any ia64 machines handy to test that the
exclusion actually works there.
Signed-off-by: Peter Jones <pjones@redhat.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
(Compile-)Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
ACPI 6.0 formalizes e820-type-7 and efi-type-14 as persistent memory.
Mark it "reserved" and allow it to be claimed by a persistent memory
device driver.
This definition is in addition to the Linux kernel's existing type-12
definition that was recently added in support of shipping platforms with
NVDIMM support that predate ACPI 6.0 (which now classifies type-12 as
OEM reserved).
Note, /proc/iomem can be consulted for differentiating legacy
"Persistent Memory (legacy)" E820_PRAM vs standard "Persistent Memory"
E820_PMEM.
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Tested-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Add sysfs files for the EFI System Resource Table (ESRT) under
/sys/firmware/efi/esrt and for each EFI System Resource Entry under
entries/ as a subdir.
The EFI System Resource Table (ESRT) provides a read-only catalog of
system components for which the system accepts firmware upgrades via
UEFI's "Capsule Update" feature. This module allows userland utilities
to evaluate what firmware updates can be applied to this system, and
potentially arrange for those updates to occur.
The ESRT is described as part of the UEFI specification, in version 2.5
which should be available from http://uefi.org/specifications in early
2015. If you're a member of the UEFI Forum, information about its
addition to the standard is available as UEFI Mantis 1090.
For some hardware platforms, additional restrictions may be found at
http://msdn.microsoft.com/en-us/library/windows/hardware/jj128256.aspx ,
and additional documentation may be found at
http://download.microsoft.com/download/5/F/5/5F5D16CD-2530-4289-8019-94C6A20BED3C/windows-uefi-firmware-update-platform.docx
.
Signed-off-by: Peter Jones <pjones@redhat.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
... and hide the memory regions dump behind it. Make it default-off.
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/20141209095843.GA3990@pd.tnic
Acked-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
- reimplementation of the virtual remapping of UEFI Runtime Services in
a way that is stable across kexec
- emulation of the "setend" instruction for 32-bit tasks (user
endianness switching trapped in the kernel, SCTLR_EL1.E0E bit set
accordingly)
- compat_sys_call_table implemented in C (from asm) and made it a
constant array together with sys_call_table
- export CPU cache information via /sys (like other architectures)
- DMA API implementation clean-up in preparation for IOMMU support
- macros clean-up for KVM
- dropped some unnecessary cache+tlb maintenance
- CONFIG_ARM64_CPU_SUSPEND clean-up
- defconfig update (CPU_IDLE)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJU25v3AAoJEGvWsS0AyF7xYjcP/j8ESvs+z0BPgeJ6XREfOnCh
cp+w/1rJ5BafJ5RRkibrciwTNOIJS4FGMivWyURtoh430lS0Rh7fxZ3Ouna3xjrT
Nf7AxenWoA8Lo6wHh+FlNUeGk3iWfX6WwA2tYrbKudK+LBJ1wHjwpE7cWQO0FgwJ
aFDahu+QD5/u45p/VcVctMtiEDvOxBdO8gfat6r+YkLm7pbRxQkZnpA/JE4Gps1p
Td5jvMNH9pXI5pffSbeR9Q+vs/r0yqKLXQg01Eb2bZgGDgwf9yzADrHuaKamZt35
X5flmLiTGC6swJCJvUkZC1Nuue33bXcvW5+vgvar+MNGyXsxv+B/wARLqGhiWhQZ
nLGwFpuNu6wdY9tGHb/XR8khcewkw1/lRH1hHKhchrmRyUqHvXcPgC5tamjLrY8C
BV3BAeQvRho8OKwWUmbXIlyON1vPux6CJdj4D/A5NL+qph2WHeVWJCXg6nVFx0Wc
Eb3bXbI4QRwTFL7pGRF8RyZJBAQtgYhQMKWMW2GHgUgn+r1EixG73BZoSwvpHrrw
FOR9AVNfVBqmNON8xiIb3DN4EViq76EF0jrsZh5I9EoWS2w5qtk60kJQgXE+M4EE
vOlmh3dhEVfCN2SxOn0bgoQmTulyjqGauTSSJKQbIBuinPFveukrJfGNFIWt0SZs
f38FBMo6sgU4VG85B+Fr
=X5x/
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
"arm64 updates for 3.20:
- reimplementation of the virtual remapping of UEFI Runtime Services
in a way that is stable across kexec
- emulation of the "setend" instruction for 32-bit tasks (user
endianness switching trapped in the kernel, SCTLR_EL1.E0E bit set
accordingly)
- compat_sys_call_table implemented in C (from asm) and made it a
constant array together with sys_call_table
- export CPU cache information via /sys (like other architectures)
- DMA API implementation clean-up in preparation for IOMMU support
- macros clean-up for KVM
- dropped some unnecessary cache+tlb maintenance
- CONFIG_ARM64_CPU_SUSPEND clean-up
- defconfig update (CPU_IDLE)
The EFI changes going via the arm64 tree have been acked by Matt
Fleming. There is also a patch adding sys_*stat64 prototypes to
include/linux/syscalls.h, acked by Andrew Morton"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (47 commits)
arm64: compat: Remove incorrect comment in compat_siginfo
arm64: Fix section mismatch on alloc_init_p[mu]d()
arm64: Avoid breakage caused by .altmacro in fpsimd save/restore macros
arm64: mm: use *_sect to check for section maps
arm64: drop unnecessary cache+tlb maintenance
arm64:mm: free the useless initial page table
arm64: Enable CPU_IDLE in defconfig
arm64: kernel: remove ARM64_CPU_SUSPEND config option
arm64: make sys_call_table const
arm64: Remove asm/syscalls.h
arm64: Implement the compat_sys_call_table in C
syscalls: Declare sys_*stat64 prototypes if __ARCH_WANT_(COMPAT_)STAT64
compat: Declare compat_sys_sigpending and compat_sys_sigprocmask prototypes
arm64: uapi: expose our struct ucontext to the uapi headers
smp, ARM64: Kill SMP single function call interrupt
arm64: Emulate SETEND for AArch32 tasks
arm64: Consolidate hotplug notifier for instruction emulation
arm64: Track system support for mixed endian EL0
arm64: implement generic IOMMU configuration
arm64: Combine coherent and non-coherent swiotlb dma_ops
...
since that's a more logical and accurate place - Leif Lindholm
* Update efibootmgr URL in Kconfig help - Peter Jones
* Improve accuracy of EFI guid function names - Borislav Petkov
* Expose firmware platform size in sysfs for the benefit of EFI boot
loader installers and other utilities - Steve McIntyre
* Cleanup __init annotations for arm64/efi code - Ard Biesheuvel
* Mark the UIE as unsupported for rtc-efi - Ard Biesheuvel
* Fix memory leak in error code path of runtime map code - Dan Carpenter
* Improve robustness of get_memory_map() by removing assumptions on the
size of efi_memory_desc_t (which could change in future spec
versions) and querying the firmware instead of guessing about the
memmap size - Ard Biesheuvel
* Remove superfluous guid unparse calls - Ivan Khoronzhuk
* Delete unnecessary chosen@0 DT node FDT code since was duplicated
from code in drivers/of and is entirely unnecessary - Leif Lindholm
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJUv69oAAoJEC84WcCNIz1VEYgP/1b27WRfCXs4q/8FP+UheSDS
nAFbGe9PjVPnxo5pA9VwPP6eNQ2zYiyNGEK1BlbQlFPZdSD1updIraA78CiF5iys
iSYyG9xVIcTB23RZI8aJLnBXbosIUKPJZ3FORv1LPhI6Mz1rCpraEaaUlv67rUKr
FLBG9cR7t9f/f+fJw6LOAAISGIG/4s0wQdA5/noaYkj5R5bICl2UTGtbwa0oNstb
NUO93aKDgaG/VljpIEeG6XV96Ioz7cHjQsEaX8sTrvT0n7nPNIqSDjFJOqWKJOXl
RsFrzyl8fFIbMuQatYv1f3efPvyH+iKOfHnHrvcjUNje0xhm7F0Bd86BkOw1a3JQ
pNb0YUWecI0Z/8GSzN8X0JQ7cowa3wI15Z/Hfs03odTXiM6VqwFAhuz/s5DEUdKS
U+rOPjU0ezt3G4oBB/VGgF9w5JWKfsMcsHgmLX9P+JYzKFrxggo1SXAtXUeRAqQp
agKmUB+k6Y1baQO8efkoM7rKL2F0q1SR9QiK+16BHCCkevD23v7IFGrHm2r1xKil
kvWlY4MkRVa4KGPxEFEDVty0HjXxImwYsxTaYVHTS7SMeoP41f6koHKB19NaB3No
5fqn/rT1KcJuhQj/I+vAixIX4WMJkX/MQVbtKfqSaKlAiRg3eRY6ONYr0jOglfF6
gaMuvmDd0HlV6UJvH/9L
=iPpM
-----END PGP SIGNATURE-----
Merge tag 'efi-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into x86/efi
Pull EFI updates from Matt Fleming:
" - Move efivarfs from the misc filesystem section to pseudo filesystem,
since that's a more logical and accurate place - Leif Lindholm
- Update efibootmgr URL in Kconfig help - Peter Jones
- Improve accuracy of EFI guid function names - Borislav Petkov
- Expose firmware platform size in sysfs for the benefit of EFI boot
loader installers and other utilities - Steve McIntyre
- Cleanup __init annotations for arm64/efi code - Ard Biesheuvel
- Mark the UIE as unsupported for rtc-efi - Ard Biesheuvel
- Fix memory leak in error code path of runtime map code - Dan Carpenter
- Improve robustness of get_memory_map() by removing assumptions on the
size of efi_memory_desc_t (which could change in future spec
versions) and querying the firmware instead of guessing about the
memmap size - Ard Biesheuvel
- Remove superfluous guid unparse calls - Ivan Khoronzhuk
- Delete unnecessary chosen@0 DT node FDT code since was duplicated
from code in drivers/of and is entirely unnecessary - Leif Lindholm
There's nothing super scary, mainly cleanups, and a merge from Ricardo who
kindly picked up some patches from the linux-efi mailing list while I
was out on annual leave in December.
Perhaps the biggest risk is the get_memory_map() change from Ard, which
changes the way that both the arm64 and x86 EFI boot stub build the
early memory map. It would be good to have it bake in linux-next for a
while.
"
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Split of the remapping code from efi_config_init() so that the caller
can perform its own remapping. This is necessary to correctly handle
virtually remapped UEFI memory regions under kexec, as efi.systab will
have been updated to a virtual address.
Acked-by: Matt Fleming <matt.fleming@intel.com>
Tested-by: Leif Lindholm <leif.lindholm@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Call it what it does - "unparse" is plain-misleading.
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
This adds support to the UEFI side for detecting the presence of
a SMBIOS 3.0 64-bit entry point. This allows the actual SMBIOS
structure table to reside at a physical offset over 4 GB, which
cannot be supported by the legacy SMBIOS 32-bit entry point.
Since the firmware can legally provide both entry points, store
the SMBIOS 3.0 entry point in a separate variable, and let the
DMI decoding layer decide which one will be used.
Tested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Acked-by: Leif Lindholm <leif.lindholm@linaro.org>
Acked-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
There are some circumstances that call for trying to write an EFI
variable in a non-blocking way. One such scenario is when writing pstore
data in efi_pstore_write() via the pstore_dump() kdump callback.
Now that we have an EFI runtime spinlock we need a way of aborting if
there is contention instead of spinning, since when writing pstore data
from the kdump callback, the runtime lock may already be held by the CPU
that's running the callback if we crashed in the middle of an EFI
variable operation.
The situation is sufficiently special that a new EFI variable operation
is warranted.
Introduce ->set_variable_nonblocking() for this use case. It is an
optional EFI backend operation, and need only be implemented by those
backends that usually acquire locks to serialize access to EFI
variables, as is the case for virt_efi_set_variable() where we now grab
the EFI runtime spinlock.
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
At the moment, there are three architectures debug-printing the EFI memory
map at initialization: x86, ia64, and arm64. They all use different format
strings, plus the EFI memory type and the EFI memory attributes are
similarly hard to decode for a human reader.
Introduce a helper __init function that formats the memory type and the
memory attributes in a unified way, to a user-provided character buffer.
The array "memory_type_name" is copied from the arm64 code, temporarily
duplicating it. The (otherwise optional) braces around each string literal
in the initializer list are dropped in order to match the kernel coding
style more closely. The element size is tightened from 32 to 20 bytes
(maximum actual string length + 1) so that we can derive the field width
from the element size.
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[ Dropped useless 'register' keyword, which compiler will ignore ]
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Add the following macro from the UEFI spec, for completeness:
EFI_MEMORY_UCE Memory cacheability attribute: The memory region
supports being configured as not cacheable, exported,
and supports the "fetch and add" semaphore mechanism.
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
noefi param can be used for arches other than X86 later, thus move it
out of x86 platform code.
Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
We need a way to customize the behaviour of the EFI boot stub, in
particular, we need a way to disable the "chunking" workaround, used
when reading files from the EFI System Partition.
One of my machines doesn't cope well when reading files in 1MB chunks to
a buffer above the 4GB mark - it appears that the "chunking" bug
workaround triggers another firmware bug. This was only discovered with
commit 4bf7111f50 ("x86/efi: Support initrd loaded above 4G"), and
that commit is perfectly valid. The symptom I observed was a corrupt
initrd rather than any kind of crash.
efi= is now used to specify EFI parameters in two very different
execution environments, the EFI boot stub and during kernel boot.
There is also a slight performance optimization by enabling efi=nochunk,
but that's offset by the fact that you're more likely to run into
firmware issues, at least on x86. This is the rationale behind leaving
the workaround enabled by default.
Also provide some documentation for EFI_READ_CHUNK_SIZE and why we're
using the current value of 1MB.
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Roy Franz <roy.franz@linaro.org>
Cc: Maarten Lankhorst <m.b.lankhorst@gmail.com>
Cc: Leif Lindholm <leif.lindholm@linaro.org>
Cc: Borislav Petkov <bp@suse.de>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
This patch does two things. It passes EFI run time mappings to second
kernel in bootparams efi_info. Second kernel parse this info and create
new mappings in second kernel. That means mappings in first and second
kernel will be same. This paves the way to enable EFI in kexec kernel.
This patch also prepares and passes EFI setup data through bootparams.
This contains bunch of information about various tables and their
addresses.
These information gathering and passing has been written along the lines
of what current kexec-tools is doing to make kexec work with UEFI.
[akpm@linux-foundation.org: s/get_efi/efi_get/g, per Matt]
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Greg Kroah-Hartman <greg@kroah.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: WANG Chao <chaowang@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The comment describing how struct efivars->lock is used hasn't been
updated in sync with the code. Fix it.
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Mike Waychison <mikew@google.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
efi_set_rtc_mmss() is never used to set RTC due to bugs found
on many EFI platforms. It is set directly by mach_set_rtc_mmss().
Hence, remove unused efi_set_rtc_mmss() function.
Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Introduce EFI_PARAVIRT flag. If it is set then kernel runs
on EFI platform but it has not direct control on EFI stuff
like EFI runtime, tables, structures, etc. If not this means
that Linux Kernel has direct access to EFI infrastructure
and everything runs as usual.
This functionality is used in Xen dom0 because hypervisor
has full control on EFI stuff and all calls from dom0 to
EFI must be requested via special hypercall which in turn
executes relevant EFI code in behalf of dom0.
Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
It appears that the BayTrail-T class of hardware requires EFI in order
to powerdown and reboot and no other reliable method exists.
This quirk is generally applicable to all hardware that has the ACPI
Hardware Reduced bit set, since usually ACPI would be the preferred
method.
Cc: Len Brown <len.brown@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Not only can EfiResetSystem() be used to reboot, it can also be used to
power down machines.
By and large, this functionality doesn't work very well across the range
of EFI machines in the wild, so it should definitely only be used as a
last resort. In an ideal world, this wouldn't be needed at all.
Unfortunately, we're starting to see machines where EFI is the *only*
reliable way to power down, and nothing else, not PCI, not ACPI, works.
efi_poweroff_required() should be implemented on a per-architecture
basis, since exactly when we should be using EFI runtime services is a
platform-specific decision. There's no analogue for reboot because each
architecture handles reboot very differently - the x86 code in
particular is pretty complex.
Patches to enable this for specific classes of hardware will be
submitted separately.
Tested-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Implement efi_reboot(), which is really just a wrapper around the
EfiResetSystem() EFI runtime service, but it does at least allow us to
funnel all callers through a single location.
It also simplifies the callsites since users no longer need to check to
see whether EFI_RUNTIME_SERVICES are enabled.
Cc: Tony Luck <tony.luck@intel.com>
Tested-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
In order to move from the #include "../../../xxxxx.c" anti-pattern used
by both the x86 and arm64 versions of the stub to a static library
linked into either the kernel proper (arm64) or a separate boot
executable (x86), there is some prepatory work required.
This patch does the following:
- move forward declarations of functions shared between the arch
specific and the generic parts of the stub to include/linux/efi.h
- move forward declarations of functions shared between various .c files
of the generic stub code to a new local header file called "efistub.h"
- add #includes to all .c files which were formerly relying on the
#includor to include the correct header files
- remove all static modifiers from functions which will need to be
externally visible once we move to a static library
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
In order for other archs (such as arm64) to be able to reuse the virtual
mode function call wrappers, move them to drivers/firmware/efi/runtime-wrappers.c.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Both ARM and ARM64 stubs will update the device tree that they pass to
the kernel. In both cases they primarily need to add the same UEFI
related information, so the function can be shared. Create a new FDT
related file for this to avoid use of architecture #ifdefs in
efi-stub-helper.c.
Signed-off-by: Roy Franz <roy.franz@linaro.org>
[ Fixed memory node deletion code. ]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
Acked-by: Grant Likely <grant.likely@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
ARM and ARM64 architectures use the device tree to pass UEFI parameters
from stub to kernel. These parameters are things known to the stub but
not discoverable by the kernel after the stub calls ExitBootSerives().
There is a helper function in:
drivers/firmware/efi/fdt.c
which the stub uses to add the UEFI parameters to the device tree.
This patch adds a complimentary helper function which UEFI runtime
support may use to retrieve the parameters from the device tree.
If an architecture wants to use this helper, it should select
CONFIG_EFI_PARAMS_FROM_FDT.
Signed-off-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
In preparation for compat support, we can't assume that user variable
object is represented by a 'struct efi_variable'. Convert the validation
functions to take the variable name as an argument, which is the only
piece of the struct that was ever used anyway.
Cc: Mike Waychison <mikew@google.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
There are a lot of places in the kernel which iterate through an
EFI memory map. Most of these places use essentially the same
for-loop code. This patch adds a for_each_efi_memory_desc()
helper to clean up all of the existing duplicate code and avoid
more in the future.
Signed-off-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
The traditional approach of using machine-specific types such as
'unsigned long' does not allow the kernel to interact with firmware
running in a different CPU mode, e.g. 64-bit kernel with 32-bit EFI.
Add distinct EFI structure definitions for both 32-bit and 64-bit so
that we can use them in the 32-bit and 64-bit code paths.
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
There's no good reason to keep efi_enabled() under CONFIG_X86 anymore,
since nothing about the implementation is specific to x86.
Set EFI feature flags in the ia64 boot path instead of claiming to
support all features. The old behaviour was actually buggy since
efi.memmap never points to a valid memory map, so we shouldn't be
claiming to support EFI_MEMMAP.
Fortunately, this bug was never triggered because EFI_MEMMAP isn't used
outside of arch/x86 currently, but that may not always be the case.
Reviewed-and-tested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
As we grow support for more EFI architectures they're going to want the
ability to query which EFI features are available on the running system.
Instead of storing this information in an architecture-specific place,
stick it in the global 'struct efi', which is already the central
location for EFI state.
While we're at it, let's change the return value of efi_enabled() to be
bool and replace all references to 'facility' with 'feature', which is
the usual word used to describe the attributes of the running system.
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
kexec kernel will need exactly same mapping for EFI runtime memory
ranges. Thus here export the runtime ranges mapping to sysfs,
kexec-tools will assemble them and pass to 2nd kernel via setup_data.
Introducing a new directory /sys/firmware/efi/runtime-map just like
/sys/firmware/memmap. Containing below attribute in each file of that
directory:
attribute num_pages phys_addr type virt_addr
Signed-off-by: Dave Young <dyoung@redhat.com>
Tested-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Export fw_vendor, runtime and config table physical addresses to
/sys/firmware/efi/{fw_vendor,runtime,config_table} because kexec kernels
need them.
From EFI spec these 3 variables will be updated to virtual address after
entering virtual mode. But kernel startup code will need the physical
address.
Signed-off-by: Dave Young <dyoung@redhat.com>
Tested-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Currently, when mounting pstore file system, a read callback of
efi_pstore driver runs mutiple times as below.
- In the first read callback, scan efivar_sysfs_list from head and pass
a kmsg buffer of a entry to an upper pstore layer.
- In the second read callback, rescan efivar_sysfs_list from the entry
and pass another kmsg buffer to it.
- Repeat the scan and pass until the end of efivar_sysfs_list.
In this process, an entry is read across the multiple read function
calls. To avoid race between the read and erasion, the whole process
above is protected by a spinlock, holding in open() and releasing in
close().
At the same time, kmemdup() is called to pass the buffer to pstore
filesystem during it. And then, it causes a following lockdep warning.
To make the dynamic memory allocation runnable without taking spinlock,
holding off a deletion of sysfs entry if it happens while scanning it
via efi_pstore, and deleting it after the scan is completed.
To implement it, this patch introduces two flags, scanning and deleting,
to efivar_entry.
On the code basis, it seems that all the scanning and deleting logic is
not needed because __efivars->lock are not dropped when reading from the
EFI variable store.
But, the scanning and deleting logic is still needed because an
efi-pstore and a pstore filesystem works as follows.
In case an entry(A) is found, the pointer is saved to psi->data. And
efi_pstore_read() passes the entry(A) to a pstore filesystem by
releasing __efivars->lock.
And then, the pstore filesystem calls efi_pstore_read() again and the
same entry(A), which is saved to psi->data, is used for resuming to scan
a sysfs-list.
So, to protect the entry(A), the logic is needed.
[ 1.143710] ------------[ cut here ]------------
[ 1.144058] WARNING: CPU: 1 PID: 1 at kernel/lockdep.c:2740 lockdep_trace_alloc+0x104/0x110()
[ 1.144058] DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags))
[ 1.144058] Modules linked in:
[ 1.144058] CPU: 1 PID: 1 Comm: systemd Not tainted 3.11.0-rc5 #2
[ 1.144058] 0000000000000009 ffff8800797e9ae0 ffffffff816614a5 ffff8800797e9b28
[ 1.144058] ffff8800797e9b18 ffffffff8105510d 0000000000000080 0000000000000046
[ 1.144058] 00000000000000d0 00000000000003af ffffffff81ccd0c0 ffff8800797e9b78
[ 1.144058] Call Trace:
[ 1.144058] [<ffffffff816614a5>] dump_stack+0x54/0x74
[ 1.144058] [<ffffffff8105510d>] warn_slowpath_common+0x7d/0xa0
[ 1.144058] [<ffffffff8105517c>] warn_slowpath_fmt+0x4c/0x50
[ 1.144058] [<ffffffff8131290f>] ? vsscanf+0x57f/0x7b0
[ 1.144058] [<ffffffff810bbd74>] lockdep_trace_alloc+0x104/0x110
[ 1.144058] [<ffffffff81192da0>] __kmalloc_track_caller+0x50/0x280
[ 1.144058] [<ffffffff815147bb>] ? efi_pstore_read_func.part.1+0x12b/0x170
[ 1.144058] [<ffffffff8115b260>] kmemdup+0x20/0x50
[ 1.144058] [<ffffffff815147bb>] efi_pstore_read_func.part.1+0x12b/0x170
[ 1.144058] [<ffffffff81514800>] ? efi_pstore_read_func.part.1+0x170/0x170
[ 1.144058] [<ffffffff815148b4>] efi_pstore_read_func+0xb4/0xe0
[ 1.144058] [<ffffffff81512b7b>] __efivar_entry_iter+0xfb/0x120
[ 1.144058] [<ffffffff8151428f>] efi_pstore_read+0x3f/0x50
[ 1.144058] [<ffffffff8128d7ba>] pstore_get_records+0x9a/0x150
[ 1.158207] [<ffffffff812af25c>] ? selinux_d_instantiate+0x1c/0x20
[ 1.158207] [<ffffffff8128ce30>] ? parse_options+0x80/0x80
[ 1.158207] [<ffffffff8128ced5>] pstore_fill_super+0xa5/0xc0
[ 1.158207] [<ffffffff811ae7d2>] mount_single+0xa2/0xd0
[ 1.158207] [<ffffffff8128ccf8>] pstore_mount+0x18/0x20
[ 1.158207] [<ffffffff811ae8b9>] mount_fs+0x39/0x1b0
[ 1.158207] [<ffffffff81160550>] ? __alloc_percpu+0x10/0x20
[ 1.158207] [<ffffffff811c9493>] vfs_kern_mount+0x63/0xf0
[ 1.158207] [<ffffffff811cbb0e>] do_mount+0x23e/0xa20
[ 1.158207] [<ffffffff8115b51b>] ? strndup_user+0x4b/0xf0
[ 1.158207] [<ffffffff811cc373>] SyS_mount+0x83/0xc0
[ 1.158207] [<ffffffff81673cc2>] system_call_fastpath+0x16/0x1b
[ 1.158207] ---[ end trace 61981bc62de9f6f4 ]---
Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com>
Tested-by: Madper Xie <cxie@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
We map the EFI regions needed for runtime services non-contiguously,
with preserved alignment on virtual addresses starting from -4G down
for a total max space of 64G. This way, we provide for stable runtime
services addresses across kernels so that a kexec'd kernel can still use
them.
Thus, they're mapped in a separate pagetable so that we don't pollute
the kernel namespace.
Add an efi= kernel command line parameter for passing miscellaneous
options and chicken bits from the command line.
While at it, add a chicken bit called "efi=old_map" which can be used as
a fallback to the old runtime services mapping method in case there's
some b0rkage with a particular EFI implementation (haha, it is hard to
hold up the sarcasm here...).
Also, add the UEFI RT VA space to Documentation/x86/x86_64/mm.txt.
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
The x86/AMD64 EFI stubs must use a call wrapper to convert between
the Linux and EFI ABIs, so void pointers are sufficient. For ARM,
the ABIs are compatible, so we can directly invoke the function
pointers. The functions that are used by the ARM stub are updated
to match the EFI definitions.
Also add some EFI types used by EFI functions.
Signed-off-by: Roy Franz <roy.franz@linaro.org>
Acked-by: Mark Salter <msalter@redhat.com>
Reviewed-by: Grant Likely <grant.likely@linaro.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
efi_lookup_mapped_addr() is a handy utility for other platforms than
x86. Move it from arch/x86 to drivers/firmware. Add memmap pointer
to global efi structure, and initialise it on x86.
Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Common to (U)EFI support on all platforms is the global "efi" data
structure, and the code that parses the System Table to locate
addresses to populate that structure with.
This patch adds both of these to the global EFI driver code and
removes the local definition of the global "efi" data structure from
the x86 and ia64 code.
Squashed into one big patch to avoid breaking bisection.
Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Pull timer core updates from Thomas Gleixner:
"The timer changes contain:
- posix timer code consolidation and fixes for odd corner cases
- sched_clock implementation moved from ARM to core code to avoid
duplication by other architectures
- alarm timer updates
- clocksource and clockevents unregistration facilities
- clocksource/events support for new hardware
- precise nanoseconds RTC readout (Xen feature)
- generic support for Xen suspend/resume oddities
- the usual lot of fixes and cleanups all over the place
The parts which touch other areas (ARM/XEN) have been coordinated with
the relevant maintainers. Though this results in an handful of
trivial to solve merge conflicts, which we preferred over nasty cross
tree merge dependencies.
The patches which have been committed in the last few days are bug
fixes plus the posix timer lot. The latter was in akpms queue and
next for quite some time; they just got forgotten and Frederic
collected them last minute."
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (59 commits)
hrtimer: Remove unused variable
hrtimers: Move SMP function call to thread context
clocksource: Reselect clocksource when watchdog validated high-res capability
posix-cpu-timers: don't account cpu timer after stopped thread runtime accounting
posix_timers: fix racy timer delta caching on task exit
posix-timers: correctly get dying task time sample in posix_cpu_timer_schedule()
selftests: add basic posix timers selftests
posix_cpu_timers: consolidate expired timers check
posix_cpu_timers: consolidate timer list cleanups
posix_cpu_timer: consolidate expiry time type
tick: Sanitize broadcast control logic
tick: Prevent uncontrolled switch to oneshot mode
tick: Make oneshot broadcast robust vs. CPU offlining
x86: xen: Sync the CMOS RTC as well as the Xen wallclock
x86: xen: Sync the wallclock when the system time is set
timekeeping: Indicate that clock was set in the pvclock gtod notifier
timekeeping: Pass flags instead of multiple bools to timekeeping_update()
xen: Remove clock_was_set() call in the resume path
hrtimers: Support resuming with two or more CPUs online (but stopped)
timer: Fix jiffies wrap behavior of round_jiffies_common()
...
... to void * like the boot services and lose all the void * casts. No
functionality change.
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
All the virtualized platforms (KVM, lguest and Xen) have persistent
wallclocks that have more than one second of precision.
read_persistent_wallclock() and update_persistent_wallclock() allow
for nanosecond precision but their implementation on x86 with
x86_platform.get/set_wallclock() only allows for one second precision.
This means guests may see a wallclock time that is off by up to 1
second.
Make set_wallclock() and get_wallclock() take a struct timespec
parameter (which allows for nanosecond precision) so KVM and Xen
guests may start with a more accurate wallclock time and a Xen dom0
can maintain a more accurate wallclock for guests.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Seiji reported getting empty dmesg-* files, because the data was never
actually read in efi_pstore_read_func(), and so the memcpy() was copying
garbage data.
This patch necessitated adding __efivar_entry_get() which is callable
between efivar_entry_iter_{begin,end}(). We can also delete
__efivar_entry_size() because efi_pstore_read_func() was the only
caller.
Reported-by: Seiji Aguchi <seiji.aguchi@hds.com>
Tested-by: Seiji Aguchi <seiji.aguchi@hds.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Matthew Garrett <matthew.garrett@nebula.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
This registers /sys/firmware/efi/{,systab,efivars/} whenever EFI is enabled
and the system is booted with EFI.
This allows
*) userspace to check for the existence of /sys/firmware/efi as a way
to determine whether or it is running on an EFI system.
*) 'mount -t efivarfs none /sys/firmware/efi/efivars' without manually
loading any modules.
[ Also, move the efivar API into vars.c and unconditionally compile it.
This allows us to move efivars.c, which now only contains the sysfs
variable code, into the firmware/efi directory. Note that the efivars.c
filename is kept to maintain backwards compatability with the old
efivars.ko module. With this patch it is now possible for efivarfs
to be built without CONFIG_EFI_VARS - Matt ]
Cc: Seiji Aguchi <seiji.aguchi@hds.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Mike Waychison <mikew@google.com>
Cc: Kay Sievers <kay@vrfy.org>
Cc: Jeremy Kerr <jk@ozlabs.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Chun-Yi Lee <jlee@suse.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Tobias Powalowski <tpowa@archlinux.org>
Signed-off-by: Tom Gundersen <teg@jklm.no>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
efivars.c has grown far too large and needs to be divided up. Create a
new directory and move the persistence storage code to efi-pstore.c now
that it uses the new efivar API. This helps us to greatly reduce the
size of efivars.c and paves the way for moving other code out of
efivars.c.
Note that because CONFIG_EFI_VARS can be built as a module efi-pstore
must also include support for building as a module.
Reviewed-by: Tom Gundersen <teg@jklm.no>
Tested-by: Tom Gundersen <teg@jklm.no>
Cc: Seiji Aguchi <seiji.aguchi@hds.com>
Cc: Anton Vorontsov <cbouatmailru@gmail.com>
Cc: Colin Cross <ccross@android.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
There isn't really a formal interface for dealing with EFI variables
or struct efivar_entry. Historically, this has led to various bits of
code directly accessing the generic EFI variable ops, which inherently
ties it to specific EFI variable operations instead of indirectly
using whatever ops were registered with register_efivars(). This lead
to the efivarfs code only working with the generic EFI variable ops
and not CONFIG_GOOGLE_SMI.
Encapsulate everything that needs to access '__efivars' inside an
efivar_entry_* API and use the new API in the pstore, sysfs and
efivarfs code.
Much of the efivars code had to be rewritten to use this new API. For
instance, it is now up to the users of the API to build the initial
list of EFI variables in their efivar_init() callback function. The
variable list needs to be passed to efivar_init() which allows us to
keep work arounds for things like implementation bugs in
GetNextVariable() in a central location.
Allowing users of the API to use a callback function to build the list
greatly benefits the efivarfs code which needs to allocate inodes and
dentries for every variable. It previously did this in a racy way
because the code ran without holding the variable spinlock. Both the
sysfs and efivarfs code maintain their own lists which means the two
interfaces can be running simultaneously without interference, though
it should be noted that because no synchronisation is performed it is
very easy to create inconsistencies. efibootmgr doesn't currently use
efivarfs and users are likely to also require the old sysfs interface,
so it makes sense to allow both to be built.
Reviewed-by: Tom Gundersen <teg@jklm.no>
Tested-by: Tom Gundersen <teg@jklm.no>
Cc: Seiji Aguchi <seiji.aguchi@hds.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Jeremy Kerr <jk@ozlabs.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Mike Waychison <mikew@google.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
There are currently two implementations of the utf16 string functions.
Somewhat confusingly, they've got different names.
Centralise the functions in efi.h.
Reviewed-by: Tom Gundersen <teg@jklm.no>
Tested-by: Tom Gundersen <teg@jklm.no>
Reviewed-by: Mike Waychison <mikew@google.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Let's not burden ia64 with checks in the common efivars code that we're not
writing too much data to the variable store. That kind of thing is an x86
firmware bug, plain and simple.
efi_query_variable_store() provides platforms with a wrapper in which they can
perform checks and workarounds for EFI variable storage bugs.
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
a system in the crash path. Plus a new mountpoint
(/sys/fs/pstore ... makes more sense then /dev/pstore).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJRJlSgAAoJEKurIx+X31iBHp4P/iTzfOncyI65UaOkKWlBUzAt
bXoiWVAPll15lJuO0Eq4e58c2UEWy9b52MxgCqK38wqSu2Wkd2JLk/q7LFS3gIxb
3h2Lx4CdCCZSENfyGuOTs5WNNcekVFt0qReEXlRAQ6QIGSniO9HT1FklVsqzzDro
HUMoGm0/KSWKuq/ls2MD7y/iH3UYbDQ15ED8n1HxWEbpY0+vxCzCMUzNBSOWUsKb
HuXeHt06jkvw2DiWuyGBJ10wLHC9c4h37FTdrQFg8YeVBYLZT7boBsnB92Hf12YK
0P782RPiWCIKI23huVHHgS3Rh7hSlKMLLxvrNgIX+XxRMxxjFZyo1xk2NlhUw8ta
nGnz7qxqqz+0yCYpF/+0N8ucRH9do64GpCxFXFMTSM9pD4BT7QJAcToswtt3a+Jx
EJQF0BNQEZCFwlWLoHZD0znuFFL+jv67RKuiytOXRMgKkxca6RrdIIMeKXF4cl8I
z/qPslK0BDfAs2Go+ApMmJ5G4rGziy+IJDLjuxFIBBaKba+79NWDIFA6IV0BquVq
idp47OGPTiltb7hOOno+8zRHLQPra7KiWk6Mos2EEb8p+BrrnzspWu1foYEjB4pV
AIpC5ClEY+MZHXl30htff9PZgnFm1BOEs6OZmzhLTZevme8KhwdFtrWhY2gH0gvs
OM/0NpeTQJzFhS90AbeC
=abWF
-----END PGP SIGNATURE-----
Merge tag 'please-pull-pstore' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux
Pull pstore patches from Tony Luck:
"A few fixes to reduce places where pstore might hang a system in the
crash path. Plus a new mountpoint (/sys/fs/pstore ... makes more
sense then /dev/pstore)."
Fix up trivial conflict in drivers/firmware/efivars.c
* tag 'please-pull-pstore' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
pstore: Create a convenient mount point for pstore
efi_pstore: Introducing workqueue updating sysfs
efivars: Disable external interrupt while holding efivars->lock
efi_pstore: Avoid deadlock in non-blocking paths
pstore: Avoid deadlock in panic and emergency-restart path
[Problem]
efi_pstore creates sysfs entries, which enable users to access to NVRAM,
in a write callback. If a kernel panic happens in an interrupt context,
it may fail because it could sleep due to dynamic memory allocations during
creating sysfs entries.
[Patch Description]
This patch removes sysfs operations from a write callback by introducing
a workqueue updating sysfs entries which is scheduled after the write
callback is called.
Also, the workqueue is kicked in a just oops case.
A system will go down in other cases such as panic, clean shutdown and emergency
restart. And we don't need to create sysfs entries because there is no chance for
users to access to them.
efi_pstore will be robust against a kernel panic in an interrupt context with this patch.
Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com>
Acked-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Originally 'efi_enabled' indicated whether a kernel was booted from
EFI firmware. Over time its semantics have changed, and it now
indicates whether or not we are booted on an EFI machine with
bit-native firmware, e.g. 64-bit kernel with 64-bit firmware.
The immediate motivation for this patch is the bug report at,
https://bugs.launchpad.net/ubuntu-cdimage/+bug/1040557
which details how running a platform driver on an EFI machine that is
designed to run under BIOS can cause the machine to become
bricked. Also, the following report,
https://bugzilla.kernel.org/show_bug.cgi?id=47121
details how running said driver can also cause Machine Check
Exceptions. Drivers need a new means of detecting whether they're
running on an EFI machine, as sadly the expression,
if (!efi_enabled)
hasn't been a sufficient condition for quite some time.
Users actually want to query 'efi_enabled' for different reasons -
what they really want access to is the list of available EFI
facilities.
For instance, the x86 reboot code needs to know whether it can invoke
the ResetSystem() function provided by the EFI runtime services, while
the ACPI OSL code wants to know whether the EFI config tables were
mapped successfully. There are also checks in some of the platform
driver code to simply see if they're running on an EFI machine (which
would make it a bad idea to do BIOS-y things).
This patch is a prereq for the samsung-laptop fix patch.
Cc: David Airlie <airlied@linux.ie>
Cc: Corentin Chary <corentincj@iksaif.net>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Peter Jones <pjones@redhat.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Steve Langasek <steve.langasek@canonical.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Konrad Rzeszutek Wilk <konrad@kernel.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: <stable@vger.kernel.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
This reverts commit bd52276fa1 ("x86-64/efi: Use EFI to deal with
platform wall clock (again)"), and the two supporting commits:
da5a108d05b4: "x86/kernel: remove tboot 1:1 page table creation code"
185034e72d59: "x86, efi: 1:1 pagetable mapping for virtual EFI calls")
as they all depend semantically on commit 53b87cf088 ("x86, mm:
Include the entire kernel memory map in trampoline_pgd") that got
reverted earlier due to the problems it caused.
This was pointed out by Yinghai Lu, and verified by me on my Macbook Air
that uses EFI.
Pointed-out-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull x86 EFI update from Peter Anvin:
"EFI tree, from Matt Fleming. Most of the patches are the new efivarfs
filesystem by Matt Garrett & co. The balance are support for EFI
wallclock in the absence of a hardware-specific driver, and various
fixes and cleanups."
* 'core-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
efivarfs: Make efivarfs_fill_super() static
x86, efi: Check table header length in efi_bgrt_init()
efivarfs: Use query_variable_info() to limit kmalloc()
efivarfs: Fix return value of efivarfs_file_write()
efivarfs: Return a consistent error when efivarfs_get_inode() fails
efivarfs: Make 'datasize' unsigned long
efivarfs: Add unique magic number
efivarfs: Replace magic number with sizeof(attributes)
efivarfs: Return an error if we fail to read a variable
efi: Clarify GUID length calculations
efivarfs: Implement exclusive access for {get,set}_variable
efivarfs: efivarfs_fill_super() ensure we clean up correctly on error
efivarfs: efivarfs_fill_super() ensure we free our temporary name
efivarfs: efivarfs_fill_super() fix inode reference counts
efivarfs: efivarfs_create() ensure we drop our reference on inode on error
efivarfs: efivarfs_file_read ensure we free data in error paths
x86-64/efi: Use EFI to deal with platform wall clock (again)
x86/kernel: remove tboot 1:1 page table creation code
x86, efi: 1:1 pagetable mapping for virtual EFI calls
x86, mm: Include the entire kernel memory map in trampoline_pgd
...
Host bridge hotplug:
- Untangle _PRT from struct pci_bus (Bjorn Helgaas)
- Request _OSC control before scanning root bus (Taku Izumi)
- Assign resources when adding host bridge (Yinghai Lu)
- Remove root bus when removing host bridge (Yinghai Lu)
- Remove _PRT during hot remove (Yinghai Lu)
SRIOV
- Add sysfs knobs to control numVFs (Don Dutile)
Power management
- Notify devices when power resource turned on (Huang Ying)
Bug fixes
- Work around broken _SEG on HP xw9300 (Bjorn Helgaas)
- Keep runtime PM enabled for unbound PCI devices (Huang Ying)
- Fix Optimus dual-GPU runtime D3 suspend issue (Dave Airlie)
- Fix xen frontend shutdown issue (David Vrabel)
- Work around PLX PCI 9050 BAR alignment erratum (Ian Abbott)
Miscellaneous
- Add GPL license for drivers/pci/ioapic (Andrew Cooks)
- Add standard PCI-X, PCIe ASPM register #defines (Bjorn Helgaas)
- NumaChip remote PCI support (Daniel Blueman)
- Fix PCIe Link Capabilities Supported Link Speed definition (Jingoo Han)
- Convert dev_printk() to dev_info(), etc (Joe Perches)
- Add support for non PCI BAR ROM data (Matthew Garrett)
- Add x86 support for host bridge translation offset (Mike Yoknis)
- Report success only when every driver supports AER (Vijay Pandarathil)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
iQIcBAABAgAGBQJQyKwSAAoJEPGMOI97Hn6zScgQAJZK2VDfCv74mKrgSDNokIzH
5nVDrc9AHKJm7CUODs6keJK5d4TD/za3Zao68zrYHsJJKes2ni2Z3W34HP2RXKK2
eOmePXOHYPPZMlimP9r9cVxNu1ZJCyp/yWSBcsPF4zUgWhBWLRaSj85I049gQ0sz
+05nZYfLjVd3HNiaXsG4CQyMrNF46XEsLhF9vs+Nr2GHPwrpzhfScgYv63oDS86C
3ICKsjmiRUZcNelxIFYmyxa5u89QdW5XHjzc9eHGQuus24Vxw+TZzsdfc17sUJEE
HTyXY+RjDpOVhdtwwUjrCEOiyZYvy3g9+3sKxoxgt/76ghdUaR7fxITwB97qVMFD
T0ESlKjSV/Qv5QYdyy5uP4zwNs/PXCWXkTg/L1m71F30BxKWDa7tgiA6uK7Z7fl5
1aokKBdk3mtJJJIDJG1YkxPXx/JItTGCNYrx7CcFj49rSjrUWLQdmrYahersRIsB
3wiD2xTi9e4dXeP/+VGzGOWB/sHk+73jvrvZe/REa1FCnMINDz4+9V9WaGROMqyq
MQ8kX0KfYcNVNxy1GOXjU5wLpMN/t/QbvI7gwzRP1DAUCJPoOgFy7AjvSTVG3zuy
8CtdOFttVkUn5dqsbQR0gVbyQVTS3PGSKz5XC/s8kVDWhja0xZTBYwrskM/4zdSD
Xf48OyYV5EjpC3FYUSiU
=OE3Q
-----END PGP SIGNATURE-----
Merge tag 'for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI update from Bjorn Helgaas:
"Host bridge hotplug:
- Untangle _PRT from struct pci_bus (Bjorn Helgaas)
- Request _OSC control before scanning root bus (Taku Izumi)
- Assign resources when adding host bridge (Yinghai Lu)
- Remove root bus when removing host bridge (Yinghai Lu)
- Remove _PRT during hot remove (Yinghai Lu)
SRIOV
- Add sysfs knobs to control numVFs (Don Dutile)
Power management
- Notify devices when power resource turned on (Huang Ying)
Bug fixes
- Work around broken _SEG on HP xw9300 (Bjorn Helgaas)
- Keep runtime PM enabled for unbound PCI devices (Huang Ying)
- Fix Optimus dual-GPU runtime D3 suspend issue (Dave Airlie)
- Fix xen frontend shutdown issue (David Vrabel)
- Work around PLX PCI 9050 BAR alignment erratum (Ian Abbott)
Miscellaneous
- Add GPL license for drivers/pci/ioapic (Andrew Cooks)
- Add standard PCI-X, PCIe ASPM register #defines (Bjorn Helgaas)
- NumaChip remote PCI support (Daniel Blueman)
- Fix PCIe Link Capabilities Supported Link Speed definition (Jingoo
Han)
- Convert dev_printk() to dev_info(), etc (Joe Perches)
- Add support for non PCI BAR ROM data (Matthew Garrett)
- Add x86 support for host bridge translation offset (Mike Yoknis)
- Report success only when every driver supports AER (Vijay
Pandarathil)"
Fix up trivial conflicts.
* tag 'for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (48 commits)
PCI: Use phys_addr_t for physical ROM address
x86/PCI: Add NumaChip remote PCI support
ath9k: Use standard #defines for PCIe Capability ASPM fields
iwlwifi: Use standard #defines for PCIe Capability ASPM fields
iwlwifi: collapse wrapper for pcie_capability_read_word()
iwlegacy: Use standard #defines for PCIe Capability ASPM fields
iwlegacy: collapse wrapper for pcie_capability_read_word()
cxgb3: Use standard #defines for PCIe Capability ASPM fields
PCI: Add standard PCIe Capability Link ASPM field names
PCI/portdrv: Use PCI Express Capability accessors
PCI: Use standard PCIe Capability Link register field names
x86: Use PCI setup data
PCI: Add support for non-BAR ROMs
PCI: Add pcibios_add_device
EFI: Stash ROMs if they're not in the PCI BAR
PCI: Add and use standard PCI-X Capability register names
PCI/PM: Keep runtime PM enabled for unbound PCI devices
xen-pcifront: Handle backend CLOSED without CLOSING
PCI: SRIOV control and status via sysfs (documentation)
PCI/AER: Report success only when every device has AER-aware driver
...
EFI provides support for providing PCI ROMs via means other than the ROM
BAR. This support vanishes after we've exited boot services, so add support
for stashing copies of the ROMs in setup_data if they're not otherwise
available.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Seth Forshee <seth.forshee@canonical.com>
[Issue]
As discussed in a thread below, Running out of space in EFI isn't a well-tested scenario.
And we wouldn't expect all firmware to handle it gracefully.
http://marc.info/?l=linux-kernel&m=134305325801789&w=2
On the other hand, current efi_pstore doesn't check a remaining space of storage at writing time.
Therefore, efi_pstore may not work if it tries to write a large amount of data.
[Patch Description]
To avoid handling the situation above, this patch checks if there is a space enough to log with
QueryVariableInfo() before writing data.
Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com>
Acked-by: Mike Waychison <mikew@google.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
We don't want someone who can write EFI variables to be able to
allocate arbitrarily large amounts of memory, so cap it to something
sensible like the amount of free space for EFI variables.
Acked-by: Jeremy Kerr <jeremy.kerr@canonical.com>
Cc: Matthew Garrett <mjg@redhat.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Other than ix86, x86-64 on EFI so far didn't set the
{g,s}et_wallclock accessors to the EFI routines, thus
incorrectly using raw RTC accesses instead.
Simply removing the #ifdef around the respective code isn't
enough, however: While so far early get-time calls were done in
physical mode, this doesn't work properly for x86-64, as virtual
addresses would still need to be set up for all runtime regions
(which wasn't the case on the system I have access to), so
instead the patch moves the call to efi_enter_virtual_mode()
ahead (which in turn allows to drop all code related to calling
efi-get-time in physical mode).
Additionally the earlier calling of efi_set_executable()
requires the CPA code to cope, i.e. during early boot it must be
avoided to call cpa_flush_array(), as the first thing this
function does is a BUG_ON(irqs_disabled()).
Also make the two EFI functions in question here static -
they're not being referenced elsewhere.
History:
This commit was originally merged as bacef661ac ("x86-64/efi:
Use EFI to deal with platform wall clock") but it resulted in some
ASUS machines no longer booting due to a firmware bug, and so was
reverted in f026cfa82f. A pre-emptive fix for the buggy ASUS
firmware was merged in 03a1c254975e ("x86, efi: 1:1 pagetable
mapping for virtual EFI calls") so now this patch can be
reapplied.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Matt Fleming <matt.fleming@intel.com>
Acked-by: Matthew Garrett <mjg@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com> [added commit history]