The hypervisor may not have full access to the kernel data structures
and hence cannot safely use cpus_have_cap() helper for checking the
system capability. Add a safe helper for hypervisors to check a constant
system capability, which *doesn't* fall back to checking the bitmap
maintained by the kernel. With this, make the cpus_have_cap() only
check the bitmask and force constant cap checks to use the new API
for quicker checks.
Cc: Robert Ritcher <rritcher@cavium.com>
Cc: Tirumalesh Chalamarla <tchalamarla@cavium.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This patch moves arm64's struct thread_info from the task stack into
task_struct. This protects thread_info from corruption in the case of
stack overflows, and makes its address harder to determine if stack
addresses are leaked, making a number of attacks more difficult. Precise
detection and handling of overflow is left for subsequent patches.
Largely, this involves changing code to store the task_struct in sp_el0,
and acquire the thread_info from the task struct. Core code now
implements current_thread_info(), and as noted in <linux/sched.h> this
relies on offsetof(task_struct, thread_info) == 0, enforced by core
code.
This change means that the 'tsk' register used in entry.S now points to
a task_struct, rather than a thread_info as it used to. To make this
clear, the TI_* field offsets are renamed to TSK_TI_*, with asm-offsets
appropriately updated to account for the structural change.
Userspace clobbers sp_el0, and we can no longer restore this from the
stack. Instead, the current task is cached in a per-cpu variable that we
can safely access from early assembly as interrupts are disabled (and we
are thus not preemptible).
Both secondary entry and idle are updated to stash the sp and task
pointer separately.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: James Morse <james.morse@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Shortly we will want to load a percpu variable in the return from
userspace path. We can save an instruction by folding the addition of
the percpu offset into the load instruction, and this patch adds a new
helper to do so.
At the same time, we clean up this_cpu_ptr for consistency. As with
{adr,ldr,str}_l, we change the template to take the destination register
first, and name this dst. Secondly, we rename the macro to adr_this_cpu,
following the scheme of adr_l, and matching the newly added
ldr_this_cpu.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In the absence of CONFIG_THREAD_INFO_IN_TASK, core code maintains
thread_info::cpu, and low-level architecture code can access this to
build raw_smp_processor_id(). With CONFIG_THREAD_INFO_IN_TASK, core code
maintains task_struct::cpu, which for reasons of hte header soup is not
accessible to low-level arch code.
Instead, we can maintain a percpu variable containing the cpu number.
For both the old and new implementation of raw_smp_processor_id(), we
read a syreg into a GPR, add an offset, and load the result. As the
offset is now larger, it may not be folded into the load, but otherwise
the assembly shouldn't change much.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: James Morse <james.morse@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Subsequent patches will make smp_processor_id() use a percpu variable.
This will make smp_processor_id() dependent on the percpu offset, and
thus we cannot use smp_processor_id() to figure out what to initialise
the offset to.
Prepare for this by initialising the percpu offset based on
current::cpu, which will work regardless of how smp_processor_id() is
implemented. Also, make this relationship obvious by placing this code
together at the start of secondary_start_kernel().
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
When returning from idle, we rely on the fact that thread_info lives at
the end of the kernel stack, and restore this by masking the saved stack
pointer. Subsequent patches will sever the relationship between the
stack and thread_info, and to cater for this we must save/restore sp_el0
explicitly, storing it in cpu_suspend_ctx.
As cpu_suspend_ctx must be doubleword aligned, this leaves us with an
extra slot in cpu_suspend_ctx. We can use this to save/restore tpidr_el1
in the same way, which simplifies the code, avoiding pointer chasing on
the restore path (as we no longer need to load thread_info::cpu followed
by the relevant slot in __per_cpu_offset based on this).
This patch stashes both registers in cpu_suspend_ctx.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: James Morse <james.morse@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
When CONFIG_THREAD_INFO_IN_TASK is selected, task stacks may be freed
before a task is destroyed. To account for this, the stacks are
refcounted, and when manipulating the stack of another task, it is
necessary to get/put the stack to ensure it isn't freed and/or re-used
while we do so.
This patch reworks the arm64 stack walking code to account for this.
When CONFIG_THREAD_INFO_IN_TASK is not selected these perform no
refcounting, and this should only be a structural change that does not
affect behaviour.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The walk_stackframe functions is architecture-specific, with a varying
prototype, and common code should not use it directly. None of its
current users can be built as modules. With THREAD_INFO_IN_TASK, users
will also need to hold a stack reference before calling it.
There's no reason for it to be exported, and it's very easy to misuse,
so unexport it for now.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In arm64's die and __die routines we pass around a thread_info, and
subsequently use this to determine the relevant task_struct, and the end
of the thread's stack. Subsequent patches will decouple thread_info from
the stack, and this approach will no longer work.
To figure out the end of the stack, we can use the new generic
end_of_stack() helper. As we only call __die() from die(), and die()
always deals with the current task, we can remove the parameter and have
both acquire current directly, which also makes it clear that __die
can't be called for arbitrary tasks.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
We define current_stack_pointer in <asm/thread_info.h>, though other
files and header relying upon it do not have this necessary include, and
are thus fragile to changes in the header soup.
Subsequent patches will affect the header soup such that directly
including <asm/thread_info.h> may result in a circular header include in
some of these cases, so we can't simply include <asm/thread_info.h>.
Instead, factor current_thread_info into its own header, and have all
existing users include this explicitly.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Subsequent patches will move the thread_info::{task,cpu} fields, and the
current TI_{TASK,CPU} offset definitions are not used anywhere.
This patch removes the redundant definitions.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
We have a comment claiming __switch_to() cares about where cpu_context
is located relative to cpu_domain in thread_info. However arm64 has
never had a thread_info::cpu_domain field, and neither __switch_to nor
cpu_switch_to care where the cpu_context field is relative to others.
Additionally, the init_thread_info alias is never used anywhere in the
kernel, and will shortly become problematic when thread_info is moved
into task_struct.
This patch removes both.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
When CONFIG_THREAD_INFO_IN_TASK is selected, the current_thread_info()
macro relies on current having been defined prior to its use. However,
not all users of current_thread_info() include <asm/current.h>, and thus
current is not guaranteed to be defined.
When CONFIG_THREAD_INFO_IN_TASK is not selected, it's possible that
get_current() / current are based upon current_thread_info(), and
<asm/current.h> includes <asm/thread_info.h>. Thus always including
<asm/current.h> would result in circular dependences on some platforms.
To ensure both cases work, this patch includes <asm/current.h>, but only
when CONFIG_THREAD_INFO_IN_TASK is selected.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Since commit f56141e3e2 ("all arches, signal: move restart_block
to struct task_struct"), thread_info and restart_block have been
logically distinct, yet struct restart_block is still defined in
<linux/thread_info.h>.
At least one architecture (erroneously) uses restart_block as part of
its thread_info, and thus the definition of restart_block must come
before the include of <asm/thread_info>. Subsequent patches in this
series need to shuffle the order of includes and definitions in
<linux/thread_info.h>, and will make this ordering fragile.
This patch moves the definition of restart_block out to its own header.
This serves as generic cleanup, logically separating thread_info and
restart_block, and also makes it easier to avoid fragility.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
For several reasons it is preferable to use {READ,WRITE}_ONCE() rather than
ACCESS_ONCE(). For example, these handle aggregate types, result in shorter
source code, and better document the intended access (which may be useful for
instrumentation features such as the upcoming KTSAN).
Over a number of patches, most uses of ACCESS_ONCE() in arch/arm64 have been
migrated to {READ,WRITE}_ONCE(). For consistency, and the above reasons, this
patch migrates the final remaining uses.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The libhugetlbfs meets several failures since the following functions
do not use the correct address:
huge_ptep_get_and_clear()
huge_ptep_set_access_flags()
huge_ptep_set_wrprotect()
huge_ptep_clear_flush()
This patch fixes the wrong address for them.
Signed-off-by: Huang Shijie <shijie.huang@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The find_num_contig() will return 1 when the pmd is not present.
It will cause a kernel dead loop in the following scenaro:
1.) pmd entry is not present.
2.) the page fault occurs:
... hugetlb_fault() --> hugetlb_no_page() --> set_huge_pte_at()
3.) set_huge_pte_at() will only set the first PMD entry, since the
find_num_contig just return 1 in this case. So the PMD entries
are all empty except the first one.
4.) when kernel accesses the address mapped by the second PMD entry,
a new page fault occurs:
... hugetlb_fault() --> huge_ptep_set_access_flags()
The second PMD entry is still empty now.
5.) When the kernel returns, the access will cause a page fault again.
The kernel will run like the "4)" above.
We will see a dead loop since here.
The dead loop is caught in the 32M hugetlb page (2M PMD + Contiguous bit).
This patch removes wrong pmd check, and fixes this dead loop.
This patch also removes the redundant checks for PGD/PUD in
the find_num_contig().
Acked-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Huang Shijie <shijie.huang@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The default hugepage size when 64K pages are enabled is set to 2MB using
the contiguous PTE bit. The add_default_hugepagesz(), however, uses
CONT_PMD_SHIFT instead of CONT_PTE_SHIFT. There is no functional change
since the values are the same.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
When CONFIG_KPROBE is disabled but CONFIG_UPROBE_EVENT is enabled, we get
following compilation error:
In file included from
.../arch/arm64/kernel/probes/decode-insn.c:20:0:
.../arch/arm64/include/asm/kprobes.h:52:5: error:
conflicting types for 'kprobe_fault_handler'
int kprobe_fault_handler(struct pt_regs *regs, unsigned int fsr);
^~~~~~~~~~~~~~~~~~~~
In file included from
.../arch/arm64/kernel/probes/decode-insn.c:17:0:
.../include/linux/kprobes.h:398:90: note:
previous definition of 'kprobe_fault_handler' was here
static inline int kprobe_fault_handler(struct pt_regs *regs, int trapnr)
^
.../scripts/Makefile.build:290: recipe for target
'arch/arm64/kernel/probes/decode-insn.o' failed
<asm/kprobes.h> is already included from <linux/kprobes.h> under #ifdef
CONFIG_KPROBE. So, this patch fixes the error by removing it from
decode-insn.c.
Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This patch adds support for uprobe on ARM64 architecture.
Unit tests for following have been done so far and they have been found
working
1. Step-able instructions, like sub, ldr, add etc.
2. Simulation-able like ret, cbnz, cbz etc.
3. uretprobe
4. Reject-able instructions like sev, wfe etc.
5. trapped and abort xol path
6. probe at unaligned user address.
7. longjump test cases
Currently it does not support aarch32 instruction probing.
Signed-off-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
We need to decide in some cases like uprobe instruction analysis that
whether the current mm context belongs to a 32 bit task or 64 bit.
This patch has introduced an unsigned flag variable in mm_context_t.
Currently, we set and clear TIF_32BIT depending on the condition that
whether an elf binary load sets personality for 32 bit or 64 bit
respectively.
Signed-off-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
uprobe is registered at break_hook with a unique ESR code. So, when a
TRAP_BRKPT occurs, call_break_hook checks if it was for uprobe. If not,
then send a SIGTRAP to user.
Signed-off-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
uprobe registers a handler at step_hook. So, single_step_handler now
checks for user mode as well if there is a valid hook.
Signed-off-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
ARM64 step exception does not have any syndrome information. So, it is
responsibility of exception handler to take care that they handle it
only if exception was raised for them.
Since kgdb_step_brk_fn() always returns 0, therefore we might have problem
when we will have other step handler registered as well.
This patch fixes kgdb_step_brk_fn() to return error in case of step handler
was not meant for kgdb.
Signed-off-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
decode-insn code has to be reused by arm64 uprobe implementation as well.
Therefore, this patch protects some portion of kprobe code and renames few
other, so that decode-insn functionality can be reused by uprobe even when
CONFIG_KPROBES is not defined.
kprobe_opcode_t and struct arch_specific_insn are also defined by
linux/kprobes.h, when CONFIG_KPROBES is not defined. So, protect these
definitions in asm/probes.h.
linux/kprobes.h already includes asm/kprobes.h. Therefore, remove inclusion
of asm/kprobes.h from decode-insn.c.
There are some definitions like kprobe_insn and kprobes_handler_t etc can
be re-used by uprobe. So, it would be better to remove 'k' from their
names.
struct arch_specific_insn is specific to kprobe. Therefore, introduce a new
struct arch_probe_insn which will be common for both kprobe and uprobe, so
that decode-insn code can be shared. Modify kprobe code accordingly.
Function arm_probe_decode_insn() will be needed by uprobe as well. So make
it global.
Signed-off-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Page mappings with full RWX permissions are a security risk. x86
has an option to walk the page tables and dump any bad pages.
(See e1a58320a3 ("x86/mm: Warn on W^X mappings")). Add a similar
implementation for arm64.
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[catalin.marinas@arm.com: folded fix for KASan out of bounds from Mark Rutland]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
max_addr was added as part of struct ptdump_info but has never actually
been used. Remove it.
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The page table dumping code always assumes it will be dumping to a
seq_file to userspace. Future code will be taking advantage of
the page table dumping code but will not need the seq_file. Make
the seq_file optional for these cases.
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
ptdump_register currently initializes a set of page table information and
registers debugfs. There are uses for the ptdump option without wanting the
debugfs options. Split this out to make it a separate option.
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Now that we no longer allow live kernel PMDs to be split, it is safe to
start using the contiguous bit for kernel mappings. So set the contiguous
bit in the kernel page mappings for regions whose size and alignment are
suitable for this.
This enables the following contiguous range sizes for the virtual mapping
of the kernel image, and for the linear mapping:
granule size | cont PTE | cont PMD |
-------------+------------+------------+
4 KB | 64 KB | 32 MB |
16 KB | 2 MB | 1 GB* |
64 KB | 2 MB | 16 GB* |
* Only when built for 3 or more levels of translation. This is due to the
fact that a 2 level configuration only consists of PGDs and PTEs, and the
added complexity of dealing with folded PMDs is not justified considering
that 16 GB contiguous ranges are likely to be ignored by the hardware (and
16k/2 levels is a niche configuration)
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In preparation of adding support for contiguous PTE and PMD mappings,
let's replace 'block_mappings_allowed' with 'page_mappings_only', which
will be a more accurate description of the nature of the setting once we
add such contiguous mappings into the mix.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Now that we take care not manipulate the live kernel page tables in a
way that may lead to TLB conflicts, the case where a table mapping is
replaced by a block mapping can no longer occur. So remove the handling
of this at the PUD and PMD levels, and instead, BUG() on any occurrence
of live kernel page table manipulations that modify anything other than
the permission bits.
Since mark_rodata_ro() is the only caller where the kernel mappings that
are being manipulated are actually live, drop the various conditional
flush_tlb_all() invocations, and add a single call to mark_rodata_ro()
instead.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
We expect arch_teardown_dma_ops() to be called very late in a device's
life, after it has been removed from its bus, and thus after the IOMMU
bus notifier has run. As such, even if this funny little check did make
sense, it's unlikely to achieve what it thinks it's trying to do anyway.
It's a residual trace of an earlier implementation which didn't belong
here from the start; belatedly snuff it out.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Kprobes does not need its own homebrewed (and frankly inscrutable) sign
extension macro; just use the standard kernel functions instead. Since
the compiler actually recognises the sign-extension idiom of the latter,
we also get the small bonus of some nicer codegen, as each displacement
calculation helper then compiles to a single optimal SBFX instruction.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Add a sysfs cpu_capacity attribute with which it is possible to read and
write (thus over-writing default values) CPUs capacity. This might be
useful in situations where values needs changing after boot.
The new attribute shows up as:
/sys/devices/system/cpu/cpu*/cpu_capacity
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
With the introduction of cpu capacity-dmips-mhz bindings, CPU capacities
can now be calculated from values extracted from DT and information
coming from cpufreq. Add parsing of DT information at boot time, and
complement it with cpufreq information. Also, store such information
using per CPU variables, as we do for arm.
Caveat: the information provided by this patch will start to be used in
the future. We need to #define arch_scale_cpu_capacity to something
provided in arch, so that scheduler's default implementation (which gets
used if arch_scale_cpu_capacity is not defined) is overwritten.
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
ARM systems may be configured to have cpus with different power/performance
characteristics within the same chip. In this case, additional information
has to be made available to the kernel (the scheduler in particular) for it
to be aware of such differences and take decisions accordingly.
Therefore, this patch aims at standardizing cpu capacities device tree
bindings for ARM platforms. Bindings define cpu capacity-dmips-mhz
parameter, to allow operating systems to retrieve such information from
the device tree and initialize related kernel structures, paving the way
for common code in the kernel to deal with heterogeneity.
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Ian Campbell <ijc+devicetree@hellion.org.uk>
Cc: Kumar Gala <galak@codeaurora.org>
Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
Cc: Paul Walmsley <paul@pwsan.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Chen-Yu Tsai <wens@csie.org>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: devicetree@vger.kernel.org
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Pull i2c fix from Wolfram Sang:
"A bugfix for the I2C core fixing a (rare) race condition"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: core: fix NULL pointer dereference under race condition
Pull stack vmap fixups from Thomas Gleixner:
"Two small patches related to sched_show_task():
- make sure to hold a reference on the task stack while accessing it
- remove the thread_saved_pc printout
.. and add a sanity check into release_task_stack() to catch problems
with task stack references"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/core: Remove pointless printout in sched_show_task()
sched/core: Fix oops in sched_show_task()
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
fork: Add task stack refcounting sanity check and prevent premature task stack freeing
Pull MD fixes from Shaohua Li:
"There are several bug fixes queued:
- fix raid5-cache recovery bugs
- fix discard IO error handling for raid1/10
- fix array sync writes bogus position to superblock
- fix IO error handling for raid array with external metadata"
* tag 'md/4.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
md: be careful not lot leak internal curr_resync value into metadata. -- (all)
raid1: handle read error also in readonly mode
raid5-cache: correct condition for empty metadata write
md: report 'write_pending' state when array in sync
md/raid5: write an empty meta-block when creating log super-block
md/raid5: initialize next_checkpoint field before use
RAID10: ignore discard error
RAID1: ignore discard error
Two more important data integrity fixes related to RAID device drivers which
wrongly throw away the SYNCHRONIZE CACHE command in the non-RAID path and a
memory leak in the scsi_debug driver
Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJYHd5nAAoJEAVr7HOZEZN4BeIP/RmOchL8Xdm2GObAJYeeC5Jv
7jYqcjsV3LHz8ubebRk/GmrcXVmF52VJ0nc6IgcoAhG44kaY99kapah7wDioMci4
DC1m9twxQMfclEjk+8nL59iC4HR+A5TlMRnXf3XRTQ399w9KxGe1jGS2/OIOYpPd
goeQdSfSLxQX87c4eZldotQDY/9NUDe/O0Af3JboX5ySCDnqKiu+xqhE+kXKY7oY
bfsBurF875bER63YCeRIjmc/iO/klYGcm/7wsEJfxDZerY2/Sr6LaAd+bcComWX2
YAcoTwOGHwbjhKUbkHGjsQIaT+VFNOCDfXF1Bm37WTF5/AFiBfHRgQEClXm5I6kD
aRfcwfXeb6jDvUujCksIngSCeQc6/3np9gvmBV6hjKEmn07ny8j7vsDbI2gUL6rs
IVzMrFUw8O/InyooJD9CubnV7cgKnU+3/WIw3J92UudiEDRJSpCiBszoKL7JnOeA
aAeUl3hhQBr50w0nLCFcm65PnHjCY/4VuJ7ZXF6Z1e6y+yd81zrbzHYC4rb9sFsa
3KJ4UgIajhC0t5FxDbwFfOj/b0WhLzqJeMrOnTyI+mrjpHWexNW+iIMw6qRi6yv9
YuL9XvaaRblnmxOEma3A3xiTCQ6mFl4yYcMa4ppBlDgbTZSJff4kRB+Nma/qw1+v
VrKlOiKXC5wYp8jPlRwT
=xrlP
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Two more important data integrity fixes related to RAID device drivers
which wrongly throw away the SYNCHRONIZE CACHE command in the non-RAID
path and a memory leak in the scsi_debug driver"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: arcmsr: Send SYNCHRONIZE_CACHE command to firmware
scsi: scsi_debug: Fix memory leak if LBP enabled and module is unloaded
scsi: megaraid_sas: Fix data integrity failure for JBOD (passthrough) devices
- Add missing input validation to the firewire-net driver.
Invalid IP-over-1394 encapsulation headers could trigger
buffer overflows (CVE 2016-8633).
- IP-over-1394 link fragmentation headers were read and written
incorrectly, breaking fragmented RX/TX with other OS's stacks.
-----BEGIN PGP SIGNATURE-----
iQI3BAABCAAhBQJYHgSgGhxzdGVmYW5yQHM1cjYuaW4tYmVybGluLmRlAAoJEHnz
b7JUXXnQFGIP/AvXrsLFgO8vOZd967145Jz0FydmwRBsX1F90xugFIxJI8kRQSN4
MQ/WHPaHl55LqX88VrxDTfN04TAtb243CDPXDnMN6rEzaSY4O7J6JO/zaUKMGyaW
rocX3s3uufIqk/GXDk2+I0Ze6xHynbWEPaoDh0rHYkJcD0NHHi9SJPFZf8RnCx3g
s7GAiU5V9o1SZydY9dCGo66Zl3JSujOsxpbVllT2ux+FWRReea5+O9ntWpPcW+E0
Elc9v7Nt3BcOEDXrVYg8UIIX6RCt8IBDCaF6D7n8JrhU/ag+OH8+KondktU5P+BI
MXcsm92UbM+/739RC7V6JbDSodUn1DisoOTyNmH58ZZerKWWt+7E27WXWuRu9ch7
rBra6pcOhkhO9sHkrwf3DlP9nubAfVbznVxZFOI4O96fnpn284J0RX8brsMOeHcF
iPAIpGc5PLyvyhczZSfokKj9S4kUvbwhNwCnHs32ttdrGSHRIVFKHsipsFFtcts2
K7QAN9mxWtBeP8i3gd5tJO8FT7tvg4Ixtt/BzpsFpxaYk9b/k2RnrSt4b4/0mC2q
fQ6urSQalvffmbU9eMm78NbaFV26s217gZHqqD++tAVLLgkwRaC2k/MucOJJz4xb
5PYoGLDfCusReBP/MYdpqOhAUEBeeZ03IQT7gaGGDQ27x8k1UIpB57Sk
=PxAL
-----END PGP SIGNATURE-----
Merge tag 'firewire-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
Pull FireWire (IEEE 1394) fixes from Stefan Richter:
- add missing input validation to the firewire-net driver. Invalid
IP-over-1394 encapsulation headers could trigger buffer overflows
(CVE 2016-8633).
- IP-over-1394 link fragmentation headers were read and written
incorrectly, breaking fragmented RX/TX with other OS's stacks.
* tag 'firewire-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
firewire: net: fix fragmented datagram_size off-by-one
firewire: net: guard against rx buffer overflows
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYGZuzAAoJEAhfPr2O5OEVEakQAJKu+4OwrzoWajdIibR9IVpP
1gbwqAeTDp4XuHC4WyA8U3no3I5RG+moo4GNv87LNZ/H2ix1EGQn6IwbpYG0YmOB
wdcZ2WViLc4tEYN/Rn7slwjY32dNplra6xNNSb0JHOT5tp6YOCLljqApd9FBvP25
Yo10z0pRL78ce6VbXyDB5JuqUsjtHivzU45/O5M5giDFIngdqFuu0zneQeMbvulF
rCz6HSqNutFaeRMdnbP6f2Vtmd2QjeCY4aYg5kQLqWiuXsdMplp9uJkeYaDgccDp
TF9z33cJlmyPXY6/YH95yfca156EZIVco3yLnNp9Ehmr4S12NV0D3xMKCXPyucEJ
A6FH60zqxe3qUv7sPi6w4MM7ufgq3F/i33lhhDLsNjw0R8m7ijohfIj7HI84XC+z
Jjr44A/7p4hqbfvkBePyLHcqaglWnc0E6LnS7lUgJC4/h7z3H2DMoPaUwkjedAI6
ynd5Ikfw+VvI0UQFyVBBWfH+ol+6BP7QO3TmHES9zfbPNMfhXc3ON4sk9yBzY50S
cH6/TagQVmuk4zA457oAU+rrYR4g+di97Wk3AMi+gKGWu2qZmHE4Tv0gQsNHoPrK
06VD1Ur04khRR3tOQ/OqDMGNyLHbCMIMXeUYc363uZ6wrcmVF5caDZ3oy3i1LNNO
ZALCj0FYwNFmDs7+Ckmq
=MRLH
-----END PGP SIGNATURE-----
Merge tag 'media/v4.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
"A series of fixup patches meant to fix the usage of DMA on stack, plus
one warning fixup"
* tag 'media/v4.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (32 commits)
[media] radio-bcm2048: don't ignore errors
[media] pctv452e: fix semicolon.cocci warnings
[media] flexcop-usb: don't use stack for DMA
[media] stk-webcam: don't use stack for DMA
[media] s2255drv: don't use stack for DMA
[media] cpia2_usb: don't use stack for DMA
[media] digitv: handle error code on RC query
[media] dw2102: return error if su3000_power_ctrl() fails
[media] nova-t-usb2: handle error code on RC query
[media] technisat-usb2: use DMA buffers for I2C transfers
[media] pctv452e: don't call BUG_ON() on non-fatal error
[media] pctv452e: don't do DMA on stack
[media] nova-t-usb2: don't do DMA on stack
[media] gp8psk: don't go past the buffer size
[media] gp8psk: don't do DMA on stack
[media] dtv5100: don't do DMA on stack
[media] dtt200u: handle USB control message errors
[media] dtt200u: don't do DMA on stack
[media] dtt200u-fe: handle errors on USB control messages
[media] dtt200u-fe: don't do DMA on stack
...
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJYHO/8AAoJEFmIoMA60/r8z/0P/1JT1lHNtDgC3RBr5NRy6x+K
9xAfxeHJjxS/6YbFCBgPlzQJBzZMiSHptb/Y44Cqpr8Gz5WWnn18gbdMoBGQFc6k
n7iVvEGlvf2YR3CulqdsUH3/B2hDjNbM45HT2Rwd1agq+qku6nMpXdUix+z7TNEg
Tht/a8XAs77/XOl/uhGSCy5hvGKErcLNrZ1qFWmiUJEsFFgzSx3eqtx2MNJSiJyv
/F9dzDIgNKAOdOv34hYndE+VLwyFAwqzvIgB5J4oLL23+FzRW68yvQgmt45cogTF
NA4uFCfnaSK2Dy7qFOfevRE2AfQcSSfvvsGukQQaFoKyY8Jb6Z8w6WWO0P/RBQsY
ZvmP10JfyjQ5z2SSAMcVNDXR2dL58zc6kuGZISUToX22mMSUsmFMhLq2350657C9
0A8BkfO86z4EfgmO8aiBfgE7A6RrCR6yfouQTTGJ91COYEG60D/mPhlhtkmHE4yd
3tqmBSEw11yv92OLU4DdoXFA7Cbm7DElEk6fPcw5TbbWCwbToTHNo/jHbe+0+Esy
je5AFZe8IIuBHVkN3tcsoaRw2KKCtnBrLbfdyLME6KDXj7eZMg3WWoU0E5uwzIO3
jLOSCBb2dtqrJmiE4sNiYob8wI+jOLsC/XHHczYWnIzLiYyvYeZP6cBhHao0KCLD
/ao0YY6m3eUtwSJZbU2q
=n738
-----END PGP SIGNATURE-----
Merge tag 'pci-v4.9-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
- fix for a Qualcomm driver issue that causes a use-before-set crash
- fix for DesignWare iATU unroll support that causes external aborts
when enabling the host bridge
* tag 'pci-v4.9-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: designware: Check for iATU unroll support after initializing host
PCI: qcom: Fix pp->dev usage before assignment
* MAINTAINERS updates to reflect some new maintainers/submaintainers -- we
have some great volunteers who've been developing and reviewing already.
We're going to try a group maintainership model, so eventually you'll
probably see pull requests from people besides me.
* NAND fixes from Boris:
"""
Three simple fixes:
- the first one is fixing a non-critical bug in the gpmi driver
- the second one is fixing a bug in the 'automatic NAND timings
selection' feature introduced in 4.9-rc1
- the last one is fixing a false positive uninitialized-var warning
"""
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJYHSUHAAoJEFySrpd9RFgtnzYQAKROCvMsD8+2k2kxAQiR4HXk
HtAVi7Pma3zBxbNYXyr1ThGS+Woiy4Ln4xrFyo2M4WQBjbwxZJmQ6BZi0WJ1Hmo0
aZ0J+jxZHAqXFMlMqaD40w7khW97oTmQ7elCp7agpunQYo1QkbT/Kq/oO3Jet1GX
lDA3JIbdpdk0nhS5p61tzlgzr6YaXvKQjbUxbtPgMi/sfEBAlG9AaoQWgYrvy0YD
8JXV74Mo7tG/gNVhsNqTAnzgOHevaW1h2Oiy87Rn7os2eCVzSR0TkQ7AEMEBF55x
2PpMhxPvxFn/rwAVyecgtkw8SJODng/ROa7iALoEGJiqSdWjhqpWkqhw4UQiHR2J
mBHFL5+wzsNGyUCPtSmxP+QDK2pueQale3skZivz7twxrRI5OF4DLHMLqktoeqEL
QGXZUzR+2guK0GK70UfsBiNkVjNH0AMCO+AedwhC6cc2Gei2qhivfIdwWNIY9otn
2JMVW+pWYlCCtczatgMb1+7/ZlPH+iLpJZHcs/fAh/MGrSDEcXxP5jOxXo3ZS1sK
jo8CbyRu/QfwWmnkskWfnmPvfbUpIyDmVddYoDmjDvtsea3s3zxvmUb0JhHY8se7
594NRqEXThmf7LkbVIAS5260fBTELu6jh+y+Fsnpd73nUnrZTspDEYKX1CbNM2k7
qpEeyozBpihUF9C6hq7o
=7VFh
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20161104' of git://git.infradead.org/linux-mtd
Pull MTD fixes from Brian Norris:
- MAINTAINERS updates to reflect some new maintainers/submaintainers.
We have some great volunteers who've been developing and reviewing
already. We're going to try a group maintainership model, so
eventually you'll probably see pull requests from people besides me.
- NAND fixes from Boris:
"Three simple fixes:
- fix a non-critical bug in the gpmi driver
- fix a bug in the 'automatic NAND timings selection' feature
introduced in 4.9-rc1
- fix a false positive uninitialized-var warning"
* tag 'for-linus-20161104' of git://git.infradead.org/linux-mtd:
mtd: mtk: avoid warning in mtk_ecc_encode
mtd: nand: Fix data interface configuration logic
mtd: nand: gpmi: disable the clocks on errors
MAINTAINERS: add more people to the MTD maintainer team
MAINTAINERS: add a maintainer for the SPI NOR subsystem
- Fix a nasty file descriptor leak when getting line handles.
- A fix for a cleanup that seemed innocent but created a problem
for drivers instantiating several gpiochips for one single
OF node.
- Fix a unpredictable problem using irq_domain_simple() in the
mvebu driver by converting it to a lineas irqdomain.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYHIVAAAoJEEEQszewGV1zsPoP/1ZhUaz+w+KvKkAj0/6mPqiY
tdzuU+LvC9/jnD80EVfkRVITMGTxKFyK8mtKTdkVd5Y4ZZVpCi6dCxuVSYL7ZmRy
dQnjE+H+o3GuhhSsc1sjYPqG3QqWAF6f2bCqraI3HtbLonD1l7DUphfYSrgpDQoX
yVRG4bwti2vpYMuV4wjA/hKUonsyeVkuO/5QQVdG/xXurOcL9z0ByVC9g2vfRQKu
hw5CWx3XthhE/IWxKg9hjDcj4bYYaHlXfPKaBEzXm3wzF6MatJelVC/gIUZUS3wT
mQY7RdQ4flK+rKjMQkQpG6a+b2hufER687EA7LjQ90CFwWQGwpT7JS9ig8sEnvKd
DUtpk5oQJ99nZbVlMJ32AmFRSnSwUf3snbO8iUZvAa/tELdbBjDeXanzM4WMR4tZ
LExyOXQLksUZttkzUM1SF0G1I/QT83vjdyLec7ssOvxuC6FdmjAtp8x6r2deqijx
wkBMmuLayUFJu+lGCt0ssfOZ+14XquYax+1uVi3Uxb4MMrqEAz8YzBiM+Fyfr9tS
sIAT0g96htF3wdiDbi2WA2LttpAghYpjNj9Mkz7BEcvFQexg4+KneYXA33opLOu7
VVvTU5uJ5vegcAYxnEGjzaK4fgpNGgexyXzQZO3YMtMtnmXgfWmhKBV/7/fsB/E2
odeokW90GQumdU5lEGYS
=QmJl
-----END PGP SIGNATURE-----
Merge tag 'gpio-v4.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
"Some GPIO fixes for the v4.9 series:
- Fix a nasty file descriptor leak when getting line handles.
- A fix for a cleanup that seemed innocent but created a problem for
drivers instantiating several gpiochips for one single OF node.
- Fix a unpredictable problem using irq_domain_simple() in the mvebu
driver by converting it to a lineas irqdomain"
* tag 'gpio-v4.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio/mvebu: Use irq_domain_add_linear
gpio: of: fix GPIO drivers with multiple gpio_chip for a single node
gpio: GPIO_GET_LINE{HANDLE,EVENT}_IOCTL: Fix file descriptor leak
stack change (after which we can no longer encrypt stuff on the stack).
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYHNwpAAoJECebzXlCjuG++DMP/3mUUAF09DfFR/EHl7knDT1f
kZ53UVHYzr02w0wXfwxVLlp2H7TdSAufgsSvPT6qksA3eY7gL6nJ9zHkl+Nv5yCx
y6vsFWjO1QEUWFOZWCKcmT2dAI3Ddt9IhK13pfZEKN1XKvK2zWB16HEVzSg6fR2K
NwHlpMnQUI4HWThURzwTZb1M5YhxRCAnyiv8BTAAPjbEfzPzdL7j3jxwqtH8bOWp
qIcDDvjC744b9zy0YuAEY/NyGBhYZPdM6gWsBBes1TRzBWUL9qsUYTWDJTmg/F1l
Or0Jz7CUEN9uOHLGnkATPDc+eBg9YFV+bSsSnJu1/W4Er7dX1Af/lol79zEp/Zw1
Snd9FelSPj3vxmYAFTCLnHRTRgsyiDhbbb7gVrzH9bxnCrRNR6p2kY018s1Cl9Td
uWQoNNFQwwnYxWYEeZdO5PgX+pcgoCzhHACNk5oA93YaBE0GuLHHugwwIrYE8TM1
1iY20sLC5lJcnPqxdgnoprZnnHMuL6rx5KRbvBeflNZ4huK2PIcPJyeB83XH6s12
G67PjJ0rfWzSBF14O/ZtQA6he+kXvnH3pKqpNnaMiBxZZ2J8E1eQvrKTLLIwmtlP
18KKJpZIzh7jTTZ/99nAMAt/BGw97P9TToLdnI8dCxYygHEaywpEYtcsE8IWFAvA
3XkS5QdlJhhAaAUUYBXy
=oPbZ
-----END PGP SIGNATURE-----
Merge tag 'nfsd-4.9-1' of git://linux-nfs.org/~bfields/linux
Pull nfsd bugfixes from Bruce Fields:
"Fixes for some recent regressions including fallout from the vmalloc'd
stack change (after which we can no longer encrypt stuff on the
stack)"
* tag 'nfsd-4.9-1' of git://linux-nfs.org/~bfields/linux:
nfsd: Fix general protection fault in release_lock_stateid()
svcrdma: backchannel cannot share a page for send and rcv buffers
sunrpc: fix some missing rq_rbuffer assignments
sunrpc: don't pass on-stack memory to sg_set_buf
nfsd: move blocked lock handling under a dedicated spinlock