I wanted to revert my v3.1 commit d0823576bf ("mm: pincer in
truncate_inode_pages_range"), to keep truncate_inode_pages_range() in
synch with shmem_undo_range(); but have stepped back - a change to
hole-punching in truncate_inode_pages_range() is a change to
hole-punching in every filesystem (except tmpfs) that supports it.
If there's a logical proof why no filesystem can depend for its own
correctness on the pincer guarantee in truncate_inode_pages_range() - an
instant when the entire hole is removed from pagecache - then let's
revisit later. But the evidence is that only tmpfs suffered from the
livelock, and we have no intention of extending hole-punch to ramfs. So
for now just add a few comments (to match or differ from those in
shmem_undo_range()), and fix one silliness noticed in d0823576bf4b...
Its "index == start" addition to the hole-punch termination test was
incomplete: it opened a way for the end condition to be missed, and the
loop go on looking through the radix_tree, all the way to end of file.
Fix that pessimization by resetting index when detected in inner loop.
Note that it's actually hard to hit this case, without the obsessive
concurrent faulting that trinity does: normally all pages are removed in
the initial trylock_page() pass, and this loop finds nothing to do. I
had to "#if 0" out the initial pass to reproduce bug and test fix.
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Lukas Czerner <lczerner@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
shmem_fault() is the actual culprit in trinity's hole-punch starvation,
and the most significant cause of such problems: since a page faulted is
one that then appears page_mapped(), needing unmap_mapping_range() and
i_mmap_mutex to be unmapped again.
But it is not the only way in which a page can be brought into a hole in
the radix_tree while that hole is being punched; and Vlastimil's testing
implies that if enough other processors are busy filling in the hole,
then shmem_undo_range() can be kept from completing indefinitely.
shmem_file_splice_read() is the main other user of SGP_CACHE, which can
instantiate shmem pagecache pages in the read-only case (without holding
i_mutex, so perhaps concurrently with a hole-punch). Probably it's
silly not to use SGP_READ already (using the ZERO_PAGE for holes): which
ought to be safe, but might bring surprises - not a change to be rushed.
shmem_read_mapping_page_gfp() is an internal interface used by
drivers/gpu/drm GEM (and next by uprobes): it should be okay. And
shmem_file_read_iter() uses the SGP_DIRTY variant of SGP_CACHE, when
called internally by the kernel (perhaps for a stacking filesystem,
which might rely on holes to be reserved): it's unclear whether it could
be provoked to keep hole-punch busy or not.
We could apply the same umbrella as now used in shmem_fault() to
shmem_file_splice_read() and the others; but it looks ugly, and use over
a range raises questions - should it actually be per page? can these get
starved themselves?
The origin of this part of the problem is my v3.1 commit d0823576bf
("mm: pincer in truncate_inode_pages_range"), once it was duplicated
into shmem.c. It seemed like a nice idea at the time, to ensure
(barring RCU lookup fuzziness) that there's an instant when the entire
hole is empty; but the indefinitely repeated scans to ensure that make
it vulnerable.
Revert that "enhancement" to hole-punch from shmem_undo_range(), but
retain the unproblematic rescanning when it's truncating; add a couple
of comments there.
Remove the "indices[0] >= end" test: that is now handled satisfactorily
by the inner loop, and mem_cgroup_uncharge_start()/end() are too light
to be worth avoiding here.
But if we do not always loop indefinitely, we do need to handle the case
of swap swizzled back to page before shmem_free_swap() gets it: add a
retry for that case, as suggested by Konstantin Khlebnikov; and for the
case of page swizzled back to swap, as suggested by Johannes Weiner.
Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Lukas Czerner <lczerner@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Cc: <stable@vger.kernel.org> [3.1+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit f00cdc6df7 ("shmem: fix faulting into a hole while it's
punched") was buggy: Sasha sent a lockdep report to remind us that
grabbing i_mutex in the fault path is a no-no (write syscall may already
hold i_mutex while faulting user buffer).
We tried a completely different approach (see following patch) but that
proved inadequate: good enough for a rational workload, but not good
enough against trinity - which forks off so many mappings of the object
that contention on i_mmap_mutex while hole-puncher holds i_mutex builds
into serious starvation when concurrent faults force the puncher to fall
back to single-page unmap_mapping_range() searches of the i_mmap tree.
So return to the original umbrella approach, but keep away from i_mutex
this time. We really don't want to bloat every shmem inode with a new
mutex or completion, just to protect this unlikely case from trinity.
So extend the original with wait_queue_head on stack at the hole-punch
end, and wait_queue item on the stack at the fault end.
This involves further use of i_lock to guard against the races: lockdep
has been happy so far, and I see fs/inode.c:unlock_new_inode() holds
i_lock around wake_up_bit(), which is comparable to what we do here.
i_lock is more convenient, but we could switch to shmem's info->lock.
This issue has been tagged with CVE-2014-4171, which will require commit
f00cdc6df7 and this and the following patch to be backported: we
suggest to 3.1+, though in fact the trinity forkbomb effect might go
back as far as 2.6.16, when madvise(,,MADV_REMOVE) came in - or might
not, since much has changed, with i_mmap_mutex a spinlock before 3.0.
Anyone running trinity on 3.0 and earlier? I don't think we need care.
Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Tested-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Lukas Czerner <lczerner@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Cc: <stable@vger.kernel.org> [3.1+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ingo Korb reported that "repeated mapping of the same file on tmpfs
using remap_file_pages sometimes triggers a BUG at mm/filemap.c:202 when
the process exits".
He bisected the bug to d7c1755179 ("mm: implement ->map_pages for
shmem/tmpfs"), although the bug was actually added by commit
8c6e50b029 ("mm: introduce vm_ops->map_pages()").
The problem is caused by calling do_fault_around for a _non-linear_
fault. In this case pgoff is shifted and might become negative during
calculation.
Faulting around non-linear page-fault makes no sense and breaks the
logic in do_fault_around because pgoff is shifted.
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Reported-by: Ingo Korb <ingo.korb@tu-dortmund.de>
Tested-by: Ingo Korb <ingo.korb@tu-dortmund.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Ning Qu <quning@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org> [3.15.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When compiling a SH2A kernel (e.g. se7206_defconfig or rsk7203_defconfig)
using sh4-linux-gcc, linking fails with:
net/built-in.o: In function `__sk_run_filter':
net/core/filter.c:566: undefined reference to `__fpscr_values'
net/core/filter.c:269: undefined reference to `__fpscr_values'
...
net/built-in.o:net/core/filter.c:580: more undefined references to `__fpscr_values' follow
This happens because sh4-linux-gcc doesn't support the "-m2a-nofpu",
which is thus filtered out by "$(call cc-option, ...)".
As compiling using sh4-linux-gcc is useful for compile coverage, also
try passing "-m4-nofpu" (which is presumably filtered out when using a
real sh2a-linux toolchain) to disable the generation of FPU instructions
and references to __fpscr_values[].
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Tony Breeds <tony@bakeyournoodle.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Magnus Damm <magnus.damm@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sasha reported lockdep warning [1] introduced by [2].
It could be fixed by doing disk revalidation out of the init_lock. It's
okay because disk capacity change is protected by init_lock so that
revalidate_disk always sees up-to-date value so there is no race.
[1] https://lkml.org/lkml/2014/7/3/735
[2] zram: revalidate disk after capacity change
Fixes 2e32baea46 ("zram: revalidate disk after capacity change").
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Cc: "Alexander E. Patrakov" <patrakov@gmail.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I triggered VM_BUG_ON() in vma_address() when I tried to migrate an
anonymous hugepage with mbind() in the kernel v3.16-rc3. This is
because pgoff's calculation in rmap_walk_anon() fails to consider
compound_order() only to have an incorrect value.
This patch introduces page_to_pgoff(), which gets the page's offset in
PAGE_CACHE_SIZE.
Kirill pointed out that page cache tree should natively handle
hugepages, and in order to make hugetlbfs fit it, page->index of
hugetlbfs page should be in PAGE_CACHE_SIZE. This is beyond this patch,
but page_to_pgoff() contains the point to be fixed in a single function.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 079148b919 ("coredump: factor out the setting of PF_DUMPCORE")
cleaned up the setting of PF_DUMPCORE by removing it from all the
linux_binfmt->core_dump() and moving it to zap_threads().But this ended
up clearing all the previously set flags. This causes issues during
core generation when tsk->flags is checked again (eg. for PF_USED_MATH
to dump floating point registers). Fix this.
Signed-off-by: Silesh C V <svellattu@mvista.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Mandeep Singh Baines <msb@chromium.org>
Cc: <stable@vger.kernel.org> [3.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Revert half of commit d151f9854f21: If isochronous I/O is attempted with
packets larget than 1 kByte, VIA VT6315 rev 01 immediately stops to generate
any interrupts if MSI are used. Fix this by going back to legacy interrupts.
[Thread "Isochronous streaming with VT6315 OHCI",
http://marc.info/?t=139049641500003]
With smaller packets, the loss of IRQs happens too but only very rarely ---
rarely eneough that it was not yet possible for me to determine whether
QUIRK_NO_MSI is an actual fix for this rare variation of this chip bug.
I am keeping QUIRK_CYCLE_TIMER off of VT6315 rev >= 1 because this has been
verified by myself with certainty. On the other hand, I am also keeping
QUIRK_CYCLE_TIMER on for VT6315 rev 0 because I don't know at this time
whether this revision accesses Cycle Timer non-atomically like most of the
other VIA OHCIs are known to do.
Reported-by: Rémy Bruno <remy-fw@remy.trinnov.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
We must mask out the overflow bit as well, otherwise
the wptr will never match the rptr again and the interrupt
handler will loop forever.
Signed-off-by: Christian König <christian.koenig@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
P4 systems with cpuid level < 4 can have SMT, but the cache topology
description available (cpuid2) does not include SMP information.
Now we know that SMT shares all cache levels, and therefore we can
mark all available cache levels as shared.
We do this by setting cpu_llc_id to ->phys_proc_id, since that's
the same for each SMT thread. We can do this unconditional since if
there's no SMT its still true, the one CPU shares cache with only
itself.
This fixes a problem where such CPUs report an incorrect LLC CPU mask.
This in turn fixes a crash in the scheduler where the topology was
build wrong, it assumes the LLC mask to include at least the SMT CPUs.
Cc: Josh Boyer <jwboyer@redhat.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Tested-by: Bruno Wolff III <bruno@wolff.to>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140722133514.GM12054@laptop.lan
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Commit 8c7424cff6 "nfsd4: don't try to encode conflicting owner if low
on space" forgot to free conf->data in nfsd4_encode_lockt and before
sign conf->data to NULL in nfsd4_encode_lock_denied, causing a leak.
Worse, kfree() can be called on an uninitialized pointer in the case of
a succesful lock (or one that fails for a reason other than a conflict).
(Note that lock->lk_denied.ld_owner.data appears it should be zero here,
until you notice that it's one arm of a union the other arm of which is
written to in the succesful case by the
memcpy(&lock->lk_resp_stateid, &lock_stp->st_stid.sc_stateid,
sizeof(stateid_t));
in nfsd4_lock(). In the 32-bit case this overwrites ld_owner.data.)
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Fixes: 8c7424cff6 ""nfsd4: don't try to encode conflicting owner if low on space"
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
An object can only have an active gtt mapping if it is currently bound
into the global gtt. Therefore we can simply walk the list of all bound
objects and check the flag upon those for an active gtt mapping.
From commit 48018a57a8
Author: Paulo Zanoni <paulo.r.zanoni@intel.com>
Date: Fri Dec 13 15:22:31 2013 -0200
drm/i915: release the GTT mmaps when going into D3
Also note that the WARN is inappropriate for this function as GPU
activity is orthogonal to GTT mmap status. Rather it is the caller that
relies upon this condition and so it should assert that the GPU is idle
itself.
References: https://bugs.freedesktop.org/show_bug.cgi?id=80081
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Tested-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
[danvet: cherry-pick from -next to -fixes.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
ZONE_DMA is created to allow 32-bit only devices to access memory in the
absence of an IOMMU. On systems where the memory starts above 4GB, it is
expected that some devices have a DMA offset hardwired to be able to
access the bottom of the memory. Linux currently supports DT bindings
for the DMA offsets but they are not (easily) available early during
boot.
This patch tries to guess a DMA offset and assumes that ZONE_DMA
corresponds to the 32-bit mask above the start of DRAM.
Fixes: 2d5a5612bc (arm64: Limit the CMA buffer to 32-bit if ZONE_DMA)
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Mark Salter <msalter@redhat.com>
Tested-by: Mark Salter <msalter@redhat.com>
Tested-by: Anup Patel <anup.patel@linaro.org>
This commit is a supplement to my previous patch.
http://mailman.alsa-project.org/pipermail/alsa-devel/2014-July/079190.html
The special_clk_ctl_put() still returns 0 in error handling case. It should
return -EINVAL.
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Here some additional changes to set a capability flag so that clients can
detect when it's appropriate to return -ENOSYS from open.
This amends the following commit introduced in 3.14:
7678ac5061 fuse: support clients that don't implement 'open'
However we can only add the flag to 3.15 and later since there was no
protocol version update in 3.14.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: <stable@vger.kernel.org> # v3.15+
This commit is for correction of my misunderstanding about return value of
.put callback in ALSA Control interface.
According to 'Writing ALSA Driver' (*1), return value of the callback has
three patterns; 1: changed, 0: not changed, an negative value: fatal error.
But I misunderstood that it's boolean; zero or nonzero.
*1: Writing an ALSA Driver (2005, Takashi Iwai)
http://www.alsa-project.org/main/index.php/ALSA_Driver_Documentation
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
This commit uses different labels for control elements of digital input/output
interfaces to correct my misunderstanding about M-Audio Firewire 1814 and
ProjectMix I/O.
According to user manuals for these two models, they have two modes for
digital input; one is S/PDIF in both of optical and coaxial interfaces,
another is ADAT in optical interface only.
But in current implementation, a control element for it reduced labels which
a control element for digital output uses because of my misunderstanding
that optical interface is not available for digital input with S/PDIF mode.
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
In error handling case, special_clk_ctl_put() returns without unlock_mutex(),
therefore the mutex is still locked. This commit moves mutex_lock() after
the error handling case.
This commit is my solution for this post.
[PATCH -next] ALSA: bebob: Fix missing unlock on error in special_clk_ctl_put()
https://lkml.org/lkml/2014/7/20/12
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Commit 554086d ("x86_32, entry: Do syscall exit work on badsys
(CVE-2014-4508)") introduced a regression in the x86_32 syscall entry
code, resulting in syscall() not returning proper errors for undefined
syscalls on CPUs supporting the sysenter feature.
The following code:
> int result = syscall(666);
> printf("result=%d errno=%d error=%s\n", result, errno, strerror(errno));
results in:
> result=666 errno=0 error=Success
Obviously, the syscall return value is the called syscall number, but it
should have been an ENOSYS error. When run under ptrace it behaves
correctly, which makes it hard to debug in the wild:
> result=-1 errno=38 error=Function not implemented
The %eax register is the return value register. For debugging via ptrace
the syscall entry code stores the complete register context on the
stack. The badsys handlers only store the ENOSYS error code in the
ptrace register set and do not set %eax like a regular syscall handler
would. The old resume_userspace call chain contains code that clobbers
%eax and it restores %eax from the ptrace registers afterwards. The same
goes for the ptrace-enabled call chain. When ptrace is not used, the
syscall return value is the passed-in syscall number from the untouched
%eax register.
Use %eax as the return value register in syscall_badsys and
sysenter_badsys, like a real syscall handler does, and have the caller
push the value onto the stack for ptrace access.
Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Link: http://lkml.kernel.org/r/alpine.LNX.2.11.1407221022380.31021@titan.int.lan.stealer.net
Reviewed-and-tested-by: Andy Lutomirski <luto@amacapital.net>
Cc: <stable@vger.kernel.org> # If 554086d is backported
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
x86_64 boots and displays fine, but booting x86_32 with CONFIG_HIGHMEM
has frozen with a blank screen throughout 3.16-rc on this ThinkPad T420s,
with i915 generation 6 graphics.
Fix 9d0a6fa6c5 ("drm/i915: add render state initialization"): kunmap()
takes struct page * argument, not virtual address. Which the compiler
kindly points out, if you use the appropriate u32 *batch, instead of
silencing it with a void *.
Why did bisection lead decisively to nearby 229b0489aa ("drm/i915:
add null render states for gen6, gen7 and gen8")? Because the u32
deposited at that virtual address by the previous stub failed the
PageHighMem test, and so did no harm.
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
memmove may be called from module code copy_pages(btrfs), and it may
call memcpy, which may call back to C code, so it needs to use
_GLOBAL_TOC to set up r2 correctly.
This fixes following error when I tried to boot an le guest:
Vector: 300 (Data Access) at [c000000073f97210]
pc: c000000000015004: enable_kernel_altivec+0x24/0x80
lr: c000000000058fbc: enter_vmx_copy+0x3c/0x60
sp: c000000073f97490
msr: 8000000002009033
dar: d000000001d50170
dsisr: 40000000
current = 0xc0000000734c0000
paca = 0xc00000000fff0000 softe: 0 irq_happened: 0x01
pid = 815, comm = mktemp
enter ? for help
[c000000073f974f0] c000000000058fbc enter_vmx_copy+0x3c/0x60
[c000000073f97510] c000000000057d34 memcpy_power7+0x274/0x840
[c000000073f97610] d000000001c3179c copy_pages+0xfc/0x110 [btrfs]
[c000000073f97660] d000000001c3c248 memcpy_extent_buffer+0xe8/0x160 [btrfs]
[c000000073f97700] d000000001be4be8 setup_items_for_insert+0x208/0x4a0 [btrfs]
[c000000073f97820] d000000001be50b4 btrfs_insert_empty_items+0xf4/0x140 [btrfs]
[c000000073f97890] d000000001bfed30 insert_with_overflow+0x70/0x180 [btrfs]
[c000000073f97900] d000000001bff174 btrfs_insert_dir_item+0x114/0x2f0 [btrfs]
[c000000073f979a0] d000000001c1f92c btrfs_add_link+0x10c/0x370 [btrfs]
[c000000073f97a40] d000000001c20e94 btrfs_create+0x204/0x270 [btrfs]
[c000000073f97b00] c00000000026d438 vfs_create+0x178/0x210
[c000000073f97b50] c000000000270a70 do_last+0x9f0/0xe90
[c000000073f97c20] c000000000271010 path_openat+0x100/0x810
[c000000073f97ce0] c000000000272ea8 do_filp_open+0x58/0xd0
[c000000073f97dc0] c00000000025ade8 do_sys_open+0x1b8/0x300
[c000000073f97e30] c00000000000a008 syscall_exit+0x0/0x7c
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Commit 75b57ecf9 refactored device tree nodes to use kobjects such that they
can be exposed via /sysfs. A secondary commit 0829f6d1f furthered this rework
by moving the kobect initialization logic out of of_node_add into its own
of_node_init function. The inital commit removed the existing kref_init calls
in the pseries dlpar code with the assumption kobject initialization would
occur in of_node_add. The second commit had the side effect of triggering a
BUG_ON during DLPAR, migration and suspend/resume operations as a result of
dynamically added nodes being uninitialized.
This patch fixes this by adding of_node_init calls in place of the previously
removed kref_init calls.
Fixes: 0829f6d1f6 ("of: device_node kobject lifecycle fixes")
Cc: stable@vger.kernel.org
Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Acked-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Acked-by: Grant Likely <grant.likely@linaro.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We now support TASK_SIZE of 16TB, hence the array should be 8.
Fixes the below crash:
Unable to handle kernel paging request for data at address 0x000100bd
Faulting instruction address: 0xc00000000004f914
cpu 0x13: Vector: 300 (Data Access) at [c000000fea75fa90]
pc: c00000000004f914: .sys_subpage_prot+0x2d4/0x5c0
lr: c00000000004fb5c: .sys_subpage_prot+0x51c/0x5c0
sp: c000000fea75fd10
msr: 9000000000009032
dar: 100bd
dsisr: 40000000
current = 0xc000000fea6ae490
paca = 0xc00000000fb8ab00 softe: 0 irq_happened: 0x00
pid = 8237, comm = a.out
enter ? for help
[c000000fea75fe30] c00000000000a164 syscall_exit+0x0/0x98
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This fixes some bugs in emulate_step(). First, the setting of the carry
bit for the arithmetic right-shift instructions was not correct on 64-bit
machines because we were masking with a mask of type int rather than
unsigned long. Secondly, the sld (shift left doubleword) instruction was
using the wrong instruction field for the register containing the shift
count.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
These processors do not currently support doorbell IPIs, so remove them
from the feature list if we are at DD 1.xx for the 0x004d part.
This fixes a regression caused by d4e58e5928 (powerpc/powernv: Enable
POWER8 doorbell IPIs). With that patch the kernel would hang at boot
when calling smp_call_function_many, as the doorbell would not be
received by the target CPUs:
.smp_call_function_many+0x2bc/0x3c0 (unreliable)
.on_each_cpu_mask+0x30/0x100
.cpuidle_register_driver+0x158/0x1a0
.cpuidle_register+0x2c/0x110
.powernv_processor_idle_init+0x23c/0x2c0
.do_one_initcall+0xd4/0x260
.kernel_init_freeable+0x25c/0x33c
.kernel_init+0x1c/0x120
.ret_from_kernel_thread+0x58/0x7c
Fixes: d4e58e5928 (powerpc/powernv: Enable POWER8 doorbell IPIs)
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Pull networking fixes from David Miller:
1) Null termination fix in dns_resolver got the pointer dereferncing
wrong, fix from Ben Hutchings.
2) ip_options_compile() has a benign but real buffer overflow when
parsing options. From Eric Dumazet.
3) Table updates can crash in netfilter's nftables if none of the state
flags indicate an actual change, from Pablo Neira Ayuso.
4) Fix race in nf_tables dumping, also from Pablo.
5) GRE-GRO support broke the forwarding path because the segmentation
state was not fully initialized in these paths, from Jerry Chu.
6) sunvnet driver leaks objects and potentially crashes on module
unload, from Sowmini Varadhan.
7) We can accidently generate the same handle for several u32
classifier filters, fix from Cong Wang.
8) Several edge case bug fixes in fragment handling in xen-netback,
from Zoltan Kiss.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (21 commits)
ipv4: fix buffer overflow in ip_options_compile()
batman-adv: fix TT VLAN inconsistency on VLAN re-add
batman-adv: drop QinQ claim frames in bridge loop avoidance
dns_resolver: Null-terminate the right string
xen-netback: Fix pointer incrementation to avoid incorrect logging
xen-netback: Fix releasing header slot on error path
xen-netback: Fix releasing frag_list skbs in error path
xen-netback: Fix handling frag_list on grant op error path
net_sched: avoid generating same handle for u32 filters
net: huawei_cdc_ncm: add "subclass 3" devices
net: qmi_wwan: add two Sierra Wireless/Netgear devices
wan/x25_asy: integer overflow in x25_asy_change_mtu()
net: ppp: fix creating PPP pass and active filters
net/mlx4_en: cq->irq_desc wasn't set in legacy EQ's
sunvnet: clean up objects created in vnet_new() on vnet_exit()
r8169: Enable RX_MULTI_EN for RTL_GIGA_MAC_VER_40
net-gre-gro: Fix a bug that breaks the forwarding path
netfilter: nf_tables: 64bit stats need some extra synchronization
netfilter: nf_tables: set NLM_F_DUMP_INTR if netlink dumping is stale
netfilter: nf_tables: safe RCU iteration on list when dumping
...
Pull sparc fix from David Miller:
"Need to hook up the new renameat2 system call"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc: Hook up renameat2 syscall.
Pull IDE fixes from David Miller:
- fix interrupt registry for some Atari IDE chipsets.
- adjust Kconfig dependencies for x86_32 specific chips.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide:
ide: Fix SC1200 dependencies
ide: Fix CS5520 and CS5530 dependencies
m68k/atari - ide: do not register interrupt if host->get_lock is set
as a counter was converted to nanoseconds (silly), and after 1 hour
11 minutes and 34 seconds, this monotonic clock would wrap, causing
havoc with the tracing system and making the clock useless.
He converted that clock to use jiffies_64 and made it into a counter
instead of nanosecond conversions, and displayed the clock with the
straight jiffy count, which works much better than it did in the past.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJTzcD+AAoJEKQekfcNnQGuy4UH/2IaAlJov97GTlYcpA6CCOfQ
S6KYN93V/wbJpoRhcVIyk21ugolPCPCA+W8QEHU/yIzTDuy4VzkiYDfEZSNWL9bF
36dmYyNQdeFkfhVK0sHW7/OvF/YcbMsd70N1+NuwFu0m/sDFKlPWiGe8F0GDyRQb
mKDBiAVAAhDtMWycff3iUgA7eJffejf6Hs6Ve9UdxQ4FxvDaS9ISCRzzWkEktlnw
RPHvZZRCd+TvtugjGdfusHXhnKSdZSkt5c0R0DyyTebW1Wgrq9dJGXxuth+hdoww
9iQ6o2YhhoxIo49BwxkTJMsLsS4jC+2KmMEepQ3H7BjTUYg5tWMd9kWAAtxKuDI=
=3sF+
-----END PGP SIGNATURE-----
Merge tag 'trace-fixes-v3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull trace fix from Steven Rostedt:
"Tony Luck found that using the "uptime" trace clock that uses jiffies
as a counter was converted to nanoseconds (silly), and after 1 hour 11
minutes and 34 seconds, this monotonic clock would wrap, causing havoc
with the tracing system and making the clock useless.
He converted that clock to use jiffies_64 and made it into a counter
instead of nanosecond conversions, and displayed the clock with the
straight jiffy count, which works much better than it did in the past"
* tag 'trace-fixes-v3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix wraparound problems in "uptime" trace clock
- recognise and drop Bridge Loop Avoidance packets even if
they are encapsulated in the 802.1q header multiple times.
Forwarding them into the mesh creates issues on other
nodes.
- properly handle VLAN private objects in order to avoid race
conditions upon fast VLAN deletion-addition. Such conditions
create an unrecoverable inconsistency in the TT database of
the nodes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJTzMYfAAoJEJgn97Bh2u9eKLIP+wWwqvRe5hFleA7Xd7vHS769
20TrhDPZrQAcaK8dg8/VpqUZ4oGAi0WHhbhAdur1Vj3Ie5DDsqqu45lK9a/o+PAe
avWafxcPcK5LLoLbDKNxX98n6BN3aNFIp7rUy4CDO7Beix/PfQUYGbZ01IEueNlX
tvKz1oO7r3SvWFELltSU7bndU+0NoZRon5qXSaxnlYHMXcsJEJAKRPE9eLdwXUaF
9h0oIKkPVQt8YFn0w1zZRePSPWGQSAb20exgRGwPxI23xs7ui1i+s5Od9aSt8FcR
e6eNuMDsuHVeAmW+nsxF3WAyYGIGyaTb9sSkwrToXZge7BRFRfphKN1WHD1bp6A5
a0Lu3wkzCJbrS3LZkjt99jh+0XAaaoWkAt4Lu4+VUcMYtfITHHHN4kfmzoPE7Z8y
Qq64KL/ry6v2lqGk2+9G5/oHXMAYAyed+TPk/HSn5O0CS+zXxXFvrvbYyQyFg99X
BcuOD6dGLbfaPQh9XuCE9jJ2D5QHnkAXj2FlK5oFd7y6ASdLltratTYNKJ4T7cVR
+cyBkZ6cI3Ehzq1jrR8/9qqAal+a/jdzne6J7DPnWksDWxnTylANuWecVkETkpcL
mUp6Zv9SYISqQSPtrbE7xu1XW/ICoajc+6H0eEOFhKU+JEqKjxwSE2QoKvzxeC8Y
OHIbq99fItGwH7Vuldkg
=RdJM
-----END PGP SIGNATURE-----
Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge
Antonio Quartulli says:
====================
pull request [net]: batman-adv 20140721
here you have two fixes that we have been testing for quite some time
(this is why they arrived a bit late in the rc cycle).
Patch 1) ensures that BLA packets get dropped and not forwarded to the
mesh even if they reach batman-adv within QinQ frames. Forwarding them
into the mesh means messing up with the TT database of other nodes which
can generate all kind of unexpected behaviours during route computation.
Patch 2) avoids a couple of race conditions triggered upon fast VLAN
deletion-addition. Such race conditions are pretty dangerous because
they not only create inconsistencies in the TT database of the nodes
in the network, but such scenario is also unrecoverable (unless
nodes are rebooted).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'drm-fixes-3.16' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon/TN: only enable bapm on MSI systems
drm/radeon: fix VM IB handling
drm/radeon: fix handling of radeon_vm_bo_rmv v3
drm/radeon: let's use GB for vm_size (v2)
Pull media fixes from Mauro Carvalho Chehab:
"A series of driver fixes:
- fix DVB-S tuning with tda1071
- fix tuner probe on af9035 when the device has a bad eeprom
- some fixes for the new si2168/2157 drivers
- one Kconfig build fix (for omap4iss)
- fixes at vpif error path
- don't lock saa7134 ioctl at driver's base core level, as it now
uses V4L2 and VB2 locking schema
- fix audio at hdpvr driver
- fix the aspect ratio at the digital timings table
- one new USB ID (at gspca_pac7302): Genius i-Look 317 webcam"
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] gspca_pac7302: Add new usb-id for Genius i-Look 317
[media] tda10071: fix returned symbol rate calculation
[media] tda10071: fix spec inversion reporting
[media] tda10071: add missing DVB-S2/PSK-8 FEC AUTO
[media] tda10071: force modulation to QPSK on DVB-S
[media] hdpvr: fix two audio bugs
[media] davinci: vpif: missing unlocks on error
[media] af9035: override tuner id when bad value set into eeprom
[media] saa7134: use unlocked_ioctl instead of ioctl
[media] media: v4l2-core: v4l2-dv-timings.c: Cleaning up code wrong value used in aspect ratio
[media] si2168: firmware download fix
[media] si2157: add one missing parenthesis
[media] si2168: add one missing parenthesis
[media] staging: tighten omap4iss dependencies
Pull block fixes from Jens Axboe:
"Final block fixes for 3.16
Four small fixes that should go into 3.16, have been queued up for a
bit and delayed due to vacation and other euro duties. But here they
are. The pull request contains:
- Fix for a reported crash with shared tagging on SCSI from Christoph
- A regression fix for drbd. From Lars Ellenberg.
- Hooking up the compat ioctl for BLKZEROOUT, which requires no
translation. From Mikulas.
- A fix for a regression where we woud crash on queue exit if the
root_blkg is gone/not there. From Tejun"
* 'for-linus' of git://git.kernel.dk/linux-block:
block: provide compat ioctl for BLKZEROOUT
blkcg: don't call into policy draining if root_blkg is already gone
drbd: fix regression 'out of mem, failed to invoke fence-peer helper'
block: don't assume last put of shared tags is for the host
Pull libata fixes from Tejun Heo:
"Late libata fixes.
The most important one is from Kevin Hao which makes sure that libata
only allocates tags inside the max tag number the controller supports.
libata always had this problem but the recent tag allocation change
and addition of support for sata_fsl which only supports queue depth
of 16 exposed the issue.
Hans de Goede agreed to become the maintainer of libahci_platform
which is under higher than usual development pressure from all the new
controllers popping up from the ARM world"
* 'for-3.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
ahci: add support for the Promise FastTrak TX8660 SATA HBA (ahci mode)
drivers/ata/pata_ep93xx.c: use signed int type for result of platform_get_irq()
libata: EH should handle AMNF error condition as a media error
libata: support the ata host which implements a queue depth less than 32
MAINTAINERS: Add Hans de Goede as ahci-platform maintainer
an x86 change too and it is a regression from 3.14. As it only affects
nested virtualization and there were other changes in this area in 3.16,
I am not nominating it for 3.15-stable.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJTyTTfAAoJEBvWZb6bTYbyytIQAJare/EWQmNBDK57EcJBIlJS
6MW2XnASEW+KCoUw0+u3sm9eaRXQdmJRb1Aw5zxTiUIR3ZSI8MDSQr1XxEgTAOtE
vFZjonPwlbnE8edLMhH3v/6/v9oO7bwNTDYeOE2pKPRfgPRjFmj1QUOJkvzRnRwj
kS5M4RtI+VqhdyJW8f4HaWqoRaOAISp3ZjQUJQdab3DWsf9ZpNjwLNjKzGZKNvIN
Klcpi7JH32zawUfqnAvph/7NsrBGrpFRE+j+JU9LLnD9PehuXwqZbWh01g2Anbq2
TKVrmXW+YnoD1IZsDw7r/14FaeRweV7yALA/eA9F4KfSyF2Qm9RbjVVdrUYz0CHV
aIl0cZeZM8xRCLy/ZWj+dOQ23RWelZaslHSpshKOznoRsuuvVwpx93zVtRwlw2dx
4WJ2A5gYA+ZUQ7eWjk83381JXkbRDUb3cO+NL8t9GnFctCJzT/gQHjqu15f7TJ2Q
gKhmeciKOS3xY4sQ+ti6gv8CwIFYqgdTzkxedxSgS9xpiAmw9v57V7WukXoXB6zl
AyjEAk9FFOeBZ5nXs0ObK5LKjI+MJoZ3X0bin7PCuT6dFrIA2yHvo5EgMvOcUua9
8Tu9L8sWv/JsKjuqebkKxekAKvv0CV35Q8OsLpEF6RI0eXyiXy2extk1LzUuK9cx
ZVYbN263++En/tgH2AJM
=Vdqn
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"These are mostly PPC changes for 3.16-new things. However, there is
an x86 change too and it is a regression from 3.14. As it only
affects nested virtualization and there were other changes in this
area in 3.16, I am not nominating it for 3.15-stable"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Check for nested events if there is an injectable interrupt
KVM: PPC: RTAS: Do byte swaps explicitly
KVM: PPC: Book3S PR: Fix ABIv2 on LE
KVM: PPC: Assembly functions exported to modules need _GLOBAL_TOC()
PPC: Add _GLOBAL_TOC for 32bit
KVM: PPC: BOOK3S: HV: Use base page size when comparing against slb value
KVM: PPC: Book3E: Unlock mmu_lock when setting caching atttribute
Pull s390 fixes from Martin Schwidefsky:
"A couple of last minute bug fixes for 3.16, including a fix for ptrace
to close a hole which allowed a user space program to write to the
kernel address space"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: fix restore of invalid floating-point-control
s390/zcrypt: improve device probing for zcrypt adapter cards
s390/ptrace: fix PSW mask check
s390/MSI: Use standard mask and unmask funtions
s390/3270: correct size detection with the read-partition command
s390: require mvcos facility, not tod clock steering facility
commit 4be173813e
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Fri Jun 6 10:22:29 2014 +0100
drm/i915: Reorder semaphore deadlock check
did the majority of the work, but it missed one crucial detail:
The check for the unkickable deadlock on this ring must come after the
check whether the ring that we are waiting on has already passed its
target seqno.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80709
Tested-by: Stefan Huber <shuber@sthu.org>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Calling radeon_vm_bo_find on the IB BO during CS
is illegal and can lead to an crash.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v3: completely rewritten. We now just remember which areas
of the PT to clear and do so on the next command submission.
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=79980
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
VM sizes smaller than 1GB doesn't make much sense anyway.
v2: fix typo and grammer
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
BorisO reports that misc_register() fails often on xen. The current code
unregisters the CPU hotplug notifier in that case. If then a CPU is
offlined and onlined back again, we end up with a second timer running
on that CPU, leading to soft lockups and system hangs.
So let's leave the hotcpu notifier always registered - even if
mce_device_create failed for some cores and never unreg it so that we
can deal with the timer handling accordingly.
Reported-and-Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: http://lkml.kernel.org/r/1403274493-1371-1-git-send-email-boris.ostrovsky@oracle.com
Signed-off-by: Borislav Petkov <bp@suse.de>