When a memory error happens on an in-use page or (free and in-use)
hugepage, the victim page is isolated with its refcount set to one.
When you try to unpoison it later, unpoison_memory() calls put_page()
for it twice in order to bring the page back to free page pool (buddy or
free hugepage list). However, if another memory error occurs on the
page which we are unpoisoning, memory_failure() returns without
releasing the refcount which was incremented in the same call at first,
which results in memory leak and unconsistent num_poisoned_pages
statistics. This patch fixes it.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: <stable@vger.kernel.org> [2.6.32+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In commit ad86622b47 ("wait: swap EXIT_ZOMBIE and EXIT_DEAD to hide
EXIT_TRACE from user-space") the order of task state definitions were
changed: EXIT_DEAD and EXIT_ZOMBIE were swapped. Though the charterers
for the states in TASK_STATE_TO_CHAR_STR string were not updated. This
patch synchronizes the string to the order of definitions.
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 284f39afea ("mm: memcg: push !mm handling out to page cache
charge function") explicitly checks for page cache charges without any
mm context (from kernel thread context[1]).
This seemed to be the only possible case where memory could be charged
without mm context so commit 03583f1a63 ("memcg: remove unnecessary
!mm check from try_get_mem_cgroup_from_mm()") removed the mm check from
get_mem_cgroup_from_mm(). This however caused another NULL ptr
dereference during early boot when loopback kernel thread splices to
tmpfs as reported by Stephan Kulow:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000360
IP: get_mem_cgroup_from_mm.isra.42+0x2b/0x60
Oops: 0000 [#1] SMP
Modules linked in: btrfs dm_multipath dm_mod scsi_dh multipath raid10 raid456 async_raid6_recov async_memcpy async_pq raid6_pq async_xor xor async_tx raid1 raid0 md_mod parport_pc parport nls_utf8 isofs usb_storage iscsi_ibft iscsi_boot_sysfs arc4 ecb fan thermal nfs lockd fscache nls_iso8859_1 nls_cp437 sg st hid_generic usbhid af_packet sunrpc sr_mod cdrom ata_generic uhci_hcd virtio_net virtio_blk ehci_hcd usbcore ata_piix floppy processor button usb_common virtio_pci virtio_ring virtio edd squashfs loop ppa]
CPU: 0 PID: 97 Comm: loop1 Not tainted 3.15.0-rc5-5-default #1
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Call Trace:
__mem_cgroup_try_charge_swapin+0x40/0xe0
mem_cgroup_charge_file+0x8b/0xd0
shmem_getpage_gfp+0x66b/0x7b0
shmem_file_splice_read+0x18f/0x430
splice_direct_to_actor+0xa2/0x1c0
do_lo_receive+0x5a/0x60 [loop]
loop_thread+0x298/0x720 [loop]
kthread+0xc6/0xe0
ret_from_fork+0x7c/0xb0
Also Branimir Maksimovic reported the following oops which is tiggered
for the swapcache charge path from the accounting code for kernel threads:
CPU: 1 PID: 160 Comm: kworker/u8:5 Tainted: P OE 3.15.0-rc5-core2-custom #159
Hardware name: System manufacturer System Product Name/MAXIMUSV GENE, BIOS 1903 08/19/2013
task: ffff880404e349b0 ti: ffff88040486a000 task.ti: ffff88040486a000
RIP: get_mem_cgroup_from_mm.isra.42+0x2b/0x60
Call Trace:
__mem_cgroup_try_charge_swapin+0x45/0xf0
mem_cgroup_charge_file+0x9c/0xe0
shmem_getpage_gfp+0x62c/0x770
shmem_write_begin+0x38/0x40
generic_perform_write+0xc5/0x1c0
__generic_file_aio_write+0x1d1/0x3f0
generic_file_aio_write+0x4f/0xc0
do_sync_write+0x5a/0x90
do_acct_process+0x4b1/0x550
acct_process+0x6d/0xa0
do_exit+0x827/0xa70
kthread+0xc3/0xf0
This patch fixes the issue by reintroducing mm check into
get_mem_cgroup_from_mm. We could do the same trick in
__mem_cgroup_try_charge_swapin as we do for the regular page cache path
but it is not worth troubles. The check is not that expensive and it is
better to have get_mem_cgroup_from_mm more robust.
[1] - http://marc.info/?l=linux-mm&m=139463617808941&w=2
Fixes: 03583f1a63 ("memcg: remove unnecessary !mm check from try_get_mem_cgroup_from_mm()")
Reported-and-tested-by: Stephan Kulow <coolo@suse.com>
Reported-by: Branimir Maksimovic <branimir.maksimovic@gmail.com>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
MADV_WILLNEED currently does not read swapped out shmem pages back in.
Commit 0cd6144aad ("mm + fs: prepare for non-page entries in page
cache radix trees") made find_get_page() filter exceptional radix tree
entries but failed to convert all find_get_page() callers that WANT
exceptional entries over to find_get_entry(). One of them is shmem swap
readahead in madvise, which now skips over any swap-out records.
Convert it to find_get_entry().
Fixes: 0cd6144aad ("mm + fs: prepare for non-page entries in page cache radix trees")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In some testing I ran today (some fio jobs that spread over two nodes),
we end up spending 40% of the time in filemap_check_errors(). That
smells fishy. Looking further, this is basically what happens:
blkdev_aio_read()
generic_file_aio_read()
filemap_write_and_wait_range()
if (!mapping->nr_pages)
filemap_check_errors()
and filemap_check_errors() always attempts two test_and_clear_bit() on
the mapping flags, thus dirtying it for every single invocation. The
patch below tests each of these bits before clearing them, avoiding this
issue. In my test case (4-socket box), performance went from 1.7M IOPS
to 4.0M IOPS.
Signed-off-by: Jens Axboe <axboe@fb.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For handling a free hugepage in memory failure, the race will happen if
another thread hwpoisoned this hugepage concurrently. So we need to
check PageHWPoison instead of !PageHWPoison.
If hwpoison_filter(p) returns true or a race happens, then we need to
unlock_page(hpage).
Signed-off-by: Chen Yucong <slaoub@gmail.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Tested-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: <stable@vger.kernel.org> [2.6.36+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I got a patch from the original author, Fred Brooks, to add a small
settling delay after setting the AI channel multiplexor. The lack of
delay resulted in unstable or scrambled data on faster processors.
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Reported-by: Fred Brooks <nsaspook@nsaspook.com>
Cc: <stable@vger.kernel.org> # 3.7.x - 3.15.x
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
22c9bcad85 contained a bad
substitution for ROOT_W => S_IRUSR|S_IRUGO instead of
S_IWUSR|S_IRUGO.
Fixes: 22c9bcad85
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The 'renameat2()' system call was incorrectly added as a ENTRY_COMP() in
the parisc system call table by commit 18e480aa07 ("parisc: add
renameat2 syscall"). That causes a link-time error due to there not
being any compat version of that system call:
arch/parisc/kernel/built-in.o: In function `sys_call_table':
(.rodata+0xad0): undefined reference to `compat_sys_renameat2'
make: *** [vmlinux] Error 1
Easily fixed by marking the system call as being the same for compat as
for native by using ENTRY_SAME() instead of ENTRY_COMP().
Reported-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Miklos Szeredi <miklos@szeredi.hu>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- two fixes concerning iio ADC triggers for at91sam9260 and at91sam9g20
one for the "device" file, the other for the DT.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQEcBAABAgAGBQJTfjD/AAoJEAf03oE53VmQ1p0IAJL/FmsbXfHLLu8KvQEXMW8W
Y3c22v2mHSHpEFjqjFg+LbOkRbFaPTlszSXuKyw7HJPiOygUk/C1i2MUW4WLNlyY
fqu+tiAOrcfJ8rapM5ndFdeMZrkyRQIInFEPzOttQJSN4Aczpc6irfMqOOqw0y/s
Sf88MiK2YgeXnt5dH6mDYhy29glvspek3w1FmDN3DjmiAdxAUV8bSpOmw2FsFAX7
QJOwkevln+3Ddxnq+NsG/6KxdAl0fj+G3shl4GwALr+hizKypzsFsSGDuyKuAL2o
A7e+OFoUWN8eCTJBdP9YA/lVyl83uFMM3tBcHr4Oq8dJk6sQ0GNQcdJuCUBGTBw=
=ehhW
-----END PGP SIGNATURE-----
Merge tag 'at91-fixes2' of git://github.com/at91linux/linux-at91 into fixes
Second 3.15 fixes for AT91
- two fixes concerning iio ADC triggers for at91sam9260 and at91sam9g20
one for the "device" file, the other for the DT.
* tag 'at91-fixes2' of git://github.com/at91linux/linux-at91:
ARM: at91: sam9260: fix compilation issues
ARM: at91/dt: sam9260: correct external trigger value
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
call->async_workfn() can take an afs_call* arg rather than a work_struct* as
the functions assigned there are now called from afs_async_workfn() which has
to call container_of() anyway.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Nathaniel Wesley Filardo <nwf@cs.jhu.edu>
Reviewed-by: Tejun Heo <tj@kernel.org>
At present, it is not possible to successfully unload the kafs module if there
are outstanding async outgoing calls (those made with afs_make_call()). This
appears to be due to the changes introduced by:
commit 059499453a
Author: Tejun Heo <tj@kernel.org>
Date: Fri Mar 7 10:24:50 2014 -0500
Subject: afs: don't use PREPARE_WORK
which didn't go far enough. The problem is due to:
(1) The aforementioned commit introduced a separate handler function pointer
in the call, call->async_workfn, in addition to the original workqueue
item, call->async_work, for asynchronous operations because workqueues
subsystem cannot handle the workqueue item pointer being changed whilst
the item is queued or being processed.
(2) afs_async_workfn() was introduced in that commit to be the callback for
call->async_work. Its sole purpose is to run whatever call->async_workfn
points to.
(3) call->async_workfn is only used from afs_async_workfn(), which is only
set on async_work by afs_collect_incoming_call() - ie. for incoming
calls.
(4) call->async_workfn is *not* set by afs_make_call() when outgoing calls are
made, and call->async_work is set afs_process_async_call() - and not
afs_async_workfn().
(5) afs_process_async_call() now changes call->async_workfn rather than
call->async_work to point to afs_delete_async_call() to clean up, but this
is only effective for incoming calls because call->async_work does not
point to afs_async_workfn() for outgoing calls.
(6) Because, for incoming calls, call->async_work remains pointing to
afs_process_async_call() this results in an infinite loop.
Instead, make the workqueue uniformly vector through call->async_workfn, via
afs_async_workfn() and simply initialise call->async_workfn to point to
afs_process_async_call() in afs_make_call().
Signed-off-by: Nathaniel Wesley Filardo <nwf@cs.jhu.edu>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
Split afs_end_call() into two pieces, one of which is identical to code in
afs_process_async_call(). Replace the latter with a call to the first part of
afs_end_call().
Signed-off-by: Nathaniel Wesley Filardo <nwf@cs.jhu.edu>
Signed-off-by: David Howells <dhowells@redhat.com>
The recent Intel H97/Z97 chipsets need the similar setups like other
Intel chipsets for snooping, etc. Especially without snooping, the
audio playback stutters or gets corrupted. This fix patch just adds
the corresponding PCI ID entry with the proper flags.
Reported-and-tested-by: Arthur Borsboom <arthurborsboom@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Two issues:
o For beql_op, beql_op, bne_op, bnel_op, blez_op, blezl_op, bgtz_op and
bgtzl_op the wrong field was being checked for the instruction opcode.
o For blez_op / blezl_op and bgtz_op / bgtzl_op the test was testing
for the wrong opcode.
This bug got introduced by d8d4e3ae0b [MIPS
Kprobes: Refactor branch emulation].
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
Acked-by: Victor Kamensky <kamensky@cisco.com>
The recent change in sysfs, bcdde7e221
"sysfs: make __sysfs_remove_dir() recursive" revealed an asymmetric
rphy device creation/deletion sequence in scsi_transport_sas:
modprobe mpt2sas
sas_rphy_add
device_add A rphy->dev
device_add B sas_device transport class
device_add C sas_end_device transport class
device_add D bsg class
rmmod mpt2sas
sas_rphy_delete
sas_rphy_remove
device_del B
device_del C
device_del A
sysfs_remove_group recursive sysfs dir removal
sas_rphy_free
device_del D warning
where device A is the parent of B, C, and D.
When sas_rphy_free tries to unregister the bsg request queue (device D
above), the ensuing sysfs cleanup discovers that its sysfs group has
already been removed and emits a warning, "sysfs group... not found for
kobject 'end_device-X:0'".
Since bsg creation is a side effect of sas_rphy_add, move its
complementary removal call into sas_rphy_remove. This imposes the
following tear-down order for the devices above: D, B, C, A.
Note the sas_device and sas_end_device transport class devices (B and C
above) are created and destroyed both via the list match traversal in
attribute_container_device_trigger, so the order in which they are
handled is fixed. This is fine as long as they are deleted before their
parent device.
Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Current code only touches the direction register when setting direction
to output, which breaks logic like
echo high > /sys/class/gpio/gpio0/direction
which is expected to also set the value. This patch also adds a call
to update the value register when setting direction to output.
Signed-off-by: Alexey Charkov <alchark@gmail.com>
Acked-by: Tony Prisk <linux@prisktech.co.nz>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
batman tries to search dev->iflink to check if it's a batman interface,
but ->iflink could be 0, which is not a valid ifindex. It should just
avoid iflink == 0 case.
Reported-by: Jet Chen <jet.chen@intel.com>
Tested-by: Jet Chen <jet.chen@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Antonio Quartulli <antonio@open-mesh.com>
Cc: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
the value of itag is a random value from stack, and may not be initiated by
fib_validate_source, which called fib_combine_itag if CONFIG_IP_ROUTE_CLASSID
is not set
This will make the cached dst uncertainty
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ALB learning packets are currentlyalways sent using the slave mac
address for all vlans configured on top of bond. This is not always
correct, as vlans may change their mac address.
This patch introduced a concept of strict matching where the
source of learning packets can either strictly match the address
passed in, or it can determine a more correct address to use.
There are 3 casese to consider:
1) Switchover. In this case, we have a new active slave and we need
tell the switch about all addresses available on the slave.
2) Monitor. We'll periodically refresh learning info for all slaves.
In this case, we refresh all addresses for current active, and just
the slave address for other slaves.
3) Teaching of disabled adddress. This happens as part of the
failover and in this case, we alwyas to use just the address
provided.
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
TLB/ALB learning packets always assume 802.1Q vlan protocol, but
that is no longer the case since we now have support for Q-in-Q
on top of bonding. Pass the vlan protocol to alb_send_lp_vid()
so that the packets are properly tagged.
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Acked-by: Veaceslav Falico <vfalico@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iEYEABECAAYFAlN8ibsACgkQjTAFq1RaXHPXQQCdF6gWW/tCObrWO8cWHQJCRij+
FVwAn2cAfA0W8goptL45550Nhzd1ijQf
=HF4f
-----END PGP SIGNATURE-----
Merge tag 'linux-can-fixes-for-3.15-20140521' of git://gitorious.org/linux-can/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2014-05-21
this is a pull request for net/master, for the v3.15 release cycle, with a
single patch. Christopher R. Baker found a use after free during unloading of
the peak_pci driver. This is fixes in a patch by Stephane Grosjean.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit 61b905da33 ("net: Rename skb->rxhash to skb->hash"), skb->rxhash
was renamed to skb->hash. Update references in Documentation
accordingly.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The stmmac_open call was calling clk_disable_unprepare on phy init
failure, but it never calls clk_prepare_enable, this causes
a WARN_ON in the clk framework to trigger if for some reason phy init
fails.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Acked-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
tc_mode() can be called from interrupt context and thus must not call
clk_*prepare*() functions.
Signed-off-by: David Jander <david@protonic.nl>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
irqchip will reject the affinity set to CPUs which is not online
yet. but in the CPU1 wakeup stage, OS only sets CPU1 to be online
after local timer is set, so that causes the irq_set_affinity not
work. this patch moves to irq_force_affinity() for the low level
boot stage.
Signed-off-by: Zhiwu Song <Zhiwu.Song@csr.com>
Signed-off-by: Barry Song <Baohua.Song@csr.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Use the hexadecimal values for the triggers to match what is done for the device
tree. This also fixes compilation issues as the defines have been moved
elsewhere.
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Due a copy/paste error, the 'reg' values for the third PCIe interface
on Armada 380, and the third and fourth PCIe interfaces on Armada 385
are wrong: they are equal to the one of the second PCIe interface.
This patch fixes this by using the appropriate 'reg' values for those
PCIe interfaces.
Without this fix, the third and fourth PCIe interfaces are unusable on
those platforms.
Reported-by: Nadav Haklai <nadavh@marvell.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Link: https://lkml.kernel.org/r/1400597008-4148-1-git-send-email-thomas.petazzoni@free-electrons.com
Fixes: 0d3d96ab00 ("ARM: mvebu: add Device Tree description of the Armada 380/385 SoCs")
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
Lai found that:
WARNING: CPU: 1 PID: 13 at arch/x86/kernel/smp.c:124 native_smp_send_reschedule+0x2d/0x4b()
...
migration_cpu_stop+0x1d/0x22
was caused by set_cpus_allowed_ptr() assuming that cpu_active_mask is
always a sub-set of cpu_online_mask.
This isn't true since 5fbd036b55 ("sched: Cleanup cpu_active madness").
So set active and online at the same time to avoid this particular
problem.
Fixes: 5fbd036b55 ("sched: Cleanup cpu_active madness")
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael wang <wangyun@linux.vnet.ibm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Toshi Kani <toshi.kani@hp.com>
Link: http://lkml.kernel.org/r/53758B12.8060609@cn.fujitsu.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tejun reported that his resume was failing due to order-3 allocations
from sched_domain building.
Replace the NR_CPUS arrays in there with a dynamically allocated
array.
Reported-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-7cysnkw1gik45r864t1nkudh@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tejun reported that his resume was failing due to order-3 allocations
from sched_domain building.
Replace the NR_CPUS arrays in there with a dynamically allocated
array.
Reported-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Juri Lelli <juri.lelli@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-kat4gl1m5a6dwy6nzuqox45e@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Michael Kerrisk noticed that creating SCHED_DEADLINE reservations
with certain parameters (e.g, a runtime of something near 2^64 ns)
can cause a system freeze for some amount of time.
The problem is that in the interface we have
u64 sched_runtime;
while internally we need to have a signed runtime (to cope with
budget overruns)
s64 runtime;
At the time we setup a new dl_entity we copy the first value in
the second. The cast turns out with negative values when
sched_runtime is too big, and this causes the scheduler to go crazy
right from the start.
Moreover, considering how we deal with deadlines wraparound
(s64)(a - b) < 0
we also have to restrict acceptable values for sched_{deadline,period}.
This patch fixes the thing checking that user parameters are always
below 2^63 ns (still large enough for everyone).
It also rewrites other conditions that we check, since in
__checkparam_dl we don't have to deal with deadline wraparounds
and what we have now erroneously fails when the difference between
values is too big.
Reported-by: Michael Kerrisk <mtk.manpages@gmail.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Juri Lelli <juri.lelli@gmail.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Dario Faggioli<raistlin@linux.it>
Cc: Dave Jones <davej@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140513141131.20d944f81633ee937f256385@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The way we read POSIX one should only call sched_getparam() when
sched_getscheduler() returns either SCHED_FIFO or SCHED_RR.
Given that we currently return sched_param::sched_priority=0 for all
others, extend the same behaviour to SCHED_DEADLINE.
Requested-by: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Dario Faggioli <raistlin@linux.it>
Cc: linux-man <linux-man@vger.kernel.org>
Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
Cc: Juri Lelli <juri.lelli@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20140512205034.GH13467@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The scheduler uses policy=-1 to preserve the current policy state to
implement sys_sched_setparam(), this got exposed to userspace by
accident through sys_sched_setattr(), cure this.
Reported-by: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: <stable@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140509085311.GJ30445@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The documented[1] behavior of sched_attr() in the proposed man page text is:
sched_attr::size must be set to the size of the structure, as in
sizeof(struct sched_attr), if the provided structure is smaller
than the kernel structure, any additional fields are assumed
'0'. If the provided structure is larger than the kernel structure,
the kernel verifies all additional fields are '0' if not the
syscall will fail with -E2BIG.
As currently implemented, sched_copy_attr() returns -EFBIG for
for this case, but the logic in sys_sched_setattr() converts that
error to -EFAULT. This patch fixes the behavior.
[1] http://thread.gmane.org/gmane.linux.kernel/1615615/focus=1697760
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/536CEC17.9070903@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Also a minor one line fix for audio clock on 54xx.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJTeopoAAoJEBvUPslcq6VzpekP/0fMNISCDPsyvAuarUrZWOay
yqkAQioDt5Bl6LKR9SYmnLCDqR9vPv7dr5kvL7TMc5Knph7faAdKSP0a6uQvyU+n
MpZaTQQ/9gsA/FJgD2YXIE25sOYOUX5WhCj+oqZZ96f6HUo9g8Ac5Mlyev21vpP+
92Q4AY28iH1035PTUYPaxs9PbHWK/Cmb2j/N3UppIppwoEob9xhl+KaPAkwE30LE
QePNY1aIqSjv+2kJB+IwDyf58eYxr1x8R/Te8rAdgVxBIc46CELrdkpG60okfWxv
V3oC5B+n1d9Qb179Dp3EsBOW4RnqtF2qtrLiEwIK2beBBiMVBEFRGLMuu1ua3m7y
18GVP90Uz5RT9ceCgiyvgwo9AkdZHhUxtjRTT51i48oSjW/tl/yn8YX0kLlr73x+
FM0m4Wb0Fe2ZlIbkJUL38jvEwxo5J5dt5/5WLAqi0YWFt2E4d2Jk+mYKj2eGFmyA
R19Fa8wW1ZLs0dqSTIk8s/HCSf9nyhuntmfYbkAO0ziKSM5l9h/CDx4j4bTkE3D9
RfMaICklLuM0gcK9KfEkSS47LKleWKEVOCX4uersg0ydsMtla3SmU6e0b+kDKus+
1XBqbpIeWKSfl95bMLiuNaoJZAsQsocihJwfsw0MGDiv/JXUlyk3/BsQTg9FdowW
4Ijtq2kmzTI3jtOY9dmu
=ioRt
-----END PGP SIGNATURE-----
Merge tag 'omap-for-v3.15/fixes-v3-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
Pull "omap fixes for v3.15-rc cycle" from Tony Lindgren:
Regression fixes for omaps for NAND, DMA, cpu_idle and audio.
Also a minor one line fix for audio clock on 54xx.
* tag 'omap-for-v3.15/fixes-v3-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP4: Fix the boot regression with CPU_IDLE enabled
ARM: OMAP2+: Fix DMA hang after off-idle
ARM: OMAP2+: nand: Fix NAND on OMAP2 and OMAP3 boards
ARM: omap5: hwmod_data: Correct IDLEMODE for McPDM
ARM: OMAP3: clock: Back-propagate rate change from cam_mclk to dpll4_m5 on all OMAP3 platforms
Signed-off-by: Olof Johansson <olof@lixom.net>
If we fail to allocate struct platform_device pdev we
dereference it after the goto label err.
This bug was found using coccinelle.
Fixes: afa77ef (ARM: mx3: dynamically allocate "ipu-core" devices)
Signed-off-by: Emil Goode <emilgoode@gmail.com>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Shawn Guo <shawn.guo@freescale.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
- Remove g2d_pd and mau_pd nodes on exynos5420.
Since the power domains are linked to the CMU blocks,
kernel panic happens during access clocks when the
power domains are disabled. Now this is a best solution.
- Enable HS-I2C on exynos5 by default
MMC partition cannot be mounted for RFS without the
enabling HS-I2C because regulators for MMC power are
connected to HS-I2C bus.
- Disable MDMA1 node on exynos5420
When MDMA1 runs in secure mode it makes kernel fault,
so need to disalbe it on exynos5420 by default instead
of each board.
- Fix the secondary CPU boot for exynos4212
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJTc8ZZAAoJEA0Cl+kVi2xqhKgQAJdt3OjRt22NTqzSvvwYXEqd
/pLptj4JQ+M4LFlPLPpHakuDN8UBy8PR3C9C8ksWfqqKEeKTfZx0icQTCItXE4uF
ZCRIcwMHzYkpgXjkkB1oFFa0JFrJYUEtTi0TJCNuxyXqt3Llr77WKZ7LBYKzepKi
vnqVZeAzlWSvyRObGGLK8GpOgfdwFBORwwwiTO42tz5AYki6rbZKTq+9ffKC/iTT
SBipYtJ1XILoQgjKmbVgN4G3bHxtfHv91jrqfVfl+0mjl3c02WfO057OE6E9qwJ1
DuSirV0M/YhDAWgKpRBAKNosPRamPXoMG8UbMqIo+yk95ZNtXleXkJ7UU8c4xBbW
OxR5ivG6urUrDxCYqT9tlx1f4gAZUhM6yio3ikP3G+55OED8pgSxYgtore+wGzT9
hbqwYgTsS3f0HYt3Hn4e4NOzvpJyH7SjDdfiWv3pfloTwCS7x4X/CRTby5bde7Zc
5GxrEyQvZ2hSZEAP96jAM5BvW8AGuy0sCi4t44PLm/igqw/P880XBrr3RdiEva8W
bwwxhzcJwL1BsLxc7BGMaTAsd/+Tx0VLsp4FRgWseQW4xjHhaAK5s0IhPDUi8soH
1U/dHGIUGOl1VB8b3C/PQEq2pV7nc/VZcKzDrLIBG754ZmwYrpm5iMBmuPrMsgsn
4uLifrHoJRn2upWmdzOe
=kz4r
-----END PGP SIGNATURE-----
Merge tag 'samsung-fixes' of http://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung into fixes
Samsung fixes for 3.15 from Kukjin Kim:
- Remove g2d_pd and mau_pd nodes on exynos5420.
Since the power domains are linked to the CMU blocks,
kernel panic happens during access clocks when the
power domains are disabled. Now this is a best solution.
- Enable HS-I2C on exynos5 by default
MMC partition cannot be mounted for RFS without the
enabling HS-I2C because regulators for MMC power are
connected to HS-I2C bus.
- Disable MDMA1 node on exynos5420
When MDMA1 runs in secure mode it makes kernel fault,
so need to disalbe it on exynos5420 by default instead
of each board.
- Fix the secondary CPU boot for exynos4212
* tag 'samsung-fixes' of http://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
ARM: dts: Remove g2d_pd node for exynos5420
ARM: dts: Remove mau_pd node for exynos5420
ARM: exynos_defconfig: enable HS-I2C to fix for mmc partition mount
ARM: dts: disable MDMA1 node for exynos5420
ARM: EXYNOS: fix the secondary CPU boot of exynos4212
Signed-off-by: Olof Johansson <olof@lixom.net>
radeon fixes, VCE one is big but does fix a userspace crash.
* 'drm-fixes-3.15' of git://people.freedesktop.org/~deathsimple/linux:
drm/radeon/pm: don't allow debugfs/sysfs access when PX card is off (v2)
drm/radeon: avoid segfault on device open when accel is not working.
drm/radeon: fix typo in finding PLL params
drm/radeon: fix register typo on si
drm/radeon: fix buffer placement under memory pressure v2
drm/radeon: fix page directory update size estimation
drm/radeon: handle non-VGA class pci devices with ATRM
drm/radeon: fix DCE83 check for mullins
drm/radeon: check VCE relocation buffer range v3
drm/radeon: also try GART for CPU accessed buffers
fixes nasty panel bleeding bug.
* 'drm-nouveau-next' of git://anongit.freedesktop.org/git/nouveau/linux-2.6:
drm/gf119-/disp: fix nasty bug which can clobber SOR0's clock setup
drm/nvd9/therm: handle another kind of PWM fan
The oops can be triggered in qemu using -no-hpet (but not nohpet) by
running a 32-bit program and reading a couple of pages before the vdso.
This should send SIGBUS instead of OOPSing.
The bug was introduced by:
commit 7a59ed415f
Author: Stefani Seibold <stefani@seibold.net>
Date: Mon Mar 17 23:22:09 2014 +0100
x86, vdso: Add 32 bit VDSO time support for 32 bit kernel
which is new in 3.15.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/e99025d887d6670b6c4d81e6ccfeeb83770b21e9.1400109621.git.luto@amacapital.net
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
When GRE support was added in linux-3.14, CHECKSUM_COMPLETE handling
broke on GRE+IPv6 because we did not update/use the appropriate csum :
GRO layer is supposed to use/update NAPI_GRO_CB(skb)->csum instead of
skb->csum
Tested using a GRE tunnel and IPv6 traffic. GRO aggregation now happens
at the first level (ethernet device) instead of being done in gre
tunnel. Native IPv6+TCP is still properly aggregated.
Fixes: bf5a755f5e ("net-gre-gro: Add GRE support to the GRO stack")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jerry Chu <hkchu@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The count which is used to get_unmap_data maybe not the same as the
count computed in dmaengine_unmap which causes to free data in a
wrong pool.
This patch fixes this issue by keeping the map count with unmap_data
structure and use this count to get the pool.
Cc: <stable@vger.kernel.org>
Signed-off-by: Xuelin Shi <xuelin.shi@freescale.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
We need to use writel() instead of writel_relaxed() when starting
a channel, to ensure all the descriptors have been flushed before
the activation.
While at it, remove the unneeded read-modify-write and make the
code simpler.
Cc: <stable@vger.kernel.org>
Signed-off-by: Lior Amsalem <alior@marvell.com>
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>