Admission control is of key importance for SCHED_DEADLINE, since
it guarantees system schedulability (or tells us something about
the degree of guarantees we can provide to the user).
This patch improves and clarifies bits and pieces regarding AC,
both for UP and SMP systems.
Signed-off-by: Luca Abeni <luca.abeni@unitn.it>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Reviewed-by: Henrik Austad <henrik@austad.us>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dario Faggioli <raistlin@linux.it>
Cc: Juri Lelli <juri.lelli@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1410256636-26171-4-git-send-email-juri.lelli@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Section 4 intro was still describing the old interface. Rewrite
it.
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Signed-off-by: Luca Abeni <luca.abeni@unitn.it>
Reviewed-by: Henrik Austad <henrik@austad.us>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dario Faggioli <raistlin@linux.it>
Cc: Juri Lelli <juri.lelli@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1410256636-26171-3-git-send-email-juri.lelli@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Several small changes regarding SCHED_DEADLINE documentation
that fix terminology and improve clarity and readability:
- "current runtime" becomes "remaining runtime"
- readablity of an equation is improved by introducing more spacing
- clarify when admission control will certainly fail
- new URL for CBS technical report
- substitue "smallest" with "earliest"
Signed-off-by: Luca Abeni <luca.abeni@unitn.it>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Reviewed-by: Henrik Austad <henrik@austad.us>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dario Faggioli <raistlin@linux.it>
Cc: Juri Lelli <juri.lelli@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1410256636-26171-2-git-send-email-juri.lelli@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When running workloads on 2+ socket systems, based on perf profiles, the
update_cfs_rq_blocked_load() function often shows up as taking up a
noticeable % of run time.
Much of the contention is in __update_cfs_rq_tg_load_contrib() when we
update the tg load contribution stats. However, it turns out that in many
cases, they don't need to be updated and "tg_contrib" is 0.
This patch adds a check in __update_cfs_rq_tg_load_contrib() to skip updating
tg load contribution stats when nothing needs to be updated. This reduces the
cacheline contention that would be unnecessary.
Reviewed-by: Ben Segall <bsegall@google.com>
Reviewed-by: Waiman Long <Waiman.Long@hp.com>
Signed-off-by: Jason Low <jason.low2@hp.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Turner <pjt@google.com>
Cc: jason.low2@hp.com
Cc: Yuyang Du <yuyang.du@intel.com>
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Cc: Chegu Vinod <chegu_vinod@hp.com>
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1409643684.19197.15.camel@j-VirtualBox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Current code can fail to migrate a waking task (silently) when TTWU_QUEUE is
enabled.
When a task is waking, it is pending on the wake_list of the rq, but it is not
queued (task->on_rq == 0). In this case, set_cpus_allowed_ptr() and
__migrate_task() will not migrate it because its invisible to them.
This behavior is incorrect, because the task has been already woken, it will be
running on the wrong CPU without correct placement until the next wake-up or
update for cpus_allowed.
To fix this problem, we need to finish the wakeup (so they appear on
the runqueue) before we migrate them.
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Reported-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Tested-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/538ED7EB.5050303@cn.fujitsu.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The functions task_cputime_adjusted and thread_group_cputime_adjusted()
can be called locklessly, as well as concurrently on many different CPUs.
This can occasionally lead to the utime and stime reported by times(), and
other syscalls like it, going backward. The cause for this appears to be
multiple threads racing in cputime_adjust(), both with values for utime or
stime that is larger than the original, but each with a different value.
Sometimes the larger value gets saved first, only to be immediately
overwritten with a smaller value by another thread.
Using atomic exchange prevents that problem, and ensures time
progresses monotonically.
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: umgwanakikbuti@gmail.com
Cc: fweisbec@gmail.com
Cc: akpm@linux-foundation.org
Cc: srao@redhat.com
Cc: lwoodman@redhat.com
Cc: atheurer@redhat.com
Cc: oleg@redhat.com
Link: http://lkml.kernel.org/r/1408133138-22048-4-git-send-email-riel@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Both times() and clock_gettime(CLOCK_PROCESS_CPUTIME_ID) have scalability
issues on large systems, due to both functions being serialized with a
lock.
The lock protects against reporting a wrong value, due to a thread in the
task group exiting, its statistics reporting up to the signal struct, and
that exited task's statistics being counted twice (or not at all).
Protecting that with a lock results in times() and clock_gettime() being
completely serialized on large systems.
This can be fixed by using a seqlock around the events that gather and
propagate statistics. As an additional benefit, the protection code can
be moved into thread_group_cputime(), slightly simplifying the calling
functions.
In the case of posix_cpu_clock_get_task() things can be simplified a
lot, because the calling function already ensures that the task sticks
around, and the rest is now taken care of in thread_group_cputime().
This way the statistics reporting code can run lockless.
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Daeseok Youn <daeseok.youn@gmail.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Guillaume Morin <guillaume@morinfr.org>
Cc: Ionut Alexa <ionut.m.alexa@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Michal Schmidt <mschmidt@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Cc: umgwanakikbuti@gmail.com
Cc: fweisbec@gmail.com
Cc: srao@redhat.com
Cc: lwoodman@redhat.com
Cc: atheurer@redhat.com
Link: http://lkml.kernel.org/r/20140816134010.26a9b572@annuminas.surriel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Oleg pointed out that wait_task_zombie adds a task's usage statistics
to the parent's signal struct, but the task's own signal struct should
also propagate the statistics at exit time.
This allows thread_group_cputime(reaped_zombie) to get the statistics
after __unhash_process() has made the task invisible to for_each_thread,
but before the thread has actually been rcu freed, making sure no
non-monotonic results are returned inside that window.
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Guillaume Morin <guillaume@morinfr.org>
Cc: Ionut Alexa <ionut.m.alexa@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Michal Schmidt <mschmidt@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: umgwanakikbuti@gmail.com
Cc: fweisbec@gmail.com
Cc: srao@redhat.com
Cc: lwoodman@redhat.com
Cc: atheurer@redhat.com
Link: http://lkml.kernel.org/r/1408133138-22048-2-git-send-email-riel@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJUDOW+AAoJEHm+PkMAQRiGOXYH/00TPKm8PdM5cXXG2YYYv9eT
W99K7KD2i0/qiVtlGgjjvB7fO3K0HcZusTd2jmVd8IWntXvauq7Zpw5YZkjwu4KX
Y1HCwwCd2aw0FoqgrJhNP3+j5Cr1BD/HLtbffjCe+A3tppOIis4Bwt2wJOoYlXpS
hU9Jxxc4lcRo8YKbffouDo7PIneWeJy8N+WGpUR5BfJIEK0ZZtCUqn3/3WLX4FYu
fE6uiF/bACTpKXU/mo4dDbhZp439H/QdwQc9B0F8+8CBDMXKaNHrPV7kN36T2SWa
fD4boikTsi/yh9Ks1fvHbvNq2N0ihoMnja+vLRyvjAcAQv2fKG3OZtYgFWSdghU=
=Xknd
-----END PGP SIGNATURE-----
Merge tag 'v3.17-rc4' into sched/core, to prevent conflicts with upcoming patches, and to refresh the tree
Linux 3.17-rc4
new link for - How to piss off a Linux kernel subsystem maintainer
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The NFS/RDMA Kconfig symbol was split into separate options for client
and server in commit 2e8c12e1b7 ("xprtrdma: add separate Kconfig
options for NFSoRDMA client and server support").
Update the documentation to reflect this split.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "J. Bruce Fields" <bfields@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
hpfall.c was renamed to freefall.c in 3.16, but this file still refer to
hpfall.c instead of freefall.c
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The example code provided with the i2c device interface documentation
won't compile since it uses the reserved word "register" to name a
variable.
The compiler fails with this error message:
error: expected identifier or '(' before '=' token
__u8 register = 0x20; /* Device register to access */
^
Rename the variable "register" to simply "reg" in the example code.
Another couple of typos has been fixed as well.
[Change "! =" to "!=".]
Signed-off-by: Jose Alarcon Roldan <jose.alarcon.roldan@gmail.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Despite the fact that these functions have been around for years, they
are little used (only 15 uses in 13 files at the preseht time) even
though many other files use work-arounds to achieve the same result.
By documenting them, hopefully they will become more widely used.
Signed-off-by: Rob Jones <rob.jones@codethink.co.uk>
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Fix for recently broken test_suspend= command line argument
(Rafael J Wysocki).
- Fixes for regressions related to the ACPI video driver caused
by switching the default to native backlight handling in 3.16
from Hans de Goede.
- Fix for a sysfs attribute of ACPI device objects that returns
stale values sometimes due to the fact that they are cached
instead of executing the appropriate method (_SUN) every time
(broken in 3.14). From Yasuaki Ishimatsu.
- Fix for a deadlock between cpuidle_lock and cpu_hotplug.lock
in the ACPI processor driver from Jiri Kosina.
- Runtime output validation for the ACPI _DSD device configuration
object missing from the support for it that has been introduced
recently. From Mika Westerberg.
- Fix for an unuseful and misleading RAPL (Running Average Power
Limit) domain detection message in the RAPL driver from Jacob Pan.
- New Intel Haswell CPU ID for the RAPL driver from Jason Baron.
- New Clevo W350etq blacklist entry for the ACPI EC driver
from Lan Tianyu.
- Cleanup for the intel_pstate driver and the core generic PM
domains code from Gabriele Mazzotta and Geert Uytterhoeven.
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJUCcZNAAoJEILEb/54YlRxhAEP/1O6gUMzbEs1LNuMoUSP/Bcx
L+sAImXBsUsvEhEVSceXrM3Gr/TpTP7t4m+O05PC8QwpCEAAB5z6NXRK3uckwmR3
//jZKm5D5eXny4QkTaZl1yUmxdoX5DlwkPkhlNS6DxBn/cq+wvPxs0crGw+0arpi
Sylj8GFbVeibhD1Wz0wor95BRg+KcbTNy5jmECs5fSWmitMC62fYXpwybbxHg8Yt
4FIHiAZSsSDT+MFPnH68pwKN0D3HDVmK0FBzvexjiHQvDRh6QFUmjSCIbiV7lDj8
bZk84xmoMtiA4eIFiFk6MTx8BibumrbefG6TT8rFH7kCOfuHbxIOzslVVmYbSpvK
ldyndGueC4AIBRREJodt6jZ3j7CQeVmtxN/CL9PvA31p6Fz0R8vMgjPKNhNN0YWj
sILY2aHWACGxefCq2Jw4osvKzMucBsC/I8C14ErhKyLf1mH/AAiavefMvpIjLLKn
OOPB6XxnqBH8iadSbVpX2rgHvaMExzB9vDZPKK67CS04opTdqhS0VQR13dYw8EOk
KGuVzF18bQXHjm+FzeaYqfi24WkpVh8kHuXJ6msTnTGLMWdJkql41pNtkpw6s98m
oh92vI/CWKChC2jlsIOgdbTom5xbaiv8QLq0z+A22FNw3h6M3X5nIkJoIOUF0xTb
wXnTBZCQPRfUsK0KdbC3
=EzJF
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-3.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael Wysocki:
"These are regression fixes (ACPI sysfs, ACPI video, suspend test),
ACPI cpuidle deadlock fix, missing runtime validation of ACPI _DSD
output, a fix and a new CPU ID for the RAPL driver, new blacklist
entry for the ACPI EC driver and a couple of trivial cleanups
(intel_pstate and generic PM domains).
Specifics:
- Fix for recently broken test_suspend= command line argument (Rafael
Wysocki).
- Fixes for regressions related to the ACPI video driver caused by
switching the default to native backlight handling in 3.16 from
Hans de Goede.
- Fix for a sysfs attribute of ACPI device objects that returns stale
values sometimes due to the fact that they are cached instead of
executing the appropriate method (_SUN) every time (broken in
3.14). From Yasuaki Ishimatsu.
- Fix for a deadlock between cpuidle_lock and cpu_hotplug.lock in the
ACPI processor driver from Jiri Kosina.
- Runtime output validation for the ACPI _DSD device configuration
object missing from the support for it that has been introduced
recently. From Mika Westerberg.
- Fix for an unuseful and misleading RAPL (Running Average Power
Limit) domain detection message in the RAPL driver from Jacob Pan.
- New Intel Haswell CPU ID for the RAPL driver from Jason Baron.
- New Clevo W350etq blacklist entry for the ACPI EC driver from Lan
Tianyu.
- Cleanup for the intel_pstate driver and the core generic PM domains
code from Gabriele Mazzotta and Geert Uytterhoeven"
* tag 'pm+acpi-3.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / cpuidle: fix deadlock between cpuidle_lock and cpu_hotplug.lock
ACPI / scan: not cache _SUN value in struct acpi_device_pnp
cpufreq: intel_pstate: Remove unneeded variable
powercap / RAPL: change domain detection message
powercap / RAPL: add support for CPU model 0x3f
PM / domains: Make generic_pm_domain.name const
PM / sleep: Fix test_suspend= command line option
ACPI / EC: Add msi quirk for Clevo W350etq
ACPI / video: Disable native_backlight on HP ENVY 15 Notebook PC
ACPI / video: Add a disable_native_backlight quirk
ACPI / video: Fix use_native_backlight selection logic
ACPICA: ACPI 5.1: Add support for runtime validation of _DSD package.
Pull filesystem fixes from Al Viro:
"Several bugfixes (all of them -stable fodder).
Alexey's one deals with double mutex_lock() in UFS (apparently, nobody
has tried to test "ufs: sb mutex merge + mutex_destroy" on something
like file creation/removal on ufs). Mine deal with two kinds of
umount bugs, in umount propagation and in handling of automounted
submounts, both resulting in bogus transient EBUSY from umount"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
ufs: fix deadlocks introduced by sb mutex merge
fix EBUSY on umount() from MNT_SHRINKABLE
get rid of propagate_umount() mistakenly treating slaves as busy.
Pull RCU fix from Ingo Molnar:
"A boot hang fix for the offloaded callback RCU model (RCU_NOCB_CPU=y
&& (TREE_CPU=y || TREE_PREEMPT_RC)) in certain bootup scenarios"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rcu: Make nocb leader kthreads process pending callbacks after spawning
Pull timer fixes from Thomas Gleixner:
"Three fixlets from the timer departement:
- Update the timekeeper before updating vsyscall and pvclock. This
fixes the kvm-clock regression reported by Chris and Paolo.
- Use the proper irq work interface from NMI. This fixes the
regression reported by Catalin and Dave.
- Clarify the compat_nanosleep error handling mechanism to avoid
future confusion"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timekeeping: Update timekeeper before updating vsyscall and pvclock
compat: nanosleep: Clarify error handling
nohz: Restore NMI safe local irq work for local nohz kick
Commit 0244756edc ("ufs: sb mutex merge + mutex_destroy") introduces
deadlocks in ufs_new_inode() and ufs_free_inode().
Most callers of that functions acqure the mutex by themselves and
ufs_{new,free}_inode() do that via lock_ufs(),
i.e we have an unavoidable double lock.
The patch proposes to resolve the issue by making sure that
ufs_{new,free}_inode() are not called with the mutex held.
Found by Linux Driver Verification project (linuxtesting.org).
Cc: stable@vger.kernel.org # 3.16
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
An overrun could happen in function start_hrtick_dl()
when a task with SCHED_DEADLINE runs in the microseconds
range.
For example, if a task with SCHED_DEADLINE has the following parameters:
Task runtime deadline period
P1 200us 500us 500us
The deadline and period from task P1 are less than 1ms.
In order to achieve microsecond precision, we need to enable HRTICK feature
by the next command:
PC#echo "HRTICK" > /sys/kernel/debug/sched_features
PC#trace-cmd record -e sched_switch &
PC#./schedtool -E -t 200000:500000:500000 -e ./test
The binary test is in an endless while(1) loop here.
Some pieces of trace.dat are as follows:
<idle>-0 157.603157: sched_switch: :R ==> 2481:4294967295: test
test-2481 157.603203: sched_switch: 2481:R ==> 0:120: swapper/2
<idle>-0 157.605657: sched_switch: :R ==> 2481:4294967295: test
test-2481 157.608183: sched_switch: 2481:R ==> 2483:120: trace-cmd
trace-cmd-2483 157.609656: sched_switch:2483:R==>2481:4294967295: test
We can get the runtime of P1 from the information above:
runtime = 157.608183 - 157.605657
runtime = 0.002526(2.526ms)
The correct runtime should be less than or equal to 200us at some point.
The problem is caused by a conditional judgment "delta > 10000"
in function start_hrtick_dl().
Because no hrtimer start up to control the rest of runtime
when the reset of runtime is less than 10us.
So the process will continue to run until tick-period is coming.
Move the code with the limit of the least time slice
from hrtick_start_fair() to hrtick_start() because the
EDF schedule class also needs this function in start_hrtick_dl().
To fix this problem, we call hrtimer_start() unconditionally in
start_hrtick_dl(), and make sure the scheduling slice won't be smaller
than 10us in hrtimer_start().
Signed-off-by: Xiaofeng Yan <xiaofeng.yan@huawei.com>
Reviewed-by: Li Zefan <lizefan@huawei.com>
Acked-by: Juri Lelli <juri.lelli@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1409022941-5880-1-git-send-email-xiaofeng.yan@huawei.com
[ Massaged the changelog and the code. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJUCi7SAAoJEBvWZb6bTYbyuhgP+wQXwL1W4f6c219PMmz/Hpiz
OCRCTFpz8eOC/e1VG5zCcocu7FisG43auH5WqDnyr8RcC8RZlfKcpIwvIAgGk1oP
RusqDbhHXSo+mWCYAlVDwGeAagc1sgxjpHAKJ0uLT7Rsv7hJp88g/KE5JbOX+M6k
UfO+bMSvys5eFCct57O5ZgLWR58k++9CC0f6U5B/uMkwYCnPz32YQL83ebpBLPvT
vHcRVgUPXj7pg4ng8uVALvLJTewEKK8aG5o8LXtOmmOdYgccxSouy5wrX00g48pb
lCB3UZrKZ07AfEgdK/06oz9RwqUUpu58VZSE4DVlEhEcPTx0Uy6k1PMw6utAJi5r
WZ+Ws3IBQMfp3oqWJmdBLte0JAjK89glhqjrrseXjtux1piknyPfquPB/tGw6wxv
rBMq4r64KJwcpL2DMYZGbHpZ7vbfDTJ7aYZvHBp2YRFnR9YE7lqGt3VJpp9WHkZT
7SMvIVFEdF1SN4jXLZ4+3tno5hPlH+MbeteCT6ZweqVfSQzHEo7AvriKQS0wPoBm
rOMZvO7SMSctHkBBqnTkXHnZ8r0J+v89VZr85ayQC/FHUEp6nFdYNiqqO54VkTfE
BKuoepRfvjAK40hnWIWlPUMmzK1tzZwxB9vU+ghJ2yJWZMo4oIXjwg6fvYfa8Ar/
r68L21dou0TNpUpyIdvN
=vBFm
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"A smattering of bug fixes across most architectures"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
powerpc/kvm/cma: Fix panic introduces by signed shift operation
KVM: s390/mm: Fix guest storage key corruption in ptep_set_access_flags
KVM: s390/mm: Fix storage key corruption during swapping
arm/arm64: KVM: Complete WFI/WFE instructions
ARM/ARM64: KVM: Nuke Hyp-mode tlbs before enabling MMU
KVM: s390/mm: try a cow on read only pages for key ops
KVM: s390: Fix user triggerable bug in dead code
Fix:
- a direct IO read/buffered read data corruption
- the associated fallout from the DIO data corruption fix
- collapse range bugs that are potential data corruption issues.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJUCkM+AAoJEK3oKUf0dfodUBgP+gJu50/XV4TFRLPlCRxhvN61
371i3ASls1y7ivhj40NzgbDDAZHM2q8Zqwd//318dFViHhWQDlH/1ga06kscRpZX
d8cQEbFHApgUGQL5Gdq2l2hvAzYa75H0m6cq3jveyrN2rscjCSmAXwtlcEmx3AR6
TnCpxuVL5asjEGZYb0KfQACq//rASHJbhukpo1gB4ccZ0boWHOVf5SxuS4remzs9
y+rlPFNl5RD/WVdnJSvu9zu/nP6op3Ax5r7jZanoKbisKHfd7QOa+k65+Vz0Vq9G
kxgfhz+yLfkOvcktq+41e1lVBln7fCIlcO9m3b53uxWPx5cla323893UiGYsA4F/
j/gGlh1qaT6C/1M1JBWDLDx931S78XiR1Y+WbtAU1PO+GuO0IEap3+iqtS2+oNAv
OrpThLOgqTspK6MJeToCzdn2lRT2BJpcKwxIyDK8g+p9N6qCpyw3DfiKyu0wipGH
D2D3mtE6drSHNaSceFAz8CrQvPOR7Ygj92QGpGSfkohxap9h6VJR/wNp/oGnpmN0
qgcxTrHvx3kw1hXssB4gjh6fBDnOUkac0isqxdow22Qt529t9sIanzMBvz+JxHQF
zeqeFSh96lOXmB7UFBU+QyOhbDp3cJWChHrtY3Esw/+FmG6fxEy8z6pdZsiJYELr
5tka2richPD+gyXzcZwP
=tRQo
-----END PGP SIGNATURE-----
Merge tag 'xfs-for-linus-3.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs
Pull xfs fixes from Dave Chinner:
"The fixes all address recently discovered data corruption issues.
The original Direct IO issue was discovered by Chris Mason @ Facebook
on a production workload which mixed buffered reads with direct reads
and writes IO to the same file. The fix for that exposed other issues
with page invalidation (exposed by millions of fsx operations) failing
due to dirty buffers beyond EOF.
Finally, the collapse_range code could also cause problems due to
racing writeback changing the extent map while it was being shifted
around. The commits for that problem are simple mitigation fixes that
prevent the problem from occuring. A more robust fix for 3.18 that
addresses the underlying problem is currently being worked on by
Brian.
Summary of fixes:
- a direct IO read/buffered read data corruption
- the associated fallout from the DIO data corruption fix
- collapse range bugs that are potential data corruption issues"
* tag 'xfs-for-linus-3.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs:
xfs: trim eofblocks before collapse range
xfs: xfs_file_collapse_range is delalloc challenged
xfs: don't log inode unless extent shift makes extent modifications
xfs: use ranged writeback and invalidation for direct IO
xfs: don't zero partial page cache pages during O_DIRECT writes
xfs: don't zero partial page cache pages during O_DIRECT writes
xfs: don't dirty buffers beyond EOF
* A tiny comment tweak, to kill a bunch of DocBook warnings added during the
merge window
* A small fixup to the OTP routines' error handling
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJUCkvYAAoJEFySrpd9RFgtoyQP/3xE3ZPOU/6F80VVAVZZqCNa
rUaaaB+Y25B+2ust2mWWv6hPSDsLOJdcbv5w2aR1PlcjUNuqp+hW+E61BQUEUAwl
9Rb8GRcduCBgsawv6jYlAgGCN/p8YMNgpdelpDp6OFLwCXrimH/46ZBi9m9vuoO+
/RqTQT2ZTisqISCApw73l0Cjbne7tIW4ttZ1E0WcREG8egxFTn9uzZ2qA58/QAvD
3kPfMnmrysBBXIk9w7izSFQV3jXl1amIDL+vHvafo2+0yu/f89Yc+esDp0swSYjF
ZQrfVLoYbgN3B4utveYxZkkdljBCD3CQRFlOCi3CBiwEqpxQnO4D+F9gUzxGBpYr
K9DAKlvmqFD3T5fWAwWaQItEu9fJBXRIjOY5Eb7UhXULhWXkN7/VNuWA5qmrydgi
ZSX7LaRKwdDDHPrrgjBCIS3k5KXFO/VbWIoVsqya8OHMgWbMvv3EWZD8V3hnosus
OBiAerglNBq658w8H3rKhlhBP5+VxRUJoxlrkHBNeNHlnMD1g4kbt8/SvUKy9jy8
VKBBiW7MhpfqXbsKjapXzXSNtx5awQU3qFEoo9Jtjs7dFDs90744Zb+T9EaOLKYP
Y0m+uXooWvhOJEQ9YYCclneizQYKJNpCvgL3tgEIG74UZkquxOc4ecgDwCca2PWK
Xr7G2kRBInCRTegQBFB6
=7uT9
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20140905' of git://git.infradead.org/linux-mtd
Pull mtd fixes from Brian Norris:
"Two trivial MTD updates for 3.17-rc4:
- a tiny comment tweak, to kill a bunch of DocBook warnings added
during the merge window
- a small fixup to the OTP routines' error handling"
* tag 'for-linus-20140905' of git://git.infradead.org/linux-mtd:
mtd: nand: fix DocBook warnings on nand_sdr_timings doc
mtd: cfi_cmdset_0002: check return code for get_chip()
The update_walltime() code works on the shadow timekeeper to make the
seqcount protected region as short as possible. But that update to the
shadow timekeeper does not update all timekeeper fields because it's
sufficient to do that once before it becomes life. One of these fields
is tkr.base_mono. That stays stale in the shadow timekeeper unless an
operation happens which copies the real timekeeper to the shadow.
The update function is called after the update calls to vsyscall and
pvclock. While not correct, it did not cause any problems because none
of the invoked update functions used base_mono.
commit cbcf2dd3b3 (x86: kvm: Make kvm_get_time_and_clockread()
nanoseconds based) changed that in the kvm pvclock update function, so
the stale mono_base value got used and caused kvm-clock to malfunction.
Put the update where it belongs and fix the issue.
Reported-by: Chris J Arges <chris.j.arges@canonical.com>
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1409050000570.3333@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The error handling in compat_sys_nanosleep() is correct, but
completely non obvious. Document it and restrict it to the
-ERESTART_RESTARTBLOCK return value for clarity.
Reported-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull i2c bugfixes from Wolfram Sang:
"I2C driver bugfixes for the 3.17 release. Details can be found in the
commit messages, yet I think this is typical driver stuff"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
Revert "i2c: rcar: remove spinlock"
i2c: at91: add bound checking on SMBus block length bytes
i2c: rk3x: fix bug that cause transfer fails in master receive mode
i2c: at91: Fix a race condition during signal handling in at91_do_twi_xfer.
i2c: mv64xxx: continue probe when clock-frequency is missing
i2c: rcar: fix MNR interrupt handling
The function cleaning up an initialized event
was called from the "event_del" handler, instead
of being used as the "destroy" callback. In case of
events group allocation this caused NULL pointer
dereference (as events are added and deleted
multiple times then). Fixed now.
Signed-off-by: Pawel Moll <mail@pawelmoll.com>
Signed-off-by: Kevin Hilman <khilman@linaro.org>
Pull m68k updates from Geert Uytterhoeven:
"Wire up new syscalls getrandom and memfd_create"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k: Wire up memfd_create
m68k: Wire up getrandom
The atmel,clk-divisors property is taking 4 divisors, if less are
provided, the clock registration will fail.
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Actually register clocks from device tree when using the common clock
framework.
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
[nicolas.ferre@atmel.com: add at91 to function name]
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
The at91sam9g20 SOC uses its own pllb implementation which is different
from the one inherited from at91sam9260 SOC.
Signed-off-by: Gaël PORTAY <gael.portay@gmail.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Dave Hansen reports a massive scalability regression in an uncontained
page fault benchmark with more than 30 concurrent threads, which he
bisected down to 05b8430123 ("mm: memcontrol: use root_mem_cgroup
res_counter") and pin-pointed on res_counter spinlock contention.
That change relied on the per-cpu charge caches to mostly swallow the
res_counter costs, but it's apparent that the caches don't scale yet.
Revert memcg back to bypassing res_counters on the root level in order
to restore performance for uncontained workloads.
Reported-by: Dave Hansen <dave@sr71.net>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Dave Hansen <dave.hansen@intel.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Vladimir Davydov <vdavydov@parallels.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch changes sync_filesystem() to be EXPORT_SYMBOL().
The reason this is needed is that starting with 3.15 kernel, due to
Theodore Ts'o's commit 02b9984d64 ("fs: push sync_filesystem() down to
the file system's remount_fs()"), all file systems that have dirty data
to be written out need to call sync_filesystem() from their
->remount_fs() method when remounting read-only.
As this is now a generically required function rather than an internal
only function it should be EXPORT_SYMBOL() so that all file systems can
call it.
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
All the fixes people have found for the regulator API have been
documentation fixes, avoiding warnings while building the kerneldoc,
fixing some errors in one of the DT bindings documents and fixing some
typos in the header.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJUCYlSAAoJELSic+t+oim9yvwP/2+v1KcVT+NHwUpNi4mbj4Zl
1/A3Pa9fsVrZ6mVMPvGHpa2ecdLRoauiLkpRKSFP85ngOtwb1p9YTVbY0FUO84iW
gSBfVBxsLmWw2pSK18lbcHkq1DLchMyF6bXn25BVQjR0xkrEZYHReCHXobcxAQpN
RcVAZIE849iJ46yNhmI1RfXAcJiH93RY9BPrkbO7DWEX6UNml5xkAi4j3TUvWAuV
fmjnHuzOgKkTL/ixj+clW/LEKwvk3hboAWA1mIcl9QvRaUDmfx4CmirKEgznu9uT
ImYKKOqeB/z+4XzhHZTkdVg5suQSV8WDm4bv+TNR6lBlCMHdKBZMh4l/Lwff/a41
pDLDzJSKunwxxOzsHmcAD/PI4kO3sO4HZS1XhzAHklqKsPHjX++BGcKoUFxYh8Iz
V5ZXvGRdX8Vqe1rF2mJWZ+lslx7wZ/pDfBslt6Y6V7CEmMN2cMX6flVPQ+g872Jt
pYdU91ujNd9Fb9rjzCnHx6+5B4LdZm0+RxHp51cF6DMwRCjUgtmHvTzSGuoO8bpH
kb/QhCO2x8fVZZzt+x4j0nSooe1vxTIbg7y/vrW5pJwwixfXMtA6rxhpd8UyssU7
IgaW8wwNJDkqOPl4Ho7MOB+ZnUTmKKcyKABQNmZ+eWzhCUpnJLISh0PvHYq7Bq1R
8Me7NhTpoQmZjXoZ/i5p
=mZk2
-----END PGP SIGNATURE-----
Merge tag 'regulator-v3.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator documentation fixes from Mark Brown:
"All the fixes people have found for the regulator API have been
documentation fixes, avoiding warnings while building the kerneldoc,
fixing some errors in one of the DT bindings documents and fixing some
typos in the header"
* tag 'regulator-v3.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: fix kernel-doc warnings in header files
regulator: Proofread documentation
regulator: tps65090: Fix tps65090 typos in example
properly on the new am437x and dra7 hardware for several devices
such as I2C, NAND, DDR3 and USB. There's also a clock fix for omap3.
And also included are two minor cosmetic fixes that are not
stictly fixes for the new hardware support added recently to
downgrade a GPMC warning into a debug statement, and fix the
confusing comments for dra7-evm spi1 mux.
Note that these are all .dts changes except for a GPMC change.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJUCMlzAAoJEBvUPslcq6Vzt7YP/1bVH7aupL/OabRkbo8FTSoS
Z0LQTNi5u14Pe+YS31pLGtDEO62iTf53wweUv+Xf90F98wfWUz9bR487nzS49JD/
2htv6JZ5wcMAQgol74UN5wQR5W3nEBEKsPSc641YrZWjzCWe5gAbdfjdZqg7Ulm2
gMcavTML66Ok9lT0PxJhpre55XeEBz7QJcWT8iESvrHX2ZMXwl38Z6pVaabgRxEp
5usQOzK45tsWkDHD3iyWEyyM1JJ2nSD3O2O/UrbVyr/8uSPQZdQGXm4/KTBH5kEJ
XiwHLscCcC+u9YoQHrLAHE4RpfWSv6gzFjrkwEDcBXotuaLJdeQJNDxzrvyWVUfN
S0P8It+rguB6TNgqxhCqs+LqtfFZEuBJPxBKiU8QWzPSRYogfHpsEHZwThJxHqQx
94gd1tMFXVY+VgM/xe/JdBw8k5oukQTWJ9QxcD5/tRA+QH++8XOEJXrWlqAA/+uq
qFE8W3+U6FUsv39hM+KWdMdhMAvqiAP7gLoItUjGMUoZtSy800zDdOEhXX3rkJMH
SoQca4T/0iX6lDafyXn62kk/emAXf/cT2cCpd6v6Aj2JhMvKbD1Nm8nJwZxTQqh/
9c72CyqqAuxtasIz/hxDB9tOiXYPihHNliVe9zDKf0con0smw6CQZ6iAVKnM62IM
3E5bXvHeK5/5a9C9i6w9
=9VRv
-----END PGP SIGNATURE-----
Merge tag 'omap-fixes-against-v3.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
Merge "omap fixes against v3.17-rc3" from Tony Lindgren:
Few fixes for omaps mostly for various devices to get them working
properly on the new am437x and dra7 hardware for several devices
such as I2C, NAND, DDR3 and USB. There's also a clock fix for omap3.
And also included are two minor cosmetic fixes that are not
stictly fixes for the new hardware support added recently to
downgrade a GPMC warning into a debug statement, and fix the
confusing comments for dra7-evm spi1 mux.
Note that these are all .dts changes except for a GPMC change.
* tag 'omap-fixes-against-v3.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: (255 commits)
ARM: dts: dra7-evm: Add vtt regulator support
ARM: dts: dra7-evm: Fix spi1 mux documentation
ARM: dts: am43x-epos-evm: Disable QSPI to prevent conflict with GPMC-NAND
ARM: OMAP2+: gpmc: Don't complain if wait pin is used without r/w monitoring
ARM: dts: am43xx-epos-evm: Don't use read/write wait monitoring
ARM: dts: am437x-gp-evm: Don't use read/write wait monitoring
ARM: dts: am437x-gp-evm: Use BCH16 ECC scheme instead of BCH8
ARM: dts: am43x-epos-evm: Use BCH16 ECC scheme instead of BCH8
ARM: dts: am4372: fix USB regs size
ARM: dts: am437x-gp: switch i2c0 to 100KHz
ARM: dts: dra7-evm: Fix 8th NAND partition's name
ARM: dts: dra7-evm: Fix i2c3 pinmux and frequency
Linux 3.17-rc3
...
Signed-off-by: Kevin Hilman <khilman@linaro.org>
- Some documentation sync
- Resource leak in the bt8xx driver
- Again fix the way varargs are used to handle the
optional flags on the gpiod_* accessors. Now hopefully
nailed the entire problem.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJUCXZqAAoJEEEQszewGV1zn5kP/19NJ4N7jUpaRIiL9yVEeOpB
oKDceHRu4jRddgNndaAPjON+PwMLgYft4uDvPeY1ulObjPEPQUg39iFtTCWeJjcT
Kzi1rhcJB5Qzepw78vcrFLfTPgXnObV1JweGnTtUhpBfbmVuCruSP6Iy2EsPMjR+
zS1UA6sCe8hevOsVrF+CRCVgWbm1dntOiOsZyjRPXQCyafMv5EyMdaFSIbzIEWsf
dxEFJ/3iFHtRrA2CUHpfg9vZO+v9rJ39tjRDNt3m0dOQkanys1zseolJUlCiM3Gf
SnPFXxpxOpW+NyXW/d6wwlSfaJ/4RZ43KCBckHDGsCnEOd2SR+7sln+irwiNQ/gh
tT0KcEixqT/06Tv1I6mC9JLTBffvqOEU8oMG5OyWNJvjq+WfBhyxo688q/pDURiJ
Q7kPsFdIvoY5bvkrQFQ46xNnYdNwBKLJIeaJIp+Kf/ScnDMPtxO/DR1LNamb8Cnz
YarxC1ZJb0OXqlFK3JG55gmjUEp8pnPjAMcMadqALmw4bKMWGvgmYfLp1U56TFZq
ovQSBiASH0buAlAkdY4L8X12wmj5ScUyvy+3leQz65Xo4vktXZ/vdkjGzaNWMsPC
o5MSkuUZkb76FuncXI/dicvJNlCe9GHceOfEMgflVZbGI+etDczI9/wDHgylKY9k
/VeZgli8XiDMo/LNGkch
=W9J2
-----END PGP SIGNATURE-----
Merge tag 'gpio-v3.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
- some documentation sync
- resource leak in the bt8xx driver
- again fix the way varargs are used to handle the optional flags on
the gpiod_* accessors. Now hopefully nailed the entire problem.
* tag 'gpio-v3.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: move varargs hack outside #ifdef GPIOLIB
gpio: bt8xx: fix release of managed resources
Documentation: gpio: documentation for optional getters functions
* acpica:
ACPICA: ACPI 5.1: Add support for runtime validation of _DSD package.
* acpi-processor:
ACPI / cpuidle: fix deadlock between cpuidle_lock and cpu_hotplug.lock
* acpi-scan:
ACPI / scan: not cache _SUN value in struct acpi_device_pnp
The timeout may elapse without 0 being returned, such as when waiting
on an unused queue. Document this possibility.
Signed-off-by: Scot Doyle <lkml14@scotdoyle.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/alpine.LNX.2.11.1408241710070.6462@localhost.localdomain
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The use of "rcu_assign_pointer()" is NULLing out the pointer.
According to RCU_INIT_POINTER()'s block comment:
"1. This use of RCU_INIT_POINTER() is NULLing out the pointer"
it is better to use it instead of rcu_assign_pointer() because it has a
smaller overhead.
The following Coccinelle semantic patch was used:
@@
@@
- rcu_assign_pointer
+ RCU_INIT_POINTER
(..., NULL)
Signed-off-by: Andreea-Cristina Bernat <bernat.ada@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: paulmck@linux.vnet.ibm.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140822145043.GA580@ada
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fixes not being able to init fence subsystem when multiple boards are
present.
Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Pull aio bugfixes from Ben LaHaise:
"Two small fixes"
* git://git.kvack.org/~bcrl/aio-fixes:
aio: block exit_aio() until all context requests are completed
aio: add missing smp_rmb() in read_events_ring
It seems that exit_aio() also needs to wait for all iocbs to complete (like
io_destroy), but we missed the wait step in current implemention, so fix
it in the same way as we did in io_destroy.
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Cc: stable@vger.kernel.org