Reading /proc/slabinfo or monitoring slabtop(1) can become very
expensive if there are many slab caches and if there are very lengthy
per-node partial and/or free lists.
Commit 07a63c41fa ("mm/slab: improve performance of gathering slabinfo
stats") addressed the per-node full lists which showed a significant
improvement when no objects were freed. This patch has the same
motivation and optimizes the remainder of the usecases where there are
very lengthy partial and free lists.
This patch maintains per-node active_slabs (full and partial) and
free_slabs rather than iterating the lists at runtime when reading
/proc/slabinfo.
When allocating 100GB of slab from a test cache where every slab page is
on the partial list, reading /proc/slabinfo (includes all other slab
caches on the system) takes ~247ms on average with 48 samples.
As a result of this patch, the same read takes ~0.856ms on average.
[rientjes@google.com: changelog]
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1611081505240.13403@chino.kir.corp.google.com
Signed-off-by: Greg Thelen <gthelen@google.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Verify that kmem_create_cache flags are not allocator specific. It is
done before removing flags that are not available with the current
configuration.
The current kmem_cache_create removes incorrect flags but do not
validate the callers are using them right. This change will ensure that
callers are not trying to create caches with flags that won't be used
because allocator specific.
Link: http://lkml.kernel.org/r/1478553075-120242-2-git-send-email-thgarnie@google.com
Signed-off-by: Thomas Garnier <thgarnie@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The slub allocator gives us some incorrect warnings when
CONFIG_PROFILE_ANNOTATED_BRANCHES is set, as the unlikely() macro
prevents it from seeing that the return code matches what it was before:
mm/slub.c: In function `kmem_cache_free_bulk':
mm/slub.c:262:23: error: `df.s' may be used uninitialized in this function [-Werror=maybe-uninitialized]
mm/slub.c:2943:3: error: `df.cnt' may be used uninitialized in this function [-Werror=maybe-uninitialized]
mm/slub.c:2933:4470: error: `df.freelist' may be used uninitialized in this function [-Werror=maybe-uninitialized]
mm/slub.c:2943:3: error: `df.tail' may be used uninitialized in this function [-Werror=maybe-uninitialized]
I have not been able to come up with a perfect way for dealing with
this, the three options I see are:
- add a bogus initialization, which would increase the runtime overhead
- replace unlikely() with unlikely_notrace()
- remove the unlikely() annotation completely
I checked the object code for a typical x86 configuration and the last
two cases produce the same result, so I went for the last one, which is
the simplest.
Link: http://lkml.kernel.org/r/20161024155704.3114445-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
synchronize_sched() is a heavy operation and calling it per each cache
owned by a memory cgroup being destroyed may take quite some time. What
is worse, it's currently called under the slab_mutex, stalling all works
doing cache creation/destruction.
Actually, there isn't much point in calling synchronize_sched() for each
cache - it's enough to call it just once - after setting cpu_partial for
all caches and before shrinking them. This way, we can also move it out
of the slab_mutex, which we have to hold for iterating over the slab
cache list.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=172991
Link: http://lkml.kernel.org/r/0a10d71ecae3db00fb4421bcd3f82bcc911f4be4.1475329751.git.vdavydov.dev@gmail.com
Signed-off-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Reported-by: Doug Smythies <dsmythies@telus.net>
Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Creating a lot of cgroups at the same time might stall all worker
threads with kmem cache creation works, because kmem cache creation is
done with the slab_mutex held. The problem was amplified by commits
801faf0db8 ("mm/slab: lockless decision to grow cache") in case of
SLAB and 81ae6d0395 ("mm/slub.c: replace kick_all_cpus_sync() with
synchronize_sched() in kmem_cache_shrink()") in case of SLUB, which
increased the maximal time the slab_mutex can be held.
To prevent that from happening, let's use a special ordered single
threaded workqueue for kmem cache creation. This shouldn't introduce
any functional changes regarding how kmem caches are created, as the
work function holds the global slab_mutex during its whole runtime
anyway, making it impossible to run more than one work at a time. By
using a single threaded workqueue, we just avoid creating a thread per
each work. Ordering is required to avoid a situation when a cgroup's
work is put off indefinitely because there are other cgroups to serve,
in other words to guarantee fairness.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=172981
Link: http://lkml.kernel.org/r/20161004131417.GC1862@esperanza
Signed-off-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Reported-by: Doug Smythies <dsmythies@telus.net>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
CURRENT_TIME is not y2038 safe.
Use y2038 safe ktime_get_real_seconds() here for timestamps. struct
heartbeat_block's hb_seq and deletetion time are already 64 bits wide
and accommodate times beyond y2038.
Also use y2038 safe ktime_get_real_ts64() for on disk inode timestamps.
These are also wide enough to accommodate time64_t.
Link: http://lkml.kernel.org/r/1475365298-29236-1-git-send-email-deepa.kernel@gmail.com
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
struct timespec is not y2038 safe. Use time64_t which is y2038 safe to
represent orphan scan times. time64_t is sufficient here as only the
seconds delta times are relevant.
Also use appropriate time functions that return time in time64_t format.
Time functions now return monotonic time instead of real time as only
delta scan times are relevant and these values are not persistent across
reboots.
The format string for the debug print is still using long as this is
only the time elapsed since the last scan and long is sufficient to
represent this value.
Link: http://lkml.kernel.org/r/1475365138-20567-1-git-send-email-deepa.kernel@gmail.com
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In ocfs2_lock_refcount_tree, if ocfs2_read_refcount_block() returns an
error, we do ocfs2_refcount_tree_put twice (once in
ocfs2_unlock_refcount_tree and once outside it), thereby reducing the
refcount of the refcount tree twice, but we dont delete the tree in this
case. This will make refcnt of the tree = 0 and the
ocfs2_refcount_tree_put will eventually call ocfs2_mark_lockres_freeing,
setting OCFS2_LOCK_FREEING for the refcount_tree->rf_lockres.
The error returned by ocfs2_read_refcount_block is propagated all the
way back and for next iteration of write, ocfs2_lock_refcount_tree gets
the same tree back from ocfs2_get_refcount_tree because we havent
deleted the tree. Now we have the same tree, but OCFS2_LOCK_FREEING is
set for rf_lockres and eventually, when _ocfs2_lock_refcount_tree is
called in this iteration, BUG_ON( __ocfs2_cluster_lock:1395 ERROR:
Cluster lock called on freeing lockres T00000000000000000386019775b08d!
flags 0x81) is triggerred.
Call stack:
(loop16,11155,0):ocfs2_lock_refcount_tree:482 ERROR: status = -5
(loop16,11155,0):ocfs2_refcount_cow_hunk:3497 ERROR: status = -5
(loop16,11155,0):ocfs2_refcount_cow:3560 ERROR: status = -5
(loop16,11155,0):ocfs2_prepare_inode_for_refcount:2111 ERROR: status = -5
(loop16,11155,0):ocfs2_prepare_inode_for_write:2190 ERROR: status = -5
(loop16,11155,0):ocfs2_file_write_iter:2331 ERROR: status = -5
(loop16,11155,0):__ocfs2_cluster_lock:1395 ERROR: bug expression:
lockres->l_flags & OCFS2_LOCK_FREEING
(loop16,11155,0):__ocfs2_cluster_lock:1395 ERROR: Cluster lock called on
freeing lockres T00000000000000000386019775b08d! flags 0x81
kernel BUG at fs/ocfs2/dlmglue.c:1395!
invalid opcode: 0000 [#1] SMP CPU 0
Modules linked in: tun ocfs2 jbd2 xen_blkback xen_netback xen_gntdev .. sd_mod crc_t10dif ext3 jbd mbcache
RIP: __ocfs2_cluster_lock+0x31c/0x740 [ocfs2]
RSP: e02b:ffff88017c0138a0 EFLAGS: 00010086
Process loop16 (pid: 11155, threadinfo ffff88017c010000, task ffff8801b5374300)
Call Trace:
ocfs2_refcount_lock+0xae/0x130 [ocfs2]
__ocfs2_lock_refcount_tree+0x29/0xe0 [ocfs2]
ocfs2_lock_refcount_tree+0xdd/0x320 [ocfs2]
ocfs2_refcount_cow_hunk+0x1cb/0x440 [ocfs2]
ocfs2_refcount_cow+0xa9/0x1d0 [ocfs2]
ocfs2_prepare_inode_for_refcount+0x115/0x200 [ocfs2]
ocfs2_prepare_inode_for_write+0x33b/0x470 [ocfs2]
ocfs2_file_write_iter+0x220/0x8c0 [ocfs2]
aio_write_iter+0x2e/0x30
Fix this by avoiding the second call to ocfs2_refcount_tree_put()
Link: http://lkml.kernel.org/r/1473984404-32011-1-git-send-email-ashish.samant@oracle.com
Signed-off-by: Ashish Samant <ashish.samant@oracle.com>
Reviewed-by: Eric Ren <zren@suse.com>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
'page' parameter in ocfs2_write_end_nolock() is never used.
Link: http://lkml.kernel.org/r/582FD91A.5000902@huawei.com
Signed-off-by: Jun Piao <piaojun@huawei.com>
Reviewed-by: Joseph Qi <jiangqi903@gmail.com>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When 'dispatch_assert' is set, 'response' must be DLM_MASTER_RESP_YES,
and 'res' won't be null, so execution can't reach these two branch.
Link: http://lkml.kernel.org/r/58174C91.3040004@huawei.com
Signed-off-by: Jun Piao <piaojun@huawei.com>
Reviewed-by: Joseph Qi Joseph Qi <jiangqi903@gmail.com>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The variable `set_maybe' is redundant when the mle has been found in the
map. So it is ok to set the node_idx into mle's maybe_map directly.
Link: http://lkml.kernel.org/r/71604351584F6A4EBAE558C676F37CA4A3D490DD@H3CMLB12-EX.srv.huawei-3com.com
Signed-off-by: Guozhonghua <guozhonghua@h3c.com>
Reviewed-by: Mark Fasheh <mfasheh@versity.com>
Reviewed-by: Joseph Qi <jiangqi903@gmail.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The value of 'stage' must be between 1 and 2, so the switch can't reach
the default case.
Link: http://lkml.kernel.org/r/57FB5EB2.7050002@huawei.com
Signed-off-by: Jun Piao <piaojun@huawei.com>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If request_irq() fails it passes the error to the caller. The caller
now checks it and jumps to the common error path on failure.
Link: http://lkml.kernel.org/r/1474237304-897-3-git-send-email-sudipm.mukherjee@gmail.com
Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
While building m32r allmodconfig we were getting warning:
drivers/pcmcia/m32r_pcc.c:331:2: warning: ignoring return value of 'request_irq', declared with attribute warn_unused_result
request_irq() can fail and we should always be checking the result from
it. Check the result and return it to the caller.
Link: http://lkml.kernel.org/r/1474237304-897-1-git-send-email-sudipm.mukherjee@gmail.com
Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
While building m32r defconfig we got warnings:
arch/m32r/platforms/m32700ut/setup.c:249:24: warning: 'm32700ut_lcdpld_irq_type' defined but not used [-Wunused-variable]
m32700ut_lcdpld_irq_type is only used when CONFIG_USB is enabled.
Modify the code to declare the related variables and functions only when
CONFIG_USB is enabled.
Link: http://lkml.kernel.org/r/1479244406-7507-1-git-send-email-sudipm.mukherjee@gmail.com
Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some builds of m32r were failing as it tried to build few drivers which
needed dma but m32r is not having dma support. Objections were raised
when it was tried to make those drivers depend on HAS_DMA. So the next
best thing is to add dma support to m32r. dma_noop is a very simple dma
with 1:1 memory mapping.
Link: http://lkml.kernel.org/r/1475949198-31623-1-git-send-email-sudipm.mukherjee@gmail.com
Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When SUBARCH is "omap1" or "omap2", plat-omap/ directory must be
indexed. Handle this special case properly.
While at it, check if mach- directory exists at all.
Link: http://lkml.kernel.org/r/20161202122148.15001-1-joe.skb7@gmail.com
Signed-off-by: Sam Protsenko <semen.protsenko@linaro.org>
Cc: Michal Marek <mmarek@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Every often used regex is better be compiled in Python.
Speedup is about ~9.8% (whee!)
$ perf stat -r 16 taskset -c 15 ./scripts/bloat-o-meter ../vmlinux-000 ../obj/vmlinux >/dev/null
7.091202853 seconds time elapsed ( +- 0.15% )
+re.compile
6.397564973 seconds time elapsed ( +- 0.34% )
Link: http://lkml.kernel.org/r/20161119004417.GB1200@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
readlines() conses whole list before doing anything which is slower for
big object files. Use per line iterator.
Speed up is ~2% on "allyesconfig" type of kernel.
$ perf stat -r 16 taskset -c 15 ./scripts/bloat-o-meter ../vmlinux-000 ../obj/vmlinux >/dev/null
...
Before: 7.247708646 seconds time elapsed ( +- 0.28% )
After: 7.091202853 seconds time elapsed ( +- 0.15% )
Link: http://lkml.kernel.org/r/20161119004143.GA1200@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This limitation came with the reason to remove "another way for
malicious code to obscure a compromised program and masquerade as a
benign process" by allowing "security-concious program can use this
prctl once during its early initialization to ensure the prctl cannot
later be abused for this purpose":
http://marc.info/?l=linux-kernel&m=133160684517468&w=2
This explanation doesn't look sufficient. The only thing "exe" link is
indicating is the file, used to execve, which is basically nothing and
not reliable immediately after process has returned from execve system
call.
Moreover, to use this feture, all the mappings to previous exe file have
to be unmapped and all the new exe file permissions must be satisfied.
Which means, that changing exe link is very similar to calling execve on
the binary.
The need to remove this limitations comes from migration of NFS mount
point, which is not accessible during restore and replaced by other file
system. Because of this exe link has to be changed twice.
[akpm@linux-foundation.org: fix up comment]
Link: http://lkml.kernel.org/r/20160927153755.9337.69650.stgit@localhost.localdomain
Signed-off-by: Stanislav Kinsburskiy <skinsbursky@virtuozzo.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When commit fbae2d44aa ("kthread: add kthread_create_worker*()")
introduced some kthread_create_...() functions which were taking
printf-like parametter, it introduced __printf attributes to some
functions (e.g. kthread_create_worker()). Nevertheless some new
functions were forgotten (they have been detected thanks to
-Wmissing-format-attribute warning flag).
Add the missing __printf attributes to the newly-introduced functions in
order to detect formatting issues at build-time with -Wformat flag.
Link: http://lkml.kernel.org/r/20161126193543.22672-1-nicolas.iooss_linux@m4x.org
Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
pageflipping race fix.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE/JuuFDWp9/ZkuCBXtdYpNtH8nugFAlhLP/4ACgkQtdYpNtH8
nug5xA//T7E/tpDkKVsd/sP7OTVgxktUpBERNDr0oJi4f7i2zUswfH+3VzqO+a8J
5aQ2i2hdP0cST9c5/TJN95uL3nzyNhmbDV1J8ZAVMDxTYwuf1Nbx6+EYtSVtQqb9
pGLP5KkfoBMZpQGX8TbaXynoFPutff+SgE2O31ptZxqq80g/jERWBLal+6OWAi/O
lLRorWyxF4Fqrfs/lt0q637FLOCKRSHdvlIl9njD47aBlLQBUxVR+Q7/cu/ZLgCo
0TlqzMlFROl+AR1G+hAys60H+qBJM7NB64earzMu0AfPw9nWV+y7lvr1PTv2JvKM
fXTqn1vzyFEwREo7nKdmtJzVDffP2NHyvvdUE7NOiY8yffKT/xBfR0THGvIQs+MY
U3rOrsuws+PWUDFEsJjuHZXynwOUULs1VkOt3kovndaJsjGLK6Zd1Y6vaVsEA1wF
1217c7j2rK2OS354coG7srUf8CU9QkwOS60l1RRdMRJuQ5K39gYHT/UuTeEq57Q5
R4XcV9CkJMeuq04waxsqZaAydSMq6El0S4oftynK/7IIZwQNAcc29i6zdFuq4XuW
+XPrkUW9Fp1PchnfTD1uohSq0+h4vEO0iseK47sHnkYEY696fpxnTzwV66omIYzm
aLTPLv6JuL4bzA3hn9AHRhN7y8Mh+Sn8bbx+Z9nKqNlT8M/acwY=
=Sh9P
-----END PGP SIGNATURE-----
Merge tag 'drm-vc4-next-2016-12-09' of https://github.com/anholt/linux into drm-next
This pull request brings in VEC (TV-out) support for vc4, along with a
pageflipping race fix.
* tag 'drm-vc4-next-2016-12-09' of https://github.com/anholt/linux:
drm/vc4: Don't use drm_put_dev
drm/vc4: Document VEC DT binding
drm/vc4: Add support for the VEC (Video Encoder) IP
drm: Add TV connector states to drm_connector_state
drm: Turn DRM_MODE_SUBCONNECTOR_xx definitions into an enum
drm/vc4: Fix ->clock_select setting for the VEC encoder
drm/vc4: Fix race between page flip completion event and clean-up
The Apple GMUX is the one managing the backlight, so there is no need for
Nouveau to register its own backlight interface.
v2: Do not split information message on two lines as it prevents from grepping
it, as pointed out by Lukas Wunner
v3: Add a missing end-of-line character to the printed message
Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Currently, every backlight interface created by Nouveau uses the same name,
nv_backlight. This leads to a sysfs warning as it tries to create an already
existing folder. This patch adds a incremented number to the name, but keeps
the initial name as nv_backlight, to avoid possibly breaking userspace; the
second interface will be named nv_backlight1, and so on.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86539
v2:
* Switch to using ida for generating unique IDs, as suggested by Ilia Mirkin;
* Allocate backlight name on the stack, as suggested by Ilia Mirkin;
* Move `nouveau_get_backlight_name()` to avoid forward declaration, as
suggested by Ilia Mirkin;
* Fix reference to bug report formatting, as reported by Nick Tenney.
v3:
* Define a macro for the size of the backlight name, to avoid defining
it multiple times;
* Use snprintf in place of sprintf.
v4:
* Do not create similarly named interfaces when reaching the maximum
amount of unique names, but fail instead, as pointed out by Lukas Wunner
Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
TTM was changed a while back to allow for pipelining of buffer moves, and
part of this was the removal of waiting for a BO to idle before calling
move(), placing the responsibility on the driver to do this if required.
That's all well and good, except, we make use of move_notify() to handle
mapping/unmapping from the GPU VMM as move() isn't called on all paths.
This commit adds a wait before unmapping from a VMM in move_notify(), to
prevent GPU page faults where a buffer is still being accessed.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Cc: stable@vger.kernel.org [v4.8+]
This has been on the TODO list for a while now, recovering from things
such as attempting to execute a push buffer or touch a semaphore in an
unmapped memory area.
The only thing required on the HW side here is that the offending
channel is removed from the runlist, and *not* a full reset of PFIFO.
This used to be a bit messier to handle before the rework to make use
of engine topology info, but is apparently now trivial.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Pull x86 platform updates from Ingo Molnar:
"Two changes:
- implement various VMWare guest OS improvements/fixes (Alexey
Makhalov)
- unexport a spurious export from the intel-mid platform driver
(Lukas Wunner)"
* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/vmware: Add paravirt sched clock
x86/vmware: Add basic paravirt ops support
x86/vmware: Use tsc_khz value for calibrate_cpu()
x86/platform/intel-mid: Unexport intel_mid_pci_set_power_state()
x86/vmware: Read tsc_khz only once at boot time
Pull x86 microcode update from Ingo Molnar:
"The biggest change (by Borislav Petkov) is a thorough rewrite of the
Intel microcode loader and its interactions with the core code.
The biggest conceptual change is the decoupling of the microcode
loading on boot and application processors (which load the microcode
in different scenarios), so that both parse the input patches with as
few assumptions as possible - this also fixes various kernel address
space randomization bugs. (The AP side then goes on and caches the
result to improve boot performance.)
Since the AMD side already did this, this change also opened up the
path towards more unification/simplification of the core microcode
loading infrastructure:
10 files changed, 647 insertions(+), 940 deletions(-)
which speaks for itself"
* 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/microcode: Bump driver version, update copyrights
x86/microcode: Rework microcode loading
x86/microcode/intel: Remove intel_lib.c
x86/microcode/amd: Move private inlines to .c and mark local functions static
x86/microcode: Collect CPU info on resume
x86/microcode: Issue the debug printk on resume only on success
x86/microcode/amd: Hand down the CPU family
x86/microcode: Export the microcode cache linked list
x86/microcode: Remove one #ifdef clause
x86/microcode/intel: Simplify generic_load_microcode()
x86/microcode: Move driver authors to CREDITS
x86/microcode: Run the AP-loading routine only on the application processors
Pull x86 idle updates from Ingo Molnar:
"There were two bigger changes in this development cycle:
- remove idle notifiers:
32 files changed, 74 insertions(+), 803 deletions(-)
These notifiers were of questionable value and the main usecase,
the i7300 driver, was essentially unmaintained and can be removed,
plus modern power management concepts don't need the callback - so
use this golden opportunity and get rid of this opaque and fragile
callback from a latency sensitive code path.
(Len Brown, Thomas Gleixner)
- improve the AMD Erratum 400 workaround that used high overhead MSR
polling in the idle loop (Borisla Petkov, Thomas Gleixner)"
* 'x86-idle-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Remove empty idle.h header
x86/amd: Simplify AMD E400 aware idle routine
x86/amd: Check for the C1E bug post ACPI subsystem init
x86/bugs: Separate AMD E400 erratum and C1E bug
x86/cpufeature: Provide helper to set bugs bits
x86/idle: Remove enter_idle(), exit_idle()
x86: Remove x86_test_and_clear_bit_percpu()
x86/idle: Remove is_idle flag
x86/idle: Remove idle_notifier
i7300_idle: Remove this driver
Pull x86 header fixlet from Ingo Molnar:
"Remove unnecessary module.h inclusion from core code (Paul Gortmaker)"
* 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/percpu: Remove unnecessary include of module.h, add asm/desc.h
Pull x86 FPU updates from Ingo Molnar:
"The main changes in this cycle were:
- do a large round of simplifications after all CPUs do 'eager' FPU
context switching in v4.9: remove CR0 twiddling, remove leftover
eager/lazy bts, etc (Andy Lutomirski)
- more FPU code simplifications: remove struct fpu::counter, clarify
nomenclature, remove unnecessary arguments/functions and better
structure the code (Rik van Riel)"
* 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/fpu: Remove clts()
x86/fpu: Remove stts()
x86/fpu: Handle #NM without FPU emulation as an error
x86/fpu, lguest: Remove CR0.TS support
x86/fpu, kvm: Remove host CR0.TS manipulation
x86/fpu: Remove irq_ts_save() and irq_ts_restore()
x86/fpu: Stop saving and restoring CR0.TS in fpu__init_check_bugs()
x86/fpu: Get rid of two redundant clts() calls
x86/fpu: Finish excising 'eagerfpu'
x86/fpu: Split old_fpu & new_fpu handling into separate functions
x86/fpu: Remove 'cpu' argument from __cpu_invalidate_fpregs_state()
x86/fpu: Split old & new FPU code paths
x86/fpu: Remove __fpregs_(de)activate()
x86/fpu: Rename lazy restore functions to "register state valid"
x86/fpu, kvm: Remove KVM vcpu->fpu_counter
x86/fpu: Remove struct fpu::counter
x86/fpu: Remove use_eager_fpu()
x86/fpu: Remove the XFEATURE_MASK_EAGER/LAZY distinction
x86/fpu: Hard-disable lazy FPU mode
x86/crypto, x86/fpu: Remove X86_FEATURE_EAGER_FPU #ifdef from the crc32c code
Pull x86 CPU updates from Ingo Molnar:
"The changes in this development cycle were:
- AMD CPU topology enhancements that are cleanups on current CPUs but
which enable future Fam17 hardware. (Yazen Ghannam)
- unify bugs.c and bugs_64.c (Borislav Petkov)
- remove the show_msr= boot option (Borislav Petkov)
- simplify a boot message (Borislav Petkov)"
* 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpu/AMD: Clean up cpu_llc_id assignment per topology feature
x86/cpu: Get rid of the show_msr= boot option
x86/cpu: Merge bugs.c and bugs_64.c
x86/cpu: Remove the printk format specifier in "CPU0: "
Pull x86 cleanups from Ingo Molnar:
"Two cleanups in the LDT handling code, by Dan Carpenter and Thomas
Gleixner"
* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/ldt: Make all size computations unsigned
x86/ldt: Make a size argument unsigned
Pull x86 build updates from Ingo Molnar:
"The main changes in this cycle were:
- Makefile improvements (Paul Bolle)
- KConfig cleanups to better separate 32-bit only, 64-bit only and
generic feature enablement sections (Ingo Molnar)"
* 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/build: Remove three unneeded genhdr-y entries
x86/build: Don't use $(LINUXINCLUDE) twice
x86/kconfig: Sort the 'config X86' selects alphabetically
x86/kconfig: Clean up 32-bit compat options
x86/kconfig: Clean up IA32_EMULATION select
x86/kconfig, x86/pkeys: Move pkeys selects to X86_INTEL_MEMORY_PROTECTION_KEYS
x86/kconfig: Move 64-bit only arch Kconfig selects to 'config X86_64'
x86/kconfig: Move 32-bit only arch Kconfig selects to 'config X86_32'
Pull x86 boot updates from Ingo Molnar:
"Misc cleanups/simplifications by Borislav Petkov, Paul Bolle and Wei
Yang"
* 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot/64: Optimize fixmap page fixup
x86/boot: Simplify the GDTR calculation assembly code a bit
x86/boot/build: Remove always empty $(USERINCLUDE)
Pull x86 asm updates from Ingo Molnar:
"The main changes in this development cycle were:
- a large number of call stack dumping/printing improvements: higher
robustness, better cross-context dumping, improved output, etc.
(Josh Poimboeuf)
- vDSO getcpu() performance improvement for future Intel CPUs with
the RDPID instruction (Andy Lutomirski)
- add two new Intel AVX512 features and the CPUID support
infrastructure for it: AVX512IFMA and AVX512VBMI. (Gayatri Kammela,
He Chen)
- more copy-user unification (Borislav Petkov)
- entry code assembly macro simplifications (Alexander Kuleshov)
- vDSO C/R support improvements (Dmitry Safonov)
- misc fixes and cleanups (Borislav Petkov, Paul Bolle)"
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
scripts/decode_stacktrace.sh: Fix address line detection on x86
x86/boot/64: Use defines for page size
x86/dumpstack: Make stack name tags more comprehensible
selftests/x86: Add test_vdso to test getcpu()
x86/vdso: Use RDPID in preference to LSL when available
x86/dumpstack: Handle NULL stack pointer in show_trace_log_lvl()
x86/cpufeatures: Enable new AVX512 cpu features
x86/cpuid: Provide get_scattered_cpuid_leaf()
x86/cpuid: Cleanup cpuid_regs definitions
x86/copy_user: Unify the code by removing the 64-bit asm _copy_*_user() variants
x86/unwind: Ensure stack grows down
x86/vdso: Set vDSO pointer only after success
x86/prctl/uapi: Remove #ifdef for CHECKPOINT_RESTORE
x86/unwind: Detect bad stack return address
x86/dumpstack: Warn on stack recursion
x86/unwind: Warn on bad frame pointer
x86/decoder: Use stderr if insn sanity test fails
x86/decoder: Use stdout if insn decoder test is successful
mm/page_alloc: Remove kernel address exposure in free_reserved_area()
x86/dumpstack: Remove raw stack dump
...
Pull x86 apic updates from Ingo Molnar:
"Misc changes:
- optimize (reduce) IRQ handler tracing overhead (Wanpeng Li)
- clean up MSR helpers (Borislav Petkov)
- fix build warning on some configs (Sebastian Andrzej Siewior)"
* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/msr: Cleanup/streamline MSR helpers
x86/apic: Prevent tracing on apic_msr_write_eoi()
x86/msr: Add wrmsr_notrace()
x86/apic: Get rid of "warning: 'acpi_ioapic_lock' defined but not used"
Pull x86 RAS updates from Ingo Molnar:
"The main changes in this development cycle were:
- more AMD northbridge support work, mostly in preparation for Fam17h
CPUs (Yazen Ghannam, Borislav Petkov)
- cleanups/refactorings and fixes (Borislav Petkov, Tony Luck,
Yinghai Lu)"
* 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce: Include the PPIN in MCE records when available
x86/mce/AMD: Add system physical address translation for AMD Fam17h
x86/amd_nb: Add SMN and Indirect Data Fabric access for AMD Fam17h
x86/amd_nb: Add Fam17h Data Fabric as "Northbridge"
x86/amd_nb: Make all exports EXPORT_SYMBOL_GPL
x86/amd_nb: Make amd_northbridges internal to amd_nb.c
x86/mce/AMD: Reset Threshold Limit after logging error
x86/mce/AMD: Fix HWID_MCATYPE calculation by grouping arguments
x86/MCE: Correct TSC timestamping of error records
x86/RAS: Hide SMCA bank names
x86/RAS: Rename smca_bank_names to smca_names
x86/RAS: Simplify SMCA HWID descriptor struct
x86/RAS: Simplify SMCA bank descriptor struct
x86/MCE: Dump MCE to dmesg if no consumers
x86/RAS: Add TSC timestamp to the injected MCE
x86/MCE: Do not look at panic_on_oops in the severity grading
Pull hotplug API fix from Ingo Molnar:
"Late breaking fix from the v4.9 cycle: fix a hotplug register/
unregister notifier API asymmetry bug that can cause kernel warnings
(and worse) with certain Kconfig combinations"
* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
hotplug: Make register and unregister notifier API symmetric
Pull scheduler updates from Ingo Molnar:
"The main scheduler changes in this cycle were:
- support Intel Turbo Boost Max Technology 3.0 (TBM3) by introducig a
notion of 'better cores', which the scheduler will prefer to
schedule single threaded workloads on. (Tim Chen, Srinivas
Pandruvada)
- enhance the handling of asymmetric capacity CPUs further (Morten
Rasmussen)
- improve/fix load handling when moving tasks between task groups
(Vincent Guittot)
- simplify and clean up the cputime code (Stanislaw Gruszka)
- improve mass fork()ed task spread a.k.a. hackbench speedup (Vincent
Guittot)
- make struct kthread kmalloc()ed and related fixes (Oleg Nesterov)
- add uaccess atomicity debugging (when using access_ok() in the
wrong context), under CONFIG_DEBUG_ATOMIC_SLEEP=y (Peter Zijlstra)
- implement various fixes, cleanups and other enhancements (Daniel
Bristot de Oliveira, Martin Schwidefsky, Rafael J. Wysocki)"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
sched/core: Use load_avg for selecting idlest group
sched/core: Fix find_idlest_group() for fork
kthread: Don't abuse kthread_create_on_cpu() in __kthread_create_worker()
kthread: Don't use to_live_kthread() in kthread_[un]park()
kthread: Don't use to_live_kthread() in kthread_stop()
Revert "kthread: Pin the stack via try_get_task_stack()/put_task_stack() in to_live_kthread() function"
kthread: Make struct kthread kmalloc'ed
x86/uaccess, sched/preempt: Verify access_ok() context
sched/x86: Make CONFIG_SCHED_MC_PRIO=y easier to enable
sched/x86: Change CONFIG_SCHED_ITMT to CONFIG_SCHED_MC_PRIO
x86/sched: Use #include <linux/mutex.h> instead of #include <asm/mutex.h>
cpufreq/intel_pstate: Use CPPC to get max performance
acpi/bus: Set _OSC for diverse core support
acpi/bus: Enable HWP CPPC objects
x86/sched: Add SD_ASYM_PACKING flags to x86 ITMT CPU
x86/sysctl: Add sysctl for ITMT scheduling feature
x86: Enable Intel Turbo Boost Max Technology 3.0
x86/topology: Define x86's arch_update_cpu_topology
sched: Extend scheduler's asym packing
sched/fair: Clean up the tunable parameter definitions
...