OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Maarten Lankhorst	040a0a3710	mutex: Add support for wound/wait style locks Wound/wait mutexes are used when other multiple lock acquisitions of a similar type can be done in an arbitrary order. The deadlock handling used here is called wait/wound in the RDBMS literature: The older tasks waits until it can acquire the contended lock. The younger tasks needs to back off and drop all the locks it is currently holding, i.e. the younger task is wounded. For full documentation please read Documentation/ww-mutex-design.txt. References: https://lwn.net/Articles/548909/ Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: rostedt@goodmis.org Cc: daniel@ffwll.ch Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/51C8038C.9000106@canonical.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-06-26 12:10:56 +02:00
Markos Chandras	25c87eae17	lib/Kconfig.debug: Restrict FRAME_POINTER for MIPS FAULT_INJECTION_STACKTRACE_FILTER selects FRAME_POINTER but that symbol is not available for MIPS. Fixes the following problem on a randconfig: warning: (LOCKDEP && FAULT_INJECTION_STACKTRACE_FILTER && LATENCYTOP && KMEMCHECK) selects FRAME_POINTER which has unmet direct dependencies (DEBUG_KERNEL && (CRIS \|\| M68K \|\| FRV \|\| UML \|\| AVR32 \|\| SUPERH \|\| BLACKFIN \|\| MN10300 \|\| METAG) \|\| ARCH_WANT_FRAME_POINTERS) Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Acked-by: Steven J. Hill <Steven.Hill@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/5441/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2013-06-21 18:07:01 +02:00
Arnd Bergmann	a857c6e7d5	X.509: do not emit any informational output When building a kernel using 'make -s', I expect to see an empty output, except for build warnings and errors. The build_OID_registry code always prints one line when run, which is not helpful to most people building the kernels, and which makes it harder to automatically check for build warnings. Let's just remove the one line output. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: David Howells <dhowells@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au>	2013-06-19 17:54:06 +02:00
Masanari Iida	278cee0515	treewide: Fix typo in printk Correct spelling typo in printk within various drivers. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2013-06-18 13:48:45 +02:00
Greg Kroah-Hartman	bb07b00be7	Merge 3.10-rc6 into driver-core-next We want these fixes here too. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-06-17 16:57:20 -07:00
Greg Kroah-Hartman	0e496b8e84	Merge 3.10-rc6 into char-misc-next We want the fixes in here.	2013-06-17 11:54:25 -07:00
Tejun Heo	a4244454df	percpu-refcount: use RCU-sched insted of normal RCU percpu-refcount was incorrectly using preempt_disable/enable() for RCU critical sections against call_rcu(). `6a24474da8` ("percpu-refcount: consistently use plain (non-sched) RCU") fixed it by converting the preepmtion operations with rcu_read_[un]lock() citing that there isn't any advantage in using sched-RCU over using the usual one; however, rcu_read_[un]lock() for the preemptible RCU implementation - CONFIG_TREE_PREEMPT_RCU, chosen when CONFIG_PREEMPT - are slightly more expensive than preempt_disable/enable(). In a contrived microbench which repeats the followings, - percpu_ref_get() - copy 32 bytes of data into percpu buffer - percpu_put_get() - copy 32 bytes of data into percpu buffer rcu_read_[un]lock() used in percpu_ref_get/put() makes it go slower by about 15% when compared to using sched-RCU. As the RCU critical sections are extremely short, using sched-RCU shouldn't have any latency implications. Convert to RCU-sched. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Kent Overstreet <koverstreet@google.com> Acked-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Rusty Russell <rusty@rustcorp.com.au>	2013-06-16 16:12:26 -07:00
Tejun Heo	dbece3a0f1	percpu-refcount: implement percpu_tryget() along with percpu_ref_kill_and_confirm() Implement percpu_tryget() which stops giving out references once the percpu_ref is visible as killed. Because the refcnt is per-cpu, different CPUs will start to see a refcnt as killed at different points in time and tryget() may continue to succeed on subset of cpus for a while after percpu_ref_kill() returns. For use cases where it's necessary to know when all CPUs start to see the refcnt as dead, percpu_ref_kill_and_confirm() is added. The new function takes an extra argument @confirm_kill which is invoked when the refcnt is guaranteed to be viewed as killed on all CPUs. While this isn't the prettiest interface, it doesn't force synchronous wait and is much safer than requiring the caller to do its own call_rcu(). v2: Patch description rephrased to emphasize that tryget() may continue to succeed on some CPUs after kill() returns as suggested by Kent. v3: Function comment in percpu_ref_kill_and_confirm() updated warning people to not depend on the implied RCU grace period from the confirm callback as it's an implementation detail. Signed-off-by: Tejun Heo <tj@kernel.org> Slightly-Grumpily-Acked-by: Kent Overstreet <koverstreet@google.com>	2013-06-13 19:23:53 -07:00
Tejun Heo	bc497bd33b	percpu-refcount: implement percpu_ref_cancel_init() Normally, percpu_ref_init() initializes and percpu_ref_kill() initiates destruction which completes asynchronously. The asynchronous destruction can be problematic in init failure path where the caller wants to destroy half-constructed object - distinguishing half-constructed objects from the usual release method can be painful for complex objects. This patch implements percpu_ref_cancel_init() which synchronously destroys the percpu_ref without invoking release. To avoid unintentional misuses, the function requires the ref to have finished percpu_ref_init() but never used and triggers WARN otherwise. v2: Explain the weird name and usage restriction in the function comment. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Kent Overstreet <koverstreet@google.com>	2013-06-13 11:08:27 -07:00
Tejun Heo	acac7883ee	percpu-refcount: add __must_check to percpu_ref_init() and don't use ACCESS_ONCE() in percpu_ref_kill_rcu() Two small changes. * Unlike most init functions, percpu_ref_init() allocates memory and may fail. Let's mark it with __must_check in case the caller forgets. * percpu_ref_kill_rcu() is unnecessarily using ACCESS_ONCE() to dereference @ref->pcpu_count, which can be misleading. The pointer is guaranteed to be valid and visible and can't change underneath the function. Drop ACCESS_ONCE(). Signed-off-by: Tejun Heo <tj@kernel.org>	2013-06-13 11:08:26 -07:00
Tejun Heo	ac899061a9	percpu-refcount: cosmetic updates * s/percpu_ref_release/percpu_ref_func_t/ as it's customary to have _t postfix for types and the type is gonna be used for a different type of callback too. * Add @ARG to function comments. * Drop unnecessary and unaligned indentation from percpu_ref_init() function comment. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Kent Overstreet <koverstreet@google.com>	2013-06-12 20:43:06 -07:00
Chen Gang	5402b8047b	lib/mpi/mpicoder.c: looping issue, need stop when equal to zero, found by 'EXTRA_FLAGS=-W'. For 'while' looping, need stop when 'nbytes == 0', or will cause issue. ('nbytes' is size_t which is always bigger or equal than zero). The related warning: (with EXTRA_CFLAGS=-W) lib/mpi/mpicoder.c:40:2: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits] Signed-off-by: Chen Gang <gang.chen@asianux.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: David Howells <dhowells@redhat.com> Cc: James Morris <james.l.morris@oracle.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-06-12 16:29:44 -07:00
Kees Cook	b7165ebbf0	kobject: sanitize argument for format string Unlike kobject_set_name(), the kset_create_and_add() interface does not provide a way to use format strings, so make sure that the interface cannot be abused accidentally. It looks like all current callers use static strings, so there's no existing flaw. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-06-07 16:05:50 -07:00
Andy Shevchenko	4cd5773a2a	net: core: move mac_pton() to lib/net_utils.c Since we have at least one user of this function outside of CONFIG_NET scope, we have to provide this function independently. The proposed solution is to move it under lib/net_utils.c with corresponding configuration variable and select wherever it is needed. Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com> Reported-by: Arnd Bergmann <arnd@arndb.de> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-06-05 12:00:27 -07:00
Herbert Xu	67822649d7	crypto: crct10dif - Use PTR_RET lib/crc-t10dif.c:42:1-3: WARNING: PTR_RET can be used Use PTR_RET rather than if(IS_ERR(...)) + PTR_ERR Generated by: coccinelle/api/ptr_ret.cocci Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2013-06-05 16:27:51 +08:00
Kent Overstreet	c1ae6e9b4d	percpu-refcount: Don't use silly cmpxchg() The cmpxchg() was just to ensure the debug check didn't race, which was a bit excessive. The caller is supposed to do the appropriate synchronization, which means percpu_ref_kill() can just do a simple store. Signed-off-by: Kent Overstreet <koverstreet@google.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2013-06-03 16:04:04 -07:00
Kent Overstreet	215e262f2a	percpu: implement generic percpu refcounting This implements a refcount with similar semantics to atomic_get()/atomic_dec_and_test() - but percpu. It also implements two stage shutdown, as we need it to tear down the percpu counts. Before dropping the initial refcount, you must call percpu_ref_kill(); this puts the refcount in "shutting down mode" and switches back to a single atomic refcount with the appropriate barriers (synchronize_rcu()). It's also legal to call percpu_ref_kill() multiple times - it only returns true once, so callers don't have to reimplement shutdown synchronization. [akpm@linux-foundation.org: fix build] [akpm@linux-foundation.org: coding-style tweak] Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Zach Brown <zab@redhat.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jens Axboe <axboe@kernel.dk> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Tejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Reviewed-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Tejun Heo <tj@kernel.org>	2013-06-03 15:36:41 -07:00
Seth Jennings	3a76e5e09f	debugfs: add get/set for atomic types debugfs currently lack the ability to create attributes that set/get atomic_t values. This patch adds support for this through a new debugfs_create_atomic_t() function. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Mel Gorman <mgorman@suse.de> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-06-03 13:55:01 -07:00
Steven Rostedt	360603a1be	sprintf: hex_string(): fix comment hex_string() had a typo in a comment. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2013-05-29 01:14:46 +02:00
Helge Deller	70ef5578dd	MPILIB: disable usage of floating point registers on parisc The umul_ppmm() macro for parisc uses the xmpyu assembler statement which does calculation via a floating point register. But usage of floating point registers inside the Linux kernel are not allowed and gcc will stop compilation due to the -mdisable-fpregs compiler option. Fix this by disabling the umul_ppmm() and udiv_qrnnd() macros. The mpilib will then use the generic built-in implementations instead. Signed-off-by: Helge Deller <deller@gmx.de>	2013-05-24 22:30:11 +02:00
Linus Torvalds	c7153d0643	Driver core fixes for 3.10-rc2 Here are 3 tiny driver core fixes for 3.10-rc2. A needed symbol export, a change to make it easier to track down offending sysfs files with incorrect attributes, and a klist bugfix. All have been in linux-next for a while. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iEYEABECAAYFAlGePdAACgkQMUfUDdst+ynX3wCfbodTGeimy2GTnc5psVgXV/x4 bz8AnR6G/JNCw54meAJ5UlYJRj7Dwo09 =MNP/ -----END PGP SIGNATURE----- Merge tag 'driver-core-3.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg Kroah-Hartman: "Here are 3 tiny driver core fixes for 3.10-rc2. A needed symbol export, a change to make it easier to track down offending sysfs files with incorrect attributes, and a klist bugfix. All have been in linux-next for a while" * tag 'driver-core-3.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: klist: del waiter from klist_remove_waiters before wakeup waitting process driver core: print sysfs attribute name when warning about bogus permissions driver core: export subsys_virtual_register	2013-05-23 09:27:08 -07:00
Randy Dunlap	b4d3ba3346	lib: make iovec obj instead of lib Fix build error io vmw_vmci.ko when CONFIG_VMWARE_VMCI=m by chaning iovec.o from lib-y to obj-y. ERROR: "memcpy_toiovec" [drivers/misc/vmw_vmci/vmw_vmci.ko] undefined! ERROR: "memcpy_fromiovec" [drivers/misc/vmw_vmci/vmw_vmci.ko] undefined! Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-05-23 09:17:11 -07:00
wang, biao	ac5a2962b0	klist: del waiter from klist_remove_waiters before wakeup waitting process There is a race between klist_remove and klist_release. klist_remove uses a local var waiter saved on stack. When klist_release calls wake_up_process(waiter->process) to wake up the waiter, waiter might run immediately and reuse the stack. Then, klist_release calls list_del(&waiter->list) to change previous wait data and cause prior waiter thread corrupt. The patch fixes it against kernel 3.9. Signed-off-by: wang, biao <biao.wang@intel.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-05-21 10:16:39 -07:00
Tim Chen	2d31e518a4	crypto: crct10dif - Wrap crc_t10dif function all to use crypto transform framework When CRC T10 DIF is calculated using the crypto transform framework, we wrap the crc_t10dif function call to utilize it. This allows us to take advantage of any accelerated CRC T10 DIF transform that is plugged into the crypto framework. Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2013-05-20 20:11:01 +08:00
Rusty Russell	d2f83e9078	Hoist memcpy_fromiovec/memcpy_toiovec into lib/ ERROR: "memcpy_fromiovec" [drivers/vhost/vhost_scsi.ko] undefined! That function is only present with CONFIG_NET. Turns out that crypto/algif_skcipher.c also uses that outside net, but it actually needs sockets anyway. In addition, commit `6d4f0139d6` added CONFIG_NET dependency to CONFIG_VMCI for memcpy_toiovec, so hoist that function and revert that commit too. socket.h already includes uio.h, so no callers need updating; trying only broke things fo x86_64 randconfig (thanks Fengguang!). Reported-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2013-05-20 10:24:22 +09:30
Linus Torvalds	ebb3727779	Merge branch 'for-3.10/drivers' of git://git.kernel.dk/linux-block Pull block driver updates from Jens Axboe: "It might look big in volume, but when categorized, not a lot of drivers are touched. The pull request contains: - mtip32xx fixes from Micron. - A slew of drbd updates, this time in a nicer series. - bcache, a flash/ssd caching framework from Kent. - Fixes for cciss" * 'for-3.10/drivers' of git://git.kernel.dk/linux-block: (66 commits) bcache: Use bd_link_disk_holder() bcache: Allocator cleanup/fixes cciss: bug fix to prevent cciss from loading in kdump crash kernel cciss: add cciss_allow_hpsa module parameter drivers/block/mg_disk.c: add CONFIG_PM_SLEEP to suspend/resume functions mtip32xx: Workaround for unaligned writes bcache: Make sure blocksize isn't smaller than device blocksize bcache: Fix merge_bvec_fn usage for when it modifies the bvm bcache: Correctly check against BIO_MAX_PAGES bcache: Hack around stuff that clones up to bi_max_vecs bcache: Set ra_pages based on backing device's ra_pages bcache: Take data offset from the bdev superblock. mtip32xx: mtip32xx: Disable TRIM support mtip32xx: fix a smatch warning bcache: Disable broken btree fuzz tester bcache: Fix a format string overflow bcache: Fix a minor memory leak on device teardown bcache: Documentation updates bcache: Use WARN_ONCE() instead of __WARN() bcache: Add missing #include <linux/prefetch.h> ...	2013-05-08 11:51:05 -07:00
Davidlohr Bueso	9607a85b67	rwsem: check counter to avoid cmpxchg calls This patch tries to reduce the amount of cmpxchg calls in the writer failed path by checking the counter value first before issuing the instruction. If ->count is not set to RWSEM_WAITING_BIAS then there is no point wasting a cmpxchg call. Furthermore, Michel states "I suppose it helps due to the case where someone else steals the lock while we're trying to acquire sem->wait_lock." Two very different workloads and machines were used to see how this patch improves throughput: pgbench on a quad-core laptop and aim7 on a large 8 socket box with 80 cores. Some results comparing Michel's fast-path write lock stealing (tps-rwsem) on a quad-core laptop running pgbench: \| db_size \| clients \| tps-rwsem \| tps-patch \| +---------+----------+----------------+--------------+ \| 160 MB \| 1 \| 6906 \| 9153 \| + 32.5 \| 160 MB \| 2 \| 15931 \| 22487 \| + 41.1% \| 160 MB \| 4 \| 33021 \| 32503 \| \| 160 MB \| 8 \| 34626 \| 34695 \| \| 160 MB \| 16 \| 33098 \| 34003 \| \| 160 MB \| 20 \| 31343 \| 31440 \| \| 160 MB \| 30 \| 28961 \| 28987 \| \| 160 MB \| 40 \| 26902 \| 26970 \| \| 160 MB \| 50 \| 25760 \| 25810 \| ------------------------------------------------------ \| 1.6 GB \| 1 \| 7729 \| 7537 \| \| 1.6 GB \| 2 \| 19009 \| 23508 \| + 23.7% \| 1.6 GB \| 4 \| 33185 \| 32666 \| \| 1.6 GB \| 8 \| 34550 \| 34318 \| \| 1.6 GB \| 16 \| 33079 \| 32689 \| \| 1.6 GB \| 20 \| 31494 \| 31702 \| \| 1.6 GB \| 30 \| 28535 \| 28755 \| \| 1.6 GB \| 40 \| 27054 \| 27017 \| \| 1.6 GB \| 50 \| 25591 \| 25560 \| ------------------------------------------------------ \| 7.6 GB \| 1 \| 6224 \| 7469 \| + 20.0% \| 7.6 GB \| 2 \| 13611 \| 12778 \| \| 7.6 GB \| 4 \| 33108 \| 32927 \| \| 7.6 GB \| 8 \| 34712 \| 34878 \| \| 7.6 GB \| 16 \| 32895 \| 33003 \| \| 7.6 GB \| 20 \| 31689 \| 31974 \| \| 7.6 GB \| 30 \| 29003 \| 28806 \| \| 7.6 GB \| 40 \| 26683 \| 26976 \| \| 7.6 GB \| 50 \| 25925 \| 25652 \| ------------------------------------------------------ For the aim7 worloads, they overall improved on top of Michel's patchset. For full graphs on how the rwsem series plus this patch behaves on a large 8 socket machine against a vanilla kernel: http://stgolabs.net/rwsem-aim7-results.tar.gz Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-05-07 16:11:51 -07:00
Anatol Pomozov	2d864e4171	kref: minor cleanup - make warning smp-safe - result of atomic _unless_zero functions should be checked by caller to avoid use-after-free error - trivial whitespace fix. Link: https://lkml.org/lkml/2013/4/12/391 Tested: compile x86, boot machine and run xfstests Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com> [ Removed line-break, changed to use WARN_ON_ONCE() - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-05-07 16:09:00 -07:00
Linus Torvalds	c8de2fa4dc	Merge branch 'rwsem-optimizations' Merge rwsem optimizations from Michel Lespinasse: "These patches extend Alex Shi's work (which added write lock stealing on the rwsem slow path) in order to provide rwsem write lock stealing on the fast path (that is, without taking the rwsem's wait_lock). I have unfortunately been unable to push this through -next before due to Ingo Molnar / David Howells / Peter Zijlstra being busy with other things. However, this has gotten some attention from Rik van Riel and Davidlohr Bueso who both commented that they felt this was ready for v3.10, and Ingo Molnar has said that he was OK with me pushing directly to you. So, here goes :) Davidlohr got the following test results from pgbench running on a quad-core laptop: \| db_size \| clients \| tps-vanilla \| tps-rwsem \| +---------+----------+----------------+--------------+ \| 160 MB \| 1 \| 5803 \| 6906 \| + 19.0% \| 160 MB \| 2 \| 13092 \| 15931 \| \| 160 MB \| 4 \| 29412 \| 33021 \| \| 160 MB \| 8 \| 32448 \| 34626 \| \| 160 MB \| 16 \| 32758 \| 33098 \| \| 160 MB \| 20 \| 26940 \| 31343 \| + 16.3% \| 160 MB \| 30 \| 25147 \| 28961 \| \| 160 MB \| 40 \| 25484 \| 26902 \| \| 160 MB \| 50 \| 24528 \| 25760 \| ------------------------------------------------------ \| 1.6 GB \| 1 \| 5733 \| 7729 \| + 34.8% \| 1.6 GB \| 2 \| 9411 \| 19009 \| + 101.9% \| 1.6 GB \| 4 \| 31818 \| 33185 \| \| 1.6 GB \| 8 \| 33700 \| 34550 \| \| 1.6 GB \| 16 \| 32751 \| 33079 \| \| 1.6 GB \| 20 \| 30919 \| 31494 \| \| 1.6 GB \| 30 \| 28540 \| 28535 \| \| 1.6 GB \| 40 \| 26380 \| 27054 \| \| 1.6 GB \| 50 \| 25241 \| 25591 \| ------------------------------------------------------ \| 7.6 GB \| 1 \| 5779 \| 6224 \| \| 7.6 GB \| 2 \| 10897 \| 13611 \| + 24.9% \| 7.6 GB \| 4 \| 32683 \| 33108 \| \| 7.6 GB \| 8 \| 33968 \| 34712 \| \| 7.6 GB \| 16 \| 32287 \| 32895 \| \| 7.6 GB \| 20 \| 27770 \| 31689 \| + 14.1% \| 7.6 GB \| 30 \| 26739 \| 29003 \| \| 7.6 GB \| 40 \| 24901 \| 26683 \| \| 7.6 GB \| 50 \| 17115 \| 25925 \| + 51.5% ------------------------------------------------------ (Davidlohr also has one additional patch which further improves throughput, though I will ask him to send it directly to you as I have suggested some minor changes)." * emailed patches from Michel Lespinasse <walken@google.com>: rwsem: no need for explicit signed longs x86 rwsem: avoid taking slow path when stealing write lock rwsem: do not block readers at head of queue if other readers are active rwsem: implement support for write lock stealing on the fastpath rwsem: simplify __rwsem_do_wake rwsem: skip initial trylock in rwsem_down_write_failed rwsem: avoid taking wait_lock in rwsem_down_write_failed rwsem: use cmpxchg for trying to steal write lock rwsem: more agressive lock stealing in rwsem_down_write_failed rwsem: simplify rwsem_down_write_failed rwsem: simplify rwsem_down_read_failed rwsem: move rwsem_down_failed_common code into rwsem_down_{read,write}_failed rwsem: shorter spinlocked section in rwsem_down_failed_common() rwsem: make the waiter type an enumeration rather than a bitmask	2013-05-07 09:22:03 -07:00
Davidlohr Bueso	b5f541810e	rwsem: no need for explicit signed longs Change explicit "signed long" declarations into plain "long" as suggested by Peter Hurley. Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Reviewed-by: Michel Lespinasse <walken@google.com> Signed-off-by: Michel Lespinasse <walken@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-05-07 07:20:17 -07:00
Michel Lespinasse	25c3932596	rwsem: do not block readers at head of queue if other readers are active This change fixes a race condition where a reader might determine it needs to block, but by the time it acquires the wait_lock the rwsem has active readers and no queued waiters. In this situation the reader can run in parallel with the existing active readers; it does not need to block until the active readers complete. Thanks to Peter Hurley for noticing this possible race. Signed-off-by: Michel Lespinasse <walken@google.com> Reviewed-by: Peter Hurley <peter@hurleysoftware.com> Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-05-07 07:20:17 -07:00
Michel Lespinasse	fe6e674c61	rwsem: implement support for write lock stealing on the fastpath When we decide to wake up readers, we must first grant them as many read locks as necessary, and then actually wake up all these readers. But in order to know how many read shares to grant, we must first count the readers at the head of the queue. This might take a while if there are many readers, and we want to be protected against a writer stealing the lock while we're counting. To that end, we grant the first reader lock before counting how many more readers are queued. We also require some adjustments to the wake_type semantics. RWSEM_WAKE_NO_ACTIVE used to mean that we had found the count to be RWSEM_WAITING_BIAS, in which case the rwsem was known to be free as nobody could steal it while we hold the wait_lock. This doesn't make sense once we implement fastpath write lock stealing, so we now use RWSEM_WAKE_ANY in that case. Similarly, when rwsem_down_write_failed found that a read lock was active, it would use RWSEM_WAKE_READ_OWNED which signalled that new readers could be woken without checking first that the rwsem was available. We can't do that anymore since the existing readers might release their read locks, and a writer could steal the lock before we wake up additional readers. So, we have to use a new RWSEM_WAKE_READERS value to indicate we only want to wake readers, but we don't currently hold any read lock. Signed-off-by: Michel Lespinasse <walken@google.com> Reviewed-by: Peter Hurley <peter@hurleysoftware.com> Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-05-07 07:20:16 -07:00
Michel Lespinasse	8cf5322ce6	rwsem: simplify __rwsem_do_wake This is mostly for cleanup value: - We don't need several gotos to handle the case where the first waiter is a writer. Two simple tests will do (and generate very similar code). - In the remainder of the function, we know the first waiter is a reader, so we don't have to double check that. We can use do..while loops to iterate over the readers to wake (generates slightly better code). Signed-off-by: Michel Lespinasse <walken@google.com> Reviewed-by: Peter Hurley <peter@hurleysoftware.com> Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-05-07 07:20:16 -07:00
Michel Lespinasse	9b0fc9c09f	rwsem: skip initial trylock in rwsem_down_write_failed We can skip the initial trylock in rwsem_down_write_failed() if there are known active lockers already, thus saving one likely-to-fail cmpxchg. Signed-off-by: Michel Lespinasse <walken@google.com> Reviewed-by: Peter Hurley <peter@hurleysoftware.com> Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-05-07 07:20:16 -07:00
Michel Lespinasse	a7d2c573ae	rwsem: avoid taking wait_lock in rwsem_down_write_failed In rwsem_down_write_failed(), if there are active locks after we wake up (i.e. the lock got stolen from us), skip taking the wait_lock and go back to sleep immediately. Signed-off-by: Michel Lespinasse <walken@google.com> Reviewed-by: Peter Hurley <peter@hurleysoftware.com> Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-05-07 07:20:16 -07:00
Michel Lespinasse	5ede972df1	rwsem: use cmpxchg for trying to steal write lock Using rwsem_atomic_update to try stealing the write lock forced us to undo the adjustment in the failure path. We can have simpler and faster code by using cmpxchg instead. Signed-off-by: Michel Lespinasse <walken@google.com> Reviewed-by: Peter Hurley <peter@hurleysoftware.com> Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-05-07 07:20:16 -07:00
Michel Lespinasse	ed00f64346	rwsem: more agressive lock stealing in rwsem_down_write_failed Some small code simplifications can be achieved by doing more agressive lock stealing: - When rwsem_down_write_failed() notices that there are no active locks (and thus no thread to wake us if we decided to sleep), it used to wake the first queued process. However, stealing the lock is also sufficient to deal with this case, so we don't need this check anymore. - In try_get_writer_sem(), we can steal the lock even when the first waiter is a reader. This is correct because the code path that wakes readers is protected by the wait_lock. As to the performance effects of this change, they are expected to be minimal: readers are still granted the lock (rather than having to acquire it themselves) when they reach the front of the wait queue, so we have essentially the same behavior as in rwsem-spinlock. Signed-off-by: Michel Lespinasse <walken@google.com> Reviewed-by: Rik van Riel <riel@redhat.com> Reviewed-by: Peter Hurley <peter@hurleysoftware.com> Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-05-07 07:20:16 -07:00
Michel Lespinasse	023fe4f712	rwsem: simplify rwsem_down_write_failed When waking writers, we never grant them the lock - instead, they have to acquire it themselves when they run, and remove themselves from the wait_list when they succeed. As a result, we can do a few simplifications in rwsem_down_write_failed(): - We don't need to check for !waiter.task since __rwsem_do_wake() doesn't remove writers from the wait_list - There is no point releaseing the wait_lock before entering the wait loop, as we will need to reacquire it immediately. We can change the loop so that the lock is always held at the start of each loop iteration. - We don't need to get a reference on the task structure, since the task is responsible for removing itself from the wait_list. There is no risk, like in the rwsem_down_read_failed() case, that a task would wake up and exit (thus destroying its task structure) while __rwsem_do_wake() is still running - wait_lock protects against that. Signed-off-by: Michel Lespinasse <walken@google.com> Reviewed-by: Rik van Riel <riel@redhat.com> Reviewed-by: Peter Hurley <peter@hurleysoftware.com> Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-05-07 07:20:16 -07:00
Michel Lespinasse	da16922cc0	rwsem: simplify rwsem_down_read_failed When trying to acquire a read lock, the RWSEM_ACTIVE_READ_BIAS adjustment doesn't cause other readers to block, so we never have to worry about waking them back after canceling this adjustment in rwsem_down_read_failed(). We also never want to steal the lock in rwsem_down_read_failed(), so we don't have to grab the wait_lock either. Signed-off-by: Michel Lespinasse <walken@google.com> Reviewed-by: Rik van Riel <riel@redhat.com> Reviewed-by: Peter Hurley <peter@hurleysoftware.com> Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-05-07 07:20:16 -07:00
Michel Lespinasse	1e78277ccb	rwsem: move rwsem_down_failed_common code into rwsem_down_{read,write}_failed Remove the rwsem_down_failed_common function and replace it with two identical copies of its code in rwsem_down_{read,write}_failed. This is because we want to make different optimizations in rwsem_down_{read,write}_failed; we are adding this pure-duplication step as a separate commit in order to make it easier to check the following steps. Signed-off-by: Michel Lespinasse <walken@google.com> Reviewed-by: Rik van Riel <riel@redhat.com> Reviewed-by: Peter Hurley <peter@hurleysoftware.com> Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-05-07 07:20:16 -07:00
Michel Lespinasse	f7dd1cee9a	rwsem: shorter spinlocked section in rwsem_down_failed_common() This change reduces the size of the spinlocked and TASK_UNINTERRUPTIBLE sections in rwsem_down_failed_common(): - We only need the sem->wait_lock to insert ourselves on the wait_list; the waiter node can be prepared outside of the wait_lock. - The task state only needs to be set to TASK_UNINTERRUPTIBLE immediately before checking if we actually need to sleep; it doesn't need to protect the entire function. Signed-off-by: Michel Lespinasse <walken@google.com> Reviewed-by: Rik van Riel <riel@redhat.com> Reviewed-by: Peter Hurley <peter@hurleysoftware.com> Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-05-07 07:20:16 -07:00
Michel Lespinasse	e2d57f782c	rwsem: make the waiter type an enumeration rather than a bitmask We are not planning to add some new waiter flags, so we can convert the waiter type into an enumeration. Background: David Howells suggested I do this back when I tried adding a new waiter type for unfair readers. However, I believe the cleanup applies regardless of that use case. Signed-off-by: Michel Lespinasse <walken@google.com> Reviewed-by: Rik van Riel <riel@redhat.com> Reviewed-by: Peter Hurley <peter@hurleysoftware.com> Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-05-07 07:20:15 -07:00
David Howells	9e6879460c	Give the OID registry file module info to avoid kernel tainting Give the OID registry file module information so that it doesn't taint the kernel when compiled as a module and loaded. Reported-by: Dros Adamson <Weston.Adamson@netapp.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Trond Myklebust <Trond.Myklebust@netapp.com> cc: stable@vger.kernel.org cc: linux-nfs@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-05-05 14:38:00 -07:00
Linus Torvalds	20a2078ce7	Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux Pull drm updates from Dave Airlie: "This is the main drm pull request for 3.10. Wierd bits: - OMAP drm changes required OMAP dss changes, in drivers/video, so I took them in here. - one more fbcon fix for font handover - VT switch avoidance in pm code - scatterlist helpers for gpu drivers - have acks from akpm Highlights: - qxl kms driver - driver for the spice qxl virtual GPU Nouveau: - fermi/kepler VRAM compression - GK110/nvf0 modesetting support. Tegra: - host1x core merged with 2D engine support i915: - vt switchless resume - more valleyview support - vblank fixes - modesetting pipe config rework radeon: - UVD engine support - SI chip tiling support - GPU registers initialisation from golden values. exynos: - device tree changes - fimc block support Otherwise: - bunches of fixes all over the place." * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (513 commits) qxl: update to new idr interfaces. drm/nouveau: fix build with nv50->nvc0 drm/radeon: fix handling of v6 power tables drm/radeon: clarify family checks in pm table parsing drm/radeon: consolidate UVD clock programming drm/radeon: fix UPLL_REF_DIV_MASK definition radeon: add bo tracking debugfs drm/radeon: add new richland pci ids drm/radeon: add some new SI PCI ids drm/radeon: fix scratch reg handling for UVD fence drm/radeon: allocate SA bo in the requested domain drm/radeon: fix possible segfault when parsing pm tables drm/radeon: fix endian bugs in atom_allocate_fb_scratch() OMAPDSS: TFP410: return EPROBE_DEFER if the i2c adapter not found OMAPDSS: VENC: Add error handling for venc_probe_pdata OMAPDSS: HDMI: Add error handling for hdmi_probe_pdata OMAPDSS: RFBI: Add error handling for rfbi_probe_pdata OMAPDSS: DSI: Add error handling for dsi_probe_pdata OMAPDSS: SDI: Add error handling for sdi_probe_pdata OMAPDSS: DPI: Add error handling for dpi_probe_pdata ...	2013-05-02 19:40:34 -07:00
Linus Torvalds	0279b3c0ad	Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "This fixes the cputime scaling overflow problems for good without having bad 32-bit overhead, and gets rid of the div64_u64_rem() helper as well." * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Revert "math64: New div64_u64_rem helper" sched: Avoid prev->stime underflow sched: Do not account bogus utime sched: Avoid cputime scaling overflow	2013-05-02 14:56:31 -07:00
Linus Torvalds	20b4fb4852	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull VFS updates from Al Viro, Misc cleanups all over the place, mainly wrt /proc interfaces (switch create_proc_entry to proc_create(), get rid of the deprecated create_proc_read_entry() in favor of using proc_create_data() and seq_file etc). 7kloc removed. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (204 commits) don't bother with deferred freeing of fdtables proc: Move non-public stuff from linux/proc_fs.h to fs/proc/internal.h proc: Make the PROC_I() and PDE() macros internal to procfs proc: Supply a function to remove a proc entry by PDE take cgroup_open() and cpuset_open() to fs/proc/base.c ppc: Clean up scanlog ppc: Clean up rtas_flash driver somewhat hostap: proc: Use remove_proc_subtree() drm: proc: Use remove_proc_subtree() drm: proc: Use minor->index to label things, not PDE->name drm: Constify drm_proc_list[] zoran: Don't print proc_dir_entry data in debug reiserfs: Don't access the proc_dir_entry in r_open(), r_start() r_show() proc: Supply an accessor for getting the data from a PDE's parent airo: Use remove_proc_subtree() rtl8192u: Don't need to save device proc dir PDE rtl8187se: Use a dir under /proc/net/r8180/ proc: Add proc_mkdir_data() proc: Move some bits from linux/proc_fs.h to linux/{of.h,signal.h,tty.h} proc: Move PDE_NET() to fs/proc/proc_net.c ...	2013-05-01 17:51:54 -07:00
Linus Torvalds	5f56886521	Merge branch 'akpm' (incoming from Andrew) Merge third batch of fixes from Andrew Morton: "Most of the rest. I still have two large patchsets against AIO and IPC, but they're a bit stuck behind other trees and I'm about to vanish for six days. - random fixlets - inotify - more of the MM queue - show_stack() cleanups - DMI update - kthread/workqueue things - compat cleanups - epoll udpates - binfmt updates - nilfs2 - hfs - hfsplus - ptrace - kmod - coredump - kexec - rbtree - pids - pidns - pps - semaphore tweaks - some w1 patches - relay updates - core Kconfig changes - sysrq tweaks" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (109 commits) Documentation/sysrq: fix inconstistent help message of sysrq key ethernet/emac/sysrq: fix inconstistent help message of sysrq key sparc/sysrq: fix inconstistent help message of sysrq key powerpc/xmon/sysrq: fix inconstistent help message of sysrq key ARM/etm/sysrq: fix inconstistent help message of sysrq key power/sysrq: fix inconstistent help message of sysrq key kgdb/sysrq: fix inconstistent help message of sysrq key lib/decompress.c: fix initconst notifier-error-inject: fix module names in Kconfig kernel/sys.c: make prctl(PR_SET_MM) generally available UAPI: remove empty Kbuild files menuconfig: print more info for symbol without prompts init/Kconfig: re-order CONFIG_EXPERT options to fix menuconfig display kconfig menu: move Virtualization drivers near other virtualization options Kconfig: consolidate CONFIG_DEBUG_STRICT_USER_COPY_CHECKS relay: use macro PAGE_ALIGN instead of FIX_SIZE kernel/relay.c: move FIX_SIZE macro into relay.c kernel/relay.c: remove unused function argument actor drivers/w1/slaves/w1_ds2760.c: fix the error handling in w1_ds2760_add_slave() drivers/w1/slaves/w1_ds2781.c: fix the error handling in w1_ds2781_add_slave() ...	2013-04-30 17:37:43 -07:00
Andi Kleen	6f9982bdde	lib/decompress.c: fix initconst Signed-off-by: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-04-30 17:04:09 -07:00
Akinobu Mita	e12a95f40a	notifier-error-inject: fix module names in Kconfig The Kconfig help text for MEMORY_NOTIFIER_ERROR_INJECT and OF_RECONFIG_NOTIFIER_ERROR_INJECT has mismatched module names. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-04-30 17:04:09 -07:00
Stephen Boyd	446f24d119	Kconfig: consolidate CONFIG_DEBUG_STRICT_USER_COPY_CHECKS The help text for this config is duplicated across the x86, parisc, and s390 Kconfig.debug files. Arnd Bergman noted that the help text was slightly misleading and should be fixed to state that enabling this option isn't a problem when using pre 4.4 gcc. To simplify the rewording, consolidate the text into lib/Kconfig.debug and modify it there to be more explicit about when you should say N to this config. Also, make the text a bit more generic by stating that this option enables compile time checks so we can cover architectures which emit warnings vs. ones which emit errors. The details of how an architecture decided to implement the checks isn't as important as the concept of compile time checking of copy_from_user() calls. While we're doing this, remove all the copy_from_user_overflow() code that's duplicated many times and place it into lib/ so that any architecture supporting this option can get the function for free. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: H. Peter Anvin <hpa@zytor.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Helge Deller <deller@gmx.de> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-04-30 17:04:09 -07:00
Davidlohr Bueso	c75aaa8ed0	rbtree_test: add __init/__exit annotations Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Reviewed-by: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-04-30 17:04:07 -07:00
Davidlohr Bueso	4130f0efbf	rbtree_test: add extra rbtree integrity check Account for the rbtree having 2**bh(v)-1 internal nodes. While this can be seen as a consequence of other checks, Michel states that it nicely sums up what the other properties are for. Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Reviewed-by: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-04-30 17:04:07 -07:00
Andy Shevchenko	d338b1379f	dynamic_debug: reuse generic string_unescape function There is kernel function to do the job in generic way. Let's use it. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Jason Baron <jbaron@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-04-30 17:04:03 -07:00
Andy Shevchenko	16c7fa0582	lib/string_helpers: introduce generic string_unescape There are several places in kernel where modules unescapes input to convert C-Style Escape Sequences into byte codes. The patch provides generic implementation of such approach. Test cases are also included into the patch. [akpm@linux-foundation.org: clarify comment] [akpm@linux-foundation.org: export get_random_int() to modules] Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Samuel Thibault <samuel.thibault@ens-lyon.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jason Baron <jbaron@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: William Hubbs <w.d.hubbs@gmail.com> Cc: Chris Brannon <chris@the-brannons.com> Cc: Kirk Reiser <kirk@braille.uwo.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-04-30 17:04:03 -07:00
Tejun Heo	196779b9b4	dump_stack: consolidate dump_stack() implementations and unify their behaviors Both dump_stack() and show_stack() are currently implemented by each architecture. show_stack(NULL, NULL) dumps the backtrace for the current task as does dump_stack(). On some archs, dump_stack() prints extra information - pid, utsname and so on - in addition to the backtrace while the two are identical on other archs. The usages in arch-independent code of the two functions indicate show_stack(NULL, NULL) should print out bare backtrace while dump_stack() is used for debugging purposes when something went wrong, so it does make sense to print additional information on the task which triggered dump_stack(). There's no reason to require archs to implement two separate but mostly identical functions. It leads to unnecessary subtle information. This patch expands the dummy fallback dump_stack() implementation in lib/dump_stack.c such that it prints out debug information (taken from x86) and invokes show_stack(NULL, NULL) and drops arch-specific dump_stack() implementations in all archs except blackfin. Blackfin's dump_stack() does something wonky that I don't understand. Debug information can be printed separately by calling dump_stack_print_info() so that arch-specific dump_stack() implementation can still emit the same debug information. This is used in blackfin. This patch brings the following behavior changes. * On some archs, an extra level in backtrace for show_stack() could be printed. This is because the top frame was determined in dump_stack() on those archs while generic dump_stack() can't do that reliably. It can be compensated by inlining dump_stack() but not sure whether that'd be necessary. * Most archs didn't use to print debug info on dump_stack(). They do now. An example WARN dump follows. WARNING: at kernel/workqueue.c:4841 init_workqueues+0x35/0x505() Hardware name: empty Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #9 0000000000000009 ffff88007c861e08 ffffffff81c614dc ffff88007c861e48 ffffffff8108f50f ffffffff82228240 0000000000000040 ffffffff8234a03c 0000000000000000 0000000000000000 0000000000000000 ffff88007c861e58 Call Trace: [<ffffffff81c614dc>] dump_stack+0x19/0x1b [<ffffffff8108f50f>] warn_slowpath_common+0x7f/0xc0 [<ffffffff8108f56a>] warn_slowpath_null+0x1a/0x20 [<ffffffff8234a071>] init_workqueues+0x35/0x505 ... v2: CPU number added to the generic debug info as requested by s390 folks and dropped the s390 specific dump_stack(). This loses %ksp from the debug message which the maintainers think isn't important enough to keep the s390-specific dump_stack() implementation. dump_stack_print_info() is moved to kernel/printk.c from lib/dump_stack.c. Because linkage is per objecct file, dump_stack_print_info() living in the same lib file as generic dump_stack() means that archs which implement custom dump_stack() - at this point, only blackfin - can't use dump_stack_print_info() as that will bring in the generic version of dump_stack() too. v1 The v1 patch broke build on blackfin due to this issue. The build breakage was reported by Fengguang Wu. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Vineet Gupta <vgupta@synopsys.com> Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> Acked-by: Vineet Gupta <vgupta@synopsys.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [s390 bits] Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Sam Ravnborg <sam@ravnborg.org> Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon bits] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-04-30 17:04:02 -07:00
Linus Torvalds	3094566959	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull fixup for trivial branch from Jiri Kosina: "Unfortunately I made a mistake when merging into for-linus branch, and omitted one pre-requisity patch for a few other patches (which have been Acked by the appropriate maintainers) in the series. Mea culpa maxima, sorry for that." The trivial branch added %pSR usage before actually teaching vsnprintf() about the 'R' part of %pSR. The 'R' causes the symbol translation to do a "__builtin_extract_return_addr()" before symbol lookup. That said, on most architectures __builtin_extract_return_addr() isn't likely to do anything special, so it probably is not normally noticeable. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: vsprintf: Add extension %pSR - print_symbol replacement	2013-04-30 13:47:37 -07:00
Joe Perches	b0d33c2bd7	vsprintf: Add extension %pSR - print_symbol replacement print_symbol takes a long and converts it to a function name and offset. %pS does something similar, but doesn't translate the address via __builtin_extract_return_addr. %pSR does the translation. This will enable replacing multiple calls like printk(...); printk_symbol(addr); printk("\n"); with a single non-interleavable in dmesg printk("... %pSR\n", (void *)addr); Update documentation too. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2013-04-30 22:31:16 +02:00
Stanislaw Gruszka	f300213415	Revert "math64: New div64_u64_rem helper" This reverts commit `f792685006`. The cputime scaling code was changed/fixed and does not need the div64_u64_rem() primitive anymore. It has no other users, so let's remove them. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: rostedt@goodmis.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Dave Hansen <dave@sr71.net> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1367314507-9728-4-git-send-email-sgruszka@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-04-30 19:13:05 +02:00
Linus Torvalds	16fa94b532	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler changes from Ingo Molnar: "The main changes in this development cycle were: - full dynticks preparatory work by Frederic Weisbecker - factor out the cpu time accounting code better, by Li Zefan - multi-CPU load balancer cleanups and improvements by Joonsoo Kim - various smaller fixes and cleanups" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits) sched: Fix init NOHZ_IDLE flag sched: Prevent to re-select dst-cpu in load_balance() sched: Rename load_balance_tmpmask to load_balance_mask sched: Move up affinity check to mitigate useless redoing overhead sched: Don't consider other cpus in our group in case of NEWLY_IDLE sched: Explicitly cpu_idle_type checking in rebalance_domains() sched: Change position of resched_cpu() in load_balance() sched: Fix wrong rq's runnable_avg update with rt tasks sched: Document task_struct::personality field sched/cpuacct/UML: Fix header file dependency bug on the UML build cgroup: Kill subsys.active flag sched/cpuacct: No need to check subsys active state sched/cpuacct: Initialize cpuacct subsystem earlier sched/cpuacct: Initialize root cpuacct earlier sched/cpuacct: Allocate per_cpu cpuusage for root cpuacct statically sched/cpuacct: Clean up cpuacct.h sched/cpuacct: Remove redundant NULL checks in cpuacct_acount_field() sched/cpuacct: Remove redundant NULL checks in cpuacct_charge() sched/cpuacct: Add cpuacct_acount_field() sched/cpuacct: Add cpuacct_init() ...	2013-04-30 07:43:28 -07:00
Akinobu Mita	f39fee5f11	lib/: rename random32() to prandom_u32() Use preferable function name which implies using a pseudo-random number generator. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-04-29 18:28:42 -07:00
Akinobu Mita	cedddb0002	uuid: use prandom_bytes() Use prandom_bytes() to generate 16 bytes of pseudo-random bytes. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Huang Ying <ying.huang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-04-29 18:28:42 -07:00
Jeff Layton	3e6628c4b3	idr: introduce idr_alloc_cyclic() As Tejun points out, there are several users of the IDR facility that attempt to use it in a cyclic fashion. These users are likely to see -ENOSPC errors after the counter wraps one or more times however. This patchset adds a new idr_alloc_cyclic routine and converts several of these users to it. Many of these users are in obscure parts of the kernel, and I don't have a good way to test some of them. The change is pretty straightforward though, so hopefully it won't be an issue. There is one other cyclic user of idr_alloc that I didn't touch in ipc/util.c. That one is doing some strange stuff that I didn't quite understand, but it looks like it should probably be converted later somehow. This patch: Thus spake Tejun Heo: Ooh, BTW, the cyclic allocation is broken. It's prone to -ENOSPC after the first wraparound. There are several cyclic users in the kernel and I think it probably would be best to implement cyclic support in idr. This patch does that by adding new idr_alloc_cyclic function that such users in the kernel can use. With this, there's no need for a caller to keep track of the last value used as that's now tracked internally. This should prevent the ENOSPC problems that can hit when the "last allocated" counter exceeds INT_MAX. Later patches will convert existing cyclic users to the new interface. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Tejun Heo <tj@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Eric Paris <eparis@parisplace.org> Cc: Jack Morgenstein <jackm@dev.mellanox.co.il> Cc: John McCutchan <john@johnmccutchan.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: Robert Love <rlove@rlove.org> Cc: Roland Dreier <roland@purestorage.com> Cc: Sridhar Samudrala <sri@us.ibm.com> Cc: Steve Wise <swise@opengridcomputing.com> Cc: Tom Tucker <tom@opengridcomputing.com> Cc: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-04-29 18:28:41 -07:00
Andy Shevchenko	2e0fb404c8	lib, net: make isodigit() public and use it There are at least two users of isodigit(). Let's make it a public function of ctype.h. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-04-29 18:28:19 -07:00
Oleg Nesterov	095d141b2e	argv_split(): teach it to handle mutable strings argv_split() allocates argv[count_argc(str)] array and assumes that it will find the same number of arguments later. This is obviously wrong if this string can be changed, say, by sysctl. With this patch argv_split() kstrndup's the whole string and does not split it, we simply replace the spaces with zeroes and keep the allocated memory in argv[-1] for argv_free(arg). We do not use argv[0] because: - str can be all-spaces or empty. In fact this case is fine, we could kfree() it before return, but: - str can have a space at the start, and we can not rely on kstrndup(skip_spaces(str)) because it can equally race if this string is mutable. Also, simplify count_argc() and kill the no longer used skip_arg(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-04-29 18:28:19 -07:00
Davidlohr Bueso	30493cc9dd	lib/int_sqrt.c: optimize square root algorithm Optimize the current version of the shift-and-subtract (hardware) algorithm, described by John von Newmann[1] and Guy L Steele. Iterating 1,000,000 times, perf shows for the current version: Performance counter stats for './sqrt-curr' (10 runs): 27.170996 task-clock # 0.979 CPUs utilized ( +- 3.19% ) 3 context-switches # 0.103 K/sec ( +- 4.76% ) 0 cpu-migrations # 0.004 K/sec ( +-100.00% ) 104 page-faults # 0.004 M/sec ( +- 0.16% ) 64,921,199 cycles # 2.389 GHz ( +- 0.03% ) 28,967,789 stalled-cycles-frontend # 44.62% frontend cycles idle ( +- 0.18% ) <not supported> stalled-cycles-backend 104,502,623 instructions # 1.61 insns per cycle # 0.28 stalled cycles per insn ( +- 0.00% ) 34,088,368 branches # 1254.587 M/sec ( +- 0.00% ) 4,901 branch-misses # 0.01% of all branches ( +- 1.32% ) 0.027763015 seconds time elapsed ( +- 3.22% ) And for the new version: Performance counter stats for './sqrt-new' (10 runs): 0.496869 task-clock # 0.519 CPUs utilized ( +- 2.38% ) 0 context-switches # 0.000 K/sec 0 cpu-migrations # 0.403 K/sec ( +-100.00% ) 104 page-faults # 0.209 M/sec ( +- 0.15% ) 590,760 cycles # 1.189 GHz ( +- 2.35% ) 395,053 stalled-cycles-frontend # 66.87% frontend cycles idle ( +- 3.67% ) <not supported> stalled-cycles-backend 398,963 instructions # 0.68 insns per cycle # 0.99 stalled cycles per insn ( +- 0.39% ) 70,228 branches # 141.341 M/sec ( +- 0.36% ) 3,364 branch-misses # 4.79% of all branches ( +- 5.45% ) 0.000957440 seconds time elapsed ( +- 2.42% ) Furthermore, this saves space in instruction text: text data bss dec hex filename 111 0 0 111 6f lib/int_sqrt-baseline.o 89 0 0 89 59 lib/int_sqrt.o [1] http://en.wikipedia.org/wiki/First_Draft_of_a_Report_on_the_EDVAC Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Reviewed-by: Jonathan Gonzalez <jgonzlez@linets.cl> Tested-by: Jonathan Gonzalez <jgonzlez@linets.cl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-04-29 18:28:19 -07:00
Philipp Zabel	9375db07ad	genalloc: add devres support, allow to find a managed pool by device This patch adds three exported functions to lib/genalloc.c: devm_gen_pool_create, dev_get_gen_pool, and of_get_named_gen_pool. devm_gen_pool_create is a managed version of gen_pool_create that keeps track of the pool via devres and allows the management code to automatically destroy it after device removal. dev_get_gen_pool retrieves the gen_pool for a given device, if it was created with devm_gen_pool_create, using devres_find. of_get_named_gen_pool retrieves the gen_pool for a given device node and property name, where the property must contain a phandle pointing to a platform device node. The corresponding platform device is then fed into dev_get_gen_pool and the resulting gen_pool is returned. [akpm@linux-foundation.org: make the of_get_named_gen_pool() stub static, fixing a zillion link errors] [akpm@linux-foundation.org: squish "struct device declared inside parameter list" warning] Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Acked-by: Grant Likely <grant.likely@secretlab.ca> Tested-by: Michal Simek <monstr@monstr.eu> Cc: Fabio Estevam <fabio.estevam@freescale.com> Cc: Matt Porter <mporter@ti.com> Cc: Dong Aisheng <dong.aisheng@linaro.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Rob Herring <rob.herring@calxeda.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Javier Martin <javier.martin@vista-silicon.com> Cc: Huang Shijie <shijie8@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-04-29 18:28:13 -07:00
David Rientjes	4b59e6c473	mm, show_mem: suppress page counts in non-blockable contexts On large systems with a lot of memory, walking all RAM to determine page types may take a half second or even more. In non-blockable contexts, the page allocator will emit a page allocation failure warning unless __GFP_NOWARN is specified. In such contexts, irqs are typically disabled and such a lengthy delay may even result in NMI watchdog timeouts. To fix this, suppress the page walk in such contexts when printing the page allocation failure warning. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Mel Gorman <mgorman@suse.de> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-04-29 15:54:28 -07:00
Linus Torvalds	830ac8524f	Merge branch 'x86-kdump-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull kdump fixes from Peter Anvin: "The kexec/kdump people have found several problems with the support for loading over 4 GiB that was introduced in this merge cycle. This is partly due to a number of design problems inherent in the way the various pieces of kdump fit together (it is pretty horrifically manual in many places.) After a lot of iterations this is the patchset that was agreed upon, but of course it is now very late in the cycle. However, because it changes both the syntax and semantics of the crashkernel option, it would be desirable to avoid a stable release with the broken interfaces." I'm not happy with the timing, since originally the plan was to release the final 3.9 tomorrow. But apparently I'm doing an -rc8 instead... * 'x86-kdump-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: kexec: use Crash kernel for Crash kernel low x86, kdump: Change crashkernel_high/low= to crashkernel=,high/low x86, kdump: Retore crashkernel= to allocate under 896M x86, kdump: Set crashkernel_low automatically	2013-04-20 18:40:36 -07:00
Linus Torvalds	db93f8b420	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Peter Anvin: "Three groups of fixes: 1. Make sure we don't execute the early microcode patching if family < 6, since it would touch MSRs which don't exist on those families, causing crashes. 2. The Xen partial emulation of HyperV can be dealt with more gracefully than just disabling the driver. 3. More EFI variable space magic. In particular, variables hidden from runtime code need to be taken into account too." * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, microcode: Verify the family before dispatching microcode patching x86, hyperv: Handle Xen emulation of Hyper-V more gracefully x86,efi: Implement efi_no_storage_paranoia parameter efi: Export efi_query_variable_store() for efivars.ko x86/Kconfig: Make EFI select UCS2_STRING efi: Distinguish between "remaining space" and actually used space efi: Pass boot services variable info to runtime code Move utf16 functions to kernel core and rename x86,efi: Check max_size only if it is non-zero. x86, efivars: firmware bug workarounds should be in platform code	2013-04-20 18:38:48 -07:00
H. Peter Anvin	c0a9f451e4	Merge remote-tracking branch 'efi/urgent' into x86/urgent Matt Fleming (1): x86, efivars: firmware bug workarounds should be in platform code Matthew Garrett (3): Move utf16 functions to kernel core and rename efi: Pass boot services variable info to runtime code efi: Distinguish between "remaining space" and actually used space Richard Weinberger (2): x86,efi: Check max_size only if it is non-zero. x86,efi: Implement efi_no_storage_paranoia parameter Sergey Vlasov (2): x86/Kconfig: Make EFI select UCS2_STRING efi: Export efi_query_variable_store() for efivars.ko Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-04-19 17:09:03 -07:00
Yinghai Lu	c729de8fce	x86, kdump: Set crashkernel_low automatically Chao said that kdump does does work well on his system on 3.8 without extra parameter, even iommu does not work with kdump. And now have to append crashkernel_low=Y in first kernel to make kdump work. We have now modified crashkernel=X to allocate memory beyong 4G (if available) and do not allocate low range for crashkernel if the user does not specify that with crashkernel_low=Y. This causes regression if iommu is not enabled. Without iommu, swiotlb needs to be setup in first 4G and there is no low memory available to second kernel. Set crashkernel_low automatically if the user does not specify that. For system that does support IOMMU with kdump properly, user could specify crashkernel_low=0 to save that 72M low ram. -v3: add swiotlb_size() according to Konrad. -v4: add comments what 8M is for according to hpa. also update more crashkernel_low= in kernel-parameters.txt -v5: update changelog according to Vivek. -v6: Change description about swiotlb referring according to HATAYAMA. Reported-by: WANG Chao <chaowang@redhat.com> Tested-by: WANG Chao <chaowang@redhat.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1366089828-19692-2-git-send-email-yinghai@kernel.org Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-04-17 12:35:32 -07:00
Matthew Garrett	0635eb8a54	Move utf16 functions to kernel core and rename We want to be able to use the utf16 functions that are currently present in the EFI variables code in platform-specific code as well. Move them to the kernel core, and in the process rename them to accurately describe what they do - they don't handle UTF16, only UCS2. Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com>	2013-04-15 21:23:03 +01:00
Linus Torvalds	a49b7e82ca	kobject: fix kset_find_obj() race with concurrent last kobject_put() Anatol Pomozov identified a race condition that hits module unloading and re-loading. To quote Anatol: "This is a race codition that exists between kset_find_obj() and kobject_put(). kset_find_obj() might return kobject that has refcount equal to 0 if this kobject is freeing by kobject_put() in other thread. Here is timeline for the crash in case if kset_find_obj() searches for an object tht nobody holds and other thread is doing kobject_put() on the same kobject: THREAD A (calls kset_find_obj()) THREAD B (calls kobject_put()) splin_lock() atomic_dec_return(kobj->kref), counter gets zero here ... starts kobject cleanup .... spin_lock() // WAIT thread A in kobj_kset_leave() iterate over kset->list atomic_inc(kobj->kref) (counter becomes 1) spin_unlock() spin_lock() // taken // it does not know that thread A increased counter so it remove obj from list spin_unlock() vfree(module) // frees module object with containing kobj // kobj points to freed memory area!! kobject_put(kobj) // OOPS!!!! The race above happens because module.c tries to use kset_find_obj() when somebody unloads module. The module.c code was introduced in commit 6494a93d55fa" Anatol supplied a patch specific for module.c that worked around the problem by simply not using kset_find_obj() at all, but rather than make a local band-aid, this just fixes kset_find_obj() to be thread-safe using the proper model of refusing the get a new reference if the refcount has already dropped to zero. See examples of this proper refcount handling not only in the kref documentation, but in various other equivalent uses of this pattern by grepping for atomic_inc_not_zero(). [ Side note: the module race does indicate that module loading and unloading is not properly serialized wrt sysfs information using the module mutex. That may require further thought, but this is the correct fix at the kobject layer regardless. ] Reported-analyzed-and-tested-by: Anatol Pomozov <anatol.pomozov@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-04-13 15:15:30 -07:00
Al Viro	0ecc833bac	mode_t, whack-a-mole at 11... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-04-09 14:13:05 -04:00
Daniel Vetter	ecb135a1a1	Linux 3.9-rc5 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iQEcBAABAgAGBQJRWLTrAAoJEHm+PkMAQRiGe8oH/iMy48mecVWvxVZn74Tx3Cef xmW/PnAIj28EhSPqK49N/Ow6AfQToFKf7AP0ge20KAf5teTq95AY+tH74DAANt8F BjKXXTZiR5xwBvRkq7CR5wDcCvEcBAAz8fgTEd6SEDB2d2VXFf5eKdKUqt1avTCh Z6Hup5kuwX+ddtwY2DCBXtp2n6fL0Rm5yLzY1A3OOBye1E7VyLTF7M5BR603Q44P 4kRLxn8+R7jy3hTuZIhAeoS8TKUoBwVk7DmKxEzrhTHZVOmvwE9lEHybRnIyOpd/ k1JnbRbiPsLsCVFOn10SQkGDAIk00lro3tuWP2C1ljERiD/OOh5Ui9nXYAhMkbI= =q15K -----END PGP SIGNATURE----- Merge tag 'v3.9-rc5' into drm-intel-next-queued Backmerge Linux 3.9-rc5 since I want to merge a few dp clock cleanups for -next, but they will conflict all over the place with commit `9d1a455b0c` Author: Takashi Iwai <tiwai@suse.de> Date: Mon Mar 18 11:25:36 2013 +0100 drm/i915: Use the fixed pixel clock for eDP in intel_dp_set_m_n() from -fixes. Conflicts: drivers/gpu/drm/i915/intel_dp.c: Simply adjacent lines changed. drivers/gpu/drm/i915/intel_panel.c: A field rename in -next conflicts with a bugfix in -fixes. Take the version from -fixes and apply the rename. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-04-03 11:28:48 +02:00
Imre Deak	2db76d7c3c	lib/scatterlist: sg_page_iter: support sg lists w/o backing pages The i915 driver uses sg lists for memory without backing 'struct page' pages, similarly to other IO memory regions, setting only the DMA address for these. It does this, so that it can program the HW MMU tables in a uniform way both for sg lists with and without backing pages. Without a valid page pointer we can't call nth_page to get the current page in __sg_page_iter_next, so add a helper that relevant users can call separately. Also add a helper to get the DMA address of the current page (idea from Daniel). Convert all places in i915, to use the new API. Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-03-27 17:13:44 +01:00
Lars Ellenberg	cbe5e61095	lru_cache: introduce lc_get_cumulative() New helper to be able to consolidate more updates into a single transaction. Without this, we can only grab a single refcount on an updated element while preparing a transaction. lc_get_cumulative - like lc_get; also finds to-be-changed elements @lc: the lru cache to operate on @enr: the label to look up Unlike lc_get this also returns the element for @enr, if it is belonging to a pending transaction, so the return values are like for lc_get(), plus: pointer to an element already on the "to_be_changed" list. In this case, the cache was already marked %LC_DIRTY. Caller needs to make sure that the pending transaction is completed, before proceeding to actually use this element. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Fixed up by Jens to export lc_get_cumulative(). Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-03-22 22:17:36 -06:00
Alexander Duyck	96e7d7a1e0	dma-debug: update DMA debug API to better handle multiple mappings of a buffer There were reports of the igb driver unmapping buffers without calling dma_mapping_error. On closer inspection issues were found in the DMA debug API and how it handled multiple mappings of the same buffer. The issue I found is the fact that the debug_dma_mapping_error would only set the map_err_type to MAP_ERR_CHECKED in the case that the was only one match for device and device address. However in the case of non-IOMMU, multiple addresses existed and as a result it was not setting this field once a second mapping was instantiated. I have resolved this by changing the search so that it instead will now set MAP_ERR_CHECKED on the first buffer that matches the device and DMA address that is currently in the state MAP_ERR_NOT_CHECKED. A secondary side effect of this patch is that in the case of multiple buffers using the same address only the last mapping will have a valid map_err_type. The previous mappings will all end up with map_err_type set to MAP_ERR_CHECKED because of the dma_mapping_error call in debug_dma_map_page. However this behavior may be preferable as it means you will likely only see one real error per multi-mapped buffer, versus the current behavior of multiple false errors mer multi-mapped buffer. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Cc: Joerg Roedel <joro@8bytes.org> Reviewed-by: Shuah Khan <shuah.khan@hp.com> Tested-by: Shuah Khan <shuah.khan@hp.com> Cc: Jakub Kicinski <kubakici@wp.pl> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-03-22 16:41:20 -07:00
Alexander Duyck	8d640a51ec	dma-debug: fix locking bug in check_unmap() In check_unmap() it is possible to get into a dead-locked state if dma_mapping_error is called. The problem is that the bucket is locked in check_unmap, and locked again by debug_dma_mapping_error which is called by dma_mapping_error. To resolve that we must release the lock on the bucket before making the call to dma_mapping_error. [akpm@linux-foundation.org: restore 80-col trickery to be consistent with the rest of the file] Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Cc: Joerg Roedel <joro@8bytes.org> Reviewed-by: Shuah Khan <shuah.khan@hp.com> Tested-by: Shuah Khan <shuah.khan@hp.com> Cc: Jakub Kicinski <kubakici@wp.pl> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-03-22 16:41:20 -07:00
Frederic Weisbecker	dc72c32e1f	printk: Provide a wake_up_klogd() off-case wake_up_klogd() is useless when CONFIG_PRINTK=n because neither printk() nor printk_sched() are in use and there are actually no waiter on log_wait waitqueue. It should be a stub in this case for users like bust_spinlocks(). Otherwise this results in this warning when CONFIG_PRINTK=n and CONFIG_IRQ_WORK=n: kernel/built-in.o In function `wake_up_klogd': (.text.wake_up_klogd+0xb4): undefined reference to `irq_work_queue' To fix this, provide an off-case for wake_up_klogd() when CONFIG_PRINTK=n. There is much more from console_unlock() and other console related code in printk.c that should be moved under CONFIG_PRINTK. But for now, focus on a minimal fix as we passed the merged window already. [akpm@linux-foundation.org: include printk.h in bust_spinlocks.c] Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Reported-by: James Hogan <james.hogan@imgtec.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-03-22 16:41:20 -07:00
Tejun Heo	59bfbcf019	idr: idr_alloc() shouldn't trigger lowmem warning when preloaded GFP_NOIO is often used for idr_alloc() inside preloaded section as the allocation mask doesn't really matter. If the idr tree needs to be expanded, idr_alloc() first tries to allocate using the specified allocation mask and if it fails falls back to the preloaded buffer. This order prevent non-preloading idr_alloc() users from taking advantage of preloading ones by using preload buffer without filling it shifting the burden of allocation to the preload users. Unfortunately, this allowed/expected-to-fail kmem_cache allocation ends up generating spurious slab lowmem warning before succeeding the request from the preload buffer. This patch makes idr_layer_alloc() add __GFP_NOWARN to the first kmem_cache attempt and try kmem_cache again w/o __GFP_NOWARN after allocation from preload_buffer fails so that lowmem warning is generated if not suppressed by the original @gfp_mask. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: David Teigland <teigland@redhat.com> Tested-by: David Teigland <teigland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-03-13 15:21:49 -07:00
Paul Bolle	97da55fcec	decompressors: fix typo "POWERPC" Commit `5dc49c75a2` ("decompressors: make the default XZ_DEC_* config match the selected architecture") added default y if POWERPC to lib/xz/Kconfig. But there is no Kconfig symbol POWERPC. The most general Kconfig symbol for the powerpc architecture is PPC. So let's use that. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Cc: Florian Fainelli <florian@openwrt.org> Cc: Lasse Collin <lasse.collin@tukaani.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-03-13 15:21:48 -07:00
Tejun Heo	c8615d3716	idr: deprecate idr_pre_get() and idr_get_new[_above]() Now that all in-kernel users are converted to ues the new alloc interface, mark the old interface deprecated. We should be able to remove these in a few releases. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-03-13 15:21:47 -07:00
Frederic Weisbecker	f792685006	math64: New div64_u64_rem helper Provide an extended version of div64_u64() that also returns the remainder of the division. We are going to need this to refine the cputime scaling code. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org>	2013-03-13 18:03:27 +01:00
Randy Dunlap	5857f70c8a	idr: fix new kernel-doc warnings Fix new kernel-doc warnings in idr: Warning(include/linux/idr.h:113): No description found for parameter 'idr' Warning(include/linux/idr.h:113): Excess function parameter 'idp' description in 'idr_find' Warning(lib/idr.c:232): Excess function parameter 'id' description in 'sub_alloc' Warning(lib/idr.c:232): Excess function parameter 'id' description in 'sub_alloc' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-03-12 20:42:09 -07:00
Tejun Heo	2e1c9b2867	idr: remove WARN_ON_ONCE() on negative IDs idr_find(), idr_remove() and idr_replace() used to silently ignore the sign bit and perform lookup with the rest of the bits. The weird behavior has been changed such that negative IDs are treated as invalid. As the behavior change was subtle, WARN_ON_ONCE() was added in the hope of determining who's calling idr functions with negative IDs so that they can be examined for problems. Up until now, all two reported cases are ID number coming directly from userland and getting fed into idr_find() and the warnings seem to cause more problems than being helpful. Drop the WARN_ON_ONCE()s. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: <markus@trippelsdorf.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-03-08 15:05:34 -08:00
Linus Torvalds	8fd5e7a2d9	ImgTec Meta architecture changes for v3.9-rc1 This adds core architecture support for Imagination's Meta processor cores, followed by some later miscellaneous arch/metag cleanups and fixes which I kept separate to ease review: - Support for basic Meta 1 (ATP) and Meta 2 (HTP) core architecture - A few fixes all over, particularly for symbol prefixes - A few privilege protection fixes - Several cleanups (setup.c includes, split out a lot of metag_ksyms.c) - Fix some missing exports - Convert hugetlb to use vm_unmapped_area() - Copy device tree to non-init memory - Provide dma_get_sgtable() Signed-off-by: James Hogan <james.hogan@imgtec.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) iQIcBAABAgAGBQJRMmVXAAoJEKHZs+irPybfivgP/inEXqJyfw59omQdjwvYcU/a /u0MJ3UKSNS3U+HknfaFCy/Nwk1dqPLjqqyVC1V6AbUPBXlaEwGcimlNRx2uRjdq Uh96upMLHsNuF/xiiR477g3RwY0egIJdM1R1bGi3mZ3vVrNQGF+wbni6f61xCWGz M/4rDglpQvE79oLhYdgj6tidZtHQT0YWtERA9W90zkQWXGYmpFPKBKbfZAi5+rKQ U6Gpg26orUugzXNaxltJEYKE8gjLTppEabx8DARnItZ4zCMy4dw5RBJ35RFvQw6e eSmfgTy9w9WqBMY2+QMSgU0KQt1IITCzX7OlOXC0jALQJXoU0WWbOELlBVQLCwF1 T0OcR/5ZP/hIlOk5Dh+e9U3AtbASXdMtqA0ZUe78woH1CBf7Nc/0c0vRg23EdMh8 lnHDJxT/UqskoOYLI4kgWbEdLDy4uTh19U2pVi7VCo7ksLB9Bj9Xc8VSKgscSXTl OwzN+c4Jgtu8FDFTp+Af4AT8pYGJ08j8L2ErsV2sOv3Q44U5WXdrMz3GSgwXj8+4 wZk3HvdkQVkMD5sJCUZgAswaN6BnbB0pHdCz4wMQ8jR/Ogs015Ipk64Ecym9S/4n uES7PnDtt/4lb5EyX2ScbvdnZTAFTaaP7OOhC77BOQvbQjIW1tkAcxWJqRry86uS iM0BFgK6Ohx3geqa5Ft0 =65cR -----END PGP SIGNATURE----- Merge tag 'metag-v3.9-rc1-v4' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag Pull new ImgTec Meta architecture from James Hogan: "This adds core architecture support for Imagination's Meta processor cores, followed by some later miscellaneous arch/metag cleanups and fixes which I kept separate to ease review: - Support for basic Meta 1 (ATP) and Meta 2 (HTP) core architecture - A few fixes all over, particularly for symbol prefixes - A few privilege protection fixes - Several cleanups (setup.c includes, split out a lot of metag_ksyms.c) - Fix some missing exports - Convert hugetlb to use vm_unmapped_area() - Copy device tree to non-init memory - Provide dma_get_sgtable()" * tag 'metag-v3.9-rc1-v4' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag: (61 commits) metag: Provide dma_get_sgtable() metag: prom.h: remove declaration of metag_dt_memblock_reserve() metag: copy devicetree to non-init memory metag: cleanup metag_ksyms.c includes metag: move mm/init.c exports out of metag_ksyms.c metag: move usercopy.c exports out of metag_ksyms.c metag: move setup.c exports out of metag_ksyms.c metag: move kick.c exports out of metag_ksyms.c metag: move traps.c exports out of metag_ksyms.c metag: move irq enable out of irqflags.h on SMP genksyms: fix metag symbol prefix on crc symbols metag: hugetlb: convert to vm_unmapped_area() metag: export clear_page and copy_page metag: export metag_code_cache_flush_all metag: protect more non-MMU memory regions metag: make TXPRIVEXT bits explicit metag: kernel/setup.c: sort includes perf: Enable building perf tools for Meta metag: add boot time LNKGET/LNKSET check metag: add __init to metag_cache_probe() ...	2013-03-03 12:06:09 -08:00
James Hogan	79f83c0294	Kconfig.debug: add METAG to dependency lists Add [!]METAG to a couple of Kconfig dependencies in lib/Kconfig.debug. Don't allow stack utilization instrumentation on metag, and allow building with frame pointers. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Paul E. McKenney" <paul.mckenney@linaro.org> Cc: Akinobu Mita <akinobu.mita@gmail.com> Cc: Michel Lespinasse <walken@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com>	2013-03-02 20:09:53 +00:00
Linus Torvalds	3cfb07743a	KGDB/KDB fixes and cleanups Cleanups Remove kdb ssb command - there is no in kernel disassembler to support it Remove kdb ll command - Always caused a kernel oops and there were no bug reports so no one was using this command Use kernel ARRAY_SIZE macro instead of array computations Fixes Stop oops in kdb if user executes kdb_defcmd with args kdb help command truncated text ppc64 support for kgdbts Add missing kconfig option from original kdb port for dealing with catastrophic kernel crashes such that you can reboot automatically on continue from kdb -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJRMhJPAAoJEIciOldedpOj25gP+wZn/u3YxOGkhL3CIBOegEwW KFte7FFNeLNzw8tyEzBMVXok85PJ+/lvm+Xi+a4hhtV7/S+WvwGoWeIg0zCDtrAu im4m2X1u00Egcztcr4tCk8Lwc6vNm5KZsBsUsbKKi0K8twaWifkxLMEWeq+3WLzh +1d4TDkziwdZfcLpUHvohmCsCyg1EdjUxTYnWpSJvLrq5lhNvTQBowHWZH06GBcI MmhC+aHp3myk2/Vwd1AuDq+jIRbH3ORV50wfRONBR7OJ58sOoD9nV8Mr+fy8QLRN BPv8WxeRSm7X/Hl6nPm9PX7oNQSg+Ba1+5helIIJMdGGWhJbl9AZslFEU40Zn3yS V7FmS3mwRSA8NCgfZ+CO6311tirSCvtf34+5ttXEntyK1ujaplSPffBP4Ya26y6q TB2wLr91+3L0LhzrVRj20P13qexsWUW4t9inLzqtDApD3Zljl9Kuwd0febCiLg0r mgGkpZjK0VI7S7RfkxpmEyU4bFP2lCTzBOKAKOfIV6O88bfqFdv5QCTHODJArf22 ugDGPX3tw++Lyc9otGxyPbxmc7npYGqTD1UEM8xK/7sEBiWJm71jjwHN1bwRwB3o EC5bcZ6TJgDJnnvYGBTQec74PnUiCu/UnQVFFabzhowQCJTIWSJkPxjnI2BKK1V5 3exPqogclusAhhwopOtG =Hwbq -----END PGP SIGNATURE----- Merge tag 'for_linux-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb Pull KGDB/KDB fixes and cleanups from Jason Wessel: "For a change we removed more code than we added. If people aren't using it we shouldn't be carrying it. :-) Cleanups: - Remove kdb ssb command - there is no in kernel disassembler to support it - Remove kdb ll command - Always caused a kernel oops and there were no bug reports so no one was using this command - Use kernel ARRAY_SIZE macro instead of array computations Fixes: - Stop oops in kdb if user executes kdb_defcmd with args - kdb help command truncated text - ppc64 support for kgdbts - Add missing kconfig option from original kdb port for dealing with catastrophic kernel crashes such that you can reboot automatically on continue from kdb" * tag 'for_linux-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb: kdb: Remove unhandled ssb command kdb: Prevent kernel oops with kdb_defcmd kdb: Remove the ll command kdb_main: fix help print kdb: Fix overlap in buffers with strcpy Fixed dead ifdef block by adding missing Kconfig option. kdb: Setup basic kdb state before invoking commands via kgdb kdb: use ARRAY_SIZE where possible kgdb/kgdbts: support ppc64 kdb: A fix for kdb command table expansion	2013-03-02 08:31:39 -08:00
Linus Torvalds	e23b62256a	Initial ARC Linux port with some fixes on top for 3.9-rc1 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJRMZYbAAoJEGnX8d3iisJeFj8P/R1hWohDDUc8pG3+ov9Y2Brt g7oIVw1udlKIk3HhVwsyT14/UHunfcTCONKKKGmmbfRrLJSMMsSlXvYoAQLozokf TuaO3Xt5IfERROqTrCDSwdNaAmwIZGsIuI9jWKo4qsXovAL0nc3sR527qMI1o7OE 9X5eIqOJ/rPvOjXYrPqXmvO/DZKYG8PNTH4PYqePes31CsAmDXWBIlgYmgvF3th3 ptwKPgRU/c2wKNpDDJXVQg/bcg9NI2cCnndNrjgXZgyUQrC37ZTdt/IOF5w6FgIW 6i6UbDKXn8MgQAhrXx0Ns/+0kSJZ7eBWmj8hLyrxUzOYlF4rCs/il6ofDRaMO6fv 9LmbNZXYnGICzm1YAxZRK7dm13IbDnltmMc81vISBpJSMTBgqzLWobHnq5/67Wh4 2oUkoc2Tfaw70FnRCewX0x4Qop2YXmXl1KBwdecvzdcKi6Yg+rRH08ur/0yyCyx7 +vAQpPVIuVqCc916qwmCPFaf1UMNnmMStxNH7D1AQHvi1G372NxfXizdYyKFRY9N f5Q+6DTo1xh2AxuGieSZxBoeK0Rlp4DWTOBD4MMz29y7BRX7LK1U2iS+nW0g8uir 3RdYeAqyCxlJtjJNQX9U8ZT54jUPZgvJWU0udesRN1CBdOSQMjM9OyZsLLRtLeX1 ww2tc7zqhBUyjBej6Itg =NKkW -----END PGP SIGNATURE----- Merge tag 'arc-v3.9-rc1-late' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull new ARC architecture from Vineet Gupta: "Initial ARC Linux port with some fixes on top for 3.9-rc1: I would like to introduce the Linux port to ARC Processors (from Synopsys) for 3.9-rc1. The patch-set has been discussed on the public lists since Nov and has received a fair bit of review, specially from Arnd, tglx, Al and other subsystem maintainers for DeviceTree, kgdb... The arch bits are in arch/arc, some asm-generic changes (acked by Arnd), a minor change to PARISC (acked by Helge). The series is a touch bigger for a new port for 2 main reasons: 1. It enables a basic kernel in first sub-series and adds ptrace/kgdb/.. later 2. Some of the fallout of review (DeviceTree support, multi-platform- image support) were added on top of orig series, primarily to record the revision history. This updated pull request additionally contains - fixes due to our GNU tools catching up with the new syscall/ptrace ABI - some (minor) cross-arch Kconfig updates." * tag 'arc-v3.9-rc1-late' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: (82 commits) ARC: split elf.h into uapi and export it for userspace ARC: Fixup the current ABI version ARC: gdbserver using regset interface possibly broken ARC: Kconfig cleanup tracking cross-arch Kconfig pruning in merge window ARC: make a copy of flat DT ARC: [plat-arcfpga] DT arc-uart bindings change: "baud" => "current-speed" ARC: Ensure CONFIG_VIRT_TO_BUS is not enabled ARC: Fix pt_orig_r8 access ARC: [3.9] Fallout of hlist iterator update ARC: 64bit RTSC timestamp hardware issue ARC: Don't fiddle with non-existent caches ARC: Add self to MAINTAINERS ARC: Provide a default serial.h for uart drivers needing BASE_BAUD ARC: [plat-arcfpga] defconfig for fully loaded ARC Linux ARC: [Review] Multi-platform image #8: platform registers SMP callbacks ARC: [Review] Multi-platform image #7: SMP common code to use callbacks ARC: [Review] Multi-platform image #6: cpu-to-dma-addr optional ARC: [Review] Multi-platform image #5: NR_IRQS defined by ARC core ARC: [Review] Multi-platform image #4: Isolate platform headers ARC: [Review] Multi-platform image #3: switch to board callback ...	2013-03-02 07:58:56 -08:00
Robert Obermeier	3b0eb71ec9	Fixed dead ifdef block by adding missing Kconfig option. Added missing Kconfig option KDB_CONTINUE_CATASTROPHIC which lead to a dead ifdef block in kernel/debug/kdb/kdb_main.c:73-75. The code using KDB_CONTINUE_CATASTROPHIC was originally introduced in commit '5d5314d6795f3c1c0f415348ff8c51f7de042b77' by Jason Wessel. This patchset ("kdb: core for kgdb back end (1 of 2)") added platform independent part of kdb to the linux kernel. The Kernel option however, even though it had the same options and behaviour on all supported architectures, was part of the x86 and ia64 patchset of KDB and therefore not pulled into the mainline kernel tree. I actually took the originally written Kconfig by Keith Owens <kaos@sgi.com> (2003-06-20 according to KDB changelog) and changed it to reflect the correct behaviour, as the KDUMP patchset is not part of the kernel and the expected functionality is missing from it. Signed-off-by: Robert Obermeier <obbi89@googlemail.com> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>	2013-03-02 08:52:18 -06:00
Linus Torvalds	b0af9cd9aa	lzo-update-signature-20130226 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (GNU/Linux) iD8DBQBRLQ9iTWFXqwsgQ8kRAvK/AJ9HUkhHJsukjw35XEQPkhfwNs/XPgCglXB1 dF0wRbbL4d6pE9IloUsgYLg= =JSSN -----END PGP SIGNATURE----- Merge tag 'lzo-update-signature-20130226' of git://github.com/markus-oberhumer/linux Pull LZO compression update from Markus Oberhumer: "Summary: ======== Update the Linux kernel LZO compression and decompression code to the current upstream version which features significant performance improvements on modern machines. Some synthetic benchmarks: ============================ x86_64 (Sandy Bridge), gcc-4.6 -O3, Silesia test corpus, 256 kB block-size: compression speed decompression speed LZO-2005 : 150 MB/sec 468 MB/sec LZO-2012 : 434 MB/sec 1210 MB/sec i386 (Sandy Bridge), gcc-4.6 -O3, Silesia test corpus, 256 kB block-size: compression speed decompression speed LZO-2005 : 143 MB/sec 409 MB/sec LZO-2012 : 372 MB/sec 1121 MB/sec armv7 (Cortex-A9), Linaro gcc-4.6 -O3, Silesia test corpus, 256 kB block-size: compression speed decompression speed LZO-2005 : 27 MB/sec 84 MB/sec LZO-2012 : 44 MB/sec 117 MB/sec LZO-2013-UA : 47 MB/sec 167 MB/sec Legend: LZO-2005 : LZO version in current 3.8 kernel (which is based on the LZO 2.02 release from 2005) LZO-2012 : updated LZO version available in linux-next LZO-2013-UA : updated LZO version available in linux-next plus experimental ARM Unaligned Access patch. This needs approval from some ARM maintainer ist NOT YET INCLUDED." Andrew Morton <akpm@linux-foundation.org> acks it and says: "There's a new LZ4 on the block which is even faster than the sped-up LZO, but various filesystems and things use LZO" * tag 'lzo-update-signature-20130226' of git://github.com/markus-oberhumer/linux: crypto: testmgr - update LZO compression test vectors lib/lzo: Update LZO compression to current upstream version lib/lzo: Rename lzo1x_decompress.c to lzo1x_decompress_safe.c	2013-02-28 20:45:52 -08:00
Sasha Levin	b67bfe0d42	hlist: drop the node parameter from iterators I'm not sure why, but the hlist for each entry iterators were conceived list_for_each_entry(pos, head, member) The hlist ones were greedy and wanted an extra parameter: hlist_for_each_entry(tpos, pos, head, member) Why did they need an extra pos parameter? I'm not quite sure. Not only they don't really need it, it also prevents the iterator from looking exactly like the list iterator, which is unfortunate. Besides the semantic patch, there was some manual work required: - Fix up the actual hlist iterators in linux/list.h - Fix up the declaration of other iterators based on the hlist ones. - A very small amount of places were using the 'node' parameter, this was modified to use 'obj->member' instead. - Coccinelle didn't handle the hlist_for_each_entry_safe iterator properly, so those had to be fixed up manually. The semantic patch which is mostly the work of Peter Senna Tschudin is here: @@ iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host; type T; expression a,c,d,e; identifier b; statement S; @@ -T b; <+... when != b ( hlist_for_each_entry(a, - b, c, d) S \| hlist_for_each_entry_continue(a, - b, c) S \| hlist_for_each_entry_from(a, - b, c) S \| hlist_for_each_entry_rcu(a, - b, c, d) S \| hlist_for_each_entry_rcu_bh(a, - b, c, d) S \| hlist_for_each_entry_continue_rcu_bh(a, - b, c) S \| for_each_busy_worker(a, c, - b, d) S \| ax25_uid_for_each(a, - b, c) S \| ax25_for_each(a, - b, c) S \| inet_bind_bucket_for_each(a, - b, c) S \| sctp_for_each_hentry(a, - b, c) S \| sk_for_each(a, - b, c) S \| sk_for_each_rcu(a, - b, c) S \| sk_for_each_from -(a, b) +(a) S + sk_for_each_from(a) S \| sk_for_each_safe(a, - b, c, d) S \| sk_for_each_bound(a, - b, c) S \| hlist_for_each_entry_safe(a, - b, c, d, e) S \| hlist_for_each_entry_continue_rcu(a, - b, c) S \| nr_neigh_for_each(a, - b, c) S \| nr_neigh_for_each_safe(a, - b, c, d) S \| nr_node_for_each(a, - b, c) S \| nr_node_for_each_safe(a, - b, c, d) S \| - for_each_gfn_sp(a, c, d, b) S + for_each_gfn_sp(a, c, d) S \| - for_each_gfn_indirect_valid_sp(a, c, d, b) S + for_each_gfn_indirect_valid_sp(a, c, d) S \| for_each_host(a, - b, c) S \| for_each_host_safe(a, - b, c, d) S \| for_each_mesh_entry(a, - b, c, d) S ) ...+> [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c] [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c] [akpm@linux-foundation.org: checkpatch fixes] [akpm@linux-foundation.org: fix warnings] [akpm@linux-foudnation.org: redo intrusive kvm changes] Tested-by: Peter Senna Tschudin <peter.senna@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-27 19:10:24 -08:00
Stefani Seibold	dfe2a77fd2	kfifo: fix kfifo_alloc() and kfifo_init() Fix kfifo_alloc() and kfifo_init() to alloc at least the requested number of elements. Since the kfifo operates on power of 2 the request size will be rounded up to the next power of two. Signed-off-by: Stefani Seibold <stefani@seibold.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-27 19:10:23 -08:00
Stefani Seibold	c759b35e64	kfifo: move kfifo.c from kernel/ to lib/ Move kfifo.c from kernel/ to lib/ Signed-off-by: Stefani Seibold <stefani@seibold.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-27 19:10:23 -08:00
Tejun Heo	7175c61cc6	idr: explain WARN_ON_ONCE() on negative IDs out-of-range ID Until recently, when an negative ID is specified, idr functions used to ignore the sign bit and proceeded with the operation with the rest of bits, which is bizarre and error-prone. The behavior recently got changed so that negative IDs are treated as invalid but we're triggering WARN_ON_ONCE() on negative IDs just in case somebody was depending on the sign bit being ignored, so that those can be detected and fixed easily. We only need this for a while. Explain why WARN_ON_ONCE()s are there and that they can be removed later. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-27 19:10:21 -08:00
Tejun Heo	0ffc2a9c80	idr: implement lookup hint While idr lookup isn't a particularly heavy operation, it still is too substantial to use in hot paths without worrying about the performance implications. With recent changes, each idr_layer covers 256 slots which should be enough to cover most use cases with single idr_layer making lookup hint very attractive. This patch adds idr->hint which points to the idr_layer which allocated an ID most recently and the fast path lookup becomes if (look up target's prefix matches that of the hinted layer) return hint->ary[ID's offset in the leaf layer]; which can be inlined. idr->hint is set to the leaf node on idr_fill_slot() and cleared from free_layer(). [andriy.shevchenko@linux.intel.com: always do slow path when hint is uninitialized] Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-27 19:10:21 -08:00
Tejun Heo	54616283c2	idr: add idr_layer->prefix Add a field which carries the prefix of ID the idr_layer covers. This will be used to implement lookup hint. This patch doesn't make use of the new field and doesn't introduce any behavior difference. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-27 19:10:20 -08:00
Tejun Heo	1d9b2e1e66	idr: remove length restriction from idr_layer->bitmap Currently, idr->bitmap is declared as an unsigned long which restricts the number of bits an idr_layer can contain. All bitops can handle arbitrary positive integer bit number and there's no reason for this restriction. Declare idr_layer->bitmap using DECLARE_BITMAP() instead of a single unsigned long. * idr_layer->bitmap is now an array. '&' dropped from params to bitops. * Replaced "== IDR_FULL" tests with bitmap_full() and removed IDR_FULL. * Replaced find_next_bit() on ~bitmap with find_next_zero_bit(). * Replaced "bitmap = 0" with bitmap_clear(). This patch doesn't (or at least shouldn't) introduce any behavior changes. [akpm@linux-foundation.org: checkpatch fixes] Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-27 19:10:20 -08:00
Tejun Heo	e8c8d1bc06	idr: remove MAX_IDR_MASK and move left MAX_IDR_* into idr.c MAX_IDR_MASK is another weirdness in the idr interface. As idr covers whole positive integer range, it's defined as 0x7fffffff or INT_MAX. Its usage in idr_find(), idr_replace() and idr_remove() is bizarre. They basically mask off the sign bit and operate on the rest, so if the caller, by accident, passes in a negative number, the sign bit will be masked off and the remaining part will be used as if that was the input, which is worse than crashing. The constant is visible in idr.h and there are several users in the kernel. * drivers/i2c/i2c-core.c:i2c_add_numbered_adapter() Basically used to test if adap->nr is a negative number which isn't -1 and returns -EINVAL if so. idr_alloc() already has negative @start checking (w/ WARN_ON_ONCE), so this can go away. * drivers/infiniband/core/cm.c:cm_alloc_id() drivers/infiniband/hw/mlx4/cm.c:id_map_alloc() Used to wrap cyclic @start. Can be replaced with max(next, 0). Note that this type of cyclic allocation using idr is buggy. These are prone to spurious -ENOSPC failure after the first wraparound. * fs/super.c:get_anon_bdev() The ID allocated from ida is masked off before being tested whether it's inside valid range. ida allocated ID can never be a negative number and the masking is unnecessary. Update idr_() functions to fail with -EINVAL when negative @id is specified and update other MAX_IDR_MASK users as described above. This leaves MAX_IDR_MASK without any user, remove it and relocate other MAX_IDR_ constants to lib/idr.c. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jean Delvare <khali@linux-fr.org> Cc: Roland Dreier <roland@kernel.org> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Cc: "Marciniszyn, Mike" <mike.marciniszyn@intel.com> Cc: Jack Morgenstein <jackm@dev.mellanox.co.il> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Wolfram Sang <wolfram@the-dreams.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-27 19:10:20 -08:00
Tejun Heo	326cf0f0f3	idr: fix top layer handling Most functions in idr fail to deal with the high bits when the idr tree grows to the maximum height. * idr_get_empty_slot() stops growing idr tree once the depth reaches MAX_IDR_LEVEL - 1, which is one depth shallower than necessary to cover the whole range. The function doesn't even notice that it didn't grow the tree enough and ends up allocating the wrong ID given sufficiently high @starting_id. For example, on 64 bit, if the starting id is 0x7fffff01, idr_get_empty_slot() will grow the tree 5 layer deep, which only covers the 30 bits and then proceed to allocate as if the bit 30 wasn't specified. It ends up allocating 0x3fffff01 without the bit 30 but still returns 0x7fffff01. * __idr_remove_all() will not remove anything if the tree is fully grown. * idr_find() can't find anything if the tree is fully grown. * idr_for_each() and idr_get_next() can't iterate anything if the tree is fully grown. Fix it by introducing idr_max() which returns the maximum possible ID given the depth of tree and replacing the id limit checks in all affected places. As the idr_layer pointer array pa[] needs to be 1 larger than the maximum depth, enlarge pa[] arrays by one. While this plugs the discovered issues, the whole code base is horrible and in desparate need of rewrite. It's fragile like hell, Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-27 19:10:20 -08:00
Tejun Heo	d5c7409f79	idr: implement idr_preload[_end]() and idr_alloc() The current idr interface is very cumbersome. * For all allocations, two function calls - idr_pre_get() and idr_get_new() - should be made. idr_pre_get() doesn't guarantee that the following idr_get_new() will not fail from memory shortage. If idr_get_new() returns -EAGAIN, the caller is expected to retry pre_get and allocation. * idr_get_new() can't enforce upper limit. Upper limit can only be enforced by allocating and then freeing if above limit. idr_layer buffer is unnecessarily per-idr. Each idr ends up keeping around MAX_IDR_FREE idr_layers. The memory consumed per idr is under two pages but it makes it difficult to make idr_layer larger. This patch implements the following new set of allocation functions. * idr_preload[_end]() - Similar to radix preload but doesn't fail. The first idr_alloc() inside preload section can be treated as if it were called with @gfp_mask used for idr_preload(). * idr_alloc() - Allocate an ID w/ lower and upper limits. Takes @gfp_flags and can be used w/o preloading. When used inside preloaded section, the allocation mask of preloading can be assumed. If idr_alloc() can be called from a context which allows sufficiently relaxed @gfp_mask, it can be used by itself. If, for example, idr_alloc() is called inside spinlock protected region, preloading can be used like the following. idr_preload(GFP_KERNEL); spin_lock(lock); id = idr_alloc(idr, ptr, start, end, GFP_NOWAIT); spin_unlock(lock); idr_preload_end(); if (id < 0) error; which is much simpler and less error-prone than idr_pre_get and idr_get_new*() loop. The new interface uses per-pcu idr_layer buffer and thus the number of idr's in the system doesn't affect the amount of memory used for preloading. idr_layer_alloc() is introduced to handle idr_layer allocations for both old and new ID allocation paths. This is a bit hairy now but the new interface is expected to replace the old and the internal implementation eventually will become simpler. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-27 19:10:14 -08:00
Tejun Heo	3594eb2894	idr: refactor idr_get_new_above() Move slot filling to idr_fill_slot() from idr_get_new_above_int() and make idr_get_new_above() directly call it. idr_get_new_above_int() is no longer needed and removed. This will be used to implement a new ID allocation interface. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-27 19:10:14 -08:00
Tejun Heo	12d1b4393e	idr: remove _idr_rc_to_errno() hack idr uses -1, IDR_NEED_TO_GROW and IDR_NOMORE_SPACE to communicate exception conditions internally. The return value is later translated to errno values using _idr_rc_to_errno(). This is confusing. Drop the custom ones and consistently use -EAGAIN for "tree needs to grow", -ENOMEM for "need more memory" and -ENOSPC for "ran out of ID space". Due to the weird memory preloading mechanism, [ra]_get_new*() return -EAGAIN on memory shortage, so we need to substitute -ENOMEM w/ -EAGAIN on those interface functions. They'll eventually be cleaned up and the translations will go away. This patch doesn't introduce any functional changes. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-27 19:10:14 -08:00
Tejun Heo	49038ef4fb	idr: relocate idr_for_each_entry() and reorganize id[r\|a]_get_new() * Move idr_for_each_entry() definition next to other idr related definitions. * Make id[r\|a]_get_new() inline wrappers of id[r\|a]_get_new_above(). This changes the implementation of idr_get_new() but the new implementation is trivial. This patch doesn't introduce any functional change. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-27 19:10:14 -08:00
Tejun Heo	fe6e24ec90	idr: deprecate idr_remove_all() There was only one legitimate use of idr_remove_all() and a lot more of incorrect uses (or lack of it). Now that idr_destroy() implies idr_remove_all() and all the in-kernel users updated not to use it, there's no reason to keep it around. Mark it deprecated so that we can later unexport it. idr_remove_all() is made an inline function calling __idr_remove_all() to avoid triggering deprecated warning on EXPORT_SYMBOL(). Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-27 19:10:14 -08:00
Tejun Heo	9bb26bc1ff	idr: make idr_destroy() imply idr_remove_all() idr is silly in quite a few ways, one of which is how it's supposed to be destroyed - idr_destroy() doesn't release IDs and doesn't even whine if the idr isn't empty. If the caller forgets idr_remove_all(), it simply leaks memory. Even ida gets this wrong and leaks memory on destruction. There is absoltely no reason not to call idr_remove_all() from idr_destroy(). Nobody is abusing idr_destroy() for shrinking free layer buffer and continues to use idr after idr_destroy(), so it's safe to do remove_all from destroy. In the whole kernel, there is only one place where idr_remove_all() is legitimiately used without following idr_destroy() while there are quite a few places where the caller forgets either idr_remove_all() or idr_destroy() leaking memory. This patch makes idr_destroy() call idr_destroy_all() and updates the function description accordingly. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-27 19:10:13 -08:00
Tejun Heo	6cdae7416a	idr: fix a subtle bug in idr_get_next() The iteration logic of idr_get_next() is borrowed mostly verbatim from idr_for_each(). It walks down the tree looking for the slot matching the current ID. If the matching slot is not found, the ID is incremented by the distance of single slot at the given level and repeats. The implementation assumes that during the whole iteration id is aligned to the layer boundaries of the level closest to the leaf, which is true for all iterations starting from zero or an existing element and thus is fine for idr_for_each(). However, idr_get_next() may be given any point and if the starting id hits in the middle of a non-existent layer, increment to the next layer will end up skipping the same offset into it. For example, an IDR with IDs filled between [64, 127] would look like the following. [ 0 64 ... ] /----/ \| \| \| NULL [ 64 ... 127 ] If idr_get_next() is called with 63 as the starting point, it will try to follow down the pointer from 0. As it is NULL, it will then try to proceed to the next slot in the same level by adding the slot distance at that level which is 64 - making the next try 127. It goes around the loop and finds and returns 127 skipping [64, 126]. Note that this bug also triggers in idr_for_each_entry() loop which deletes during iteration as deletions can make layers go away leaving the iteration with unaligned ID into missing layers. Fix it by ensuring proceeding to the next slot doesn't carry over the unaligned offset - ie. use round_up(id + 1, slot_distance) instead of id += slot_distance. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: David Teigland <teigland@redhat.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-27 19:10:12 -08:00
Imre Deak	4225fc8555	lib/scatterlist: use page iterator in the mapping iterator For better code reuse use the newly added page iterator to iterate through the pages. The offset, length within the page is still calculated by the mapping iterator as well as the actual mapping. Idea from Tejun Heo. Signed-off-by: Imre Deak <imre.deak@intel.com> Cc: Maxim Levitsky <maximlevitsky@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: James Hogan <james.hogan@imgtec.com> Cc: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-27 19:10:10 -08:00
Imre Deak	a321e91b6d	lib/scatterlist: add simple page iterator Add an iterator to walk through a scatter list a page at a time starting at a specific page offset. As opposed to the mapping iterator this is meant to be small, performing well even in simple loops like collecting all pages on the scatterlist into an array or setting up an iommu table based on the pages' DMA address. Signed-off-by: Imre Deak <imre.deak@intel.com> Cc: Maxim Levitsky <maximlevitsky@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-27 19:10:10 -08:00
Jingoo Han	9ed8a30f34	lib/devres.c: fix misplaced #endif A misplaced #endif causes link errors related to pcim_() functions. This is because pcim_() functions are related to CONFIG_PCI option, however these are not related to CONFIG_HAS_IOPORT option. Therefore, when CONFIG_PCI is enabled and CONFIG_HAS_IOPORT is not enabled, it makes link errors related to pcim_*() functions as below: drivers/ata/libata-sff.c:3233: undefined reference to `pcim_iomap_regions' drivers/ata/libata-sff.c:3238: undefined reference to `pcim_iomap_table' drivers/built-in.o: In function `ata_pci_sff_init_host': drivers/ata/libata-sff.c:2318: undefined reference to `pcim_iomap_regions' drivers/ata/libata-sff.c:2329: undefined reference to `pcim_iomap_table Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Greg KH <greg@kroah.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-27 19:10:09 -08:00
Linus Torvalds	9043a2650c	The sweeping change is to make add_taint() explicitly indicate whether to disable lockdep, but it's a mechanical change. Cheers, Rusty. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJRJAcuAAoJENkgDmzRrbjxsw0P/3eXb+LddYnx0V0uHYdKpCUf 4vdW7X0fX3Z+aUK69IWRL/6ahoO4TpaHYGHBDjEoivyQ0GDq14X7JNWsYYt3LdMf 3wmDgRc2cn/mZOJbFeVpNV8ox5l/xc0CUvV+iQ8tMjfQItXMXgWUFZKMECsXKSO6 eex3lrw9M2jAX2uL8LQPp9W8xtKu24nSZRC6tH5riE/8fCzi1cZPPAqfxP5c8Lee ZXtbCRSyAFENZLpKyMe1PC7HvtJyi5NDn9xwOQiXULZV/VOlvP94DGBLIKCM/6dn 4QvZxpG0P0uOlpCgRAVLyh/z7g4XY4VF/fHopLCmEcqLsvgD+V2LQpQ9zWUalLPC Z+pUpz2vu0gIddPU1nR8R6oGpEdJ8O12aJle62p/RSXWZGx12qUQ+Tamu0tgKcv1 AsiJfbUGNDYfxgU6sHsoQjl2f68LTVckCU1C1LqEbW/S104EIORtGx30CHM4LRiO 32kDC5TtgYDBKQAIqJ4bL48ZMh+9W3uX40p7xzOI5khHQjvswUKa3jcxupU0C1uv lx8KXo7pn8WT33QGysWC782wJCgJuzSc2vRn+KQoqoynuHGM6agaEtR59gil3QWO rQEcxH63BBRDgHlg4FM9IkJwwsnC3PWKL8gbX0uAWXAPMbgapJkuuGZAwt0WDGVK +GszxsFkCjlW0mK0egTb =tiSY -----END PGP SIGNATURE----- Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module update from Rusty Russell: "The sweeping change is to make add_taint() explicitly indicate whether to disable lockdep, but it's a mechanical change." * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: MODSIGN: Add option to not sign modules during modules_install MODSIGN: Add -s <signature> option to sign-file MODSIGN: Specify the hash algorithm on sign-file command line MODSIGN: Simplify Makefile with a Kconfig helper module: clean up load_module a little more. modpost: Ignore ARC specific non-alloc sections module: constify within_module_* taint: add explicit flag to show whether lock dep is still OK. module: printk message when module signature fail taints kernel.	2013-02-25 15:41:43 -08:00
Linus Torvalds	3b5d8510b9	Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core locking changes from Ingo Molnar: "The biggest change is the rwsem lock-steal improvements, both to the assembly optimized and the spinlock based variants. The other notable change is the clean up of the seqlock implementation to be based on the seqcount infrastructure. The rest is assorted smaller debuggability, cleanup and continued -rt locking changes." * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rwsem-spinlock: Implement writer lock-stealing for better scalability futex: Revert "futex: Mark get_robust_list as deprecated" generic: Use raw local irq variant for generic cmpxchg lockdep: Selftest: convert spinlock to raw spinlock seqlock: Use seqcount infrastructure seqlock: Remove unused functions ntp: Make ntp_lock raw intel_idle: Convert i7300_idle_lock to raw_spinlock locking: Various static lock initializer fixes lockdep: Print more info when MAX_LOCK_DEPTH is exceeded rwsem: Implement writer lock-stealing for better scalability lockdep: Silence warning if CONFIG_LOCKDEP isn't set watchdog: Use local_clock for get_timestamp() lockdep: Rename print_unlock_inbalance_bug() to print_unlock_imbalance_bug() locking/stat: Fix a typo	2013-02-22 19:25:09 -08:00
Linus Torvalds	2ef14f465b	Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm changes from Peter Anvin: "This is a huge set of several partly interrelated (and concurrently developed) changes, which is why the branch history is messier than one would like. The really big items are two humonguous patchsets mostly developed by Yinghai Lu at my request, which completely revamps the way we create initial page tables. In particular, rather than estimating how much memory we will need for page tables and then build them into that memory -- a calculation that has shown to be incredibly fragile -- we now build them (on 64 bits) with the aid of a "pseudo-linear mode" -- a #PF handler which creates temporary page tables on demand. This has several advantages: 1. It makes it much easier to support things that need access to data very early (a followon patchset uses this to load microcode way early in the kernel startup). 2. It allows the kernel and all the kernel data objects to be invoked from above the 4 GB limit. This allows kdump to work on very large systems. 3. It greatly reduces the difference between Xen and native (Xen's equivalent of the #PF handler are the temporary page tables created by the domain builder), eliminating a bunch of fragile hooks. The patch series also gets us a bit closer to W^X. Additional work in this pull is the 64-bit get_user() work which you were also involved with, and a bunch of cleanups/speedups to __phys_addr()/__pa()." * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (105 commits) x86, mm: Move reserving low memory later in initialization x86, doc: Clarify the use of asm("%edx") in uaccess.h x86, mm: Redesign get_user with a __builtin_choose_expr hack x86: Be consistent with data size in getuser.S x86, mm: Use a bitfield to mask nuisance get_user() warnings x86/kvm: Fix compile warning in kvm_register_steal_time() x86-32: Add support for 64bit get_user() x86-32, mm: Remove reference to alloc_remap() x86-32, mm: Remove reference to resume_map_numa_kva() x86-32, mm: Rip out x86_32 NUMA remapping code x86/numa: Use __pa_nodebug() instead x86: Don't panic if can not alloc buffer for swiotlb mm: Add alloc_bootmem_low_pages_nopanic() x86, 64bit, mm: hibernate use generic mapping_init x86, 64bit, mm: Mark data/bss/brk to nx x86: Merge early kernel reserve for 32bit and 64bit x86: Add Crash kernel low reservation x86, kdump: Remove crashkernel range find limit for 64bit memblock: Add memblock_mem_size() x86, boot: Not need to check setup_header version for setup_data ...	2013-02-21 18:06:55 -08:00
Linus Torvalds	7c2db36e73	Merge branch 'akpm' (incoming from Andrew) Merge misc patches from Andrew Morton: - Florian has vanished so I appear to have become fbdev maintainer again :( - Joel and Mark are distracted to welcome to the new OCFS2 maintainer - The backlight queue - Small core kernel changes - lib/ updates - The rtc queue - Various random bits * akpm: (164 commits) rtc: rtc-davinci: use devm_() functions rtc: rtc-max8997: use devm_request_threaded_irq() rtc: rtc-max8907: use devm_request_threaded_irq() rtc: rtc-da9052: use devm_request_threaded_irq() rtc: rtc-wm831x: use devm_request_threaded_irq() rtc: rtc-tps80031: use devm_request_threaded_irq() rtc: rtc-lp8788: use devm_request_threaded_irq() rtc: rtc-coh901331: use devm_clk_get() rtc: rtc-vt8500: use devm_() functions rtc: rtc-tps6586x: use devm_request_threaded_irq() rtc: rtc-imxdi: use devm_clk_get() rtc: rtc-cmos: use dev_warn()/dev_dbg() instead of printk()/pr_debug() rtc: rtc-pcf8583: use dev_warn() instead of printk() rtc: rtc-sun4v: use pr_warn() instead of printk() rtc: rtc-vr41xx: use dev_info() instead of printk() rtc: rtc-rs5c313: use pr_err() instead of printk() rtc: rtc-at91rm9200: use dev_dbg()/dev_err() instead of printk()/pr_debug() rtc: rtc-rs5c372: use dev_dbg()/dev_warn() instead of printk()/pr_debug() rtc: rtc-ds2404: use dev_err() instead of printk() rtc: rtc-efi: use dev_err()/dev_warn()/pr_err() instead of printk() ...	2013-02-21 17:38:49 -08:00
Florian Fainelli	5dc49c75a2	decompressors: make the default XZ_DEC_* config match the selected architecture Change the defautl XZ_DEC_* config symbol to match the configured architecture. It is perfectly legitimate to support multiple XZ BCJ filters for different architectures (e.g.: to mount foreign squashfs/xz compressed filesystems), it is however more natural not to select them all by default, but only the one matching the configured architecture. Signed-off-by: Florian Fainelli <florian@openwrt.org> Acked-by: Lasse Collin <lasse.collin@tukaani.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-21 17:22:26 -08:00
Florian Fainelli	64dbfb444c	decompressors: drop dependency on CONFIG_EXPERT Remove the XZ_DEC_* depedencey on CONFIG_EXPERT as recommended by Lasse Colin. Signed-off-by: Florian Fainelli <florian@openwrt.org> Acked-by: Lasse Collin <lasse.collin@tukaani.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-21 17:22:26 -08:00
Florian Fainelli	9d74962965	decompressors: group XZ_DEC_* symbols under an if XZ_BCJ / endif Group all architecture-specific BCJ filter configuration symbols under an if XZ_BCJ / endif statement. Signed-off-by: Florian Fainelli <florian@openwrt.org> Acked-by: Lasse Collin <lasse.collin@tukaani.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-21 17:22:26 -08:00
Namjae Jeon	53769627b9	lib/parser.c: fix up comments for valid return values from match_number match_number() has return values of -ENOMEM, -EINVAL and -ERANGE. So, for all the functions calling match_number, the return value should include these values. Fix up the comments to reflect the correct values. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-21 17:22:25 -08:00
Stepan Moskovchenko	7d7992108d	lib/vsprintf.c: add %pa format specifier for phys_addr_t types Add the %pa format specifier for printing a phys_addr_t type and its derivative types (such as resource_size_t), since the physical address size on some platforms can vary based on build options, regardless of the native integer type. Signed-off-by: Stepan Moskovchenko <stepanm@codeaurora.org> Cc: Rob Landley <rob@landley.net> Cc: George Spelvin <linux@horizon.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-21 17:22:20 -08:00
Kyle McMartin	76e8402619	lib/Kconfig.debug: unhide CONFIG_PANIC_ON_OOPS CONFIG_EXPERT doesn't really make sense, and hides it unintentionally. Remove superfluous "default n" pointed out by Ingo as well. Signed-off-by: Kyle McMartin <kyle@redhat.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-21 17:22:20 -08:00
Linus Torvalds	21eaab6d19	tty/serial patches for 3.9-rc1 Here's the big tty/serial driver patches for 3.9-rc1. More tty port rework and fixes from Jiri here, as well as lots of individual serial driver updates and fixes. All of these have been in the linux-next tree for a while. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iEYEABECAAYFAlEmZYQACgkQMUfUDdst+ylJDgCg0B0nMevUUdM4hLvxunbbiyXM HUEAoIOedqriNNPvX4Bwy0hjeOEaWx0g =vi6x -----END PGP SIGNATURE----- Merge tag 'tty-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial patches from Greg Kroah-Hartman: "Here's the big tty/serial driver patches for 3.9-rc1. More tty port rework and fixes from Jiri here, as well as lots of individual serial driver updates and fixes. All of these have been in the linux-next tree for a while." * tag 'tty-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (140 commits) tty: mxser: improve error handling in mxser_probe() and mxser_module_init() serial: imx: fix uninitialized variable warning serial: tegra: assume CONFIG_OF TTY: do not update atime/mtime on read/write lguest: select CONFIG_TTY to build properly. ARM defconfigs: add missing inclusions of linux/platform_device.h fb/exynos: include platform_device.h ARM: sa1100/assabet: include platform_device.h directly serial: imx: Fix recursive locking bug pps: Fix build breakage from decoupling pps from tty tty: Remove ancient hardpps() pps: Additional cleanups in uart_handle_dcd_change pps: Move timestamp read into PPS code proper pps: Don't crash the machine when exiting will do pps: Fix a use-after free bug when unregistering a source. pps: Use pps_lookup_dev to reduce ldisc coupling pps: Add pps_lookup_dev() function tty: serial: uartlite: Support uartlite on big and little endian systems tty: serial: uartlite: Fix sparse and checkpatch warnings serial/arc-uart: Miscll DT related updates (Grant's review comments) ... Fix up trivial conflicts, mostly just due to the TTY config option clashing with the EXPERIMENTAL removal.	2013-02-21 13:41:04 -08:00
Linus Torvalds	06991c28f3	Driver core patches for 3.9-rc1 Here is the big driver core merge for 3.9-rc1 There are two major series here, both of which touch lots of drivers all over the kernel, and will cause you some merge conflicts: - add a new function called devm_ioremap_resource() to properly be able to check return values. - remove CONFIG_EXPERIMENTAL If you need me to provide a merged tree to handle these resolutions, please let me know. Other than those patches, there's not much here, some minor fixes and updates. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iEYEABECAAYFAlEmV0cACgkQMUfUDdst+yncCQCfbmnQZju7kzWXk6PjdFuKspT9 weAAoMCzcAtEzzc4LXuUxxG/sXBVBCjW =yWAQ -----END PGP SIGNATURE----- Merge tag 'driver-core-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core patches from Greg Kroah-Hartman: "Here is the big driver core merge for 3.9-rc1 There are two major series here, both of which touch lots of drivers all over the kernel, and will cause you some merge conflicts: - add a new function called devm_ioremap_resource() to properly be able to check return values. - remove CONFIG_EXPERIMENTAL Other than those patches, there's not much here, some minor fixes and updates" Fix up trivial conflicts * tag 'driver-core-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (221 commits) base: memory: fix soft/hard_offline_page permissions drivercore: Fix ordering between deferred_probe and exiting initcalls backlight: fix class_find_device() arguments TTY: mark tty_get_device call with the proper const values driver-core: constify data for class_find_device() firmware: Ignore abort check when no user-helper is used firmware: Reduce ifdef CONFIG_FW_LOADER_USER_HELPER firmware: Make user-mode helper optional firmware: Refactoring for splitting user-mode helper code Driver core: treat unregistered bus_types as having no devices watchdog: Convert to devm_ioremap_resource() thermal: Convert to devm_ioremap_resource() spi: Convert to devm_ioremap_resource() power: Convert to devm_ioremap_resource() mtd: Convert to devm_ioremap_resource() mmc: Convert to devm_ioremap_resource() mfd: Convert to devm_ioremap_resource() media: Convert to devm_ioremap_resource() iommu: Convert to devm_ioremap_resource() drm: Convert to devm_ioremap_resource() ...	2013-02-21 12:05:51 -08:00
Linus Torvalds	33673dcb37	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "This is basically a maintenance update for the TPM driver and EVM/IMA" Fix up conflicts in lib/digsig.c and security/integrity/ima/ima_main.c * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (45 commits) tpm/ibmvtpm: build only when IBM pseries is configured ima: digital signature verification using asymmetric keys ima: rename hash calculation functions ima: use new crypto_shash API instead of old crypto_hash ima: add policy support for file system uuid evm: add file system uuid to EVM hmac tpm_tis: check pnp_acpi_device return code char/tpm/tpm_i2c_stm_st33: drop temporary variable for return value char/tpm/tpm_i2c_stm_st33: remove dead assignment in tpm_st33_i2c_probe char/tpm/tpm_i2c_stm_st33: Remove __devexit attribute char/tpm/tpm_i2c_stm_st33: Don't use memcpy for one byte assignment tpm_i2c_stm_st33: removed unused variables/code TPM: Wait for TPM_ACCESS tpmRegValidSts to go high at startup tpm: Fix cancellation of TPM commands (interrupt mode) tpm: Fix cancellation of TPM commands (polling mode) tpm: Store TPM vendor ID TPM: Work around buggy TPMs that block during continue self test tpm_i2c_stm_st33: fix oops when i2c client is unavailable char/tpm: Use struct dev_pm_ops for power management TPM: STMicroelectronics ST33 I2C BUILD STUFF ...	2013-02-21 08:18:12 -08:00
Markus F.X.J. Oberhumer	8b975bd3f9	lib/lzo: Update LZO compression to current upstream version This commit updates the kernel LZO code to the current upsteam version which features a significant speed improvement - benchmarking the Calgary and Silesia test corpora typically shows a doubled performance in both compression and decompression on modern i386/x86_64/powerpc machines. Signed-off-by: Markus F.X.J. Oberhumer <markus@oberhumer.com>	2013-02-20 19:36:01 +01:00
Markus F.X.J. Oberhumer	b6bec26cea	lib/lzo: Rename lzo1x_decompress.c to lzo1x_decompress_safe.c Rename the source file to match the function name and thereby also make room for a possible future even slightly faster "non-safe" decompressor version. Signed-off-by: Markus F.X.J. Oberhumer <markus@oberhumer.com>	2013-02-20 19:36:00 +01:00
Linus Torvalds	e84cf5d0fd	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU changes from Ingo Molnar: "SRCU changes: - These include debugging aids, updates that move towards the goal of permitting srcu_read_lock() and srcu_read_unlock() to be used from idle and offline CPUs, and a few small fixes. Changes to rcutorture and to RCU documentation: - Posted to LKML at https://lkml.org/lkml/2013/1/26/188 Enhancements to uniprocessor handling in tiny RCU: - Posted to LKML at https://lkml.org/lkml/2013/1/27/2 Tag RCU callbacks with grace-period number to simplify callback advancement: - Posted to LKML at https://lkml.org/lkml/2013/1/26/203 Miscellaneous fixes: - Posted to LKML at https://lkml.org/lkml/2013/1/26/204" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits) srcu: use ACCESS_ONCE() to access sp->completed in srcu_read_lock() srcu: Update synchronize_srcu_expedited()'s comments srcu: Update synchronize_srcu()'s comments srcu: Remove checks preventing idle CPUs from calling srcu_read_lock() srcu: Remove checks preventing offline CPUs from calling srcu_read_lock() srcu: Simple cleanup for cleanup_srcu_struct() srcu: Add might_sleep() annotation to synchronize_srcu() srcu: Simplify __srcu_read_unlock() via this_cpu_dec() rcu: Allow rcutorture to be built at low optimization levels rcu: Make rcutorture's shuffler task shuffle recently added tasks rcu: Allow TREE_PREEMPT_RCU on UP systems rcu: Provide RCU CPU stall warnings for tiny RCU context_tracking: Add comments on interface and internals rcu: Remove obsolete Kconfig option from comment rcu: Remove unused code originally used for context tracking rcu: Consolidate debugging Kconfig options rcu: Correct 'optimized' to 'optimize' in header comment rcu: Trace callback acceleration rcu: Tag callback lists with corresponding grace-period number rcutorture: Don't compare ptr with 0 ...	2013-02-19 17:45:20 -08:00
Yuanhan Liu	41ef8f8266	rwsem-spinlock: Implement writer lock-stealing for better scalability We (Linux Kernel Performance project) found a regression introduced by commit: `5a505085f0` mm/rmap: Convert the struct anon_vma::mutex to an rwsem which converted all anon_vma::mutex locks rwsem write locks. The semantics are the same, but the behavioral difference is quite huge in some cases. After investigating it we found the root cause: mutexes support lock stealing while rwsems don't. Here is the link for the detailed regression report: https://lkml.org/lkml/2013/1/29/84 Ingo suggested adding write lock stealing to rwsems: "I think we should allow lock-steal between rwsem writers - that will not hurt fairness as most rwsem fairness concerns relate to reader vs. writer fairness" And here is the rwsem-spinlock version. With this patch, we got a double performance increase in one test box with following aim7 workfile: FILESIZE: 1M POOLSIZE: 10M 10 fork_test /usr/bin/time output w/o patch /usr/bin/time_output with patch -- Percent of CPU this job got: 369% Percent of CPU this job got: 537% Voluntary context switches: 640595016 Voluntary context switches: 157915561 We got a 45% increase in CPU usage and saved about 3/4 voluntary context switches. Reported-by: LKP project <lkp@linux.intel.com> Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Cc: Alex Shi <alex.shi@intel.com> Cc: David Howells <dhowells@redhat.com> Cc: Michel Lespinasse <walken@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Anton Blanchard <anton@samba.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: paul.gortmaker@windriver.com Link: http://lkml.kernel.org/r/1359716356-23865-1-git-send-email-yuanhan.liu@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-02-19 08:43:39 +01:00
Yong Zhang	9fb1b90ce0	lockdep: Selftest: convert spinlock to raw spinlock To make the lockdep selftest working on RT we need to convert the spinlock tests to a raw spinlock. Otherwise we cannot run the irq context checks. For mainline this is just annotational as spinlocks are mapped to raw_spinlocks anyway. Signed-off-by: Yong Zhang <yong.zhang0@gmail.com> Link: http://lkml.kernel.org/r/1334559716-18447-2-git-send-email-yong.zhang0@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2013-02-19 08:43:35 +01:00
Alex Shi	ce6711f3d1	rwsem: Implement writer lock-stealing for better scalability Commit `5a505085f0` ("mm/rmap: Convert the struct anon_vma::mutex to an rwsem") changed struct anon_vma::mutex to an rwsem, which caused aim7 fork_test performance to drop by 50%. Yuanhan Liu did the following excellent analysis: https://lkml.org/lkml/2013/1/29/84 and found that the regression is caused by strict, serialized, FIFO sequential write-ownership of rwsems. Ingo suggested implementing opportunistic lock-stealing for the front writer task in the waitqueue. Yuanhan Liu implemented lock-stealing for spinlock-rwsems, which indeed recovered much of the regression - confirming the analysis that the main factor in the regression was the FIFO writer-fairness of rwsems. In this patch we allow lock-stealing to happen when the first waiter is also writer. With that change in place the aim7 fork_test performance is fully recovered on my Intel NHM EP, NHM EX, SNB EP 2S and 4S test-machines. Reported-by: lkp@linux.intel.com Reported-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Signed-off-by: Alex Shi <alex.shi@intel.com> Cc: David Howells <dhowells@redhat.com> Cc: Michel Lespinasse <walken@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Anton Blanchard <anton@samba.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: paul.gortmaker@windriver.com Link: https://lkml.org/lkml/2013/1/29/84 Link: http://lkml.kernel.org/r/1360069915-31619-1-git-send-email-alex.shi@intel.com [ Small stylistic fixes, updated changelog. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-02-19 08:42:43 +01:00
Vineet Gupta	64e69073c3	asm-generic headers: Allow yet more arch overrides in checksum.h arches can have more efficient implementation of these routines Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2013-02-11 20:00:33 +05:30
Ingo Molnar	9228b5f243	Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: 1. Changes to rcutorture and to RCU documentation. Posted to LKML at https://lkml.org/lkml/2013/1/26/188. 2. Enhancements to uniprocessor handling in tiny RCU. Posted to LKML at https://lkml.org/lkml/2013/1/27/2. 3. Tag RCU callbacks with grace-period number to simplify callback advancement. Posted to LKML at https://lkml.org/lkml/2013/1/26/203. 4. Miscellaneous fixes. Posted to LKML at https://lkml.org/lkml/2013/1/26/204. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-02-04 19:06:34 +01:00
Andy Shevchenko	0d2a1b2d03	mpilib: use DIV_ROUND_UP and remove unused macros Remove MIN, MAX and ABS macros that are duplicates kernel's native implementation. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2013-02-01 16:28:32 +11:00
Dmitry Kasatkin	26d438457e	digsig: remove unnecessary memory allocation and copying In existing use case, copying of the decoded data is unnecessary in pkcs_1_v1_5_decode_emsa. It is just enough to get pointer to the message. Removing copying and extra buffer allocation. Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2013-02-01 16:28:24 +11:00
YOSHIFUJI Hideaki	7810cc1e77	digsig: Fix memory leakage in digsig_verify_rsa() digsig_verify_rsa() does not free kmalloc'ed buffer returned by mpi_get_buffer(). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Cc: stable@vger.kernel.org Signed-off-by: James Morris <james.l.morris@oracle.com>	2013-02-01 15:59:33 +11:00
Yinghai Lu	ac2cbab21f	x86: Don't panic if can not alloc buffer for swiotlb Normal boot path on system with iommu support: swiotlb buffer will be allocated early at first and then try to initialize iommu, if iommu for intel or AMD could setup properly, swiotlb buffer will be freed. The early allocating is with bootmem, and could panic when we try to use kdump with buffer above 4G only, or with memmap to limit mem under 4G. for example: memmap=4095M$1M to remove memory under 4G. According to Eric, add _nopanic version and no_iotlb_memory to fail map single later if swiotlb is still needed. -v2: don't pass nopanic, and use -ENOMEM return value according to Eric. panic early instead of using swiotlb_full to panic...according to Eric/Konrad. -v3: make swiotlb_init to be notpanic, but will affect: arm64, ia64, powerpc, tile, unicore32, x86. -v4: cleanup swiotlb_init by removing swiotlb_init_with_default_size. Suggested-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1359058816-7615-36-git-send-email-yinghai@kernel.org Reviewed-and-tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Kyungmin Park <kyungmin.park@samsung.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Andrzej Pietrasiewicz <andrzej.p@samsung.com> Cc: linux-mips@linux-mips.org Cc: xen-devel@lists.xensource.com Cc: virtualization@lists.linux-foundation.org Cc: Shuah Khan <shuahkhan@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-01-29 19:36:53 -08:00
Paul E. McKenney	40393f525f	Merge branches 'doctorture.2013.01.29a', 'fixes.2013.01.26a', 'tagcb.2013.01.24a' and 'tiny.2013.01.29b' into HEAD doctorture.2013.01.11a: Changes to rcutorture and to RCU documentation. fixes.2013.01.26a: Miscellaneous fixes. tagcb.2013.01.24a: Tag RCU callbacks with grace-period number to simplify callback advancement. tiny.2013.01.29b: Enhancements to uniprocessor handling in tiny RCU.	2013-01-28 22:25:21 -08:00
Paul E. McKenney	6bfc09e232	rcu: Provide RCU CPU stall warnings for tiny RCU Tiny RCU has historically omitted RCU CPU stall warnings in order to reduce memory requirements, however, lack of these warnings caused Thomas Gleixner some debugging pain recently. Therefore, this commit adds RCU CPU stall warnings to tiny RCU if RCU_TRACE=y. This keeps the memory footprint small, while still enabling CPU stall warnings in kernels built to enable them. Updated to include Josh Triplett's suggested use of RCU_STALL_COMMON config variable to simplify #if expressions. Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2013-01-28 22:06:21 -08:00
Dave Hansen	2f03e3ca74	rcu: Consolidate debugging Kconfig options The RCU-related debugging Kconfig options are in two different places, and consume too much screen real estate. This commit therefore consolidates them into their own menu. Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2013-01-26 16:34:48 -08:00
Greg Kroah-Hartman	422d26b6ec	Merge 3.8-rc5 into driver-core-next This resolves a gpio driver merge issue pointed out in linux-next. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-01-25 21:06:30 -08:00
Greg Kroah-Hartman	9f9cba810f	Merge 3.8-rc5 into tty-next This resolves a number of tty driver merge issues found in linux-next Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-01-25 13:27:36 -08:00
Thierry Reding	f4a18312f4	lib: devres: Fix build breakage The ERR_PTR() and IS_ERR() macros used by the devm_ioremap_resource() function are defined in the linux/err.h header. On ARM this seems to be pulled in by one of the other headers but the build fails at least on OpenRISC. Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de> Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-01-22 13:31:18 -08:00
Thierry Reding	75096579c3	lib: devres: Introduce devm_ioremap_resource() The devm_request_and_ioremap() function is very useful and helps avoid a whole lot of boilerplate. However, one issue that keeps popping up is its lack of a specific error code to determine which of the steps that it performs failed. Furthermore, while the function gives an example and suggests what error code to return on failure, a wide variety of error codes are used throughout the tree. In an attempt to fix these problems, this patch adds a new function that drivers can transition to. The devm_ioremap_resource() returns a pointer to the remapped I/O memory on success or an ERR_PTR() encoded error code on failure. Callers can check for failure using IS_ERR() and determine its cause by extracting the error code using PTR_ERR(). devm_request_and_ioremap() is implemented as a wrapper around the new API and return NULL on failure as before. This ensures that backwards compatibility is maintained until all users have been converted to the new API, at which point the old devm_request_and_ioremap() function should be removed. A semantic patch is included which can be used to convert from the old devm_request_and_ioremap() API to the new devm_ioremap_resource() API. Some non-trivial cases may require manual intervention, though. Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de> Cc: Arnd Bergmann <arnd@arndb.de> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-01-22 09:41:43 -08:00
Rusty Russell	373d4d0997	taint: add explicit flag to show whether lock dep is still OK. Fix up all callers as they were before, with make one change: an unsigned module taints the kernel, but doesn't turn off lockdep. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2013-01-21 17:17:57 +10:30
Linus Torvalds	226364766f	Various minor fixes, but a slightly more complex one to fix the per-cpu overload problem introduced recently by kvm id changes. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJQ/IaJAAoJENkgDmzRrbjxOjAQAIrI9+Jo3Lsxk1v9gXeo9xn2 ST4LNv7/oW2+3NFBOkKsGVpcXe1JtGySIXyx9k+dELPa5xe4Rs4HE3pHQj/VoEx8 FKz3oUXSHkuh+paKuFXvZ2u/z0/FI99GmqHPObvGQ4iS3hTXAibzO83yYYPxwApq Zq4kof/dAcVVPLm8fGVAMPA2Rbh/WmjDfrIv8gv71QkDjtRLzcr40VIgky5cvu7V FWcBl4/DVoKkGnDPsLDhLK9QGqgBGhFIlNIcVX4Jv50DiCibOyzdjeUXYxMftoGr Rw56hHwGpPdqbRIjBkR071vIl/mlXTmxIv+d77vZNBin2MIBwAzCQXo8I1/HojCK /wKhI+RFj0J5DaDo/BTB80cmI3X2oah5sRUebW6vd9HjunhFFndg4mVeDNPa0E0+ F72xWlj79BjdIOuD06TLg6Tg2klL49nC8bUc0wrsh6onEjhd9v7Cp/X/rxi5cKYW eEv3oLkKwUHoheF9gBlpnT0Yyl/HpFe+nemblzj/ybRKnk4A5vtJqV9eZnqoOS16 lgIkKOpgXT9dzSom2EL/f4sMCeLLYC44DQwOvxNKt/BdMY0r5y8OLaJORXQGfEDF Ztvu2G8PmELxV0B3JZcGR/zOcKxpOBsrGoVn0/EQIul3A/0C0ID7i5zwJAyX6LP7 V+6vyF2eHMf10tB0rbfB =SpOo -----END PGP SIGNATURE----- Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module fixes and a virtio block fix from Rusty Russell: "Various minor fixes, but a slightly more complex one to fix the per-cpu overload problem introduced recently by kvm id changes." * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: module: put modules in list much earlier. module: add new state MODULE_STATE_UNFORMED. module: prevent warning when finit_module a 0 sized file virtio-blk: Don't free ida when disk is in use	2013-01-20 16:44:28 -08:00
Joe Millenbach	4f73bc4dd3	tty: Added a CONFIG_TTY option to allow removal of TTY The option allows you to remove TTY and compile without errors. This saves space on systems that won't support TTY interfaces anyway. bloat-o-meter output is below. The bulk of this patch consists of Kconfig changes adding "depends on TTY" to various serial devices and similar drivers that require the TTY layer. Ideally, these dependencies would occur on a common intermediate symbol such as SERIO, but most drivers "select SERIO" rather than "depends on SERIO", and "select" does not respect dependencies. bloat-o-meter output comparing our previous minimal to new minimal by removing TTY. The list is filtered to not show removed entries with awk '$3 != "-"' as the list was very long. add/remove: 0/226 grow/shrink: 2/14 up/down: 6/-35356 (-35350) function old new delta chr_dev_init 166 170 +4 allow_signal 80 82 +2 static.__warned 143 142 -1 disallow_signal 63 62 -1 __set_special_pids 95 94 -1 unregister_console 126 121 -5 start_kernel 546 541 -5 register_console 593 588 -5 copy_from_user 45 40 -5 sys_setsid 128 120 -8 sys_vhangup 32 19 -13 do_exit 1543 1526 -17 bitmap_zero 60 40 -20 arch_local_irq_save 137 117 -20 release_task 674 652 -22 static.spin_unlock_irqrestore 308 260 -48 Signed-off-by: Joe Millenbach <jmillenbach@gmail.com> Reviewed-by: Jamey Sharp <jamey@minilop.net> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-01-18 16:15:27 -08:00
Greg Kroah-Hartman	ed408f7c0f	Merge 3.9-rc4 into driver-core-next This is to fix up a build problem with a wireless driver due to the dynamic-debug patches in this branch. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-01-17 19:48:18 -08:00
Jim Cromie	18c216c53b	dynamic_debug: add pr_errs before -EINVALs Ma noted that dynamic-debug is silent about many query errors, so add pr_err()s to explain those errors, and tweak a few others. Also parse flags 1st, so that match-spec errs are slightly clearer. CC: Jianpeng Ma <majianpeng@gmail.com> CC: Joe Perches <joe@perches.com> Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Signed-off-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-01-17 12:19:09 -08:00
Vladimir Kondratiev	7a555613eb	dynamic_debug: dynamic hex dump Introduce print_hex_dump_debug() that can be dynamically controlled, similar to pr_debug. Also, make print_hex_dump_bytes() dynamically controlled Implement only 'p' flag (_DPRINTK_FLAGS_PRINT) to keep it simple since hex dump prints multiple lines and long prefix would impact readability. To provide line/file etc. information, use pr_debug or similar before/after print_hex_dump_debug() Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com> Signed-off-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-01-17 12:19:09 -08:00
Joe Perches	f657fd21e1	dynamic_debug: Fix vpr_<foo> logging styles vpr_info_dq should be a function and vpr_info should have a do {} while (0) Add missing newlines to pr_<level>s. Miscellaneous neatening too. braces, coalescing formats, alignments, etc... Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-01-17 12:18:07 -08:00

1 2 3 4 5 ...

2501 Commits