OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Al Viro	7076aada10	x86: split ret_from_fork Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-30 22:53:31 -04:00
Al Viro	44f4b56b54	alpha: introduce ret_from_kernel_execve(), switch to generic kernel_execve() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-30 22:53:31 -04:00
Al Viro	cba1ec7e88	alpha: switch to generic kernel_thread() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-30 22:53:18 -04:00
Al Viro	756144f8ea	alpha: switch to generic sys_execve() get rid of sys_execve() wrapper, while we are at it Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-30 22:21:37 -04:00
Al Viro	a63c97a000	arm: get rid of execve wrapper, switch to generic execve() implementation Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-30 22:21:37 -04:00
Al Viro	bfd170d565	arm: optimized current_pt_regs() ... no need to read current_thread_info()->task only to feed it to task_thread_page() immediately afterwards. Moreover, not using current_thread_info() at all ends up with better assembler - we need a location very close to the top of kernel stack page and it's actually better to do or with 0x1fff, followed be subtracting a small constant than and with ~0x1fff, followed by adding a large one. Both & and \| would be a couple of insns (mvn lsr/mvn lsl for \|, a pair of bic for &), but the following addition would cost a pair of add while the subtraction ends up as a single sub. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-30 22:21:37 -04:00
Al Viro	583d632fb3	arm: introduce ret_from_kernel_execve(), switch to generic kernel_execve() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-30 22:21:36 -04:00
Al Viro	9e14f828ee	arm: split ret_from_fork, simplify kernel_thread() [based on patch by rmk] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-30 22:21:36 -04:00
Al Viro	38b983b346	generic sys_execve() Selected by __ARCH_WANT_SYS_EXECVE in unistd.h. Requires * working current_pt_regs() * NOT doing a syscall-in-kernel kind of kernel_execve() implementation. Using generic kernel_execve() is fine. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-30 22:20:51 -04:00
Al Viro	282124d186	generic kernel_execve() based mostly on arm and alpha versions. Architectures can define __ARCH_WANT_KERNEL_EXECVE and use it, provided that * they have working current_pt_regs(), even for kernel threads. * kernel_thread-spawned threads do have space for pt_regs in the normal location. Normally that's as simple as switching to generic kernel_thread() and making sure that kernel threads do not go through return from syscall path; call the payload from equivalent of ret_from_fork if we are in a kernel thread (or just have separate ret_from_kernel_thread and make copy_thread() use it instead of ret_from_fork in kernel thread case). * they have ret_from_kernel_execve(); it is called after successful do_execve() done by kernel_execve() and gets normal pt_regs location passed to it as argument. It's essentially a longjmp() analog - it should set sp, etc. to the situation expected at the return for syscall and go there. Eventually the need for that sucker will disappear, but that'll take some surgery on kernel_thread() payloads. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-30 13:36:39 -04:00
Al Viro	a3460a5974	new helper: current_pt_regs() Normally (and that's the default) it's just task_pt_regs(current). However, if an architecture can optimize that, it can do so by making a macro of its own available from asm/ptrace.h. More importantly, some architectures have task_pt_regs() working only for traced tasks blocked on signal delivery. current_pt_regs() needs to work for all processes, so before those architectures start using stuff relying on current_pt_regs() they'll need a properly working variant. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-30 13:36:39 -04:00
Al Viro	2aa3a7f866	preparation for generic kernel_thread() Let architectures select GENERIC_KERNEL_THREAD and have their copy_thread() treat NULL regs as "it came from kernel_thread(), sp argument contains the function new thread will be calling and stack_size - the argument for that function". Switching the architectures begins shortly... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-30 13:35:55 -04:00
Al Viro	a4d94ff8aa	um: kill thread->forking we only use that to tell copy_thread() done by syscall from that done by kernel_thread(). However, it's easier to do simply by checking PF_KTHREAD in thread flags. Merge sys_clone() guts for 32bit and 64bit, while we are at it... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-20 10:49:09 -04:00
Al Viro	8e2c85aa6c	um: let signal_delivered() do SIGTRAP on singlestepping into handler ... rather than duplicating that in sigframe setup code (and doing that inconsistently, at that) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-20 09:53:01 -04:00
Al Viro	344569aef3	um: don't leak floating point state and segment registers on execve() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-20 09:53:01 -04:00
Al Viro	ab286b21aa	um: take cleaning singlestep to start_thread() ... assuming it's needed to be done at all Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-20 09:53:00 -04:00
Al Viro	1cedd6925a	don't bother exporting kernel_execve() most of the architectures don't and there's not a single caller outside of core kernel. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-20 09:51:28 -04:00
Al Viro	826eba4db0	the only place that needs to include asm/exec.h is linux/binfmts.h Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-20 09:51:13 -04:00
Al Viro	ddd03a1f75	get rid of generic instances of asm/exec.h Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-20 09:51:02 -04:00
Al Viro	e76623d694	x86: get rid of TIF_IRET hackery TIF_NOTIFY_RESUME will work in precisely the same way; all that is achieved by TIF_IRET is appearing that there's some work to be done, so we end up on the iret exit path. Just use NOTIFY_RESUME. And for execve() do that in 32bit start_thread(), not sys_execve() itself. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-20 09:50:17 -04:00
Linus Torvalds	c46de2263f	Merge branch 'for-linus' of git://git.kernel.dk/linux-block Pull block fixes from Jens Axboe: "A small collection of driver fixes/updates and a core fix for 3.6. It contains: - Bug fixes for mtip32xx, and support for new hardware (just addition of IDs). They have been queued up for 3.7 for a few weeks as well. - rate-limit a failing command error message in block core. - A fix for an old cciss bug from Stephen. - Prevent overflow of partition count from Alan." * 'for-linus' of git://git.kernel.dk/linux-block: cciss: fix handling of protocol error blk: add an upper sanity check on partition adding mtip32xx: fix user_buffer check in exec_drive_command mtip32xx: Remove dead code mtip32xx: Change printk to pr_xxxx mtip32xx: Proper reporting of write protect status on big-endian mtip32xx: Increase timeout for standby command mtip32xx: Handle NCQ commands during the security locked state mtip32xx: Add support for new devices block: rate-limit the error message from failing commands	2012-09-19 11:04:34 -07:00
Linus Torvalds	077fee0036	SuperH fixes for 3.6-rc7 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iEYEABECAAYFAlBYK+cACgkQGkmNcg7/o7jthwCfemhnr590s3hwWXjA88ZZMFDl U8kAoJA7hNCtAqdoj+LHXJlKLK1UalkD =aCxD -----END PGP SIGNATURE----- Merge tag 'sh-for-linus' of git://github.com/pmundt/linux-sh Pull SuperH fixes from Paul Mundt. * tag 'sh-for-linus' of git://github.com/pmundt/linux-sh: sh: Fix up TIF_NOTIFY_RESUME sans TIF_SIGPENDING handling. sh: pfc: Release spinlock in sh_pfc_gpio_request_enable() error path sh: intc: Fix up multi-evt irq association.	2012-09-19 11:03:55 -07:00
Linus Torvalds	cf42d543e5	A quick rpmsg fix from Fernando, fixing two buggy invocations of dma_free_coherent. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJQV/pNAAoJELLolMlTRIoMkh8QAK56+PRhhhhtGKR9n9Mf8NX1 ZBbNYfHqm3AcToHgCIAY0ciaS2H6ZoOQj9bg4G2+JJWlwnphIcK7Vq9RBXi2/+jk zvPvxz/mHkWeiCXARd9HtKxHSr4QRbO8xGmD2sxQSojlsr8RQTuTBcRbPF3e4RFH 7QYC7YwowS5JZXS4m/szLSTlWyzi1D8HzFZkKf7FMg3RklpbQm3v3wg4iJIPn8C+ CO6jV35WB436M5vuu4nk6YnKfMaE5D//Aj/1Eeq1aZOIquRQ7vVtWMnDGo+ZpOT1 i2paY7h7ra7Yh6f2wD6GAtpRhd+xV5dp6g0N9pntQ03/3Xyg7qQJ0rTyiLUNSZVQ OPD69ud/xKr+VEda7rKcZ63TiJ3e3gZypgm5/xkZMw58X5Tt4ELC/7YTXjK7zrWN S1jjUEl+38UN11iIiTdhRKCdiEZWpA6xiUrzE1jxG2AyiS2EgnkByhedb3QwIR8V VbRbpcQkDW/Dn8dP6+JtW9PQyBFkuEHofmMtLGXUmn52ijHX90dXsYPnOaPtf+2e oV7JYQXQZ1X+3K3evo7FmaeiEFCs/KL4eoCjDOaZz05pIOJ/Y9GOUapnmPo85TZ3 axyeP/82Td2C4CpnTb6TWfFtw1WTI0Vdj/kz/5IdX2AuK9SGum606jYlyz2NGSuP YT9LXJT4G30DkG6Zk3aL =Ywa8 -----END PGP SIGNATURE----- Merge tag 'rpmsg-3.6-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/rpmsg Pull rpmsg fix from Ohad Ben-Cohen: "A quick rpmsg fix from Fernando, fixing two buggy invocations of dma_free_coherent" * tag 'rpmsg-3.6-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/rpmsg: rpmsg: fix dma_free_coherent dev parameter	2012-09-19 11:03:13 -07:00
Linus Torvalds	4b92c17e1d	3 fixes for md in 3.6. One reverts a recent patch which turns out to not be such a good idea. Other two fix minor bugs with the new (since 3.3) 'replacement' code and have been tagged for -stable. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iQIVAwUAUFk0Xznsnt1WYoG5AQL/wg/+PgiYHPhz88Nw7pQIDMVtxVPjsf8YLhs/ cFeIoTE13KQX+akiKORoFopIaon0hJLX48Hs+/WlrZmucJMLn/gmUhkwkcZs31As PrGLrLdz6cXue0GPTU5IP25lkbMRBsRV1U5k1pWuq9qWQv+Bjs1dXc1H1HekR3Lr WD4TdLz/Zg5fboADXVt6cSpAHL++eDdHOoqh7amMDzQfLf6Et9U1gaqTXeQMw70M /0+AubVmceYbP7uw1/haWii6/cLNtu7opE9dEvsHHkibdwcdSiOmsqMYCurjvd8p zYsaK/KcIWipfSpYsaDI1Sz4tYVc4UBQZCYgHJxv2ynnKRHHEDnrj1/hU86SRsmS YUEM5ENeLnXtmFMZH2Pro8c9x4ianv751uMCEt61HZs2572Rz5csZ0JgCSaScCVA PKldSe4AsyeGQsQ0lSjhza/zmx6uvy0mUrJFSd2lt8cMLvlcDfGihYG1ERjFf638 kuIthP7NwtE/sM0cZtLkVvXfJdyUQDL2EGvJJIO4A4m1PJ07RzJ0KRU/g0jzi1Q8 E63abhnTk7y7QpLtIH7Bv4DrDjdMvmfYFbprR/Mxz5D4RUOBxxj+HvD9EFwNG9oJ ufc/hnDQd7BYkMPFWrVuYxtByMYMdnhuiRSFUDCrMt81pxSLecxjDt1r9UOLT5Bo emAPezrzK/g= =dfcj -----END PGP SIGNATURE----- Merge tag 'md-3.6-fixes' of git://neil.brown.name/md Pull md fixes from NeilBrown: "3 fixes for md in 3.6. One reverts a recent patch which turns out to not be such a good idea. Other two fix minor bugs with the new (since 3.3) 'replacement' code and have been tagged for -stable." * tag 'md-3.6-fixes' of git://neil.brown.name/md: md: make sure metadata is updated when spares are activated or removed. md/raid5: fix calculate of 'degraded' when a replacement becomes active. Revert "md/raid5: For odirect-write performance, do not set STRIPE_PREREAD_ACTIVE."	2012-09-19 11:01:38 -07:00
Linus Torvalds	c5c473e29c	Merge branch 'for-3.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue / powernow-k8 fix from Tejun Heo: "This is the fix for the bug where cpufreq/powernow-k8 was tripping BUG_ON() in try_to_wake_up_local() by migrating workqueue worker to a different CPU. https://bugzilla.kernel.org/show_bug.cgi?id=47301 As discussed, the fix is now two parts - one to reimplement work_on_cpu() so that it doesn't create a new kthread each time and the actual fix which makes powernow-k8 use work_on_cpu() instead of performing manual migration. While pretty late in the merge cycle, both changes are on the safer side. Jiri and I verified two existing users of work_on_cpu() and Duncan confirmed that the powernow-k8 fix survived about 18 hours of testing." * 'for-3.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: cpufreq/powernow-k8: workqueue user shouldn't migrate the kworker to another CPU workqueue: reimplement work_on_cpu() using system_wq	2012-09-19 11:00:07 -07:00
Tejun Heo	6889125b8b	cpufreq/powernow-k8: workqueue user shouldn't migrate the kworker to another CPU powernowk8_target() runs off a per-cpu work item and if the cpufreq_policy->cpu is different from the current one, it migrates the kworker to the target CPU by manipulating current->cpus_allowed. The function migrates the kworker back to the original CPU but this is still broken. Workqueue concurrency management requires the kworkers to stay on the same CPU and powernowk8_target() ends up triggerring BUG_ON(rq != this_rq()) in try_to_wake_up_local() if it contends on fidvid_mutex and sleeps. It is unclear why this bug is being reported now. Duncan says it appeared to be a regression of 3.6-rc1 and couldn't reproduce it on 3.5. Bisection seemed to point to `63d95a91` "workqueue: use @pool instead of @gcwq or @cpu where applicable" which is an non-functional change. Given that the reproduce case sometimes took upto days to trigger, it's easy to be misled while bisecting. Maybe something made contention on fidvid_mutex more likely? I don't know. This patch fixes the bug by using work_on_cpu() instead if @pol->cpu isn't the same as the current one. The code assumes that cpufreq_policy->cpu is kept online by the caller, which Rafael tells me is the case. stable: `ed48ece27c` ("workqueue: reimplement work_on_cpu() using system_wq") should be applied before this; otherwise, the behavior could be horrible. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Duncan <1i5t5.duncan@cox.net> Tested-by: Duncan <1i5t5.duncan@cox.net> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: stable@vger.kernel.org Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=47301	2012-09-19 10:15:01 -07:00
Tejun Heo	ed48ece27c	workqueue: reimplement work_on_cpu() using system_wq The existing work_on_cpu() implementation is hugely inefficient. It creates a new kthread, execute that single function and then let the kthread die on each invocation. Now that system_wq can handle concurrent executions, there's no advantage of doing this. Reimplement work_on_cpu() using system_wq which makes it simpler and way more efficient. stable: While this isn't a fix in itself, it's needed to fix a workqueue related bug in cpufreq/powernow-k8. AFAICS, this shouldn't break other existing users. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Jiri Kosina <jkosina@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Len Brown <lenb@kernel.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: stable@vger.kernel.org	2012-09-19 10:13:12 -07:00
NeilBrown	6dafab6b13	md: make sure metadata is updated when spares are activated or removed. It isn't always necessary to update the metadata when spares are removed as the presence-or-not of a spare isn't really important to the integrity of an array. Also activating a spare doesn't always require updating the metadata as the update on 'recovery-completed' is usually sufficient. However the introduction of 'replacement' devices have made these transitions sometimes more important. For example the 'Replacement' flag isn't cleared until the original device is removed, so we need to ensure a metadata update after that 'spare' is removed. So set MD_CHANGE_DEVS whenever a spare is activated or removed, to complement the current situation where it is set when a spare is added or a device is failed (or a number of other less common situations). This is suitable for -stable as out-of-data metadata could lead to data corruption. This is only relevant for 3.3 and later 9when 'replacement' as introduced. Cc: stable@vger.kernel.org Signed-off-by: NeilBrown <neilb@suse.de>	2012-09-19 12:54:22 +10:00
NeilBrown	e5c86471f9	md/raid5: fix calculate of 'degraded' when a replacement becomes active. When a replacement device becomes active, we mark the device that it replaces as 'faulty' so that it can subsequently get removed. However 'calc_degraded' only pays attention to the primary device, not the replacement, so the array appears to become degraded, which is wrong. So teach 'calc_degraded' to consider any replacement if a primary device is faulty. This is suitable for -stable as an incorrect 'degraded' value can confuse md and could lead to data corruption. This is only relevant for 3.3 and later. Cc: stable@vger.kernel.org Reported-by: Robin Hill <robin@robinhill.me.uk> Reported-by: John Drescher <drescherjm@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-09-19 12:52:30 +10:00
NeilBrown	a852d7b8a0	Revert "md/raid5: For odirect-write performance, do not set STRIPE_PREREAD_ACTIVE." This reverts commit `895e3c5c58`. While this patch seemed like a good idea and did help some workloads, it hurts other workloads. Large sequential O_DIRECT writes were faster, Small random O_DIRECT writes were slower. Other changes (batching RAID5 writes) have improved the sequential writes using a different mechanism, so the net result of this patch is definitely negative. So revert it. Reported-by: Shaohua Li <shli@kernel.org> Tested-by: Jianpeng Ma <majianpeng@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-09-19 12:48:30 +10:00
Linus Torvalds	925a6f0bf8	A single hwspinlock fix by Wei Yongjun, which prevents potential NULL dereferences. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJQWBd8AAoJELLolMlTRIoMsdQQAKGtNY9sF8FlWWLl49RTO8iZ gb9p0Frb5HGaW9FYW6tGfn2CKqKpd4K0MJ+4BhXYr099rwXEZeQ6SIjMZg6wzvO3 +l84yYv6aJZmOWFvK6ViiID2tfVm583bQ+dDRjZnqxa2miTKAkEEX4unuSZzd/MR uaR7xta4Ya4y09zkyPrsZuBysVPGged3/FqzeHqPaSlOAcQx1DfHWIu2eOlKV2lW uBHnWMXzL0gOZhnj/93aLf/I0gQiKs2a9JWZqz9BGXSS2Jh5Td9GEKdr7pBcesPi 78543/gZnE03q5RgzkrredGqepxWTrwJugHoDwOwTb3V7+yUHin8uqFwkjenF1uL xTMsqX84Fao71s1IF3Na1lCZrkAGazY7KJhjYaoLy/lwmUa1Y3VnZwJvOdbqhlTd 3DeSAldDjhCdf7vZnJA1PTsRowRR9BWKSm/vdrL5gUXxsufJwS8xAWgModSGvn4P 9I2Mr9yaU2OgRyP10NfBXBI6cykH1DwniFJ1313Yp1K8JXSFI5KmSi04fKeINP6s K1GXvxdfaLvODXOtYz097pLx7BxwPKoW05C+lglavdlVSzuQoP/DwLdX0V8hcl0L sDxW5/jg8ZR9tk3DPOBeOpmMBz2me+3e1F2V+ztXicP03WLpQN2Al5QK9SUdKKLx h4ZVlbfOR6SKwyFASNDp =wZhC -----END PGP SIGNATURE----- Merge tag 'hwspinlock-3.6-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/hwspinlock Pull hwspinlock fix from Ohad Ben-Cohen: "A single hwspinlock fix by Wei Yongjun, which prevents potential NULL dereferences" * tag 'hwspinlock-3.6-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/hwspinlock: hwspinlock/core: move the dereference below the NULL test	2012-09-18 11:58:54 -07:00
Miklos Szeredi	b161dfa693	vfs: dcache: use DCACHE_DENTRY_KILLED instead of DCACHE_DISCONNECTED in d_kill() IBM reported a soft lockup after applying the fix for the rename_lock deadlock. Commit `c83ce989cb` ("VFS: Fix the nfs sillyrename regression in kernel 2.6.38") was found to be the culprit. The nfs sillyrename fix used DCACHE_DISCONNECTED to indicate that the dentry was killed. This flag can be set on non-killed dentries too, which results in infinite retries when trying to traverse the dentry tree. This patch introduces a separate flag: DCACHE_DENTRY_KILLED, which is only set in d_kill() and makes try_to_ascend() test only this flag. IBM reported successful test results with this patch. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-09-18 11:23:51 -07:00
Stephen M. Cameron	2453f5f992	cciss: fix handling of protocol error If a command completes with a status of CMD_PROTOCOL_ERR, this information should be conveyed to the SCSI mid layer, not dropped on the floor. Unlike a similar bug in the hpsa driver, this bug only affects tape drives and CD and DVD ROM drives in the cciss driver, and to induce it, you have to disconnect (or damage) a cable, so it is not a very likely scenario (which would explain why the bug has gone undetected for the last 10 years.) Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-09-18 11:57:08 +02:00
Alan Cox	2bd6efad25	blk: add an upper sanity check on partition adding 65536 should be ludicrous anyway but without it we overflow the memory computation doing the allocation and badness occurs. Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-09-18 11:56:29 +02:00
Al Viro	5e071e2b4b	sh: Fix up TIF_NOTIFY_RESUME sans TIF_SIGPENDING handling. As Al notes, we missed a TIF_NOTIFY_RESUME check which caused any handlers without TIF_SIGPENDING also set to skip the notification: Looks like while it is in the relevant masks and checked in do_notify_resume() both on 32bit and 64bit variants since commit `ab99c733ae` ("sh: Make syscall tracer use tracehook notifiers, add TIF_NOTIFY_RESUME.") they are actually not reached without simulataneous SIGPENDING, since the actual glue in the callers had not been updated back then and still checks for _TIF_SIGPENDING alone when deciding whether to hit do_notify_resume() or not. Reported-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2012-09-18 17:04:37 +09:00
Laurent Pinchart	077664a264	sh: pfc: Release spinlock in sh_pfc_gpio_request_enable() error path The sh_pfc_gpio_request_enable() function acquires a spinlock but fails to release it before returning if the requested mux type is not supported. Fix this. Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2012-09-18 16:54:46 +09:00
Linus Torvalds	4651afbbae	Merge branch 'for-3.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull another workqueue fix from Tejun Heo: "Unfortunately, yet another late fix. This too is discovered and fixed by Lai. This bug was introduced during this merge window by commit `25511a4776` ("workqueue: reimplement CPU online rebinding to handle idle workers") which started using WORKER_REBIND flag for idle rebind too. The bug is relatively easy to trigger if the CPU rapidly goes through off, on and then off (and stay off). The fix is on the safer side. This hasn't been on linux-next yet but I'm pushing early so that it can get more exposure before v3.6 release." * 'for-3.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: always clear WORKER_REBIND in busy_worker_rebind_fn()	2012-09-17 16:05:23 -07:00
Lai Jiangshan	960bd11bf2	workqueue: always clear WORKER_REBIND in busy_worker_rebind_fn() busy_worker_rebind_fn() didn't clear WORKER_REBIND if rebinding failed (CPU is down again). This used to be okay because the flag wasn't used for anything else. However, after `25511a477` "workqueue: reimplement CPU online rebinding to handle idle workers", WORKER_REBIND is also used to command idle workers to rebind. If not cleared, the worker may confuse the next CPU_UP cycle by having REBIND spuriously set or oops / get stuck by prematurely calling idle_worker_rebind(). WARNING: at /work/os/wq/kernel/workqueue.c:1323 worker_thread+0x4cd/0x5 00() Hardware name: Bochs Modules linked in: test_wq(O-) Pid: 33, comm: kworker/1:1 Tainted: G O 3.6.0-rc1-work+ #3 Call Trace: [<ffffffff8109039f>] warn_slowpath_common+0x7f/0xc0 [<ffffffff810903fa>] warn_slowpath_null+0x1a/0x20 [<ffffffff810b3f1d>] worker_thread+0x4cd/0x500 [<ffffffff810bc16e>] kthread+0xbe/0xd0 [<ffffffff81bd2664>] kernel_thread_helper+0x4/0x10 ---[ end trace e977cf20f4661968 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff810b3db0>] worker_thread+0x360/0x500 PGD 0 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: test_wq(O-) CPU 0 Pid: 33, comm: kworker/1:1 Tainted: G W O 3.6.0-rc1-work+ #3 Bochs Bochs RIP: 0010:[<ffffffff810b3db0>] [<ffffffff810b3db0>] worker_thread+0x360/0x500 RSP: 0018:ffff88001e1c9de0 EFLAGS: 00010086 RAX: 0000000000000000 RBX: ffff88001e633e00 RCX: 0000000000004140 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000009 RBP: ffff88001e1c9ea0 R08: 0000000000000000 R09: 0000000000000001 R10: 0000000000000002 R11: 0000000000000000 R12: ffff88001fc8d580 R13: ffff88001fc8d590 R14: ffff88001e633e20 R15: ffff88001e1c6900 FS: 0000000000000000(0000) GS:ffff88001fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000000 CR3: 00000000130e8000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kworker/1:1 (pid: 33, threadinfo ffff88001e1c8000, task ffff88001e1c6900) Stack: ffff880000000000 ffff88001e1c9e40 0000000000000001 ffff88001e1c8010 ffff88001e519c78 ffff88001e1c9e58 ffff88001e1c6900 ffff88001e1c6900 ffff88001e1c6900 ffff88001e1c6900 ffff88001fc8d340 ffff88001fc8d340 Call Trace: [<ffffffff810bc16e>] kthread+0xbe/0xd0 [<ffffffff81bd2664>] kernel_thread_helper+0x4/0x10 Code: b1 00 f6 43 48 02 0f 85 91 01 00 00 48 8b 43 38 48 89 df 48 8b 00 48 89 45 90 e8 ac f0 ff ff 3c 01 0f 85 60 01 00 00 48 8b 53 50 <8b> 02 83 e8 01 85 c0 89 02 0f 84 3b 01 00 00 48 8b 43 38 48 8b RIP [<ffffffff810b3db0>] worker_thread+0x360/0x500 RSP <ffff88001e1c9de0> CR2: 0000000000000000 There was no reason to keep WORKER_REBIND on failure in the first place - WORKER_UNBOUND is guaranteed to be set in such cases preventing incorrectly activating concurrency management. Always clear WORKER_REBIND. tj: Updated comment and description. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2012-09-17 15:42:31 -07:00
Linus Torvalds	08077ca849	Merge branch 'akpm' (Andrew's patch-bomb) Merge fixes from Andrew Morton: "13 patches. 12 are fixes and one is a little preparatory thing for Andi." * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (13 commits) memory hotplug: fix section info double registration bug mm/page_alloc: fix the page address of higher page's buddy calculation drivers/rtc/rtc-twl.c: ensure all interrupts are disabled during probe compiler.h: add __visible pid-namespace: limit value of ns_last_pid to (0, max_pid) include/net/sock.h: squelch compiler warning in sk_rmem_schedule() slub: consider pfmemalloc_match() in get_partial_node() slab: fix starting index for finding another object slab: do ClearSlabPfmemalloc() for all pages of slab nbd: clear waiting_queue on shutdown MAINTAINERS: fix TXT maintainer list and source repo path mm/ia64: fix a memory block size bug memory hotplug: reset pgdat->kswapd to NULL if creating kernel thread fails	2012-09-17 15:01:14 -07:00
qiuxishi	f14851af0e	memory hotplug: fix section info double registration bug There may be a bug when registering section info. For example, on my Itanium platform, the pfn range of node0 includes the other nodes, so other nodes' section info will be double registered, and memmap's page count will equal to 3. node0: start_pfn=0x100, spanned_pfn=0x20fb00, present_pfn=0x7f8a3, => 0x000100-0x20fc00 node1: start_pfn=0x80000, spanned_pfn=0x80000, present_pfn=0x80000, => 0x080000-0x100000 node2: start_pfn=0x100000, spanned_pfn=0x80000, present_pfn=0x80000, => 0x100000-0x180000 node3: start_pfn=0x180000, spanned_pfn=0x80000, present_pfn=0x80000, => 0x180000-0x200000 free_all_bootmem_node() register_page_bootmem_info_node() register_page_bootmem_info_section() When hot remove memory, we can't free the memmap's page because page_count() is 2 after put_page_bootmem(). sparse_remove_one_section() free_section_usemap() free_map_bootmem() put_page_bootmem() [akpm@linux-foundation.org: add code comment] Signed-off-by: Xishi Qiu <qiuxishi@huawei.com> Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Acked-by: Mel Gorman <mgorman@suse.de> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-09-17 15:00:38 -07:00
Li Haifeng	0ba8f2d593	mm/page_alloc: fix the page address of higher page's buddy calculation The heuristic method for buddy has been introduced since commit `43506fad21` ("mm/page_alloc.c: simplify calculation of combined index of adjacent buddy lists"). But the page address of higher page's buddy was wrongly calculated, which will lead page_is_buddy to fail for ever. IOW, the heuristic method would be disabled with the wrong page address of higher page's buddy. Calculating the page address of higher page's buddy should be based higher_page with the offset between index of higher page and index of higher page's buddy. Signed-off-by: Haifeng Li <omycle@gmail.com> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Reviewed-by: Michal Hocko <mhocko@suse.cz> Cc: KyongHo Cho <pullip.cho@samsung.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Johannes Weiner <jweiner@redhat.com> Cc: <stable@vger.kernel.org> [2.6.38+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-09-17 15:00:38 -07:00
Kevin Hilman	8dcebaa9a0	drivers/rtc/rtc-twl.c: ensure all interrupts are disabled during probe On some platforms, bootloaders are known to do some interesting RTC programming. Without going into the obscurities as to why this may be the case, suffice it to say the the driver should not make any assumptions about the state of the RTC when the driver loads. In particular, the driver probe should be sure that all interrupts are disabled until otherwise programmed. This was discovered when finding bursty I2C traffic every second on Overo platforms. This I2C overhead was keeping the SoC from hitting deep power states. The cause was found to be the RTC firing every second on the I2C-connected TWL PMIC. Special thanks to Felipe Balbi for suggesting to look for a rogue driver as the source of the I2C traffic rather than the I2C driver itself. Special thanks to Steve Sakoman for helping track down the source of the continuous RTC interrups on the Overo boards. Signed-off-by: Kevin Hilman <khilman@ti.com> Cc: Felipe Balbi <balbi@ti.com> Tested-by: Steve Sakoman <steve@sakoman.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Tested-by: Shubhrajyoti Datta <omaplinuxkernel@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-09-17 15:00:38 -07:00
Andi Kleen	9a858dc7ce	compiler.h: add __visible gcc 4.6+ has support for a externally_visible attribute that prevents the optimizer from optimizing unused symbols away. Add a __visible macro to use it with that compiler version or later. This is used (at least) by the "Link Time Optimization" patchset. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-09-17 15:00:38 -07:00
Andrew Vagin	579035dc5d	pid-namespace: limit value of ns_last_pid to (0, max_pid) The kernel doesn't check the pid for negative values, so if you try to write -2 to /proc/sys/kernel/ns_last_pid, you will get a kernel panic. The crash happens because the next pid is -1, and alloc_pidmap() will try to access to a nonexistent pidmap. map = &pid_ns->pidmap[pid/BITS_PER_PAGE]; Signed-off-by: Andrew Vagin <avagin@openvz.org> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-09-17 15:00:38 -07:00
Chuck Lever	35c448a8a3	include/net/sock.h: squelch compiler warning in sk_rmem_schedule() This warning: In file included from linux/include/linux/tcp.h:227:0, from linux/include/linux/ipv6.h:221, from linux/include/net/ipv6.h:16, from linux/include/linux/sunrpc/clnt.h:26, from linux/net/sunrpc/stats.c:22: linux/include/net/sock.h: In function `sk_rmem_schedule': linux/nfs-2.6/include/net/sock.h:1339:13: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] is seen with gcc (GCC) 4.6.3 20120306 (Red Hat 4.6.3-2) using the -Wextra option. Commit `c76562b670` ("netvm: prevent a stream-specific deadlock") accidentally replaced the "size" parameter of sk_rmem_schedule() with an unsigned int. This changes the semantics of the comparison in the return statement. In sk_wmem_schedule we have syntactically the same comparison, but "size" is a signed integer. In addition, __sk_mem_schedule() takes a signed integer for its "size" parameter, so there is an implicit type conversion in sk_rmem_schedule() anyway. Revert the "size" parameter back to a signed integer so that the semantics of the expressions in both sk_[rw]mem_schedule() are exactly the same. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: David Miller <davem@davemloft.net> Cc: Joonsoo Kim <js1304@gmail.com> Cc: David Rientjes <rientjes@google.com> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-09-17 15:00:38 -07:00
Joonsoo Kim	8ba00bb68a	slub: consider pfmemalloc_match() in get_partial_node() get_partial() is currently not checking pfmemalloc_match() meaning that it is possible for pfmemalloc pages to leak to non-pfmemalloc users. This is a problem in the following situation. Assume that there is a request from normal allocation and there are no objects in the per-cpu cache and no node-partial slab. In this case, slab_alloc enters the slow path and new_slab_objects() is called which may return a PFMEMALLOC page. As the current user is not allowed to access PFMEMALLOC page, deactivate_slab() is called ([5091b74a: mm: slub: optimise the SLUB fast path to avoid pfmemalloc checks]) and returns an object from PFMEMALLOC page. Next time, when we get another request from normal allocation, slab_alloc() enters the slow-path and calls new_slab_objects(). In new_slab_objects(), we call get_partial() and get a partial slab which was just deactivated but is a pfmemalloc page. We extract one object from it and re-deactivate. "deactivate -> re-get in get_partial -> re-deactivate" occures repeatedly. As a result, access to PFMEMALLOC page is not properly restricted and it can cause a performance degradation due to frequent deactivation. deactivation frequently. This patch changes get_partial_node() to take pfmemalloc_match() into account and prevents the "deactivate -> re-get in get_partial() scenario. Instead, new_slab() is called. Signed-off-by: Joonsoo Kim <js1304@gmail.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: David Miller <davem@davemloft.net> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-09-17 15:00:38 -07:00
Joonsoo Kim	d014dc2ed4	slab: fix starting index for finding another object In array cache, there is a object at index 0, check it. Signed-off-by: Joonsoo Kim <js1304@gmail.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: David Miller <davem@davemloft.net> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: David Rientjes <rientjes@google.com> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-09-17 15:00:38 -07:00
Mel Gorman	30c29bea6a	slab: do ClearSlabPfmemalloc() for all pages of slab Right now, we call ClearSlabPfmemalloc() for first page of slab when we clear SlabPfmemalloc flag. This is fine for most swap-over-network use cases as it is expected that order-0 pages are in use. Unfortunately it is possible that that __ac_put_obj() checks SlabPfmemalloc on a tail page and while this is harmless, it is sloppy. This patch ensures that the head page is always used. This problem was originally identified by Joonsoo Kim. [js1304@gmail.com: Original implementation and problem identification] Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: David Miller <davem@davemloft.net> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Joonsoo Kim <js1304@gmail.com> Cc: David Rientjes <rientjes@google.com> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-09-17 15:00:38 -07:00
Paul Clements	fded4e090c	nbd: clear waiting_queue on shutdown Fix a serious but uncommon bug in nbd which occurs when there is heavy I/O going to the nbd device while, at the same time, a failure (server, network) or manual disconnect of the nbd connection occurs. There is a small window between the time that the nbd_thread is stopped and the socket is shutdown where requests can continue to be queued to nbd's internal waiting_queue. When this happens, those requests are never completed or freed. The fix is to clear the waiting_queue on shutdown of the nbd device, in the same way that the nbd request queue (queue_head) is already being cleared. Signed-off-by: Paul Clements <paul.clements@steeleye.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-09-17 15:00:37 -07:00
Gang Wei	e9b7d7c81d	MAINTAINERS: fix TXT maintainer list and source repo path Signed-off-by: Gang Wei <gang.wei@intel.com> Cc: Richard L Maliszewski <richard.l.maliszewski@intel.com> Cc: Gang Wei <gang.wei@intel.com> Cc: Shane Wang <shane.wang@intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-09-17 15:00:37 -07:00

1 2 3 4 5 ...

323003 Commits All Branches Search

323003 Commits

All Branches