Commit Graph

526 Commits

Author SHA1 Message Date
Thomas Gleixner c365c292d0 sched: Consider pi boosting in setscheduler()
If a PI boosted task policy/priority is modified by a setscheduler()
call we unconditionally dequeue and requeue the task if it is on the
runqueue even if the new priority is lower than the current effective
boosted priority. This can result in undesired reordering of the
priority bucket list.

If the new priority is less or equal than the current effective we
just store the new parameters in the task struct and leave the
scheduler class and the runqueue untouched. This is handled when the
task deboosts itself. Only if the new priority is higher than the
effective boosted priority we apply the change immediately.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[ Rebase ontop of v3.14-rc1. ]
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Dario Faggioli <raistlin@linux.it>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1391803122-4425-7-git-send-email-bigeasy@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-02-22 18:10:04 +01:00
Andi Kleen 3ebae4f3a2 asmlinkage: Mark rwsem functions that can be called from assembler asmlinkage
Mark the rwsem functions that can be called from assembler asmlinkage.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1391845930-28580-9-git-send-email-ak@linux.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-02-13 18:13:37 -08:00
Andi Kleen 22d9fd3411 asmlinkage, mutex: Mark __visible
Various kernel/mutex.c functions can be called from
inline assembler, so they should be all global and
__visible.

Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1391845930-28580-7-git-send-email-ak@linux.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-02-13 18:13:19 -08:00
Andi Kleen b35f830533 asmlinkage: Make trace_hardirq visible
Can be called from assembler code.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1391845930-28580-6-git-send-email-ak@linux.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-02-13 18:13:07 -08:00
Andi Kleen 63f9a7fde7 asmlinkage: Make lockdep_sys_exit asmlinkage
lockdep_sys_exit can be called from assembler code, so make it
asmlinkage.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1391845930-28580-5-git-send-email-ak@linux.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-02-13 18:12:54 -08:00
Oleg Nesterov 34d0ed5ea7 lockdep: Change mark_held_locks() to check hlock->check instead of lockdep_no_validate
The __lockdep_no_validate check in mark_held_locks() adds the subtle
and (afaics) unnecessary difference between no-validate and check==0.
And this looks even more inconsistent because __lock_acquire() skips
mark_irqflags()->mark_lock() if !check.

Change mark_held_locks() to check hlock->check instead.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140120182013.GA26505@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-02-09 21:18:59 +01:00
Oleg Nesterov 1b5ff816ca lockdep: Don't create the wrong dependency on hlock->check == 0
Test-case:

	DEFINE_MUTEX(m1);
	DEFINE_MUTEX(m2);
	DEFINE_MUTEX(mx);

	void lockdep_should_complain(void)
	{
		lockdep_set_novalidate_class(&mx);

		// m1 -> mx -> m2
		mutex_lock(&m1);
		mutex_lock(&mx);
		mutex_lock(&m2);
		mutex_unlock(&m2);
		mutex_unlock(&mx);
		mutex_unlock(&m1);

		// m2 -> m1 ; should trigger the warning
		mutex_lock(&m2);
		mutex_lock(&m1);
		mutex_unlock(&m1);
		mutex_unlock(&m2);
	}

this doesn't trigger any warning, lockdep can't detect the trivial
deadlock.

This is because lock(&mx) correctly avoids m1 -> mx dependency, it
skips validate_chain() due to mx->check == 0. But lock(&m2) wrongly
adds mx -> m2 and thus m1 -> m2 is not created.

rcu_lock_acquire()->lock_acquire(check => 0) is fine due to read == 2,
so currently only __lockdep_no_validate__ can trigger this problem.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140120182010.GA26498@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-02-09 21:18:57 +01:00
Oleg Nesterov fb9edbe984 lockdep: Make held_lock->check and "int check" argument bool
The "int check" argument of lock_acquire() and held_lock->check are
misleading. This is actually a boolean: 2 means "true", everything
else is "false".

And there is no need to pass 1 or 0 to lock_acquire() depending on
CONFIG_PROVE_LOCKING, __lock_acquire() checks prove_locking at the
start and clears "check" if !CONFIG_PROVE_LOCKING.

Note: probably we can simply kill this member/arg. The only explicit
user of check => 0 is rcu_lock_acquire(), perhaps we can change it to
use lock_acquire(trylock =>, read => 2). __lockdep_no_validate means
check => 0 implicitly, but we can change validate_chain() to check
hlock->instance->key instead. Not to mention it would be nice to get
rid of lockdep_set_novalidate_class().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140120182006.GA26495@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-02-09 21:18:54 +01:00
Tim Chen e72246748f locking/mutexes/mcs: Restructure the MCS lock defines and locking code into its own file
We will need the MCS lock code for doing optimistic spinning for rwsem
and queued rwlock.  Extracting the MCS code from mutex.c and put into
its own file allow us to reuse this code easily.

We also inline mcs_spin_lock and mcs_spin_unlock functions
for better efficiency.

Note that using the smp_load_acquire/smp_store_release pair used in
mcs_lock and mcs_unlock is not sufficient to form a full memory barrier
across cpus for many architectures (except x86).  For applications that
absolutely need a full barrier across multiple cpus with mcs_unlock and
mcs_lock pair, smp_mb__after_unlock_lock() should be used after mcs_lock.

Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1390347360.3138.63.camel@schen9-DESK
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-28 13:13:27 +01:00
Waiman Long aff7385b5a locking/mutexes/mcs: Correct barrier usage
This patch corrects the way memory barriers are used in the MCS lock
with smp_load_acquire and smp_store_release fucnctions.  The previous
barriers could leak critical sections if mcs lock is used by itself.
It is not a problem when mcs lock is embedded in mutex but will be an
issue when the mcs_lock is used elsewhere.

The patch removes the incorrect barriers and put in correct
barriers with the pair of functions smp_load_acquire and smp_store_release.

Suggested-by: Michel Lespinasse <walken@google.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Signed-off-by: Jason Low <jason.low2@hp.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1390347353.3138.62.camel@schen9-DESK
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-28 13:13:26 +01:00
Linus Torvalds a0fa1dd3cd Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler changes from Ingo Molnar:

 - Add the initial implementation of SCHED_DEADLINE support: a real-time
   scheduling policy where tasks that meet their deadlines and
   periodically execute their instances in less than their runtime quota
   see real-time scheduling and won't miss any of their deadlines.
   Tasks that go over their quota get delayed (Available to privileged
   users for now)

 - Clean up and fix preempt_enable_no_resched() abuse all around the
   tree

 - Do sched_clock() performance optimizations on x86 and elsewhere

 - Fix and improve auto-NUMA balancing

 - Fix and clean up the idle loop

 - Apply various cleanups and fixes

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits)
  sched: Fix __sched_setscheduler() nice test
  sched: Move SCHED_RESET_ON_FORK into attr::sched_flags
  sched: Fix up attr::sched_priority warning
  sched: Fix up scheduler syscall LTP fails
  sched: Preserve the nice level over sched_setscheduler() and sched_setparam() calls
  sched/core: Fix htmldocs warnings
  sched/deadline: No need to check p if dl_se is valid
  sched/deadline: Remove unused variables
  sched/deadline: Fix sparse static warnings
  m68k: Fix build warning in mac_via.h
  sched, thermal: Clean up preempt_enable_no_resched() abuse
  sched, net: Fixup busy_loop_us_clock()
  sched, net: Clean up preempt_enable_no_resched() abuse
  sched/preempt: Fix up missed PREEMPT_NEED_RESCHED folding
  sched/preempt, locking: Rework local_bh_{dis,en}able()
  sched/clock, x86: Avoid a runtime condition in native_sched_clock()
  sched/clock: Fix up clear_sched_clock_stable()
  sched/clock, x86: Use a static_key for sched_clock_stable
  sched/clock: Remove local_irq_disable() from the clocks
  sched/clock, x86: Rewrite cyc2ns() to avoid the need to disable IRQs
  ...
2014-01-20 10:42:08 -08:00
Dario Faggioli 2d3d891d33 sched/deadline: Add SCHED_DEADLINE inheritance logic
Some method to deal with rt-mutexes and make sched_dl interact with
the current PI-coded is needed, raising all but trivial issues, that
needs (according to us) to be solved with some restructuring of
the pi-code (i.e., going toward a proxy execution-ish implementation).

This is under development, in the meanwhile, as a temporary solution,
what this commits does is:

 - ensure a pi-lock owner with waiters is never throttled down. Instead,
   when it runs out of runtime, it immediately gets replenished and it's
   deadline is postponed;

 - the scheduling parameters (relative deadline and default runtime)
   used for that replenishments --during the whole period it holds the
   pi-lock-- are the ones of the waiting task with earliest deadline.

Acting this way, we provide some kind of boosting to the lock-owner,
still by using the existing (actually, slightly modified by the previous
commit) pi-architecture.

We would stress the fact that this is only a surely needed, all but
clean solution to the problem. In the end it's only a way to re-start
discussion within the community. So, as always, comments, ideas, rants,
etc.. are welcome! :-)

Signed-off-by: Dario Faggioli <raistlin@linux.it>
Signed-off-by: Juri Lelli <juri.lelli@gmail.com>
[ Added !RT_MUTEXES build fix. ]
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1383831828-15501-11-git-send-email-juri.lelli@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-13 13:42:56 +01:00
Peter Zijlstra fb00aca474 rtmutex: Turn the plist into an rb-tree
Turn the pi-chains from plist to rb-tree, in the rt_mutex code,
and provide a proper comparison function for -deadline and
-priority tasks.

This is done mainly because:
 - classical prio field of the plist is just an int, which might
   not be enough for representing a deadline;
 - manipulating such a list would become O(nr_deadline_tasks),
   which might be to much, as the number of -deadline task increases.

Therefore, an rb-tree is used, and tasks are queued in it according
to the following logic:
 - among two -priority (i.e., SCHED_BATCH/OTHER/RR/FIFO) tasks, the
   one with the higher (lower, actually!) prio wins;
 - among a -priority and a -deadline task, the latter always wins;
 - among two -deadline tasks, the one with the earliest deadline
   wins.

Queueing and dequeueing functions are changed accordingly, for both
the list of a task's pi-waiters and the list of tasks blocked on
a pi-lock.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Dario Faggioli <raistlin@linux.it>
Signed-off-by: Juri Lelli <juri.lelli@gmail.com>
Signed-off-again-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1383831828-15501-10-git-send-email-juri.lelli@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-13 13:41:50 +01:00
Chuansheng Liu 91f30a1702 mutexes: Give more informative mutex warning in the !lock->owner case
When mutex debugging is enabled and an imbalanced mutex_unlock()
is called, we get the following, slightly confusing warning:

  [  364.208284] DEBUG_LOCKS_WARN_ON(lock->owner != current)

But in that case the warning is due to an imbalanced mutex_unlock() call,
and the lock->owner is NULL - so the message is misleading.

So improve the message by testing for this case specifically:

   DEBUG_LOCKS_WARN_ON(!lock->owner)

Signed-off-by: Liu, Chuansheng <chuansheng.liu@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1386136693.3650.48.camel@cliu38-desktop-build
[ Improved the changelog, changed the patch to use !lock->owner consistently. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-12-17 15:35:10 +01:00
Sasha Levin 8dce7a9a6f lockdep: Be nice about building from userspace
Lockdep is an awesome piece of code which detects locking issues
which are relevant both to userspace and kernelspace. We can
easily make lockdep work in userspace since there is really no
kernel spacific magic going on in the code.

All we need is to wrap two functions which are used by lockdep
and are very kernel specific.

Doing that will allow tools located in tools/ to easily utilize
lockdep's code for their own use.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: penberg@kernel.org
Cc: torvalds@linux-foundation.org
Link: http://lkml.kernel.org/r/1352753446-24109-1-git-send-email-sasha.levin@oracle.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-27 11:55:20 +01:00
Fengguang Wu 5216d530bb locking/lockdep: Mark __lockdep_count_forward_deps() as static
There are new Sparse warnings:

  >> kernel/locking/lockdep.c:1235:15: sparse: symbol '__lockdep_count_forward_deps' was not declared. Should it be static?
  >> kernel/locking/lockdep.c:1261:15: sparse: symbol '__lockdep_count_backward_deps' was not declared. Should it be static?

Please consider folding the attached diff :-)

Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/527d1787.ThzXGoUspZWehFDl\%fengguang.wu@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-13 13:50:17 +01:00
Peter Zijlstra 838cc7b488 lockdep/proc: Fix lock-time avg computation
>    kernel/locking/lockdep_proc.c: In function 'seq_lock_time':
> >> kernel/locking/lockdep_proc.c:424:23: warning: comparison of distinct pointer types lacks a cast [enabled by default]
>
>    418	static void seq_lock_time(struct seq_file *m, struct lock_time *lt)
>    419	{
>    420		seq_printf(m, "%14lu", lt->nr);
>    421		seq_time(m, lt->min);
>    422		seq_time(m, lt->max);
>    423		seq_time(m, lt->total);
>  > 424		seq_time(m, lt->nr ? do_div(lt->total, lt->nr) : 0);
>    425	}

My compiler refuses to actually say that; but it looks wrong in that
do_div() returns the remainder, not the divisor.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Tested-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Link: http://lkml.kernel.org/r/20131106164230.GE16117@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-11 12:41:34 +01:00
Peter Zijlstra 67a6de49bf locking/doc: Update references to kernel/mutex.c
Fix this docbook error:

  >> docproc: kernel/mutex.c: No such file or directory

by updating the stale references to kernel/mutex.c.

Reported-by: fengguang.wu@intel.com
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-34pikw1tlsskj65rrt5iusrq@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-11 12:41:33 +01:00
Peter Zijlstra 32cf7c3c94 locking: Move the percpu-rwsem code to kernel/locking/
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-52bjmtty46we26hbfd9sc9iy@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 09:24:22 +01:00
Peter Zijlstra cd4d241d57 locking: Move the lglocks code to kernel/locking/
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-amd6pg1mif6tikbyktfvby3y@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 09:24:20 +01:00
Peter Zijlstra ed428bfc3c locking: Move the rwsem code to kernel/locking/
Notably: changed lib/rwsem* targets from lib- to obj-, no idea about
the ramifications of that.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-g0kynfh5feriwc6p3h6kpbw6@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 09:24:18 +01:00
Peter Zijlstra 1696a8bee3 locking: Move the rtmutex code to kernel/locking/
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-p9ijt8div0hwldexwfm4nlhj@git.kernel.org
[ Fixed build failure in kernel/rcu/tree_plugin.h. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 09:23:59 +01:00
Peter Zijlstra e25a64c401 locking: Move the semaphore core to kernel/locking/
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-vmw5sf6vzmua1z6nx1cg69h2@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 07:55:22 +01:00
Peter Zijlstra 60fc28746a locking: Move the spinlock code to kernel/locking/
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-b81ol0z3mon45m51o131yc9j@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 07:55:21 +01:00
Peter Zijlstra 8eddac3f10 locking: Move the lockdep code to kernel/locking/
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-wl7s3tta5isufzfguc23et06@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 07:55:08 +01:00
Peter Zijlstra 01768b42dc locking: Move the mutex code to kernel/locking/
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-1ditvncg30dgbpvrz2bxfmke@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 07:55:07 +01:00