Commit Graph

521620 Commits

Author SHA1 Message Date
Nikolay Aleksandrov 1a040eaca1 bridge: fix multicast router rlist endless loop
Since the addition of sysfs multicast router support if one set
multicast_router to "2" more than once, then the port would be added to
the hlist every time and could end up linking to itself and thus causing an
endless loop for rlist walkers.
So to reproduce just do:
echo 2 > multicast_router; echo 2 > multicast_router;
in a bridge port and let some igmp traffic flow, for me it hangs up
in br_multicast_flood().
Fix this by adding a check in br_multicast_add_router() if the port is
already linked.
The reason this didn't happen before the addition of multicast_router
sysfs entries is because there's a !hlist_unhashed check that prevents
it.

Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Fixes: 0909e11758 ("bridge: Add multicast_router sysfs entries")
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-10 22:07:50 -07:00
Erik Hugne b3be5e3e72 tipc: disconnect socket directly after probe failure
If the TIPC connection timer expires in a probing state, a
self abort message is supposed to be generated and delivered
to the local socket. This is currently broken, and the abort
message is actually sent out to the peer node with invalid
addressing information. This will cause the link to enter
a constant retransmission state and eventually reset.
We fix this by removing the self-abort message creation and
tear down connection immediately instead.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-10 22:05:20 -07:00
Takashi Iwai bf06848bdb ALSA: hda - Continue probing even if i915 binding fails
Currently snd-hda-intel driver aborts the probing of Intel HD-audio
controller with i915 power well management when binding with i915
driver via hda_i915_init() fails.  This is no big problem for Haswell
and Broadwell where the HD-audio controllers are dedicated to
HDMI/DP, thus i915 link is mandatory.  However, Skylake, Baytrail and
Braswell have only one controller and both HDMI/DP and analog codecs
share the same bus.  Thus, even if HDMI/DP isn't usable, we should
keep the controller working for other codecs.

For fixing this, this patch simply allows continuing the probing even
if hda_i915_init() call fails.  This may leave stale sound components
for HDMI/DP devices that are unbound with graphics.  We could abort
the probing selectively, but from the code simplicity POV, it's better
to continue in all cases.

Reported-by: Libin Yang <libin.yang@intel.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2015-06-11 06:51:19 +02:00
Linus Torvalds 2bfc60dd28 Fix build errors affecting blackfin and score architectures.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVeHo5AAoJEMsfJm/On5mBU8oQAJIAMPvFs0VqrzkgzILjDpmO
 sDhNwrEMbRSyTHZMvcuvAbMpeSYAeGu4KZ1oQlxxr18Sw0QJ9Asw79drhWI9+5ic
 pt6DBNaPbLlfAdbOgZaIdcETIUPE9SGPcmIMJLpVtgJ+9LmJnmSOWI358d1WFEJQ
 u45OQfWM5tLxMFwFeMj9dFrAGP9f6Z+lSbdp5RoOqFZ4S6Ad/5mwy3i09Hr+n5pY
 HUoSFd+Z/4WT2pFW3//x/Mwejy4FqMIp6OEk8XsrGz3S0wqKRyeuEuypLqBuRAjj
 IgfVtUwlIlmRs1on3/20qOb7O/O9BYceRtFYZ7OraIf/ljaGn97tUlkcBq0BW2IK
 MGLWppTngnpW5EJ6+h4qtF/KnWRPVNZKDC6hA4KMwBm8PVAdKr1bFxkG2dS3Ycz/
 Tn24s8EtaSszQLNWacUhP8JblVh7OQk7QXEKTN8QT31qJHcHpS+ZY+Fk7KHHTDud
 kta42WPau5GHQC4sVjmcqxdth1BpP2IWZAEUNdANTK8T1g1HOAcZqWGbYCBBiM2Q
 ClP1pJ54+W8uhw3W44hm/bIs20YQ6rKYZColK8xOL1ykNUjV6HVxqN9q8+clrlo0
 SQMWUtsCQn33SM48P+zE2V9NYLHaO4SSdSo6Li+KxUdU0u2IZG8IigJ+OzIWi43n
 G6Lub3GQQTZg1qix7PmQ
 =PGHg
 -----END PGP SIGNATURE-----

Merge tag 'misc-for-linus-4.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull misc fixes from Guenter Roeck:
 "There are two patches here.  One fixes a build error affecting the
  blackfin architecture, the other fixes a build error affecting the
  score architecture.

  The score maintainer (Lennox Wu) has a hard time sending you the score
  patch, and the blackfin maintainer (Steven Miao) has been silent since
  -rc1.  Since 4.1 is about to be released, I figured it would be useful
  to get the patches upstream to avoid the related build failures in the
  final release"

* tag 'misc-for-linus-4.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  score: Fix exception handler label
  blackfin: Fix build error
2015-06-10 17:16:32 -07:00
Linus Torvalds f5278565be Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "The gcc-4.4.4 workaround has actually been merged into a KVM tree by
  Paolo but it is stuck in linux-next and mainline needs it"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  arch/x86/kvm/mmu.c: work around gcc-4.4.4 bug
  sched, numa: do not hint for NUMA balancing on VM_MIXEDMAP mappings
  zsmalloc: fix a null pointer dereference in destroy_handle_cache()
  mm: memcontrol: fix false-positive VM_BUG_ON() on -rt
  checkpatch: fix "GLOBAL_INITIALISERS" test
  zram: clear disk io accounting when reset zram device
  memcg: do not call reclaim if !__GFP_WAIT
  mm/memory_hotplug.c: set zone->wait_table to null after freeing it
2015-06-10 16:43:53 -07:00
Andrew Morton 5ec45a192f arch/x86/kvm/mmu.c: work around gcc-4.4.4 bug
Fix this compile issue with gcc-4.4.4:

   arch/x86/kvm/mmu.c: In function 'kvm_mmu_pte_write':
   arch/x86/kvm/mmu.c:4256: error: unknown field 'cr0_wp' specified in initializer
   arch/x86/kvm/mmu.c:4257: error: unknown field 'cr4_pae' specified in initializer
   arch/x86/kvm/mmu.c:4257: warning: excess elements in union initializer
   ...

gcc-4.4.4 (at least) has issues when using anonymous unions in
initializers.

Fixes: edc90b7dc4 ("KVM: MMU: fix SMAP virtualization")
Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-10 16:43:43 -07:00
Mel Gorman 8e76d4eecf sched, numa: do not hint for NUMA balancing on VM_MIXEDMAP mappings
Jovi Zhangwei reported the following problem

  Below kernel vm bug can be triggered by tcpdump which mmaped a lot of pages
  with GFP_COMP flag.

  [Mon May 25 05:29:33 2015] page:ffffea0015414000 count:66 mapcount:1 mapping:          (null) index:0x0
  [Mon May 25 05:29:33 2015] flags: 0x20047580004000(head)
  [Mon May 25 05:29:33 2015] page dumped because: VM_BUG_ON_PAGE(compound_order(page) && !PageTransHuge(page))
  [Mon May 25 05:29:33 2015] ------------[ cut here ]------------
  [Mon May 25 05:29:33 2015] kernel BUG at mm/migrate.c:1661!
  [Mon May 25 05:29:33 2015] invalid opcode: 0000 [#1] SMP

In this case it was triggered by running tcpdump but it's not necessary
reproducible on all systems.

  sudo tcpdump -i bond0.100 'tcp port 4242' -c 100000000000 -w 4242.pcap

Compound pages cannot be migrated and it was not expected that such pages
be marked for NUMA balancing.  This did not take into account that drivers
such as net/packet/af_packet.c may insert compound pages into userspace
with vm_insert_page.  This patch tells the NUMA balancing protection
scanner to skip all VM_MIXEDMAP mappings which avoids the possibility that
compound pages are marked for migration.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reported-by: Jovi Zhangwei <jovi@cloudflare.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-10 16:43:43 -07:00
Sergey Senozhatsky 02f7b4145d zsmalloc: fix a null pointer dereference in destroy_handle_cache()
If zs_create_pool()->create_handle_cache()->kmem_cache_create() or
pool->name allocation fails, zs_create_pool()->destroy_handle_cache()
will dereference the NULL pool->handle_cachep.

Modify destroy_handle_cache() to avoid this.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-10 16:43:43 -07:00
Johannes Weiner f371763a79 mm: memcontrol: fix false-positive VM_BUG_ON() on -rt
On -rt, the VM_BUG_ON(!irqs_disabled()) triggers inside the memcg
swapout path because the spin_lock_irq(&mapping->tree_lock) in the
caller doesn't actually disable the hardware interrupts - which is fine,
because on -rt the tophalves run in process context and so we are still
safe from preemption while updating the statistics.

Remove the VM_BUG_ON() but keep the comment of what we rely on.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Clark Williams <williams@redhat.com>
Cc: Fernando Lopez-Lezcano <nando@ccrma.Stanford.EDU>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-10 16:43:43 -07:00
Joe Perches 5129e87cb8 checkpatch: fix "GLOBAL_INITIALISERS" test
Commit d5e616fc1c ("checkpatch: add a few more --fix corrections")
broke the GLOBAL_INITIALISERS test with bad parentheses and optional
leading spaces.

Fix it.

Signed-off-by: Joe Perches <joe@perches.com>
Reported-by: Bandan Das <bsd@makefile.in>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-10 16:43:43 -07:00
Weijie Yang d7ad41a1c4 zram: clear disk io accounting when reset zram device
Clear zram disk io accounting when resetting the zram device.  Otherwise
the residual io accounting stat will affect the diskstat in the next
zram active cycle.

Signed-off-by: Weijie Yang <weijie.yang@samsung.com>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-10 16:43:43 -07:00
Vladimir Davydov 7d638093d4 memcg: do not call reclaim if !__GFP_WAIT
When trimming memcg consumption excess (see memory.high), we call
try_to_free_mem_cgroup_pages without checking if we are allowed to sleep
in the current context, which can result in a deadlock.  Fix this.

Fixes: 241994ed86 ("mm: memcontrol: default hierarchy interface for memory")
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-10 16:43:43 -07:00
Gu Zheng 85bd839983 mm/memory_hotplug.c: set zone->wait_table to null after freeing it
Izumi found the following oops when hot re-adding a node:

    BUG: unable to handle kernel paging request at ffffc90008963690
    IP: __wake_up_bit+0x20/0x70
    Oops: 0000 [#1] SMP
    CPU: 68 PID: 1237 Comm: rs:main Q:Reg Not tainted 4.1.0-rc5 #80
    Hardware name: FUJITSU PRIMEQUEST2800E/SB, BIOS PRIMEQUEST 2000 Series BIOS Version 1.87 04/28/2015
    task: ffff880838df8000 ti: ffff880017b94000 task.ti: ffff880017b94000
    RIP: 0010:[<ffffffff810dff80>]  [<ffffffff810dff80>] __wake_up_bit+0x20/0x70
    RSP: 0018:ffff880017b97be8  EFLAGS: 00010246
    RAX: ffffc90008963690 RBX: 00000000003c0000 RCX: 000000000000a4c9
    RDX: 0000000000000000 RSI: ffffea101bffd500 RDI: ffffc90008963648
    RBP: ffff880017b97c08 R08: 0000000002000020 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: ffff8a0797c73800
    R13: ffffea101bffd500 R14: 0000000000000001 R15: 00000000003c0000
    FS:  00007fcc7ffff700(0000) GS:ffff880874800000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: ffffc90008963690 CR3: 0000000836761000 CR4: 00000000001407e0
    Call Trace:
      unlock_page+0x6d/0x70
      generic_write_end+0x53/0xb0
      xfs_vm_write_end+0x29/0x80 [xfs]
      generic_perform_write+0x10a/0x1e0
      xfs_file_buffered_aio_write+0x14d/0x3e0 [xfs]
      xfs_file_write_iter+0x79/0x120 [xfs]
      __vfs_write+0xd4/0x110
      vfs_write+0xac/0x1c0
      SyS_write+0x58/0xd0
      system_call_fastpath+0x12/0x76
    Code: 5d c3 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 45 f8 31 c0 48 8d 47 48 <48> 39 47 48 48 c7 45 e8 00 00 00 00 48 c7 45 f0 00 00 00 00 48
    RIP  [<ffffffff810dff80>] __wake_up_bit+0x20/0x70
     RSP <ffff880017b97be8>
    CR2: ffffc90008963690

Reproduce method (re-add a node)::
  Hot-add nodeA --> remove nodeA --> hot-add nodeA (panic)

This seems an use-after-free problem, and the root cause is
zone->wait_table was not set to *NULL* after free it in
try_offline_node.

When hot re-add a node, we will reuse the pgdat of it, so does the zone
struct, and when add pages to the target zone, it will init the zone
first (including the wait_table) if the zone is not initialized.  The
judgement of zone initialized is based on zone->wait_table:

	static inline bool zone_is_initialized(struct zone *zone)
	{
		return !!zone->wait_table;
	}

so if we do not set the zone->wait_table to *NULL* after free it, the
memory hotplug routine will skip the init of new zone when hot re-add
the node, and the wait_table still points to the freed memory, then we
will access the invalid address when trying to wake up the waiting
people after the i/o operation with the page is done, such as mentioned
above.

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Reported-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Reviewed by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-10 16:43:43 -07:00
David S. Miller 1b0ccfe54a Revert "ipv6: Fix protocol resubmission"
This reverts commit 0243508edd.

It introduces new regressions.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-10 15:29:31 -07:00
Guenter Roeck 80524e9336 score: Fix exception handler label
The latest version of modinfo fails to compile score architecture
targets with the following error.

FATAL: The relocation at __ex_table+0x634 references
section "__ex_table" which is not executable, IOW
the kernel will fault if it ever tries to
jump to it.  Something is seriously wrong
and should be fixed.

The probem is caused by a bad label in an __ex_table entry.

Acked-by: Lennox Wu <lennox.wu@gmail.com>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2015-06-10 10:19:47 -07:00
Guenter Roeck 5eea900389 blackfin: Fix build error
Fix

include/asm-generic/io.h: In function 'readb':
include/asm-generic/io.h:113:2: error:
	implicit declaration of function 'bfin_read8'
include/asm-generic/io.h: In function 'readw':
include/asm-generic/io.h:121:2: error:
	implicit declaration of function 'bfin_read16'
include/asm-generic/io.h: In function 'readl':
include/asm-generic/io.h:129:2: error:
	implicit declaration of function 'bfin_read32'
include/asm-generic/io.h: In function 'writeb':
include/asm-generic/io.h:147:2: error:
	implicit declaration of function 'bfin_write8'
include/asm-generic/io.h: In function 'writew':
include/asm-generic/io.h:155:2: error:
	implicit declaration of function 'bfin_write16'
include/asm-generic/io.h: In function 'writel':
include/asm-generic/io.h:163:2: error:
	implicit declaration of function 'bfin_write32'

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Fixes: 1a3372bc52 ("blackfin: io: define __raw_readx/writex with
	bfin_readx/writex")
Cc: Steven Miao <realmz6@gmail.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2015-06-10 10:19:24 -07:00
Peter Zijlstra 5610032135 perf record: Amend option summaries
Because there's too many options and I cannot read, I frequently get
confused between -c and -P, and try to do things like:

  perf record -P 50000 -- foo

Which does not work; try and make the option description slightly longer
and hopefully less confusing.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20150610144850.GP19282@twins.programming.kicks-ass.net
[ Do those changes on the man page as well ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-06-10 12:00:27 -03:00
Milos Vyletel d7c72606d9 perf tools: Avoid possible race condition in copyfile()
Use unique temporary files when copying to buildid dir to prevent races
in case multiple instances are trying to copy same file. This is done by

- creating template in form <path>/.<filename>.XXXXXX where the suffix is
  used by mkstemp() to create unique file
- change file mode
- copy content
- if successful link temp file to target file
- unlink temp file

At this point the only file left at target path should be the desired
one either created by us or other instance if we raced. This should also
prevent not yet fully copied files to be visible to to other perf
instances that could try to parse them.

On top of that slow_copyfile no longer needs to deal with file mode when
creating file since temporary file is already created and mode is set.

Succesfully tested by myself by running perf record, archive and reading
the data on other system and by running perf buildid-cache on perf
binary itself. I also did revert fix from 0635b0f that to exposes
previously fixed race with EEXIST and recreator test passed sucessfully.

Signed-off-by: Milos Vyletel <milos@redhat.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1433775018-19868-1-git-send-email-milos@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-06-10 11:51:24 -03:00
Joe Perches 45bbfe64ea clocksource: Use current logging style
clocksource messages aren't prefixed in dmesg so it's a bit unclear
what subsystem emits the messages.

Use pr_fmt and pr_<level> to auto-prefix the messages appropriately.

Miscellanea:

o Remove "Warning" from KERN_WARNING level messages
o Align "timekeeping watchdog: " messages
o Coalesce formats
o Align multiline arguments

Signed-off-by: Joe Perches <joe@perches.com>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/1432579795.2846.75.camel@perches.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-06-10 11:31:14 +02:00
Nicholas Mc Guire c569a23d65 time: Allow gcc to fold usecs_to_jiffies(constant)
To allow constant folding in usecs_to_jiffies() conditionally calls
the HZ dependent _usecs_to_jiffies() helpers or, when gcc can not
figure out constant folding, __usecs_to_jiffies, which is the renamed
original usecs_to_jiffies() function.

Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Cc: Masahiro Yamada <yamada.m@jp.panasonic.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Joe Perches <joe@perches.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Andrew Hunter <ahh@google.com>
Cc: Paul Turner <pjt@google.com>
Cc: Michal Marek <mmarek@suse.cz>
Link: http://lkml.kernel.org/r/1432832996-12129-2-git-send-email-hofrat@osadl.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-06-10 11:31:14 +02:00
Nicholas Mc Guire ae60d6a0e3 time: Refactor usecs_to_jiffies
Refactor the usecs_to_jiffies conditional code part in time.c and
jiffies.h putting it into conditional functions rather than #ifdefs
to improve readability. This is analogous to the msecs_to_jiffies()
cleanup in commit ca42aaf0c8 ("time: Refactor msecs_to_jiffies")

Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Cc: Masahiro Yamada <yamada.m@jp.panasonic.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Joe Perches <joe@perches.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Andrew Hunter <ahh@google.com>
Cc: Paul Turner <pjt@google.com>
Cc: Michal Marek <mmarek@suse.cz>
Link: http://lkml.kernel.org/r/1432832996-12129-1-git-send-email-hofrat@osadl.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-06-10 11:31:13 +02:00
Ralf Baechle 9cc719ab3f MIPS: MSA: bugfix - disable MSA correctly for new threads/processes.
Due to the slightly odd way that new threads and processes start execution
when scheduled for the very first time they were bypassing the required
disable_msa call.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-06-10 11:24:53 +02:00
Ralf Baechle d9fb566045 MIPS: Loongson: Do not register 8250 platform device from module.
If CONFIG_SERIAL_8250 is set to m, the Loongson seria.ko module might get
unloaded while the serial driver modules are still loaded resulting in
stale references to the destroyed platform_device instance.

Anyway, platform devices should always be registered indicated what
devices are present, _not_ what drivers have been configured.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Patchwork: https://patchwork.linux-mips.org/patch/10538/
2015-06-10 10:43:07 +02:00
Ralf Baechle 34b1252bd9 MIPS: Cobalt: Do not build MTD platform device registration code as module.
If CONFIG_MTD_PHYSMAP is set to m, the Cobalt mtd.ko module might get
unloaded while the drivers/mtd modules are still loaded resulting in
stale references to the destroyed platform_device instance.

Anyway, platform devices should always be registered indicated what
devices are present, _not_ what drivers have been configured.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-06-10 10:39:30 +02:00
Takashi Iwai 98a226ed21 ALSA: hda - Don't actually write registers for caps overwrites
Along with the transition to regmap for managing the cached parameter
reads, the caps overwrite was also moved to regmap cache.  The cache
change itself works, but it still tries to write the non-existing verb
(the HDA parameter is read-only) wrongly.  It's harmless in most
cases, but some chips are picky and may result in the codec
communication stall.

This patch avoids it just by adding the missing flag check in
reg_write ops.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
2015-06-10 10:31:10 +02:00
Andy Lutomirski 539f511365 x86/asm/entry/64: Disentangle error_entry/exit gsbase/ebx/usermode code
The error_entry/error_exit code to handle gsbase and whether we
return to user mdoe was a mess:

 - error_sti was misnamed.  In particular, it did not enable interrupts.

 - Error handling for gs_change was hopelessly tangled the normal
   usermode path.  Separate it out.  This saves a branch in normal
   entries from kernel mode.

 - The comments were bad.

Fix it up.  As a nice side effect, there's now a code path that
happens on error entries from user mode.  We'll use it soon.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/f1be898ab93360169fb845ab85185948832209ee.1433878454.git.luto@kernel.org
[ Prettified it, clarified comments some more. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-10 08:47:57 +02:00
Denys Vlasenko a92fde2523 x86/asm/entry/32: Shorten __audit_syscall_entry() args preparation
We use three MOVs to swap edx and ecx. We can use one XCHG
instead.

Expand the comments. It's difficult to keep track which arg#
every register corresponds to, so spell it out.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1433876051-26604-3-git-send-email-dvlasenk@redhat.com
[ Expanded the comments some more. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-10 08:42:13 +02:00
Denys Vlasenko 1536bb46fa x86/asm/entry/32: Explain reloading of registers after __audit_syscall_entry()
Here it is not obvious why we load pt_regs->cx to %esi etc.
Lets improve comments.

Explain that here we combine two things: first, we reload
registers since some of them are clobbered by the C function we
just called; and we also convert 32-bit syscall params to 64-bit
C ABI, because we are going to jump back to syscall dispatch
code.

Move reloading of 6th argument into the macro instead of having
it after each of two macro invocations.

No actual code changes here.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1433876051-26604-2-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-10 08:42:13 +02:00
Denys Vlasenko aee4b013a7 x86/asm/entry/32: Fix fallout from the R9 trick removal in the SYSCALL code
I put %ebp restoration code too late. Under strace, it is not
reached and %ebp is not restored upon return to userspace.

This is the fix. Run-tested.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1433876051-26604-1-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-10 08:42:12 +02:00
Linus Torvalds e64f638483 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input layer fix from Dmitry Torokhov:
 "A small tweak for the Synaptics PS/2 touchpad driver"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: synaptics - add min/max quirk for Lenovo S540
2015-06-09 15:05:27 -07:00
Ming Lei c3b4afca70 blk-mq: free hctx->ctxs in queue's release handler
Now blk_cleanup_queue() can be called before calling
del_gendisk()[1], inside which hctx->ctxs is touched
from blk_mq_unregister_hctx(), but the variable has
been freed by blk_cleanup_queue() at that time.

So this patch moves freeing of hctx->ctxs into queue's
release handler for fixing the oops reported by Stefan.

[1], 6cd18e711d (block: destroy bdi before blockdev is
unregistered)

Reported-by: Stefan Seyfried <stefan.seyfried@googlemail.com>
Cc: NeilBrown <neilb@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org (v4.0)
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-09 15:32:38 -06:00
Johannes Berg 9c5a18a31b cfg80211: wext: clear sinfo struct before calling driver
Until recently, mac80211 overwrote all the statistics it could
provide when getting called, but it now relies on the struct
having been zeroed by the caller. This was always the case in
nl80211, but wext used a static struct which could even cause
values from one device leak to another.

Using a static struct is OK (as even documented in a comment)
since the whole usage of this function and its return value is
always locked under RTNL. Not clearing the struct for calling
the driver has always been wrong though, since drivers were
free to only fill values they could report, so calling this
for one device and then for another would always have leaked
values from one to the other.

Fix this by initializing the structure in question before the
driver method call.

This fixes https://bugzilla.kernel.org/show_bug.cgi?id=99691

Cc: stable@vger.kernel.org
Reported-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Reported-by: Alexander Kaltsas <alexkaltsas@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-09 13:54:58 -07:00
Hauke Mehrtens 36f6ea2b21 SSB: Fix handling of ssb_pmu_get_alp_clock()
Dan Carpenter reported missing brackets which resulted in reading a
wrong crystalfreq value. I also noticed that the result of this
function is ignored.

Reported-By: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Michael Buesch <m@bues.ch>
Cc: davem@davemloft.net
Cc: netdev@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10536/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-06-09 16:38:06 +02:00
David Woodhouse bd00c606a6 iommu/vt-d: Change PASID support to bit 40 of Extended Capability Register
The existing hardware implementations with PASID support advertised in
bit 28? Forget them. They do not exist. Bit 28 means nothing. When we
have something that works, it'll use bit 40. Do not attempt to infer
anything meaningful from bit 28.

This will be reflected in an updated VT-d spec in the extremely near
future.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2015-06-09 15:06:55 +01:00
Dave Hansen 97ac46a508 x86/mpx: Allow 32-bit binaries on 64-bit kernels again
Now that the bugs in mixed mode MPX handling are fixed, re-allow
32-bit binaries on 64-bit kernels again.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20150607183706.70277DAD@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09 12:24:34 +02:00
Dave Hansen bea03c50b8 x86/mpx: Do not count MPX VMAs as neighbors when unmapping
The comment pretty much says it all.

I wrote a test program that does lots of random allocations
and forces bounds tables to be created.  It came up with a
layout like this:

  ....   | BOUNDS DIRECTORY ENTRY COVERS |  ....
         |    BOUNDS TABLE COVERS        |
|  BOUNDS TABLE |  REAL ALLOC | BOUNDS TABLE |

Unmapping "REAL ALLOC" should have been able to free the
bounds table "covering" the "REAL ALLOC" because it was the
last real user.  But, the neighboring VMA bounds tables were
found, considered as real neighbors, and we declined to free
the bounds table covering the area.

Doing this over and over left a small but significant number
of these orphans.  Handling them is fairly straighforward.
All we have to do is walk the VMAs and skip all of the MPX
ones when looking for neighbors.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20150607183706.A6BD90BF@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09 12:24:34 +02:00
Dave Hansen 3ceaccdf92 x86/mpx: Rewrite the unmap code
The MPX code needs to clear out bounds tables for memory which
is no longer in use.  We do this when a userspace mapping is
torn down (unmapped).

There are two modes:

  1. An entire bounds table becomes unused, and can be freed
     and its pointer removed from the bounds directory.  This
     happens either when a large mapping is torn down, or when
     a small mapping is torn down and it is the last mapping
     "covered" by a bounds table.

  2. Only part of a bounds table becomes unused, in which case
     we free the backing memory as if MADV_DONTNEED was called.

The old code was a spaghetti mess of "edge" bounds tables
where the edges were handled specially, even if we were
unmapping an entire one.  Non-edge bounds tables are always
fully unmapped, but share a different code path from the edge
ones.  The old code had a bug where it was unmapping too much
memory.  I worked on fixing it for two days and gave up.

I didn't write the original code.  I didn't particularly like
it, but it worked, so I left it.  After my debug session, I
realized it was undebuggagle *and* buggy, so out it went.

I also wrote a new unmapping test program which uncovers bugs
pretty nicely.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20150607183706.DCAEC67D@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09 12:24:34 +02:00
Dave Hansen 613fcb7d3c x86/mpx: Support 32-bit binaries on 64-bit kernels
Right now, the kernel can only switch between 64-bit and 32-bit
binaries at compile time. This patch adds support for 32-bit
binaries on 64-bit kernels when we support ia32 emulation.

We essentially choose which set of table sizes to use when doing
arithmetic for the bounds table calculations.

This also uses a different approach for calculating the table
indexes than before.  I think the new one makes it much more
clear what is going on, and allows us to share more code between
the 32-bit and 64-bit cases.

Based-on-patch-by: Qiaowei Ren <qiaowei.ren@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20150607183705.E01F21E2@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09 12:24:34 +02:00
Dave Hansen 6ac52bb491 x86/mpx: Use 32-bit-only cmpxchg() for 32-bit apps
user_atomic_cmpxchg_inatomic() actually looks at sizeof(*ptr) to
figure out how many bytes to copy.  If we run it on a 64-bit
kernel with a 64-bit pointer, it will copy a 64-bit bounds
directory entry.  That's fine, except when we have 32-bit
programs with 32-bit bounds directory entries and we only *want*
32-bits.

This patch breaks the cmpxchg() operation out in to its own
function and performs the 32-bit type swizzling in there.

Note, the "64-bit" version of this code _would_ work on a
32-bit-only kernel.  The issue this patch addresses is only for
when the kernel's 'long' is mismatched from the size of the
bounds directory entry of the process we are working on.

The new helper modifies 'actual_old_val' or returns an error.
But gcc doesn't know this, so it warns about 'actual_old_val'
being unused.  Shut it up with an uninitialized_var().

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20150607183705.672B115E@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09 12:24:33 +02:00
Dave Hansen 5458765390 x86/mpx: Introduce new 'directory entry' to 'addr' helper function
Currently, to get from a bounds directory entry to the virtual
address of a bounds table, we simply mask off a few low bits.
However, the set of bits we mask off is different for 32-bit and
64-bit binaries.

This breaks the operation out in to a helper function and also
adds a temporary variable to store the result until we are
sure we are returning one.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20150607183704.007686CE@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09 12:24:33 +02:00
Dave Hansen a1149fc83a x86/mpx: Add temporary variable to reduce masking
When we allocate a bounds table, we call mmap(), then add a
"valid" bit to the value before storing it in to the bounds
directory.

If we fail along the way, we go and mask that valid bit
_back_ out.  That seems a little silly, and this makes it
much more clear when we have a plain address versus an
actual table _entry_.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20150607183704.3D69D5F4@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09 12:24:33 +02:00
Dave Hansen b0e9b09b3b x86: Make is_64bit_mm() widely available
The uprobes code has a nice helper, is_64bit_mm(), that consults
both the runtime and compile-time flags for 32-bit support.
Instead of reinventing the wheel, pull it in to an x86 header so
we can use it for MPX.

I prefer passing the 'mm' around to test_thread_flag(TIF_IA32)
because it makes it explicit where the context is coming from.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20150607183704.F0209999@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09 12:24:32 +02:00
Dave Hansen cd4996dce1 x86/mpx: Trace allocation of new bounds tables
Bounds tables are a significant consumer of memory.  It is
important to know when they are being allocated.  Add a trace
point to trace whenever an allocation occurs and also its
virtual address.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20150607183704.EC23A93E@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09 12:24:32 +02:00
Dave Hansen 2a1dcb1f79 x86/mpx: Trace the attempts to find bounds tables
There are two different events being traced here.  They are
doing similar things so share a trace "EVENT_CLASS" and are
presented together.

1. Trace when MPX is zapping pages "mpx_unmap_zap":

	When MPX can not free an entire bounds table, it will
	instead try to zap unused parts of a bounds table to free
	the backing memory.  This decreases RSS (resident set
	size) without decreasing the virtual space allocated
	for bounds tables.

2. Trace attempts to find bounds tables "mpx_unmap_search":

	This event traces any time we go looking to unmap a
	bounds table for a given virtual address range.  This is
	useful to ensure that the kernel actually "tried" to free
	a bounds table versus times it succeeded in finding one.

	It might try and fail if it realized that a table was
	shared with an adjacent VMA which is not being unmapped.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20150607183703.B9D2468B@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09 12:24:32 +02:00
Dave Hansen 97efebf1bc x86/mpx: Trace entry to bounds exception paths
There are two basic things that can happen as the result of
a bounds exception (#BR):

	1. We allocate a new bounds table
	2. We pass up a bounds exception to userspace.

This patch adds a trace point for the case where we are
passing the exception up to userspace with a signal.

We are also explicit that we're printing out the inverse of
the 'upper' that we encounter.  If you want to filter, for
instance, you need to ~ the value first.  The reason we do
this is because of how 'upper' is stored in the bounds table.

If a pointer's range is:

	0x1000 -> 0x2000

it is stored in the bounds table as (32-bits here for brevity):

	lower: 0x00001000
	upper: 0xffffdfff

That is so that an all 0's entry:

	lower: 0x00000000
	upper: 0x00000000

corresponds to the "init" bounds which store a *range* of:

	0x00000000 -> 0xffffffff

That is, by far, the common case, and that lets us use the
zero page, or deduplicate the memory, etc... The 'upper'
stored in the table is gibberish to print by itself, so we
print ~upper to get the *actual*, logical, human-readable
value printed out.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20150607183703.027BB9B0@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09 12:24:32 +02:00
Dave Hansen e7126cf5f1 x86/mpx: Trace #BR exceptions
This is the first in a series of MPX tracing patches.
I've found these extremely useful in the process of
debugging applications and the kernel code itself.

This exception hooks in to the bounds (#BR) exception
very early and allows capturing the key registers which
would influence how the exception is handled.

Note that bndcfgu/bndstatus are technically still
64-bit registers even in 32-bit mode.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20150607183703.5FE2619A@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09 12:24:31 +02:00
Dave Hansen 8c3641e957 x86/mpx: Introduce a boot-time disable flag
MPX has the _potential_ to cause some issues.  Say part of your
init system tried to protect one of its components from buffer
overflows with MPX.  If there were a false positive, it's
possible that MPX could keep a system from booting.

MPX could also potentially cause performance issues since it is
present in hot paths like the unmap path.

Allow it to be disabled at boot time.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20150607183702.2E8B77AB@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09 12:24:31 +02:00
Dave Hansen eb099e5bc5 x86/mpx: Restrict the mmap() size check to bounds tables
The comment and code here are confusing.  We do not currently
allocate the bounds directory in the kernel.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20150607183702.222CEC2A@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09 12:24:31 +02:00
Qiaowei Ren 3c1d323009 x86/mpx: Remove redundant MPX_BNDCFG_ADDR_MASK
MPX_BNDCFG_ADDR_MASK is defined two times, so this patch removes
redundant one.

Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20150607183702.5F129376@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09 12:24:30 +02:00
Dave Hansen 46a6e0cf1c x86/mpx: Clean up the code by not passing a task pointer around when unnecessary
The MPX code can only work on the current task.  You can not,
for instance, enable MPX management in another process or
thread. You can also not handle a fault for another process or
thread.

Despite this, we pass a task_struct around prolifically.  This
patch removes all of the task struct passing for code paths
where the code can not deal with another task (which turns out
to be all of them).

This has no functional changes.  It's just a cleanup.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: bp@alien8.de
Link: http://lkml.kernel.org/r/20150607183702.6A81DA2C@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09 12:24:30 +02:00