Commit Graph

693161 Commits

Author SHA1 Message Date
Pawel Baldysiak 675dc2ccc2 raid5-ppl: Recovery support for multiple partial parity logs
Search PPL buffer in order to find out the latest PPL header (the one
with largest generation number) and use it for recovery. The PPL entry
format and recovery algorithm are the same as for single PPL approach.

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2017-08-28 07:45:49 -07:00
Pawel Baldysiak ddc088238c md: Runtime support for multiple ppls
Increase PPL area to 1MB and use it as circular buffer to store PPL. The
entry with highest generation number is the latest one. If PPL to be
written is larger then space left in a buffer, rewind the buffer to the
start (don't wrap it).

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2017-08-28 07:45:48 -07:00
Shaohua Li 8a8e6f84ad md/raid0: attach correct cgroup info in bio
The discard bio doesn't attach the original bio cgroup info. Normal bio
is cloned, so is fine.

Signed-off-by: Shaohua Li <shli@fb.com>
2017-08-25 10:21:48 -07:00
Denys Vlasenko b5e0fff19b lib/raid6: align AVX512 constants to 512 bits, not bytes
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: mingo@redhat.com
Cc: Jim Kukunas <james.t.kukunas@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Megha Dey <megha.dey@linux.intel.com>
Cc: Gayatri Kammela <gayatri.kammela@intel.com>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Shaohua Li <shli@fb.com>
2017-08-25 10:21:47 -07:00
Guoqing Jiang 27a4ff8f49 raid5: remove raid5_build_block
Now raid5_build_block is just called to set the
sector of r5dev, raid5_compute_blocknr can be
used directly for the purpose.

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2017-08-25 10:21:47 -07:00
Song Liu a72cbf83b0 md/r5cache: call mddev_lock/unlock() in r5c_journal_mode_show
In r5c_journal_mode_show(), it is necessary to call mddev_lock()
before accessing conf and conf->log. Otherwise, the conf->log
may change (and become NULL).

Signed-off-by: Song Liu <songliubraving@fb.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2017-08-25 10:21:46 -07:00
Cihangir Akturk 26e13043b7 md: replace seq_release_private with seq_release
Since commit f15146380d ("fs: seq_file - add event counter to simplify
poll() support"), md.c code has been no longer used the private field of
the struct seq_file, but seq_release_private() has been continued to be
used to release the allocated seq_file. This seems to have been
forgotten. So convert it to use seq_release() instead of
seq_release_private().

Signed-off-by: Cihangir Akturk <cakturk@gmail.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2017-08-25 10:21:45 -07:00
Alexey Obitotskiy 5492c46e94 md: notify about new spare disk in the container
In case of external metadata arrays spare disks are added to containers
first. mdadm keeps monitoring /proc/mdstat output and when spare disk is
available, it moves it from the container to the array. The problem is
there is no notification of new spare disk in the container and mdadm
waits a long time (until timeout) before it takes the action.

Signed-off-by: Alexey Obitotskiy <aleksey.obitotskiy@intel.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2017-08-25 10:21:45 -07:00
Shaohua Li 208410b546 md/raid1/10: reset bio allocated from mempool
Data allocated from mempool doesn't always get initialized, this happens when
the data is reused instead of fresh allocation. In the raid1/10 case, we must
reinitialize the bios.

Reported-by: Jonathan G. Underwood <jonathan.underwood@gmail.com>
Fixes: f0250618361d(md: raid10: don't use bio's vec table to manage resync pages)
Fixes: 98d30c5812c3(md: raid1: don't use bio's vec table to manage resync pages)
Cc: stable@vger.kernel.org (4.12+)
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2017-08-25 10:21:44 -07:00
Song Liu 9c72a18e46 md/raid5: release/flush io in raid5_do_work()
In raid5, there are scenarios where some ios are deferred to a later
time, and some IO need a flush to complete. To make sure we make
progress with these IOs, we need to call the following functions:

    flush_deferred_bios(conf);
    r5l_flush_stripe_to_raid(conf->log);

Both of these functions are called in raid5d(), but missing in
raid5_do_work(). As a result, these functions are not called
when multi-threading (group_thread_cnt > 0) is enabled. This patch
adds calls to these function to raid5_do_work().

Note for stable branches:

  r5l_flush_stripe_to_raid(conf->log) is need for 4.4+
  flush_deferred_bios(conf) is only needed for 4.11+

Cc: stable@vger.kernel.org (4.4+)
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2017-08-24 10:06:19 -07:00
Shaohua Li 8031c3ddc7 md/bitmap: copy correct data for bitmap super
raid5 cache could write bitmap superblock before bitmap superblock is
initialized. The bitmap superblock is less than 512B. The current code will
only copy the superblock to a new page and write the whole 512B, which will
zero the the data after the superblock. Unfortunately the data could include
bitmap, which we should preserve. The patch will make superblock read do 4k
chunk and we always copy the 4k data to new page, so the superblock write will
old data to disk and we don't change the bitmap.

Reported-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Song Liu <songliubraving@fb.com>
Cc: stable@vger.kernel.org (4.10+)
Signed-off-by: Shaohua Li <shli@fb.com>
2017-08-24 10:04:54 -07:00
Linus Torvalds 143c97cc65 Revert "pty: fix the cached path of the pty slave file descriptor in the master"
This reverts commit c8c03f1858.

It turns out that while fixing the ptmx file descriptor to have the
correct 'struct path' to the associated slave pty is a really good
thing, it breaks some user space tools for a very annoying reason.

The problem is that /dev/ptmx and its associated slave pty (/dev/pts/X)
are on different mounts.  That was what caused us to have the wrong path
in the first place (we would mix up the vfsmount of the 'ptmx' node,
with the dentry of the pty slave node), but it also means that now while
we use the right vfsmount, having the pty master open also keeps the pts
mount busy.

And it turn sout that that makes 'pbuilder' very unhappy, as noted by
Stefan Lippers-Hollmann:

 "This patch introduces a regression for me when using pbuilder
  0.228.7[2] (a helper to build Debian packages in a chroot and to
  create and update its chroots) when trying to umount /dev/ptmx (inside
  the chroot) on Debian/ unstable (full log and pbuilder configuration
  file[3] attached).

  [...]
  Setting up build-essential (12.3) ...
  Processing triggers for libc-bin (2.24-15) ...
  I: unmounting dev/ptmx filesystem
  W: Could not unmount dev/ptmx: umount: /var/cache/pbuilder/build/1340/dev/ptmx: target is busy
          (In some cases useful info about processes that
           use the device is found by lsof(8) or fuser(1).)"

apparently pbuilder tries to unmount the /dev/pts filesystem while still
holding at least one master node open, which is arguably not very nice,
but we don't break user space even when fixing other bugs.

So this commit has to be reverted.

I'll try to figure out a way to avoid caching the path to the slave pty
in the master pty.  The only thing that actually wants that slave pty
path is the "TIOCGPTPEER" ioctl, and I think we could just recreate the
path at that time.

Reported-by: Stefan Lippers-Hollmann <s.l-h@gmx.de>
Cc: Eric W Biederman <ebiederm@xmission.com>
Cc: Christian Brauner <christian.brauner@canonical.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-23 18:16:11 -07:00
Linus Torvalds 2acf097f16 Late arm64 fixes:
- Fix very early boot failures with KASLR enabled
 
 - Fix fatal signal handling on userspace access from kernel
 
 - Fix leakage of floating point register state across exec()
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCgAGBQJZnUeIAAoJELescNyEwWM0hH8IALpwELGRIKFkHYCSnBBjyHUl
 SfoWSJJ8Q9X8filHk5DakfM8wTcsbwlk6XpCwqx+hbETGDq8Zz8eKlzJvg0ARpND
 /Z6H3nhp3Z1MIV0nkn10XLgbKNwl7/512lTaO+TfqiIXG7fLZh5+zWBlHMcvDuNb
 RAy8AVNnYOfiqB4tRupZ8MoRerVi8PHPUpPY/FB1NeGoD0nNIl/lopKRwaD+XXiS
 KDfnZd4jAs8y71iaOSidybyNFQ7T++MvZsGx4eLB86MY4IBihxBWQojvtNp7Pptp
 H50IFvSYKG4LXTYphZUbWriW600PGHO4oVjeY1KaZsgAhtIsegqi1SH75ulXe70=
 =ES28
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "Late arm64 fixes.

  They fix very early boot failures with KASLR where the early mapping
  of the kernel is incorrect, so the failure mode looks like a hang with
  no output. There's also a signal-handling fix when a uaccess routine
  faults with a fatal signal pending, which could be used to create
  unkillable user tasks using userfaultfd and finally a state leak fix
  for the floating pointer registers across a call to exec().

  We're still seeing some random issues crop up (inode memory corruption
  and spinlock recursion) but we've not managed to reproduce things
  reliably enough to debug or bisect them yet.

  Summary:

   - Fix very early boot failures with KASLR enabled

   - Fix fatal signal handling on userspace access from kernel

   - Fix leakage of floating point register state across exec()"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: kaslr: Adjust the offset to avoid Image across alignment boundary
  arm64: kaslr: ignore modulo offset when validating virtual displacement
  arm64: mm: abort uaccess retries upon fatal signal
  arm64: fpsimd: Prevent registers leaking across exec
2017-08-23 12:05:46 -07:00
Linus Torvalds a67ca1e9bd GPIO fixes for the v4.13 series:
- An important core fix to reject invalid GPIOs *before* trying
   to obtain a GPIO descriptor for it.
 
 - A driver fix for the mvebu driver IRQ handling.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZnT/4AAoJEEEQszewGV1zt4oP/14p4YCqooukiR8oZpG30Wj3
 34EOPQ2k2/BrKegO3IN+4TkZScfnH4hqeJ6z95mrJtI0MMXBM3Zu+967L1lVOueB
 TBaxb034iNGpjztB3gQ1cCBuR7409AH89irzUFeNXfSZtelAu73lfiEp1W02+uUt
 80Jlzuov/ANJi2PrNS1arH8QJmurWwhmbCsqQ8xB8RXkNE0uu5809ahntqlgfSpf
 bXGXPrtmNgM14BheGhYMTR+jZRprlblRYrz1Cjy9x9iJeJWOzADVpjeIjBZxI4OG
 M+MPWfyXZldiVXlhR5QQgeJnQgmi4AkhL7VSkez5Dpdirtrnh0hgVc+oPw01w2g4
 bsbaXGzwTuT5ok20+E92po5ALMCLfOwj3Jw91b7UquLbM0QqgDVz/u7VpAk2gPuT
 fWTIBxyQMfUSvNqKTF6/gJIkfA40H/x2ydVJvDdFT9I3B4KP4yxPjHzMTd99MmuR
 +2n4nULjzjocuaYAdZVAd8gxh+dzsaVBBEx+uSu/jgjoObMKyt93JnWZvak8sBvI
 Gehe2zvfvLtGN3C9cBAGzlZIP6vIhB8emCLR7DMLsarGybgwhoxU45cvAfuKepge
 FojltQhVXMmiwdNQI4CxNzfRhXOpUhY7/yim/2w+evT9UM15QHKI7CS/3QUqe8uk
 qnAFYNXQ9v6PVrh+egu9
 =eV+r
 -----END PGP SIGNATURE-----

Merge tag 'gpio-v4.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO fixes from Linus Walleij:
 "Here are the (hopefully) last GPIO fixes for v4.13:

   - an important core fix to reject invalid GPIOs *before* trying to
     obtain a GPIO descriptor for it.

   - a driver fix for the mvebu driver IRQ handling"

* tag 'gpio-v4.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpio: mvebu: Fix cause computation in irq handler
  gpio: reject invalid gpio before getting gpio_desc
2017-08-23 11:43:38 -07:00
Linus Torvalds 55652400fd SCSI fixes on 20170823
Six minor and error leg fixes, plus one major change: the reversion of
 scsi-mq as the default.  We're doing the latter temporarily (with a
 backport to stable) to give us time to fix all the issues that turned
 up with this default before trying again.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJZnSOUAAoJEAVr7HOZEZN4W/4QAJ4YCDZaS1QH2Yud7IpPA/tb
 1A9r5YY0KDqqONiha4u2NbKiCQDr+RupA+r5ZdlO4upFk7ilV4d90EwCbl4L10HL
 4wXAhedO8LXcz4bmAx9xWrBD6JXfG92H4UnM3ciWOhNV5eW4e4t3IikxWeYnZBuM
 uLNwSIMVKMvz5VXZmItDny0izFjcWbfYIld/7wXSX+naOx5Z1ianeURj7S3kpapb
 54olbUjSQfzW325gFRbyvXa78uWVfRmY7wF2KyMBv5DrrJz0mEAEGN84NzGPcOwR
 bcHhYRDUNzF1eODRWjtw5lHu+mngtEWILbw3uIjLEuJXnox9cSGgrd+AK96sEuKr
 teahjszEsN93MORDafllZylreGdWn+G/DJnF+b6CGKY0h4XsEX/rFIAmvF06hMc/
 pBemzjIUJQgXXzMBtPTzciNvYxOAFe4JVrY8AB6FTuvJY6F90HPA71bnmDUnvjhl
 wrysqtuSTJoFlCgnO0e0i2koPnxNmuImKoln3VjpDNeW3rBBMJlcMtWueegbOHUh
 r5Xc/HQZ2L/8ihB0oqPpnwYmD2czpkldMmEiRMvhJ/f89RxaPub4tmFndbiW8gxs
 5wZxFDkN9Fi/+MCjDUZQ2dMfubWmwRpPXm0MWyUQ4rG8+bM+17yci7Be1TGPwQMA
 Ito9txXbsdMSE6fTgtk7
 =aB/i
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Six minor and error leg fixes, plus one major change: the reversion of
  scsi-mq as the default.

  We're doing the latter temporarily (with a backport to stable) to give
  us time to fix all the issues that turned up with this default before
  trying again"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: cxgb4i: call neigh_event_send() to update MAC address
  Revert "scsi: default to scsi-mq"
  scsi: sd_zbc: Write unlock zone from sd_uninit_cmnd()
  scsi: aacraid: Fix out of bounds in aac_get_name_resp
  scsi: csiostor: fail probe if fw does not support FCoE
  scsi: megaraid_sas: fix error handle in megasas_probe_one
2017-08-23 11:34:40 -07:00
Linus Torvalds 98b9f8a454 Fix a clang build regression and an potential xattr corruption bug.
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAlmdAHoACgkQ8vlZVpUN
 gaNVsgf/SRn6HaOpX7BdrtkXqjV8VvLZsDmsZPkhchdmTxMpIFJNf16/sg0hqdyJ
 wcTx3y+BkBSjBXLtqK+hslVyg4pUjSBWWZyZ9Dtyi5+B92CJJJBdaHIpcdvd3Ek1
 J/HPQjqcPXL43Cg5SQ0/KgVMhCze9I4bEbNm2evC18bC15hZAVP0FK1hT3FNpyIB
 fhOu9FZdnzlcBlnLdfTqgIEPaHzc6zcJnqpSbkT0InjiJf5cxDionhoaBzUh9Jzg
 bKvkFRDTDWDrBcYStuHwgpELmVVYJGbwjzMVOAcmeCiSJqNbU1/Ym5t3e3rflKmi
 6YEyDhK43iZGiR4/QUffrCxEIzfqrA==
 =dOeQ
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 fixes from Ted Ts'o:
 "Fix a clang build regression and an potential xattr corruption bug"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: add missing xattr hash update
  ext4: fix clang build regression
2017-08-22 21:30:52 -07:00
Linus Torvalds eec3b80fb1 - Bugfixes
- Revert duplicate commit in da9062-core
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJZm+V+AAoJEFGvii+H/HdhE5EP/RRTDorkvc9JNWJYhC5Zdaoy
 bo6Bh9T/Gvhmfr16rYrsxdVR67uBydiOL/U7B3zdzP94C1e+I7B9KFcYMKx3rgeu
 29glB6u+pdpGonoiBgdF6B76FBpe6Xt9Yz+VgzCbNNPb/AzNadl7HtoSxgXG96dQ
 RLHwjJ7tmV6LLSCZPRwhQMFeivVfKMevMHb670x21/GrKuWUSqtBqlouKwKA2Mto
 b+eIEDAg4QaSQEGpsDNORUt65DBm2c2mKwUlAG5w1NeCYOLDKtwYMZ6MUxZMcHL2
 Upu8Lu4zlua0VvFhFzsz69mmBvBn+u6dvLu4vd5apNaHlYHoTwWJ9xEOd7fi1mm+
 twI2wrU02yL53+A5L58g7rc3yhdFEuddwnkQcFdv0LMDXiE2g+Os+HwI2azQiFr3
 lCkgLGBI3/oKhaAP29+PWy9b3071Cljdm917z1Yt0ImAJpiSPQDzsPSct4C8Wm07
 Z7WvX1wH5WbB36TXYsRT/phQdpxUJev+jnwRHui/lMYkR9uvYMLvE4ndpawBXRY8
 JdiTMzBsF6HWMUl3HAogN+hv70n03okt4rpKdaVzwF6gruDI2pE1/jYGj4OWFPmF
 Pw3jIwi1ZfjF3eBvLWdKXisfJzvDQn78eZzPfs4uF5NwSSrlgvvxY3GwppWNerNl
 7fmjyP5JmtbOt+YhU9+R
 =7O30
 -----END PGP SIGNATURE-----

Merge tag 'mfd-fixes-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd

Pull MFD fix from Lee Jones:
 "Revert duplicate commit in da9062-core"

* tag 'mfd-fixes-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
  Revert "mfd: da9061: Fix to remove BBAT_CONT register from chip model"
2017-08-22 10:21:05 -07:00
Catalin Marinas a067d94d37 arm64: kaslr: Adjust the offset to avoid Image across alignment boundary
With 16KB pages and a kernel Image larger than 16MB, the current
kaslr_early_init() logic for avoiding mappings across swapper table
boundaries fails since increasing the offset by kimg_sz just moves the
problem to the next boundary.

This patch rounds the offset down to (1 << SWAPPER_TABLE_SHIFT) if the
Image crosses a PMD_SIZE boundary.

Fixes: afd0e5a876 ("arm64: kaslr: Fix up the kernel image alignment")
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-08-22 18:15:42 +01:00
Ard Biesheuvel 4a23e56ad6 arm64: kaslr: ignore modulo offset when validating virtual displacement
In the KASLR setup routine, we ensure that the early virtual mapping
of the kernel image does not cover more than a single table entry at
the level above the swapper block level, so that the assembler routines
involved in setting up this mapping can remain simple.

In this calculation we add the proposed KASLR offset to the values of
the _text and _end markers, and reject it if they would end up falling
in different swapper table sized windows.

However, when taking the addresses of _text and _end, the modulo offset
(the physical displacement modulo 2 MB) is already accounted for, and
so adding it again results in incorrect results. So disregard the modulo
offset from the calculation.

Fixes: 08cdac619c ("arm64: relocatable: deal with physically misaligned ...")
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-08-22 18:15:42 +01:00
Mark Rutland 289d07a2dc arm64: mm: abort uaccess retries upon fatal signal
When there's a fatal signal pending, arm64's do_page_fault()
implementation returns 0. The intent is that we'll return to the
faulting userspace instruction, delivering the signal on the way.

However, if we take a fatal signal during fixing up a uaccess, this
results in a return to the faulting kernel instruction, which will be
instantly retried, resulting in the same fault being taken forever. As
the task never reaches userspace, the signal is not delivered, and the
task is left unkillable. While the task is stuck in this state, it can
inhibit the forward progress of the system.

To avoid this, we must ensure that when a fatal signal is pending, we
apply any necessary fixup for a faulting kernel instruction. Thus we
will return to an error path, and it is up to that code to make forward
progress towards delivering the fatal signal.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Steve Capper <steve.capper@arm.com>
Tested-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Tested-by: James Morse <james.morse@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-08-22 18:15:42 +01:00
Dave Martin 096622104e arm64: fpsimd: Prevent registers leaking across exec
There are some tricky dependencies between the different stages of
flushing the FPSIMD register state during exec, and these can race
with context switch in ways that can cause the old task's regs to
leak across.  In particular, a context switch during the memset() can
cause some of the task's old FPSIMD registers to reappear.

Disabling preemption for this small window would be no big deal for
performance: preemption is already disabled for similar scenarios
like updating the FPSIMD registers in sigreturn.

So, instead of rearranging things in ways that might swap existing
subtle bugs for new ones, this patch just disables preemption
around the FPSIMD state flushing so that races of this type can't
occur here.  This brings fpsimd_flush_thread() into line with other
code paths.

Cc: stable@vger.kernel.org
Fixes: 674c242c93 ("arm64: flush FP/SIMD state correctly after execve()")
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-08-22 18:15:42 +01:00
Lee Jones 0f0fc5c090 Revert "mfd: da9061: Fix to remove BBAT_CONT register from chip model"
This patch was applied to the MFD twice, causing unwanted behavour.

This reverts commit b77eb79acc.

Fixes: b77eb79acc ("mfd: da9061: Fix to remove BBAT_CONT register from chip model")
Reported-by: Steve Twiss <stwiss.opensource@diasemi.com>
Reviewed-by: Steve Twiss <stwiss.opensource@diasemi.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2017-08-22 09:03:00 +01:00
Linus Torvalds 6470812e22 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Pull sparc fixes from David Miller:
 "Just a couple small fixes, two of which have to do with gcc-7:

   1) Don't clobber kernel fixed registers in __multi4 libgcc helper.

   2) Fix a new uninitialized variable warning on sparc32 with gcc-7,
      from Thomas Petazzoni.

   3) Adjust pmd_t initializer on sparc32 to make gcc happy.

   4) If ATU isn't available, don't bark in the logs. From Tushar Dave"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc: kernel/pcic: silence gcc 7.x warning in pcibios_fixup_bus()
  sparc64: remove unnecessary log message
  sparc64: Don't clibber fixed registers in __multi4.
  mm: add pmd_t initializer __pmd() to work around a GCC bug.
2017-08-21 14:07:48 -07:00
Thomas Petazzoni 2dc77533f1 sparc: kernel/pcic: silence gcc 7.x warning in pcibios_fixup_bus()
When building the kernel for Sparc using gcc 7.x, the build fails
with:

arch/sparc/kernel/pcic.c: In function ‘pcibios_fixup_bus’:
arch/sparc/kernel/pcic.c:647:8: error: ‘cmd’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
    cmd |= PCI_COMMAND_IO;
        ^~

The simplified code looks like this:

unsigned int cmd;
[...]
pcic_read_config(dev->bus, dev->devfn, PCI_COMMAND, 2, &cmd);
[...]
cmd |= PCI_COMMAND_IO;

I.e, the code assumes that pcic_read_config() will always initialize
cmd. But it's not the case. Looking at pcic_read_config(), if
bus->number is != 0 or if the size is not one of 1, 2 or 4, *val will
not be initialized.

As a simple fix, we initialize cmd to zero at the beginning of
pcibios_fixup_bus.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-21 13:57:22 -07:00
Linus Torvalds 05ab303b4f ARC fixes for 4.13-rc7
- PAE40 related updates
 
  - SLC errata for region ops
 
  - intc line masking by default
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJZmw0DAAoJEGnX8d3iisJeoogP/R78jWBmVIRr7YBvOMEbNZAz
 r5dJHnd2jCUtqQ/rmCndwzOqLv2j77RRdH3nlwfM7YaS8LZmK0Dkz9zUGReCalQU
 z1sZfBSZuOfodWbDzfXdJVNEGGu8eSG7/s87E6K9UxOuaZJIPNMyU9qGxjDb3BTo
 dzgNmT0xgiqZYtv9Y3uciPgkddJLOxE+eMuEpxzbzejLerUUc/jRV8m5qAL8ja9w
 NanzLjo7Ec0FiczyYf1DtiONXBVl556IPQoFJtXIbsfZww8kJxFSZ+qemvmFXJOF
 cxSfZeRBOCWV9mRW36kGAeKVE9EWqFHFn/UiCfvhTCpoFXPX63Hz+nVRLEE1lhrQ
 ZaiQSuu0QgUkWP39ZpAPQjmdBlVxzv9Jsz5Dh72l3C00Hf9yw3jVCsKX/nZLpLM2
 pS8pFVnJqttOLX/6wU1JjIQDhPvzqn0V21SwCXBt2DwyXd1zuce82ioPY7K2Uefc
 4Unrso+YpIJNh8NlIe7Pvn8kEitNF7MViybofjhKPXlFXT4FqSJIV0Q/iq/L6lh8
 RAfJO3GCQQymkB03aVmfRWq+xgCS0v3K2vP50T3+XEyix98ZwH+D4ViaXs9egmLk
 EO323ebCKp8AJxICTme5qtmXs+k/CH+KeCzwSc90Mtf1SD3ohvyUeJvA5BGCCyP5
 NP5sxiH9cNKwrMIBdvuE
 =mOLm
 -----END PGP SIGNATURE-----

Merge tag 'arc-4.13-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC fixes from Vineet Gupta:

 - PAE40 related updates

 - SLC errata for region ops

 - intc line masking by default

* tag 'arc-4.13-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  arc: Mask individual IRQ lines during core INTC init
  ARCv2: PAE40: set MSB even if !CONFIG_ARC_HAS_PAE40 but PAE exists in SoC
  ARCv2: PAE40: Explicitly set MSB counterpart of SLC region ops addresses
  ARC: dma: implement dma_unmap_page and sg variant
  ARCv2: SLC: Make sure busy bit is set properly for region ops
  ARC: [plat-sim] Include this platform unconditionally
  ARC: [plat-axs10x]: prepare dts files for enabling PAE40 on axs103
  ARC: defconfig: Cleanup from old Kconfig options
2017-08-21 13:30:36 -07:00
Linus Torvalds 0b3baec853 RTC fixes for 4.13:
- ds1307: fix regmap configuration
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEXx9Viay1+e7J/aM4AyWl4gNJNJIFAlma57AACgkQAyWl4gNJ
 NJJDMw/9F3x6xub2F5nkW/uWgp4c6qL6cY0yC+fSHEhBrDl1T5gr+Cl+prQJotjU
 HNJAFRC78ERlr16o6ck5wL9Mo9rP3iOgpYuLCMFGEelSWF+DMv1fO1mrE2GWgcLZ
 phILkzzdkqqveq6f6wHuxAegltw8Wgh/DCF+tJCZI6UcEwsfWsH7yzadNSdYhiAO
 DFN7lry0mvt432ezz/ko/Evg7GOCT9SK9C1jZ6P2uB/gSoXwbhYXBWubdtKuYYVy
 5OCyyv9FWb3WX8FXFfb/oyAMI5xZ4cXqrJseoFsxZr5GamprK0hKQutaGtV9WCg8
 Tp8syz3MCH5w402VqiUARnW89o5F5bEUs+Ymh6x9+C6d2//3MQ4Bne1tEHSMhXI6
 w6R5tm/S+ybF7zKC/WjIxS+xGvqzizJoxUhUqDb7oQmz8aZQt6qhtNzwJO7zmDx6
 3c2nADmfjV28CAXYOWaIYWLp6CRwIOSGykijMvnhwvHFS+XdzXrq1uadaaMvfxDR
 iv5lRcb9nA66vsjYvU0/6cfn3U+el1cgn7g62gzj86GmElUvwW1XR2DIWs/RqDfZ
 e92B+ZwORE6u5YZtQAIvaNep2aZnwqYOQn7zwf5sNZlURDPwtHfDYpDP8xntZIFt
 twLH+DpYqoQvUtCP8u15xYsLIFiqpGHA12oBERhK5A+2nYYOJck=
 =CuI8
 -----END PGP SIGNATURE-----

Merge tag 'rtc-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux

Pull RTC fix from Alexandre Belloni:
 "Fix regmap configuration for ds1307"

* tag 'rtc-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
  rtc: ds1307: fix regmap config
2017-08-21 13:27:51 -07:00
Linus Torvalds e3181f2c0c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix IGMP handling wrt VRF, from David Ahern.

 2) Fix timer access to freed object in dccp, from Eric Dumazet.

 3) Use kmalloc_array() in ptr_ring to avoid overflow cases which are
    triggerable by userspace. Also from Eric Dumazet.

 4) Fix infinite loop in unmapping cleanup of nfp driver, from Colin Ian
    King.

 5) Correct datagram peek handling of empty SKBs, from Matthew Dawson.

 6) Fix use after free in TIPC, from Eric Dumazet.

 7) When replacing a route in ipv6 we need to reset the round robin
    pointer, from Wei Wang.

 8) Fix bug in pci_find_pcie_root_port() which was unearthed by the
    relaxed ordering changes, from Thierry Redding. I made sure to get
    an explicit ACK from Bjorn this time around :-)

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (27 commits)
  ipv6: repair fib6 tree in failure case
  net_sched: fix order of queue length updates in qdisc_replace()
  tools lib bpf: improve warning
  switchdev: documentation: minor typo fixes
  bpf, doc: also add s390x as arch to sysctl description
  net: sched: fix NULL pointer dereference when action calls some targets
  rxrpc: Fix oops when discarding a preallocated service call
  irda: do not leak initialized list.dev to userspace
  net/mlx4_core: Enable 4K UAR if SRIOV module parameter is not enabled
  PCI: Allow PCI express root ports to find themselves
  tcp: when rearming RTO, if RTO time is in past then fire RTO ASAP
  net: check and errout if res->fi is NULL when RTM_F_FIB_MATCH is set
  ipv6: reset fn->rr_ptr when replacing route
  sctp: fully initialize the IPv6 address in sctp_v6_to_addr()
  tipc: fix use-after-free
  tun: handle register_netdevice() failures properly
  datagram: When peeking datagrams with offset < 0 don't skip empty skbs
  bpf, doc: improve sysctl knob description
  netxen: fix incorrect loop counter decrement
  nfp: fix infinite loop on umapping cleanup
  ...
2017-08-21 13:16:27 -07:00
Oleg Nesterov dd1c1f2f20 pids: make task_tgid_nr_ns() safe
This was reported many times, and this was even mentioned in commit
52ee2dfdd4 ("pids: refactor vnr/nr_ns helpers to make them safe") but
somehow nobody bothered to fix the obvious problem: task_tgid_nr_ns() is
not safe because task->group_leader points to nowhere after the exiting
task passes exit_notify(), rcu_read_lock() can not help.

We really need to change __unhash_process() to nullify group_leader,
parent, and real_parent, but this needs some cleanups.  Until then we
can turn task_tgid_nr_ns() into another user of __task_pid_nr_ns() and
fix the problem.

Reported-by: Troy Kensinger <tkensinger@google.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-21 12:47:31 -07:00
Heiner Kallweit 03619844d8 rtc: ds1307: fix regmap config
Current max_register setting breaks reading nvram on certain chips and
also reading the standard registers on RX8130 where register map starts
at 0x10.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Fixes: 11e5890b53 "rtc: ds1307: convert driver to regmap"
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2017-08-21 11:08:03 +02:00
Wei Wang 348a400272 ipv6: repair fib6 tree in failure case
In fib6_add(), it is possible that fib6_add_1() picks an intermediate
node and sets the node's fn->leaf to NULL in order to add this new
route. However, if fib6_add_rt2node() fails to add the new
route for some reason, fn->leaf will be left as NULL and could
potentially cause crash when fn->leaf is accessed in fib6_locate().
This patch makes sure fib6_repair_tree() is called to properly repair
fn->leaf in the above failure case.

Here is the syzkaller reported general protection fault in fib6_locate:
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN
Modules linked in:
CPU: 0 PID: 40937 Comm: syz-executor3 Not tainted
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
task: ffff8801d7d64100 ti: ffff8801d01a0000 task.ti: ffff8801d01a0000
RIP: 0010:[<ffffffff82a3e0e1>]  [<ffffffff82a3e0e1>] __ipv6_prefix_equal64_half include/net/ipv6.h:475 [inline]
RIP: 0010:[<ffffffff82a3e0e1>]  [<ffffffff82a3e0e1>] ipv6_prefix_equal include/net/ipv6.h:492 [inline]
RIP: 0010:[<ffffffff82a3e0e1>]  [<ffffffff82a3e0e1>] fib6_locate_1 net/ipv6/ip6_fib.c:1210 [inline]
RIP: 0010:[<ffffffff82a3e0e1>]  [<ffffffff82a3e0e1>] fib6_locate+0x281/0x3c0 net/ipv6/ip6_fib.c:1233
RSP: 0018:ffff8801d01a36a8  EFLAGS: 00010202
RAX: 0000000000000020 RBX: ffff8801bc790e00 RCX: ffffc90002983000
RDX: 0000000000001219 RSI: ffff8801d01a37a0 RDI: 0000000000000100
RBP: ffff8801d01a36f0 R08: 00000000000000ff R09: 0000000000000000
R10: 0000000000000003 R11: 0000000000000000 R12: 0000000000000001
R13: dffffc0000000000 R14: ffff8801d01a37a0 R15: 0000000000000000
FS:  00007f6afd68c700(0000) GS:ffff8801db400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000004c6340 CR3: 00000000ba41f000 CR4: 00000000001426f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Stack:
 ffff8801d01a37a8 ffff8801d01a3780 ffffed003a0346f5 0000000c82a23ea0
 ffff8800b7bd7700 ffff8801d01a3780 ffff8800b6a1c940 ffffffff82a23ea0
 ffff8801d01a3920 ffff8801d01a3748 ffffffff82a223d6 ffff8801d7d64988
Call Trace:
 [<ffffffff82a223d6>] ip6_route_del+0x106/0x570 net/ipv6/route.c:2109
 [<ffffffff82a23f9d>] inet6_rtm_delroute+0xfd/0x100 net/ipv6/route.c:3075
 [<ffffffff82621359>] rtnetlink_rcv_msg+0x549/0x7a0 net/core/rtnetlink.c:3450
 [<ffffffff8274c1d1>] netlink_rcv_skb+0x141/0x370 net/netlink/af_netlink.c:2281
 [<ffffffff82613ddf>] rtnetlink_rcv+0x2f/0x40 net/core/rtnetlink.c:3456
 [<ffffffff8274ad38>] netlink_unicast_kernel net/netlink/af_netlink.c:1206 [inline]
 [<ffffffff8274ad38>] netlink_unicast+0x518/0x750 net/netlink/af_netlink.c:1232
 [<ffffffff8274b83e>] netlink_sendmsg+0x8ce/0xc30 net/netlink/af_netlink.c:1778
 [<ffffffff82564aff>] sock_sendmsg_nosec net/socket.c:609 [inline]
 [<ffffffff82564aff>] sock_sendmsg+0xcf/0x110 net/socket.c:619
 [<ffffffff82564d62>] sock_write_iter+0x222/0x3a0 net/socket.c:834
 [<ffffffff8178523d>] new_sync_write+0x1dd/0x2b0 fs/read_write.c:478
 [<ffffffff817853f4>] __vfs_write+0xe4/0x110 fs/read_write.c:491
 [<ffffffff81786c38>] vfs_write+0x178/0x4b0 fs/read_write.c:538
 [<ffffffff817892a9>] SYSC_write fs/read_write.c:585 [inline]
 [<ffffffff817892a9>] SyS_write+0xd9/0x1b0 fs/read_write.c:577
 [<ffffffff82c71e32>] entry_SYSCALL_64_fastpath+0x12/0x17

Note: there is no "Fixes" tag as this seems to be a bug introduced
very early.

Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-20 20:06:56 -07:00
Konstantin Khlebnikov 68a66d149a net_sched: fix order of queue length updates in qdisc_replace()
This important to call qdisc_tree_reduce_backlog() after changing queue
length. Parent qdisc should deactivate class in ->qlen_notify() called from
qdisc_tree_reduce_backlog() but this happens only if qdisc->q.qlen in zero.

Missed class deactivations leads to crashes/warnings at picking packets
from empty qdisc and corrupting state at reactivating this class in future.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Fixes: 86a7996cc8 ("net_sched: introduce qdisc_replace() helper")
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-20 20:02:00 -07:00
Eric Leblond 49bf4b36fd tools lib bpf: improve warning
Signed-off-by: Eric Leblond <eric@regit.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-20 19:49:51 -07:00
Chris Packham 5a78449810 switchdev: documentation: minor typo fixes
Two typos in switchdev.txt

Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-20 19:49:10 -07:00
Daniel Borkmann d4dd2d75a2 bpf, doc: also add s390x as arch to sysctl description
Looks like this was accidentally missed, so still add s390x
as supported eBPF JIT arch to bpf_jit_enable.

Fixes: 014cd0a368 ("bpf: Update sysctl documentation to list all supported architectures")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-20 19:45:09 -07:00
Linus Torvalds 14ccee78fc Linux 4.13-rc6 2017-08-20 14:13:52 -07:00
Linus Torvalds 197e7e5213 Sanitize 'move_pages()' permission checks
The 'move_paghes()' system call was introduced long long ago with the
same permission checks as for sending a signal (except using
CAP_SYS_NICE instead of CAP_SYS_KILL for the overriding capability).

That turns out to not be a great choice - while the system call really
only moves physical page allocations around (and you need other
capabilities to do a lot of it), you can check the return value to map
out some the virtual address choices and defeat ASLR of a binary that
still shares your uid.

So change the access checks to the more common 'ptrace_may_access()'
model instead.

This tightens the access checks for the uid, and also effectively
changes the CAP_SYS_NICE check to CAP_SYS_PTRACE, but it's unlikely that
anybody really _uses_ this legacy system call any more (we hav ebetter
NUMA placement models these days), so I expect nobody to notice.

Famous last words.

Reported-by: Otto Ebeling <otto.ebeling@iki.fi>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-20 13:26:27 -07:00
Linus Torvalds 7f680d7ec3 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
 "Another pile of small fixes and updates for x86:

   - Plug a hole in the SMAP implementation which misses to clear AC on
     NMI entry

   - Fix the norandmaps/ADDR_NO_RANDOMIZE logic so the command line
     parameter works correctly again

   - Use the proper accessor in the startup64 code for next_early_pgt to
     prevent accessing of invalid addresses and faulting in the early
     boot code.

   - Prevent CPU hotplug lock recursion in the MTRR code

   - Unbreak CPU0 hotplugging

   - Rename overly long CPUID bits which got introduced in this cycle

   - Two commits which mark data 'const' and restrict the scope of data
     and functions to file scope by making them 'static'"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Constify attribute_group structures
  x86/boot/64/clang: Use fixup_pointer() to access 'next_early_pgt'
  x86/elf: Remove the unnecessary ADDR_NO_RANDOMIZE checks
  x86: Fix norandmaps/ADDR_NO_RANDOMIZE
  x86/mtrr: Prevent CPU hotplug lock recursion
  x86: Mark various structures and functions as 'static'
  x86/cpufeature, kvm/svm: Rename (shorten) the new "virtualized VMSAVE/VMLOAD" CPUID flag
  x86/smpboot: Unbreak CPU0 hotplug
  x86/asm/64: Clear AC on NMI entries
2017-08-20 09:36:52 -07:00
Linus Torvalds 2615a38f14 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
 "A few small fixes for timer drivers:

   - Prevent infinite recursion in the arm architected timer driver with
     ftrace

   - Propagate error codes to the caller in case of failure in EM STI
     driver

   - Adjust a bogus loop iteration in the arm architected timer driver

   - Add a missing Kconfig dependency to the pistachio clocksource to
     prevent build failures

   - Correctly check for IS_ERR() instead of NULL in the shared timer-of
     code"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource/drivers/arm_arch_timer: Avoid infinite recursion when ftrace is enabled
  clocksource/drivers/Kconfig: Fix CLKSRC_PISTACHIO dependencies
  clocksource/drivers/timer-of: Checking for IS_ERR() instead of NULL
  clocksource/drivers/em_sti: Fix error return codes in em_sti_probe()
  clocksource/drivers/arm_arch_timer: Fix mem frame loop initialization
2017-08-20 09:34:24 -07:00
Linus Torvalds e46db8d2ef Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
 "Two fixes for the perf subsystem:

   - Fix an inconsistency of RDPMC mm struct tagging across exec() which
     causes RDPMC to fault.

   - Correct the timestamp mechanics across IOC_DISABLE/ENABLE which
     causes incorrect timestamps and total time calculations"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Fix time on IOC_ENABLE
  perf/x86: Fix RDPMC vs. mm_struct tracking
2017-08-20 09:20:57 -07:00
Linus Torvalds 9dae41a238 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
 "A pile of smallish changes all over the place:

   - Add a missing ISB in the GIC V1 driver

   - Remove an ACPI version check in the GIC V3 ITS driver

   - Add the missing irq_pm_shutdown function for BRCMSTB-L2 to avoid
     spurious wakeups

   - Remove the artifical limitation of ITS instances to the number of
     NUMA nodes which prevents utilizing the ITS hardware correctly

   - Prevent a infinite parsing loop in the GIC-V3 ITS/MSI code

   - Honour the force affinity argument in the GIC-V3 driver which is
     required to make perf work correctly

   - Correctly report allocation failures in GIC-V2/V3 to avoid using
     half allocated and initialized interrupts.

   - Fixup checks against nr_cpu_ids in the generic IPI code"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq/ipi: Fixup checks against nr_cpu_ids
  genirq: Restore trigger settings in irq_modify_status()
  MAINTAINERS: Remove Jason Cooper's irqchip git tree
  irqchip/gic-v3-its-platform-msi: Fix msi-parent parsing loop
  irqchip/gic-v3-its: Allow GIC ITS number more than MAX_NUMNODES
  irqchip: brcmstb-l2: Define an irq_pm_shutdown function
  irqchip/gic: Ensure we have an ISB between ack and ->handle_irq
  irqchip/gic-v3-its: Remove ACPICA version check for ACPI NUMA
  irqchip/gic-v3: Honor forced affinity setting
  irqchip/gic-v3: Report failures in gic_irq_domain_alloc
  irqchip/gic-v2: Report failures in gic_irq_domain_alloc
  irqchip/atmel-aic: Remove root argument from ->fixup() prototype
  irqchip/atmel-aic: Fix unbalanced refcount in aic_common_rtc_irq_fixup()
  irqchip/atmel-aic: Fix unbalanced of_node_put() in aic_common_irq_fixup()
2017-08-20 09:07:56 -07:00
Linus Torvalds e18a5ebc2d Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull watchdog fix from Thomas Gleixner:
 "A fix for the hardlockup watchdog to prevent false positives with
  extreme Turbo-Modes which make the perf/NMI watchdog fire faster than
  the hrtimer which is used to verify.

  Slightly larger than the minimal fix, which just would increase the
  hrtimer frequency, but comes with extra overhead of more watchdog
  timer interrupts and thread wakeups for all users.

  With this change we restrict the overhead to the extreme Turbo-Mode
  systems"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  kernel/watchdog: Prevent false positives with turbo modes
2017-08-20 08:54:30 -07:00
Alexey Dobriyan 8fbbe2d7cc genirq/ipi: Fixup checks against nr_cpu_ids
Valid CPU ids are [0, nr_cpu_ids-1] inclusive.

Fixes: 3b8e29a82d ("genirq: Implement ipi_send_mask/single()")
Fixes: f9bce791ae ("genirq: Add a new function to get IPI reverse mapping")
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170819095751.GB27864@avx2
2017-08-20 10:49:05 +02:00
Xin Long 4f8a881acc net: sched: fix NULL pointer dereference when action calls some targets
As we know in some target's checkentry it may dereference par.entryinfo
to check entry stuff inside. But when sched action calls xt_check_target,
par.entryinfo is set with NULL. It would cause kernel panic when calling
some targets.

It can be reproduce with:
  # tc qd add dev eth1 ingress handle ffff:
  # tc filter add dev eth1 parent ffff: u32 match u32 0 0 action xt \
    -j ECN --ecn-tcp-remove

It could also crash kernel when using target CLUSTERIP or TPROXY.

By now there's no proper value for par.entryinfo in ipt_init_target,
but it can not be set with NULL. This patch is to void all these
panics by setting it with an ipt_entry obj with all members = 0.

Note that this issue has been there since the very beginning.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 16:25:49 -07:00
David Howells 9a19bad70c rxrpc: Fix oops when discarding a preallocated service call
rxrpc_service_prealloc_one() doesn't set the socket pointer on any new call
it preallocates, but does add it to the rxrpc net namespace call list.
This, however, causes rxrpc_put_call() to oops when the call is discarded
when the socket is closed.  rxrpc_put_call() needs the socket to be able to
reach the namespace so that it can use a lock held therein.

Fix this by setting a call's socket pointer immediately before discarding
it.

This can be triggered by unloading the kafs module, resulting in an oops
like the following:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
IP: rxrpc_put_call+0x1e2/0x32d
PGD 0
P4D 0
Oops: 0000 [#1] SMP
Modules linked in: kafs(E-)
CPU: 3 PID: 3037 Comm: rmmod Tainted: G            E   4.12.0-fscache+ #213
Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014
task: ffff8803fc92e2c0 task.stack: ffff8803fef74000
RIP: 0010:rxrpc_put_call+0x1e2/0x32d
RSP: 0018:ffff8803fef77e08 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff8803fab99ac0 RCX: 000000000000000f
RDX: ffffffff81c50a40 RSI: 000000000000000c RDI: ffff8803fc92ea88
RBP: ffff8803fef77e30 R08: ffff8803fc87b941 R09: ffffffff82946d20
R10: ffff8803fef77d10 R11: 00000000000076fc R12: 0000000000000005
R13: ffff8803fab99c20 R14: 0000000000000001 R15: ffffffff816c6aee
FS:  00007f915a059700(0000) GS:ffff88041fb80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000030 CR3: 00000003fef39000 CR4: 00000000001406e0
Call Trace:
 rxrpc_discard_prealloc+0x325/0x341
 rxrpc_listen+0xf9/0x146
 kernel_listen+0xb/0xd
 afs_close_socket+0x3e/0x173 [kafs]
 afs_exit+0x1f/0x57 [kafs]
 SyS_delete_module+0x10f/0x19a
 do_syscall_64+0x8a/0x149
 entry_SYSCALL64_slow_path+0x25/0x25

Fixes: 2baec2c3f8 ("rxrpc: Support network namespacing")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 16:23:23 -07:00
Colin Ian King b024d949a3 irda: do not leak initialized list.dev to userspace
list.dev has not been initialized and so the copy_to_user is copying
data from the stack back to user space which is a potential
information leak. Fix this ensuring all of list is initialized to
zero.

Detected by CoverityScan, CID#1357894 ("Uninitialized scalar variable")

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 16:21:51 -07:00
Huy Nguyen ca3d89a3eb net/mlx4_core: Enable 4K UAR if SRIOV module parameter is not enabled
enable_4k_uar module parameter was added in patch cited below to
address the backward compatibility issue in SRIOV when the VM has
system's PAGE_SIZE uar implementation and the Hypervisor has 4k uar
implementation.

The above compatibility issue does not exist in the non SRIOV case.
In this patch, we always enable 4k uar implementation if SRIOV
is not enabled on mlx4's supported cards.

Fixes: 76e39ccf9c ("net/mlx4_core: Fix backward compatibility on VFs")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 16:15:37 -07:00
Thierry Reding b6f6d56c91 PCI: Allow PCI express root ports to find themselves
If the pci_find_pcie_root_port() function is called on a root port
itself, return the root port rather than NULL.

This effectively reverts commit 0e40523287 ("PCI: fix oops when
try to find Root Port for a PCI device") which added an extra check
that would now be redundant.

Fixes: a99b646afa ("PCI: Disable PCIe Relaxed Ordering if unsupported")
Fixes: c56d4450eb ("PCI: Turn off Request Attributes to avoid Chelsio T5 Completion erratum")
Signed-off-by: Thierry Reding <treding@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Shawn Lin <shawn.lin@rock-chips.com>
Tested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 16:14:37 -07:00
Neal Cardwell cdbeb633ca tcp: when rearming RTO, if RTO time is in past then fire RTO ASAP
In some situations tcp_send_loss_probe() can realize that it's unable
to send a loss probe (TLP), and falls back to calling tcp_rearm_rto()
to schedule an RTO timer. In such cases, sometimes tcp_rearm_rto()
realizes that the RTO was eligible to fire immediately or at some
point in the past (delta_us <= 0). Previously in such cases
tcp_rearm_rto() was scheduling such "overdue" RTOs to happen at now +
icsk_rto, which caused needless delays of hundreds of milliseconds
(and non-linear behavior that made reproducible testing
difficult). This commit changes the logic to schedule "overdue" RTOs
ASAP, rather than at now + icsk_rto.

Fixes: 6ba8a3b19e ("tcp: Tail loss probe (TLP)")
Suggested-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 16:07:43 -07:00
Linus Torvalds 58d4e450a4 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "14 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm: revert x86_64 and arm64 ELF_ET_DYN_BASE base changes
  mm/vmalloc.c: don't unconditonally use __GFP_HIGHMEM
  mm/mempolicy: fix use after free when calling get_mempolicy
  mm/cma_debug.c: fix stack corruption due to sprintf usage
  signal: don't remove SIGNAL_UNKILLABLE for traced tasks.
  mm, oom: fix potential data corruption when oom_reaper races with writer
  mm: fix double mmap_sem unlock on MMF_UNSTABLE enforced SIGBUS
  slub: fix per memcg cache leak on css offline
  mm: discard memblock data later
  test_kmod: fix description for -s -and -c parameters
  kmod: fix wait on recursive loop
  wait: add wait_event_killable_timeout()
  kernel/watchdog: fix Kconfig constraints for perf hardlockup watchdog
  mm: memcontrol: fix NULL pointer crash in test_clear_page_writeback()
2017-08-18 16:06:33 -07:00
Roopa Prabhu bc3aae2bba net: check and errout if res->fi is NULL when RTM_F_FIB_MATCH is set
Syzkaller hit 'general protection fault in fib_dump_info' bug on
commit 4.13-rc5..

Guilty file: net/ipv4/fib_semantics.c

kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN
Modules linked in:
CPU: 0 PID: 2808 Comm: syz-executor0 Not tainted 4.13.0-rc5 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
Ubuntu-1.8.2-1ubuntu1 04/01/2014
task: ffff880078562700 task.stack: ffff880078110000
RIP: 0010:fib_dump_info+0x388/0x1170 net/ipv4/fib_semantics.c:1314
RSP: 0018:ffff880078117010 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: 00000000000000fe RCX: 0000000000000002
RDX: 0000000000000006 RSI: ffff880078117084 RDI: 0000000000000030
RBP: ffff880078117268 R08: 000000000000000c R09: ffff8800780d80c8
R10: 0000000058d629b4 R11: 0000000067fce681 R12: 0000000000000000
R13: ffff8800784bd540 R14: ffff8800780d80b5 R15: ffff8800780d80a4
FS:  00000000022fa940(0000) GS:ffff88007fc00000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000004387d0 CR3: 0000000079135000 CR4: 00000000000006f0
Call Trace:
  inet_rtm_getroute+0xc89/0x1f50 net/ipv4/route.c:2766
  rtnetlink_rcv_msg+0x288/0x680 net/core/rtnetlink.c:4217
  netlink_rcv_skb+0x340/0x470 net/netlink/af_netlink.c:2397
  rtnetlink_rcv+0x28/0x30 net/core/rtnetlink.c:4223
  netlink_unicast_kernel net/netlink/af_netlink.c:1265 [inline]
  netlink_unicast+0x4c4/0x6e0 net/netlink/af_netlink.c:1291
  netlink_sendmsg+0x8c4/0xca0 net/netlink/af_netlink.c:1854
  sock_sendmsg_nosec net/socket.c:633 [inline]
  sock_sendmsg+0xca/0x110 net/socket.c:643
  ___sys_sendmsg+0x779/0x8d0 net/socket.c:2035
  __sys_sendmsg+0xd1/0x170 net/socket.c:2069
  SYSC_sendmsg net/socket.c:2080 [inline]
  SyS_sendmsg+0x2d/0x50 net/socket.c:2076
  entry_SYSCALL_64_fastpath+0x1a/0xa5
  RIP: 0033:0x4512e9
  RSP: 002b:00007ffc75584cc8 EFLAGS: 00000216 ORIG_RAX:
  000000000000002e
  RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00000000004512e9
  RDX: 0000000000000000 RSI: 0000000020f2cfc8 RDI: 0000000000000003
  RBP: 000000000000000e R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000216 R12: fffffffffffffffe
  R13: 0000000000718000 R14: 0000000020c44ff0 R15: 0000000000000000
  Code: 00 0f b6 8d ec fd ff ff 48 8b 85 f0 fd ff ff 88 48 17 48 8b 45
  28 48 8d 78 30 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03
  <0f>
  b6 04 02 84 c0 74 08 3c 03 0f 8e cb 0c 00 00 48 8b 45 28 44
  RIP: fib_dump_info+0x388/0x1170 net/ipv4/fib_semantics.c:1314 RSP:
  ffff880078117010
---[ end trace 254a7af28348f88b ]---

This patch adds a res->fi NULL check.

example run:
$ip route get 0.0.0.0 iif virt1-0
broadcast 0.0.0.0 dev lo
    cache <local,brd> iif virt1-0

$ip route get 0.0.0.0 iif virt1-0 fibmatch
RTNETLINK answers: No route to host

Reported-by: idaifish <idaifish@gmail.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Fixes: b61798130f ("net: ipv4: RTM_GETROUTE: return matched fib result when requested")
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 16:05:46 -07:00