Commit Graph

692218 Commits

Author SHA1 Message Date
Josef Bacik 7a362ea96d nbd: clear disconnected on reconnect
If our device loses its connection for longer than the dead timeout we
will set NBD_DISCONNECTED in order to quickly fail any pending IO's that
flood in after the IO's that were waiting during the dead timer.
However if we re-connect at some point in the future we'll still see
this DISCONNECTED flag set if we then lose our connection again after
that, which means we won't get notifications for our newly lost
connections.  Fix this by just clearing the DISCONNECTED flag on
reconnect in order to make sure everything works as it's supposed to.

Reported-by: Dan Melnic <dmm@fb.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-07-25 13:58:34 -06:00
Helge Deller 56188832a5 parisc: Suspend lockup detectors before system halt
Some machines can't power off the machine, so disable the lockup detectors to
avoid this watchdog BUG to show up every few seconds:
watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [systemd-shutdow:1]

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # 4.9+
2017-07-25 21:43:38 +02:00
Helge Deller c46bafc4d2 parisc: Show DIMM slot number which holds broken memory module
The Page Deallocation Table (PDT) holds the physical addresses of all broken
memory addresses. With the physical address we now are able to show which DIMM
slot (e.g. 1a, 3c) actually holds the broken memory module so that users are
able to replace it.

Signed-off-by: Helge Deller <deller@gmx.de>
2017-07-25 21:43:10 +02:00
Mikulas Patocka edbe9597ac dm zoned: remove test for impossible REQ_OP_FLUSH conditions
The value REQ_OP_FLUSH is only used by the block code for
request-based devices.

Remove the tests for REQ_OP_FLUSH from the bio-based dm-zoned-target.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-07-25 15:12:17 -04:00
Heinz Mauelshagen ac6a318888 dm raid: bump target version
Bumo dm-raid target version to 1.12.1 to reflect that commit cc27b0c78c
("md: fix deadlock between mddev_suspend() and md_write_start()") is
available.

This version change allows userspace to detect that MD fix is available.

Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-07-25 14:54:20 -04:00
Heinz Mauelshagen 0cf352e5a0 dm raid: avoid mddev->suspended access
Use runtime flag to ensure that an mddev gets suspended/resumed just once.

Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-07-25 14:54:19 -04:00
Heinz Mauelshagen f4af3f82da dm raid: fix activation check in validate_raid_redundancy()
During growing reshapes (i.e. stripes being added to a raid set), the
new stripe images are not in-sync and not part of the raid set until
the reshape is started.

LVM2 has to request multiple table reloads involving superblock updates
in order to reflect proper size of SubLVs in the cluster.  Before a stripe
adding reshape starts, validate_raid_redundancy() fails as a result of that
because it checks the total number of devices against the number of rebuild
ones rather than the actual ones in the raid set (as retrieved from the
superblock) thus resulting in failed raid4/5/6/10 redundancy checks.

E.g. convert 3 stripes -> 7 stripes raid5 (which only allows for maximum
1 device to fail) requesting +4 delta disks causing 4 devices to rebuild
during reshaping thus failing activation.

To fix this, move validate_raid_redundancy() to get access to the
current raid_set members.

Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-07-25 14:54:19 -04:00
Heinz Mauelshagen bbac1e06a4 dm raid: remove WARN_ON() in raid10_md_layout_to_format()
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-07-25 14:54:18 -04:00
Helge Deller 25a9b76597 parisc: Add function to return DIMM slot of physical address
Add a firmware wrapper function, which asks PDC firmware for the DIMM slot of a
physical address. This is needed to show users which DIMM module needs
replacement in case a broken DIMM was encountered.

Signed-off-by: Helge Deller <deller@gmx.de>
2017-07-25 19:28:37 +02:00
Helge Deller f520e55241 parisc: Fix crash when calling PDC_PAT_MEM PDT firmware function
Commit c9c2877d08 ("parisc: Add Page Deallocation Table (PDT) support")
introduced the pdc_pat_mem_read_pd_pdt() firmware helper function, which
crashed the system because it trashed the stack if the
pdc_pat_mem_read_pd_retinfo struct was located on the stack (and which is
in size less than the required 32 64-bit values).

Fix it by using the pdc_result struct instead when calling firmware and copy
the return values back into the result struct when finished sucessfully.

While debugging this code I noticed that the pdc_type wasn't set correctly
either, so let's fix that too.

Fixes: c9c2877d08 ("parisc: Add Page Deallocation Table (PDT) support")
Signed-off-by: Helge Deller <deller@gmx.de>
2017-07-25 18:24:39 +02:00
Christoph Hellwig 50cdb7c61b nvme-pci: fix HMB size calculation
It's possible the preferred HMB size may not be a multiple of the
chunk_size. This patch moves len to function scope and uses that in
the for loop increment so the last iteration doesn't cause the total
size to exceed the allocated HMB size.

Based on an earlier patch from Keith Busch.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Fixes: 87ad72a59a ("nvme-pci: implement host memory buffer support")
2017-07-25 18:05:33 +02:00
James Smart 9c5358e15c nvme-fc: revise TRADDR parsing
The FC-NVME spec hasn't locked down on the format string for TRADDR.
Currently the spec is lobbying for "nn-<16hexdigits>:pn-<16hexdigits>"
where the wwn's are hex values but not prefixed by 0x.

Most implementations so far expect a string format of
"nn-0x<16hexdigits>:pn-0x<16hexdigits>" to be used. The transport
uses the match_u64 parser which requires a leading 0x prefix to set
the base properly. If it's not there, a match will either fail or return
a base 10 value.

The resolution in T11 is pushing out. Therefore, to fix things now and
to cover any eventuality and any implementations already in the field,
this patch adds support for both formats.

The change consists of replacing the token matching routine with a
routine that validates the fixed string format, and then builds
a local copy of the hex name with a 0x prefix before calling
the system parser.

Note: the same parser routine exists in both the initiator and target
transports. Given this is about the only "shared" item, we chose to
replicate rather than create an interdendency on some shared code.

Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-07-25 18:05:25 +02:00
James Smart 8b25f35192 nvme-fc: address target disconnect race conditions in fcp io submit
There are cases where threads are in the process of submitting new
io when the LLDD calls in to remove the remote port. In some cases,
the next io actually goes to the LLDD, who knows the remoteport isn't
present and rejects it. To properly recovery/restart these i/o's we
don't want to hard fail them, we want to treat them as temporary
resource errors in which a delayed retry will work.

Add a couple more checks on remoteport connectivity and commonize the
busy response handling when it's seen.

Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-07-25 17:58:47 +02:00
Jon Derrick 2fd4167fad nvme: fabrics commands should use the fctype field for data direction
Fabrics commands with opcode 0x7F use the fctype field to indicate data
direction.

Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Reviewed-by: Sagi Grimberg <sai@grmberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Fixes: eb793e2c ("nvme.h: add NVMe over Fabrics definitions")
2017-07-25 17:58:32 +02:00
Johannes Thumshirn 6484f5d16f nvme: also provide a UUID in the WWID sysfs attribute
The WWID sysfs attribute can provide multiple means of a World Wide ID
for a NVMe device. It can either be a NGUID, a EUI-64 or a concatenation
of VID, Serial Number, Model and the Namespace ID in this order of
preference.

If the target also sends us a UUID use the UUID for identification and
give it the highest priority.

This eases generation of /dev/disk/by-* symlinks.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-07-25 17:58:22 +02:00
Linus Torvalds 25f6a53799 JFS fixes for 4.13
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEIodevzQLVs53l6BhNqiEXrVAjGQFAll3YIwACgkQNqiEXrVA
 jGQE9w//b/MyL9MtvGcKI7u3V7RrqbodzoL42KxV98TI3y7rpmUcmRsqDT045ufD
 S+AOqnIhvH2XbsF7jvZ0UROUtErBgh+pIRqCctVqQM+GKE7p/KR0rY/4eMPlbqQL
 Q2L0ZbokHU4mgvo7SqSJkYFRd0PPBaaw4aJaf8gg1g0pCb29jcer4ycKKHCjz0Wz
 YS2/v7NVWysehihiz/JF6ga5/n6VZeXq3fa48Mmt4TDzKt4IaqgLtAEbPa+eddwW
 z/t4YIoiQ4JUYPnLNmG1Kd8lACfeqKkr2WIh3143ipof4Tj3ZaNc315wVQUl58LP
 6ZUg12tLDVH9OOPdI9VsVLostRbyKH3yUazULBXWIQQt6ZWcFiKFbturMlZevSJB
 OuGBsfpDDvkd+2n2iNcwane/+Ouw4LChzUvmp9/MuIRxinRVdXAVRjaUb2M54xSW
 qR/Yfw8qyky9tac1c1Md/bVo7oMEw0Xliv3HxmGKHNPLMOLzwfVTAw0gGXSrGFn8
 veUKCl1J1+9NWoNExDkyUjsmD1CdkhQk1gpmU70KWQlCgKCtBhWiX6rEy0l8pw1t
 G5UmFpgqG8g61ODuW0dbnSdImmS8mcbY4I1lSvQFb3qVtCeBLz9q7OKSCuavbs4M
 egtaUNVMJ22dVv12loj2OOyVdneUXbUB0WVga3kfEq7QRCtSjF4=
 =fVR9
 -----END PGP SIGNATURE-----

Merge tag 'jfs-4.13' of git://github.com/kleikamp/linux-shaggy

Pull JFS fixes from David Kleikamp.

* tag 'jfs-4.13' of git://github.com/kleikamp/linux-shaggy:
  jfs: preserve i_mode if __jfs_set_acl() fails
  jfs: Don't clear SGID when inheriting ACLs
  jfs: atomically read inode size
2017-07-25 08:51:57 -07:00
Linus Torvalds a9d0683e0b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Pull HID fixes from Jiri Kosina:

 - regression fix (missing IRQs) for devices that require 'always poll'
   quirk, from Dmitry Torokhov

 - new device ID addition to Ortek driver, from Benjamin Tissoires

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
  HID: ortek: add one more buggy device
  HID: usbhid: fix "always poll" quirk
2017-07-25 08:49:00 -07:00
Linus Torvalds eeb7c41d9d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
 "Three bug fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/mm: set change and reference bit on lazy key enablement
  s390: chp: handle CRW_ERC_INIT for channel-path status change
  s390/perf: fix problem state detection
2017-07-25 08:44:27 -07:00
Darrick J. Wong 6215894e11 xfs: check that dir block entries don't off the end of the buffer
When we're checking the entries in a directory buffer, make sure that
the entry length doesn't push us off the end of the buffer.  Found via
xfs/388 writing ones to the length fields.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2017-07-25 08:36:35 -07:00
Dongli Zhang bd912ef3e4 xen/blkfront: always allocate grants first from per-queue persistent grants
This patch partially reverts 3df0e50 ("xen/blkfront: pseudo support for
multi hardware queues/rings"). The xen-blkfront queue/ring might hang due
to grants allocation failure in the situation when gnttab_free_head is
almost empty while many persistent grants are reserved for this queue/ring.

As persistent grants management was per-queue since 73716df ("xen/blkfront:
make persistent grants pool per-queue"), we should always allocate from
persistent grants first.

Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2017-07-25 11:31:09 -04:00
Junxiao Bi 4b422cb998 xen-blkfront: fix mq start/stop race
When ring buf full, hw queue will be stopped. While blkif interrupt consume
request and make free space in ring buf, hw queue will be started again.
But since start queue is protected by spin lock while stop not, that will
cause a race.

interrupt:                                      process:
blkif_interrupt()                               blkif_queue_rq()
 kick_pending_request_queues_locked()
   blk_mq_start_stopped_hw_queues()
      clear_bit(BLK_MQ_S_STOPPED, &hctx->state)
	                                             blk_mq_stop_hw_queue(hctx)
	  blk_mq_run_hw_queue(hctx, async)

If ring buf is made empty in this case, interrupt will never come, then the
hw queue will be stopped forever, all processes waiting for the pending io
in the queue will hung.

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Ankur Arora <ankur.a.arora@oracle.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2017-07-25 11:30:59 -04:00
Dan Carpenter edc11d49f8 dm bufio: fix error code in dm_bufio_write_dirty_buffers()
We should be returning normal negative error codes here.  The "a"
variables comes from &c->async_write_error which is a blk_status_t
converted to a regular error code.

In the current code, the blk_status_t gets propogated back to
pool_create() and eventually results in an Oops.

Fixes: 4e4cbee93d ("block: switch bios to blk_status_t")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-07-25 10:11:15 -04:00
Mikulas Patocka bc86a41e96 dm integrity: test for corrupted disk format during table load
If the dm-integrity superblock was corrupted in such a way that the
journal_sections field was zero, the integrity target would deadlock
because it would wait forever for free space in the journal.

Detect this situation and refuse to activate the device.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 7eada909bf ("dm: add integrity target")
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-07-25 10:11:14 -04:00
Mikulas Patocka aa03a91ffa dm integrity: WARN_ON if variables representing journal usage get out of sync
If this WARN_ON triggers it speaks to programmer error, and likely
implies corruption, but no released kernel should trigger it.  This
WARN_ON serves to assist DM integrity developers as changes are
made/tested in the future.

BUG_ON is excessive for catching programmer error, if a user or
developer would like warnings to trigger a panic, they can enable that
via /proc/sys/kernel/panic_on_warn

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-07-25 10:11:13 -04:00
Andrew Jones cfa0ebc9d6 virtio-net: fix module unloading
Unregister the driver before removing multi-instance hotplug
callbacks. This order avoids the warning issued from
__cpuhp_remove_state_cpuslocked when the number of remaining
instances isn't yet zero.

Fixes: 8017c27919 ("net/virtio-net: Convert to hotplug state machine")
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-25 16:37:36 +03:00
Wei Wang f9aada5fff virtio-balloon: coding format cleanup
Clean up the comment format.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-25 16:37:35 +03:00
Liang Li 195a8c43e9 virtio-balloon: deflate via a page list
This patch saves the deflated pages to a list, instead of the PFN array.
Accordingly, the balloon_pfn_to_page() function is removed.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-25 16:37:35 +03:00
Andy Shevchenko f53d5aa050 virtio_blk: Use sysfs_match_string() helper
Use sysfs_match_string() helper instead of open coded variant.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
2017-07-25 16:37:34 +03:00
Christian Borntraeger 4f89914742 KVM: s390: take srcu lock when getting/setting storage keys
The following warning was triggered by missing srcu locks around
the storage key handling functions.

=============================
WARNING: suspicious RCU usage
4.12.0+ #56 Not tainted
-----------------------------
./include/linux/kvm_host.h:572 suspicious rcu_dereference_check() usage!
rcu_scheduler_active = 2, debug_locks = 1
 1 lock held by live_migration/4936:
  #0:  (&mm->mmap_sem){++++++}, at: [<0000000000141be0>]
kvm_arch_vm_ioctl+0x6b8/0x22d0

 CPU: 8 PID: 4936 Comm: live_migration Not tainted 4.12.0+ #56
 Hardware name: IBM 2964 NC9 704 (LPAR)
 Call Trace:
 ([<000000000011378a>] show_stack+0xea/0xf0)
  [<000000000055cc4c>] dump_stack+0x94/0xd8
  [<000000000012ee70>] gfn_to_memslot+0x1a0/0x1b8
  [<0000000000130b76>] gfn_to_hva+0x2e/0x48
  [<0000000000141c3c>] kvm_arch_vm_ioctl+0x714/0x22d0
  [<000000000013306c>] kvm_vm_ioctl+0x11c/0x7b8
  [<000000000037e2c0>] do_vfs_ioctl+0xa8/0x6c8
  [<000000000037e984>] SyS_ioctl+0xa4/0xb8
  [<00000000008b20a4>] system_call+0xc4/0x27c
 1 lock held by live_migration/4936:
  #0:  (&mm->mmap_sem){++++++}, at: [<0000000000141be0>]
kvm_arch_vm_ioctl+0x6b8/0x22d0

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Pierre Morel<pmorel@linux.vnet.ibm.com>
2017-07-25 14:59:52 +02:00
Ard Biesheuvel 288be97cc7 arm64/lib: copy_page: use consistent prefetch stride
The optional prefetch instructions in the copy_page() routine are
inconsistent: at the start of the function, two cachelines are
prefetched beyond the one being loaded in the first iteration, but
in the loop, the prefetch is one more line ahead. This appears to
be unintentional, so let's fix it.

While at it, fix the comment style and white space.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-07-25 10:04:42 +01:00
Dave Airlie cfd1081108 Merge branch 'linux-4.13' of git://github.com/skeggsb/linux into drm-fixes
two more fixes for issues nouveau found in fedora 26.

* 'linux-4.13' of git://github.com/skeggsb/linux:
  drm/nouveau/bar/gf100: fix access to upper half of BAR2
  drm/nouveau/disp/nv50-: bump max chans to 21
2017-07-25 15:41:39 +10:00
Ben Skeggs 38bcb208f6 drm/nouveau/bar/gf100: fix access to upper half of BAR2
Bit 30 being set causes the upper half of BAR2 to stay in physical mode,
mapped over the end of VRAM, even when the rest of the BAR has been set
to virtual mode.

We inherited our initial value from RM, but I'm not aware of any reason
we need to keep it that way.

This fixes severe GPU hang/lockup issues revealed by Wayland on F26.

Shout-out to NVIDIA for the quick response with the potential cause!

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Cc: stable@vger.kernel.org # 4.3+
2017-07-25 15:30:27 +10:00
Ilia Mirkin a90e049cac drm/nouveau/disp/nv50-: bump max chans to 21
GP102's cursors go from chan 17..20. Increase the array size to hold
their data properly.

Fixes: e50fcff15f ("drm/nouveau/disp/gp102: fix cursor/overlay immediate channel indices")
Cc: stable@vger.kernel.org # v4.10+
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-07-25 15:30:10 +10:00
Jian Jun Chen 26a201a2ba drm/i915/gvt: Extend KBL platform support in GVT-g
Extend KBL platform support in GVT-g. Validation tests
are done on KBL server and KBL NUC. Both show the same
quality.

Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-07-25 12:39:41 +08:00
Ross Zwisler 02cb489be7 ACPI: NUMA: Fix typo in the full name of SRAT
To save someone the time of searching the ACPI spec for
"Static Resource Affinity Table".

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-24 22:27:44 +02:00
Ross Zwisler a0c2d9c1de ACPI: NUMA: add missing include in acpi_numa.h
Right now if a file includes acpi_numa.h and they don't happen to include
linux/numa.h before it, they get the following warning:

./include/acpi/acpi_numa.h:9:5: warning: "MAX_NUMNODES" is not defined [-Wundef]
 #if MAX_NUMNODES > 256
     ^~~~~~~~~~~~

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-24 22:27:43 +02:00
Christoph Hellwig 76451d79bd blk-mq: map queues to all present CPUs
We already do this for PCI mappings, and the higher level code now
expects that CPU on/offlining doesn't have an affect on the queue
mappings.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-07-24 10:01:31 -06:00
Christoph Hellwig 832e4c83ab uuid: remove uuid_be
Everything uses uuid_t now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2017-07-24 17:50:37 +02:00
Christoph Hellwig 7c39ffe7a8 thunderbolt: use uuid_t instead of uuid_be
Switch thunderbolt to the new uuid type.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2017-07-24 17:50:18 +02:00
Benjamin Tissoires c228352dc6 HID: ortek: add one more buggy device
The iHome keypad also requires the same tweak we are doing for other
Ortek devices.

Reported-by: Mairin Duffy <duffy@redhat.com>
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2017-07-24 17:38:21 +02:00
Brian Foster cfaf2d0343 xfs: fix quotacheck dquot id overflow infinite loop
If a dquot has an id of U32_MAX, the next lookup index increment
overflows the uint32_t back to 0. This starts the lookup sequence
over from the beginning, repeats indefinitely and results in a
livelock.

Update xfs_qm_dquot_walk() to explicitly check for the lookup
overflow and exit the loop.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-07-24 08:33:25 -07:00
Ofer Heifetz 7e96d55963 md/raid5: add thread_group worker async_tx_issue_pending_all
Since thread_group worker and raid5d kthread are not in sync, if
worker writes stripe before raid5d then requests will be waiting
for issue_pendig.

Issue observed when building raid5 with ext4, in some build runs
jbd2 would get hung and requests were waiting in the HW engine
waiting to be issued.

Fix this by adding a call to async_tx_issue_pending_all in the
raid5_do_work.

Signed-off-by: Ofer Heifetz <oferh@marvell.com>
Cc: stable@vger.kernel.org
Signed-off-by: Shaohua Li <shli@fb.com>
2017-07-24 07:49:15 -07:00
Christoph Hellwig 765e40b675 block: disable runtime-pm for blk-mq
The blk-mq code lacks support for looking at the rpm_status field, tracking
active requests and the RQF_PM flag.

Due to the default switch to blk-mq for scsi people start to run into
suspend / resume issue due to this fact, so make sure we disable the runtime
PM functionality until it is properly implemented.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-07-24 08:46:40 -06:00
Bart Van Assche 31c4ccc3ec xen-blkfront: Fix handling of non-supported operations
This patch fixes the following sparse warnings:

drivers/block/xen-blkfront.c:916:45: warning: incorrect type in argument 2 (different base types)
drivers/block/xen-blkfront.c:916:45:    expected restricted blk_status_t [usertype] error
drivers/block/xen-blkfront.c:916:45:    got int [signed] error
drivers/block/xen-blkfront.c:1599:47: warning: incorrect type in assignment (different base types)
drivers/block/xen-blkfront.c:1599:47:    expected int [signed] error
drivers/block/xen-blkfront.c:1599:47:    got restricted blk_status_t [usertype] <noident>
drivers/block/xen-blkfront.c:1607:55: warning: incorrect type in assignment (different base types)
drivers/block/xen-blkfront.c:1607:55:    expected int [signed] error
drivers/block/xen-blkfront.c:1607:55:    got restricted blk_status_t [usertype] <noident>
drivers/block/xen-blkfront.c:1625:55: warning: incorrect type in assignment (different base types)
drivers/block/xen-blkfront.c:1625:55:    expected int [signed] error
drivers/block/xen-blkfront.c:1625:55:    got restricted blk_status_t [usertype] <noident>
drivers/block/xen-blkfront.c:1628:62: warning: restricted blk_status_t degrades to integer

Compile-tested only.

Fixes: commit 2a842acab1 ("block: introduce new block status code type")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Roger Pau Monné <roger.pau@citrix.com>
Cc: <xen-devel@lists.xenproject.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-07-24 08:45:48 -06:00
Nikolay Borisov 0e4324a4c3 btrfs: round down size diff when shrinking/growing device
Further testing showed that the fix introduced in 7dfb8be11b ("btrfs:
Round down values which are written for total_bytes_size") was
insufficient and it could still lead to discrepancies between the
total_bytes in the super block and the device total bytes. So this patch
also ensures that the difference between old/new sizes when
shrinking/growing is also rounded down. This ensure that we won't be
subtracting/adding a non-sectorsize multiples to the superblock/device
total sizees.

Fixes: 7dfb8be11b ("btrfs: Round down values which are written for total_bytes_size")
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2017-07-24 16:05:00 +02:00
Omar Sandoval 17024ad0a0 Btrfs: fix early ENOSPC due to delalloc
If a lot of metadata is reserved for outstanding delayed allocations, we
rely on shrink_delalloc() to reclaim metadata space in order to fulfill
reservation tickets. However, shrink_delalloc() has a shortcut where if
it determines that space can be overcommitted, it will stop early. This
made sense before the ticketed enospc system, but now it means that
shrink_delalloc() will often not reclaim enough space to fulfill any
tickets, leading to an early ENOSPC. (Reservation tickets don't care
about being able to overcommit, they need every byte accounted for.)

Fix it by getting rid of the shortcut so that shrink_delalloc() reclaims
all of the metadata it is supposed to. This fixes early ENOSPCs we were
seeing when doing a btrfs receive to populate a new filesystem, as well
as early ENOSPCs Christoph saw when doing a big cp -r onto Btrfs.

Fixes: 957780eb27 ("Btrfs: introduce ticketed enospc infrastructure")
Tested-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
Cc: stable@vger.kernel.org
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2017-07-24 16:04:26 +02:00
Jeff Mahoney 144439376b btrfs: fix lockup in find_free_extent with read-only block groups
If we have a block group that is all of the following:
1) uncached in memory
2) is read-only
3) has a disk cache state that indicates we need to recreate the cache

AND the file system has enough free space fragmentation such that the
request for an extent of a given size can't be honored;

AND have a single CPU core;

AND it's the block group with the highest starting offset such that
there are no opportunities (like reading from disk) for the loop to
yield the CPU;

We can end up with a lockup.

The root cause is simple.  Once we're in the position that we've read in
all of the other block groups directly and none of those block groups
can honor the request, there are no more opportunities to sleep.  We end
up trying to start a caching thread which never gets run if we only have
one core.  This *should* present as a hung task waiting on the caching
thread to make some progress, but it doesn't.  Instead, it degrades into
a busy loop because of the placement of the read-only check.

During the first pass through the loop, block_group->cached will be set
to BTRFS_CACHE_STARTED and have_caching_bg will be set.  Then we hit the
read-only check and short circuit the loop.  We're not yet in
LOOP_CACHING_WAIT, so we skip that loop back before going through the
loop again for other raid groups.

Then we move to LOOP_CACHING_WAIT state.

During the this pass through the loop, ->cached will still be
BTRFS_CACHE_STARTED, which means it's not cached, so we'll enter
cache_block_group, do a lot of nothing, and return, and also set
have_caching_bg again.  Then we hit the read-only check and short circuit
the loop.  The same thing happens as before except now we DO trigger
the LOOP_CACHING_WAIT && have_caching_bg check and loop back up to the
top.  We do this forever.

There are two fixes in this patch since they address the same underlying
bug.

The first is to add a cond_resched to the end of the loop to ensure
that the caching thread always has an opportunity to run.  This will
fix the soft lockup issue, but find_free_extent will still loop doing
nothing until the thread has completed.

The second is to move the read-only check to the top of the loop.  We're
never going to return an allocation within a read-only block group so
we may as well skip it early.  The check for ->cached == BTRFS_CACHE_ERROR
would cause the same problem except that BTRFS_CACHE_ERROR is considered
a "done" state and we won't re-set have_caching_bg again.

Many thanks to Stephan Kulow <coolo@suse.de> for his excellent help in
the testing process.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2017-07-24 16:04:02 +02:00
Dave Martin ce184a0dee ARM: 8687/1: signal: Fix unparseable iwmmxt_sigframe in uc_regspace[]
In kernels with CONFIG_IWMMXT=y running on non-iWMMXt hardware, the
signal frame can be left partially uninitialised in such a way
that userspace cannot parse uc_regspace[] safely.  In particular,
this means that the VFP registers cannot be located reliably in the
signal frame when a multi_v7_defconfig kernel is run on the
majority of platforms.

The cause is that the uc_regspace[] is laid out statically based on
the kernel config, but the decision of whether to save/restore the
iWMMXt registers must be a runtime decision.

To minimise breakage of software that may assume a fixed layout,
this patch emits a dummy block of the same size as iwmmxt_sigframe,
for non-iWMMXt threads.  However, the magic and size of this block
are now filled in to help parsers skip over it.  A new DUMMY_MAGIC
is defined for this purpose.

It is probably legitimate (if non-portable) for userspace to
manufacture its own sigframe for sigreturn, and there is no obvious
reason why userspace should be required to insert a DUMMY_MAGIC
block when running on non-iWMMXt hardware, when omitting it has
worked just fine forever in other configurations.  So in this case,
sigreturn does not require this block to be present.

Reported-by: Edmund Grimley-Evans <Edmund.Grimley-Evans@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2017-07-24 14:26:55 +01:00
Dave Martin 269583559c ARM: 8686/1: iwmmxt: Add missing __user annotations to sigframe accessors
preserve_iwmmxt_context() and restore_iwmmxt_context() lack __user
accessors on their arguments pointing to the user signal frame.

There does not be appear to be a bug here, but this omission is
inconsistent with the crunch and vfp sigframe access functions.

This patch adds the annotations, for consistency.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2017-07-24 14:26:54 +01:00
Paolo Bonzini fa19871a16 KVM: VMX: remove unused field
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-24 10:55:22 +02:00