Commit Graph

678396 Commits

Author SHA1 Message Date
Honggang Li 8c490669de RDMA/IPoIB: Replace netdev_priv with ipoib_priv for ipoib_get_link_ksettings
ipoib_dev_init accesses the wrong private data for the IPoIB device.
Commit cd565b4b51 (IB/IPoIB: Support acceleration options callbacks)
changed ipoib_priv from being identical to netdev_priv to being an
area inside of, but not the same pointer as, the netdev_priv pointer.
As such, the struct we want is the ipoib_priv area, not the netdev_priv
area, so use the right accessor, otherwise we kernel panic.

[   27.271938] IPv6: ADDRCONF(NETDEV_CHANGE): mlx5_ib0.8006: link becomes ready
[   28.156790] BUG: unable to handle kernel NULL pointer dereference at 000000000000067c
[   28.166309] IP: ib_query_port+0x30/0x180 [ib_core]
...
[   28.306282] RIP: 0010:ib_query_port+0x30/0x180 [ib_core]
...
[   28.393337] Call Trace:
[   28.397594]  ipoib_get_link_ksettings+0x66/0xe0 [ib_ipoib]
[   28.405274]  __ethtool_get_link_ksettings+0xa0/0x1c0
[   28.412353]  speed_show+0x74/0xa0
[   28.417503]  dev_attr_show+0x20/0x50
[   28.422922]  ? mutex_lock+0x12/0x40
[   28.428179]  sysfs_kf_seq_show+0xbf/0x1a0
[   28.434002]  kernfs_seq_show+0x21/0x30
[   28.439470]  seq_read+0x116/0x3b0
[   28.444445]  ? do_filp_open+0xa5/0x100
[   28.449774]  kernfs_fop_read+0xff/0x180
[   28.455220]  __vfs_read+0x37/0x150
[   28.460167]  ? security_file_permission+0x9d/0xc0
[   28.466560]  vfs_read+0x8c/0x130
[   28.471318]  SyS_read+0x55/0xc0
[   28.475950]  do_syscall_64+0x67/0x150
[   28.481163]  entry_SYSCALL64_slow_path+0x25/0x25
...
[   28.584493] ---[ end trace 3549968a4bf0aa5d ]---

Fixes: cd565b4b51 (IB/IPoIB: Support acceleration options callbacks)
Fixes: 0d7e2d2166 (IB/ipoib: add get_link_ksettings in ethtool)
Signed-off-by: Honggang Li <honli@redhat.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-01 17:20:09 -04:00
Gustavo A. R. Silva d38d7fdafa RDMA/qedr: add null check before pointer dereference
Add null check before dereferencing pointer sgid_attr.ndev
inside function rdma_vlan_dev_vlan_id().

Addresses-Coverity-ID: 1373979
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Acked-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-01 17:20:08 -04:00
Max Gurtovoy 6e8484c5cf RDMA/mlx5: set UMR wqe fence according to HCA cap
Cache the needed umr_fence and set the wqe ctrl segmennt
accordingly.

Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Acked-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-01 17:19:57 -04:00
Max Gurtovoy 1410a90ae4 net/mlx5: Define interface bits for fencing UMR wqe
HW can implement UMR wqe re-transmission in various ways.
Thus, add HCA cap to distinguish the needed fence for UMR to make
sure that the wqe wouldn't fail on mkey checks.

Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Acked-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-01 17:05:04 -04:00
Jack Morgenstein eed7624552 RDMA/mlx4: Fix MAD tunneling when SRIOV is enabled
The cited patch added a type field to structures ib_ah and rdma_ah_attr.

Function mlx4_ib_query_ah() builds an rdma_ah_attr structure from the
data in an mlx4_ib_ah structure (which contains both an ib_ah structure
and an address vector).

For mlx4_ib_query_ah() to work properly, the type field in the contained
ib_ah structure must be set correctly.

In the outgoing MAD tunneling flow, procedure mlx4_ib_multiplex_mad()
paravirtualizes a MAD received from a slave and sends the processed
mad out over the wire. During this processing, it populates an
mlx4_ib_ah structure and calls mlx4_ib_query_ah().

The cited commit overlooked setting the type field in the contained
ib_ah structure before invoking mlx4_ib_query_ah(). As a result, the
type field remained uninitialized, and the rdma_ah_attr structure was
incorrectly built. This resulted in improperly built MADs being sent out
over the wire.

This patch properly initializes the type field in the contained ib_ah
structure before calling mlx4_ib_query_ah(). The rdma_ah_attr structure
is then generated correctly.

Fixes: 44c58487d5 ("IB/core: Define 'ib' and 'roce' rdma_ah_attr types")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-01 17:04:51 -04:00
Mike Marciniszyn 1feb40067c RDMA/qib,hfi1: Fix MR reference count leak on write with immediate
The handling of IB_RDMA_WRITE_ONLY_WITH_IMMEDIATE will leak a memory
reference when a buffer cannot be allocated for returning the immediate
data.

The issue is that the rkey validation has already occurred and the RNR
nak fails to release the reference that was fruitlessly gotten.  The
the peer will send the identical single packet request when its RNR
timer pops.

The fix is to release the held reference prior to the rnr nak exit.
This is the only sequence the requires both rkey validation and the
buffer allocation on the same packet.

Cc: Stable <stable@vger.kernel.org> # 4.7+
Tested-by: Tadeusz Struk <tadeusz.struk@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-01 17:04:33 -04:00
Byczkowski, Jakub b3e6b4bdbb RDMA/hfi1: Defer setting VL15 credits to link-up interrupt
Keep VL15 credits at 0 during LNI, before link-up. Store
VL15 credits value during verify cap interrupt and set
in after link-up. This addresses an issue where VL15 MAD
packets could be sent by one side of the link before
the other side is ready to receive them.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jakub Byczkowski <jakub.byczkowski@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-01 17:04:20 -04:00
Steven L. Roberts e4785b0633 RDMA/hfi1: change PCI bar addr assignments to Linux API functions
The Omni-Path adapter driver fails to load on the ppc64le platform
due to invalid PCI setup.

This patch makes the PCI configuration more robust and will
fix 64 bit addressing for ppc64le.

Signed-off-by: Steven L Roberts <robers97@gmail.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-01 17:03:40 -04:00
Steven L. Roberts c4dd4b69f5 RDMA/hfi1: fix array termination by appending NULL to attr array
This fixes a kernel panic when loading the hfi driver as a dynamic module.

Signed-off-by: Steven L Roberts <robers97@gmail.com>
Reviewed-by: Leon Romanovsky <leon@kernel.org>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-01 17:03:19 -04:00
Raju Rangoju 98b80a2a73 RDMA/iw_cxgb4: fix the calculation of ipv6 header size
Take care of ipv6 checks while computing header length for deducing mtu
size of ipv6 servers. Due to the incorrect header length computation for
ipv6 servers, wrong mss is reported to the peer (client).

Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-01 17:03:02 -04:00
Ganesh Goudar 4bbfabede5 RDMA/iw_cxgb4: calculate t4_eq_status_entries properly
use egrstatuspagesize to calculate t4_eq_status_entries.

Fixes: bb58d07964 ("cxgb4: Update IngPad and IngPack values")
Reported-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-01 17:02:50 -04:00
Raju Rangoju 1dad0ebeea RDMA/iw_cxgb4: Avoid touch after free error in ARP failure handlers
The patch 761e19a504 (RDMA/iw_cxgb4: Handle return value of
c4iw_ofld_send() in abort_arp_failure()) from May 6, 2016
leads to the following static checker warning:
	drivers/infiniband/hw/cxgb4/cm.c:575 abort_arp_failure()
	warn: passing freed memory 'skb'

Also fixes skb leak when l2t resolution fails

Fixes: 761e19a504 (RDMA/iw_cxgb4: Handle return value of
c4iw_ofld_send() in abort_arp_failure())
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-01 17:01:28 -04:00
Tatyana Nikolova f863de7de3 RDMA/nes: ACK MPA Reply frame
Explicitly ACK the MPA Reply frame so the peer
does not retransmit the frame.

Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-01 17:01:24 -04:00
Tatyana Nikolova 0e5fc90351 RDMA/nes: Don't set 0-length FULPDU RTR indication control flag
Don't set control flag for 0-length FULPDU (Send) RTR indication
in the enhanced MPA Request/Reply frames, because it isn't supported.

Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-01 17:01:14 -04:00
Gustavo A. R. Silva e80bd98d1f RDMA/i40iw: fix duplicated code for different branches
Refactor code to avoid identical code for different branches.

Addresses-Coverity-ID: 1357356
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-01 17:00:29 -04:00
Shiraz Saleem f300ba2d1e RDMA/i40iw: Remove MSS change support
MSS change on active QPs is not supported. Store new MSS
value for new QPs only. Remove code to modify MSS on the fly.
This also resolves a crash on QP modify to QP 0.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: i40iw_sc_qp_modify+0x22/0x280 [i40iw]
Oops: 0000 [#1] SMP KASAN
CPU: 2 PID: 1236 Comm: kworker/u16:4 Not tainted 4.12.0-rc1 #5
Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./Q87M-D2H,
BIOS F7 01/17/2014
Workqueue: l2params i40iw_l2params_worker [i40iw]
task: ffff88070f5a9b40 task.stack: ffff88070f5a0000
RIP: 0010:i40iw_sc_qp_modify+0x22/0x280 [i40iw]
...
Call Trace:
i40iw_exec_cqp_cmd+0x2ce/0x410 [i40iw]
? _raw_spin_lock_irqsave+0x6f/0x80
? i40iw_process_cqp_cmd+0x1d/0x80 [i40iw]
i40iw_process_cqp_cmd+0x7c/0x80 [i40iw]
i40iw_handle_cqp_op+0x2f/0x200 [i40iw]
? trace_hardirqs_off+0xd/0x10
? _raw_spin_unlock_irqrestore+0x46/0x50
i40iw_hw_modify_qp+0x5e/0x90 [i40iw]
i40iw_qp_mss_modify+0x52/0x60 [i40iw]
i40iw_change_l2params+0x145/0x160 [i40iw]
i40iw_l2params_worker+0x1f/0x40 [i40iw]
process_one_work+0x1f5/0x650
? process_one_work+0x161/0x650
worker_thread+0x48/0x3b0
kthread+0x112/0x150
? process_one_work+0x650/0x650
? kthread_create_on_node+0x40/0x40
ret_from_fork+0x2e/0x40
Code: 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 56 41 55 41 89 cd 41 54 49 89 fc
53 48 89 f3 48 89 d6 48 83 ec 08 48 8b 87 10 01 00 00 <48> 8b 40 08 4c 8b b0 40 04
00 00 4c 89 f7 e8 1b e5 ff ff 48 85
RIP: i40iw_sc_qp_modify+0x22/0x280 [i40iw] RSP: ffff88070f5a7c28
CR2: 0000000000000008
---[ end trace 77a405931e296060 ]---

Reported-by: Stefan Assmann <sassmann@redhat.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-01 16:59:52 -04:00
Mustafa Ismail c0c643e16f RDMA/i40iw: Fix device initialization error path
Some error paths in i40iw_initialize_dev are doing
additional and unnecessary work before exiting.
Correctly free resources allocated prior to error
and return with correct status code.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intelcom>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-01 16:59:46 -04:00
Tatyana Nikolova b117f47963 RDMA/i40iw: ACK MPA Reject frame
Explicitly ACK the MPA Reject frame so the peer does
not retransmit the frame.

Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-01 16:59:40 -04:00
Tatyana Nikolova 5a27fec21b RDMA/i40iw: Don't set 0-length FULPDU RTR indication control flag
Don't set control flag for 0-length FULPDU (Send)
RTR indication in the enhanced MPA Request/Reply
frames, because it isn't supported.

Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-01 16:59:22 -04:00
Bart Van Assche b425e50492 block: Avoid that blk_exit_rl() triggers a use-after-free
Since the introduction of .init_rq_fn() and .exit_rq_fn() it is
essential that the memory allocated for struct request_queue
stays around until all blk_exit_rl() calls have finished. Hence
make blk_init_rl() take a reference on struct request_queue.

This patch fixes the following crash:

general protection fault: 0000 [#2] SMP
CPU: 3 PID: 28 Comm: ksoftirqd/3 Tainted: G      D         4.12.0-rc2-dbg+ #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
task: ffff88013a108040 task.stack: ffffc9000071c000
RIP: 0010:free_request_size+0x1a/0x30
RSP: 0018:ffffc9000071fd38 EFLAGS: 00010202
RAX: 6b6b6b6b6b6b6b6b RBX: ffff880067362a88 RCX: 0000000000000003
RDX: ffff880067464178 RSI: ffff880067362a88 RDI: ffff880135ea4418
RBP: ffffc9000071fd40 R08: 0000000000000000 R09: 0000000100180009
R10: ffffc9000071fd38 R11: ffffffff81110800 R12: ffff88006752d3d8
R13: ffff88006752d3d8 R14: ffff88013a108040 R15: 000000000000000a
FS:  0000000000000000(0000) GS:ffff88013fd80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa8ec1edb00 CR3: 0000000138ee8000 CR4: 00000000001406e0
Call Trace:
 mempool_destroy.part.10+0x21/0x40
 mempool_destroy+0xe/0x10
 blk_exit_rl+0x12/0x20
 blkg_free+0x4d/0xa0
 __blkg_release_rcu+0x59/0x170
 rcu_process_callbacks+0x260/0x4e0
 __do_softirq+0x116/0x250
 smpboot_thread_fn+0x123/0x1e0
 kthread+0x109/0x140
 ret_from_fork+0x31/0x40

Fixes: commit e9c787e65c ("scsi: allocate scsi_cmnd structures as part of struct request")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Cc: <stable@vger.kernel.org> # v4.11+
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-06-01 13:07:55 -06:00
Brian Norris 05e97a9eda This pull request contains several fixes to the core and the tango
driver.
 
 tango fixes:
  * Add missing MODULE_DEVICE_TABLE() in tango_nand.c
  * Update the number of corrected bitflips
 
 core fixes:
  * Fix a long standing memory leak in nand_scan_tail()
  * Fix several bugs introduced by the per-vendor init/detection
    infrastructure (introduced in 4.12)
  * Add a static specifier to nand_ooblayout_lp_hamming_ops definition
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJZIp0AAAoJEGXtNgF+CLcAUXIP/RAgpP9cRlUlPs7ZY1lxG7xg
 UK78VH0Z5DFu3wpTQEJnx9iV5rYHiDGktavrZD9NzWS/TUPMI31bmAFUKhFq+07y
 KE3/K+ujUdBxkbk/AhDYoi67aZiczqsBhwHRsWceGp1jYpKsEYrZ8nAOGuw7oS/B
 zCFHLbndTj62MZadPRU+OgJptY4Ao8OIcLhIq3FytLwjmP8NyGAwvwHA9uMn4ZC7
 rtRuuTIMs5ME8m6YbCHphMUjHf1Ap9OFzYnN2SMnzP1OTlyN0SdZ1WcryEKXL9kg
 3dm92OE14hFxr99N9iO2eTWF3jY+14NA5YNE4A6HImPBmd9J96CzGUQggaZvdOal
 BBm/G2oF3bucKuB+4IpqVRGkzWiAUujJ36GdNgTynovr1m3fitYn+gL0fXsMoOEc
 WnPR3vC+ddXPni+Y3cc1cQW5SL8NA+XKf2zbbTvTDrXNuFgJKmtLu2KaSZTSSd0s
 VICjQOideATeVESSqvBz61BOvt8svOsRKzNn4osA5ewVBOM2rn5wtSusmZvM6bVr
 qOkUKUfufhei8WxBvMGrciqS1Ym0cXCV26WFVyIupElYOh9ifD/+3fIfOITBQEz1
 su0u6nMNgtx0lDmP0HiWgqKtwZOsHcqhnmpYhVz7ZWdY8MSKcr2+WVxVbAeDQjzG
 gbbl2mLOwRnaU6V0Srlo
 =MBca
 -----END PGP SIGNATURE-----

Merge tag 'nand/fixes-for-4.12-rc3' of git://git.infradead.org/linux-mtd into MTD

From Boris:
"""
This pull request contains several fixes to the core and the tango
driver.

tango fixes:
 * Add missing MODULE_DEVICE_TABLE() in tango_nand.c
 * Update the number of corrected bitflips

core fixes:
 * Fix a long standing memory leak in nand_scan_tail()
 * Fix several bugs introduced by the per-vendor init/detection
   infrastructure (introduced in 4.12)
 * Add a static specifier to nand_ooblayout_lp_hamming_ops definition
"""
2017-06-01 10:53:55 -07:00
Linus Torvalds 9ea15a59c3 Many small x86 bug fixes: SVM segment registers access rights, nested VMX,
preempt notifiers, LAPIC virtual wire mode, NMI injection.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJZMETIAAoJEL/70l94x66DyjgH/i0LJydL2vnDKanUsPdQYrml
 daoRm67uaz8cy456SIvz+j+NAmVLKoOsQy+jRigFpOWDJglBhy77Fw0rD68uTaf1
 vGRl75pOYANxVlDC0BLgUwXVUjfFsLs6sKYpIIb9pTKNf8Q04MaWSpeCX+GUe0IR
 8Ere9LK0UKTinF1cHmZe4ihG9DYylK6DEagk/9qnxEu1rU8ZiC9SXguXXeNDQI9p
 wppkBngOokqbZ5oTVIkBmbbDMpVefj6ioGqeBVBjS6xE0UlfvJsjsJ54wdLXsGue
 7CF8E1cX7d8NohtlN5qZGssTDscUPPJghalXpeIhtHYgooKf1VeZExATA5YCVLw=
 =bHQF
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "Many small x86 bug fixes: SVM segment registers access rights, nested
  VMX, preempt notifiers, LAPIC virtual wire mode, NMI injection"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Fix nmi injection failure when vcpu got blocked
  KVM: SVM: do not zero out segment attributes if segment is unusable or not present
  KVM: SVM: ignore type when setting segment registers
  KVM: nVMX: fix nested_vmx_check_vmptr failure paths under debugging
  KVM: x86: Fix virtual wire mode
  KVM: nVMX: Fix handling of lmsw instruction
  KVM: X86: Fix preempt the preemption timer cancel
2017-06-01 10:48:09 -07:00
Linus Torvalds 0bb230399f Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull Reiserfs and GFS2 fixes from Jan Kara:
 "Fixes to GFS2 & Reiserfs for the fallout of the recent WRITE_FUA
  cleanup from Christoph.

  Fixes for other filesystems were already merged by respective
  maintainers."

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  reiserfs: Make flush bios explicitely sync
  gfs2: Make flush bios explicitely sync
2017-06-01 10:45:27 -07:00
Linus Torvalds 393bcfaeb8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target fixes from Nicholas Bellinger:
 "Here are the target-pending fixes for v4.12-rc4:

   - ibmviscsis ABORT_TASK handling fixes that missed the v4.12 merge
     window. (Bryant Ly and Michael Cyr)

   - Re-add a target-core check enforcing WRITE overflow reject that was
     relaxed in v4.3, to avoid unsupported iscsi-target immediate data
     overflow. (nab)

   - Fix a target-core-user OOPs during device removal. (MNC + Bryant
     Ly)

   - Fix a long standing iscsi-target potential issue where kthread exit
     did not wait for kthread_should_stop(). (Jiang Yi)

   - Fix a iscsi-target v3.12.y regression OOPs involving initial login
     PDU processing during asynchronous TCP connection close. (MNC +
     nab)

  This is a little larger than usual for an -rc4, primarily due to the
  iscsi-target v3.12.y regression OOPs bug-fix.

  However, it's an important patch as MNC + Hannes where both able to
  trigger it using a reduced iscsi initiator login timeout combined with
  a backend taking a long time to complete I/Os during iscsi login
  driven session reinstatement"

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  iscsi-target: Always wait for kthread_should_stop() before kthread exit
  iscsi-target: Fix initial login PDU asynchronous socket close OOPs
  tcmu: fix crash during device removal
  target: Re-add check to reject control WRITEs with overflow data
  ibmvscsis: Fix the incorrect req_lim_delta
  ibmvscsis: Clear left-over abort_cmd pointers
2017-06-01 10:40:41 -07:00
Ingo Molnar c08d517480 Revert "x86/PAT: Fix Xorg regression on CPUs that don't support PAT"
This reverts commit cbed27cdf0.

As Andy Lutomirski observed:

 "I think this patch is bogus. pat_enabled() sure looks like it's
  supposed to return true if PAT is *enabled*, and these days PAT is
  'enabled' even if there's no HW PAT support."

Reported-by: Bernhard Held <berny156@gmx.de>
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: stable@vger.kernel.org # v4.2+
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-01 15:52:23 +02:00
ZhuangYanying 47a66eed99 KVM: x86: Fix nmi injection failure when vcpu got blocked
When spin_lock_irqsave() deadlock occurs inside the guest, vcpu threads,
other than the lock-holding one, would enter into S state because of
pvspinlock. Then inject NMI via libvirt API "inject-nmi", the NMI could
not be injected into vm.

The reason is:
1 It sets nmi_queued to 1 when calling ioctl KVM_NMI in qemu, and sets
cpu->kvm_vcpu_dirty to true in do_inject_external_nmi() meanwhile.
2 It sets nmi_queued to 0 in process_nmi(), before entering guest, because
cpu->kvm_vcpu_dirty is true.

It's not enough just to check nmi_queued to decide whether to stay in
vcpu_block() or not. NMI should be injected immediately at any situation.
Add checking nmi_pending, and testing KVM_REQ_NMI replaces nmi_queued
in vm_vcpu_has_events().

Do the same change for SMIs.

Signed-off-by: Zhuang Yanying <ann.zhuangyanying@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-01 11:23:10 +02:00
Roman Pen d9c1b5431d KVM: SVM: do not zero out segment attributes if segment is unusable or not present
This is a fix for the problem [1], where VMCB.CPL was set to 0 and interrupt
was taken on userspace stack.  The root cause lies in the specific AMD CPU
behaviour which manifests itself as unusable segment attributes on SYSRET.
The corresponding work around for the kernel is the following:

61f01dd941 ("x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue")

In other turn virtualization side treated unusable segment incorrectly and
restored CPL from SS attributes, which were zeroed out few lines above.

In current patch it is assured only that P bit is cleared in VMCB.save state
and segment attributes are not zeroed out if segment is not presented or is
unusable, therefore CPL can be safely restored from DPL field.

This is only one part of the fix, since QEMU side should be fixed accordingly
not to zero out attributes on its side.  Corresponding patch will follow.

[1] Message id: CAJrWOzD6Xq==b-zYCDdFLgSRMPM-NkNuTSDFEtX=7MreT45i7Q@mail.gmail.com

Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com>
Signed-off-by: Mikhail Sennikovskii <mikhail.sennikovskii@profitbricks.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim KrÄmář <rkrcmar@redhat.com>
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-01 11:21:17 +02:00
Takashi Iwai d2c3b14e1f ALSA: hda - Fix applying MSI dual-codec mobo quirk
The previous commit [63691587f7b0: ALSA: hda - Apply dual-codec quirk
for MSI Z270-Gaming mobo] attempted to apply the existing dual-codec
quirk for a MSI mobo.  But it turned out that this isn't applied
properly due to the MSI-vendor quirk before this entry.  I overlooked
such two MSI entries just because they were put in the wrong position,
although we have a list ordered by PCI SSID numbers.

This patch fixes it by rearranging the unordered entries.

Fixes: 63691587f7 ("ALSA: hda - Apply dual-codec quirk for MSI Z270-Gaming mobo")
Reported-by: Rudolf Schmidt <info@rudolfschmidt.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-06-01 09:46:47 +02:00
Linus Torvalds a37484638c msm/exynos/i915/amdgpu fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZL3jPAAoJEAx081l5xIa+6RQP/0Inn+u3Yh3aFm1f/37DATTy
 w0S5JMWyBippCR1rR9EOmYA1xcyXmsDNsisyCAWXortqrDT5UpXLP3V0kI2IVzdn
 Ms4IbH6k6py+jLX88hJGZQyvhd26vXSEmFJDOzXj+1ie446yuAPJOXn1Lz0SHFOW
 0QLl7sgMgZoLseu5/fWgXlgLKnEBqQFhJjM0zmH58sxKBBbFCb5ox/f92x8SkSTm
 WO9voDrfsR0ejq55hViW1NBCU53ZnCUYL+P2zhrB8iog7fI9RvK4DhYIoutSUMCK
 pzMFilznfhte/4dg5oWxa/r0gxuwa0IviRmLA2UW7ioXZLjiJicXUgTsNzShbtYf
 C4NORc1uIX2pUGdLW6FH32Dbc+frL0fbKU8jeeeOuUyvxsiDhcIp72lVIVdrtz7/
 APpEE6Z86X3BhFEBcrNy9qasC+SX3BBxdtB/pR7YoRa6XN+Rrl2ZnsZGDP9UseQd
 DjBv68pyh30JiSotBdvJ+FuQo3rojARVlVluS+IVb6q1p1Hau3q+W0ItjvxIKtQ0
 5iM4uotCyNjAI6HwFO3ey7cRrP9qmywiDANefJutIIjpMLgtGz1sAEVLUx5NkA4s
 KTDGqJZ32cKGmlAxuMbjTOYI4SLxf6Scg+zhKIm5n7d4l0G7nXA/itvWPWFOSGlT
 aYVOYbA/bXAHWVCBgEVu
 =dckO
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-for-v4.12-rc4' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
 "This is the main set of fixes for rc4, one amdgpu fix, some exynos
  regression fixes, some msm fixes and some i915 and GVT fixes.

  I've got a second regression fix for some DP chips that might be a
  bit large, but I think we'd like to land it now, I'll send it along
  tomorrow, once you are happy with this set"

* tag 'drm-fixes-for-v4.12-rc4' of git://people.freedesktop.org/~airlied/linux: (24 commits)
  drm/amdgpu: Program ring for vce instance 1 at its register space
  drm/exynos: clean up description of exynos_drm_crtc
  drm/exynos: dsi: Remove bridge node reference in removal
  drm/exynos: dsi: Fix the parse_dt function
  drm/exynos: Merge pre/postclose hooks
  drm/msm: Fix the check for the command size
  drm/msm: Take the mutex before calling msm_gem_new_impl
  drm/msm: for array in-fences, check if all backing fences are from our own context before waiting
  drm/msm: constify irq_domain_ops
  drm/msm/mdp5: release hwpipe(s) for unused planes
  drm/msm: Reuse dma_fence_release.
  drm/msm: Expose our reservation object when exporting a dmabuf.
  drm/msm/gpu: check legacy clk names in get_clocks()
  drm/msm/mdp5: use __drm_atomic_helper_plane_duplicate_state()
  drm/msm: select PM_OPP
  drm/i915: Stop pretending to mask/unmask LPE audio interrupts
  drm/i915/selftests: Silence compiler warning in igt_ctx_exec
  Revert "drm/i915: Restore lost "Initialized i915" welcome message"
  drm/i915/gvt: clean up unsubmited workloads before destroying kmem cache
  drm/i915/gvt: Disable compression workaround for Gen9
  ...
2017-05-31 21:53:49 -07:00
Dave Airlie 400129f0a3 - Fix a regression to description of exynos_drm_crtc
- Remove preclose hook of Exynos
   . This was a exynos change of the patch series[1] merged already.
 - Fix one dt broken issue
 - Make sure to release bridge_node of Exynos MIPI-DSI driver.
 
 [1] https://lists.freedesktop.org/archives/dri-devel/2017-March/135111.html
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJZK3gTAAoJEFc4NIkMQxK46xkQAIaHUAWX8815oayhulMpENie
 P7zbw4HjXU6pX3kBRSabYr/CQcAVw7epLsDdCo64lg69HtfQzUnPxSDIWI5tCFcR
 WmlbTiD4Cvth8K+wmkUyowNE4QR5UKXSIghNfNXRd87pJj4jyRNfpY/xRkmL/hsx
 KZrIRBRnvgewnLfeU+UKXY7GsVFI17rCZdyWOI1FJL7a2srbvQRFkgajjweh9Wos
 mxicDpE5g91maH/OcVt2m2J1BNfLCN/J1jjA/Iz+c5OnUQYPtE0KUoFJPllcz8wA
 u416ZN1MYsRfZVn7XFeeBFmcs/bBQIADfkpHtX7sOx64FEQDgB0MxnvYtjS50gJ/
 FFb+0wybLyIDpsbEGbpFfEFO1IZbNUB+UbeMgyHeREtefkE5MxFfnUsUq4buv09Q
 w/ICTqE834Hra8PYGMqMe09oZrPyu9n2UJSdlXcyGlnwljmSEKu87KQBICrXfhiB
 pMiHqLvdyWp+jvNDfxSBBFbDSDmOktLcqowFQHeoclcMxwcBg0M9mGVeoHcdwy67
 MTcj0b+JniHcu5dLGP/iIzxnWU38XGFwB1W27X7milkj0hBlUDOJOPUVJcnXm+Ng
 LRAcV6E/aACTTg47XRLlgpQoqC9480cMA7CuZEz7tRve7d2y107yfU2K2+DHEdsk
 4AsUfGE9l11YM5FFoS1W
 =j5Vn
 -----END PGP SIGNATURE-----

Merge tag 'exynos-drm-fixes-for-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-fixes

- Fix a regression to description of exynos_drm_crtc
- Remove preclose hook of Exynos
  . This was a exynos change of the patch series[1] merged already.
- Fix one dt broken issue
- Make sure to release bridge_node of Exynos MIPI-DSI driver.

[1] https://lists.freedesktop.org/archives/dri-devel/2017-March/135111.html

* tag 'exynos-drm-fixes-for-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos:
  drm/exynos: clean up description of exynos_drm_crtc
  drm/exynos: dsi: Remove bridge node reference in removal
  drm/exynos: dsi: Fix the parse_dt function
  drm/exynos: Merge pre/postclose hooks
2017-06-01 12:07:48 +10:00
Dave Airlie 8ef6fcc8ee Merge branch 'drm-fixes-4.12' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
* 'drm-fixes-4.12' of git://people.freedesktop.org/~agd5f/linux:
  drm/amdgpu: Program ring for vce instance 1 at its register space
2017-06-01 12:07:18 +10:00
Dave Airlie 58b58f6ef5 Merge branch 'msm-fixes-4.12-rc4' of git://people.freedesktop.org/~robclark/linux into drm-fixes
a few fixes for 4.12..

* 'msm-fixes-4.12-rc4' of git://people.freedesktop.org/~robclark/linux:
  drm/msm: Fix the check for the command size
  drm/msm: Take the mutex before calling msm_gem_new_impl
  drm/msm: for array in-fences, check if all backing fences are from our own context before waiting
  drm/msm: constify irq_domain_ops
  drm/msm/mdp5: release hwpipe(s) for unused planes
  drm/msm: Reuse dma_fence_release.
  drm/msm: Expose our reservation object when exporting a dmabuf.
  drm/msm/gpu: check legacy clk names in get_clocks()
  drm/msm/mdp5: use __drm_atomic_helper_plane_duplicate_state()
  drm/msm: select PM_OPP
2017-06-01 12:06:34 +10:00
Dave Airlie 25f480e89a Merge tag 'drm-intel-fixes-2017-05-29' of git://anongit.freedesktop.org/git/drm-intel into drm-fixes
drm/i915 fixes for v4.12-rc4

* tag 'drm-intel-fixes-2017-05-29' of git://anongit.freedesktop.org/git/drm-intel:
  drm/i915: Stop pretending to mask/unmask LPE audio interrupts
  drm/i915/selftests: Silence compiler warning in igt_ctx_exec
  Revert "drm/i915: Restore lost "Initialized i915" welcome message"
  drm/i915/gvt: clean up unsubmited workloads before destroying kmem cache
  drm/i915/gvt: Disable compression workaround for Gen9
  drm/i915: set initialised only when init_context callback is NULL
  drm/i915: Fix new -Wint-in-bool-context gcc compiler warning
  drm/i915: use vma->size for appgtt allocate_va_range
  drm/i915: Do not sync RCU during shrinking
2017-06-01 11:53:34 +10:00
Jiang Yi 5e0cf5e6c4 iscsi-target: Always wait for kthread_should_stop() before kthread exit
There are three timing problems in the kthread usages of iscsi_target_mod:

 - np_thread of struct iscsi_np
 - rx_thread and tx_thread of struct iscsi_conn

In iscsit_close_connection(), it calls

 send_sig(SIGINT, conn->tx_thread, 1);
 kthread_stop(conn->tx_thread);

In conn->tx_thread, which is iscsi_target_tx_thread(), when it receive
SIGINT the kthread will exit without checking the return value of
kthread_should_stop().

So if iscsi_target_tx_thread() exit right between send_sig(SIGINT...)
and kthread_stop(...), the kthread_stop() will try to stop an already
stopped kthread.

This is invalid according to the documentation of kthread_stop().

(Fix -ECONNRESET logout handling in iscsi_target_tx_thread and
 early iscsi_target_rx_thread failure case - nab)

Signed-off-by: Jiang Yi <jiangyilism@gmail.com>
Cc: <stable@vger.kernel.org> # v3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-31 15:12:57 -07:00
Nicholas Bellinger 25cdda95fd iscsi-target: Fix initial login PDU asynchronous socket close OOPs
This patch fixes a OOPs originally introduced by:

   commit bb048357da
   Author: Nicholas Bellinger <nab@linux-iscsi.org>
   Date:   Thu Sep 5 14:54:04 2013 -0700

   iscsi-target: Add sk->sk_state_change to cleanup after TCP failure

which would trigger a NULL pointer dereference when a TCP connection
was closed asynchronously via iscsi_target_sk_state_change(), but only
when the initial PDU processing in iscsi_target_do_login() from iscsi_np
process context was blocked waiting for backend I/O to complete.

To address this issue, this patch makes the following changes.

First, it introduces some common helper functions used for checking
socket closing state, checking login_flags, and atomically checking
socket closing state + setting login_flags.

Second, it introduces a LOGIN_FLAGS_INITIAL_PDU bit to know when a TCP
connection has dropped via iscsi_target_sk_state_change(), but the
initial PDU processing within iscsi_target_do_login() in iscsi_np
context is still running.  For this case, it sets LOGIN_FLAGS_CLOSED,
but doesn't invoke schedule_delayed_work().

The original NULL pointer dereference case reported by MNC is now handled
by iscsi_target_do_login() doing a iscsi_target_sk_check_close() before
transitioning to FFP to determine when the socket has already closed,
or iscsi_target_start_negotiation() if the login needs to exchange
more PDUs (eg: iscsi_target_do_login returned 0) but the socket has
closed.  For both of these cases, the cleanup up of remaining connection
resources will occur in iscsi_target_start_negotiation() from iscsi_np
process context once the failure is detected.

Finally, to handle to case where iscsi_target_sk_state_change() is
called after the initial PDU procesing is complete, it now invokes
conn->login_work -> iscsi_target_do_login_rx() to perform cleanup once
existing iscsi_target_sk_check_close() checks detect connection failure.
For this case, the cleanup of remaining connection resources will occur
in iscsi_target_do_login_rx() from delayed workqueue process context
once the failure is detected.

Reported-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Tested-by: Mike Christie <mchristi@redhat.com>
Cc: Mike Christie <mchristi@redhat.com>
Reported-by: Hannes Reinecke <hare@suse.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Varun Prakash <varun@chelsio.com>
Cc: <stable@vger.kernel.org> # v3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-31 15:12:31 -07:00
Leo Liu 45cc6586b7 drm/amdgpu: Program ring for vce instance 1 at its register space
We need program ring buffer on instance 1 register space domain,
when only if instance 1 available, with two instances or instance 0,
and we need only program instance 0 regsiter space domain for ring.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-31 13:09:15 -04:00
NeilBrown 6ea44adce9 SUNRPC: ensure correct error is reported by xs_tcp_setup_socket()
If you attempt a TCP mount from an host that is unreachable in a way
that triggers an immediate error from kernel_connect(), that error
does not propagate up, instead EAGAIN is reported.

This results in call_connect_status receiving the wrong error.

A case that it easy to demonstrate is to attempt to mount from an
address that results in ENETUNREACH, but first deleting any default
route.
Without this patch, the mount.nfs process is persistently runnable
and is hard to kill.  With this patch it exits as it should.

The problem is caused by the fact that xs_tcp_force_close() eventually
calls
      xprt_wake_pending_tasks(xprt, -EAGAIN);
which causes an error return of -EAGAIN.  so when xs_tcp_setup_sock()
calls
      xprt_wake_pending_tasks(xprt, status);
the status is ignored.

Fixes: 4efdd92c92 ("SUNRPC: Remove TCP client connection reset hack")
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-05-31 12:26:44 -04:00
Jan Kara 5a8948f8a3 md: Make flush bios explicitely sync
Commit b685d3d65a "block: treat REQ_FUA and REQ_PREFLUSH as
synchronous" removed REQ_SYNC flag from WRITE_{FUA|PREFLUSH|...}
definitions.  generic_make_request_checks() however strips REQ_FUA and
REQ_PREFLUSH flags from a bio when the storage doesn't report volatile
write cache and thus write effectively becomes asynchronous which can
lead to performance regressions

Fix the problem by making sure all bios which are synchronous are
properly marked with REQ_SYNC.

CC: linux-raid@vger.kernel.org
CC: Shaohua Li <shli@kernel.org>
Fixes: b685d3d65a
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Shaohua Li <shli@fb.com>
2017-05-31 09:25:53 -07:00
Linus Torvalds d602fb6844 Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull overlayfs fixes from Miklos Szeredi:
 "Fix regressions:

   - missing CONFIG_EXPORTFS dependency

   - failure if upper fs doesn't support xattr

   - bad error cleanup

  This also adds the concept of "impure" directories complementing the
  "origin" marking introduced in -rc1. Together they enable getting
  consistent st_ino and d_ino for directory listings.

  And there's a bug fix and a cleanup as well"

* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: filter trusted xattr for non-admin
  ovl: mark upper merge dir with type origin entries "impure"
  ovl: mark upper dir with type origin entries "impure"
  ovl: remove unused arg from ovl_lookup_temp()
  ovl: handle rename when upper doesn't support xattr
  ovl: don't fail copy-up if upper doesn't support xattr
  ovl: check on mount time if upper fs supports setting xattr
  ovl: fix creds leak in copy up error path
  ovl: select EXPORTFS
2017-05-31 08:29:02 -07:00
Hou Tao 5be6b75610 cfq-iosched: fix the delay of cfq_group's vdisktime under iops mode
When adding a cfq_group into the cfq service tree, we use CFQ_IDLE_DELAY
as the delay of cfq_group's vdisktime if there have been other cfq_groups
already.

When cfq is under iops mode, commit 9a7f38c42c ("cfq-iosched: Convert
from jiffies to nanoseconds") could result in a large iops delay and
lead to an abnormal io schedule delay for the added cfq_group. To fix
it, we just need to revert to the old CFQ_IDLE_DELAY value: HZ / 5
when iops mode is enabled.

Despite having the same value, the delay of a cfq_queue in idle class
and the delay of cfq_group are different things, so I define two new
macros for the delay of a cfq_group under time-slice mode and iops mode.

Fixes: 9a7f38c42c ("cfq-iosched: Convert from jiffies to nanoseconds")
Cc: <stable@vger.kernel.org> # 4.8+
Signed-off-by: Hou Tao <houtao1@huawei.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-05-31 09:25:21 -06:00
Brian Foster 63db7c815b xfs: use ->b_state to fix buffer I/O accounting release race
We've had user reports of unmount hangs in xfs_wait_buftarg() that
analysis shows is due to btp->bt_io_count == -1. bt_io_count
represents the count of in-flight asynchronous buffers and thus
should always be >= 0. xfs_wait_buftarg() waits for this value to
stabilize to zero in order to ensure that all untracked (with
respect to the lru) buffers have completed I/O processing before
unmount proceeds to tear down in-core data structures.

The value of -1 implies an I/O accounting decrement race. Indeed,
the fact that xfs_buf_ioacct_dec() is called from xfs_buf_rele()
(where the buffer lock is no longer held) means that bp->b_flags can
be updated from an unsafe context. While a user-level reproducer is
currently not available, some intrusive hacks to run racing buffer
lookups/ioacct/releases from multiple threads was used to
successfully manufacture this problem.

Existing callers do not expect to acquire the buffer lock from
xfs_buf_rele(). Therefore, we can not safely update ->b_flags from
this context. It turns out that we already have separate buffer
state bits and associated serialization for dealing with buffer LRU
state in the form of ->b_state and ->b_lock. Therefore, replace the
_XBF_IN_FLIGHT flag with a ->b_state variant, update the I/O
accounting wrappers appropriately and make sure they are used with
the correct locking. This ensures that buffer in-flight state can be
modified at buffer release time without racing with modifications
from a buffer lock holder.

Fixes: 9c7504aa72 ("xfs: track and serialize in-flight async buffers against unmount")
Cc: <stable@vger.kernel.org> # v4.8+
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Tested-by: Libor Pechacek <lpechacek@suse.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-05-31 08:22:52 -07:00
Jan Kara ff0361b34a dm: make flush bios explicitly sync
Commit b685d3d65a ("block: treat REQ_FUA and REQ_PREFLUSH as
synchronous") removed REQ_SYNC flag from WRITE_{FUA|PREFLUSH|...}
definitions.  generic_make_request_checks() however strips REQ_FUA and
REQ_PREFLUSH flags from a bio when the storage doesn't report volatile
write cache and thus write effectively becomes asynchronous which can
lead to performance regressions.

Fix the problem by making sure all bios which are synchronous are
properly marked with REQ_SYNC.

Fixes: b685d3d65a ("block: treat REQ_FUA and REQ_PREFLUSH as synchronous")
Cc: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-05-31 10:50:23 -04:00
Takashi Iwai e49a14fa36 ALSA: usb: Avoid VLA in mixer_us16x08.c
This is another attempt to work around the VLA used in
mixer_us16x08.c.  Basically the temporary array is used individually
for two cases, and we can declare locally in each block, instead of
hackish max() usage.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-05-31 08:46:19 +02:00
Takashi Iwai 617163fc25 ALSA: usb: Fix a typo in Tascam US-16x08 mixer element
A mixer element created in a quirk for Tascam US-16x08 contains a
typo: it should be "EQ MidLow Q" instead of "EQ MidQLow Q".

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195875
Fixes: d2bb390a20 ("ALSA: usb-audio: Tascam US-16x08 DSP mixer quirk")
Cc: <stable@vger.kernel.org> # v4.11+
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-05-31 08:46:16 +02:00
Takashi Iwai 64188cfbe5 Revert "ALSA: usb-audio: purge needless variable length array"
This reverts commit 89b593c30e ("ALSA: usb-audio: purge needless
variable length array").  The patch turned out to cause a severe
regression, triggering an Oops at snd_usb_ctl_msg().  It was overseen
that snd_usb_ctl_msg() writes back the response to the given buffer,
while the patch changed it to a read-only const buffer.  (One should
always double-check when an extra pointer cast is present...)

As a simple fix, just revert the affected commit.  It was merely a
cleanup.  Although it brings VLA again, it's clearer as a fix.  We'll
address the VLA later in another patch.

Fixes: 89b593c30e ("ALSA: usb-audio: purge needless variable length array")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195875
Cc: <stable@vger.kernel.org> # v4.11+
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-05-31 08:46:14 +02:00
Patrick Venture 7ed1c5e5dd hwmon: (aspeed-pwm-tacho) On read failure return -ETIMEDOUT
When the controller fails to provide an RPM reading within the alloted
time; the driver returns -ETIMEDOUT and no file contents.

Signed-off-by: Patrick Venture <venture@google.com>
Fixes: 2d7a548a3e ("drivers: hwmon: Support for ASPEED PWM/Fan tach")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2017-05-30 15:15:40 -07:00
Guenter Roeck 08fd5e76c2 hwmon: (aspeed-pwm-tacho) Select REGMAP
The driver uses regmap and thus has to select it to avoid build
errors such as the following.

drivers/hwmon/aspeed-pwm-tacho.c:337:21: error: variable
	'aspeed_pwm_tacho_regmap_config' has initializer but incomplete type

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Acked-by: Joel Stanley <joel@jms.id.au>
Fixes: 2d7a548a3e ("drivers: hwmon: Support for ASPEED PWM/Fan tach")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2017-05-30 15:15:19 -07:00
Linus Torvalds f511c0b17b "Yes, people use FOLL_FORCE ;)"
This effectively reverts commit 8ee74a91ac ("proc: try to remove use
of FOLL_FORCE entirely")

It turns out that people do depend on FOLL_FORCE for the /proc/<pid>/mem
case, and we're talking not just debuggers. Talking to the affected people, the use-cases are:

Keno Fischer:
 "We used these semantics as a hardening mechanism in the julia JIT. By
  opening /proc/self/mem and using these semantics, we could avoid
  needing RWX pages, or a dual mapping approach. We do have fallbacks to
  these other methods (though getting EIO here actually causes an assert
  in released versions - we'll updated that to make sure to take the
  fall back in that case).

  Nevertheless the /proc/self/mem approach was our favored approach
  because it a) Required an attacker to be able to execute syscalls
  which is a taller order than getting memory write and b) didn't double
  the virtual address space requirements (as a dual mapping approach
  would).

  I think in general this feature is very useful for anybody who needs
  to precisely control the execution of some other process. Various
  debuggers (gdb/lldb/rr) certainly fall into that category, but there's
  another class of such processes (wine, various emulators) which may
  want to do that kind of thing.

  Now, I suspect most of these will have the other process under ptrace
  control, so maybe allowing (same_mm || ptraced) would be ok, but at
  least for the sandbox/remote-jit use case, it would be perfectly
  reasonable to not have the jit server be a ptracer"

Robert O'Callahan:
 "We write to readonly code and data mappings via /proc/.../mem in lots
  of different situations, particularly when we're adjusting program
  state during replay to match the recorded execution.

  Like Julia, we can add workarounds, but they could be expensive."

so not only do people use FOLL_FORCE for both reads and writes, but they
use it for both the local mm and remote mm.

With these comments in mind, we likely also cannot add the "are we
actively ptracing" check either, so this keeps the new code organization
and does not do a real revert that would add back the original comment
about "Maybe we should limit FOLL_FORCE to actual ptrace users?"

Reported-by: Keno Fischer <keno@juliacomputing.com>
Reported-by: Robert O'Callahan <robert@ocallahan.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-30 12:38:59 -07:00
Keith Busch e4dc2b32df blk-mq: Take tagset lock when updating hw queues
The tagset lock needs to be held when iterating the tag_list, so a
lockdep assert was added when updating number of hardware queues. The
drivers calling this API, however, were unaware of the new requirement,
so are failing the assertion.

This patch takes the lock within the blk-mq function so the drivers do
not have to be modified in order to be safe.

Fixes: 705cda97e ("blk-mq: Make it safe to use RCU to iterate over blk_mq_tag_set.tag_list")
Reported-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-05-30 12:32:37 -06:00
Gioh Kim 8eae9570d1 KVM: SVM: ignore type when setting segment registers
Commit 19bca6ab75 ("KVM: SVM: Fix cross vendor migration issue with
unusable bit") added checking type when setting unusable.
So unusable can be set if present is 0 OR type is 0.
According to the AMD processor manual, long mode ignores the type value
in segment descriptor. And type can be 0 if it is read-only data segment.
Therefore type value is not related to unusable flag.

This patch is based on linux-next v4.12.0-rc3.

Signed-off-by: Gioh Kim <gi-oh.kim@profitbricks.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-05-30 17:17:22 +02:00