Commit Graph

617334 Commits

Author SHA1 Message Date
Alex Lyakas eecba891d3 btrfs: flush_space: treat return value of do_chunk_alloc properly
do_chunk_alloc returns 1 when it succeeds to allocate a new chunk.
But flush_space will not convert this to 0, and will also return 1.
As a result, reserve_metadata_bytes will think that flush_space failed,
and may potentially return this value "1" to the caller (depends how
reserve_metadata_bytes was called). The caller will also treat this as an error.
For example, btrfs_block_rsv_refill does:

int ret = -ENOSPC;
...
ret = reserve_metadata_bytes(root, block_rsv, num_bytes, flush);
if (!ret) {
        block_rsv_add_bytes(block_rsv, num_bytes, 0);
        return 0;
}

return ret;

So it will return -ENOSPC.

Signed-off-by: Alex Lyakas <alex@zadarastorage.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25 03:58:18 -07:00
Liu Bo f3bca8028b Btrfs: add ASSERT for block group's memory leak
This adds several ASSERT()' s to report memory leak of block group cache.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25 03:58:17 -07:00
Qu Wenruo d8422ba334 btrfs: backref: Fix soft lockup in __merge_refs function
When over 1000 file extents refers to one extent, find_parent_nodes()
will be obviously slow, due to the O(n^2)~O(n^3) loops inside
__merge_refs().

The following ftrace shows the cubic growth of execution time:

256 refs
 5) + 91.768 us   |  __add_keyed_refs.isra.12 [btrfs]();
 5)   1.447 us    |  __add_missing_keys.isra.13 [btrfs]();
 5) ! 114.544 us  |  __merge_refs [btrfs]();
 5) ! 136.399 us  |  __merge_refs [btrfs]();

512 refs
 6) ! 279.859 us  |  __add_keyed_refs.isra.12 [btrfs]();
 6)   3.164 us    |  __add_missing_keys.isra.13 [btrfs]();
 6) ! 442.498 us  |  __merge_refs [btrfs]();
 6) # 2091.073 us |  __merge_refs [btrfs]();

and 1024 refs
 7) ! 368.683 us  |  __add_keyed_refs.isra.12 [btrfs]();
 7)   4.810 us    |  __add_missing_keys.isra.13 [btrfs]();
 7) # 2043.428 us |  __merge_refs [btrfs]();
 7) * 18964.23 us |  __merge_refs [btrfs]();

And sort them into the following char:
(Unit: us)
------------------------------------------------------------------------
 Trace function        | 256 ref        | 512 refs      | 1024 refs    |
------------------------------------------------------------------------
 __add_keyed_refs      | 91             | 249           | 368          |
 __add_missing_keys    | 1              | 3             | 4            |
 __merge_refs 1st call | 114            | 442           | 2043         |
 __merge_refs 2nd call | 136            | 2091          | 18964        |
------------------------------------------------------------------------

We can see the that __add_keyed_refs() grows almost in linear behavior.
And __add_missing_keys() in this case doesn't change much or takes much
time.

While for the 1st __merge_refs() it's square growth
for the 2nd __merge_refs() call it's cubic growth.

It's no doubt that merge_refs() will take a long long time to execute if
the number of refs continues its grows.

So add a cond_resced() into the loop of __merge_refs().

Although this will solve the problem of soft lockup, we need to use the
new rb_tree based structure introduced by Lu Fengqi to really solve the
long execution time.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25 03:58:16 -07:00
Liu Bo 1c1ea4f781 Btrfs: fix memory leak of reloc_root
When some critical errors occur and FS would be flipped into RO,
if we have an on-going balance, we can end up with a memory leak
of root->reloc_root since btrfs_drop_snapshots() bails out
without freeing reloc_root at the very early start.

However, we're not able to free reloc_root in btrfs_drop_snapshots()
because its caller, merge_reloc_roots(), still needs to access it to
cleanup reloc_root's rbtree.

This makes us free reloc_root when we're going to free fs/file roots.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25 03:58:15 -07:00
Mark Rutland fd363bd417 arm64: avoid TLB conflict with CONFIG_RANDOMIZE_BASE
When CONFIG_RANDOMIZE_BASE is selected, we modify the page tables to remap the
kernel at a newly-chosen VA range. We do this with the MMU disabled, but do not
invalidate TLBs prior to re-enabling the MMU with the new tables. Thus the old
mappings entries may still live in TLBs, and we risk violating
Break-Before-Make requirements, leading to TLB conflicts and/or other issues.

We invalidate TLBs when we uninsall the idmap in early setup code, but prior to
this we are subject to issues relating to the Break-Before-Make violation.

Avoid these issues by invalidating the TLBs before the new mappings can be
used by the hardware.

Fixes: f80fb3a3d5 ("arm64: add support for kernel ASLR")
Cc: <stable@vger.kernel.org> # 4.6+
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-08-25 11:11:32 +01:00
Linus Torvalds 61c04572de Merge branch 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux
Pull thermal fixes from Zhang Rui:

 - Fix cpu_cooling to have separate thermal_cooling_device_ops
   structures for cpus with and without power model, to avoid NULL
   dereference in cpufreq_state2power.  From Brendan Jackman.

 - Fix a possible NULL dereference in imx_thermal driver.  From Corentin
   LABBE.

 - Another two trivial fixes, one typo fix and one deleting module
   owner.  From Caesar Wang and Markus Elfring.

* 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
  thermal: imx: fix a possible NULL dereference
  thermal: trivial: fix the typo
  Thermal-INT3406: Delete owner assignment
  thermal: cpu_cooling: Fix NULL dereference in cpufreq_state2power
2016-08-25 05:49:38 -04:00
Felipe Balbi 6f8245b4e3 usb: dwc3: gadget: always decrement by 1
We need to decrement in both cases (enq > deq and
enq < deq)

Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2016-08-25 12:30:27 +03:00
Felipe Balbi 696fe69d7e usb: dwc3: debug: fix ep name on trace output
There was a typo when generating endpoint name which
would be very confusing when debugging. Fix it.

Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2016-08-25 10:36:25 +03:00
Felipe Balbi 23fd537c95 usb: gadget: udc: core: don't starve DMA resources
Always unmap all SG entries as required by DMA API

Fixes: a698908d3b ("usb: gadget: add generic map/unmap request utilities")
Cc: <stable@vger.kernel.org> # v3.4+
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2016-08-25 10:36:24 +03:00
Dave Airlie 179ca3bb2c Merge branch 'drm-fixes-4.8' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
radeon and amdgpu fixes for 4.8.  Nothing major:
- fix a performance regression due to the LRU changes in 4.7
- 32 bit fixes
- fix a PLL regression
- misc bug fixes

* 'drm-fixes-4.8' of git://people.freedesktop.org/~agd5f/linux:
  drm/amdgpu: skip TV/CV in display parsing
  drm/amdgpu: avoid a possible array overflow
  drm/amdgpu: fix lru size grouping v2
  drm/amdgpu: fix timeout value check in amd_sched_job_recovery
  drm/amdgpu: fix sdma_v2_4_ring_test_ib
  drm/amdgpu: fix amdgpu_move_blit on 32bit systems
  drm/radeon: fix radeon_move_blit on 32bit systems
  drm/radeon: only apply the SS fractional workaround to RS[78]80
2016-08-25 12:50:30 +10:00
Dave Airlie 30bffd1b44 drm/tegra: Fixes for v4.8-rc4
This contains one fix for DSI runtime power management support that was
 introduced in v4.8-rc1. This is slightly more elaborate than I would've
 wished, but there are a few corner cases that needed fixing.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIwBAABCAAaBQJXvaiSExx0cmVkaW5nQG52aWRpYS5jb20ACgkQ3SOs138+s6Hv
 jg/6AnYkczQYraAb8Oy2YO3dIfz3r0TLGEEnY3AKudur867mNG0V59JcUXHQYKin
 FL7z4iIEx21RAtA9gUEc7xz9gitXP2W/GqRG6v1u0wLn+GS72TuNZCsqUVPQ6Glr
 yg0uLPoxsFyLa3rsTgif5jXxpsbSd6OV2NdkpImqt1EQLaOWYSNqLLq/YNvMsvCK
 QivRvvJEYznIyJayIN/PDekfYY3L5esMQNdDEI9jJKdJrTykQThx2eOIbsgY4MyC
 sdp/h9R23O1ZFvfGONrQHOOzDGumhDJtsRfew+q9JoBaOOgztk2LmPpRZFQsIrmV
 1rv1ahQh3boPWw3cSoQMCE8E3Eu2wD/zNrWYbKyhjChx/HTaSPyQGwk0v2WycizA
 KnNo3eQOTT/5Wb/KvhCaef4Q+yc4LbcAC8T3pkYyGVy+yhUOXr4SGOQLh4Za0Sag
 uaM20wk3TPYAiyUKZFE7RZ1WHtOp56gOIHbS5eV5ppuR2lmoCh4vK5yCojaz4gww
 DAHzldegReurS3du613KrV63lue1r3nwX1f0kKDsWd4yN0FH/74g/QC1jXKqJmmk
 ZevqKCQ1eoaPyOMQI2G0rvRL4B63s800oHO2L760e5SdbM1/Oraxzc+FobGZz9KW
 Y55SyNnn3QT5HcH6dntmisnuSbZVXVNonEJaPJWFLH/ChgI=
 =lyq1
 -----END PGP SIGNATURE-----

Merge tag 'drm/tegra/for-4.8-rc4' of git://anongit.freedesktop.org/tegra/linux into drm-fixes

drm/tegra: Fixes for v4.8-rc4

This contains one fix for DSI runtime power management support that was
introduced in v4.8-rc1. This is slightly more elaborate than I would've
wished, but there are a few corner cases that needed fixing.

* tag 'drm/tegra/for-4.8-rc4' of git://anongit.freedesktop.org/tegra/linux:
  drm/tegra: dsi: Enhance runtime power management
2016-08-25 12:49:22 +10:00
Chuck Lever 16590a2281 SUNRPC: Silence WARN_ON when NFSv4.1 over RDMA is in use
Using NFSv4.1 on RDMA should be safe, so broaden the new checks in
rpc_create().

WARN_ON_ONCE is used, matching most other WARN call sites in clnt.c.

Fixes: 39a9beab5a ("rpc: share one xps between all backchannels")
Fixes: d50039ea5e ("nfsd4/rpc: move backchannel create logic...")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-08-24 22:32:55 -04:00
Heinz Mauelshagen 9c5a559d94 dm log: fix unitialized bio operation flags
Commit e6047149db ("dm: use bio op accessors") switched DM over to
using bio_set_op_attrs() but didn't take care to initialize
lc->io_req.bi_op_flags in dm-log.c:rw_header().  This caused
rw_header()'s call to dm_io() to make bio->bi_op_flags be uninitialized
in dm-io.c:do_region(), which ultimately resulted in a SCSI BUG() in
sd_init_command().

Also, adjust rw_header() and its callers to use REQ_OP_{READ|WRITE}.

Fixes: e6047149db ("dm: use bio op accessors")
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-08-24 21:55:05 -04:00
Mike Snitzer 299f6230bc dm flakey: fix reads to be issued if drop_writes configured
v4.8-rc3 commit 99f3c90d0d ("dm flakey: error READ bios during the
down_interval") overlooked the 'drop_writes' feature, which is meant to
allow reads to be issued rather than errored, during the down_interval.

Fixes: 99f3c90d0d ("dm flakey: error READ bios during the down_interval")
Reported-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
2016-08-24 21:55:05 -04:00
Xing Zheng a45f9d41c9 clk: rockchip: mark aclk_emmc_noc as a critical clock on rk3399
We don't have code to handle any of the noc clocks in rk3399 and they're
all just listed as critical clocks.  Let's do the same for
aclk_emmc_noc.

Without this clock being marked as critical we have problems around
suspend/resume after commit 20c389e656 ("clk: rockchip: fix incorrect
aclk_emmc source gate bits on rk3399").  Before that change we were
presumably not actually gating any of these clocks because we were
setting the wrong gate.

Fixes: 20c389e656 ("clk: rockchip: fix incorrect aclk_emmc source gate bits on rk3399")
Signed-off-by: Xing Zheng <zhengxing@rock-chips.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2016-08-24 23:44:49 +02:00
Jens Axboe 0e87e58bf6 blk-mq: improve warning for running a queue on the wrong CPU
__blk_mq_run_hw_queue() currently warns if we are running the queue on a
CPU that isn't set in its mask. However, this can happen if a CPU is
being offlined, and the workqueue handling will place the work on CPU0
instead. Improve the warning so that it only triggers if the batch cpu
in the hardware queue is currently online.  If it triggers for that
case, then it's indicative of a flow problem in blk-mq, so we want to
retain it for that case.

Signed-off-by: Jens Axboe <axboe@fb.com>
2016-08-24 15:38:01 -06:00
Jens Axboe e57690fe00 blk-mq: don't overwrite rq->mq_ctx
We do this in a few places, if the CPU is offline. This isn't allowed,
though, since on multi queue hardware, we can't just move a request
from one software queue to another, if they map to different hardware
queues. The request and tag isn't valid on another hardware queue.

This can happen if plugging races with CPU offlining. But it does
no harm, since it can only happen in the window where we are
currently busy freezing the queue and flushing IO, in preparation
for redoing the software <-> hardware queue mappings.

Signed-off-by: Jens Axboe <axboe@fb.com>
2016-08-24 15:34:35 -06:00
Doug Ledford 716b076ba4 IB/srpt: Update sport->port_guid with each port refresh
If port_guid is set with the default subnet_prefix, then we get a change
event and run a port refresh, we don't update the port_guid.  As a
result, attempts to create a target device that uses the new
subnet_prefix in the wwn will fail to find a match and be rejected by
the ib_srpt driver.  This makes it impossible to configure a port if it
was initialized with a default subnet_prefix and later changed to any
non-default subnet-prefix.  Updating the port refresh task to always
update the wwn based upon the current subnext_prefix solves this
problem.

Cc: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: nab@linux-iscsi.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-24 16:51:16 -04:00
Linus Torvalds 4935e04ef4 Merge branch 'for-linus-4.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml
Pull UML fix from Richard Weinberger:
 "This contains a fix for a build regression introduced during the merge
  window"

* 'for-linus-4.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
  um: Don't discard .text.exit section
2016-08-24 16:04:59 -04:00
Linus Torvalds 94ef71a99a This pull requests contains fixes for two issues in UBI and UBIFS:
1. Wrong UBIFS assertion.
 2. A UBIFS xattr regression.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJXvflzAAoJEEtJtSqsAOnWs8QP/jEgGY5QcuvdIA+ymFFFeZ8o
 /8YwzbLula+M5T7trMSaRmT5AW5iwY/Xu/VVpVipKMVkAP7079jLjJljOckwriut
 FY60BoQ8VcxwFPRn5xMDJ6KdDMAzVFX10j/+h71VrlE6Ej4nu8XVYGYiRdnjTYiF
 JdxvuWgIDmycRT6bH69c5ZSNpMuOPpCydX0CbWEFk9P07BKL2+9inpPBGRJxy8y8
 abT4ByCJmZWjruzjeBrR4o9A2hrDTrlHPH2RzQgJXCDKntM8AjsCCCReHbhrCKLo
 QmZh+8l8N8HN4GQczcrbTSL0EZn3IsbAS6Ut03NOPcSb+kWjaH5Hcr2kEUEFA7R4
 myKfFe6/BorgHX4QTqNiX7r0y63YepcIFAIUnfzv4wya8p+IGAunouZ+D6e+3BPy
 ICUv3oGqDZkI/fqc5h3cU9RLF7fOvdAtqO+M/lInwFPbqv3jJoVxuC4oD7PEh/eY
 n7VNeVyYr+8JZ5MKBU3zYHRNyHDYME+wTpNGu/4fJR1Rsym523L7hka3zQDkDqs3
 xqFoln1BcRKT/kMTKubK5dLAWaRv68RuMZTPRDeSBrY5vY7jYTYg/42eXnmcOzeP
 vi8L3p8o2CSLHTjeX+Q0lHAw/Ppy/FFowUEx03huj0C+5LNtY6RvIvBtC3YABR8U
 nC92PnhRhl9gUvkDuii9
 =8pdh
 -----END PGP SIGNATURE-----

Merge tag 'upstream-4.8-rc4' of git://git.infradead.org/linux-ubifs

Pull UBIFS fixes from Richard Weinberger:
 "This pull requests contains fixes for two issues in UBI and UBIFS:

   - wrong UBIFS assertion.
   - a UBIFS xattr regression"

* tag 'upstream-4.8-rc4' of git://git.infradead.org/linux-ubifs:
  ubifs: Fix xattr generic handler usage
  ubifs: Fix assertion in layout_in_gaps()
2016-08-24 15:54:41 -04:00
Mark Brown cfb89f2e75 Merge remote-tracking branches 'asoc/fix/max98371', 'asoc/fix/nau8825', 'asoc/fix/omap', 'asoc/fix/samsung', 'asoc/fix/simple' and 'asoc/fix/wm2000' into asoc-linus 2016-08-24 19:05:25 +01:00
Mark Brown e8f0f8aa4e Merge remote-tracking branches 'asoc/fix/atmel', 'asoc/fix/compress', 'asoc/fix/da7213' and 'asoc/fix/debugfs' into asoc-linus 2016-08-24 19:05:22 +01:00
Mark Brown d520519518 Merge remote-tracking branch 'asoc/fix/rcar' into asoc-linus 2016-08-24 19:05:21 +01:00
Mark Brown a74306fe94 Merge remote-tracking branch 'asoc/fix/intel' into asoc-linus 2016-08-24 19:05:20 +01:00
Mark Brown b5db6c57c9 Merge remote-tracking branch 'asoc/fix/dapm' into asoc-linus 2016-08-24 19:05:18 +01:00
Mark Brown ae16842306 Merge remote-tracking branch 'asoc/fix/core' into asoc-linus 2016-08-24 19:05:17 +01:00
Alex Deucher 611a1507fe drm/amdgpu: skip TV/CV in display parsing
No asics supported by amdgpu support analog TV.

Workaround for bug:
https://bugs.freedesktop.org/show_bug.cgi?id=97460

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2016-08-24 14:04:38 -04:00
Linus Torvalds fe2dd21282 xen: regression fix for 4.8-rc3
- Fix a regression in the xenbus device preventing userspace tools
   from working.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXvdugAAoJEFxbo/MsZsTRAwEH/AiKLV4T0OiARv/df827WVnL
 obUmEAh/wVSWZh2xdUNurDOH64lEfeBDSBIpGPQMLGmXLzNEQO9u8ZJYWJ7R1Ryp
 JU37lu3DP7HqQqTXsy8ltgcBkwVaQZAo0GRtDeua80ZPdjulnZirwHWS48TuNIFF
 pVtW4Eoy1BNAVri55o5hOIub4HUKMRoNB/J+o+SKLyJEvOon+qD4pOfIhR3sqeja
 oYVX7QpY/4Miymd5uI9v8LUefS4PW/U58a7tjr414Ng4mzQbZOHDmNyWF0CH27lj
 INAmgMXDG7RtiSQMWPKtDQUvuefApKoeRmFr6mQ/xHyCX3cAzOw07+p0rKacCig=
 =PTX1
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.8b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen regression fix from David Vrabel:
 "Fix a regression in the xenbus device preventing userspace tools from
  working"

* tag 'for-linus-4.8b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: change the type of xen_vcpu_id to uint32_t
  xenbus: don't look up transaction IDs for ordinary writes
2016-08-24 14:04:30 -04:00
Alex Deucher e1718d97aa drm/amdgpu: avoid a possible array overflow
When looking up the connector type make sure the index
is valid.  Avoids a later crash if we read past the end
of the array.

Workaround for bug:
https://bugs.freedesktop.org/show_bug.cgi?id=97460

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2016-08-24 14:04:24 -04:00
Vince Hsu af7c388a9c clk: tegra: remove TEGRA_PLL_USE_LOCK for PLLD/PLLD2
Tegra114 has a HW bug that the PLLD/PLLD2 lock bit cannot be asserted when
the DIS power domain is during up-powergating process but the clamp to this
domain is not removed yet. That causes a timeout and aborts the power
sequence, although the PLLD/PLLD2 has already locked. To remove the false
alarm, we don't use the lock for PLLD/PLLD2. Just wait 1ms and treat the
clocks as locked.

Signed-off-by: Vince Hsu <vinceh@nvidia.com>
Tested-by: Jonathan Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2016-08-24 10:54:17 -07:00
Shaohua Li 45c91d808f raid5: avoid unnecessary bio data set
bio_reset doesn't change bi_io_vec and bi_max_vecs, so we don't need to
set them every time. bi_private will be set before the bio is
dispatched.

Signed-off-by: Shaohua Li <shli@fb.com>
2016-08-24 10:21:53 -07:00
Shaohua Li 5f9d1fde7d raid5: fix memory leak of bio integrity data
Yi reported a memory leak of raid5 with DIF/DIX enabled disks. raid5
doesn't alloc/free bio, instead it reuses bios. There are two issues in
current code:
1. the code calls bio_init (from
init_stripe->raid5_build_block->bio_init) then bio_reset (ops_run_io).
The bio is reused, so likely there is integrity data attached. bio_init
will clear a pointer to integrity data and makes bio_reset can't release
the data
2. bio_reset is called before dispatching bio. After bio is finished,
it's possible we don't free bio's integrity data (eg, we don't call
bio_reset again)
Both issues will cause memory leak. The patch moves bio_init to stripe
creation and bio_reset to bio end io. This will fix the two issues.

Reported-by: Yi Zhang <yizhan@redhat.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2016-08-24 10:21:52 -07:00
Tomasz Majchrzak 27028626b4 raid10: record correct address of bad block
For failed write request record block address on a device, not block
address in an array.

Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2016-08-24 10:21:51 -07:00
Wei Yongjun 0f6187dbe5 md-cluster: fix error return code in join()
Fix to return error code -ENOMEM from the lockres_init() error
handling case instead of 0, as done elsewhere in this function.

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2016-08-24 10:21:51 -07:00
Song Liu 486b0f7bcd r5cache: set MD_JOURNAL_CLEAN correctly
Currently, the code sets MD_JOURNAL_CLEAN when the array has
MD_FEATURE_JOURNAL and the recovery_cp is MaxSector. The array
will be MD_JOURNAL_CLEAN even if the journal device is missing.

With this patch, the MD_JOURNAL_CLEAN is only set when the journal
device presents.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2016-08-24 10:21:50 -07:00
Vitaly Kuznetsov 55467dea29 xen: change the type of xen_vcpu_id to uint32_t
We pass xen_vcpu_id mapping information to hypercalls which require
uint32_t type so it would be cleaner to have it as uint32_t. The
initializer to -1 can be dropped as we always do the mapping before using
it and we never check the 'not set' value anyway.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-08-24 18:17:27 +01:00
Jan Beulich 9a035a40f7 xenbus: don't look up transaction IDs for ordinary writes
This should really only be done for XS_TRANSACTION_END messages, or
else at least some of the xenstore-* tools don't work anymore.

Fixes: 0beef634b8 ("xenbus: don't BUG() on user mode induced condition")
Reported-by: Richard Schütz <rschuetz@uni-koblenz.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Richard Schütz <rschuetz@uni-koblenz.de>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-08-24 18:16:18 +01:00
Yotam Gigi 51af96b534 mlxsw: router: Enable neighbors to be created on stacked devices
Make the function mlxsw_router_neigh_construct search the rif according
to the neighbour dev other than the dev that was passed to the ndo, thus
allowing creating neigbhours upon stacked devices.

Fixes: 6cf3c971dc ("mlxsw: spectrum_router: Add private neigh table")
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-24 09:39:04 -07:00
Ido Schimmel f888f58795 mlxsw: spectrum: Add missing flood to router port
In case we have a layer 3 interface on top of a bridge (VLAN / FID RIF),
then we should flood the following packet types to the router:

* Broadcast: If DIP is the broadcast address of the interface, then we
need to be able to get it to CPU by trapping it following route lookup.

* Reserved IP multicast (224.0.0.X): Some control packets (e.g. OSPF)
use this range and are trapped in the router block.

Fixes: 99f44bb352 ("mlxsw: spectrum: Enable L3 interfaces on top of bridge devices")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-24 09:39:03 -07:00
Selvin Xavier 3c199b4523 RDMA/ocrdma: Fix the max_sge reported from FW
Current driver is reporting wrong values for max_sge and
max_sge_rd in query_device. This breaks the nfs rdma and iser
in some device profiles. Fixing the driver to report
correct values from FW.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-24 11:31:40 -04:00
Mustafa Ismail 433c58139f i40iw: Avoid writing to freed memory
iwpbl->iwmr points to the structure that contains iwpbl,
which is iwmr. Setting this to NULL would result in
writing to freed memory. So just free iwmr, and return.

Fixes: d374984179 ("i40iw: add files for iwarp interface")

Reported-by: Stefan Assmann <sassmann@redhat.com>
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-24 11:31:40 -04:00
Mustafa Ismail d41d0910d9 i40iw: Fix double free of allocated_buffer
Memory allocated for iwqp; iwqp->allocated_buffer is freed twice in
the create_qp error path. Correct this by having it freed only once in
i40iw_free_qp_resources().

Fixes: d374984179 ("i40iw: add files for iwarp interface")

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-24 11:31:39 -04:00
Chris Wilson 82d200cc6f IB/mlx5: Remove superfluous include of io-mapping.h
This file does not use any structs or functions defined by io-mapping.h
(nor does it directly use iomap, ioremap, iounamp or friends). Remove it
to simplify verification of changes to io-mapping.h

The include existed since its inception in

commit e126ba97db
Author: Eli Cohen <eli@mellanox.com>
Date:   Sun Jul 7 17:25:49 2013 +0300

    mlx5: Add driver for Mellanox Connect-IB adapters

which looks like a copy across from the Mellanox ethernet driver.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Eli Cohen <eli@mellanox.com>
Cc: Jack Morgenstein <jackm@dev.mellanox.co.il>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Cc: Matan Barak <matanb@mellanox.com>
Cc: Leon Romanovsky <leonro@mellanox.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: linux-rdma@vger.kernel.org
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Tested-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-24 11:30:39 -04:00
Mustafa Ismail 7eaf8313b1 i40iw: Do not set self-referencing pointer to NULL after kfree
In i40iw_free_virt_mem(), do not set mem->va to NULL
after freeing it as mem->va is a self-referencing pointer
to mem.

Fixes: 4e9042e647 ("i40iw: add hw and utils files")

Reported-by: Stefan Assmann <sassmann@redhat.com>
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-24 11:25:34 -04:00
Shiraz Saleem 5dfd5e5e3b i40iw: Add missing NULL check for MPA private data
Add NULL check for pdata and pdata->addr before the memcpy in
i40iw_form_cm_frame(). This fixes a NULL pointer de-reference
which occurs when the MPA private data pointer is NULL. Also
only copy pdata->size bytes in the memcpy to prevent reading
past the length of the private data buffer provided by upper layer.

Fixes: f27b4746f3 ("i40iw: add connection management code")

Reported-by: Stefan Assmann <sassmann@redhat.com>
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-24 11:21:51 -04:00
Daniel Borkmann dbb50887c8 Bluetooth: split sk_filter in l2cap_sock_recv_cb
During an audit for sk_filter(), we found that rx_busy_skb handling
in l2cap_sock_recv_cb() and l2cap_sock_recvmsg() looks not quite as
intended.

The assumption from commit e328140fda ("Bluetooth: Use event-driven
approach for handling ERTM receive buffer") is that errors returned
from sock_queue_rcv_skb() are due to receive buffer shortage. However,
nothing should prevent doing a setsockopt() with SO_ATTACH_FILTER on
the socket, that could drop some of the incoming skbs when handled in
sock_queue_rcv_skb().

In that case sock_queue_rcv_skb() will return with -EPERM, propagated
from sk_filter() and if in L2CAP_MODE_ERTM mode, wrong assumption was
that we failed due to receive buffer being full. From that point onwards,
due to the to-be-dropped skb being held in rx_busy_skb, we cannot make
any forward progress as rx_busy_skb is never cleared from l2cap_sock_recvmsg(),
due to the filter drop verdict over and over coming from sk_filter().
Meanwhile, in l2cap_sock_recv_cb() all new incoming skbs are being
dropped due to rx_busy_skb being occupied.

Instead, just use __sock_queue_rcv_skb() where an error really tells that
there's a receive buffer issue. Split the sk_filter() and enable it for
non-segmented modes at queuing time since at this point in time the skb has
already been through the ERTM state machine and it has been acked, so dropping
is not allowed. Instead, for ERTM and streaming mode, call sk_filter() in
l2cap_data_rcv() so the packet can be dropped before the state machine sees it.

Fixes: e328140fda ("Bluetooth: Use event-driven approach for handling ERTM receive buffer")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-08-24 16:55:04 +02:00
Frederic Dalleau 9afee94939 Bluetooth: Fix memory leak at end of hci requests
In hci_req_sync_complete the event skb is referenced in hdev->req_skb.
It is used (via hci_req_run_skb) from either __hci_cmd_sync_ev which will
pass the skb to the caller, or __hci_req_sync which leaks.

unreferenced object 0xffff880005339a00 (size 256):
  comm "kworker/u3:1", pid 1011, jiffies 4294671976 (age 107.389s)
  backtrace:
    [<ffffffff818d89d9>] kmemleak_alloc+0x49/0xa0
    [<ffffffff8116bba8>] kmem_cache_alloc+0x128/0x180
    [<ffffffff8167c1df>] skb_clone+0x4f/0xa0
    [<ffffffff817aa351>] hci_event_packet+0xc1/0x3290
    [<ffffffff8179a57b>] hci_rx_work+0x18b/0x360
    [<ffffffff810692ea>] process_one_work+0x14a/0x440
    [<ffffffff81069623>] worker_thread+0x43/0x4d0
    [<ffffffff8106ead4>] kthread+0xc4/0xe0
    [<ffffffff818dd38f>] ret_from_fork+0x1f/0x40
    [<ffffffffffffffff>] 0xffffffffffffffff

Signed-off-by: Frédéric Dalleau <frederic.dalleau@collabora.co.uk>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-08-24 16:49:29 +02:00
Christian König 5661538749 drm/amdgpu: fix lru size grouping v2
Adding a BO can make it the insertion point for larger sizes as well.

v2: add a comment about the guard structure.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2016-08-24 10:27:49 -04:00
Ming Lei 4d70dca4ea block: make sure a big bio is split into at most 256 bvecs
After arbitrary bio size was introduced, the incoming bio may
be very big. We have to split the bio into small bios so that
each holds at most BIO_MAX_PAGES bvecs for safety reason, such
as bio_clone().

This patch fixes the following kernel crash:

> [  172.660142] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
> [  172.660229] IP: [<ffffffff811e53b4>] bio_trim+0xf/0x2a
> [  172.660289] PGD 7faf3e067 PUD 7f9279067 PMD 0
> [  172.660399] Oops: 0000 [#1] SMP
> [...]
> [  172.664780] Call Trace:
> [  172.664813]  [<ffffffffa007f3be>] ? raid1_make_request+0x2e8/0xad7 [raid1]
> [  172.664846]  [<ffffffff811f07da>] ? blk_queue_split+0x377/0x3d4
> [  172.664880]  [<ffffffffa005fb5f>] ? md_make_request+0xf6/0x1e9 [md_mod]
> [  172.664912]  [<ffffffff811eb860>] ? generic_make_request+0xb5/0x155
> [  172.664947]  [<ffffffffa0445c89>] ? prio_io+0x85/0x95 [bcache]
> [  172.664981]  [<ffffffffa0448252>] ? register_cache_set+0x355/0x8d0 [bcache]
> [  172.665016]  [<ffffffffa04497d3>] ? register_bcache+0x1006/0x1174 [bcache]

The issue can be reproduced by the following steps:
	- create one raid1 over two virtio-blk
	- build bcache device over the above raid1 and another cache device
	and bucket size is set as 2Mbytes
	- set cache mode as writeback
	- run random write over ext4 on the bcache device

Fixes: 54efd50(block: make generic_make_request handle arbitrarily sized bios)
Reported-by: Sebastian Roesner <sroesner-kernelorg@roesner-online.de>
Reported-by: Eric Wheeler <bcache@lists.ewheeler.net>
Cc: stable@vger.kernel.org (4.3+)
Cc: Shaohua Li <shli@fb.com>
Acked-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-08-24 08:17:24 -06:00
Andy Lutomirski 9b47f77a68 nvme: Fix nvme_get/set_features() with a NULL result pointer
nvme_set_features() callers seem to expect that passing NULL as the
result pointer is acceptable.  Teach nvme_set_features() not to try to
write to the NULL address.

For symmetry, make the same change to nvme_get_features(), despite the
fact that all current callers pass a valid result pointer.

I assume that this bug hasn't been reported in practice because
the callers that pass NULL are all in the SCSI translation layer
and no one uses the relevant operations.

Cc: stable@vger.kernel.org
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-08-24 08:11:10 -06:00