Commit Graph

918021 Commits

Author SHA1 Message Date
Hugh Dickins 2f33a70602 mm,thp: stop leaking unreleased file pages
When collapse_file() calls try_to_release_page(), it has already isolated
the page: so if releasing buffers happens to fail (as it sometimes does),
remember to putback_lru_page(): otherwise that page is left unreclaimable
and unfreeable, and the file extent uncollapsible.

Fixes: 99cb0dbd47 ("mm,thp: add read-only THP support for (non-shmem) FS")
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: <stable@vger.kernel.org>	[5.4+]
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2005231837500.1766@eggly.anvils
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-28 11:35:40 -07:00
Qian Cai af4798a5bb mm/z3fold: silence kmemleak false positives of slots
Kmemleak reported many leaks while under memory pressue in,

    slots = alloc_slots(pool, gfp);

which is referenced by "zhdr" in init_z3fold_page(),

    zhdr->slots = slots;

However, "zhdr" could be gone without freeing slots as the later will be
freed separately when the last "handle" off of "handles" array is freed.
It will be within "slots" which is always aligned.

  unreferenced object 0xc000000fdadc1040 (size 104):
  comm "oom04", pid 140476, jiffies 4295359280 (age 3454.970s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    z3fold_zpool_malloc+0x7b0/0xe10
    alloc_slots at mm/z3fold.c:214
    (inlined by) init_z3fold_page at mm/z3fold.c:412
    (inlined by) z3fold_alloc at mm/z3fold.c:1161
    (inlined by) z3fold_zpool_malloc at mm/z3fold.c:1735
    zpool_malloc+0x34/0x50
    zswap_frontswap_store+0x60c/0xda0
    zswap_frontswap_store at mm/zswap.c:1093
    __frontswap_store+0x128/0x330
    swap_writepage+0x58/0x110
    pageout+0x16c/0xa40
    shrink_page_list+0x1ac8/0x25c0
    shrink_inactive_list+0x270/0x730
    shrink_lruvec+0x444/0xf30
    shrink_node+0x2a4/0x9c0
    do_try_to_free_pages+0x158/0x640
    try_to_free_pages+0x1bc/0x5f0
    __alloc_pages_slowpath.constprop.60+0x4dc/0x15a0
    __alloc_pages_nodemask+0x520/0x650
    alloc_pages_vma+0xc0/0x420
    handle_mm_fault+0x1174/0x1bf0

Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Vitaly Wool <vitaly.wool@konsulko.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: http://lkml.kernel.org/r/20200522220052.2225-1-cai@lca.pw
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-28 11:35:40 -07:00
Alexander Dahl 8874347066 x86/dma: Fix max PFN arithmetic overflow on 32 bit systems
The intermediate result of the old term (4UL * 1024 * 1024 * 1024) is
4 294 967 296 or 0x100000000 which is no problem on 64 bit systems.
The patch does not change the later overall result of 0x100000 for
MAX_DMA32_PFN (after it has been shifted by PAGE_SHIFT). The new
calculation yields the same result, but does not require 64 bit
arithmetic.

On 32 bit systems the old calculation suffers from an arithmetic
overflow in that intermediate term in braces: 4UL aka unsigned long int
is 4 byte wide and an arithmetic overflow happens (the 0x100000000 does
not fit in 4 bytes), the in braces result is truncated to zero, the
following right shift does not alter that, so MAX_DMA32_PFN evaluates to
0 on 32 bit systems.

That wrong value is a problem in a comparision against MAX_DMA32_PFN in
the init code for swiotlb in pci_swiotlb_detect_4gb() to decide if
swiotlb should be active.  That comparison yields the opposite result,
when compiling on 32 bit systems.

This was not possible before

  1b7e03ef75 ("x86, NUMA: Enable emulation on 32bit too")

when that MAX_DMA32_PFN was first made visible to x86_32 (and which
landed in v3.0).

In practice this wasn't a problem, unless CONFIG_SWIOTLB is active on
x86-32.

However if one has set CONFIG_IOMMU_INTEL, since

  c5a5dc4cbb ("iommu/vt-d: Don't switch off swiotlb if bounce page is used")

there's a dependency on CONFIG_SWIOTLB, which was not necessarily
active before. That landed in v5.4, where we noticed it in the fli4l
Linux distribution. We have CONFIG_IOMMU_INTEL active on both 32 and 64
bit kernel configs there (I could not find out why, so let's just say
historical reasons).

The effect is at boot time 64 MiB (default size) were allocated for
bounce buffers now, which is a noticeable amount of memory on small
systems like pcengines ALIX 2D3 with 256 MiB memory, which are still
frequently used as home routers.

We noticed this effect when migrating from kernel v4.19 (LTS) to v5.4
(LTS) in fli4l and got that kernel messages for example:

  Linux version 5.4.22 (buildroot@buildroot) (gcc version 7.3.0 (Buildroot 2018.02.8)) #1 SMP Mon Nov 26 23:40:00 CET 2018
  …
  Memory: 183484K/261756K available (4594K kernel code, 393K rwdata, 1660K rodata, 536K init, 456K bss , 78272K reserved, 0K cma-reserved, 0K highmem)
  …
  PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
  software IO TLB: mapped [mem 0x0bb78000-0x0fb78000] (64MB)

The initial analysis and the suggested fix was done by user 'sourcejedi'
at stackoverflow and explicitly marked as GPLv2 for inclusion in the
Linux kernel:

  https://unix.stackexchange.com/a/520525/50007

The new calculation, which does not suffer from that overflow, is the
same as for arch/mips now as suggested by Robin Murphy.

The fix was tested by fli4l users on round about two dozen different
systems, including both 32 and 64 bit archs, bare metal and virtualized
machines.

 [ bp: Massage commit message. ]

Fixes: 1b7e03ef75 ("x86, NUMA: Enable emulation on 32bit too")
Reported-by: Alan Jenkins <alan.christopher.jenkins@gmail.com>
Suggested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Alexander Dahl <post@lespocky.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable@vger.kernel.org
Link: https://unix.stackexchange.com/q/520065/50007
Link: https://web.nettworks.org/bugs/browse/FFL-2560
Link: https://lkml.kernel.org/r/20200526175749.20742-1-post@lespocky.de
2020-05-28 20:21:32 +02:00
Qiushi Wu a068aab422 bonding: Fix reference count leak in bond_sysfs_slave_add.
kobject_init_and_add() takes reference even when it fails.
If this function returns an error, kobject_put() must be called to
properly clean up the memory associated with the object. Previous
commit "b8eb718348b8" fixed a similar problem.

Fixes: 07699f9a7c ("bonding: add sysfs /slave dir for bond slave devices.")
Signed-off-by: Qiushi Wu <wu000273@umn.edu>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 11:08:16 -07:00
David S. Miller 2200313aa0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Uninitialized when used in __nf_conntrack_update(), from
   Nathan Chancellor.

2) Comparison of unsigned expression in nf_confirm_cthelper().

3) Remove 'const' type qualifier with no effect.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 10:54:18 -07:00
Petr Mladek d195b1d1d1 powerpc/bpf: Enable bpf_probe_read{, str}() on powerpc again
The commit 0ebeea8ca8 ("bpf: Restrict bpf_probe_read{, str}() only
to archs where they work") caused that bpf_probe_read{, str}() functions
were not longer available on architectures where the same logical address
might have different content in kernel and user memory mapping. These
architectures should use probe_read_{user,kernel}_str helpers.

For backward compatibility, the problematic functions are still available
on architectures where the user and kernel address spaces are not
overlapping. This is defined CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE.

At the moment, these backward compatible functions are enabled only on x86_64,
arm, and arm64. Let's do it also on powerpc that has the non overlapping
address space as well.

Fixes: 0ebeea8ca8 ("bpf: Restrict bpf_probe_read{, str}() only to archs where they work")
Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/lkml/20200527122844.19524-1-pmladek@suse.com
2020-05-28 17:12:18 +02:00
Serge Semin 87976ce282 hwmon: Add Baikal-T1 PVT sensor driver
Baikal-T1 SoC provides an embedded process, voltage and temperature
sensor to monitor an internal SoC environment (chip temperature, supply
voltage and process monitor) and on time detect critical situations,
which may cause the system instability and even damages. The IP-block
is based on the Analog Bits PVT sensor, but is equipped with a
dedicated control wrapper, which provides a MMIO registers-based access
to the sensor core functionality (APB3-bus based) and exposes an
additional functions like thresholds/data ready interrupts, its status
and masks, measurements timeout. All of these is used to create a hwmon
driver being added to the kernel by this commit.

The driver implements support for the hardware monitoring capabilities
of Baikal-T1 process, voltage and temperature sensors. PVT IP-core
consists of one temperature and four voltage sensors, each of which is
implemented as a dedicated hwmon channel config.

The driver can optionally provide the hwmon alarms for each sensor the
PVT controller supports. The alarms functionality is made compile-time
configurable due to the hardware interface implementation peculiarity,
which is connected with an ability to convert data from only one sensor
at a time. Additional limitation is that the controller performs the
thresholds checking synchronously with the data conversion procedure.
Due to these limitations in order to have the hwmon alarms
automatically detected the driver code must switch from one sensor to
another, read converted data and manually check the threshold status
bits. Depending on the measurements timeout settings this design may
cause additional burden on the system performance. By default if the
alarms kernel config is disabled the data conversion is performed by
the driver on demand when read operation is requested via corresponding
_input-file.

Co-developed-by: Maxim Kaurkin <maxim.kaurkin@baikalelectronics.ru>
Signed-off-by: Maxim Kaurkin <maxim.kaurkin@baikalelectronics.ru>
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: linux-mips@vger.kernel.org
Cc: devicetree@vger.kernel.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2020-05-28 07:59:45 -07:00
Guenter Roeck 1597b374af hwmon: Add notification support
For hwmon drivers using the hwmon_device_register_with_info() API, it
is desirable to have a generic notification mechanism available. This
mechanism can be used to notify userspace as well as the thermal
subsystem if the driver experiences any events, such as warning or
critical alarms.

Implement hwmon_notify_event() to provide this mechanism. The function
generates a sysfs event and a udev event. If the device is registered
with the thermal subsystem and the event is associated with a temperature
sensor, also notify the thermal subsystem that a thermal event occurred.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Cc: Maxim Kaurkin <Maxim.Kaurkin@baikalelectronics.ru>
Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: linux-mips@vger.kernel.org
Cc: devicetree@vger.kernel.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2020-05-28 07:59:45 -07:00
Serge Semin ddc65caa56 dt-bindings: hwmon: Add Baikal-T1 PVT sensor binding
Baikal-T1 SoC is equipped with an embedded process, voltage and
temperature sensor to monitor the chip internal environment like
temperature, supply voltage and transistors performance.

This bindings describes the external Baikal-T1 PVT control interfaces
like MMIO registers space, interrupt request number and clocks source.
These are then used by the corresponding hwmon device driver to
implement the sysfs files-based access to the sensors functionality.

Co-developed-by: Maxim Kaurkin <maxim.kaurkin@baikalelectronics.ru>
Signed-off-by: Maxim Kaurkin <maxim.kaurkin@baikalelectronics.ru>
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Reviewed-by: Rob Herring <robh@kernel.org>
Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: linux-mips@vger.kernel.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2020-05-28 07:59:45 -07:00
Jens Axboe 15fede122b Merge branch 'nvme-5.7' of git://git.infradead.org/nvme into block-5.7
Pull NVMe poll fix from Christoph.

* 'nvme-5.7' of git://git.infradead.org/nvme:
  nvme-pci: avoid race between nvme_reap_pending_cqes() and nvme_poll()
2020-05-28 08:48:12 -06:00
Mark Brown b7d73cb63c
Merge series "add ecspi ERR009165 for i.mx6/7 soc family" from Robin Gong <yibin.gong@nxp.com>:
There is ecspi ERR009165 on i.mx6/7 soc family, which cause FIFO
transfer to be send twice in DMA mode. Please get more information from:
https://www.nxp.com/docs/en/errata/IMX6DQCE.pdf. The workaround is adding
new sdma ram script which works in XCH  mode as PIO inside sdma instead
of SMC mode, meanwhile, 'TX_THRESHOLD' should be 0. The issue should be
exist on all legacy i.mx6/7 soc family before i.mx6ul.
NXP fix this design issue from i.mx6ul, so newer chips including i.mx6ul/
6ull/6sll do not need this workaroud anymore. All other i.mx6/7/8 chips
still need this workaroud. This patch set add new 'fsl,imx6ul-ecspi'
for ecspi driver and 'ecspi_fixed' in sdma driver to choose if need errata
or not.
The first two reverted patches should be the same issue, though, it
seems 'fixed' by changing to other shp script. Hope Sean or Sascha could
have the chance to test this patch set if could fix their issues.
Besides, enable sdma support for i.mx8mm/8mq and fix ecspi1 not work
on i.mx8mm because the event id is zero.

PS:
   Please get sdma firmware from below linux-firmware and copy it to your
local rootfs /lib/firmware/imx/sdma.
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/imx/sdma

v2:
  1.Add commit log for reverted patches.
  2.Add comment for 'ecspi_fixed' in sdma driver.
  3.Add 'fsl,imx6sll-ecspi' compatible instead of 'fsl,imx6ul-ecspi'
    rather than remove.
v3:
  1.Confirm with design team make sure ERR009165 fixed on i.mx6ul/i.mx6ull
    /i.mx6sll, not fixed on i.mx8m/8mm and other i.mx6/7 legacy chips.
    Correct dts related dts patch in v2.
  2.Clean eratta information in binding doc and new 'tx_glitch_fixed' flag
    in spi-imx driver to state ERR009165 fixed or not.
  3.Enlarge burst size to fifo size for tx since tx_wml set to 0 in the
    errata workaroud, thus improve performance as possible.
v4:
  1.Add Ack tag from Mark and Vinod
  2.Remove checking 'event_id1' zero as 'event_id0'.
v5:
  1.Add the last patch for compatible with the current uart driver which
    using rom script, so both uart ram script and rom script supported
    in latest firmware, by default uart rom script used. UART driver
    will be broken without this patch.
v6:
  1.Resend after rebase the latest next branch.
  2.Remove below No.13~No.15 patches of v5 because they were mergered.
  	ARM: dts: imx6ul: add dma support on ecspi
  	ARM: dts: imx6sll: correct sdma compatible
  	arm64: defconfig: Enable SDMA on i.mx8mq/8mm
  3.Revert "dmaengine: imx-sdma: fix context cache" since
    'context_loaded' removed.
v7:
  1.Put the last patch 13/13 'Revert "dmaengine: imx-sdma: fix context
    cache"' to the ahead of 03/13 'Revert "dmaengine: imx-sdma: refine
    to load context only once" so that no building waring during comes out
    during bisect.
  2.Address Sascha's comments, including eliminating any i.mx6sx in this
    series, adding new 'is_imx6ul_ecspi()' instead imx in imx51 and taking
    care SMC bit for PIO.
  3.Add back missing 'Reviewed-by' tag on 08/15(v5):09/13(v7)
   'spi: imx: add new i.mx6ul compatible name in binding doc'
v8:
  1.remove 0003-Revert-dmaengine-imx-sdma-fix-context-cache.patch and merge
    it into 04/13 of v7
  2.add 0005-spi-imx-fallback-to-PIO-if-dma-setup-failure.patch for no any
    ecspi function broken even if sdma firmware not updated.
  3.merge 'tx.dst_maxburst' changes in the two continous patches into one
    patch to avoid confusion.
  4.fix typo 'duplicated'.

Robin Gong (13):
  Revert "ARM: dts: imx6q: Use correct SDMA script for SPI5 core"
  Revert "ARM: dts: imx6: Use correct SDMA script for SPI cores"
  Revert "dmaengine: imx-sdma: refine to load context only once"
  dmaengine: imx-sdma: remove duplicated sdma_load_context
  spi: imx: fallback to PIO if dma setup failure
  dmaengine: imx-sdma: add mcu_2_ecspi script
  spi: imx: fix ERR009165
  spi: imx: remove ERR009165 workaround on i.mx6ul
  spi: imx: add new i.mx6ul compatible name in binding doc
  dmaengine: imx-sdma: remove ERR009165 on i.mx6ul
  dma: imx-sdma: add i.mx6ul compatible name
  dmaengine: imx-sdma: fix ecspi1 rx dma not work on i.mx8mm
  dmaengine: imx-sdma: add uart rom script

 .../devicetree/bindings/dma/fsl-imx-sdma.txt       |  1 +
 .../devicetree/bindings/spi/fsl-imx-cspi.txt       |  1 +
 arch/arm/boot/dts/imx6q.dtsi                       |  2 +-
 arch/arm/boot/dts/imx6qdl.dtsi                     |  8 +-
 drivers/dma/imx-sdma.c                             | 67 ++++++++++------
 drivers/spi/spi-imx.c                              | 92 +++++++++++++++++++---
 include/linux/platform_data/dma-imx-sdma.h         |  8 +-
 7 files changed, 135 insertions(+), 44 deletions(-)

--
2.7.4

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
2020-05-28 14:01:17 +01:00
Dinghao Liu 117858bd63
spi: tegra20-sflash: Fix runtime PM imbalance on error
pm_runtime_get_sync() increments the runtime PM usage counter even
when it returns an error code. Thus a pairing decrement is needed on
the error handling path to keep the counter balanced.

Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Link: https://lore.kernel.org/r/20200523124758.28604-1-dinghao.liu@zju.edu.cn
Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-28 14:01:16 +01:00
Dinghao Liu faedcc17ad
spi: tegra20-slink: Fix runtime PM imbalance on error
pm_runtime_get_sync() increments the runtime PM usage counter even
when it returns an error code. Thus a pairing decrement is needed on
the error handling path to keep the counter balanced.

Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Link: https://lore.kernel.org/r/20200523122909.25247-1-dinghao.liu@zju.edu.cn
Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-28 14:01:15 +01:00
Dinghao Liu cddc36f3fd
spi: tegra114: Fix runtime PM imbalance on error
pm_runtime_get_sync() increments the runtime PM usage counter even
when it returns an error code. Thus a pairing decrement is needed on
the error handling path to keep the counter balanced.

Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Link: https://lore.kernel.org/r/20200523125704.30300-1-dinghao.liu@zju.edu.cn
Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-28 14:01:14 +01:00
Robin Gong bcd8e7761e
spi: imx: fallback to PIO if dma setup failure
Fallback to PIO in case dma setup failed. For example, sdma firmware not
updated but ERR009165 workaroud added in kernel.

Signed-off-by: Robin Gong <yibin.gong@nxp.com>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Link: https://lore.kernel.org/r/1590006865-20900-6-git-send-email-yibin.gong@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-28 13:41:48 +01:00
Nobuhiro Iwamatsu ba051f097f arm64/kernel: Fix return value when cpu_online() fails in __cpu_up()
If boot_secondary() was successful, and cpu_online() was an error in
__cpu_up(), -EIO was returned, but 0 is returned by commit d22b115cbf
("arm64/kernel: Simplify __cpu_up() by bailing out early").
Therefore, bringup_wait_for_ap() causes the primary core to wait for a
long time, which may cause boot failure.
This commit sets -EIO to return code under the same conditions.

Fixes: d22b115cbf ("arm64/kernel: Simplify __cpu_up() by bailing out early")
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
Tested-by: Yuji Ishikawa <yuji2.ishikawa@toshiba.co.jp>
Acked-by: Will Deacon <will@kernel.org>
Cc: Gavin Shan <gshan@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20200527233457.2531118-1-nobuhiro1.iwamatsu@toshiba.co.jp
[catalin.marinas@arm.com: return -EIO at the end of the function]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-05-28 12:04:55 +01:00
Weili Qian 58ca0060ec crypto: hisilicon - fix driver compatibility issue with different versions of devices
In order to be compatible with devices of different versions, V1 in the
accelerator driver is now isolated, and other versions are the previous
V2 processing flow.

Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Shukun Tan <tanshukun1@huawei.com>
Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-05-28 17:27:52 +10:00
Iuliana Prodan d1c72f6e4c crypto: engine - do not requeue in case of fatal error
Now, in crypto-engine, if hardware queue is full (-ENOSPC),
requeue request regardless of MAY_BACKLOG flag.
If hardware throws any other error code (like -EIO, -EINVAL,
-ENOMEM, etc.) only MAY_BACKLOG requests are enqueued back into
crypto-engine's queue, since the others can be dropped.
The latter case can be fatal error, so those cannot be recovered from.
For example, in CAAM driver, -EIO is returned in case the job descriptor
is broken, so there is no possibility to fix the job descriptor.
Therefore, these errors might be fatal error, so we shouldn’t
requeue the request. This will just be pass back and forth between
crypto-engine and hardware.

Fixes: 6a89f492f8 ("crypto: engine - support for parallel requests based on retry mechanism")
Signed-off-by: Iuliana Prodan <iuliana.prodan@nxp.com>
Reported-by: Horia Geantă <horia.geanta@nxp.com>
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-05-28 17:27:52 +10:00
Christophe JAILLET ae4052c59c crypto: cavium/nitrox - Fix a typo in a comment
s/NITORX/NITROX/

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-05-28 17:27:51 +10:00
Dave Airlie ed52a9b556 Merge tag 'amd-drm-fixes-5.7-2020-05-27' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
amd-drm-fixes-5.7-2020-05-27:

amdgpu:
- Display atomic test fix
- Fix soft hang in display vupdate code

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200527222700.4378-1-alexander.deucher@amd.com
2020-05-28 15:38:43 +10:00
Guo Ren f36e0aab6f csky: Fixup CONFIG_DEBUG_RSEQ
Put the rseq_syscall check point at the prologue of the syscall
will break the a0 ... a7. This will casue system call bug when
DEBUG_RSEQ is enabled.

So move it to the epilogue of syscall, but before syscall_trace.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-05-28 00:18:36 +00:00
Guo Ren 20f69538b9 csky: Coding convention in entry.S
There is no fixup or feature in the patch, we only cleanup with:

 - Remove unnecessary reg used (r11, r12), just use r9 & r10 &
   syscallid regs as temp useage.
 - Add _TIF_SYSCALL_WORK and _TIF_WORK_MASK to gather macros.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-05-28 00:18:36 +00:00
Guo Ren e0bbb53843 csky: Fixup abiv2 syscall_trace break a4 & a5
Current implementation could destory a4 & a5 when strace, so we need to get them
from pt_regs by SAVE_ALL.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-05-28 00:18:36 +00:00
Guo Ren 900897591b csky: Fixup CONFIG_PREEMPT panic
log:
[    0.13373200] Calibrating delay loop...
[    0.14077600] ------------[ cut here ]------------
[    0.14116700] WARNING: CPU: 0 PID: 0 at kernel/sched/core.c:3790 preempt_count_add+0xc8/0x11c
[    0.14348000] DEBUG_LOCKS_WARN_ON((preempt_count() < 0))Modules linked in:
[    0.14395100] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.6.0 #7
[    0.14410800]
[    0.14427400] Call Trace:
[    0.14450700] [<807cd226>] dump_stack+0x8a/0xe4
[    0.14473500] [<80072792>] __warn+0x10e/0x15c
[    0.14495900] [<80072852>] warn_slowpath_fmt+0x72/0xc0
[    0.14518600] [<800a5240>] preempt_count_add+0xc8/0x11c
[    0.14544900] [<807ef918>] _raw_spin_lock+0x28/0x68
[    0.14572600] [<800e0eb8>] vprintk_emit+0x84/0x2d8
[    0.14599000] [<800e113a>] vprintk_default+0x2e/0x44
[    0.14625100] [<800e2042>] vprintk_func+0x12a/0x1d0
[    0.14651300] [<800e1804>] printk+0x30/0x48
[    0.14677600] [<80008052>] lockdep_init+0x12/0xb0
[    0.14703800] [<80002080>] start_kernel+0x558/0x7f8
[    0.14730000] [<800052bc>] csky_start+0x58/0x94
[    0.14756600] irq event stamp: 34
[    0.14775100] hardirqs last  enabled at (33): [<80067370>] ret_from_exception+0x2c/0x72
[    0.14793700] hardirqs last disabled at (34): [<800e0eae>] vprintk_emit+0x7a/0x2d8
[    0.14812300] softirqs last  enabled at (32): [<800655b0>] __do_softirq+0x578/0x6d8
[    0.14830800] softirqs last disabled at (25): [<8007b3b8>] irq_exit+0xec/0x128

The preempt_count of reg could be destroyed after csky_do_IRQ without reload
from memory.

After reference to other architectures (arm64, riscv), we move preempt entry
into ret_from_exception and disable irq at the beginning of
ret_from_exception instead of RESTORE_ALL.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Reported-by: Lu Baoquan <lu.baoquan@intellif.com>
2020-05-28 00:18:36 +00:00
Valentine Fatiev 1acba6a817 IB/ipoib: Fix double free of skb in case of multicast traffic in CM mode
When connected mode is set, and we have connected and datagram traffic in
parallel, ipoib might crash with double free of datagram skb.

The current mechanism assumes that the order in the completion queue is
the same as the order of sent packets for all QPs. Order is kept only for
specific QP, in case of mixed UD and CM traffic we have few QPs (one UD and
few CM's) in parallel.

The problem:
----------------------------------------------------------

Transmit queue:
-----------------
UD skb pointer kept in queue itself, CM skb kept in spearate queue and
uses transmit queue as a placeholder to count the number of total
transmitted packets.

0   1   2   3   4  5  6  7  8   9  10  11 12 13 .........127
------------------------------------------------------------
NL ud1 UD2 CM1 ud3 cm2 cm3 ud4 cm4 ud5 NL NL NL ...........
------------------------------------------------------------
    ^                                  ^
   tail                               head

Completion queue (problematic scenario) - the order not the same as in
the transmit queue:

  1  2  3  4  5  6  7  8  9
------------------------------------
 ud1 CM1 UD2 ud3 cm2 cm3 ud4 cm4 ud5
------------------------------------

1. CM1 'wc' processing
   - skb freed in cm separate ring.
   - tx_tail of transmit queue increased although UD2 is not freed.
     Now driver assumes UD2 index is already freed and it could be used for
     new transmitted skb.

0   1   2   3   4  5  6  7  8   9  10  11 12 13 .........127
------------------------------------------------------------
NL NL  UD2 CM1 ud3 cm2 cm3 ud4 cm4 ud5 NL NL NL ...........
------------------------------------------------------------
        ^   ^                       ^
      (Bad)tail                    head
(Bad - Could be used for new SKB)

In this case (due to heavy load) UD2 skb pointer could be replaced by new
transmitted packet UD_NEW, as the driver assumes its free.  At this point
we will have to process two 'wc' with same index but we have only one
pointer to free.

During second attempt to free the same skb we will have NULL pointer
exception.

2. UD2 'wc' processing
   - skb freed according the index we got from 'wc', but it was already
     overwritten by mistake. So actually the skb that was released is the
     skb of the new transmitted packet and not the original one.

3. UD_NEW 'wc' processing
   - attempt to free already freed skb. NUll pointer exception.

The fix:
-----------------------------------------------------------------------

The fix is to stop using the UD ring as a placeholder for CM packets, the
cyclic ring variables tx_head and tx_tail will manage the UD tx_ring, a
new cyclic variables global_tx_head and global_tx_tail are introduced for
managing and counting the overall outstanding sent packets, then the send
queue will be stopped and waken based on these variables only.

Note that no locking is needed since global_tx_head is updated in the xmit
flow and global_tx_tail is updated in the NAPI flow only.  A previous
attempt tried to use one variable to count the outstanding sent packets,
but it did not work since xmit and NAPI flows can run at the same time and
the counter will be updated wrongly. Thus, we use the same simple cyclic
head and tail scheme that we have today for the UD tx_ring.

Fixes: 2c104ea683 ("IB/ipoib: Get rid of the tx_outstanding variable in all modes")
Link: https://lore.kernel.org/r/20200527134705.480068-1-leon@kernel.org
Signed-off-by: Valentine Fatiev <valentinef@mellanox.com>
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-27 21:14:09 -03:00
Aric Cyr 4e5183200d drm/amd/display: Fix potential integer wraparound resulting in a hang
[Why]
If VUPDATE_END is before VUPDATE_START the delay calculated can become
very large, causing a soft hang.

[How]
Take the absolute value of the difference between START and END.

Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2020-05-27 18:13:14 -04:00
Simon Ser f7d5991b92 drm/amd/display: drop cursor position check in atomic test
get_cursor_position already handles the case where the cursor has
negative off-screen coordinates by not setting
dc_cursor_position.enabled.

Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: 626bf90fe0 ("drm/amd/display: add basic atomic check for cursor plane")
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2020-05-27 18:12:32 -04:00
Vladimir Oltean 2b86cb8299 net: dsa: declare lockless TX feature for slave ports
Be there a platform with the following layout:

      Regular NIC
       |
       +----> DSA master for switch port
               |
               +----> DSA master for another switch port

After changing DSA back to static lockdep class keys in commit
1a33e10e4a ("net: partially revert dynamic lockdep key changes"), this
kernel splat can be seen:

[   13.361198] ============================================
[   13.366524] WARNING: possible recursive locking detected
[   13.371851] 5.7.0-rc4-02121-gc32a05ecd7af-dirty #988 Not tainted
[   13.377874] --------------------------------------------
[   13.383201] swapper/0/0 is trying to acquire lock:
[   13.388004] ffff0000668ff298 (&dsa_slave_netdev_xmit_lock_key){+.-.}-{2:2}, at: __dev_queue_xmit+0x84c/0xbe0
[   13.397879]
[   13.397879] but task is already holding lock:
[   13.403727] ffff0000661a1698 (&dsa_slave_netdev_xmit_lock_key){+.-.}-{2:2}, at: __dev_queue_xmit+0x84c/0xbe0
[   13.413593]
[   13.413593] other info that might help us debug this:
[   13.420140]  Possible unsafe locking scenario:
[   13.420140]
[   13.426075]        CPU0
[   13.428523]        ----
[   13.430969]   lock(&dsa_slave_netdev_xmit_lock_key);
[   13.435946]   lock(&dsa_slave_netdev_xmit_lock_key);
[   13.440924]
[   13.440924]  *** DEADLOCK ***
[   13.440924]
[   13.446860]  May be due to missing lock nesting notation
[   13.446860]
[   13.453668] 6 locks held by swapper/0/0:
[   13.457598]  #0: ffff800010003de0 ((&idev->mc_ifc_timer)){+.-.}-{0:0}, at: call_timer_fn+0x0/0x400
[   13.466593]  #1: ffffd4d3fb478700 (rcu_read_lock){....}-{1:2}, at: mld_sendpack+0x0/0x560
[   13.474803]  #2: ffffd4d3fb478728 (rcu_read_lock_bh){....}-{1:2}, at: ip6_finish_output2+0x64/0xb10
[   13.483886]  #3: ffffd4d3fb478728 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x6c/0xbe0
[   13.492793]  #4: ffff0000661a1698 (&dsa_slave_netdev_xmit_lock_key){+.-.}-{2:2}, at: __dev_queue_xmit+0x84c/0xbe0
[   13.503094]  #5: ffffd4d3fb478728 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x6c/0xbe0
[   13.512000]
[   13.512000] stack backtrace:
[   13.516369] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.7.0-rc4-02121-gc32a05ecd7af-dirty #988
[   13.530421] Call trace:
[   13.532871]  dump_backtrace+0x0/0x1d8
[   13.536539]  show_stack+0x24/0x30
[   13.539862]  dump_stack+0xe8/0x150
[   13.543271]  __lock_acquire+0x1030/0x1678
[   13.547290]  lock_acquire+0xf8/0x458
[   13.550873]  _raw_spin_lock+0x44/0x58
[   13.554543]  __dev_queue_xmit+0x84c/0xbe0
[   13.558562]  dev_queue_xmit+0x24/0x30
[   13.562232]  dsa_slave_xmit+0xe0/0x128
[   13.565988]  dev_hard_start_xmit+0xf4/0x448
[   13.570182]  __dev_queue_xmit+0x808/0xbe0
[   13.574200]  dev_queue_xmit+0x24/0x30
[   13.577869]  neigh_resolve_output+0x15c/0x220
[   13.582237]  ip6_finish_output2+0x244/0xb10
[   13.586430]  __ip6_finish_output+0x1dc/0x298
[   13.590709]  ip6_output+0x84/0x358
[   13.594116]  mld_sendpack+0x2bc/0x560
[   13.597786]  mld_ifc_timer_expire+0x210/0x390
[   13.602153]  call_timer_fn+0xcc/0x400
[   13.605822]  run_timer_softirq+0x588/0x6e0
[   13.609927]  __do_softirq+0x118/0x590
[   13.613597]  irq_exit+0x13c/0x148
[   13.616918]  __handle_domain_irq+0x6c/0xc0
[   13.621023]  gic_handle_irq+0x6c/0x160
[   13.624779]  el1_irq+0xbc/0x180
[   13.627927]  cpuidle_enter_state+0xb4/0x4d0
[   13.632120]  cpuidle_enter+0x3c/0x50
[   13.635703]  call_cpuidle+0x44/0x78
[   13.639199]  do_idle+0x228/0x2c8
[   13.642433]  cpu_startup_entry+0x2c/0x48
[   13.646363]  rest_init+0x1ac/0x280
[   13.649773]  arch_call_rest_init+0x14/0x1c
[   13.653878]  start_kernel+0x490/0x4bc

Lockdep keys themselves were added in commit ab92d68fc2 ("net: core:
add generic lockdep keys"), and it's very likely that this splat existed
since then, but I have no real way to check, since this stacked platform
wasn't supported by mainline back then.

>From Taehee's own words:

  This patch was considered that all stackable devices have LLTX flag.
  But the dsa doesn't have LLTX, so this splat happened.
  After this patch, dsa shares the same lockdep class key.
  On the nested dsa interface architecture, which you illustrated,
  the same lockdep class key will be used in __dev_queue_xmit() because
  dsa doesn't have LLTX.
  So that lockdep detects deadlock because the same lockdep class key is
  used recursively although actually the different locks are used.
  There are some ways to fix this problem.

  1. using NETIF_F_LLTX flag.
  If possible, using the LLTX flag is a very clear way for it.
  But I'm so sorry I don't know whether the dsa could have LLTX or not.

  2. using dynamic lockdep again.
  It means that each interface uses a separate lockdep class key.
  So, lockdep will not detect recursive locking.
  But this way has a problem that it could consume lockdep class key
  too many.
  Currently, lockdep can have 8192 lockdep class keys.
   - you can see this number with the following command.
     cat /proc/lockdep_stats
     lock-classes:                         1251 [max: 8192]
     ...
     The [max: 8192] means that the maximum number of lockdep class keys.
  If too many lockdep class keys are registered, lockdep stops to work.
  So, using a dynamic(separated) lockdep class key should be considered
  carefully.
  In addition, updating lockdep class key routine might have to be existing.
  (lockdep_register_key(), lockdep_set_class(), lockdep_unregister_key())

  3. Using lockdep subclass.
  A lockdep class key could have 8 subclasses.
  The different subclass is considered different locks by lockdep
  infrastructure.
  But "lock-classes" is not counted by subclasses.
  So, it could avoid stopping lockdep infrastructure by an overflow of
  lockdep class keys.
  This approach should also have an updating lockdep class key routine.
  (lockdep_set_subclass())

  4. Using nonvalidate lockdep class key.
  The lockdep infrastructure supports nonvalidate lockdep class key type.
  It means this lockdep is not validated by lockdep infrastructure.
  So, the splat will not happen but lockdep couldn't detect real deadlock
  case because lockdep really doesn't validate it.
  I think this should be used for really special cases.
  (lockdep_set_novalidate_class())

Further discussion here:
https://patchwork.ozlabs.org/project/netdev/patch/20200503052220.4536-2-xiyou.wangcong@gmail.com/

There appears to be no negative side-effect to declaring lockless TX for
the DSA virtual interfaces, which means they handle their own locking.
So that's what we do to make the splat go away.

Patch tested in a wide variety of cases: unicast, multicast, PTP, etc.

Fixes: ab92d68fc2 ("net: core: add generic lockdep keys")
Suggested-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-27 15:09:28 -07:00
Al Viro 9e46365459 copy_xstate_to_kernel(): don't leave parts of destination uninitialized
copy the corresponding pieces of init_fpstate into the gaps instead.

Cc: stable@kernel.org
Tested-by: Alexander Potapenko <glider@google.com>
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-27 17:06:31 -04:00
Vladimir Oltean 183be6f967 net: dsa: felix: send VLANs on CPU port as egress-tagged
As explained in other commits before (b9cd75e668 and 87b0f983f6),
ocelot switches have a single egress-untagged VLAN per port, and the
driver would deny adding a second one while an egress-untagged VLAN
already exists.

But on the CPU port (where the VLAN configuration is implicit, because
there is no net device for the bridge to control), the DSA core attempts
to add a VLAN using the same flags as were used for the front-panel
port. This would make adding any untagged VLAN fail due to the CPU port
rejecting the configuration:

bridge vlan add dev swp0 vid 100 pvid untagged
[ 1865.854253] mscc_felix 0000:00:00.5: Port already has a native VLAN: 1
[ 1865.860824] mscc_felix 0000:00:00.5: Failed to add VLAN 100 to port 5: -16

(note that port 5 is the CPU port and not the front-panel swp0).

So this hardware will send all VLANs as tagged towards the CPU.

Fixes: 5605194877 ("net: dsa: ocelot: add driver for Felix switch family")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-27 11:40:38 -07:00
Arnd Bergmann b3b6a84c6a bridge: multicast: work around clang bug
Clang-10 and clang-11 run into a corner case of the register
allocator on 32-bit ARM, leading to excessive stack usage from
register spilling:

net/bridge/br_multicast.c:2422:6: error: stack frame size of 1472 bytes in function 'br_multicast_get_stats' [-Werror,-Wframe-larger-than=]

Work around this by marking one of the internal functions as
noinline_for_stack.

Link: https://bugs.llvm.org/show_bug.cgi?id=45802#c9
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-27 11:34:48 -07:00
Dongli Zhang 9210c075ce nvme-pci: avoid race between nvme_reap_pending_cqes() and nvme_poll()
There may be a race between nvme_reap_pending_cqes() and nvme_poll(), e.g.,
when doing live reset while polling the nvme device.

      CPU X                        CPU Y
                               nvme_poll()
nvme_dev_disable()
-> nvme_stop_queues()
-> nvme_suspend_io_queues()
-> nvme_suspend_queue()
                               -> spin_lock(&nvmeq->cq_poll_lock);
-> nvme_reap_pending_cqes()
   -> nvme_process_cq()        -> nvme_process_cq()

In the above scenario, the nvme_process_cq() for the same queue may be
running on both CPU X and CPU Y concurrently.

It is much more easier to reproduce the issue when CONFIG_PREEMPT is
enabled in kernel. When CONFIG_PREEMPT is disabled, it would take longer
time for nvme_stop_queues()-->blk_mq_quiesce_queue() to wait for grace
period.

This patch protects nvme_process_cq() with nvmeq->cq_poll_lock in
nvme_reap_pending_cqes().

Fixes: fa46c6fb5d ("nvme/pci: move cqe check after device shutdown")
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-05-27 20:32:56 +02:00
Stefano Garzarella 7e0afbdfd1 vsock: fix timeout in vsock_accept()
The accept(2) is an "input" socket interface, so we should use
SO_RCVTIMEO instead of SO_SNDTIMEO to set the timeout.

So this patch replace sock_sndtimeo() with sock_rcvtimeo() to
use the right timeout in the vsock_accept().

Fixes: d021c34405 ("VSOCK: Introduce VM Sockets")
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-27 11:20:23 -07:00
Heinrich Kuhn 5b186cd60f nfp: flower: fix used time of merge flow statistics
Prior to this change the correct value for the used counter is calculated
but not stored nor, therefore, propagated to user-space. In use-cases such
as OVS use-case at least this results in active flows being removed from
the hardware datapath. Which results in both unnecessary flow tear-down
and setup, and packet processing on the host.

This patch addresses the problem by saving the calculated used value
which allows the value to propagate to user-space.

Found by inspection.

Fixes: aa6ce2ea0c ("nfp: flower: support stats update for merge flows")
Signed-off-by: Heinrich Kuhn <heinrich.kuhn@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-27 11:18:57 -07:00
Davide Caratti bb2f930d6d net/sched: fix infinite loop in sch_fq_pie
this command hangs forever:

 # tc qdisc add dev eth0 root fq_pie flows 65536

 watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [tc:1028]
 [...]
 CPU: 1 PID: 1028 Comm: tc Not tainted 5.7.0-rc6+ #167
 RIP: 0010:fq_pie_init+0x60e/0x8b7 [sch_fq_pie]
 Code: 4c 89 65 50 48 89 f8 48 c1 e8 03 42 80 3c 30 00 0f 85 2a 02 00 00 48 8d 7d 10 4c 89 65 58 48 89 f8 48 c1 e8 03 42 80 3c 30 00 <0f> 85 a7 01 00 00 48 8d 7d 18 48 c7 45 10 46 c3 23 00 48 89 f8 48
 RSP: 0018:ffff888138d67468 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
 RAX: 1ffff9200018d2b2 RBX: ffff888139c1c400 RCX: ffffffffffffffff
 RDX: 000000000000c5e8 RSI: ffffc900000e5000 RDI: ffffc90000c69590
 RBP: ffffc90000c69580 R08: fffffbfff79a9699 R09: fffffbfff79a9699
 R10: 0000000000000700 R11: fffffbfff79a9698 R12: ffffc90000c695d0
 R13: 0000000000000000 R14: dffffc0000000000 R15: 000000002347c5e8
 FS:  00007f01e1850e40(0000) GS:ffff88814c880000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 000000000067c340 CR3: 000000013864c000 CR4: 0000000000340ee0
 Call Trace:
  qdisc_create+0x3fd/0xeb0
  tc_modify_qdisc+0x3be/0x14a0
  rtnetlink_rcv_msg+0x5f3/0x920
  netlink_rcv_skb+0x121/0x350
  netlink_unicast+0x439/0x630
  netlink_sendmsg+0x714/0xbf0
  sock_sendmsg+0xe2/0x110
  ____sys_sendmsg+0x5b4/0x890
  ___sys_sendmsg+0xe9/0x160
  __sys_sendmsg+0xd3/0x170
  do_syscall_64+0x9a/0x370
  entry_SYSCALL_64_after_hwframe+0x44/0xa9

we can't accept 65536 as a valid number for 'nflows', because the loop on
'idx' in fq_pie_init() will never end. The extack message is correct, but
it doesn't say that 0 is not a valid number for 'flows': while at it, fix
this also. Add a tdc selftest to check correct validation of 'flows'.

CC: Ivan Vecera <ivecera@redhat.com>
Fixes: ec97ecf1eb ("net: sched: add Flow Queue PIE packet scheduler")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-27 11:11:18 -07:00
Linus Torvalds b0c3ba31be \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAl7OoFIACgkQnJ2qBz9k
 QNm4Ewf/VeATmggs4mjetbrqmnr2sIdBxWHIq7Pv1MT9Wrz1WENGwi18yy36CfJU
 5Rign2pa00SIHj1qZsiwcoxFIU7D4WNG36I//aOZelrDp/atsfSAufXN4sZk1KyG
 PO5nVmAH0FkmyIJMDap7EG4jKnK+YSkuF56DLybbZqEwdkHMS2RMwWCmP6M/UjPW
 AdseMjEOnpGzXi2xah4TtEODCKe7koi/TMIrQxBdvd3UGn5VyonTilSTMUtieZic
 qfpotjyRPKQ3RjEQAwvX11jljTUjmdJeGz08PHTHAL3kGwduvFA73TUPuWd5Tz3X
 mAEsmBZNg38WxQYGdCshAvPbSHJFQw==
 =VeY8
 -----END PGP SIGNATURE-----

Merge tag 'fsnotify_for_v5.7-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull fanotify FAN_DIR_MODIFY disabling from Jan Kara:
 "A single patch that disables FAN_DIR_MODIFY support that was merged in
  this merge window.

  When discussing further functionality we realized it may be more
  logical to guard it with a feature flag or to call things slightly
  differently (or maybe not) so let's not set the API in stone for now."

* tag 'fsnotify_for_v5.7-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  fanotify: turn off support for FAN_DIR_MODIFY
2020-05-27 11:03:24 -07:00
Linus Torvalds 3301f6ae2d Merge branch 'for-5.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:

 - Reverted stricter synchronization for cgroup recursive stats which
   was prepping it for event counter usage which never got merged. The
   change was causing performation regressions in some cases.

 - Restore bpf-based device-cgroup operation even when cgroup1 device
   cgroup is disabled.

 - An out-param init fix.

* 'for-5.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  device_cgroup: Cleanup cgroup eBPF device filter code
  xattr: fix uninitialized out-param
  Revert "cgroup: Add memory barriers to plug cgroup_rstat_updated() race window"
2020-05-27 10:58:19 -07:00
Jason Gunthorpe c85f4abe66 RDMA/core: Fix double destruction of uobject
Fix use after free when user user space request uobject concurrently for
the same object, within the RCU grace period.

In that case, remove_handle_idr_uobject() is called twice and we will have
an extra put on the uobject which cause use after free.  Fix it by leaving
the uobject write locked after it was removed from the idr.

Call to rdma_lookup_put_uobject with UVERBS_LOOKUP_DESTROY instead of
UVERBS_LOOKUP_WRITE will do the work.

  refcount_t: underflow; use-after-free.
  WARNING: CPU: 0 PID: 1381 at lib/refcount.c:28 refcount_warn_saturate+0xfe/0x1a0
  Kernel panic - not syncing: panic_on_warn set ...
  CPU: 0 PID: 1381 Comm: syz-executor.0 Not tainted 5.5.0-rc3 #8
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
  Call Trace:
   dump_stack+0x94/0xce
   panic+0x234/0x56f
   __warn+0x1cc/0x1e1
   report_bug+0x200/0x310
   fixup_bug.part.11+0x32/0x80
   do_error_trap+0xd3/0x100
   do_invalid_op+0x31/0x40
   invalid_op+0x1e/0x30
  RIP: 0010:refcount_warn_saturate+0xfe/0x1a0
  Code: 0f 0b eb 9b e8 23 f6 6d ff 80 3d 6c d4 19 03 00 75 8d e8 15 f6 6d ff 48 c7 c7 c0 02 55 bd c6 05 57 d4 19 03 01 e8 a2 58 49 ff <0f> 0b e9 6e ff ff ff e8 f6 f5 6d ff 80 3d 42 d4 19 03 00 0f 85 5c
  RSP: 0018:ffffc90002df7b98 EFLAGS: 00010282
  RAX: 0000000000000000 RBX: ffff88810f6a193c RCX: ffffffffba649009
  RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff88811b0283cc
  RBP: 0000000000000003 R08: ffffed10236060e3 R09: ffffed10236060e3
  R10: 0000000000000001 R11: ffffed10236060e2 R12: ffff88810f6a193c
  R13: ffffc90002df7d60 R14: 0000000000000000 R15: ffff888116ae6a08
   uverbs_uobject_put+0xfd/0x140
   __uobj_perform_destroy+0x3d/0x60
   ib_uverbs_close_xrcd+0x148/0x170
   ib_uverbs_write+0xaa5/0xdf0
   __vfs_write+0x7c/0x100
   vfs_write+0x168/0x4a0
   ksys_write+0xc8/0x200
   do_syscall_64+0x9c/0x390
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x465b49
  Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
  RSP: 002b:00007f759d122c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
  RAX: ffffffffffffffda RBX: 000000000073bfa8 RCX: 0000000000465b49
  RDX: 000000000000000c RSI: 0000000020000080 RDI: 0000000000000003
  RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000246 R12: 00007f759d1236bc
  R13: 00000000004ca27c R14: 000000000070de40 R15: 00000000ffffffff
  Dumping ftrace buffer:
     (ftrace buffer empty)
  Kernel Offset: 0x39400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)

Fixes: 7452a3c745 ("IB/uverbs: Allow RDMA_REMOVE_DESTROY to work concurrently with disassociate")
Link: https://lore.kernel.org/r/20200527135534.482279-1-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-27 14:22:57 -03:00
Amir Goldstein f17936993a fanotify: turn off support for FAN_DIR_MODIFY
FAN_DIR_MODIFY has been enabled by commit 44d705b037 ("fanotify:
report name info for FAN_DIR_MODIFY event") in 5.7-rc1. Now we are
planning further extensions to the fanotify API and during that we
realized that FAN_DIR_MODIFY may behave slightly differently to be more
consistent with extensions we plan. So until we finalize these
extensions, let's not bind our hands with exposing FAN_DIR_MODIFY to
userland.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2020-05-27 18:55:54 +02:00
Linus Torvalds 006f38a1c3 Merge branch 'exec-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull execve fix from Eric Biederman:
 "While working on my exec cleanups I found a bug in exec that winds up
  miscomputing the ambient credentials during exec. Andy appears to have
  to been confused as to why credentials are computed for both the
  script and the interpreter

  From the original patch description:

   [3] Linux very confusingly processes both the script and the
       interpreter if applicable, for reasons that elude me. The results
       from thinking about a script's file capabilities and/or setuid
       bits are mostly discarded.

  The only value in struct cred that gets changed in cap_bprm_set_creds
  that I could find that might persist between the script and the
  interpreter was cap_ambient. Which is fixed with this trivial change"

* 'exec-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  exec: Always set cap_ambient in cap_bprm_set_creds
2020-05-27 09:53:25 -07:00
Arnd Bergmann fff2d0f701 hwmon: (applesmc) avoid overlong udelay()
Building this driver with "clang -O3" produces a link error
after the compiler partially unrolls the loop and 256ms
becomes a compile-time constant that triggers the check
in udelay():

ld.lld: error: undefined symbol: __bad_udelay
>>> referenced by applesmc.c
>>>               hwmon/applesmc.o:(read_smc) in archive drivers/built-in.a

I can see no reason against using a sleeping function here,
as no part of the driver runs in atomic context, so instead use
usleep_range() with a wide range and use jiffies for the
end condition.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20200527135207.1118624-1-arnd@arndb.de
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2020-05-27 09:18:13 -07:00
Krzysztof Kozlowski ed3119e455 x86: Hide the archdata.iommu field behind generic IOMMU_API
There is a generic, kernel wide configuration symbol for enabling the
IOMMU specific bits: CONFIG_IOMMU_API.  Implementations (including
INTEL_IOMMU and AMD_IOMMU driver) select it so use it here as well.

This makes the conditional archdata.iommu field consistent with other
platforms and also fixes any compile test builds of other IOMMU drivers,
when INTEL_IOMMU or AMD_IOMMU are not selected).

For the case when INTEL_IOMMU/AMD_IOMMU and COMPILE_TEST are not
selected, this should create functionally equivalent code/choice.  With
COMPILE_TEST this field could appear if other IOMMU drivers are chosen
but neither INTEL_IOMMU nor AMD_IOMMU are not.

Reported-by: kbuild test robot <lkp@intel.com>
Fixes: e93a1695d7 ("iommu: Enable compile testing for some of drivers")
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Acked-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20200518120855.27822-2-krzk@kernel.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-05-27 16:44:05 +02:00
Krzysztof Kozlowski e18a8c104d ia64: Hide the archdata.iommu field behind generic IOMMU_API
There is a generic, kernel wide configuration symbol for enabling the
IOMMU specific bits: CONFIG_IOMMU_API.  Implementations (including
INTEL_IOMMU driver) select it so use it here as well.

This makes the conditional archdata.iommu field consistent with other
platforms and also fixes any compile test builds of other IOMMU drivers,
when INTEL_IOMMU is not selected).

For the case when INTEL_IOMMU and COMPILE_TEST are not selected, this
should create functionally equivalent code/choice.  With COMPILE_TEST
this field could appear if other IOMMU drivers are chosen but
INTEL_IOMMU not.

Reported-by: kbuild test robot <lkp@intel.com>
Fixes: e93a1695d7 ("iommu: Enable compile testing for some of drivers")
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>

Link: https://lore.kernel.org/r/20200518120855.27822-1-krzk@kernel.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-05-27 16:41:29 +02:00
Linus Walleij ad3073bcb9 gpio fixes for v5.7
- fix mutex and spinlock ordering in gpio-mlxbf2
 - fix the return value checks on devm_platform_ioremap_resource in
   gpio-pxa and gpio-bcm-kona
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAl7ONMEACgkQEacuoBRx
 13IYBw/+OvvMyw5bhiaYJT+Ig/4lSFwtRxXjmeIb5XcmSk1z9T8U6Pon6TbcV23+
 u7Pnar+A6Dr2+erTgZyXhA4iFE3YcHmo2GO4Pi4tnIoeSAoe74CQPmcJyu8KJ39J
 8np/7vlfY0l6xMEGgqTC0kBPtBLnovny6GPYo28ziVFzaoHl7rpUpVq9/WFcR8VW
 vtvuWqFKibSdYgx1JgH1LgznKiFopA8fJOuMwdSj/nl7zXcHGvb1D6FeRV/GYDv/
 2MIMRTPC3kgLqHc2vpmV0tTx8frFBN9twrRp2Xge1PbmiplQ2yKTGMmCbGZthAcG
 umu8iMwo/9Ocy+xt23Bj0yl5ZsKUbDT966bDCK4Fningk/9Svk3CGUYX8AEH9kav
 utB2VfdHDWtUHyIgBgNUhdxbeUB5EU851J5AxufQWpOPfowPni8bizkI5x0uPHhd
 HqyYDxrY/KLk9zkV0xhE+swVrA738ojpK5G+ijaU9ESLzLiITEdHzF2EHpxKttxz
 IHMlDoeTya6nCPlZ44AchsudkJj/3vfKEZb3OhiK7oQkJ5CSVQI7t/wj1/kfsWH8
 10iEova0q+6NlDavxgaAxYbLCMSi0Vpx5Dav8xd8WWcgMwv2jcclEZ+avfb+GUC4
 gtlReWb2+LaA/Gx9JIuGsGSE4f11xr9Jk3GBSikKqj0Q0zrHZwE=
 =06hZ
 -----END PGP SIGNATURE-----

Merge tag 'gpio-fixes-for-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux into fixes

gpio fixes for v5.7

- fix mutex and spinlock ordering in gpio-mlxbf2
- fix the return value checks on devm_platform_ioremap_resource in
  gpio-pxa and gpio-bcm-kona
2020-05-27 15:38:26 +02:00
Pablo Neira Ayuso 4946ea5c12 netfilter: nf_conntrack_pptp: fix compilation warning with W=1 build
>> include/linux/netfilter/nf_conntrack_pptp.h:13:20: warning: 'const' type qualifier on return type has no effect [-Wignored-qualifiers]
extern const char *const pptp_msg_name(u_int16_t msg);
^~~~~~

Reported-by: kbuild test robot <lkp@intel.com>
Fixes: 4c559f15ef ("netfilter: nf_conntrack_pptp: prevent buffer overflows in debug code")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-27 13:39:08 +02:00
Pablo Neira Ayuso 94945ad2b3 netfilter: conntrack: comparison of unsigned in cthelper confirmation
net/netfilter/nf_conntrack_core.c: In function nf_confirm_cthelper:
net/netfilter/nf_conntrack_core.c:2117:15: warning: comparison of unsigned expression in < 0 is always false [-Wtype-limits]
 2117 |   if (protoff < 0 || (frag_off & htons(~0x7)) != 0)
      |               ^

ipv6_skip_exthdr() returns a signed integer.

Reported-by: Colin Ian King <colin.king@canonical.com>
Fixes: 703acd70f2 ("netfilter: nfnetlink_cthelper: unbreak userspace helper support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-27 13:39:06 +02:00
Nathan Chancellor 46c1e0621a netfilter: conntrack: Pass value of ctinfo to __nf_conntrack_update
Clang warns:

net/netfilter/nf_conntrack_core.c:2068:21: warning: variable 'ctinfo' is
uninitialized when used here [-Wuninitialized]
        nf_ct_set(skb, ct, ctinfo);
                           ^~~~~~
net/netfilter/nf_conntrack_core.c:2024:2: note: variable 'ctinfo' is
declared here
        enum ip_conntrack_info ctinfo;
        ^
1 warning generated.

nf_conntrack_update was split up into nf_conntrack_update and
__nf_conntrack_update, where the assignment of ctinfo is in
nf_conntrack_update but it is used in __nf_conntrack_update.

Pass the value of ctinfo from nf_conntrack_update to
__nf_conntrack_update so that uninitialized memory is not used
and everything works properly.

Fixes: ee04805ff5 ("netfilter: conntrack: make conntrack userspace helpers work again")
Link: https://github.com/ClangBuiltLinux/linux/issues/1039
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-27 13:38:58 +02:00
Jeff Layton fb33c114d3 ceph: flush release queue when handling caps for unknown inode
It's possible for the VFS to completely forget about an inode, but for
it to still be sitting on the cap release queue. If the MDS sends the
client a cap message for such an inode, it just ignores it today, which
can lead to a stall of up to 5s until the cap release queue is flushed.

If we get a cap message for an inode that can't be located, then go
ahead and flush the cap release queue.

Cc: stable@vger.kernel.org
URL: https://tracker.ceph.com/issues/45532
Fixes: 1e9c2eb681 ("ceph: delete stale dentry when last reference is dropped")
Reported-and-Tested-by: Andrej Filipčič <andrej.filipcic@ijs.si>
Suggested-by: Yan, Zheng <zyan@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-05-27 13:03:57 +02:00
Jerry Lee 890bd0f899 libceph: ignore pool overlay and cache logic on redirects
OSD client should ignore cache/overlay flag if got redirect reply.
Otherwise, the client hangs when the cache tier is in forward mode.

[ idryomov: Redirects are effectively deprecated and no longer
  used or tested.  The original tiering modes based on redirects
  are inherently flawed because redirects can race and reorder,
  potentially resulting in data corruption.  The new proxy and
  readproxy tiering modes should be used instead of forward and
  readforward.  Still marking for stable as obviously correct,
  though. ]

Cc: stable@vger.kernel.org
URL: https://tracker.ceph.com/issues/23296
URL: https://tracker.ceph.com/issues/36406
Signed-off-by: Jerry Lee <leisurelysw24@gmail.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-05-27 12:43:35 +02:00
Kailang Yang 630e36126e ALSA: hda/realtek - Add new codec supported for ALC287
Enable new codec supported for ALC287.

Signed-off-by: Kailang Yang <kailang@realtek.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/dcf5ce5507104d0589a917cbb71dc3c6@realtek.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-05-27 08:52:51 +02:00