Support is hooked up via a cpu start method specified in the device
tree, and also depends on DT nodes that describe the interfaces for
performing IPI and identifying which cpu execution is taking place on.
The currently used method is a form of spin table, where secondary
cpus are unblocked by writing to a special address.
Signed-off-by: Rich Felker <dalias@libc.org>
The SH2 version of entry.S uses global variables, which need to be
cpu-local in order to work with SMP. For ease of access from asm,
simply use arrays indexed by cpu number, and require the availability
of an address (mmio register or properly setup per-cpu memory) from
which the current cpu's index can be read.
Signed-off-by: Rich Felker <dalias@libc.org>
The version of futex.h in asm-generic should really be adapted to do
the same thing so that this hideous code does not have to be
duplicated per-arch.
Signed-off-by: Rich Felker <dalias@libc.org>
The J-Core cpu has, as an ISA extension, an atomic compare-and-swap
instruction cas.l which applications need to use (instead the imask or
gusa atomic models, which are fundamentally limited to UP) for
synchronization in order to be compatible with SMP systems. Provide a
hwcap flag so that it's possible to do runtime selection and support
both.
Signed-off-by: Rich Felker <dalias@libc.org>
At the CPU/ISA level, the J2 is compatible with SH-2, and thus the
changes to add J2 support build on existing SH-2 support. However, J2
does not duplicate the memory-mapped SH-2 features like the cache
interface. Instead, the cache interfaces is described in the device
tree, and new code is added to be able to access the flat device tree
at early boot before it is unflattened.
Support is also added for receiving interrupts on trap numbers in the
range 16 to 31, since the J-Core aic1 interrupt controller generates
these traps. This range was unused but nominally for hardware
exceptions on SH-2, and a few values in this range were used for
exceptions on SH-2A, but SH-2A has its own version of the relevant
code.
No individual cpu subtypes are added for J2 since the intent moving
forward is to represent SoCs with device tree rather than as
hard-coded subtypes in the kernel. The CPU_SUBTYPE_J2 Kconfig item
exists only to fit into the existing cpu selection mechanism until it
is overhauled.
Signed-off-by: Rich Felker <dalias@libc.org>
If a PCI bridge (or PCIe port) that is runtime-suspended gets an ACPI
hotplug notification, such as a bus check, it has to be resumed before
re-scanning the devices below it, or those devices will not be
accessible and will be treated as hot-removed.
Make that happen and let the bridge suspend again after the bus below it
has been re-scanned.
This is a replacement for commit 16468c783c ("ACPI / hotplug / PCI:
Runtime resume bridge before rescan") that has been reverted, because it
introduced a system resume regression (due to missing bridge->pci_dev
checks that are necessary in case the notification is targeted at the
host bridge) and it is necessary for the code added by commit
006d44e49a ("PCI: Add runtime PM support for PCIe ports") to work as
expected.
Tested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mailmap entries may not use more than 2 email addresses per line.
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Merge misc fixes from Andrew Morton:
"A few late-breaking fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm/memblock.c: fix NULL dereference error
MAINTAINERS: update cgroup's document path
slub: drop bogus inline for fixup_red_left()
powerpc/fsl_rio: fix a missing error code
mm: initialise per_cpu_nodestats for all online pgdats at boot
mm/memblock: fix a typo in a comment
mm: disable CONFIG_MEMORY_HOTPLUG when KASAN is enabled
- hfi1 driver updates
- Fix for max SGEs allowed via RDMA R/W API
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXoqUzAAoJELgmozMOVy/dNKAP/1/Rzn/k97eda1qFqzWpqsPl
lMaxDiZZnRIAFJEqEF9Iwo1JLiFIzjpDJnqHB++CKuXZQT0NY6sHW0yrcyUwzsx7
5gui92ldkVg4vY7PTco171vyzG+79KKRZ1dFS14z7oC8XAg48zQ7yJmfb1op3dEw
mgxyoLaaMwMF5aLwPoWG4+aPkBMtKUGB/ARb4ehq6M2p71c43lb18GaarJuWLdAz
1HxakXL/uzttyvGDyJGKDrT6ktXXSyvdCTRO60OrrPFJ67P2xRYXce85TLRr8srp
Q5RNjyR5fP8uN0qtrQz+hl09mtBeBQHKomyFIOVwkB2r53OKqsR5g5roz3BlpA1X
7PF/MO0pKy4t8XQnLfohEwtNWgszupvxkyAAISI8MwzLOPra/V8smQ9CpTltx1UB
hTu3tpAMy1auAjh8TWzzzII1ZoRZz6YCTziWnTaC3bqAljufjt1mnvjrtNmQ1sNi
MCLeA3yr8HjlKWdwYr+gVfhSR1wEoOxwHZdLsvBsxmC32hFLlh6rbg2x8wceqTlR
4T8l0AERV1YPjsoSe3/pWVImKUA97qppIfeFcCZiBCBHBPlhpw3ebVt6B1mLVUCV
hTMuZeFVcV75D+qr0kR5ZuVn4jgEn9zB1VH3tCV9LJnhBfySZFcP4yhATqiELaHG
RVoVAiTBxq5RgNVOH4Zo
=cQcp
-----END PGP SIGNATURE-----
Merge tag 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull second round of rdma updates from Doug Ledford:
"This can be split out into just two categories:
- fixes to the RDMA R/W API in regards to SG list length limits
(about 5 patches)
- fixes/features for the Intel hfi1 driver (everything else)
The hfi1 driver is still being brought to full feature support by
Intel, and they have a lot of people working on it, so that amounts to
almost the entirety of this pull request"
* tag 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (84 commits)
IB/hfi1: Add cache evict LRU list
IB/hfi1: Fix memory leak during unexpected shutdown
IB/hfi1: Remove unneeded mm argument in remove function
IB/hfi1: Consistently call ops->remove outside spinlock
IB/hfi1: Use evict mmu rb operation
IB/hfi1: Add evict operation to the mmu rb handler
IB/hfi1: Fix TID caching actions
IB/hfi1: Make the cache handler own its rb tree root
IB/hfi1: Make use of mm consistent
IB/hfi1: Fix user SDMA racy user request claim
IB/hfi1: Fix error condition that needs to clean up
IB/hfi1: Release node on insert failure
IB/hfi1: Validate SDMA user iovector count
IB/hfi1: Validate SDMA user request index
IB/hfi1: Use the same capability state for all shared contexts
IB/hfi1: Prevent null pointer dereference
IB/hfi1: Rename TID mmu_rb_* functions
IB/hfi1: Remove unneeded empty check in hfi1_mmu_rb_unregister()
IB/hfi1: Restructure hfi1_file_open
IB/hfi1: Make iovec loop index easy to understand
...
- Updates/fixes for iw_cxgb4 driver
- Updates/fixes for mlx5 driver
- Add flow steering and RSS API
- Add hardware stats to mlx4 and mlx5 drivers
- Add firmware version API for RDMA driver use
- Add the rxe driver (this is a software RoCE driver that makes any
Ethernet device a RoCE device)
- Fixes for i40iw driver
- Support for send only multicast joins in the cma layer
- Other minor fixes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXo1vCAAoJELgmozMOVy/d0HcQAJqMi7siD9cSaMViYbu812pq
3kNkHZbLNB/947uShDPhhFAWFXU0nRxEnTNSvYxRo+nxnDE/9hEEXpx8OzzKLNU+
GXyDeHsEEriSFcaSne5Tak/QuiFm3PJv73ttXQROCtHG7KxLG9ieVbfusz42Xwiu
5R21qfp6PZEOC+j7L/fTZh/kEN3cfaDYrGnCgmU3z0ka9xG5Qe2/+uWGNkuioRA5
phFUR4MS+1n/VrnxPHrLXTrqv3sw8YfCfRImaXSBrxFVMqhno+cDDtEJQCRnmNrq
7KcJO2KqDMl/QqsjxdwqojNpUTh2t7SeOeQuzUsfXl15yyyetq2Zu7ZurkCGjNtQ
NtTt6hv5eXq3mNuBmOPKYDDgakSYyYjS0zueoi8wFFqIeSYxRJv4wx4xoeJ/Bsz8
2LplpaPMQaTM65FhzYXGhYNBKaRkqjL9ihbIl1OcLNvfXAqLElfONM17/Yc/hgVw
xfDtvNFrZcl7/exIpBBNOnxwbs4h78vvXsXoBiVoN7V/hBnMzDhkiBHNxNCfZXA0
REGs/cnyy6cpiJOnVCWs77NqL75oK/qb1mEwe1M+A2kaxe/tLixUdYXo/zclDPm8
3DLTL9lCgJIBIEiZT4q/alxLK+yUKD+SHtQT3lmF2Bfsmv/I38Uy55SXAiFO4yOq
kwy96TvYtT43SkyNmmBf
=oZOO
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull base rdma updates from Doug Ledford:
"Round one of 4.8 code: while this is mostly normal, there is a new
driver in here (the driver was hosted outside the kernel for several
years and is actually a fairly mature and well coded driver). It
amounts to 13,000 of the 16,000 lines of added code in here.
Summary:
- Updates/fixes for iw_cxgb4 driver
- Updates/fixes for mlx5 driver
- Add flow steering and RSS API
- Add hardware stats to mlx4 and mlx5 drivers
- Add firmware version API for RDMA driver use
- Add the rxe driver (this is a software RoCE driver that makes any
Ethernet device a RoCE device)
- Fixes for i40iw driver
- Support for send only multicast joins in the cma layer
- Other minor fixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (72 commits)
Soft RoCE driver
IB/core: Support for CMA multicast join flags
IB/sa: Add cached attribute containing SM information to SA port
IB/uverbs: Fix race between uverbs_close and remove_one
IB/mthca: Clean up error unwind flow in mthca_reset()
IB/mthca: NULL arg to pci_dev_put is OK
IB/hfi1: NULL arg to sc_return_credits is OK
IB/mlx4: Add diagnostic hardware counters
net/mlx4: Query performance and diagnostics counters
net/mlx4: Add diagnostic counters capability bit
Use smaller 512 byte messages for portmapper messages
IB/ipoib: Report SG feature regardless of HW UD CSUM capability
IB/mlx4: Don't use GFP_ATOMIC for CQ resize struct
IB/hfi1: Disable by default
IB/rdmavt: Disable by default
IB/mlx5: Fix port counter ID association to QP offset
IB/mlx5: Fix iteration overrun in GSI qps
i40iw: Add NULL check for puda buffer
i40iw: Change dup_ack_thresh to u8
i40iw: Remove unnecessary check for moving CQ head
...
Pull SCSI target updates from Nicholas Bellinger:
"The most notable item is IBM virtual SCSI target driver, that was
originally ported to target-core back in 2010 by Tomo-san, and has
been brought forward to v4.x code by Bryant Ly, Michael Cyr and co
over the last months.
Also included are two ORDERED task related bug-fixes Bryant + Michael
found along the way using ibmvscsis with AIX guests, plus a few
miscellaneous target-core + iscsi-target bug-fixes with associated
stable tags"
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
target: fix spelling mistake: "limitiation" -> "limitation"
target: Fix residual overflow handling in target_complete_cmd_with_length
tcm_fc: set and unset FCP_SPPF_TARG_FCN
iscsi-target: Fix panic when adding second TCP connection to iSCSI session
ibmvscsis: Initial commit of IBM VSCSI Tgt Driver
target: Fix ordered task CHECK_CONDITION early exception handling
target: Fix ordered task target_setup_cmd_from_cdb exception hang
target: Fix max_unmap_lba_count calc overflow
target: Fix race between iscsi-target connection shutdown + ABORT_TASK
target: Fix missing complete during ABORT_TASK + CMD_T_FABRIC_STOP
It causes NULL dereference error and failure to get type_a->regions[0]
info if parameter type_b of __next_mem_range_rev() == NULL
Fix this by checking before dereferring and initializing idx_b to 0
The approach is tested by dumping all types of region via
__memblock_dump_all() and __next_mem_range_rev() fixed to UART
separately the result is okay after checking the logs.
Link: http://lkml.kernel.org/r/57A0320D.6070102@zoho.com
Signed-off-by: zijun_hu <zijun_hu@htc.com>
Tested-by: zijun_hu <zijun_hu@htc.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With m68k-linux-gnu-gcc-4.1:
include/linux/slub_def.h:126: warning: `fixup_red_left' declared inline after being called
include/linux/slub_def.h:126: warning: previous declaration of `fixup_red_left' was here
Commit c146a2b98e ("mm, kasan: account for object redzone in SLUB's
nearest_obj()") made fixup_red_left() global, but forgot to remove the
inline keyword.
Fixes: c146a2b98e ("mm, kasan: account for object redzone in SLUB's nearest_obj()")
Link: http://lkml.kernel.org/r/1470256262-1586-1-git-send-email-geert@linux-m68k.org
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Alexander Potapenko <glider@google.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We should set the error code here rather than incorrectly returning 0.
Otherwise static checkers complain.
Link: http://lkml.kernel.org/r/20160804053525.GM775@mwanda
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Alexandre Bounine <alexandre.bounine@idt.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Paul Mackerras and Reza Arbab reported that machines with memoryless
nodes fail when vmstats are refreshed. Paul reported an oops as follows
Unable to handle kernel paging request for data at address 0xff7a10000
Faulting instruction address: 0xc000000000270cd0
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=2048 NUMA PowerNV
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.7.0-kvm+ #118
task: c000000ff0680010 task.stack: c000000ff0704000
NIP: c000000000270cd0 LR: c000000000270ce8 CTR: 0000000000000000
REGS: c000000ff0707900 TRAP: 0300 Not tainted (4.7.0-kvm+)
MSR: 9000000102009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE,TM[E]> CR: 846b6824 XER: 20000000
CFAR: c000000000008768 DAR: 0000000ff7a10000 DSISR: 42000000 SOFTE: 1
NIP refresh_zone_stat_thresholds+0x80/0x240
LR refresh_zone_stat_thresholds+0x98/0x240
Call Trace:
refresh_zone_stat_thresholds+0xb8/0x240 (unreliable)
Both supplied potential fixes but one potentially misses checks and
another had redundant initialisations. This version initialises
per_cpu_nodestats on a per-pgdat basis instead of on a per-zone basis.
Link: http://lkml.kernel.org/r/20160804092404.GI2799@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Reported-by: Paul Mackerras <paulus@ozlabs.org>
Reported-by: Reza Arbab <arbab@linux.vnet.ibm.com>
Tested-by: Reza Arbab <arbab@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
At present it is obvious that memory online and offline will fail when
KASAN is enabled. So add the condition to limit the memory_hotplug when
KASAN is enabled.
Link: http://lkml.kernel.org/r/1470063651-29519-1-git-send-email-zhongjiang@huawei.com
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Trond made a change to the server's tcp logic that allows a fast
client to better take advantage of high bandwidth networks, but
may increase the risk that a single client could starve other
clients; a new sunrpc.svc_rpc_per_connection_limit parameter
should help mitigate this in the (hopefully unlikely) event this
becomes a problem in practice.
Tom Haynes added a minimal flex-layout pnfs server, which is of
no use in production for now--don't build it unless you're doing
client testing or further server development.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXo7HNAAoJECebzXlCjuG+zqUP/RxO5jZjBhNI8/ayGdDW/Jnq
s0Fu6B+aNRV3GnugmIeI4tWNGnPyERNzFtjLKlnwaasz/oW4qBLqGbNUWC5xKARS
erODs0hM/1aCYWwNBEc5qXP2u23HrWVuQ+B5fg42ACyliKFGq5faDRmf6XGU/1kB
8unXGWPAiLiNZD/bWP91fYhThlLgpfHBFZ7M3G2IqmzWZTSELPzwp1bpRWt7yWQQ
z1oYtXToycbwz3yPVk3cXtaoqpjDUVZf2Guqgqi1BwEyEtYOSaYo1VHNsKDf4OId
QXQh64AqIK4uszpvtNhvsEaAECN7IiB+N4n2laFiQVmAf8Hfl3AnV/gKeD4lKmTj
TY6knnjZO/X88wn80MB7JR1H1WXvvzNIHwNR95qfub/lVKX+C+0AORRtYhi5F9ec
ixNs/z1ImLpYxAjiP/T5anD5xcX2S+LcSv7kRjhEufqNFtRAIqBZO9ZWbCdXAAyE
tcH9Cru4jeIlFO/y6O61EVrn9FFj2+0uu+7urefNRQ2Y9pmKeculJrLF6WO8WHms
4IzXMmjZK+358RVdX2Ji5Hw6rBDvfgP+LjB8Jn8CeIiNRONEjT+2/AYQcfk61aLb
INUbk6G6Vfd8iMO4aaRI9tmW+vKCOZa0IbnrNE1oHKp/AKBDr25i5YPSCsnl3r4Q
iR7rRe9FIkfqBpbfjVFv
=mo54
-----END PGP SIGNATURE-----
Merge tag 'nfsd-4.8' of git://linux-nfs.org/~bfields/linux
Pull nfsd updates from Bruce Fields:
"Highlights:
- Trond made a change to the server's tcp logic that allows a fast
client to better take advantage of high bandwidth networks, but may
increase the risk that a single client could starve other clients;
a new sunrpc.svc_rpc_per_connection_limit parameter should help
mitigate this in the (hopefully unlikely) event this becomes a
problem in practice.
- Tom Haynes added a minimal flex-layout pnfs server, which is of no
use in production for now--don't build it unless you're doing
client testing or further server development"
* tag 'nfsd-4.8' of git://linux-nfs.org/~bfields/linux: (32 commits)
nfsd: remove some dead code in nfsd_create_locked()
nfsd: drop unnecessary MAY_EXEC check from create
nfsd: clean up bad-type check in nfsd_create_locked
nfsd: remove unnecessary positive-dentry check
nfsd: reorganize nfsd_create
nfsd: check d_can_lookup in fh_verify of directories
nfsd: remove redundant zero-length check from create
nfsd: Make creates return EEXIST instead of EACCES
SUNRPC: Detect immediate closure of accepted sockets
SUNRPC: accept() may return sockets that are still in SYN_RECV
nfsd: allow nfsd to advertise multiple layout types
nfsd: Close race between nfsd4_release_lockowner and nfsd4_lock
nfsd/blocklayout: Make sure calculate signature/designator length aligned
xfs: abstract block export operations from nfsd layouts
SUNRPC: Remove unused callback xpo_adjust_wspace()
SUNRPC: Change TCP socket space reservation
SUNRPC: Add a server side per-connection limit
SUNRPC: Micro optimisation for svc_data_ready
SUNRPC: Call the default socket callbacks instead of open coding
SUNRPC: lock the socket while detaching it
...
Pull more btrfs updates from Chris Mason:
"This is part two of my btrfs pull, which is some cleanups and a batch
of fixes.
Most of the code here is from Jeff Mahoney, making the pointers we
pass around internally more consistent and less confusing overall. I
noticed a small problem right before I sent this out yesterday, so I
fixed it up and re-tested overnight"
* 'for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (40 commits)
Btrfs: fix __MAX_CSUM_ITEMS
btrfs: btrfs_abort_transaction, drop root parameter
btrfs: add btrfs_trans_handle->fs_info pointer
btrfs: btrfs_relocate_chunk pass extent_root to btrfs_end_transaction
btrfs: convert nodesize macros to static inlines
btrfs: introduce BTRFS_MAX_ITEM_SIZE
btrfs: cleanup, remove prototype for btrfs_find_root_ref
btrfs: copy_to_sk drop unused root parameter
btrfs: simpilify btrfs_subvol_inherit_props
btrfs: tests, use BTRFS_FS_STATE_DUMMY_FS_INFO instead of dummy root
btrfs: tests, require fs_info for root
btrfs: tests, move initialization into tests/
btrfs: btrfs_test_opt and friends should take a btrfs_fs_info
btrfs: prefix fsid to all trace events
btrfs: plumb fs_info into btrfs_work
btrfs: remove obsolete part of comment in statfs
btrfs: hide test-only member under ifdef
btrfs: Ratelimit "no csum found" info message
btrfs: Add ratelimit to btrfs printing
Btrfs: fix unexpected balance crash due to BUG_ON
...
improvements of UBI and UBIFS.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJXoiyMAAoJEEtJtSqsAOnWoXoP/1Q192UXeI18eezK//Y1kgv/
Q3gFoqtOWBnw9kcY9aTHdAtPJcgsjRzCMPVbd1TBEe071xWCyKziyGalNUFKLKOR
IZxym3uf65jhXkcch7ZtoUdMH7XcGOavPg8X47RWs5u72uTiIt6t/RRUwM1zDeaW
YZx3FnCGwyzPygrogTbVfH132o1pzO587wrxFeaZQ30sWCLqQOk3qVyROgz2J9zm
00TjNQEvUgfhBf2PiUvX0S5Lan/AX1aB3iEGg05fIDDsZqui698DRDx+isFEJEHf
NWBHDBnhOObwKgutDfCk1gsfIKxzxBCxlLQG/ZaCwG4XKke8ylRc1wNffJbKrIIQ
AYywLol3n3/WR4VvPK+4/TX/s4UOZOvSZYiaVJiSmxOCUNydNtwIewNp+aVghV/u
qMfWsWRIPy7OXOdm3fTxzRsFtUxZaqglQ/dK24i1d8kktM0rkb1mgfKq9P0uctWq
0ejnNHQmJyuGKYvemjBtTXUFmFktelolDOfsAl10MbYZ+OwPOYpI9FbGY/POYWuT
Gpn/x/r2lGtP94kGYxBzSX8xTCC4SEFaMjE2sRvhWoxA8YgIydTDhz9SxCO1wz8E
a7nPnRQ0iZfo5JW0MkLZim+YDNyBjY5ASeBXXdJH/uXlCaFjmDCDCLz5/e08DuM3
lmmkepYwimHJIClr6d+0
=hOxy
-----END PGP SIGNATURE-----
Merge tag 'upstream-4.8-rc1' of git://git.infradead.org/linux-ubifs
Pull UBI/UBIFS updates from Richard Weinberger:
"This contains mostly cleanups and minor improvements of UBI and UBIFS"
* tag 'upstream-4.8-rc1' of git://git.infradead.org/linux-ubifs:
ubi: Use bitmaps in Fastmap self-check code
ubi: Be more paranoid while seaching for the most recent Fastmap
ubi: Check whether the Fastmap anchor matches the super block
ubi: Rework Fastmap attach base code
ubi: Fix whitespace issue in count_fastmap_pebs()
ubi: Introduce vol_ignored()
ubi: Fix scan_fast() comment
ubifs: switch_gc_head: Remove redondant sync of wbuf
ubi: Make volume resize power cut aware
ubi: Fix early logging
ubi: gluebi: Fix double refcounting
ubifs: Silence early error messages if MS_SILENT is set
ubi: Fix race condition between ubi device creation and udev
ubifs: Update comment for ubifs_errc
ubi: Only read necessary size when reading the VID header
ubifs: Make xattr structures static
ubifs: Silence error output if MS_SILENT is set
Pull UML updates from Richard Weinberger:
"Beside of various fixes this also contains patches to enable features
such was Kcov, kmemleak and TRACE_IRQFLAGS_SUPPORT on UML"
* 'for-linus-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
hostfs: Freeing an ERR_PTR in hostfs_fill_sb_common()
um: Support kcov
um: Enable TRACE_IRQFLAGS_SUPPORT
um: Use asm-generic/irqflags.h
um: Fix possible deadlock in sig_handler_common()
um: Select HAVE_DEBUG_KMEMLEAK
um: Setup physical memory in setup_arch()
um: Eliminate null test after alloc_bootmem
Instead of a separate ignore flag, use the obvious DOCBOOKS="" to ignore
all DocBook files. This is also in line with the Sphinx build being
ignored if a non-empty DOCBOOKS make variable is specified on the make
command line.
This replaces the IGNORE_DOCBOOKS introduced in
commit 547218864a
Author: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Date: Sat Jul 9 13:12:45 2016 -0300
doc-rst: add an option to ignore DocBooks when generating docs
and aligns with
commit 6387872c86
Author: Jani Nikula <jani.nikula@intel.com>
Date: Fri Jul 1 15:24:44 2016 +0300
Documentation/sphinx: skip build if user requested specific DOCBOOKS
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Tested-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Pull parisc updates from Helge Deller:
- added an optimized hash implementation for parisc (George Spelvin)
- C99 style cleanups in iomap.c (Amitoj Kaur Chawla)
- added breaks to switch statement in PDC function (noticed by Dan
Carpenter)
* 'parisc-4.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Change structure intialisation to C99 style in iomap.c
parisc: Add break statements to pdc_pat_io_pci_cfg_read()
parisc: Add <asm/hash.h>
Pull m68knommu updates from Greg Ungerer:
"This series is all about Nicolas flat format support for MMU systems.
Traditional m68k no-MMU flat format binaries can now be run on m68k
MMU enabled systems too. The series includes some nice cleanups of
the binfmt_flat code and converts it to using proper user space
accessor functions.
With all this in place you can boot and run a complete no-MMU flat
format based user space on an MMU enabled system"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68k: enable binfmt_flat on systems with an MMU
binfmt_flat: allow compressed flat binary format to work on MMU systems
binfmt_flat: add MMU-specific support
binfmt_flat: update libraries' data segment pointer with userspace accessors
binfmt_flat: use clear_user() rather than memset() to clear .bss
binfmt_flat: use proper user space accessors with old relocs code
binfmt_flat: use proper user space accessors with relocs processing code
binfmt_flat: clean up create_flat_tables() and stack accesses
binfmt_flat: use generic transfer_args_to_stack()
elf_fdpic_transfer_args_to_stack(): make it generic
binfmt_flat: prevent kernel dammage from corrupted executable headers
binfmt_flat: convert printk invocations to their modern form
binfmt_flat: assorted cleanups
m68k: use same start_thread() on MMU and no-MMU
m68k: fix file path comment
m68k: fix bFLT executable running on MMU enabled systems
We changed this around in f135af1041f ('nfsd: reorganize nfsd_create')
so "dchild" can't be an error pointer any more. Also, dchild can't be
NULL here (and dput would already handle this even if it was).
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
We need an fh_verify to make sure we at least have a dentry, but actual
permission checks happen later.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
vfs_{create,mkdir,mknod} each begin with a call to may_create(), which
returns EEXIST if the object already exists.
This check is therefore unnecessary.
(In the NFSv2 case, nfsd_proc_create also has such a check. Contrary to
RFC 1094, our code seems to believe that a CREATE of an existing file
should succeed. I'm leaving that behavior alone.)
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
There's some odd logic in nfsd_create() that allows it to be called with
the parent directory either locked or unlocked. The only already-locked
caller is NFSv2's nfsd_proc_create(). It's less confusing to split out
the unlocked case into a separate function which the NFSv2 code can call
directly.
Also fix some comments while we're here.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Create and other nfsd ops generally assume we can call lookup_one_len on
inodes with S_IFDIR set. Al says that this assumption isn't true in
general, though it should be for the filesystem objects nfsd sees.
Add a check just to make sure our assumption isn't violated.
Remove a couple checks for i_op->lookup in create code.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
lookup_one_len already has this check.
The only effect of this patch is to return access instead of perm in the
0-length-filename case. I actually prefer nfserr_perm (or _inval?), but
I doubt anyone cares.
The isdotent check seems redundant too, but I worry that some client
might actually care about that strange nfserr_exist error.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
When doing a create (mkdir/mknod) on a name, it's worth
checking the name exists first before returning EACCES in case
the directory is not writeable by the user.
This makes return values on the client more consistent
regardless of whenever the entry there is cached in the local
cache or not.
Another positive side effect is certain programs only expect
EEXIST in that case even despite POSIX allowing any valid
error to be returned.
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
The rw_page users were not converted to use bio/req ops. As a result
bdev_write_page is not passing down REQ_OP_WRITE and the IOs will
be sent down as reads.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Fixes: 4e1b2d52a8 ("block, fs, drivers: remove REQ_OP compat defs and related code")
Modified by me to:
1) Drop op_flags passing into ->rw_page(), as we don't use it.
2) Make op_is_write() and friends safe to use for !CONFIG_BLOCK
Signed-off-by: Jens Axboe <axboe@fb.com>
Fix a fat-fingered conversion to the req_op accessors, and also
use a switch statement to make it more obvious what is being checked.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Dave Chinner <david@fromorbit.com>
Fixes: c2df40 ("drivers: use req op accessor");
Reviewed-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
If we fail registering any of the hardware queues, we call
into blk_mq_unregister_disk() with the hotplug mutex already
held. Since blk_mq_unregister_disk() attempts to acquire the
same mutex, we end up in a less than happy place.
Reported-by: Jinpu Wang <jinpu.wang@profitbricks.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
In include/linux/blkdev.h duplicate declarations of the request
struct exist. Cleaned up by removing the second, unneeded
declaration.
Signed-off-by: John Pittman <jpittman@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
bi_rw should be using bio_set_op_attrs to set bi_rw.
Signed-off-by: Shaun Tancheff <shaun@tancheff.com>
Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <jbacik@fb.com>
Cc: David Sterba <dsterba@suse.com>
Cc: Mike Christie <mchristi@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
The name for a bdi of a gendisk is derived from the gendisk's devt.
However, since the gendisk is destroyed before the bdi it leaves a
window where a new gendisk could dynamically reuse the same devt while a
bdi with the same name is still live. Arrange for the bdi to hold a
reference against its "owner" disk device while it is registered.
Otherwise we can hit sysfs duplicate name collisions like the following:
WARNING: CPU: 10 PID: 2078 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x64/0x80
sysfs: cannot create duplicate filename '/devices/virtual/bdi/259:1'
Hardware name: HP ProLiant DL580 Gen8, BIOS P79 05/06/2015
0000000000000286 0000000002c04ad5 ffff88006f24f970 ffffffff8134caec
ffff88006f24f9c0 0000000000000000 ffff88006f24f9b0 ffffffff8108c351
0000001f0000000c ffff88105d236000 ffff88105d1031e0 ffff8800357427f8
Call Trace:
[<ffffffff8134caec>] dump_stack+0x63/0x87
[<ffffffff8108c351>] __warn+0xd1/0xf0
[<ffffffff8108c3cf>] warn_slowpath_fmt+0x5f/0x80
[<ffffffff812a0d34>] sysfs_warn_dup+0x64/0x80
[<ffffffff812a0e1e>] sysfs_create_dir_ns+0x7e/0x90
[<ffffffff8134faaa>] kobject_add_internal+0xaa/0x320
[<ffffffff81358d4e>] ? vsnprintf+0x34e/0x4d0
[<ffffffff8134ff55>] kobject_add+0x75/0xd0
[<ffffffff816e66b2>] ? mutex_lock+0x12/0x2f
[<ffffffff8148b0a5>] device_add+0x125/0x610
[<ffffffff8148b788>] device_create_groups_vargs+0xd8/0x100
[<ffffffff8148b7cc>] device_create_vargs+0x1c/0x20
[<ffffffff811b775c>] bdi_register+0x8c/0x180
[<ffffffff811b7877>] bdi_register_dev+0x27/0x30
[<ffffffff813317f5>] add_disk+0x175/0x4a0
Cc: <stable@vger.kernel.org>
Reported-by: Yi Zhang <yizhan@redhat.com>
Tested-by: Yi Zhang <yizhan@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Fixed up missing 0 return in bdi_register_owner().
Signed-off-by: Jens Axboe <axboe@fb.com>
In case a submitted request gets stuck for some reason, the block layer
can prevent the request starvation by starting the scheduled timeout work.
If this stuck request occurs at the same time another thread has started
a queue freeze, the blk_mq_timeout_work will not be able to acquire the
queue reference and will return silently, thus not issuing the timeout.
But since the request is already holding a q_usage_counter reference and
is unable to complete, it will never release its reference, preventing
the queue from completing the freeze started by first thread. This puts
the request_queue in a hung state, forever waiting for the freeze
completion.
This was observed while running IO to a NVMe device at the same time we
toggled the CPU hotplug code. Eventually, once a request got stuck
requiring a timeout during a queue freeze, we saw the CPU Hotplug
notification code get stuck inside blk_mq_freeze_queue_wait, as shown in
the trace below.
[c000000deaf13690] [c000000deaf13738] 0xc000000deaf13738 (unreliable)
[c000000deaf13860] [c000000000015ce8] __switch_to+0x1f8/0x350
[c000000deaf138b0] [c000000000ade0e4] __schedule+0x314/0x990
[c000000deaf13940] [c000000000ade7a8] schedule+0x48/0xc0
[c000000deaf13970] [c0000000005492a4] blk_mq_freeze_queue_wait+0x74/0x110
[c000000deaf139e0] [c00000000054b6a8] blk_mq_queue_reinit_notify+0x1a8/0x2e0
[c000000deaf13a40] [c0000000000e7878] notifier_call_chain+0x98/0x100
[c000000deaf13a90] [c0000000000b8e08] cpu_notify_nofail+0x48/0xa0
[c000000deaf13ac0] [c0000000000b92f0] _cpu_down+0x2a0/0x400
[c000000deaf13b90] [c0000000000b94a8] cpu_down+0x58/0xa0
[c000000deaf13bc0] [c0000000006d5dcc] cpu_subsys_offline+0x2c/0x50
[c000000deaf13bf0] [c0000000006cd244] device_offline+0x104/0x140
[c000000deaf13c30] [c0000000006cd40c] online_store+0x6c/0xc0
[c000000deaf13c80] [c0000000006c8c78] dev_attr_store+0x68/0xa0
[c000000deaf13cc0] [c0000000003974d0] sysfs_kf_write+0x80/0xb0
[c000000deaf13d00] [c0000000003963e8] kernfs_fop_write+0x188/0x200
[c000000deaf13d50] [c0000000002e0f6c] __vfs_write+0x6c/0xe0
[c000000deaf13d90] [c0000000002e1ca0] vfs_write+0xc0/0x230
[c000000deaf13de0] [c0000000002e2cdc] SyS_write+0x6c/0x110
[c000000deaf13e30] [c000000000009204] system_call+0x38/0xb4
The fix is to allow the timeout work to execute in the window between
dropping the initial refcount reference and the release of the last
reference, which actually marks the freeze completion. This can be
achieved with percpu_refcount_tryget, which does not require the counter
to be alive. This way the timeout work can do it's job and terminate a
stuck request even during a freeze, returning its reference and avoiding
the deadlock.
Allowing the timeout to run is just a part of the fix, since for some
devices, we might get stuck again inside the device driver's timeout
handler, should it attempt to allocate a new request in that path -
which is a quite common action for Abort commands, which need to be sent
after a timeout. In NVMe, for instance, we call blk_mq_alloc_request
from inside the timeout handler, which will fail during a freeze, since
it also tries to acquire a queue reference.
I considered a similar change to blk_mq_alloc_request as a generic
solution for further device driver hangs, but we can't do that, since it
would allow new requests to disturb the freeze process. I thought about
creating a new function in the block layer to support unfreezable
requests for these occasions, but after working on it for a while, I
feel like this should be handled in a per-driver basis. I'm now
experimenting with changes to the NVMe timeout path, but I'm open to
suggestions of ways to make this generic.
Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Cc: Brian King <brking@linux.vnet.ibm.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: linux-nvme@lists.infradead.org
Cc: linux-block@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Quentin ran into this bug:
WARNING: CPU: 64 PID: 10085 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x65/0x80
sysfs: cannot create duplicate filename '/devices/virtual/block/nbd3/pid'
Modules linked in: nbd
CPU: 64 PID: 10085 Comm: qemu-nbd Tainted: G D 4.6.0+ #7
0000000000000000 ffff8820330bba68 ffffffff814b8791 ffff8820330bbac8
0000000000000000 ffff8820330bbab8 ffffffff810d04ab ffff8820330bbaa8
0000001f00000296 0000000000017681 ffff8810380bf000 ffffffffa0001790
Call Trace:
[<ffffffff814b8791>] dump_stack+0x4d/0x6c
[<ffffffff810d04ab>] __warn+0xdb/0x100
[<ffffffff810d0574>] warn_slowpath_fmt+0x44/0x50
[<ffffffff81218c65>] sysfs_warn_dup+0x65/0x80
[<ffffffff81218a02>] sysfs_add_file_mode_ns+0x172/0x180
[<ffffffff81218a35>] sysfs_create_file_ns+0x25/0x30
[<ffffffff81594a76>] device_create_file+0x36/0x90
[<ffffffffa0000e8d>] __nbd_ioctl+0x32d/0x9b0 [nbd]
[<ffffffff814cc8e8>] ? find_next_bit+0x18/0x20
[<ffffffff810f7c29>] ? select_idle_sibling+0xe9/0x120
[<ffffffff810f6cd7>] ? __enqueue_entity+0x67/0x70
[<ffffffff810f9bf0>] ? enqueue_task_fair+0x630/0xe20
[<ffffffff810efa76>] ? resched_curr+0x36/0x70
[<ffffffff810f0078>] ? check_preempt_curr+0x78/0x90
[<ffffffff810f00a2>] ? ttwu_do_wakeup+0x12/0x80
[<ffffffff810f01b1>] ? ttwu_do_activate.constprop.86+0x61/0x70
[<ffffffff810f0c15>] ? try_to_wake_up+0x185/0x2d0
[<ffffffff810f0d6d>] ? default_wake_function+0xd/0x10
[<ffffffff81105471>] ? autoremove_wake_function+0x11/0x40
[<ffffffffa0001577>] nbd_ioctl+0x67/0x94 [nbd]
[<ffffffff814ac0fd>] blkdev_ioctl+0x14d/0x940
[<ffffffff811b0da2>] ? put_pipe_info+0x22/0x60
[<ffffffff811d96cc>] block_ioctl+0x3c/0x40
[<ffffffff811ba08d>] do_vfs_ioctl+0x8d/0x5e0
[<ffffffff811aa329>] ? ____fput+0x9/0x10
[<ffffffff810e9092>] ? task_work_run+0x72/0x90
[<ffffffff811ba627>] SyS_ioctl+0x47/0x80
[<ffffffff8185f5df>] entry_SYSCALL_64_fastpath+0x17/0x93
---[ end trace 7899b295e4f850c8 ]---
It seems fairly obvious that device_create_file() is not being protected
from being run concurrently on the same nbd.
Quentin found the following relevant commits:
1a2ad21 nbd: add locking to nbd_ioctl
90b8f28 [PATCH] end of methods switch: remove the old ones
d4430d6 [PATCH] beginning of methods conversion
08f8585 [PATCH] move block_device_operations to blkdev.h
It would seem that the race was introduced in the process of moving nbd
from BKL to unlocked ioctls.
By setting nbd->task_recv while the mutex is held, we can prevent other
processes from running concurrently (since nbd->task_recv is also checked
while the mutex is held).
Reported-and-tested-by: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Markus Pargmann <mpa@pengutronix.de>
Cc: Paul Clements <paul.clements@steeleye.com>
Cc: Pavel Machek <pavel@suse.cz>
Cc: Jens Axboe <axboe@fb.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
I got a KASAN report of use-after-free:
==================================================================
BUG: KASAN: use-after-free in klist_iter_exit+0x61/0x70 at addr ffff8800b6581508
Read of size 8 by task trinity-c1/315
=============================================================================
BUG kmalloc-32 (Not tainted): kasan: bad access detected
-----------------------------------------------------------------------------
Disabling lock debugging due to kernel taint
INFO: Allocated in disk_seqf_start+0x66/0x110 age=144 cpu=1 pid=315
___slab_alloc+0x4f1/0x520
__slab_alloc.isra.58+0x56/0x80
kmem_cache_alloc_trace+0x260/0x2a0
disk_seqf_start+0x66/0x110
traverse+0x176/0x860
seq_read+0x7e3/0x11a0
proc_reg_read+0xbc/0x180
do_loop_readv_writev+0x134/0x210
do_readv_writev+0x565/0x660
vfs_readv+0x67/0xa0
do_preadv+0x126/0x170
SyS_preadv+0xc/0x10
do_syscall_64+0x1a1/0x460
return_from_SYSCALL_64+0x0/0x6a
INFO: Freed in disk_seqf_stop+0x42/0x50 age=160 cpu=1 pid=315
__slab_free+0x17a/0x2c0
kfree+0x20a/0x220
disk_seqf_stop+0x42/0x50
traverse+0x3b5/0x860
seq_read+0x7e3/0x11a0
proc_reg_read+0xbc/0x180
do_loop_readv_writev+0x134/0x210
do_readv_writev+0x565/0x660
vfs_readv+0x67/0xa0
do_preadv+0x126/0x170
SyS_preadv+0xc/0x10
do_syscall_64+0x1a1/0x460
return_from_SYSCALL_64+0x0/0x6a
CPU: 1 PID: 315 Comm: trinity-c1 Tainted: G B 4.7.0+ #62
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
ffffea0002d96000 ffff880119b9f918 ffffffff81d6ce81 ffff88011a804480
ffff8800b6581500 ffff880119b9f948 ffffffff8146c7bd ffff88011a804480
ffffea0002d96000 ffff8800b6581500 fffffffffffffff4 ffff880119b9f970
Call Trace:
[<ffffffff81d6ce81>] dump_stack+0x65/0x84
[<ffffffff8146c7bd>] print_trailer+0x10d/0x1a0
[<ffffffff814704ff>] object_err+0x2f/0x40
[<ffffffff814754d1>] kasan_report_error+0x221/0x520
[<ffffffff8147590e>] __asan_report_load8_noabort+0x3e/0x40
[<ffffffff83888161>] klist_iter_exit+0x61/0x70
[<ffffffff82404389>] class_dev_iter_exit+0x9/0x10
[<ffffffff81d2e8ea>] disk_seqf_stop+0x3a/0x50
[<ffffffff8151f812>] seq_read+0x4b2/0x11a0
[<ffffffff815f8fdc>] proc_reg_read+0xbc/0x180
[<ffffffff814b24e4>] do_loop_readv_writev+0x134/0x210
[<ffffffff814b4c45>] do_readv_writev+0x565/0x660
[<ffffffff814b8a17>] vfs_readv+0x67/0xa0
[<ffffffff814b8de6>] do_preadv+0x126/0x170
[<ffffffff814b92ec>] SyS_preadv+0xc/0x10
This problem can occur in the following situation:
open()
- pread()
- .seq_start()
- iter = kmalloc() // succeeds
- seqf->private = iter
- .seq_stop()
- kfree(seqf->private)
- pread()
- .seq_start()
- iter = kmalloc() // fails
- .seq_stop()
- class_dev_iter_exit(seqf->private) // boom! old pointer
As the comment in disk_seqf_stop() says, stop is called even if start
failed, so we need to reinitialise the private pointer to NULL when seq
iteration stops.
An alternative would be to set the private pointer to NULL when the
kmalloc() in disk_seqf_start() fails.
Cc: stable@vger.kernel.org
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
Merge 4fc29c1aa3 included this extra line, but it's not needed (or
useful) since we'll bio_set_op_attrs() right after to properly set
the op and flags for the bio.
Signed-off-by: Jens Axboe <axboe@fb.com>
When a bio is cloned, the newly created bio must be associated with
the same blkcg as the original bio (if BLK_CGROUP is enabled). If
this operation is not performed, then the new bio is not associated
with any group, and the group of the current task is returned when
the group of the bio is requested.
Depending on the cloning frequency, this may cause a large
percentage of the bios belonging to a given group to be treated
as if belonging to other groups (in most cases as if belonging to
the root group). The expected group isolation may thereby be broken.
This commit adds the missing association in bio-cloning functions.
Fixes: da2f0f74cf ("Btrfs: add support for blkio controllers")
Cc: stable@vger.kernel.org # v4.3+
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Reviewed-by: Nikolay Borisov <kernel@kyup.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
'nr_undestroyed_grps' in struct throtl_data was used to count
the number of throtl_grp related with throtl_data, but now
throtl_grp is tracked by blkcg_gq, so it is useless anymore.
Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Commit 09954bad4 ("floppy: refactor open() flags handling"), as a
side-effect, causes open(/dev/fdX, O_ACCMODE) to fail. It turns out that
this is being used setfdprm userspace for ioctl-only open().
Reintroduce back the original behavior wrt !(FMODE_READ|FMODE_WRITE)
modes, while still keeping the original O_NDELAY bug fixed.
Cc: stable@vger.kernel.org # v4.5+
Reported-by: Wim Osterholt <wim@djo.tudelft.nl>
Tested-by: Wim Osterholt <wim@djo.tudelft.nl>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>