License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 22:07:57 +08:00
|
|
|
/* SPDX-License-Identifier: GPL-2.0 */
|
2016-06-21 07:22:39 +08:00
|
|
|
#ifndef LINUX_IOMAP_H
|
|
|
|
#define LINUX_IOMAP_H 1
|
|
|
|
|
2018-07-12 13:26:05 +08:00
|
|
|
#include <linux/atomic.h>
|
|
|
|
#include <linux/bitmap.h>
|
2019-10-18 04:12:15 +08:00
|
|
|
#include <linux/blk_types.h>
|
2018-07-12 13:26:05 +08:00
|
|
|
#include <linux/mm.h>
|
2016-06-21 07:22:39 +08:00
|
|
|
#include <linux/types.h>
|
2018-10-27 06:02:59 +08:00
|
|
|
#include <linux/mm_types.h>
|
2019-07-15 23:50:59 +08:00
|
|
|
#include <linux/blkdev.h>
|
2016-06-21 07:22:39 +08:00
|
|
|
|
2018-06-02 00:03:08 +08:00
|
|
|
struct address_space;
|
2016-06-21 07:38:45 +08:00
|
|
|
struct fiemap_extent_info;
|
2016-06-21 07:23:11 +08:00
|
|
|
struct inode;
|
2020-09-28 23:51:08 +08:00
|
|
|
struct iomap_dio;
|
2019-10-18 04:12:15 +08:00
|
|
|
struct iomap_writepage_ctx;
|
2016-06-21 07:23:11 +08:00
|
|
|
struct iov_iter;
|
|
|
|
struct kiocb;
|
2018-06-20 06:10:56 +08:00
|
|
|
struct page;
|
2016-06-21 07:23:11 +08:00
|
|
|
struct vm_area_struct;
|
|
|
|
struct vm_fault;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Types of block ranges for iomap mappings:
|
|
|
|
*/
|
2019-10-19 07:43:08 +08:00
|
|
|
#define IOMAP_HOLE 0 /* no blocks allocated, need allocation */
|
|
|
|
#define IOMAP_DELALLOC 1 /* delayed allocation blocks */
|
|
|
|
#define IOMAP_MAPPED 2 /* blocks allocated at @addr */
|
|
|
|
#define IOMAP_UNWRITTEN 3 /* blocks allocated at @addr in unwritten state */
|
|
|
|
#define IOMAP_INLINE 4 /* data inline in the inode */
|
2016-06-21 07:22:39 +08:00
|
|
|
|
2016-08-29 09:33:58 +08:00
|
|
|
/*
|
2019-10-19 07:40:17 +08:00
|
|
|
* Flags reported by the file system from iomap_begin:
|
|
|
|
*
|
|
|
|
* IOMAP_F_NEW indicates that the blocks have been newly allocated and need
|
|
|
|
* zeroing for areas that no data is copied to.
|
libnvdimm for 4.15
* Introduce MAP_SYNC and MAP_SHARED_VALIDATE, a mechanism to enable
'userspace flush' of persistent memory updates via filesystem-dax
mappings. It arranges for any filesystem metadata updates that may be
required to satisfy a write fault to also be flushed ("on disk") before
the kernel returns to userspace from the fault handler. Effectively
every write-fault that dirties metadata completes an fsync() before
returning from the fault handler. The new MAP_SHARED_VALIDATE mapping
type guarantees that the MAP_SYNC flag is validated as supported by the
filesystem's ->mmap() file operation.
* Add support for the standard ACPI 6.2 label access methods that
replace the NVDIMM_FAMILY_INTEL (vendor specific) label methods. This
enables interoperability with environments that only implement the
standardized methods.
* Add support for the ACPI 6.2 NVDIMM media error injection methods.
* Add support for the NVDIMM_FAMILY_INTEL v1.6 DIMM commands for latch
last shutdown status, firmware update, SMART error injection, and
SMART alarm threshold control.
* Cleanup physical address information disclosures to be root-only.
* Fix revalidation of the DIMM "locked label area" status to support
dynamic unlock of the label area.
* Expand unit test infrastructure to mock the ACPI 6.2 Translate SPA
(system-physical-address) command and error injection commands.
Acknowledgements that came after the commits were pushed to -next:
957ac8c421ad dax: fix PMD faults on zero-length files
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
a39e596baa07 xfs: support for synchronous DAX faults
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
7b565c9f965b xfs: Implement xfs_filemap_pfn_mkwrite() using __xfs_filemap_fault()
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJaDfvcAAoJEB7SkWpmfYgCk7sP/2qJhBH+VTTdg2osDnhAdAhI
co/AGEmsHFlUCMBb/Ek7UnMAmhBYiJU2q4ywPsNFBpusXpMlqNy5Iwo7k4/wQHE/
SJcIM0g4zg0ViFuUhwV+C2T0R5UzFR8JLd9EYWj/YS6aJpurtotm5l4UStaM0Hzo
AhxSXJLrBDuqCpbOxbctfiGEmdRL7aRfBEAARTNRKBn/iXxJUcYHlp62rtXQS+t4
I6LC/URCWTNTTMGmzW6TRsgSD9WMfd19xKcGzN3qL6ee0KFccxN4ctFqHA/sFGOh
iYLeR0XJUjJxyp+PkWGteXPVZL0Kj3bD/lSTG+Co5bm/ra8a/sh3TSFfgFyoBZD1
EqMN8Ryf80hGp3FabeH2Iw2SviYPZpHSWgjddjxLD0RA6OmpzINc+Wm8eqApjMME
sbZDTOijiab4QMQ0XamF4GuDHyQtawv5Y/w2Ehhl1tmiqW+5tKhsKqxkQt+/V3Yt
RTVSRe2Pkway66b+cD64IdQ6L2tyonPnmi5IzgkKOhlOEGomy+4/U2Jt2bMbhzq6
ymszKmXp2XI8P06wU8sHrIUeXO5I9qoKn/fZA73Eb8aIzgJe3tBE/5+Ab7RG6HB9
1OVfcMWoXU1gNgNktTs63X1Lsg4aW9kt/K4fPHHcqUcaliEJpJTlAbg9GLF2buoW
nQ+0fTRgMRihE3ZA0Fs3
=h2vZ
-----END PGP SIGNATURE-----
Merge tag 'libnvdimm-for-4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm and dax updates from Dan Williams:
"Save for a few late fixes, all of these commits have shipped in -next
releases since before the merge window opened, and 0day has given a
build success notification.
The ext4 touches came from Jan, and the xfs touches have Darrick's
reviewed-by. An xfstest for the MAP_SYNC feature has been through
a few round of reviews and is on track to be merged.
- Introduce MAP_SYNC and MAP_SHARED_VALIDATE, a mechanism to enable
'userspace flush' of persistent memory updates via filesystem-dax
mappings. It arranges for any filesystem metadata updates that may
be required to satisfy a write fault to also be flushed ("on disk")
before the kernel returns to userspace from the fault handler.
Effectively every write-fault that dirties metadata completes an
fsync() before returning from the fault handler. The new
MAP_SHARED_VALIDATE mapping type guarantees that the MAP_SYNC flag
is validated as supported by the filesystem's ->mmap() file
operation.
- Add support for the standard ACPI 6.2 label access methods that
replace the NVDIMM_FAMILY_INTEL (vendor specific) label methods.
This enables interoperability with environments that only implement
the standardized methods.
- Add support for the ACPI 6.2 NVDIMM media error injection methods.
- Add support for the NVDIMM_FAMILY_INTEL v1.6 DIMM commands for
latch last shutdown status, firmware update, SMART error injection,
and SMART alarm threshold control.
- Cleanup physical address information disclosures to be root-only.
- Fix revalidation of the DIMM "locked label area" status to support
dynamic unlock of the label area.
- Expand unit test infrastructure to mock the ACPI 6.2 Translate SPA
(system-physical-address) command and error injection commands.
Acknowledgements that came after the commits were pushed to -next:
- 957ac8c421ad ("dax: fix PMD faults on zero-length files"):
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
- a39e596baa07 ("xfs: support for synchronous DAX faults") and
7b565c9f965b ("xfs: Implement xfs_filemap_pfn_mkwrite() using __xfs_filemap_fault()")
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>"
* tag 'libnvdimm-for-4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (49 commits)
acpi, nfit: add 'Enable Latch System Shutdown Status' command support
dax: fix general protection fault in dax_alloc_inode
dax: fix PMD faults on zero-length files
dax: stop requiring a live device for dax_flush()
brd: remove dax support
dax: quiet bdev_dax_supported()
fs, dax: unify IOMAP_F_DIRTY read vs write handling policy in the dax core
tools/testing/nvdimm: unit test clear-error commands
acpi, nfit: validate commands against the device type
tools/testing/nvdimm: stricter bounds checking for error injection commands
xfs: support for synchronous DAX faults
xfs: Implement xfs_filemap_pfn_mkwrite() using __xfs_filemap_fault()
ext4: Support for synchronous DAX faults
ext4: Simplify error handling in ext4_dax_huge_fault()
dax: Implement dax_finish_sync_fault()
dax, iomap: Add support for synchronous faults
mm: Define MAP_SYNC and VM_SYNC flags
dax: Allow tuning whether dax_insert_mapping_entry() dirties entry
dax: Allow dax_iomap_fault() to return pfn
dax: Fix comment describing dax_iomap_fault()
...
2017-11-18 01:51:57 +08:00
|
|
|
*
|
2017-11-01 23:36:42 +08:00
|
|
|
* IOMAP_F_DIRTY indicates the inode has uncommitted metadata needed to access
|
|
|
|
* written data and requires fdatasync to commit them to persistent storage.
|
2019-10-18 04:12:01 +08:00
|
|
|
* This needs to take into account metadata changes that *may* be made at IO
|
|
|
|
* completion, such as file size updates from direct IO.
|
2019-10-19 07:40:17 +08:00
|
|
|
*
|
|
|
|
* IOMAP_F_SHARED indicates that the blocks are shared, and will need to be
|
|
|
|
* unshared as part a write.
|
|
|
|
*
|
|
|
|
* IOMAP_F_MERGED indicates that the iomap contains the merge of multiple block
|
|
|
|
* mappings.
|
|
|
|
*
|
|
|
|
* IOMAP_F_BUFFER_HEAD indicates that the file system requires the use of
|
|
|
|
* buffer heads for this mapping.
|
2022-11-29 06:09:17 +08:00
|
|
|
*
|
|
|
|
* IOMAP_F_XATTR indicates that the iomap is for an extended attribute extent
|
|
|
|
* rather than a file data extent.
|
2016-08-29 09:33:58 +08:00
|
|
|
*/
|
2022-11-29 06:09:17 +08:00
|
|
|
#define IOMAP_F_NEW (1U << 0)
|
|
|
|
#define IOMAP_F_DIRTY (1U << 1)
|
|
|
|
#define IOMAP_F_SHARED (1U << 2)
|
|
|
|
#define IOMAP_F_MERGED (1U << 3)
|
|
|
|
#define IOMAP_F_BUFFER_HEAD (1U << 4)
|
|
|
|
#define IOMAP_F_ZONE_APPEND (1U << 5)
|
|
|
|
#define IOMAP_F_XATTR (1U << 6)
|
2016-10-20 12:51:28 +08:00
|
|
|
|
|
|
|
/*
|
2019-10-19 07:40:17 +08:00
|
|
|
* Flags set by the core iomap code during operations:
|
|
|
|
*
|
|
|
|
* IOMAP_F_SIZE_CHANGED indicates to the iomap_end method that the file size
|
|
|
|
* has changed as the result of this write operation.
|
2022-11-29 06:09:17 +08:00
|
|
|
*
|
|
|
|
* IOMAP_F_STALE indicates that the iomap is not valid any longer and the file
|
|
|
|
* range it covers needs to be remapped by the high level before the operation
|
|
|
|
* can proceed.
|
2016-10-20 12:51:28 +08:00
|
|
|
*/
|
2022-11-29 06:09:17 +08:00
|
|
|
#define IOMAP_F_SIZE_CHANGED (1U << 8)
|
|
|
|
#define IOMAP_F_STALE (1U << 9)
|
2016-08-29 09:33:58 +08:00
|
|
|
|
2018-06-02 00:03:07 +08:00
|
|
|
/*
|
|
|
|
* Flags from 0x1000 up are for file system specific usage:
|
|
|
|
*/
|
2022-11-29 06:09:17 +08:00
|
|
|
#define IOMAP_F_PRIVATE (1U << 12)
|
2018-06-02 00:03:07 +08:00
|
|
|
|
|
|
|
|
2016-06-21 07:23:11 +08:00
|
|
|
/*
|
2017-10-02 05:55:54 +08:00
|
|
|
* Magic value for addr:
|
2016-06-21 07:23:11 +08:00
|
|
|
*/
|
2017-10-02 05:55:54 +08:00
|
|
|
#define IOMAP_NULL_ADDR -1ULL /* addr is not valid */
|
2016-06-21 07:22:39 +08:00
|
|
|
|
2019-04-30 23:45:34 +08:00
|
|
|
struct iomap_page_ops;
|
|
|
|
|
2016-06-21 07:22:39 +08:00
|
|
|
struct iomap {
|
2017-10-02 05:55:54 +08:00
|
|
|
u64 addr; /* disk offset of mapping, bytes */
|
2016-06-21 07:23:11 +08:00
|
|
|
loff_t offset; /* file offset of mapping, bytes */
|
|
|
|
u64 length; /* length of mapping, bytes */
|
2016-08-29 09:33:58 +08:00
|
|
|
u16 type; /* type of mapping */
|
|
|
|
u16 flags; /* flags for mapping */
|
2016-06-21 07:23:11 +08:00
|
|
|
struct block_device *bdev; /* block device for I/O */
|
2017-01-28 04:04:59 +08:00
|
|
|
struct dax_device *dax_dev; /* dax_dev for dax operations */
|
2018-06-20 06:10:56 +08:00
|
|
|
void *inline_data;
|
2018-06-20 06:10:57 +08:00
|
|
|
void *private; /* filesystem private */
|
2019-04-30 23:45:34 +08:00
|
|
|
const struct iomap_page_ops *page_ops;
|
2022-11-29 06:09:17 +08:00
|
|
|
u64 validity_cookie; /* used with .iomap_valid() */
|
2019-04-30 23:45:34 +08:00
|
|
|
};
|
2018-06-20 06:10:56 +08:00
|
|
|
|
2021-08-11 09:33:04 +08:00
|
|
|
static inline sector_t iomap_sector(const struct iomap *iomap, loff_t pos)
|
2019-07-15 23:50:59 +08:00
|
|
|
{
|
|
|
|
return (iomap->addr + pos - iomap->offset) >> SECTOR_SHIFT;
|
|
|
|
}
|
|
|
|
|
2021-08-04 00:38:22 +08:00
|
|
|
/*
|
|
|
|
* Returns the inline data pointer for logical offset @pos.
|
|
|
|
*/
|
2021-08-11 09:33:04 +08:00
|
|
|
static inline void *iomap_inline_data(const struct iomap *iomap, loff_t pos)
|
2021-08-04 00:38:22 +08:00
|
|
|
{
|
|
|
|
return iomap->inline_data + pos - iomap->offset;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Check if the mapping's length is within the valid range for inline data.
|
|
|
|
* This is used to guard against accessing data beyond the page inline_data
|
|
|
|
* points at.
|
|
|
|
*/
|
2021-08-11 09:33:05 +08:00
|
|
|
static inline bool iomap_inline_data_valid(const struct iomap *iomap)
|
2021-08-04 00:38:22 +08:00
|
|
|
{
|
|
|
|
return iomap->length <= PAGE_SIZE - offset_in_page(iomap->inline_data);
|
|
|
|
}
|
|
|
|
|
2019-04-30 23:45:34 +08:00
|
|
|
/*
|
|
|
|
* When a filesystem sets page_ops in an iomap mapping it returns, page_prepare
|
|
|
|
* and page_done will be called for each page written to. This only applies to
|
|
|
|
* buffered writes as unbuffered writes will not typically have pages
|
|
|
|
* associated with them.
|
|
|
|
*
|
|
|
|
* When page_prepare succeeds, page_done will always be called to do any
|
|
|
|
* cleanup work necessary. In that page_done call, @page will be NULL if the
|
|
|
|
* associated page could not be obtained.
|
|
|
|
*/
|
|
|
|
struct iomap_page_ops {
|
2021-08-11 09:33:03 +08:00
|
|
|
int (*page_prepare)(struct inode *inode, loff_t pos, unsigned len);
|
2018-06-20 06:10:56 +08:00
|
|
|
void (*page_done)(struct inode *inode, loff_t pos, unsigned copied,
|
2021-08-11 09:33:03 +08:00
|
|
|
struct page *page);
|
2022-11-29 06:09:17 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Check that the cached iomap still maps correctly to the filesystem's
|
|
|
|
* internal extent map. FS internal extent maps can change while iomap
|
|
|
|
* is iterating a cached iomap, so this hook allows iomap to detect that
|
|
|
|
* the iomap needs to be refreshed during a long running write
|
|
|
|
* operation.
|
|
|
|
*
|
|
|
|
* The filesystem can store internal state (e.g. a sequence number) in
|
|
|
|
* iomap->validity_cookie when the iomap is first mapped to be able to
|
|
|
|
* detect changes between mapping time and whenever .iomap_valid() is
|
|
|
|
* called.
|
|
|
|
*
|
|
|
|
* This is called with the folio over the specified file position held
|
|
|
|
* locked by the iomap code.
|
|
|
|
*/
|
|
|
|
bool (*iomap_valid)(struct inode *inode, const struct iomap *iomap);
|
2016-06-21 07:23:11 +08:00
|
|
|
};
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Flags for iomap_begin / iomap_end. No flag implies a read.
|
|
|
|
*/
|
2016-10-20 12:51:28 +08:00
|
|
|
#define IOMAP_WRITE (1 << 0) /* writing, must allocate blocks */
|
|
|
|
#define IOMAP_ZERO (1 << 1) /* zeroing operation, may skip holes */
|
|
|
|
#define IOMAP_REPORT (1 << 2) /* report extent status, e.g. FIEMAP */
|
2016-11-10 07:26:50 +08:00
|
|
|
#define IOMAP_FAULT (1 << 3) /* mapping for page fault */
|
2016-11-30 11:36:01 +08:00
|
|
|
#define IOMAP_DIRECT (1 << 4) /* direct I/O */
|
2018-06-02 00:03:07 +08:00
|
|
|
#define IOMAP_NOWAIT (1 << 5) /* do not block */
|
2021-01-24 02:06:10 +08:00
|
|
|
#define IOMAP_OVERWRITE_ONLY (1 << 6) /* only pure overwrites allowed */
|
2021-08-11 09:33:14 +08:00
|
|
|
#define IOMAP_UNSHARE (1 << 7) /* unshare_file_range */
|
2021-11-29 18:21:58 +08:00
|
|
|
#ifdef CONFIG_FS_DAX
|
|
|
|
#define IOMAP_DAX (1 << 8) /* DAX mapping */
|
|
|
|
#else
|
|
|
|
#define IOMAP_DAX 0
|
|
|
|
#endif /* CONFIG_FS_DAX */
|
2016-06-21 07:23:11 +08:00
|
|
|
|
|
|
|
struct iomap_ops {
|
|
|
|
/*
|
|
|
|
* Return the existing mapping at pos, or reserve space starting at
|
|
|
|
* pos for up to length, as long as we can do it as a single mapping.
|
|
|
|
* The actual length is returned in iomap->length.
|
|
|
|
*/
|
|
|
|
int (*iomap_begin)(struct inode *inode, loff_t pos, loff_t length,
|
2019-10-19 07:44:10 +08:00
|
|
|
unsigned flags, struct iomap *iomap,
|
|
|
|
struct iomap *srcmap);
|
2016-06-21 07:23:11 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Commit and/or unreserve space previous allocated using iomap_begin.
|
|
|
|
* Written indicates the length of the successful write operation which
|
|
|
|
* needs to be commited, while the rest needs to be unreserved.
|
|
|
|
* Written might be zero if no data was written.
|
|
|
|
*/
|
|
|
|
int (*iomap_end)(struct inode *inode, loff_t pos, loff_t length,
|
|
|
|
ssize_t written, unsigned flags, struct iomap *iomap);
|
2016-06-21 07:22:39 +08:00
|
|
|
};
|
|
|
|
|
2021-08-11 09:33:07 +08:00
|
|
|
/**
|
|
|
|
* struct iomap_iter - Iterate through a range of a file
|
|
|
|
* @inode: Set at the start of the iteration and should not change.
|
|
|
|
* @pos: The current file position we are operating on. It is updated by
|
|
|
|
* calls to iomap_iter(). Treat as read-only in the body.
|
|
|
|
* @len: The remaining length of the file segment we're operating on.
|
|
|
|
* It is updated at the same time as @pos.
|
|
|
|
* @processed: The number of bytes processed by the body in the most recent
|
|
|
|
* iteration, or a negative errno. 0 causes the iteration to stop.
|
|
|
|
* @flags: Zero or more of the iomap_begin flags above.
|
|
|
|
* @iomap: Map describing the I/O iteration
|
|
|
|
* @srcmap: Source map for COW operations
|
|
|
|
*/
|
|
|
|
struct iomap_iter {
|
|
|
|
struct inode *inode;
|
|
|
|
loff_t pos;
|
|
|
|
u64 len;
|
|
|
|
s64 processed;
|
|
|
|
unsigned flags;
|
|
|
|
struct iomap iomap;
|
|
|
|
struct iomap srcmap;
|
2022-05-06 04:11:11 +08:00
|
|
|
void *private;
|
2021-08-11 09:33:07 +08:00
|
|
|
};
|
|
|
|
|
|
|
|
int iomap_iter(struct iomap_iter *iter, const struct iomap_ops *ops);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* iomap_length - length of the current iomap iteration
|
|
|
|
* @iter: iteration structure
|
|
|
|
*
|
|
|
|
* Returns the length that the operation applies to for the current iteration.
|
|
|
|
*/
|
|
|
|
static inline u64 iomap_length(const struct iomap_iter *iter)
|
|
|
|
{
|
|
|
|
u64 end = iter->iomap.offset + iter->iomap.length;
|
|
|
|
|
|
|
|
if (iter->srcmap.type != IOMAP_HOLE)
|
|
|
|
end = min(end, iter->srcmap.offset + iter->srcmap.length);
|
|
|
|
return min(iter->len, end - iter->pos);
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* iomap_iter_srcmap - return the source map for the current iomap iteration
|
|
|
|
* @i: iteration structure
|
|
|
|
*
|
|
|
|
* Write operations on file systems with reflink support might require a
|
|
|
|
* source and a destination map. This function retourns the source map
|
|
|
|
* for a given operation, which may or may no be identical to the destination
|
|
|
|
* map in &i->iomap.
|
|
|
|
*/
|
2021-08-11 09:33:16 +08:00
|
|
|
static inline const struct iomap *iomap_iter_srcmap(const struct iomap_iter *i)
|
2021-08-11 09:33:07 +08:00
|
|
|
{
|
|
|
|
if (i->srcmap.type != IOMAP_HOLE)
|
|
|
|
return &i->srcmap;
|
|
|
|
return &i->iomap;
|
|
|
|
}
|
|
|
|
|
2016-06-21 07:23:11 +08:00
|
|
|
ssize_t iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *from,
|
2017-01-28 15:20:26 +08:00
|
|
|
const struct iomap_ops *ops);
|
2022-11-23 09:44:38 +08:00
|
|
|
int iomap_file_buffered_write_punch_delalloc(struct inode *inode,
|
|
|
|
struct iomap *iomap, loff_t pos, loff_t length, ssize_t written,
|
|
|
|
int (*punch)(struct inode *inode, loff_t pos, loff_t length));
|
|
|
|
|
2022-04-29 20:54:32 +08:00
|
|
|
int iomap_read_folio(struct folio *folio, const struct iomap_ops *ops);
|
2020-06-02 12:47:34 +08:00
|
|
|
void iomap_readahead(struct readahead_control *, const struct iomap_ops *ops);
|
2022-02-10 04:21:27 +08:00
|
|
|
bool iomap_is_partially_uptodate(struct folio *, size_t from, size_t count);
|
2022-05-01 11:01:08 +08:00
|
|
|
bool iomap_release_folio(struct folio *folio, gfp_t gfp_flags);
|
2021-04-28 19:51:36 +08:00
|
|
|
void iomap_invalidate_folio(struct folio *folio, size_t offset, size_t len);
|
2019-10-19 07:41:34 +08:00
|
|
|
int iomap_file_unshare(struct inode *inode, loff_t pos, loff_t len,
|
2017-01-28 15:20:26 +08:00
|
|
|
const struct iomap_ops *ops);
|
2016-06-21 07:23:11 +08:00
|
|
|
int iomap_zero_range(struct inode *inode, loff_t pos, loff_t len,
|
2017-01-28 15:20:26 +08:00
|
|
|
bool *did_zero, const struct iomap_ops *ops);
|
2016-06-21 07:23:11 +08:00
|
|
|
int iomap_truncate_page(struct inode *inode, loff_t pos, bool *did_zero,
|
2017-01-28 15:20:26 +08:00
|
|
|
const struct iomap_ops *ops);
|
2018-10-27 06:02:59 +08:00
|
|
|
vm_fault_t iomap_page_mkwrite(struct vm_fault *vmf,
|
|
|
|
const struct iomap_ops *ops);
|
2016-06-21 07:38:45 +08:00
|
|
|
int iomap_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
|
2020-05-23 15:30:12 +08:00
|
|
|
u64 start, u64 len, const struct iomap_ops *ops);
|
2017-06-30 02:43:21 +08:00
|
|
|
loff_t iomap_seek_hole(struct inode *inode, loff_t offset,
|
|
|
|
const struct iomap_ops *ops);
|
|
|
|
loff_t iomap_seek_data(struct inode *inode, loff_t offset,
|
|
|
|
const struct iomap_ops *ops);
|
2018-06-02 00:03:08 +08:00
|
|
|
sector_t iomap_bmap(struct address_space *mapping, sector_t bno,
|
|
|
|
const struct iomap_ops *ops);
|
2016-06-21 07:23:11 +08:00
|
|
|
|
2019-10-18 04:12:15 +08:00
|
|
|
/*
|
|
|
|
* Structure for writeback I/O completions.
|
|
|
|
*/
|
|
|
|
struct iomap_ioend {
|
|
|
|
struct list_head io_list; /* next ioend in chain */
|
|
|
|
u16 io_type;
|
|
|
|
u16 io_flags; /* IOMAP_F_* */
|
xfs, iomap: limit individual ioend chain lengths in writeback
Trond Myklebust reported soft lockups in XFS IO completion such as
this:
watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [kworker/12:1:3106]
CPU: 12 PID: 3106 Comm: kworker/12:1 Not tainted 4.18.0-305.10.2.el8_4.x86_64 #1
Workqueue: xfs-conv/md127 xfs_end_io [xfs]
RIP: 0010:_raw_spin_unlock_irqrestore+0x11/0x20
Call Trace:
wake_up_page_bit+0x8a/0x110
iomap_finish_ioend+0xd7/0x1c0
iomap_finish_ioends+0x7f/0xb0
xfs_end_ioend+0x6b/0x100 [xfs]
xfs_end_io+0xb9/0xe0 [xfs]
process_one_work+0x1a7/0x360
worker_thread+0x1fa/0x390
kthread+0x116/0x130
ret_from_fork+0x35/0x40
Ioends are processed as an atomic completion unit when all the
chained bios in the ioend have completed their IO. Logically
contiguous ioends can also be merged and completed as a single,
larger unit. Both of these things can be problematic as both the
bio chains per ioend and the size of the merged ioends processed as
a single completion are both unbound.
If we have a large sequential dirty region in the page cache,
write_cache_pages() will keep feeding us sequential pages and we
will keep mapping them into ioends and bios until we get a dirty
page at a non-sequential file offset. These large sequential runs
can will result in bio and ioend chaining to optimise the io
patterns. The pages iunder writeback are pinned within these chains
until the submission chaining is broken, allowing the entire chain
to be completed. This can result in huge chains being processed
in IO completion context.
We get deep bio chaining if we have large contiguous physical
extents. We will keep adding pages to the current bio until it is
full, then we'll chain a new bio to keep adding pages for writeback.
Hence we can build bio chains that map millions of pages and tens of
gigabytes of RAM if the page cache contains big enough contiguous
dirty file regions. This long bio chain pins those pages until the
final bio in the chain completes and the ioend can iterate all the
chained bios and complete them.
OTOH, if we have a physically fragmented file, we end up submitting
one ioend per physical fragment that each have a small bio or bio
chain attached to them. We do not chain these at IO submission time,
but instead we chain them at completion time based on file
offset via iomap_ioend_try_merge(). Hence we can end up with unbound
ioend chains being built via completion merging.
XFS can then do COW remapping or unwritten extent conversion on that
merged chain, which involves walking an extent fragment at a time
and running a transaction to modify the physical extent information.
IOWs, we merge all the discontiguous ioends together into a
contiguous file range, only to then process them individually as
discontiguous extents.
This extent manipulation is computationally expensive and can run in
a tight loop, so merging logically contiguous but physically
discontigous ioends gains us nothing except for hiding the fact the
fact we broke the ioends up into individual physical extents at
submission and then need to loop over those individual physical
extents at completion.
Hence we need to have mechanisms to limit ioend sizes and
to break up completion processing of large merged ioend chains:
1. bio chains per ioend need to be bound in length. Pure overwrites
go straight to iomap_finish_ioend() in softirq context with the
exact bio chain attached to the ioend by submission. Hence the only
way to prevent long holdoffs here is to bound ioend submission
sizes because we can't reschedule in softirq context.
2. iomap_finish_ioends() has to handle unbound merged ioend chains
correctly. This relies on any one call to iomap_finish_ioend() being
bound in runtime so that cond_resched() can be issued regularly as
the long ioend chain is processed. i.e. this relies on mechanism #1
to limit individual ioend sizes to work correctly.
3. filesystems have to loop over the merged ioends to process
physical extent manipulations. This means they can loop internally,
and so we break merging at physical extent boundaries so the
filesystem can easily insert reschedule points between individual
extent manipulations.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reported-and-tested-by: Trond Myklebust <trondmy@hammerspace.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-01-27 01:19:20 +08:00
|
|
|
u32 io_folios; /* folios added to ioend */
|
2019-10-18 04:12:15 +08:00
|
|
|
struct inode *io_inode; /* file being written to */
|
|
|
|
size_t io_size; /* size of the extent */
|
|
|
|
loff_t io_offset; /* offset in the file */
|
xfs, iomap: limit individual ioend chain lengths in writeback
Trond Myklebust reported soft lockups in XFS IO completion such as
this:
watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [kworker/12:1:3106]
CPU: 12 PID: 3106 Comm: kworker/12:1 Not tainted 4.18.0-305.10.2.el8_4.x86_64 #1
Workqueue: xfs-conv/md127 xfs_end_io [xfs]
RIP: 0010:_raw_spin_unlock_irqrestore+0x11/0x20
Call Trace:
wake_up_page_bit+0x8a/0x110
iomap_finish_ioend+0xd7/0x1c0
iomap_finish_ioends+0x7f/0xb0
xfs_end_ioend+0x6b/0x100 [xfs]
xfs_end_io+0xb9/0xe0 [xfs]
process_one_work+0x1a7/0x360
worker_thread+0x1fa/0x390
kthread+0x116/0x130
ret_from_fork+0x35/0x40
Ioends are processed as an atomic completion unit when all the
chained bios in the ioend have completed their IO. Logically
contiguous ioends can also be merged and completed as a single,
larger unit. Both of these things can be problematic as both the
bio chains per ioend and the size of the merged ioends processed as
a single completion are both unbound.
If we have a large sequential dirty region in the page cache,
write_cache_pages() will keep feeding us sequential pages and we
will keep mapping them into ioends and bios until we get a dirty
page at a non-sequential file offset. These large sequential runs
can will result in bio and ioend chaining to optimise the io
patterns. The pages iunder writeback are pinned within these chains
until the submission chaining is broken, allowing the entire chain
to be completed. This can result in huge chains being processed
in IO completion context.
We get deep bio chaining if we have large contiguous physical
extents. We will keep adding pages to the current bio until it is
full, then we'll chain a new bio to keep adding pages for writeback.
Hence we can build bio chains that map millions of pages and tens of
gigabytes of RAM if the page cache contains big enough contiguous
dirty file regions. This long bio chain pins those pages until the
final bio in the chain completes and the ioend can iterate all the
chained bios and complete them.
OTOH, if we have a physically fragmented file, we end up submitting
one ioend per physical fragment that each have a small bio or bio
chain attached to them. We do not chain these at IO submission time,
but instead we chain them at completion time based on file
offset via iomap_ioend_try_merge(). Hence we can end up with unbound
ioend chains being built via completion merging.
XFS can then do COW remapping or unwritten extent conversion on that
merged chain, which involves walking an extent fragment at a time
and running a transaction to modify the physical extent information.
IOWs, we merge all the discontiguous ioends together into a
contiguous file range, only to then process them individually as
discontiguous extents.
This extent manipulation is computationally expensive and can run in
a tight loop, so merging logically contiguous but physically
discontigous ioends gains us nothing except for hiding the fact the
fact we broke the ioends up into individual physical extents at
submission and then need to loop over those individual physical
extents at completion.
Hence we need to have mechanisms to limit ioend sizes and
to break up completion processing of large merged ioend chains:
1. bio chains per ioend need to be bound in length. Pure overwrites
go straight to iomap_finish_ioend() in softirq context with the
exact bio chain attached to the ioend by submission. Hence the only
way to prevent long holdoffs here is to bound ioend submission
sizes because we can't reschedule in softirq context.
2. iomap_finish_ioends() has to handle unbound merged ioend chains
correctly. This relies on any one call to iomap_finish_ioend() being
bound in runtime so that cond_resched() can be issued regularly as
the long ioend chain is processed. i.e. this relies on mechanism #1
to limit individual ioend sizes to work correctly.
3. filesystems have to loop over the merged ioends to process
physical extent manipulations. This means they can loop internally,
and so we break merging at physical extent boundaries so the
filesystem can easily insert reschedule points between individual
extent manipulations.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reported-and-tested-by: Trond Myklebust <trondmy@hammerspace.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-01-27 01:19:20 +08:00
|
|
|
sector_t io_sector; /* start sector of ioend */
|
2019-10-18 04:12:15 +08:00
|
|
|
struct bio *io_bio; /* bio being built */
|
|
|
|
struct bio io_inline_bio; /* MUST BE LAST! */
|
|
|
|
};
|
|
|
|
|
|
|
|
struct iomap_writeback_ops {
|
|
|
|
/*
|
|
|
|
* Required, maps the blocks so that writeback can be performed on
|
|
|
|
* the range starting at offset.
|
|
|
|
*/
|
|
|
|
int (*map_blocks)(struct iomap_writepage_ctx *wpc, struct inode *inode,
|
|
|
|
loff_t offset);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Optional, allows the file systems to perform actions just before
|
|
|
|
* submitting the bio and/or override the bio end_io handler for complex
|
|
|
|
* operations like copy on write extent manipulation or unwritten extent
|
|
|
|
* conversions.
|
|
|
|
*/
|
|
|
|
int (*prepare_ioend)(struct iomap_ioend *ioend, int status);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Optional, allows the file system to discard state on a page where
|
|
|
|
* we failed to submit any I/O.
|
|
|
|
*/
|
2021-07-30 21:56:05 +08:00
|
|
|
void (*discard_folio)(struct folio *folio, loff_t pos);
|
2019-10-18 04:12:15 +08:00
|
|
|
};
|
|
|
|
|
|
|
|
struct iomap_writepage_ctx {
|
|
|
|
struct iomap iomap;
|
|
|
|
struct iomap_ioend *ioend;
|
|
|
|
const struct iomap_writeback_ops *ops;
|
|
|
|
};
|
|
|
|
|
|
|
|
void iomap_finish_ioends(struct iomap_ioend *ioend, int error);
|
|
|
|
void iomap_ioend_try_merge(struct iomap_ioend *ioend,
|
2021-05-04 23:54:29 +08:00
|
|
|
struct list_head *more_ioends);
|
2019-10-18 04:12:15 +08:00
|
|
|
void iomap_sort_ioends(struct list_head *ioend_list);
|
|
|
|
int iomap_writepages(struct address_space *mapping,
|
|
|
|
struct writeback_control *wbc, struct iomap_writepage_ctx *wpc,
|
|
|
|
const struct iomap_writeback_ops *ops);
|
|
|
|
|
2016-11-30 11:36:01 +08:00
|
|
|
/*
|
|
|
|
* Flags for direct I/O ->end_io:
|
|
|
|
*/
|
|
|
|
#define IOMAP_DIO_UNWRITTEN (1 << 0) /* covers unwritten extent(s) */
|
|
|
|
#define IOMAP_DIO_COW (1 << 1) /* covers COW extent(s) */
|
2019-09-20 06:32:45 +08:00
|
|
|
|
|
|
|
struct iomap_dio_ops {
|
|
|
|
int (*end_io)(struct kiocb *iocb, ssize_t size, int error,
|
|
|
|
unsigned flags);
|
2021-10-12 19:12:24 +08:00
|
|
|
void (*submit_io)(const struct iomap_iter *iter, struct bio *bio,
|
|
|
|
loff_t file_offset);
|
2022-05-06 04:11:10 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Filesystems wishing to attach private information to a direct io bio
|
|
|
|
* must provide a ->submit_io method that attaches the additional
|
|
|
|
* information to the bio and changes the ->bi_end_io callback to a
|
|
|
|
* custom function. This function should, at a minimum, perform any
|
|
|
|
* relevant post-processing of the bio and end with a call to
|
|
|
|
* iomap_dio_bio_end_io.
|
|
|
|
*/
|
|
|
|
struct bio_set *bio_set;
|
2019-09-20 06:32:45 +08:00
|
|
|
};
|
|
|
|
|
2021-01-24 02:06:09 +08:00
|
|
|
/*
|
|
|
|
* Wait for the I/O to complete in iomap_dio_rw even if the kiocb is not
|
|
|
|
* synchronous.
|
|
|
|
*/
|
|
|
|
#define IOMAP_DIO_FORCE_WAIT (1 << 0)
|
|
|
|
|
2021-01-24 02:06:10 +08:00
|
|
|
/*
|
|
|
|
* Do not allocate blocks or zero partial blocks, but instead fall back to
|
|
|
|
* the caller by returning -EAGAIN. Used to optimize direct I/O writes that
|
|
|
|
* are not aligned to the file system block size.
|
|
|
|
*/
|
|
|
|
#define IOMAP_DIO_OVERWRITE_ONLY (1 << 1)
|
|
|
|
|
2021-07-23 07:59:41 +08:00
|
|
|
/*
|
|
|
|
* When a page fault occurs, return a partial synchronous result and allow
|
|
|
|
* the caller to retry the rest of the operation after dealing with the page
|
|
|
|
* fault.
|
|
|
|
*/
|
|
|
|
#define IOMAP_DIO_PARTIAL (1 << 2)
|
|
|
|
|
2022-06-08 04:04:03 +08:00
|
|
|
/*
|
|
|
|
* The caller will sync the write if needed; do not sync it within
|
|
|
|
* iomap_dio_rw. Overrides IOMAP_DIO_FORCE_WAIT.
|
|
|
|
*/
|
|
|
|
#define IOMAP_DIO_NOSYNC (1 << 3)
|
|
|
|
|
2016-11-30 11:36:01 +08:00
|
|
|
ssize_t iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
|
2019-10-15 23:43:42 +08:00
|
|
|
const struct iomap_ops *ops, const struct iomap_dio_ops *dops,
|
2022-05-06 04:11:11 +08:00
|
|
|
unsigned int dio_flags, void *private, size_t done_before);
|
2020-09-28 23:51:08 +08:00
|
|
|
struct iomap_dio *__iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
|
|
|
|
const struct iomap_ops *ops, const struct iomap_dio_ops *dops,
|
2022-05-06 04:11:11 +08:00
|
|
|
unsigned int dio_flags, void *private, size_t done_before);
|
2020-09-28 23:51:08 +08:00
|
|
|
ssize_t iomap_dio_complete(struct iomap_dio *dio);
|
2022-05-06 04:11:10 +08:00
|
|
|
void iomap_dio_bio_end_io(struct bio *bio);
|
2016-11-30 11:36:01 +08:00
|
|
|
|
2018-05-10 23:38:15 +08:00
|
|
|
#ifdef CONFIG_SWAP
|
|
|
|
struct file;
|
|
|
|
struct swap_info_struct;
|
|
|
|
|
|
|
|
int iomap_swapfile_activate(struct swap_info_struct *sis,
|
|
|
|
struct file *swap_file, sector_t *pagespan,
|
|
|
|
const struct iomap_ops *ops);
|
|
|
|
#else
|
|
|
|
# define iomap_swapfile_activate(sis, swapfile, pagespan, ops) (-EIO)
|
|
|
|
#endif /* CONFIG_SWAP */
|
|
|
|
|
2016-06-21 07:22:39 +08:00
|
|
|
#endif /* LINUX_IOMAP_H */
|