Make sure we initialize *bno and *len, before jumping to out_bad_rec
label, and risk calling xfs_warn() with uninitialized variables.
Coverity: 100898
Coverity: 1437081
Coverity: 1437129
Coverity: 1437191
Coverity: 1437201
Coverity: 1437212
Coverity: 1437341
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
In __xfs_ag_resv_init we incorrectly calculate the amount by which to
decrease fdblocks when reserving blocks for the rmapbt. Because rmapbt
allocations do not decrease fdblocks, we must decrease fdblocks by the
entire size of the requested reservation in order to achieve our goal of
always having enough free blocks to satisfy an rmapbt expansion.
This is in contrast to the refcountbt/finobt, which /do/ subtract from
fdblocks whenever they allocate a block. For this allocation type we
preserve the existing behavior where we decrease fdblocks only by the
requested reservation minus the size of the existing tree.
This fixes the problem where the available block counts reported by
statfs change across a remount if there had been an rmapbt size change
since mount time.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
If a user asks us to zero_range part of a file, the end of the range is
EOF, and not aligned to a page boundary, invoke writeback of the EOF
page to ensure that the post-EOF part of the page is zeroed. This
ensures that we don't expose stale memory contents via mmap, if in a
clumsy manner.
Found by running generic/127 when it runs zero_range and mapread at EOF
one after the other.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
In commit 8ad560d256 ("xfs: strengthen rtalloc query range checks")
we strengthened the input parameter checks in the rtbitmap range query
function, but introduced an off-by-one error in the process. The call
to xfs_rtfind_forw deals with the high key being rextents, but we clamp
the high key to rextents - 1. This causes the returned results to stop
one block short of the end of the rtdev, which is incorrect.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Initialize the extent count field of the high key so that when we use
the high key to synthesize an 'unknown owner' record (i.e. used space
record) at the end of the queried range we have a field with which to
compute rm_blockcount. This is not strictly necessary because the
synthesizer never uses the rm_blockcount field, but we can shut up the
static code analysis anyway.
Coverity-id: 1437358
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Zorro Lang reports that generic/485 blows an assert on a filesystem with
512 byte blocks. The test tries to fallocate a post-eof extent at the
maximum file size and calls insert range to shift the extents right by
two blocks. On a 512b block filesystem this causes startoff to overflow
the 54-bit startoff field, leading to the assert.
Therefore, always check the rightmost extent to see if it would overflow
prior to invoking the insert range machinery.
Reported-by: zlang@redhat.com
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=200137
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
If we somehow end up with a filesystem that has fewer free blocks than
the blocks set aside to avoid ENOSPC deadlocks, it's possible that the
free space calculation in xfs_reserve_blocks will spit out a negative
number (because percpu_counter_sum returns s64). We fail to notice
this negative number and set fdblks_delta to it. Now we increment
fdblocks(!) and the unsigned type of m_resblks means that we end up
setting a ridiculously huge m_resblks reservation.
Avoid this comedy of errors by detecting the negative free space and
returning -ENOSPC.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
In commit e89c041338 ("xfs: implement the GETFSMAP ioctl") we
created the ability to obtain empty transactions. These transactions
have no log or block reservations and therefore can't modify anything.
Since they're also NO_WRITECOUNT they can run while the fs is frozen,
so we don't need to WARN_ON about that usage.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
When a corrupt inode is detected during xfs_iflush_cluster, we can
get a shutdown ASSERT failure like this:
XFS (pmem1): Metadata corruption detected at xfs_symlink_shortform_verify+0x5c/0xa0, inode 0x86627 data fork
XFS (pmem1): Unmount and run xfs_repair
XFS (pmem1): xfs_do_force_shutdown(0x8) called from line 3372 of file fs/xfs/xfs_inode.c. Return address = ffffffff814f4116
XFS (pmem1): Corruption of in-memory data detected. Shutting down filesystem
XFS (pmem1): xfs_do_force_shutdown(0x1) called from line 222 of file fs/xfs/libxfs/xfs_defer.c. Return address = ffffffff814a8a88
XFS (pmem1): xfs_do_force_shutdown(0x1) called from line 222 of file fs/xfs/libxfs/xfs_defer.c. Return address = ffffffff814a8ef9
XFS (pmem1): Please umount the filesystem and rectify the problem(s)
XFS: Assertion failed: xfs_isiflocked(ip), file: fs/xfs/xfs_inode.h, line: 258
.....
Call Trace:
xfs_iflush_abort+0x10a/0x110
xfs_iflush+0xf3/0x390
xfs_inode_item_push+0x126/0x1e0
xfsaild+0x2c5/0x890
kthread+0x11c/0x140
ret_from_fork+0x24/0x30
Essentially, xfs_iflush_abort() has been called twice on the
original inode that that was flushed. This happens because the
inode has been flushed to teh buffer successfully via
xfs_iflush_int(), and so when another inode is detected as corrupt
in xfs_iflush_cluster, the buffer is marked stale and EIO, and
iodone callbacks are run on it.
Running the iodone callbacks walks across the original inode and
calls xfs_iflush_abort() on it. When xfs_iflush_cluster() returns
to xfs_iflush(), it runs the error path for that function, and that
calls xfs_iflush_abort() on the inode a second time, leading to the
above assert failure as the inode is not flush locked anymore.
This bug has been there a long time.
The simple fix would be to just avoid calling xfs_iflush_abort() in
xfs_iflush() if we've got a failure from xfs_iflush_cluster().
However, xfs_iflush_cluster() has magic delwri buffer handling that
means it may or may not have run IO completion on the buffer, and
hence sometimes we have to call xfs_iflush_abort() from
xfs_iflush(), and sometimes we shouldn't.
After reading through all the error paths and the delwri buffer
code, it's clear that the error handling in xfs_iflush_cluster() is
unnecessary. If the buffer is delwri, it leaves it on the delwri
list so that when the delwri list is submitted it sees a shutdown
fliesystem in xfs_buf_submit() and that marks the buffer stale, EIO
and runs IO completion. i.e. exactly what xfs+iflush_cluster() does
when it's not a delwri buffer. Further, marking a buffer stale
clears the _XBF_DELWRI_Q flag on the buffer, which means when
submission of the buffer occurs, it just skips over it and releases
it.
IOWs, the error handling in xfs_iflush_cluster doesn't need to care
if the buffer is already on a the delwri queue or not - it just
needs to mark the buffer stale, EIO and run completions. That means
we can just use the easy fix for xfs_iflush() to avoid the double
abort.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
When the inode is in extent format, it can't have more extents that
fit in the inode fork. We don't currenty check this, and so this
corruption goes unnoticed by the inode verifiers. This can lead to
crashes operating on invalid in-memory structures.
Attempts to access such a inode will now error out in the verifier
rather than allowing modification operations to proceed.
Reported-by: Wen Xu <wen.xu@gatech.edu>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
[darrick: fix a typedef, add some braces and breaks to shut up compiler warnings]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Instead of using xfs_bmapi_read to find delalloc extents and then punch
them out using xfs_bunmapi, opencode the loop to iterate over the extents
and call xfs_bmap_del_extent_delay directly. This both simplifies the
code and reduces the number of extent tree lookups required.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
- can.rst: fix a footnote reference;
- crypto_engine.rst: Fix two parsing warnings;
- Fix a lot of broken references to Documentation/*;
- Improves the scripts/documentation-file-ref-check script,
in order to help detecting/fixing broken references,
preventing false-positives.
After this patch series, only 33 broken references to doc files are
detected by scripts/documentation-file-ref-check.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJbJC2aAAoJEAhfPr2O5OEVPmMP/2rN5m9LZ048oRWlg4hCwo73
4FpWqDg18hbWCMHXYHIN1UACIMUkIUfgLhF7WE3D/XqRMuxHoiE5u7DUdak7+VNt
wunpksKFJbgyfFMHRvykHcZV+jQFVbM7eFvXVPIvoSaAeGH6zx4imHTyeDn3x/nL
gdtBqM4bvEhmBjotBTRR4PB8+oPrT/HIT5npHepx3UnFFFAzDQGEZ/I67/el2G5C
pVmYdBXvr7iqrvUs6FilHLTEfe1quCI4UaKNfLHKrxXrTkiJQFOwugYuobZfNmxT
GwjWzfpNy9HMlKJFYipcByALxel1Mnpqz5mIxFQaCTygBuEsORCWzW5MoKIsIUJ0
KOoG76v0rUyMvLBRvaoao3CHYHdzxhQbtVV9DjyDuDksa2G5IoCAF1t6DyIOitRw
9plMnGckk+FJ/MXJKYWXHszFS8NhI0SF2zHe3s1DmRTD8P6oxkxvxBFz6iqqADmL
W6XHd8CcqJItaS9ctPen91TFuysN1HFpdzLLY+xwWmmKOcWC/jFjhTm8pj7xLQHM
5yuuEcefsajf+Xk4w2fSQmRfXnuq+oOlPuWpwSvEy+59cHGI0ms18P1nHy/yt3II
CJywwdx6fjwDon57RFKH7kkGd7px317zMqWdIv9gUj/qZAy9gcdLdvEQLhx9u0aV
4F+hLKFDFEpf58xqRT1R
=/ozx
-----END PGP SIGNATURE-----
Merge tag 'docs-broken-links' of git://linuxtv.org/mchehab/experimental
Pull documentation fixes from Mauro Carvalho Chehab:
"This solves a series of broken links for files under Documentation,
and improves a script meant to detect such broken links (see
scripts/documentation-file-ref-check).
The changes on this series are:
- can.rst: fix a footnote reference;
- crypto_engine.rst: Fix two parsing warnings;
- Fix a lot of broken references to Documentation/*;
- improve the scripts/documentation-file-ref-check script, in order
to help detecting/fixing broken references, preventing
false-positives.
After this patch series, only 33 broken references to doc files are
detected by scripts/documentation-file-ref-check"
* tag 'docs-broken-links' of git://linuxtv.org/mchehab/experimental: (26 commits)
fix a series of Documentation/ broken file name references
Documentation: rstFlatTable.py: fix a broken reference
ABI: sysfs-devices-system-cpu: remove a broken reference
devicetree: fix a series of wrong file references
devicetree: fix name of pinctrl-bindings.txt
devicetree: fix some bindings file names
MAINTAINERS: fix location of DT npcm files
MAINTAINERS: fix location of some display DT bindings
kernel-parameters.txt: fix pointers to sound parameters
bindings: nvmem/zii: Fix location of nvmem.txt
docs: Fix more broken references
scripts/documentation-file-ref-check: check tools/*/Documentation
scripts/documentation-file-ref-check: get rid of false-positives
scripts/documentation-file-ref-check: hint: dash or underline
scripts/documentation-file-ref-check: add a fix logic for DT
scripts/documentation-file-ref-check: accept more wildcards at filenames
scripts/documentation-file-ref-check: fix help message
media: max2175: fix location of driver's companion documentation
media: v4l: fix broken video4linux docs locations
media: dvb: point to the location of the old README.dvb-usb file
...
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAlsielwACgkQnJ2qBz9k
QNlN0Af/Q82iP3EqrT3w+CT7w0gER2su+Df2riDpo0/XYRQLxuyW+kYtLsQwovvB
Q7Tt+WTSO5OIqoJxwGMmd6VO5ICblhP+uHVC6+JlWy17DgccjwFBE/sUopxPqJaK
9utwXZhqqOEoikNpDABcptNnWVILRl0yppkQrVV/pKkyZFp2F8vO4roUHFFYkJJt
/uXJfLDQx6pBLTwqfQBFyiz0dCSsvCHUVnlw7Hu5JfE6xPtkMlk6F/M0Y0rvyEOg
8KmH5jUX/BXKIijg+ycOzS3CCdvm0UhrtiH5YWy4qGaI8eczT31Epfl08Sk8pvkv
n2rnxNnJP5sjPPNQhXvHJqy9qRCB6g==
=bLjN
-----END PGP SIGNATURE-----
Merge tag 'fsnotify_for_v4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull fsnotify updates from Jan Kara:
"fsnotify cleanups unifying handling of different watch types.
This is the shortened fsnotify series from Amir with the last five
patches pulled out. Amir has modified those patches to not change
struct inode but obviously it's too late for those to go into this
merge window"
* tag 'fsnotify_for_v4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
fsnotify: add fsnotify_add_inode_mark() wrappers
fanotify: generalize fanotify_should_send_event()
fsnotify: generalize send_to_group()
fsnotify: generalize iteration of marks by object type
fsnotify: introduce marks iteration helpers
fsnotify: remove redundant arguments to handle_event()
fsnotify: use type id to identify connector object type
Pull AFS updates from Al Viro:
"Assorted AFS stuff - ended up in vfs.git since most of that consists
of David's AFS-related followups to Christoph's procfs series"
* 'afs-proc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
afs: Optimise callback breaking by not repeating volume lookup
afs: Display manually added cells in dynamic root mount
afs: Enable IPv6 DNS lookups
afs: Show all of a server's addresses in /proc/fs/afs/servers
afs: Handle CONFIG_PROC_FS=n
proc: Make inline name size calculation automatic
afs: Implement network namespacing
afs: Mark afs_net::ws_cell as __rcu and set using rcu functions
afs: Fix a Sparse warning in xdr_decode_AFSFetchStatus()
proc: Add a way to make network proc files writable
afs: Rearrange fs/afs/proc.c to remove remaining predeclarations.
afs: Rearrange fs/afs/proc.c to move the show routines up
afs: Rearrange fs/afs/proc.c by moving fops and open functions down
afs: Move /proc management functions to the end of the file
Pull compat updates from Al Viro:
"Some biarch patches - getting rid of assorted (mis)uses of
compat_alloc_user_space().
Not much in that area this cycle..."
* 'work.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
orangefs: simplify compat ioctl handling
signalfd: lift sigmask copyin and size checks to callers of do_signalfd4()
vmsplice(): lift importing iovec into vmsplice(2) and compat counterpart
Pull aio fixes from Al Viro:
"Assorted AIO followups and fixes"
* 'work.aio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
eventpoll: switch to ->poll_mask
aio: only return events requested in poll_mask() for IOCB_CMD_POLL
eventfd: only return events requested in poll_mask()
aio: mark __aio_sigset::sigmask const
As files move around, their previous links break. Fix the
references for them.
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
As we move stuff around, some doc references are broken. Fix some of
them via this script:
./scripts/documentation-file-ref-check --fix
Manually checked that produced results are valid.
Acked-by: Matthias Brugger <matthias.bgg@gmail.com>
Acked-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
At the moment, afs_break_callbacks calls afs_break_one_callback() for each
separate FID it was given, and the latter looks up the volume individually
for each one.
However, this is inefficient if two or more FIDs have the same vid as we
could reuse the volume. This is complicated by cell aliasing whereby we
may have multiple cells sharing a volume and can therefore have multiple
callback interests for any particular volume ID.
At the moment afs_break_one_callback() scans the entire list of volumes
we're getting from a server and breaks the appropriate callback in every
matching volume, regardless of cell. This scan is done for every FID.
Optimise callback breaking by the following means:
(1) Sort the FID list by vid so that all FIDs belonging to the same volume
are clumped together.
This is done through the use of an indirection table as we cannot do
an insertion sort on the afs_callback_break array as we decode FIDs
into it as we subsequently also have to decode callback info into it
that corresponds by array index only.
We also don't really want to bubblesort afterwards if we can avoid it.
(2) Sort the server->cb_interests array by vid so that all the matching
volumes are grouped together. This permits the scan to stop after
finding a record that has a higher vid.
(3) When breaking FIDs, we try to keep server->cb_break_lock as long as
possible, caching the start point in the array for that volume group
as long as possible.
It might make sense to add another layer in that list and have a
refcounted volume ID anchor that has the matching interests attached
to it rather than being in the list. This would allow the lock to be
dropped without losing the cursor.
Signed-off-by: David Howells <dhowells@redhat.com>
Alter the dynroot mount so that cells created by manipulation of
/proc/fs/afs/cells and /proc/fs/afs/rootcell and by specification of a root
cell as a module parameter will cause directories for those cells to be
created in the dynamic root superblock for the network namespace[*].
To this end:
(1) Only one dynamic root superblock is now created per network namespace
and this is shared between all attempts to mount it. This makes it
easier to find the superblock to modify.
(2) When a dynamic root superblock is created, the list of cells is walked
and directories created for each cell already defined.
(3) When a new cell is added, if a dynamic root superblock exists, a
directory is created for it.
(4) When a cell is destroyed, the directory is removed.
(5) These directories are created by calling lookup_one_len() on the root
dir which automatically creates them if they don't exist.
[*] Inasmuch as network namespaces are currently supported here.
Signed-off-by: David Howells <dhowells@redhat.com>
Show all of a server's addresses in /proc/fs/afs/servers, placing the
second plus addresses on padded lines of their own. The current address is
marked with a star.
Signed-off-by: David Howells <dhowells@redhat.com>
The AFS filesystem depends at the moment on /proc for configuration and
also presents information that way - however, this causes a compilation
failure if procfs is disabled.
Fix it so that the procfs bits aren't compiled in if procfs is disabled.
This means that you can't configure the AFS filesystem directly, but it is
still usable provided that an up-to-date keyutils is installed to look up
cells by SRV or AFSDB DNS records.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Make calculation of the size of the inline name in struct proc_dir_entry
automatic, rather than having to manually encode the numbers and failing to
allow for lockdep.
Require a minimum inline name size of 33+1 to allow for names that look
like two hex numbers with a dash between.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The ->poll_mask() operation has a mask of events that the caller
is interested in, but not all implementations might take it into
account. Mask the return value to only the requested events,
similar to what the poll and epoll code does.
Reported-by: Avi Kivity <avi@scylladb.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The ->poll_mask() operation has a mask of events that the caller
is interested in, but we're returning all events regardless.
Change to return only the events the caller is interested in. This
fixes aio IO_CMD_POLL returning immediately when called with POLLIN
on an eventfd, since an eventfd is almost always ready for a write.
Signed-off-by: Avi Kivity <avi@scylladb.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Merge more updates from Andrew Morton:
- MM remainders
- various misc things
- kcov updates
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (27 commits)
lib/test_printf.c: call wait_for_random_bytes() before plain %p tests
hexagon: drop the unused variable zero_page_mask
hexagon: fix printk format warning in setup.c
mm: fix oom_kill event handling
treewide: use PHYS_ADDR_MAX to avoid type casting ULLONG_MAX
mm: use octal not symbolic permissions
ipc: use new return type vm_fault_t
sysvipc/sem: mitigate semnum index against spectre v1
fault-injection: reorder config entries
arm: port KCOV to arm
sched/core / kcov: avoid kcov_area during task switch
kcov: prefault the kcov_area
kcov: ensure irq code sees a valid area
kernel/relay.c: change return type to vm_fault_t
exofs: avoid VLA in structures
coredump: fix spam with zero VMA process
fat: use fat_fs_error() instead of BUG_ON() in __fat_get_block()
proc: skip branch in /proc/*/* lookup
mremap: remove LATENCY_LIMIT from mremap to reduce the number of TLB shootdowns
mm/memblock: add missing include <linux/bootmem.h>
...
If file size and FAT cluster chain is not matched (corrupted image), we
can hit BUG_ON(!phys) in __fat_get_block().
So, use fat_fs_error() instead.
[hirofumi@mail.parknet.co.jp: fix printk warning]
Link: http://lkml.kernel.org/r/87po12aq5p.fsf@mail.parknet.co.jp
Link: http://lkml.kernel.org/r/874lilcu67.fsf@mail.parknet.co.jp
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Tested-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Code is structured like this:
for ( ... p < last; p++) {
if (memcmp == 0)
break;
}
if (p >= last)
ERROR
OK
gcc doesn't see that if if lookup succeeds than post loop branch will
never be taken and skip it.
[akpm@linux-foundation.org: proc_pident_instantiate() no longer takes an inode*]
Link: http://lkml.kernel.org/r/20180423213954.GD9043@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is a late set of changes from Deepa Dinamani doing an automated
treewide conversion of the inode and iattr structures from 'timespec'
to 'timespec64', to push the conversion from the VFS layer into the
individual file systems.
There were no conflicts between this and the contents of linux-next
until just before the merge window, when we saw multiple problems:
- A minor conflict with my own y2038 fixes, which I could address
by adding another patch on top here.
- One semantic conflict with late changes to the NFS tree. I addressed
this by merging Deepa's original branch on top of the changes that
now got merged into mainline and making sure the merge commit includes
the necessary changes as produced by coccinelle.
- A trivial conflict against the removal of staging/lustre.
- Multiple conflicts against the VFS changes in the overlayfs tree.
These are still part of linux-next, but apparently this is no longer
intended for 4.18 [1], so I am ignoring that part.
As Deepa writes:
The series aims to switch vfs timestamps to use struct timespec64.
Currently vfs uses struct timespec, which is not y2038 safe.
The series involves the following:
1. Add vfs helper functions for supporting struct timepec64 timestamps.
2. Cast prints of vfs timestamps to avoid warnings after the switch.
3. Simplify code using vfs timestamps so that the actual
replacement becomes easy.
4. Convert vfs timestamps to use struct timespec64 using a script.
This is a flag day patch.
Next steps:
1. Convert APIs that can handle timespec64, instead of converting
timestamps at the boundaries.
2. Update internal data structures to avoid timestamp conversions.
Thomas Gleixner adds:
I think there is no point to drag that out for the next merge window.
The whole thing needs to be done in one go for the core changes which
means that you're going to play that catchup game forever. Let's get
over with it towards the end of the merge window.
[1] https://www.spinics.net/lists/linux-fsdevel/msg128294.html
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJbInZAAAoJEGCrR//JCVInReoQAIlVIIMt5ZX6wmaKbrjy9Itf
MfgbFihQ/djLnuSPVQ3nztcxF0d66BKHZ9puVjz6+mIHqfDvJTRwZs9nU+sOF/T1
g78fRkM1cxq6ZCkGYAbzyjyo5aC4PnSMP/NQLmwqvi0MXqqrbDoq5ZdP9DHJw39h
L9lD8FM/P7T29Fgp9tq/pT5l9X8VU8+s5KQG1uhB5hii4VL6pD6JyLElDita7rg+
Z7/V7jkxIGEUWF7vGaiR1QTFzEtpUA/exDf9cnsf51OGtK/LJfQ0oiZPPuq3oA/E
LSbt8YQQObc+dvfnGxwgxEg1k5WP5ekj/Wdibv/+rQKgGyLOTz6Q4xK6r8F2ahxs
nyZQBdXqHhJYyKr1H1reUH3mrSgQbE5U5R1i3My0xV2dSn+vtK5vgF21v2Ku3A1G
wJratdtF/kVBzSEQUhsYTw14Un+xhBLRWzcq0cELonqxaKvRQK9r92KHLIWNE7/v
c0TmhFbkZA+zR8HdsaL3iYf1+0W/eYy8PcvepyldKNeW2pVk3CyvdTfY2Z87G2XK
tIkK+BUWbG3drEGG3hxZ3757Ln3a9qWyC5ruD3mBVkuug/wekbI8PykYJS7Mx4s/
WNXl0dAL0Eeu1M8uEJejRAe1Q3eXoMWZbvCYZc+wAm92pATfHVcKwPOh8P7NHlfy
A3HkjIBrKW5AgQDxfgvm
=CZX2
-----END PGP SIGNATURE-----
Merge tag 'vfs-timespec64' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground
Pull inode timestamps conversion to timespec64 from Arnd Bergmann:
"This is a late set of changes from Deepa Dinamani doing an automated
treewide conversion of the inode and iattr structures from 'timespec'
to 'timespec64', to push the conversion from the VFS layer into the
individual file systems.
As Deepa writes:
'The series aims to switch vfs timestamps to use struct timespec64.
Currently vfs uses struct timespec, which is not y2038 safe.
The series involves the following:
1. Add vfs helper functions for supporting struct timepec64
timestamps.
2. Cast prints of vfs timestamps to avoid warnings after the switch.
3. Simplify code using vfs timestamps so that the actual replacement
becomes easy.
4. Convert vfs timestamps to use struct timespec64 using a script.
This is a flag day patch.
Next steps:
1. Convert APIs that can handle timespec64, instead of converting
timestamps at the boundaries.
2. Update internal data structures to avoid timestamp conversions'
Thomas Gleixner adds:
'I think there is no point to drag that out for the next merge
window. The whole thing needs to be done in one go for the core
changes which means that you're going to play that catchup game
forever. Let's get over with it towards the end of the merge window'"
* tag 'vfs-timespec64' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground:
pstore: Remove bogus format string definition
vfs: change inode times to use struct timespec64
pstore: Convert internal records to timespec64
udf: Simplify calls to udf_disk_stamp_to_time
fs: nfs: get rid of memcpys for inode times
ceph: make inode time prints to be long long
lustre: Use long long type to print inode time
fs: add timespec64_truncate()
requests are aborted, improving CephFS ENOSPC handling and making
"umount -f" actually work (Zheng and myself). The rest is mostly
mount option handling cleanups from Chengguang and assorted fixes
from Zheng, Luis and Dongsheng.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJbIkigAAoJEEp/3jgCEfOL3EUH/1s7Ib3FgFzG/SPPKISxZOGr
ndZGg0rPT9mPIQ4rp6t0z/cDlMrluPmCK3sWrAPe//sZz9iZiuip+mCL0gUFXFNr
1kL2xDKkJzGxtP3UlUvr5CC6bnxLdeBXJRBDLk/swtphuqArKndlbN/iLZnCZivT
uJDk+vZTwNJ3UhQP4QdnOQLV60NYs+q4euTqbZF3+pDiRiONbxRfXC3adFsc8zL9
zlie3CHPbrQHWMsfNvbfM3rBH1WhTwEssDm+IEFlKl19q9SKP2WPZfmBcE1pmZ58
AhIMoNGdQha1FXS6N96kaPaqFgeysPnEPoyHDqLxsUMKqsvJlOEZsK1jujza4rE=
=EfXm
-----END PGP SIGNATURE-----
Merge tag 'ceph-for-4.18-rc1' of git://github.com/ceph/ceph-client
Pull ceph updates from Ilya Dryomov:
"The main piece is a set of libceph changes that revamps how OSD
requests are aborted, improving CephFS ENOSPC handling and making
"umount -f" actually work (Zheng and myself).
The rest is mostly mount option handling cleanups from Chengguang and
assorted fixes from Zheng, Luis and Dongsheng.
* tag 'ceph-for-4.18-rc1' of git://github.com/ceph/ceph-client: (31 commits)
rbd: flush rbd_dev->watch_dwork after watch is unregistered
ceph: update description of some mount options
ceph: show ino32 if the value is different with default
ceph: strengthen rsize/wsize/readdir_max_bytes validation
ceph: fix alignment of rasize
ceph: fix use-after-free in ceph_statfs()
ceph: prevent i_version from going back
ceph: fix wrong check for the case of updating link count
libceph: allocate the locator string with GFP_NOFAIL
libceph: make abort_on_full a per-osdc setting
libceph: don't abort reads in ceph_osdc_abort_on_full()
libceph: avoid a use-after-free during map check
libceph: don't warn if req->r_abort_on_full is set
libceph: use for_each_request() in ceph_osdc_abort_on_full()
libceph: defer __complete_request() to a workqueue
libceph: move more code into __complete_request()
libceph: no need to call flush_workqueue() before destruction
ceph: flush pending works before shutdown super
ceph: abort osd requests on force umount
libceph: introduce ceph_osdc_abort_requests()
...
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAlsg598ACgkQxWXV+ddt
WDtG1w//R9/nvAk5A5cbForTXNyxwXAQnz0t2/4Lh5igbfJloqoTZtr47Gvqsvy+
DITU+3BPcyupBuUFLoeivPC+ruKOUmle27Vm62mdZRtyt96kiUiV/m1gbPhFy8lW
DKHK8tvtIoZObo5oNGvRxiuJPjefiChYZMDHokB+MY5ZALSRaIW9opj2WsM+ZAt7
g9CEjeitQcBY68CCpSEVBlQSz+BTZYEDFJGNCmGDxGhaBGZr/ganrkDJ75cG6U10
LnOZ6LDHNxGMqUm4wnhfmpHtVcIJiBF+gOyTumBPtFSoLnBverl684xizglpoq6d
fnUP8Y6XR9JA4OCZvo310yvX9nyqgb0H2h+APO0f7jRRcJo0QSKZ/qZR+XZCk3PU
91HtBopcGs8gGQUkRdAE7TMCiIEzL1eNOXHvsiILCObq1i7iNCe7Dzx6M6Gfgep8
a3IcoVmSw1DFpln2ZxTQw9viAib41iU46XHXz7W7rGPulF5QdXGo5ScORRG6HBLE
nZsXdTkrCPMJyN2bIhU6YJOK9rb9TjD4lUtnvzT8t1CfUsxQT4AsJykaYr9BwF2D
Z4rBruUAQ3OmvXJDfGG4T5YCAdPBN+xBcxCeyrDZSD0r6YPQGoF0dlvHmvP0J78D
oGkD1bb/gjcvsPJxtTQ4QWEh0oqZiDfRt4qdgO46vhba0onzQFw=
=X2zA
-----END PGP SIGNATURE-----
Merge tag 'for-4.18-part2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- error handling fixup for one of the new ioctls from 1st pull
- fix for device-replace that incorrectly uses inode pages and can mess
up compressed extents in some cases
- fiemap fix for reporting incorrect number of extents
- vm_fault_t type conversion
* tag 'for-4.18-part2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: scrub: Don't use inode pages for device replace
btrfs: change return type of btrfs_page_mkwrite to vm_fault_t
Btrfs: fiemap: pass correct bytenr when fm_extent_count is zero
btrfs: Check error of btrfs_iget in btrfs_search_path_in_tree_user
The pstore conversion to timespec64 introduces its own method of passing
seconds into sscanf() and sprintf() type functions to work around the
timespec64 definition on 64-bit systems that redefine it to 'timespec'.
That hack is now finally getting removed, but that means we get a (harmless)
warning once both patches are merged:
fs/pstore/ram.c: In function 'ramoops_read_kmsg_hdr':
fs/pstore/ram.c:39:29: error: format '%ld' expects argument of type 'long int *', but argument 3 has type 'time64_t *' {aka 'long long int *'} [-Werror=format=]
#define RAMOOPS_KERNMSG_HDR "===="
^~~~~~
fs/pstore/ram.c:167:21: note: in expansion of macro 'RAMOOPS_KERNMSG_HDR'
This removes the pstore specific workaround and uses the same method that
we have in place for all other functions that print a timespec64.
Related to this, I found that the kasprintf() output contains an incorrect
nanosecond values for any number starting with zeroes, and I adapt the
format string accordingly.
Link: https://lkml.org/lkml/2018/5/19/115
Link: https://lkml.org/lkml/2018/5/16/1080
Fixes: 0f0d83b99ef7 ("pstore: Convert internal records to timespec64")
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Pull the timespec64 conversion from Deepa Dinamani:
"The series aims to switch vfs timestamps to use
struct timespec64. Currently vfs uses struct timespec,
which is not y2038 safe.
The flag patch applies cleanly. I've not seen the timestamps
update logic change often. The series applies cleanly on 4.17-rc6
and linux-next tip (top commit: next-20180517).
I'm not sure how to merge this kind of a series with a flag patch.
We are targeting 4.18 for this.
Let me know if you have other suggestions.
The series involves the following:
1. Add vfs helper functions for supporting struct timepec64 timestamps.
2. Cast prints of vfs timestamps to avoid warnings after the switch.
3. Simplify code using vfs timestamps so that the actual
replacement becomes easy.
4. Convert vfs timestamps to use struct timespec64 using a script.
This is a flag day patch.
I've tried to keep the conversions with the script simple, to
aid in the reviews. I've kept all the internal filesystem data
structures and function signatures the same.
Next steps:
1. Convert APIs that can handle timespec64, instead of converting
timestamps at the boundaries.
2. Update internal data structures to avoid timestamp conversions."
I've pulled it into a branch based on top of the NFS changes that
are now in mainline, so I could resolve the non-obvious conflict
between the two while merging.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
This reverts commit 95cde3c599.
The commit had good intentions, but it breaks kvm-tool and qemu-kvm.
With it in place, "lkvm run" just fails with
Error: KVM_CREATE_VM ioctl
Warning: Failed init: kvm__init
which isn't a wonderful error message, but bisection pinpointed the
problematic commit.
The problem is almost certainly due to the special kvm debugfs entries
created dynamically by kvm under /sys/kernel/debug/kvm/. See
kvm_create_vm_debugfs()
Bisected-and-reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Wanpeng Li <kernellwp@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Additional struct_size() conversions (Matthew, Kees)
- Explicitly reported overflow fixes (Silvio, Kees)
- Add missing kvcalloc() function (Kees)
- Treewide conversions of allocators to use either 2-factor argument
variant when available, or array_size() and array3_size() as needed (Kees)
-----BEGIN PGP SIGNATURE-----
Comment: Kees Cook <kees@outflux.net>
iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAlsgVtMWHGtlZXNjb29r
QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJhsJEACLYe2EbwLFJz7emOT1KUGK5R1b
oVxJog0893WyMqgk9XBlA2lvTBRBYzR3tzsadfYo87L3VOBzazUv0YZaweJb65sF
bAvxW3nY06brhKKwTRed1PrMa1iG9R63WISnNAuZAq7+79mN6YgW4G6YSAEF9lW7
oPJoPw93YxcI8JcG+dA8BC9w7pJFKooZH4gvLUSUNl5XKr8Ru5YnWcV8F+8M4vZI
EJtXFmdlmxAledUPxTSCIojO8m/tNOjYTreBJt9K1DXKY6UcgAdhk75TRLEsp38P
fPvMigYQpBDnYz2pi9ourTgvZLkffK1OBZ46PPt8BgUZVf70D6CBg10vK47KO6N2
zreloxkMTrz5XohyjfNjYFRkyyuwV2sSVrRJqF4dpyJ4NJQRjvyywxIP4Myifwlb
ONipCM1EjvQjaEUbdcqKgvlooMdhcyxfshqJWjHzXB6BL22uPzq5jHXXugz8/ol8
tOSM2FuJ2sBLQso+szhisxtMd11PihzIZK9BfxEG3du+/hlI+2XgN7hnmlXuA2k3
BUW6BSDhab41HNd6pp50bDJnL0uKPWyFC6hqSNZw+GOIb46jfFcQqnCB3VZGCwj3
LH53Be1XlUrttc/NrtkvVhm4bdxtfsp4F7nsPFNDuHvYNkalAVoC3An0BzOibtkh
AtfvEeaPHaOyD8/h2Q==
=zUUp
-----END PGP SIGNATURE-----
Merge tag 'overflow-v4.18-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull more overflow updates from Kees Cook:
"The rest of the overflow changes for v4.18-rc1.
This includes the explicit overflow fixes from Silvio, further
struct_size() conversions from Matthew, and a bug fix from Dan.
But the bulk of it is the treewide conversions to use either the
2-factor argument allocators (e.g. kmalloc(a * b, ...) into
kmalloc_array(a, b, ...) or the array_size() macros (e.g. vmalloc(a *
b) into vmalloc(array_size(a, b)).
Coccinelle was fighting me on several fronts, so I've done a bunch of
manual whitespace updates in the patches as well.
Summary:
- Error path bug fix for overflow tests (Dan)
- Additional struct_size() conversions (Matthew, Kees)
- Explicitly reported overflow fixes (Silvio, Kees)
- Add missing kvcalloc() function (Kees)
- Treewide conversions of allocators to use either 2-factor argument
variant when available, or array_size() and array3_size() as needed
(Kees)"
* tag 'overflow-v4.18-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (26 commits)
treewide: Use array_size in f2fs_kvzalloc()
treewide: Use array_size() in f2fs_kzalloc()
treewide: Use array_size() in f2fs_kmalloc()
treewide: Use array_size() in sock_kmalloc()
treewide: Use array_size() in kvzalloc_node()
treewide: Use array_size() in vzalloc_node()
treewide: Use array_size() in vzalloc()
treewide: Use array_size() in vmalloc()
treewide: devm_kzalloc() -> devm_kcalloc()
treewide: devm_kmalloc() -> devm_kmalloc_array()
treewide: kvzalloc() -> kvcalloc()
treewide: kvmalloc() -> kvmalloc_array()
treewide: kzalloc_node() -> kcalloc_node()
treewide: kzalloc() -> kcalloc()
treewide: kmalloc() -> kmalloc_array()
mm: Introduce kvcalloc()
video: uvesafb: Fix integer overflow in allocation
UBIFS: Fix potential integer overflow in allocation
leds: Use struct_size() in allocation
Convert intel uncore to struct_size
...