2005-04-17 06:20:36 +08:00
|
|
|
/*
|
2007-04-27 06:55:03 +08:00
|
|
|
* Copyright (c) 2002, 2007 Red Hat, Inc. All rights reserved.
|
2005-04-17 06:20:36 +08:00
|
|
|
*
|
|
|
|
* This software may be freely redistributed under the terms of the
|
|
|
|
* GNU General Public License.
|
|
|
|
*
|
|
|
|
* You should have received a copy of the GNU General Public License
|
|
|
|
* along with this program; if not, write to the Free Software
|
|
|
|
* Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
|
|
|
|
*
|
2008-06-06 13:46:18 +08:00
|
|
|
* Authors: David Woodhouse <dwmw2@infradead.org>
|
2005-04-17 06:20:36 +08:00
|
|
|
* David Howells <dhowells@redhat.com>
|
|
|
|
*
|
|
|
|
*/
|
|
|
|
|
|
|
|
#include <linux/kernel.h>
|
|
|
|
#include <linux/module.h>
|
|
|
|
#include <linux/init.h>
|
2007-04-27 06:55:03 +08:00
|
|
|
#include <linux/circ_buf.h>
|
Detach sched.h from mm.h
First thing mm.h does is including sched.h solely for can_do_mlock() inline
function which has "current" dereference inside. By dealing with can_do_mlock()
mm.h can be detached from sched.h which is good. See below, why.
This patch
a) removes unconditional inclusion of sched.h from mm.h
b) makes can_do_mlock() normal function in mm/mlock.c
c) exports can_do_mlock() to not break compilation
d) adds sched.h inclusions back to files that were getting it indirectly.
e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were
getting them indirectly
Net result is:
a) mm.h users would get less code to open, read, preprocess, parse, ... if
they don't need sched.h
b) sched.h stops being dependency for significant number of files:
on x86_64 allmodconfig touching sched.h results in recompile of 4083 files,
after patch it's only 3744 (-8.3%).
Cross-compile tested on
all arm defconfigs, all mips defconfigs, all powerpc defconfigs,
alpha alpha-up
arm
i386 i386-up i386-defconfig i386-allnoconfig
ia64 ia64-up
m68k
mips
parisc parisc-up
powerpc powerpc-up
s390 s390-up
sparc sparc-up
sparc64 sparc64-up
um-x86_64
x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig
as well as my two usual configs.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-21 05:22:52 +08:00
|
|
|
#include <linux/sched.h>
|
2005-04-17 06:20:36 +08:00
|
|
|
#include "internal.h"
|
2007-04-27 06:55:03 +08:00
|
|
|
|
afs: Fix mmap coherency vs 3rd-party changes
Fix the coherency management of mmap'd data such that 3rd-party changes
become visible as soon as possible after the callback notification is
delivered by the fileserver. This is done by the following means:
(1) When we break a callback on a vnode specified by the CB.CallBack call
from the server, we queue a work item (vnode->cb_work) to go and
clobber all the PTEs mapping to that inode.
This causes the CPU to trip through the ->map_pages() and
->page_mkwrite() handlers if userspace attempts to access the page(s)
again.
(Ideally, this would be done in the service handler for CB.CallBack,
but the server is waiting for our reply before considering, and we
have a list of vnodes, all of which need breaking - and the process of
getting the mmap_lock and stripping the PTEs on all CPUs could be
quite slow.)
(2) Call afs_validate() from the ->map_pages() handler to check to see if
the file has changed and to get a new callback promise from the
server.
Also handle the fileserver telling us that it's dropping all callbacks,
possibly after it's been restarted by sending us a CB.InitCallBackState*
call by the following means:
(3) Maintain a per-cell list of afs files that are currently mmap'd
(cell->fs_open_mmaps).
(4) Add a work item to each server that is invoked if there are any open
mmaps when CB.InitCallBackState happens. This work item goes through
the aforementioned list and invokes the vnode->cb_work work item for
each one that is currently using this server.
This causes the PTEs to be cleared, causing ->map_pages() or
->page_mkwrite() to be called again, thereby calling afs_validate()
again.
I've chosen to simply strip the PTEs at the point of notification reception
rather than invalidate all the pages as well because (a) it's faster, (b)
we may get a notification for other reasons than the data being altered (in
which case we don't want to clobber the pagecache) and (c) we need to ask
the server to find out - and I don't want to wait for the reply before
holding up userspace.
This was tested using the attached test program:
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/mman.h>
int main(int argc, char *argv[])
{
size_t size = getpagesize();
unsigned char *p;
bool mod = (argc == 3);
int fd;
if (argc != 2 && argc != 3) {
fprintf(stderr, "Format: %s <file> [mod]\n", argv[0]);
exit(2);
}
fd = open(argv[1], mod ? O_RDWR : O_RDONLY);
if (fd < 0) {
perror(argv[1]);
exit(1);
}
p = mmap(NULL, size, mod ? PROT_READ|PROT_WRITE : PROT_READ,
MAP_SHARED, fd, 0);
if (p == MAP_FAILED) {
perror("mmap");
exit(1);
}
for (;;) {
if (mod) {
p[0]++;
msync(p, size, MS_ASYNC);
fsync(fd);
}
printf("%02x", p[0]);
fflush(stdout);
sleep(1);
}
}
It runs in two modes: in one mode, it mmaps a file, then sits in a loop
reading the first byte, printing it and sleeping for a second; in the
second mode it mmaps a file, then sits in a loop incrementing the first
byte and flushing, then printing and sleeping.
Two instances of this program can be run on different machines, one doing
the reading and one doing the writing. The reader should see the changes
made by the writer, but without this patch, they aren't because validity
checking is being done lazily - only on entry to the filesystem.
Testing the InitCallBackState change is more complicated. The server has
to be taken offline, the saved callback state file removed and then the
server restarted whilst the reading-mode program continues to run. The
client machine then has to poke the server to trigger the InitCallBackState
call.
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Markus Suvanto <markus.suvanto@gmail.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/163111668833.283156.382633263709075739.stgit@warthog.procyon.org.uk/
2021-09-02 23:43:10 +08:00
|
|
|
/*
|
|
|
|
* Handle invalidation of an mmap'd file. We invalidate all the PTEs referring
|
|
|
|
* to the pages in this file's pagecache, forcing the kernel to go through
|
|
|
|
* ->fault() or ->page_mkwrite() - at which point we can handle invalidation
|
|
|
|
* more fully.
|
|
|
|
*/
|
|
|
|
void afs_invalidate_mmap_work(struct work_struct *work)
|
|
|
|
{
|
|
|
|
struct afs_vnode *vnode = container_of(work, struct afs_vnode, cb_work);
|
|
|
|
|
netfs: Fix gcc-12 warning by embedding vfs inode in netfs_i_context
While randstruct was satisfied with using an open-coded "void *" offset
cast for the netfs_i_context <-> inode casting, __builtin_object_size() as
used by FORTIFY_SOURCE was not as easily fooled. This was causing the
following complaint[1] from gcc v12:
In file included from include/linux/string.h:253,
from include/linux/ceph/ceph_debug.h:7,
from fs/ceph/inode.c:2:
In function 'fortify_memset_chk',
inlined from 'netfs_i_context_init' at include/linux/netfs.h:326:2,
inlined from 'ceph_alloc_inode' at fs/ceph/inode.c:463:2:
include/linux/fortify-string.h:242:25: warning: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wattribute-warning]
242 | __write_overflow_field(p_size_field, size);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fix this by embedding a struct inode into struct netfs_i_context (which
should perhaps be renamed to struct netfs_inode). The struct inode
vfs_inode fields are then removed from the 9p, afs, ceph and cifs inode
structs and vfs_inode is then simply changed to "netfs.inode" in those
filesystems.
Further, rename netfs_i_context to netfs_inode, get rid of the
netfs_inode() function that converted a netfs_i_context pointer to an
inode pointer (that can now be done with &ctx->inode) and rename the
netfs_i_context() function to netfs_inode() (which is now a wrapper
around container_of()).
Most of the changes were done with:
perl -p -i -e 's/vfs_inode/netfs.inode/'g \
`git grep -l 'vfs_inode' -- fs/{9p,afs,ceph,cifs}/*.[ch]`
Kees suggested doing it with a pair structure[2] and a special
declarator to insert that into the network filesystem's inode
wrapper[3], but I think it's cleaner to embed it - and then it doesn't
matter if struct randomisation reorders things.
Dave Chinner suggested using a filesystem-specific VFS_I() function in
each filesystem to convert that filesystem's own inode wrapper struct
into the VFS inode struct[4].
Version #2:
- Fix a couple of missed name changes due to a disabled cifs option.
- Rename nfs_i_context to nfs_inode
- Use "netfs" instead of "nic" as the member name in per-fs inode wrapper
structs.
[ This also undoes commit 507160f46c55 ("netfs: gcc-12: temporarily
disable '-Wattribute-warning' for now") that is no longer needed ]
Fixes: bc899ee1c898 ("netfs: Add a netfs inode context")
Reported-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
cc: Jonathan Corbet <corbet@lwn.net>
cc: Eric Van Hensbergen <ericvh@gmail.com>
cc: Latchesar Ionkov <lucho@ionkov.net>
cc: Dominique Martinet <asmadeus@codewreck.org>
cc: Christian Schoenebeck <linux_oss@crudebyte.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Ilya Dryomov <idryomov@gmail.com>
cc: Steve French <smfrench@gmail.com>
cc: William Kucharski <william.kucharski@oracle.com>
cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
cc: Dave Chinner <david@fromorbit.com>
cc: linux-doc@vger.kernel.org
cc: v9fs-developer@lists.sourceforge.net
cc: linux-afs@lists.infradead.org
cc: ceph-devel@vger.kernel.org
cc: linux-cifs@vger.kernel.org
cc: samba-technical@lists.samba.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-hardening@vger.kernel.org
Link: https://lore.kernel.org/r/d2ad3a3d7bdd794c6efb562d2f2b655fb67756b9.camel@kernel.org/ [1]
Link: https://lore.kernel.org/r/20220517210230.864239-1-keescook@chromium.org/ [2]
Link: https://lore.kernel.org/r/20220518202212.2322058-1-keescook@chromium.org/ [3]
Link: https://lore.kernel.org/r/20220524101205.GI2306852@dread.disaster.area/ [4]
Link: https://lore.kernel.org/r/165296786831.3591209.12111293034669289733.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165305805651.4094995.7763502506786714216.stgit@warthog.procyon.org.uk # v2
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-10 04:46:04 +08:00
|
|
|
unmap_mapping_pages(vnode->netfs.inode.i_mapping, 0, 0, false);
|
afs: Fix mmap coherency vs 3rd-party changes
Fix the coherency management of mmap'd data such that 3rd-party changes
become visible as soon as possible after the callback notification is
delivered by the fileserver. This is done by the following means:
(1) When we break a callback on a vnode specified by the CB.CallBack call
from the server, we queue a work item (vnode->cb_work) to go and
clobber all the PTEs mapping to that inode.
This causes the CPU to trip through the ->map_pages() and
->page_mkwrite() handlers if userspace attempts to access the page(s)
again.
(Ideally, this would be done in the service handler for CB.CallBack,
but the server is waiting for our reply before considering, and we
have a list of vnodes, all of which need breaking - and the process of
getting the mmap_lock and stripping the PTEs on all CPUs could be
quite slow.)
(2) Call afs_validate() from the ->map_pages() handler to check to see if
the file has changed and to get a new callback promise from the
server.
Also handle the fileserver telling us that it's dropping all callbacks,
possibly after it's been restarted by sending us a CB.InitCallBackState*
call by the following means:
(3) Maintain a per-cell list of afs files that are currently mmap'd
(cell->fs_open_mmaps).
(4) Add a work item to each server that is invoked if there are any open
mmaps when CB.InitCallBackState happens. This work item goes through
the aforementioned list and invokes the vnode->cb_work work item for
each one that is currently using this server.
This causes the PTEs to be cleared, causing ->map_pages() or
->page_mkwrite() to be called again, thereby calling afs_validate()
again.
I've chosen to simply strip the PTEs at the point of notification reception
rather than invalidate all the pages as well because (a) it's faster, (b)
we may get a notification for other reasons than the data being altered (in
which case we don't want to clobber the pagecache) and (c) we need to ask
the server to find out - and I don't want to wait for the reply before
holding up userspace.
This was tested using the attached test program:
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/mman.h>
int main(int argc, char *argv[])
{
size_t size = getpagesize();
unsigned char *p;
bool mod = (argc == 3);
int fd;
if (argc != 2 && argc != 3) {
fprintf(stderr, "Format: %s <file> [mod]\n", argv[0]);
exit(2);
}
fd = open(argv[1], mod ? O_RDWR : O_RDONLY);
if (fd < 0) {
perror(argv[1]);
exit(1);
}
p = mmap(NULL, size, mod ? PROT_READ|PROT_WRITE : PROT_READ,
MAP_SHARED, fd, 0);
if (p == MAP_FAILED) {
perror("mmap");
exit(1);
}
for (;;) {
if (mod) {
p[0]++;
msync(p, size, MS_ASYNC);
fsync(fd);
}
printf("%02x", p[0]);
fflush(stdout);
sleep(1);
}
}
It runs in two modes: in one mode, it mmaps a file, then sits in a loop
reading the first byte, printing it and sleeping for a second; in the
second mode it mmaps a file, then sits in a loop incrementing the first
byte and flushing, then printing and sleeping.
Two instances of this program can be run on different machines, one doing
the reading and one doing the writing. The reader should see the changes
made by the writer, but without this patch, they aren't because validity
checking is being done lazily - only on entry to the filesystem.
Testing the InitCallBackState change is more complicated. The server has
to be taken offline, the saved callback state file removed and then the
server restarted whilst the reading-mode program continues to run. The
client machine then has to poke the server to trigger the InitCallBackState
call.
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Markus Suvanto <markus.suvanto@gmail.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/163111668833.283156.382633263709075739.stgit@warthog.procyon.org.uk/
2021-09-02 23:43:10 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
void afs_server_init_callback_work(struct work_struct *work)
|
|
|
|
{
|
|
|
|
struct afs_server *server = container_of(work, struct afs_server, initcb_work);
|
|
|
|
struct afs_vnode *vnode;
|
|
|
|
struct afs_cell *cell = server->cell;
|
|
|
|
|
|
|
|
down_read(&cell->fs_open_mmaps_lock);
|
|
|
|
|
|
|
|
list_for_each_entry(vnode, &cell->fs_open_mmaps, cb_mmap_link) {
|
|
|
|
if (vnode->cb_server == server) {
|
|
|
|
clear_bit(AFS_VNODE_CB_PROMISED, &vnode->flags);
|
|
|
|
queue_work(system_unbound_wq, &vnode->cb_work);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
up_read(&cell->fs_open_mmaps_lock);
|
|
|
|
}
|
|
|
|
|
2017-11-02 23:27:49 +08:00
|
|
|
/*
|
2020-05-27 22:51:30 +08:00
|
|
|
* Allow the fileserver to request callback state (re-)initialisation.
|
|
|
|
* Unfortunately, UUIDs are not guaranteed unique.
|
2017-11-02 23:27:49 +08:00
|
|
|
*/
|
|
|
|
void afs_init_callback_state(struct afs_server *server)
|
|
|
|
{
|
2020-05-27 22:51:30 +08:00
|
|
|
rcu_read_lock();
|
|
|
|
do {
|
|
|
|
server->cb_s_break++;
|
2021-09-03 04:51:01 +08:00
|
|
|
atomic_inc(&server->cell->fs_s_break);
|
afs: Fix mmap coherency vs 3rd-party changes
Fix the coherency management of mmap'd data such that 3rd-party changes
become visible as soon as possible after the callback notification is
delivered by the fileserver. This is done by the following means:
(1) When we break a callback on a vnode specified by the CB.CallBack call
from the server, we queue a work item (vnode->cb_work) to go and
clobber all the PTEs mapping to that inode.
This causes the CPU to trip through the ->map_pages() and
->page_mkwrite() handlers if userspace attempts to access the page(s)
again.
(Ideally, this would be done in the service handler for CB.CallBack,
but the server is waiting for our reply before considering, and we
have a list of vnodes, all of which need breaking - and the process of
getting the mmap_lock and stripping the PTEs on all CPUs could be
quite slow.)
(2) Call afs_validate() from the ->map_pages() handler to check to see if
the file has changed and to get a new callback promise from the
server.
Also handle the fileserver telling us that it's dropping all callbacks,
possibly after it's been restarted by sending us a CB.InitCallBackState*
call by the following means:
(3) Maintain a per-cell list of afs files that are currently mmap'd
(cell->fs_open_mmaps).
(4) Add a work item to each server that is invoked if there are any open
mmaps when CB.InitCallBackState happens. This work item goes through
the aforementioned list and invokes the vnode->cb_work work item for
each one that is currently using this server.
This causes the PTEs to be cleared, causing ->map_pages() or
->page_mkwrite() to be called again, thereby calling afs_validate()
again.
I've chosen to simply strip the PTEs at the point of notification reception
rather than invalidate all the pages as well because (a) it's faster, (b)
we may get a notification for other reasons than the data being altered (in
which case we don't want to clobber the pagecache) and (c) we need to ask
the server to find out - and I don't want to wait for the reply before
holding up userspace.
This was tested using the attached test program:
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/mman.h>
int main(int argc, char *argv[])
{
size_t size = getpagesize();
unsigned char *p;
bool mod = (argc == 3);
int fd;
if (argc != 2 && argc != 3) {
fprintf(stderr, "Format: %s <file> [mod]\n", argv[0]);
exit(2);
}
fd = open(argv[1], mod ? O_RDWR : O_RDONLY);
if (fd < 0) {
perror(argv[1]);
exit(1);
}
p = mmap(NULL, size, mod ? PROT_READ|PROT_WRITE : PROT_READ,
MAP_SHARED, fd, 0);
if (p == MAP_FAILED) {
perror("mmap");
exit(1);
}
for (;;) {
if (mod) {
p[0]++;
msync(p, size, MS_ASYNC);
fsync(fd);
}
printf("%02x", p[0]);
fflush(stdout);
sleep(1);
}
}
It runs in two modes: in one mode, it mmaps a file, then sits in a loop
reading the first byte, printing it and sleeping for a second; in the
second mode it mmaps a file, then sits in a loop incrementing the first
byte and flushing, then printing and sleeping.
Two instances of this program can be run on different machines, one doing
the reading and one doing the writing. The reader should see the changes
made by the writer, but without this patch, they aren't because validity
checking is being done lazily - only on entry to the filesystem.
Testing the InitCallBackState change is more complicated. The server has
to be taken offline, the saved callback state file removed and then the
server restarted whilst the reading-mode program continues to run. The
client machine then has to poke the server to trigger the InitCallBackState
call.
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Markus Suvanto <markus.suvanto@gmail.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/163111668833.283156.382633263709075739.stgit@warthog.procyon.org.uk/
2021-09-02 23:43:10 +08:00
|
|
|
if (!list_empty(&server->cell->fs_open_mmaps))
|
|
|
|
queue_work(system_unbound_wq, &server->initcb_work);
|
|
|
|
|
|
|
|
} while ((server = rcu_dereference(server->uuid_next)));
|
2020-05-27 22:51:30 +08:00
|
|
|
rcu_read_unlock();
|
2007-04-27 06:55:03 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* actually break a callback
|
|
|
|
*/
|
2019-06-21 01:12:16 +08:00
|
|
|
void __afs_break_callback(struct afs_vnode *vnode, enum afs_cb_break_reason reason)
|
2007-04-27 06:55:03 +08:00
|
|
|
{
|
|
|
|
_enter("");
|
|
|
|
|
2018-04-06 21:17:26 +08:00
|
|
|
clear_bit(AFS_VNODE_NEW_CONTENT, &vnode->flags);
|
2017-11-02 23:27:49 +08:00
|
|
|
if (test_and_clear_bit(AFS_VNODE_CB_PROMISED, &vnode->flags)) {
|
|
|
|
vnode->cb_break++;
|
2021-09-03 04:51:01 +08:00
|
|
|
vnode->cb_v_break = vnode->volume->cb_v_break;
|
2017-11-02 23:27:49 +08:00
|
|
|
afs_clear_permits(vnode);
|
2007-04-27 06:55:03 +08:00
|
|
|
|
2019-05-11 06:03:31 +08:00
|
|
|
if (vnode->lock_state == AFS_VNODE_LOCK_WAITING_FOR_CB)
|
2007-07-16 14:40:12 +08:00
|
|
|
afs_lock_may_be_available(vnode);
|
2019-06-21 01:12:16 +08:00
|
|
|
|
afs: Fix mmap coherency vs 3rd-party changes
Fix the coherency management of mmap'd data such that 3rd-party changes
become visible as soon as possible after the callback notification is
delivered by the fileserver. This is done by the following means:
(1) When we break a callback on a vnode specified by the CB.CallBack call
from the server, we queue a work item (vnode->cb_work) to go and
clobber all the PTEs mapping to that inode.
This causes the CPU to trip through the ->map_pages() and
->page_mkwrite() handlers if userspace attempts to access the page(s)
again.
(Ideally, this would be done in the service handler for CB.CallBack,
but the server is waiting for our reply before considering, and we
have a list of vnodes, all of which need breaking - and the process of
getting the mmap_lock and stripping the PTEs on all CPUs could be
quite slow.)
(2) Call afs_validate() from the ->map_pages() handler to check to see if
the file has changed and to get a new callback promise from the
server.
Also handle the fileserver telling us that it's dropping all callbacks,
possibly after it's been restarted by sending us a CB.InitCallBackState*
call by the following means:
(3) Maintain a per-cell list of afs files that are currently mmap'd
(cell->fs_open_mmaps).
(4) Add a work item to each server that is invoked if there are any open
mmaps when CB.InitCallBackState happens. This work item goes through
the aforementioned list and invokes the vnode->cb_work work item for
each one that is currently using this server.
This causes the PTEs to be cleared, causing ->map_pages() or
->page_mkwrite() to be called again, thereby calling afs_validate()
again.
I've chosen to simply strip the PTEs at the point of notification reception
rather than invalidate all the pages as well because (a) it's faster, (b)
we may get a notification for other reasons than the data being altered (in
which case we don't want to clobber the pagecache) and (c) we need to ask
the server to find out - and I don't want to wait for the reply before
holding up userspace.
This was tested using the attached test program:
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/mman.h>
int main(int argc, char *argv[])
{
size_t size = getpagesize();
unsigned char *p;
bool mod = (argc == 3);
int fd;
if (argc != 2 && argc != 3) {
fprintf(stderr, "Format: %s <file> [mod]\n", argv[0]);
exit(2);
}
fd = open(argv[1], mod ? O_RDWR : O_RDONLY);
if (fd < 0) {
perror(argv[1]);
exit(1);
}
p = mmap(NULL, size, mod ? PROT_READ|PROT_WRITE : PROT_READ,
MAP_SHARED, fd, 0);
if (p == MAP_FAILED) {
perror("mmap");
exit(1);
}
for (;;) {
if (mod) {
p[0]++;
msync(p, size, MS_ASYNC);
fsync(fd);
}
printf("%02x", p[0]);
fflush(stdout);
sleep(1);
}
}
It runs in two modes: in one mode, it mmaps a file, then sits in a loop
reading the first byte, printing it and sleeping for a second; in the
second mode it mmaps a file, then sits in a loop incrementing the first
byte and flushing, then printing and sleeping.
Two instances of this program can be run on different machines, one doing
the reading and one doing the writing. The reader should see the changes
made by the writer, but without this patch, they aren't because validity
checking is being done lazily - only on entry to the filesystem.
Testing the InitCallBackState change is more complicated. The server has
to be taken offline, the saved callback state file removed and then the
server restarted whilst the reading-mode program continues to run. The
client machine then has to poke the server to trigger the InitCallBackState
call.
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Markus Suvanto <markus.suvanto@gmail.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/163111668833.283156.382633263709075739.stgit@warthog.procyon.org.uk/
2021-09-02 23:43:10 +08:00
|
|
|
if (reason != afs_cb_break_for_deleted &&
|
|
|
|
vnode->status.type == AFS_FTYPE_FILE &&
|
|
|
|
atomic_read(&vnode->cb_nr_mmap))
|
|
|
|
queue_work(system_unbound_wq, &vnode->cb_work);
|
|
|
|
|
2019-06-21 01:12:16 +08:00
|
|
|
trace_afs_cb_break(&vnode->fid, vnode->cb_break, reason, true);
|
|
|
|
} else {
|
|
|
|
trace_afs_cb_break(&vnode->fid, vnode->cb_break, reason, false);
|
2007-04-27 06:55:03 +08:00
|
|
|
}
|
2018-10-20 07:57:58 +08:00
|
|
|
}
|
2017-11-02 23:27:49 +08:00
|
|
|
|
2019-06-21 01:12:16 +08:00
|
|
|
void afs_break_callback(struct afs_vnode *vnode, enum afs_cb_break_reason reason)
|
2018-10-20 07:57:58 +08:00
|
|
|
{
|
|
|
|
write_seqlock(&vnode->cb_lock);
|
2019-06-21 01:12:16 +08:00
|
|
|
__afs_break_callback(vnode, reason);
|
2017-11-02 23:27:49 +08:00
|
|
|
write_sequnlock(&vnode->cb_lock);
|
2007-04-27 06:55:03 +08:00
|
|
|
}
|
|
|
|
|
2020-03-27 23:02:44 +08:00
|
|
|
/*
|
2020-04-30 08:03:49 +08:00
|
|
|
* Look up a volume by volume ID under RCU conditions.
|
2020-03-27 23:02:44 +08:00
|
|
|
*/
|
2020-04-30 08:03:49 +08:00
|
|
|
static struct afs_volume *afs_lookup_volume_rcu(struct afs_cell *cell,
|
|
|
|
afs_volid_t vid)
|
2020-03-27 23:02:44 +08:00
|
|
|
{
|
2020-04-30 08:03:49 +08:00
|
|
|
struct afs_volume *volume = NULL;
|
2020-03-27 23:02:44 +08:00
|
|
|
struct rb_node *p;
|
|
|
|
int seq = 0;
|
|
|
|
|
|
|
|
do {
|
|
|
|
/* Unfortunately, rbtree walking doesn't give reliable results
|
|
|
|
* under just the RCU read lock, so we have to check for
|
|
|
|
* changes.
|
|
|
|
*/
|
2020-04-30 08:03:49 +08:00
|
|
|
read_seqbegin_or_lock(&cell->volume_lock, &seq);
|
2020-03-27 23:02:44 +08:00
|
|
|
|
2020-04-30 08:03:49 +08:00
|
|
|
p = rcu_dereference_raw(cell->volumes.rb_node);
|
2020-03-27 23:02:44 +08:00
|
|
|
while (p) {
|
2020-04-30 08:03:49 +08:00
|
|
|
volume = rb_entry(p, struct afs_volume, cell_node);
|
2020-03-27 23:02:44 +08:00
|
|
|
|
2020-04-30 08:03:49 +08:00
|
|
|
if (volume->vid < vid)
|
2020-03-27 23:02:44 +08:00
|
|
|
p = rcu_dereference_raw(p->rb_left);
|
2020-04-30 08:03:49 +08:00
|
|
|
else if (volume->vid > vid)
|
2020-03-27 23:02:44 +08:00
|
|
|
p = rcu_dereference_raw(p->rb_right);
|
|
|
|
else
|
|
|
|
break;
|
2020-04-30 08:03:49 +08:00
|
|
|
volume = NULL;
|
2020-03-27 23:02:44 +08:00
|
|
|
}
|
|
|
|
|
2020-04-30 08:03:49 +08:00
|
|
|
} while (need_seqretry(&cell->volume_lock, seq));
|
2020-03-27 23:02:44 +08:00
|
|
|
|
2020-04-30 08:03:49 +08:00
|
|
|
done_seqretry(&cell->volume_lock, seq);
|
|
|
|
return volume;
|
2020-03-27 23:02:44 +08:00
|
|
|
}
|
|
|
|
|
2007-04-27 06:55:03 +08:00
|
|
|
/*
|
|
|
|
* allow the fileserver to explicitly break one callback
|
|
|
|
* - happens when
|
|
|
|
* - the backing file is changed
|
|
|
|
* - a lock is released
|
|
|
|
*/
|
2020-04-30 08:03:49 +08:00
|
|
|
static void afs_break_one_callback(struct afs_volume *volume,
|
|
|
|
struct afs_fid *fid)
|
2007-04-27 06:55:03 +08:00
|
|
|
{
|
2020-04-30 08:03:49 +08:00
|
|
|
struct super_block *sb;
|
2007-04-27 06:55:03 +08:00
|
|
|
struct afs_vnode *vnode;
|
2017-11-02 23:27:49 +08:00
|
|
|
struct inode *inode;
|
2007-04-27 06:55:03 +08:00
|
|
|
|
2020-04-30 08:03:49 +08:00
|
|
|
if (fid->vnode == 0 && fid->unique == 0) {
|
|
|
|
/* The callback break applies to an entire volume. */
|
|
|
|
write_lock(&volume->cb_v_break_lock);
|
|
|
|
volume->cb_v_break++;
|
|
|
|
trace_afs_cb_break(fid, volume->cb_v_break,
|
|
|
|
afs_cb_break_for_volume_callback, false);
|
|
|
|
write_unlock(&volume->cb_v_break_lock);
|
|
|
|
return;
|
|
|
|
}
|
2018-05-13 05:31:33 +08:00
|
|
|
|
2020-04-30 08:03:49 +08:00
|
|
|
/* See if we can find a matching inode - even an I_NEW inode needs to
|
|
|
|
* be marked as it can have its callback broken before we finish
|
|
|
|
* setting up the local inode.
|
|
|
|
*/
|
|
|
|
sb = rcu_dereference(volume->sb);
|
|
|
|
if (!sb)
|
|
|
|
return;
|
|
|
|
|
|
|
|
inode = find_inode_rcu(sb, fid->vnode, afs_ilookup5_test_by_fid, fid);
|
|
|
|
if (inode) {
|
|
|
|
vnode = AFS_FS_I(inode);
|
|
|
|
afs_break_callback(vnode, afs_cb_break_for_callback);
|
|
|
|
} else {
|
|
|
|
trace_afs_cb_miss(fid, afs_cb_break_for_callback);
|
2017-11-02 23:27:49 +08:00
|
|
|
}
|
2020-03-27 23:02:44 +08:00
|
|
|
}
|
2007-04-27 06:55:03 +08:00
|
|
|
|
2020-03-27 23:02:44 +08:00
|
|
|
static void afs_break_some_callbacks(struct afs_server *server,
|
|
|
|
struct afs_callback_break *cbb,
|
|
|
|
size_t *_count)
|
|
|
|
{
|
|
|
|
struct afs_callback_break *residue = cbb;
|
2020-04-30 08:03:49 +08:00
|
|
|
struct afs_volume *volume;
|
2020-03-27 23:02:44 +08:00
|
|
|
afs_volid_t vid = cbb->fid.vid;
|
|
|
|
size_t i;
|
|
|
|
|
2020-04-30 08:03:49 +08:00
|
|
|
volume = afs_lookup_volume_rcu(server->cell, vid);
|
2020-03-27 23:02:44 +08:00
|
|
|
|
|
|
|
/* TODO: Find all matching volumes if we couldn't match the server and
|
|
|
|
* break them anyway.
|
|
|
|
*/
|
|
|
|
|
|
|
|
for (i = *_count; i > 0; cbb++, i--) {
|
|
|
|
if (cbb->fid.vid == vid) {
|
|
|
|
_debug("- Fid { vl=%08llx n=%llu u=%u }",
|
|
|
|
cbb->fid.vid,
|
|
|
|
cbb->fid.vnode,
|
|
|
|
cbb->fid.unique);
|
|
|
|
--*_count;
|
2020-04-30 08:03:49 +08:00
|
|
|
if (volume)
|
|
|
|
afs_break_one_callback(volume, &cbb->fid);
|
2020-03-27 23:02:44 +08:00
|
|
|
} else {
|
|
|
|
*residue++ = *cbb;
|
|
|
|
}
|
|
|
|
}
|
2007-04-27 06:49:28 +08:00
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* allow the fileserver to break callback promises
|
|
|
|
*/
|
2007-04-27 06:55:03 +08:00
|
|
|
void afs_break_callbacks(struct afs_server *server, size_t count,
|
2018-04-10 04:12:31 +08:00
|
|
|
struct afs_callback_break *callbacks)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2007-04-27 06:55:03 +08:00
|
|
|
_enter("%p,%zu,", server, count);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2007-04-27 06:55:03 +08:00
|
|
|
ASSERT(server != NULL);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2020-03-27 23:02:44 +08:00
|
|
|
rcu_read_lock();
|
2018-05-13 05:31:33 +08:00
|
|
|
|
2020-03-27 23:02:44 +08:00
|
|
|
while (count > 0)
|
|
|
|
afs_break_some_callbacks(server, callbacks, &count);
|
2007-04-27 06:55:03 +08:00
|
|
|
|
2020-03-27 23:02:44 +08:00
|
|
|
rcu_read_unlock();
|
2007-04-27 06:55:03 +08:00
|
|
|
return;
|
|
|
|
}
|