From d8e464ecc17b4444e9a3e148a9748c4828c6328c Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Sun, 17 Nov 2019 11:20:48 -0800 Subject: [PATCH] vfs: mark pipes and sockets as stream-like file descriptors In commit 3975b097e577 ("convert stream-like files -> stream_open, even if they use noop_llseek") Kirill used a coccinelle script to change "nonseekable_open()" to "stream_open()", which changed the trivial cases of stream-like file descriptors to the new model with FMODE_STREAM. However, the two big cases - sockets and pipes - don't actually have that trivial pattern at all, and were thus never converted to FMODE_STREAM even though it makes lots of sense to do so. That's particularly true when looking forward to the next change: getting rid of FMODE_ATOMIC_POS entirely, and just using FMODE_STREAM to decide whether f_pos updates are needed or not. And if they are, we'll always do them atomically. This came up because KCSAN (correctly) noted that the non-locked f_pos updates are data races: they are clearly benign for the case where we don't care, but it would be good to just not have that issue exist at all. Note that the reason we used FMODE_ATOMIC_POS originally is that only doing it for the minimal required case is "safer" in that it's possible that the f_pos locking can cause unnecessary serialization across the whole write() call. And in the worst case, that kind of serialization can cause deadlock issues: think writers that need readers to empty the state using the same file descriptor. [ Note that the locking is per-file descriptor - because it protects "f_pos", which is obviously per-file descriptor - so it only affects cases where you literally use the same file descriptor to both read and write. So a regular pipe that has separate reading and writing file descriptors doesn't really have this situation even though it's the obvious case of "reader empties what a bit writer concurrently fills" But we want to make pipes as being stream-line anyway, because we don't want the unnecessary overhead of locking, and because a named pipe can be (ab-)used by reading and writing to the same file descriptor. ] There are likely a lot of other cases that might want FMODE_STREAM, and looking for ".llseek = no_llseek" users and other cases that don't have an lseek file operation at all and making them use "stream_open()" might be a good idea. But pipes and sockets are likely to be the two main cases. Cc: Kirill Smelkov Cc: Eic Dumazet Cc: Al Viro Cc: Alan Stern Cc: Marco Elver Cc: Andrea Parri Cc: Paul McKenney Signed-off-by: Linus Torvalds --- fs/pipe.c | 6 ++++-- net/socket.c | 1 + 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/fs/pipe.c b/fs/pipe.c index 8a2ab2f974bd..a9149199e0e7 100644 --- a/fs/pipe.c +++ b/fs/pipe.c @@ -793,6 +793,8 @@ int create_pipe_files(struct file **res, int flags) } res[0]->private_data = inode->i_pipe; res[1] = f; + stream_open(inode, res[0]); + stream_open(inode, res[1]); return 0; } @@ -931,9 +933,9 @@ static int fifo_open(struct inode *inode, struct file *filp) __pipe_lock(pipe); /* We can only do regular read/write on fifos */ - filp->f_mode &= (FMODE_READ | FMODE_WRITE); + stream_open(inode, filp); - switch (filp->f_mode) { + switch (filp->f_mode & (FMODE_READ | FMODE_WRITE)) { case FMODE_READ: /* * O_RDONLY diff --git a/net/socket.c b/net/socket.c index 6a9ab7a8b1d2..3c6d60eadf7a 100644 --- a/net/socket.c +++ b/net/socket.c @@ -404,6 +404,7 @@ struct file *sock_alloc_file(struct socket *sock, int flags, const char *dname) sock->file = file; file->private_data = sock; + stream_open(SOCK_INODE(sock), file); return file; } EXPORT_SYMBOL(sock_alloc_file);