Commit Graph

873432 Commits

Author SHA1 Message Date
Ronnie Sahlberg 32546a9586 cifs: move cifsFileInfo_put logic into a work-queue
This patch moves the final part of the cifsFileInfo_put() logic where we
need a write lock on lock_sem to be processed in a separate thread that
holds no other locks.
This is to prevent deadlocks like the one below:

> there are 6 processes looping to while trying to down_write
> cinode->lock_sem, 5 of them from _cifsFileInfo_put, and one from
> cifs_new_fileinfo
>
> and there are 5 other processes which are blocked, several of them
> waiting on either PG_writeback or PG_locked (which are both set), all
> for the same page of the file
>
> 2 inode_lock() (inode->i_rwsem) for the file
> 1 wait_on_page_writeback() for the page
> 1 down_read(inode->i_rwsem) for the inode of the directory
> 1 inode_lock()(inode->i_rwsem) for the inode of the directory
> 1 __lock_page
>
>
> so processes are blocked waiting on:
>   page flags PG_locked and PG_writeback for one specific page
>   inode->i_rwsem for the directory
>   inode->i_rwsem for the file
>   cifsInodeInflock_sem
>
>
>
> here are the more gory details (let me know if I need to provide
> anything more/better):
>
> [0 00:48:22.765] [UN]  PID: 8863   TASK: ffff8c691547c5c0  CPU: 3
> COMMAND: "reopen_file"
>  #0 [ffff9965007e3ba8] __schedule at ffffffff9b6e6095
>  #1 [ffff9965007e3c38] schedule at ffffffff9b6e64df
>  #2 [ffff9965007e3c48] rwsem_down_write_slowpath at ffffffff9af283d7
>  #3 [ffff9965007e3cb8] legitimize_path at ffffffff9b0f975d
>  #4 [ffff9965007e3d08] path_openat at ffffffff9b0fe55d
>  #5 [ffff9965007e3dd8] do_filp_open at ffffffff9b100a33
>  #6 [ffff9965007e3ee0] do_sys_open at ffffffff9b0eb2d6
>  #7 [ffff9965007e3f38] do_syscall_64 at ffffffff9ae04315
> * (I think legitimize_path is bogus)
>
> in path_openat
>         } else {
>                 const char *s = path_init(nd, flags);
>                 while (!(error = link_path_walk(s, nd)) &&
>                         (error = do_last(nd, file, op)) > 0) {  <<<<
>
> do_last:
>         if (open_flag & O_CREAT)
>                 inode_lock(dir->d_inode);  <<<<
>         else
> so it's trying to take inode->i_rwsem for the directory
>
>      DENTRY           INODE           SUPERBLK     TYPE PATH
> ffff8c68bb8e79c0 ffff8c691158ef20 ffff8c6915bf9000 DIR  /mnt/vm1_smb/
> inode.i_rwsem is ffff8c691158efc0
>
> <struct rw_semaphore 0xffff8c691158efc0>:
>         owner: <struct task_struct 0xffff8c6914275d00> (UN -   8856 -
> reopen_file), counter: 0x0000000000000003
>         waitlist: 2
>         0xffff9965007e3c90     8863   reopen_file      UN 0  1:29:22.926
>   RWSEM_WAITING_FOR_WRITE
>         0xffff996500393e00     9802   ls               UN 0  1:17:26.700
>   RWSEM_WAITING_FOR_READ
>
>
> the owner of the inode.i_rwsem of the directory is:
>
> [0 00:00:00.109] [UN]  PID: 8856   TASK: ffff8c6914275d00  CPU: 3
> COMMAND: "reopen_file"
>  #0 [ffff99650065b828] __schedule at ffffffff9b6e6095
>  #1 [ffff99650065b8b8] schedule at ffffffff9b6e64df
>  #2 [ffff99650065b8c8] schedule_timeout at ffffffff9b6e9f89
>  #3 [ffff99650065b940] msleep at ffffffff9af573a9
>  #4 [ffff99650065b948] _cifsFileInfo_put.cold.63 at ffffffffc0a42dd6 [cifs]
>  #5 [ffff99650065ba38] cifs_writepage_locked at ffffffffc0a0b8f3 [cifs]
>  #6 [ffff99650065bab0] cifs_launder_page at ffffffffc0a0bb72 [cifs]
>  #7 [ffff99650065bb30] invalidate_inode_pages2_range at ffffffff9b04d4bd
>  #8 [ffff99650065bcb8] cifs_invalidate_mapping at ffffffffc0a11339 [cifs]
>  #9 [ffff99650065bcd0] cifs_revalidate_mapping at ffffffffc0a1139a [cifs]
> #10 [ffff99650065bcf0] cifs_d_revalidate at ffffffffc0a014f6 [cifs]
> #11 [ffff99650065bd08] path_openat at ffffffff9b0fe7f7
> #12 [ffff99650065bdd8] do_filp_open at ffffffff9b100a33
> #13 [ffff99650065bee0] do_sys_open at ffffffff9b0eb2d6
> #14 [ffff99650065bf38] do_syscall_64 at ffffffff9ae04315
>
> cifs_launder_page is for page 0xffffd1e2c07d2480
>
> crash> page.index,mapping,flags 0xffffd1e2c07d2480
>       index = 0x8
>       mapping = 0xffff8c68f3cd0db0
>   flags = 0xfffffc0008095
>
>   PAGE-FLAG       BIT  VALUE
>   PG_locked         0  0000001
>   PG_uptodate       2  0000004
>   PG_lru            4  0000010
>   PG_waiters        7  0000080
>   PG_writeback     15  0008000
>
>
> inode is ffff8c68f3cd0c40
> inode.i_rwsem is ffff8c68f3cd0ce0
>      DENTRY           INODE           SUPERBLK     TYPE PATH
> ffff8c68a1f1b480 ffff8c68f3cd0c40 ffff8c6915bf9000 REG
> /mnt/vm1_smb/testfile.8853
>
>
> this process holds the inode->i_rwsem for the parent directory, is
> laundering a page attached to the inode of the file it's opening, and in
> _cifsFileInfo_put is trying to down_write the cifsInodeInflock_sem
> for the file itself.
>
>
> <struct rw_semaphore 0xffff8c68f3cd0ce0>:
>         owner: <struct task_struct 0xffff8c6914272e80> (UN -   8854 -
> reopen_file), counter: 0x0000000000000003
>         waitlist: 1
>         0xffff9965005dfd80     8855   reopen_file      UN 0  1:29:22.912
>   RWSEM_WAITING_FOR_WRITE
>
> this is the inode.i_rwsem for the file
>
> the owner:
>
> [0 00:48:22.739] [UN]  PID: 8854   TASK: ffff8c6914272e80  CPU: 2
> COMMAND: "reopen_file"
>  #0 [ffff99650054fb38] __schedule at ffffffff9b6e6095
>  #1 [ffff99650054fbc8] schedule at ffffffff9b6e64df
>  #2 [ffff99650054fbd8] io_schedule at ffffffff9b6e68e2
>  #3 [ffff99650054fbe8] __lock_page at ffffffff9b03c56f
>  #4 [ffff99650054fc80] pagecache_get_page at ffffffff9b03dcdf
>  #5 [ffff99650054fcc0] grab_cache_page_write_begin at ffffffff9b03ef4c
>  #6 [ffff99650054fcd0] cifs_write_begin at ffffffffc0a064ec [cifs]
>  #7 [ffff99650054fd30] generic_perform_write at ffffffff9b03bba4
>  #8 [ffff99650054fda8] __generic_file_write_iter at ffffffff9b04060a
>  #9 [ffff99650054fdf0] cifs_strict_writev.cold.70 at ffffffffc0a4469b [cifs]
> #10 [ffff99650054fe48] new_sync_write at ffffffff9b0ec1dd
> #11 [ffff99650054fed0] vfs_write at ffffffff9b0eed35
> #12 [ffff99650054ff00] ksys_write at ffffffff9b0eefd9
> #13 [ffff99650054ff38] do_syscall_64 at ffffffff9ae04315
>
> the process holds the inode->i_rwsem for the file to which it's writing,
> and is trying to __lock_page for the same page as in the other processes
>
>
> the other tasks:
> [0 00:00:00.028] [UN]  PID: 8859   TASK: ffff8c6915479740  CPU: 2
> COMMAND: "reopen_file"
>  #0 [ffff9965007b39d8] __schedule at ffffffff9b6e6095
>  #1 [ffff9965007b3a68] schedule at ffffffff9b6e64df
>  #2 [ffff9965007b3a78] schedule_timeout at ffffffff9b6e9f89
>  #3 [ffff9965007b3af0] msleep at ffffffff9af573a9
>  #4 [ffff9965007b3af8] cifs_new_fileinfo.cold.61 at ffffffffc0a42a07 [cifs]
>  #5 [ffff9965007b3b78] cifs_open at ffffffffc0a0709d [cifs]
>  #6 [ffff9965007b3cd8] do_dentry_open at ffffffff9b0e9b7a
>  #7 [ffff9965007b3d08] path_openat at ffffffff9b0fe34f
>  #8 [ffff9965007b3dd8] do_filp_open at ffffffff9b100a33
>  #9 [ffff9965007b3ee0] do_sys_open at ffffffff9b0eb2d6
> #10 [ffff9965007b3f38] do_syscall_64 at ffffffff9ae04315
>
> this is opening the file, and is trying to down_write cinode->lock_sem
>
>
> [0 00:00:00.041] [UN]  PID: 8860   TASK: ffff8c691547ae80  CPU: 2
> COMMAND: "reopen_file"
> [0 00:00:00.057] [UN]  PID: 8861   TASK: ffff8c6915478000  CPU: 3
> COMMAND: "reopen_file"
> [0 00:00:00.059] [UN]  PID: 8858   TASK: ffff8c6914271740  CPU: 2
> COMMAND: "reopen_file"
> [0 00:00:00.109] [UN]  PID: 8862   TASK: ffff8c691547dd00  CPU: 6
> COMMAND: "reopen_file"
>  #0 [ffff9965007c3c78] __schedule at ffffffff9b6e6095
>  #1 [ffff9965007c3d08] schedule at ffffffff9b6e64df
>  #2 [ffff9965007c3d18] schedule_timeout at ffffffff9b6e9f89
>  #3 [ffff9965007c3d90] msleep at ffffffff9af573a9
>  #4 [ffff9965007c3d98] _cifsFileInfo_put.cold.63 at ffffffffc0a42dd6 [cifs]
>  #5 [ffff9965007c3e88] cifs_close at ffffffffc0a07aaf [cifs]
>  #6 [ffff9965007c3ea0] __fput at ffffffff9b0efa6e
>  #7 [ffff9965007c3ee8] task_work_run at ffffffff9aef1614
>  #8 [ffff9965007c3f20] exit_to_usermode_loop at ffffffff9ae03d6f
>  #9 [ffff9965007c3f38] do_syscall_64 at ffffffff9ae0444c
>
> closing the file, and trying to down_write cifsi->lock_sem
>
>
> [0 00:48:22.839] [UN]  PID: 8857   TASK: ffff8c6914270000  CPU: 7
> COMMAND: "reopen_file"
>  #0 [ffff9965006a7cc8] __schedule at ffffffff9b6e6095
>  #1 [ffff9965006a7d58] schedule at ffffffff9b6e64df
>  #2 [ffff9965006a7d68] io_schedule at ffffffff9b6e68e2
>  #3 [ffff9965006a7d78] wait_on_page_bit at ffffffff9b03cac6
>  #4 [ffff9965006a7e10] __filemap_fdatawait_range at ffffffff9b03b028
>  #5 [ffff9965006a7ed8] filemap_write_and_wait at ffffffff9b040165
>  #6 [ffff9965006a7ef0] cifs_flush at ffffffffc0a0c2fa [cifs]
>  #7 [ffff9965006a7f10] filp_close at ffffffff9b0e93f1
>  #8 [ffff9965006a7f30] __x64_sys_close at ffffffff9b0e9a0e
>  #9 [ffff9965006a7f38] do_syscall_64 at ffffffff9ae04315
>
> in __filemap_fdatawait_range
>                         wait_on_page_writeback(page);
> for the same page of the file
>
>
>
> [0 00:48:22.718] [UN]  PID: 8855   TASK: ffff8c69142745c0  CPU: 7
> COMMAND: "reopen_file"
>  #0 [ffff9965005dfc98] __schedule at ffffffff9b6e6095
>  #1 [ffff9965005dfd28] schedule at ffffffff9b6e64df
>  #2 [ffff9965005dfd38] rwsem_down_write_slowpath at ffffffff9af283d7
>  #3 [ffff9965005dfdf0] cifs_strict_writev at ffffffffc0a0c40a [cifs]
>  #4 [ffff9965005dfe48] new_sync_write at ffffffff9b0ec1dd
>  #5 [ffff9965005dfed0] vfs_write at ffffffff9b0eed35
>  #6 [ffff9965005dff00] ksys_write at ffffffff9b0eefd9
>  #7 [ffff9965005dff38] do_syscall_64 at ffffffff9ae04315
>
>         inode_lock(inode);
>
>
> and one 'ls' later on, to see whether the rest of the mount is available
> (the test file is in the root, so we get blocked up on the directory
> ->i_rwsem), so the entire mount is unavailable
>
> [0 00:36:26.473] [UN]  PID: 9802   TASK: ffff8c691436ae80  CPU: 4
> COMMAND: "ls"
>  #0 [ffff996500393d28] __schedule at ffffffff9b6e6095
>  #1 [ffff996500393db8] schedule at ffffffff9b6e64df
>  #2 [ffff996500393dc8] rwsem_down_read_slowpath at ffffffff9b6e9421
>  #3 [ffff996500393e78] down_read_killable at ffffffff9b6e95e2
>  #4 [ffff996500393e88] iterate_dir at ffffffff9b103c56
>  #5 [ffff996500393ec8] ksys_getdents64 at ffffffff9b104b0c
>  #6 [ffff996500393f30] __x64_sys_getdents64 at ffffffff9b104bb6
>  #7 [ffff996500393f38] do_syscall_64 at ffffffff9ae04315
>
> in iterate_dir:
>         if (shared)
>                 res = down_read_killable(&inode->i_rwsem);  <<<<
>         else
>                 res = down_write_killable(&inode->i_rwsem);
>

Reported-by: Frank Sorenson <sorenson@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:17:12 -06:00
Aurelien Aptel d70e9fa558 cifs: try opening channels after mounting
After doing mount() successfully we call cifs_try_adding_channels()
which will open as many channels as it can.

Channels are closed when the master session is closed.

The master connection becomes the first channel.

,-------------> global cifs_tcp_ses_list <-------------------------.
|                                                                  |
'- TCP_Server_Info  <-->  TCP_Server_Info  <-->  TCP_Server_Info <-'
      (master con)           (chan#1 con)         (chan#2 con)
      |      ^                    ^                    ^
      v      '--------------------|--------------------'
   cifs_ses                       |
   - chan_count = 3               |
   - chans[] ---------------------'
   - smb3signingkey[]
      (master signing key)

Note how channel connections don't have sessions. That's because
cifs_ses can only be part of one linked list (list_head are internal
to the elements).

For signing keys, each channel has its own signing key which must be
used only after the channel has been bound. While it's binding it must
use the master session signing key.

For encryption keys, since channel connections do not have sessions
attached we must now find matching session by looping over all sessions
in smb2_get_enc_key().

Each channel is opened like a regular server connection but at the
session setup request step it must set the
SMB2_SESSION_REQ_FLAG_BINDING flag and use the session id to bind to.

Finally, while sending in compound_send_recv() for requests that
aren't negprot, ses-setup or binding related, use a channel by cycling
through the available ones (round-robin).

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:16:30 -06:00
Aurelien Aptel b8f7442bc4 CIFS: refactor cifs_get_inode_info()
Make logic of cifs_get_inode() much clearer by moving code to sub
functions and adding comments.

Document the steps this function does.

cifs_get_inode_info() gets and updates a file inode metadata from its
file path.

* If caller already has raw info data from server they can pass it.
* If inode already exists (just need to update) caller can pass it.

Step 1: get raw data from server if none was passed
Step 2: parse raw data into intermediate internal cifs_fattr struct
Step 3: set fattr uniqueid which is later used for inode number. This
        can sometime be done from raw data
Step 4: tweak fattr according to mount options (file_mode, acl to mode
        bits, uid, gid, etc)
Step 5: update or create inode from final fattr struct

* add is_smb1_server() helper
* add is_inode_cache_good() helper
* move SMB1-backupcreds-getinfo-retry to separate func
  cifs_backup_query_path_info().
* move set-uniqueid code to separate func cifs_set_fattr_ino()
* don't clobber uniqueid from backup cred retry
* fix some probable corner cases memleaks

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:16:30 -06:00
Aurelien Aptel f6a6bf7c4d cifs: switch servers depending on binding state
Currently a lot of the code to initialize a connection & session uses
the cifs_ses as input. But depending on if we are opening a new session
or a new channel we need to use different server pointers.

Add a "binding" flag in cifs_ses and a helper function that returns
the server ptr a session should use (only in the sess establishment
code path).

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:16:30 -06:00
Aurelien Aptel f780bd3fef cifs: add server param
As we get down to the transport layer, plenty of functions are passed
the session pointer and assume the transport to use is ses->server.

Instead we modify those functions to pass (ses, server) so that we
can decouple the session from the server.

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:16:30 -06:00
Aurelien Aptel bcc8880115 cifs: add multichannel mount options and data structs
adds:
- [no]multichannel to enable/disable multichannel
- max_channels=N to control how many channels to create

these options are then stored in the volume struct.

- store channels and max_channels in cifs_ses

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:16:30 -06:00
Aurelien Aptel 35adffed07 cifs: sort interface list by speed
New channels are going to be opened by walking the list sequentially,
so by sorting it we will connect to the fastest interfaces first.

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:16:30 -06:00
Pavel Shilovsky fa9c236249 CIFS: Fix SMB2 oplock break processing
Even when mounting modern protocol version the server may be
configured without supporting SMB2.1 leases and the client
uses SMB2 oplock to optimize IO performance through local caching.

However there is a problem in oplock break handling that leads
to missing a break notification on the client who has a file
opened. It latter causes big latencies to other clients that
are trying to open the same file.

The problem reproduces when there are multiple shares from the
same server mounted on the client. The processing code tries to
match persistent and volatile file ids from the break notification
with an open file but it skips all share besides the first one.
Fix this by looking up in all shares belonging to the server that
issued the oplock break.

Cc: Stable <stable@vger.kernel.org>
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:16:30 -06:00
Ronnie Sahlberg 3591bb83ee cifs: don't use 'pre:' for MODULE_SOFTDEP
It can cause
to fail with
modprobe: FATAL: Module <module> is builtin.

RHBZ: 1767094

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:16:30 -06:00
Long Li 4357d45f50 cifs: smbd: Return -EAGAIN when transport is reconnecting
During reconnecting, the transport may have already been destroyed and is in
the process being reconnected. In this case, return -EAGAIN to not fail and
to retry this I/O.

Signed-off-by: Long Li <longli@microsoft.com>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:16:30 -06:00
Long Li c21ce58eab cifs: smbd: Only queue work for error recovery on memory registration
It's not necessary to queue invalidated memory registration to work queue, as
all we need to do is to unmap the SG and make it usable again. This can save
CPU cycles in normal data paths as memory registration errors are rare and
normally only happens during reconnection.

Signed-off-by: Long Li <longli@microsoft.com>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:16:30 -06:00
Ronnie Sahlberg 87bc2376ff smb3: add debug messages for closing unmatched open
Helps distinguish between an interrupted close and a truly
unmatched open.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:14:53 -06:00
Pavel Shilovsky 7b71843fa7 CIFS: Do not miss cancelled OPEN responses
When an OPEN command is cancelled we mark a mid as
cancelled and let the demultiplex thread process it
by closing an open handle. The problem is there is
a race between a system call thread and the demultiplex
thread and there may be a situation when the mid has
been already processed before it is set as cancelled.

Fix this by processing cancelled requests when mids
are being destroyed which means that there is only
one thread referencing a particular mid. Also set
mids as cancelled unconditionally on their state.

Cc: Stable <stable@vger.kernel.org>
Tested-by: Frank Sorenson <sorenson@redhat.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:14:53 -06:00
Pavel Shilovsky 86a7964be7 CIFS: Fix NULL pointer dereference in mid callback
There is a race between a system call processing thread
and the demultiplex thread when mid->resp_buf becomes NULL
and later is being accessed to get credits. It happens when
the 1st thread wakes up before a mid callback is called in
the 2nd one but the mid state has already been set to
MID_RESPONSE_RECEIVED. This causes NULL pointer dereference
in mid callback.

Fix this by saving credits from the response before we
update the mid state and then use this value in the mid
callback rather then accessing a response buffer.

Cc: Stable <stable@vger.kernel.org>
Fixes: ee258d7915 ("CIFS: Move credit processing to mid callbacks for SMB3")
Tested-by: Frank Sorenson <sorenson@redhat.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:14:53 -06:00
Pavel Shilovsky 9150c3adbf CIFS: Close open handle after interrupted close
If Close command is interrupted before sending a request
to the server the client ends up leaking an open file
handle. This wastes server resources and can potentially
block applications that try to remove the file or any
directory containing this file.

Fix this by putting the close command into a worker queue,
so another thread retries it later.

Cc: Stable <stable@vger.kernel.org>
Tested-by: Frank Sorenson <sorenson@redhat.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:14:53 -06:00
Pavel Shilovsky 44805b0e62 CIFS: Respect O_SYNC and O_DIRECT flags during reconnect
Currently the client translates O_SYNC and O_DIRECT flags
into corresponding SMB create options when openning a file.
The problem is that on reconnect when the file is being
re-opened the client doesn't set those flags and it causes
a server to reject re-open requests because create options
don't match. The latter means that any subsequent system
call against that open file fail until a share is re-mounted.

Fix this by properly setting SMB create options when
re-openning files after reconnects.

Fixes: 1013e760d10e6: ("SMB3: Don't ignore O_SYNC/O_DSYNC and O_DIRECT flags")
Cc: Stable <stable@vger.kernel.org>
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:14:53 -06:00
Steve French 037d050724 smb3: remove confusing dmesg when mounting with encryption ("seal")
The smb2/smb3 message checking code was logging to dmesg when mounting
with encryption ("seal") for compounded SMB3 requests.  When encrypted
the whole frame (including potentially multiple compounds) is read
so the length field is longer than in the case of non-encrypted
case (where length field will match the the calculated length for
the particular SMB3 request in the compound being validated).

Avoids the warning on mount (with "seal"):

   "srv rsp padded more than expected. Length 384 not ..."

Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:14:53 -06:00
Ronnie Sahlberg 72e73c78c4 cifs: close the shared root handle on tree disconnect
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:14:53 -06:00
Markus Elfring 598b6c57f2 CIFS: Return directly after a failed build_path_from_dentry() in cifs_do_create()
Return directly after a call of the function "build_path_from_dentry"
failed at the beginning.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:14:53 -06:00
Markus Elfring 2b1116bbe8 CIFS: Use common error handling code in smb2_ioctl_query_info()
Move the same error code assignments so that such exception handling
can be better reused at the end of this function.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:14:53 -06:00
Markus Elfring cfaa118109 CIFS: Use memdup_user() rather than duplicating its implementation
Reuse existing functionality from memdup_user() instead of keeping
duplicate source code.

Generated by: scripts/coccinelle/api/memdup_user.cocci

Fixes: f5b05d622a ("cifs: add IOCTL for QUERY_INFO passthrough to userspace")
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:14:53 -06:00
Long Li acd4680e2b cifs: smbd: Return -ECONNABORTED when trasnport is not in connected state
The transport should return this error so the upper layer will reconnect.

Signed-off-by: Long Li <longli@microsoft.com>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:14:15 -06:00
Long Li d63cdbae60 cifs: smbd: Add messages on RDMA session destroy and reconnection
Log these activities to help production support.

Signed-off-by: Long Li <longli@microsoft.com>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:14:15 -06:00
Long Li 37941ea17d cifs: smbd: Return -EINVAL when the number of iovs exceeds SMBDIRECT_MAX_SGE
While it's not friendly to fail user processes that issue more iovs
than we support, at least we should return the correct error code so the
user process gets a chance to retry with smaller number of iovs.

Signed-off-by: Long Li <longli@microsoft.com>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:14:15 -06:00
Long Li b7a55bbd6d cifs: smbd: Invalidate and deregister memory registration on re-send for direct I/O
On re-send, there might be a reconnect and all prevoius memory registrations
need to be invalidated and deregistered.

Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:14:14 -06:00
Long Li 14cc639c17 cifs: Don't display RDMA transport on reconnect
On reconnect, the transport data structure is NULL and its information is not
available.

Signed-off-by: Long Li <longli@microsoft.com>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:14:14 -06:00
YueHaibing f28a2e5ebc CIFS: remove set but not used variables 'cinode' and 'netfid'
Fixes gcc '-Wunused-but-set-variable' warning:

fs/cifs/file.c: In function 'cifs_flock':
fs/cifs/file.c:1704:8: warning:
 variable 'netfid' set but not used [-Wunused-but-set-variable]

fs/cifs/file.c:1702:24: warning:
 variable 'cinode' set but not used [-Wunused-but-set-variable]

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:14:14 -06:00
Steve French d0677992d2 cifs: add support for flock
The flock system call locks the whole file rather than a byte
range and so is currently emulated by various other file systems
by simply sending a byte range lock for the whole file.
Add flock handling for cifs.ko in similar way.

xfstest generic/504 passes with this as well

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2019-11-25 01:14:14 -06:00
YueHaibing be1bf978e5 cifs: remove unused variable 'sid_user'
fs/cifs/cifsacl.c:43:30: warning:
 sid_user defined but not used [-Wunused-const-variable=]

It is never used, so remove it.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:14:14 -06:00
Dan Carpenter 8bd3754cff cifs: rename a variable in SendReceive()
Smatch gets confused because we sometimes refer to "server->srv_mutex" and
sometimes to "sess->server->srv_mutex".  They refer to the same lock so
let's just make this consistent.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:14:14 -06:00
Linus Torvalds 219d54332a Linux 5.4 2019-11-24 16:32:01 -08:00
Linus Torvalds b8387f6f34 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull cramfs fix from Al Viro:
 "Regression fix, fallen through the cracks"

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  cramfs: fix usage on non-MTD device
2019-11-24 12:36:39 -08:00
Maxime Bizon 3e5aeec0e2 cramfs: fix usage on non-MTD device
When both CONFIG_CRAMFS_MTD and CONFIG_CRAMFS_BLOCKDEV are enabled, if
we fail to mount on MTD, we don't try on block device.

Note: this relies upon cramfs_mtd_fill_super() leaving no side
effects on fc state in case of failure; in general, failing
get_tree_...() does *not* mean "fine to try again"; e.g. parsed
options might've been consumed by fill_super callback and freed
on failure.

Fixes: 74f78fc5ef ("vfs: Convert cramfs to use the new mount API")

Signed-off-by: Maxime Bizon <mbizon@freebox.fr>
Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-11-23 21:44:49 -05:00
Linus Torvalds 6b8a794678 virtio: last minute bugfixes
Minor bugfixes all over the place.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAl3U6E4PHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRpi/oIAJwjsmGhheWTRa+vL/otHu1PjOQ4hziYwu/A
 n80gp3pmPMa4yV5CDZt72qL4jH8llXPpH4gErvXvSyKfrvmelkJBILB53mAa04pq
 BZWhkbcqldhMo35y3Ac+WcRk9zlkGq0NPuUcCb959h+pRXuZWtKWxQ2miwi413e1
 1ecBRO64SvXSphflFfMMCB730aIXC/dsZAtBXqs8v+i4cIz9g8Z4fZS6c38nUA23
 wBleNWylaxHn6UDRZAGRY12JFxFe3QxCP0RKTjwo0eF3HG7IZIGQKHnztjlqD4ko
 ZOQmxPYW2+96XnS8n7y+gA517u10OIVZBocD9FX10c3wvS+Lpxg=
 =KAuZ
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull last minute virtio bugfixes from Michael Tsirkin:
 "Minor bugfixes all over the place"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio_balloon: fix shrinker count
  virtio_balloon: fix shrinker scan number of pages
  virtio_console: allocate inbufs in add_port() only if it is needed
  virtio_ring: fix return code on DMA mapping fails
2019-11-23 13:02:18 -08:00
Linus Torvalds 2027cabe6a Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fix from Dmitry Torokhov:
 "Just a single revert as RMI mode should not have been enabled for this
  model [yet?]"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Revert "Input: synaptics - enable RMI mode for X1 Extreme 2nd Generation"
2019-11-22 16:57:26 -08:00
Lyude Paul 8791663435 Revert "Input: synaptics - enable RMI mode for X1 Extreme 2nd Generation"
This reverts commit 68b9c5066e39af41d3448abfc887c77ce22dd64d.

Ugh, I really dropped the ball on this one :\. So as it turns out RMI4
works perfectly fine on the X1 Extreme Gen 2 except for one thing I
didn't notice because I usually use the trackpoint: clicking with the
touchpad. Somehow this is broken, in fact we don't even seem to indicate
BTN_LEFT as a valid event type for the RMI4 touchpad. And, I don't even
see any RMI4 events coming from the touchpad when I press down on it.
This only seems to work for PS/2 mode.

Since that means we have a regression, and PS/2 mode seems to work fine
for the time being - revert this for now. We'll have to do a more
thorough investigation on this.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://lore.kernel.org/r/20191119234534.10725-1-lyude@redhat.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2019-11-22 15:19:03 -08:00
Linus Torvalds 34c36f4564 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) Validate tunnel options length in act_tunnel_key, from Xin Long.

 2) Fix DMA sync bug in gve driver, from Adi Suresh.

 3) TSO kills performance on some r8169 chips due to HW issues, disable
    by default in that case, from Corinna Vinschen.

 4) Fix clock disable mismatch in fec driver, from Chubong Yuan.

 5) Fix interrupt status bits define in hns3 driver, from Huazhong Tan.

 6) Fix workqueue deadlocks in qeth driver, from Julian Wiedmann.

 7) Don't napi_disable() twice in r8152 driver, from Hayes Wang.

 8) Fix SKB extension memory leak, from Florian Westphal.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (54 commits)
  r8152: avoid to call napi_disable twice
  MAINTAINERS: Add myself as maintainer of virtio-vsock
  udp: drop skb extensions before marking skb stateless
  net: rtnetlink: prevent underflows in do_setvfinfo()
  can: m_can_platform: remove unnecessary m_can_class_resume() call
  can: m_can_platform: set net_device structure as driver data
  hv_netvsc: Fix send_table offset in case of a host bug
  hv_netvsc: Fix offset usage in netvsc_send_table()
  net-ipv6: IPV6_TRANSPARENT - check NET_RAW prior to NET_ADMIN
  sfc: Only cancel the PPS workqueue if it exists
  nfc: port100: handle command failure cleanly
  net-sysfs: fix netdev_queue_add_kobject() breakage
  r8152: Re-order napi_disable in rtl8152_close
  net: qca_spi: Move reset_count to struct qcaspi
  net: qca_spi: fix receive buffer size check
  net/ibmvnic: Ignore H_FUNCTION return from H_EOI to tolerate XIVE mode
  Revert "net/ibmvnic: Fix EOI when running in XIVE mode"
  net/mlxfw: Verify FSM error code translation doesn't exceed array size
  net/mlx5: Update the list of the PCI supported devices
  net/mlx5: Fix auto group size calculation
  ...
2019-11-22 14:28:14 -08:00
Marc Dionne b485275f1a afs: Fix large file support
By default s_maxbytes is set to MAX_NON_LFS, which limits the usable
file size to 2GB, enforced by the vfs.

Commit b9b1f8d593 ("AFS: write support fixes") added support for the
64-bit fetch and store server operations, but did not change this value.
As a result, attempts to write past the 2G mark result in EFBIG errors:

 $ dd if=/dev/zero of=foo bs=1M count=1 seek=2048
 dd: error writing 'foo': File too large

Set s_maxbytes to MAX_LFS_FILESIZE.

Fixes: b9b1f8d593 ("AFS: write support fixes")
Signed-off-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-22 14:19:26 -08:00
Marc Dionne cd340703c2 afs: Fix possible assert with callbacks from yfs servers
Servers sending callback breaks to the YFS_CM_SERVICE service may
send up to YFSCBMAX (1024) fids in a single RPC.  Anything over
AFSCBMAX (50) will cause the assert in afs_break_callbacks to trigger.

Remove the assert, as the count has already been checked against
the appropriate max values in afs_deliver_cb_callback and
afs_deliver_yfs_cb_callback.

Fixes: 35dbfba311 ("afs: Implement the YFS cache manager service")
Signed-off-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-22 14:19:26 -08:00
Hayes Wang 5b1d9c17a3 r8152: avoid to call napi_disable twice
Call napi_disable() twice would cause dead lock. There are three situations
may result in the issue.

1. rtl8152_pre_reset() and set_carrier() are run at the same time.
2. Call rtl8152_set_tunable() after rtl8152_close().
3. Call rtl8152_set_ringparam() after rtl8152_close().

For #1, use the same solution as commit 8481141246 ("r8152: Re-order
napi_disable in rtl8152_close"). For #2 and #3, add checking the flag
of IFF_UP and using napi_disable/napi_enable during mutex.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-22 10:07:44 -08:00
Linus Torvalds cc079039c9 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "Three fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm/ksm.c: don't WARN if page is still mapped in remove_stable_node()
  mm/memory_hotplug: don't access uninitialized memmaps in shrink_zone_span()
  Revert "fs: ocfs2: fix possible null-pointer dereferences in ocfs2_xa_prepare_entry()"
2019-11-22 09:49:08 -08:00
David S. Miller 068299374c linux-can-fixes-for-5.4-20191122
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEmvEkXzgOfc881GuFWsYho5HknSAFAl3X9E0THG1rbEBwZW5n
 dXRyb25peC5kZQAKCRBaxiGjkeSdIIeQB/9EctdOAnM39+4vy22n9veOj/qEWW1S
 VbIgG2WG3eHRPhCVoA7d61QS3/Mdeme0HU6icUXG3AYomCEEDu/Rc0kaFL8G3Zj5
 mfL7ypIrjwjC2aEvaRzWVk4+0RW2xAchWONLycaC3ubolmr2q04ETDoY/dmWgByG
 UpkhcCuy95kxD5e10do0g8UsP8xJwTEgRD91uPk2/M/f90LTJ1EVCh4nsntoYUDj
 PmhavZNoiRBQTjajaHzE6jBQp6kJFvImcbMpn1mFQOuqSnk/b0hWSkgtP40Mp6xD
 1j9M2JuttE5ai0eWh8xk0m3/mbCuCeTvJyW4F+c2OQ/aPzl1FqNkrTqh
 =9oQw
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-fixes-for-5.4-20191122' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2019-11-22

this is a pull request of 2 patches for net/master, if possible for the
current release cycle. Otherwise these patches should hit v5.4 via the
stable tree.

Both patches of this pull request target the m_can driver. Pankaj Sharma
fixes the fallout in the m_can_platform part, which appeared with the
introduction of the m_can platform framework.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-22 09:42:11 -08:00
Stefano Garzarella efabb6c688 MAINTAINERS: Add myself as maintainer of virtio-vsock
Since I'm actively working on vsock and virtio/vhost transports,
Stefan suggested to help him to maintain it.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-22 09:38:52 -08:00
Florian Westphal 677bf08cfd udp: drop skb extensions before marking skb stateless
Once udp stack has set the UDP_SKB_IS_STATELESS flag, later skb free
assumes all skb head state has been dropped already.

This will leak the extension memory in case the skb has extensions other
than the ipsec secpath, e.g. bridge nf data.

To fix this, set the UDP_SKB_IS_STATELESS flag only if we don't have
extensions or if the extension space can be free'd.

Fixes: 895b5c9f20 ("netfilter: drop bridge nf reset from nf_reset")
Cc: Paolo Abeni <pabeni@redhat.com>
Reported-by: Byron Stanoszek <gandalf@winds.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-22 09:28:46 -08:00
Dan Carpenter ff08ddba3a net: rtnetlink: prevent underflows in do_setvfinfo()
The "ivm->vf" variable is a u32, but the problem is that a number of
drivers cast it to an int and then forget to check for negatives.  An
example of this is in the cxgb4 driver.

drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
  2890  static int cxgb4_mgmt_get_vf_config(struct net_device *dev,
  2891                                      int vf, struct ifla_vf_info *ivi)
                                            ^^^^^^
  2892  {
  2893          struct port_info *pi = netdev_priv(dev);
  2894          struct adapter *adap = pi->adapter;
  2895          struct vf_info *vfinfo;
  2896
  2897          if (vf >= adap->num_vfs)
                    ^^^^^^^^^^^^^^^^^^^
  2898                  return -EINVAL;
  2899          vfinfo = &adap->vfinfo[vf];
                ^^^^^^^^^^^^^^^^^^^^^^^^^^

There are 48 functions affected.

drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c:8435 hclge_set_vf_vlan_filter() warn: can 'vfid' underflow 's32min-2147483646'
drivers/net/ethernet/freescale/enetc/enetc_pf.c:377 enetc_pf_set_vf_mac() warn: can 'vf' underflow 's32min-2147483646'
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:2899 cxgb4_mgmt_get_vf_config() warn: can 'vf' underflow 's32min-254'
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:2960 cxgb4_mgmt_set_vf_rate() warn: can 'vf' underflow 's32min-254'
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:3019 cxgb4_mgmt_set_vf_rate() warn: can 'vf' underflow 's32min-254'
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:3038 cxgb4_mgmt_set_vf_vlan() warn: can 'vf' underflow 's32min-254'
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:3086 cxgb4_mgmt_set_vf_link_state() warn: can 'vf' underflow 's32min-254'
drivers/net/ethernet/chelsio/cxgb/cxgb2.c:791 get_eeprom() warn: can 'i' underflow 's32min-(-4),0,4-s32max'
drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:82 bnxt_set_vf_spoofchk() warn: can 'vf_id' underflow 's32min-65534'
drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:164 bnxt_set_vf_trust() warn: can 'vf_id' underflow 's32min-65534'
drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:186 bnxt_get_vf_config() warn: can 'vf_id' underflow 's32min-65534'
drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:228 bnxt_set_vf_mac() warn: can 'vf_id' underflow 's32min-65534'
drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:264 bnxt_set_vf_vlan() warn: can 'vf_id' underflow 's32min-65534'
drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:293 bnxt_set_vf_bw() warn: can 'vf_id' underflow 's32min-65534'
drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:333 bnxt_set_vf_link_state() warn: can 'vf_id' underflow 's32min-65534'
drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c:2595 bnx2x_vf_op_prep() warn: can 'vfidx' underflow 's32min-63'
drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c:2595 bnx2x_vf_op_prep() warn: can 'vfidx' underflow 's32min-63'
drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2281 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63'
drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2285 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63'
drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2286 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63'
drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2292 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63'
drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2297 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63'
drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:1832 qlcnic_sriov_set_vf_mac() warn: can 'vf' underflow 's32min-254'
drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:1864 qlcnic_sriov_set_vf_tx_rate() warn: can 'vf' underflow 's32min-254'
drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:1937 qlcnic_sriov_set_vf_vlan() warn: can 'vf' underflow 's32min-254'
drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:2005 qlcnic_sriov_get_vf_config() warn: can 'vf' underflow 's32min-254'
drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:2036 qlcnic_sriov_set_vf_spoofchk() warn: can 'vf' underflow 's32min-254'
drivers/net/ethernet/emulex/benet/be_main.c:1914 be_get_vf_config() warn: can 'vf' underflow 's32min-65534'
drivers/net/ethernet/emulex/benet/be_main.c:1915 be_get_vf_config() warn: can 'vf' underflow 's32min-65534'
drivers/net/ethernet/emulex/benet/be_main.c:1922 be_set_vf_tvt() warn: can 'vf' underflow 's32min-65534'
drivers/net/ethernet/emulex/benet/be_main.c:1951 be_clear_vf_tvt() warn: can 'vf' underflow 's32min-65534'
drivers/net/ethernet/emulex/benet/be_main.c:2063 be_set_vf_tx_rate() warn: can 'vf' underflow 's32min-65534'
drivers/net/ethernet/emulex/benet/be_main.c:2091 be_set_vf_link_state() warn: can 'vf' underflow 's32min-65534'
drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:2609 ice_set_vf_port_vlan() warn: can 'vf_id' underflow 's32min-65534'
drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3050 ice_get_vf_cfg() warn: can 'vf_id' underflow 's32min-65534'
drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3103 ice_set_vf_spoofchk() warn: can 'vf_id' underflow 's32min-65534'
drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3181 ice_set_vf_mac() warn: can 'vf_id' underflow 's32min-65534'
drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3237 ice_set_vf_trust() warn: can 'vf_id' underflow 's32min-65534'
drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3286 ice_set_vf_link_state() warn: can 'vf_id' underflow 's32min-65534'
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:3919 i40e_validate_vf() warn: can 'vf_id' underflow 's32min-2147483646'
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:3957 i40e_ndo_set_vf_mac() warn: can 'vf_id' underflow 's32min-2147483646'
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4104 i40e_ndo_set_vf_port_vlan() warn: can 'vf_id' underflow 's32min-2147483646'
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4263 i40e_ndo_set_vf_bw() warn: can 'vf_id' underflow 's32min-2147483646'
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4309 i40e_ndo_get_vf_config() warn: can 'vf_id' underflow 's32min-2147483646'
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4371 i40e_ndo_set_vf_link_state() warn: can 'vf_id' underflow 's32min-2147483646'
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4441 i40e_ndo_set_vf_spoofchk() warn: can 'vf_id' underflow 's32min-2147483646'
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4441 i40e_ndo_set_vf_spoofchk() warn: can 'vf_id' underflow 's32min-2147483646'
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4504 i40e_ndo_set_vf_trust() warn: can 'vf_id' underflow 's32min-2147483646'

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-22 09:28:46 -08:00
Linus Torvalds a6b0373ffc Power management regression fix for final 5.4
Fix problems with switching cpufreq drivers on some x86 systems with
 ACPI (and with changing the operation modes of the intel_pstate driver
 on those systems) introduced by recent changes related to the
 management of frequency limits in cpufreq.
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl3X1LQSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxAgMQAKFu4e3IZ7vVMo3gu61r9amqVG03prMh
 Ca8lvMd0pxp1wtL6gSkAB0Kces1hwooz4bpA/Khm7igBfdilcNu9gv/d9EvGRCU1
 /iOWHrsYS4gQnWwjJ001x+PokQQEFv+uGw9a6vwlEVU4a6TImlJKF+99Beogzr7U
 tuTzEpim2ruMHFaP5EqjC/XcU+uzyW2rMnc7fH0t55W0saIT2trI6Nj+DJgZWPth
 eRbgS4aOy3Bt9foBCVOFtkXz5VvHS3EUMPI7DxOx3OTt8D469NBDYx/ui8y7frIF
 t3vME/UuWCJmEUJ7o0CqhVcB4TW5gZaIE7P2TO9pSCxPxKtPbROdfrR3snGPLjNl
 syD7yqwuYpDbfi+0v3Mi3fnUqnnsiXPg2mFCl/w0tgIKz1iAv20pOWnCqja4QreG
 Z+Ax1owzrEYlDaV5lDXYZd48dqItj4dMuUBSwXu8RBk+LE/Bjro0jnnK4zUD/M1m
 sukrvgFrIXpTgqSnm8sX4fMyZnEvmNtG2kAYl6F84k2EICbOLkXD9ypBANb/9C3h
 KcSzJFr76de18s5bnMBgoQRvfElbOotr63TyzeRo/6mF0G4+BpyAEsP7FzRSq+6u
 cd+lrCvItzwIgD/5XU1WpkoOoYM//c0Y9JUHKwtnd+FSp1psGJwlhgbbckazQ699
 A5DC7maAkkHb
 =S23p
 -----END PGP SIGNATURE-----

Merge tag 'pm-5.4-final' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management regression fix from Rafael Wysocki:
 "Fix problems with switching cpufreq drivers on some x86 systems with
  ACPI (and with changing the operation modes of the intel_pstate driver
  on those systems) introduced by recent changes related to the
  management of frequency limits in cpufreq"

* tag 'pm-5.4-final' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: QoS: Invalidate frequency QoS requests after removal
2019-11-22 09:18:16 -08:00
Linus Torvalds 5d867ab037 drm fixes for 5.4
amdgpu:
 - Remove experimental flag for navi14
 - Fix confusing power message failures on older VI parts
 - Hang fix for gfxoff when using the read register interface
 - Two stability regression fixes for Raven
 
 i915:
 - Fix kernel oops on dumb_create ioctl on no crtc situation
 - Fix bad ugly colored flash on VLV/CHV related to gamma LUT update
 - Fix unity of the frequencies reported on PMU
 - Fix kernel oops on set_page_dirty using better locks around it
 - Protect the request pointer with RCU to prevent it being freed while we might need still
 - Make pool objects read-only
 - Restore physical addresses for fb_map to avoid corrupted page table
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJd1y5MAAoJEAx081l5xIa+HMUP/3G2QKUMN6hqno7EZQjJPhWs
 UFaVHhwGtWdwyntwUtSxkZEvj1+1InSw2x/vhBdBghFZ4j5yO63v1pe5EkvAZ4Ff
 Yg2z0vhl0tqQ7w4upe5C8TYztBvGqInMdbt76fIAB8jXM10lpPDZ98eVmVTotvCE
 6KHzP7/xW4l53SRnptMM+pHrRlaoc4NpLkTPexpJ7woZv07ktG5mGw1/+kEuHi5q
 qrg9PKOg3z3Q77yq4dxcH3ET+IL9I34A5WktTXCd1xIowdfzkC7rpr5DVHaVjUqf
 Io0Z+ca54enUtfLs5S0pbgmLll9SQ8GMSP1BuwLmZx5Xxqm0TM4paQ0lZV2mVa7X
 izWYni+eIIINMqeIfo8xtegZ2YUAz2unx+hy4HK5VGgnTalqidsgfQ+sFVScPsva
 3DvgA6Hrz4AiYCen1s01DMuJLi/O7BmxwLwrKgpS3XWYVZKpmLKZqdZpkwLBQC7T
 OvluGCLyFsaZj/FAht6l4AqotcWURHd/ujOmK5AxEEelTT1fMAy1KjyiwouWTEIw
 7HCRd2yGkqUt3ShKABwTEh0+m4kub+Id3CaXCcu2Ce+bmSoseq564XPXrRzK7eUX
 KuuTuj9opXvhHLBsVVGkHs72xWNgc7MoVORWxfDmSGRrD5a93tbltlNmuy+PCAKQ
 FY3YQmvWYcLenZAgJ4VX
 =0Z0x
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-2019-11-22' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Two sets of fixes in here, one for amdgpu, and one for i915.

  The amdgpu ones are pretty small, i915's CI system seems to have a few
  problems in the last week or so, there is one major regression fix for
  fb_mmap, but there are a bunch of other issues fixed in there as well,
  oops, screen flashes and rcu related.

  amdgpu:
   - Remove experimental flag for navi14
   - Fix confusing power message failures on older VI parts
   - Hang fix for gfxoff when using the read register interface
   - Two stability regression fixes for Raven

  i915:
   - Fix kernel oops on dumb_create ioctl on no crtc situation
   - Fix bad ugly colored flash on VLV/CHV related to gamma LUT update
   - Fix unity of the frequencies reported on PMU
   - Fix kernel oops on set_page_dirty using better locks around it
   - Protect the request pointer with RCU to prevent it being freed
     while we might need still
   - Make pool objects read-only
   - Restore physical addresses for fb_map to avoid corrupted page
     table"

* tag 'drm-fixes-2019-11-22' of git://anongit.freedesktop.org/drm/drm:
  drm/i915/fbdev: Restore physical addresses for fb_mmap()
  Revert "drm/amd/display: enable S/G for RAVEN chip"
  drm/amdgpu: disable gfxoff on original raven
  drm/amdgpu: disable gfxoff when using register read interface
  drm/amd/powerplay: correct fine grained dpm force level setting
  drm/amd/powerplay: issue no PPSMC_MSG_GetCurrPkgPwr on unsupported ASICs
  drm/amdgpu: remove experimental flag for Navi14
  drm/i915: make pool objects read-only
  drm/i915: Protect request peeking with RCU
  drm/i915/userptr: Try to acquire the page lock around set_page_dirty()
  drm/i915/pmu: "Frequency" is reported as accumulated cycles
  drm/i915: Preload LUTs if the hw isn't currently using them
  drm/i915: Don't oops in dumb_create ioctl if we have no crtcs
2019-11-22 09:14:30 -08:00
Andrey Ryabinin 9a63236f1a mm/ksm.c: don't WARN if page is still mapped in remove_stable_node()
It's possible to hit the WARN_ON_ONCE(page_mapped(page)) in
remove_stable_node() when it races with __mmput() and squeezes in
between ksm_exit() and exit_mmap().

  WARNING: CPU: 0 PID: 3295 at mm/ksm.c:888 remove_stable_node+0x10c/0x150

  Call Trace:
   remove_all_stable_nodes+0x12b/0x330
   run_store+0x4ef/0x7b0
   kernfs_fop_write+0x200/0x420
   vfs_write+0x154/0x450
   ksys_write+0xf9/0x1d0
   do_syscall_64+0x99/0x510
   entry_SYSCALL_64_after_hwframe+0x49/0xbe

Remove the warning as there is nothing scary going on.

Link: http://lkml.kernel.org/r/20191119131850.5675-1-aryabinin@virtuozzo.com
Fixes: cbf86cfe04 ("ksm: remove old stable nodes more thoroughly")
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-22 09:11:18 -08:00
David Hildenbrand 7ce700bf11 mm/memory_hotplug: don't access uninitialized memmaps in shrink_zone_span()
Let's limit shrinking to !ZONE_DEVICE so we can fix the current code.
We should never try to touch the memmap of offline sections where we
could have uninitialized memmaps and could trigger BUGs when calling
page_to_nid() on poisoned pages.

There is no reliable way to distinguish an uninitialized memmap from an
initialized memmap that belongs to ZONE_DEVICE, as we don't have
anything like SECTION_IS_ONLINE we can use similar to
pfn_to_online_section() for !ZONE_DEVICE memory.

E.g., set_zone_contiguous() similarly relies on pfn_to_online_section()
and will therefore never set a ZONE_DEVICE zone consecutive.  Stopping
to shrink the ZONE_DEVICE therefore results in no observable changes,
besides /proc/zoneinfo indicating different boundaries - something we
can totally live with.

Before commit d0dc12e86b ("mm/memory_hotplug: optimize memory
hotplug"), the memmap was initialized with 0 and the node with the right
value.  So the zone might be wrong but not garbage.  After that commit,
both the zone and the node will be garbage when touching uninitialized
memmaps.

Toshiki reported a BUG (race between delayed initialization of
ZONE_DEVICE memmaps without holding the memory hotplug lock and
concurrent zone shrinking).

  https://lkml.org/lkml/2019/11/14/1040

"Iteration of create and destroy namespace causes the panic as below:

      kernel BUG at mm/page_alloc.c:535!
      CPU: 7 PID: 2766 Comm: ndctl Not tainted 5.4.0-rc4 #6
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
      RIP: 0010:set_pfnblock_flags_mask+0x95/0xf0
      Call Trace:
       memmap_init_zone_device+0x165/0x17c
       memremap_pages+0x4c1/0x540
       devm_memremap_pages+0x1d/0x60
       pmem_attach_disk+0x16b/0x600 [nd_pmem]
       nvdimm_bus_probe+0x69/0x1c0
       really_probe+0x1c2/0x3e0
       driver_probe_device+0xb4/0x100
       device_driver_attach+0x4f/0x60
       bind_store+0xc9/0x110
       kernfs_fop_write+0x116/0x190
       vfs_write+0xa5/0x1a0
       ksys_write+0x59/0xd0
       do_syscall_64+0x5b/0x180
       entry_SYSCALL_64_after_hwframe+0x44/0xa9

  While creating a namespace and initializing memmap, if you destroy the
  namespace and shrink the zone, it will initialize the memmap outside
  the zone and trigger VM_BUG_ON_PAGE(!zone_spans_pfn(page_zone(page),
  pfn), page) in set_pfnblock_flags_mask()."

This BUG is also mitigated by this commit, where we for now stop to
shrink the ZONE_DEVICE zone until we can do it in a safe and clean way.

Link: http://lkml.kernel.org/r/20191006085646.5768-5-david@redhat.com
Fixes: f1dd2cd13c ("mm, memory_hotplug: do not associate hotadded memory to zones until online")	[visible after d0dc12e86b]
Signed-off-by: David Hildenbrand <david@redhat.com>
Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reported-by: Toshiki Fukasawa <t-fukasawa@vx.jp.nec.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Damian Tometzki <damian.tometzki@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jun Yao <yaojun8558363@gmail.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Pankaj Gupta <pagupta@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pavel Tatashin <pavel.tatashin@microsoft.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qian Cai <cai@lca.pw>
Cc: Rich Felker <dalias@libc.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Yu Zhao <yuzhao@google.com>
Cc: <stable@vger.kernel.org>	[4.13+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-22 09:11:18 -08:00
Joseph Qi 94b07b6f9e Revert "fs: ocfs2: fix possible null-pointer dereferences in ocfs2_xa_prepare_entry()"
This reverts commit 56e94ea132.

Commit 56e94ea132 ("fs: ocfs2: fix possible null-pointer dereferences
in ocfs2_xa_prepare_entry()") introduces a regression that fail to
create directory with mount option user_xattr and acl.  Actually the
reported NULL pointer dereference case can be correctly handled by
loc->xl_ops->xlo_add_entry(), so revert it.

Link: http://lkml.kernel.org/r/1573624916-83825-1-git-send-email-joseph.qi@linux.alibaba.com
Fixes: 56e94ea132 ("fs: ocfs2: fix possible null-pointer dereferences in ocfs2_xa_prepare_entry()")
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reported-by: Thomas Voegtle <tv@lio96.de>
Acked-by: Changwei Ge <gechangwei@live.cn>
Cc: Jia-Ju Bai <baijiaju1990@gmail.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-22 09:11:18 -08:00