Commit Graph

4032 Commits

Author SHA1 Message Date
Linus Torvalds 8834147f95 fscache rewrite
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEqG5UsNXhtOCrfGQP+7dXa6fLC2sFAmHeBGsACgkQ+7dXa6fL
 C2tyLw/8C2Gs/XvOZvRO7KPetKI9BbQSFoCe7uvGbiPq5CEmgcjWzQxvQGklBiZD
 qYa6pMNye1iGpsHOY3Yu210b7vMQiRLnnxvVle0UrjpZR7CcxYS0gGV+6yRdbDGy
 W1X6GFiX06qiNsgBH4msYp0SmbhhfkTyAx1BeBZAEtX8iFgaPfOldPY2nLMcTDD6
 6FT1nTzRcMHx9IUQZJtpeatzc70Qg8+fOr2UAY2nOIypXh6+vAMBO80xtUjGVU+1
 pWD1E+8cXSLfwEEzquFWoWTsTX7hNfsesEN10FmBf1bVCH9ZDFE01MOl6B8+CkFl
 +xfkvDNFC3yyUwAMVAV4+A4Be+cVLSqN2R91QIKJnAj9w1OjxASrwZJ1YeZp6KP4
 h0XKuPs3sRwwbNPVL/nP0UPNexoJnOUAaHesl4uKkRrExmxz9xGOIqIri2+tUIO+
 HkGyNns1huymj1K1ja4AQbDiZZX39GgYVleyg9g3uuy1FS4k+/myJcXo/CqWn3ON
 4oeNwxwLvlcqIQnPrESvwev50lFZYB4pfwvez6T2C5dL/Wk/xdeJK9iG81RWgx7y
 5XcDeoGDE08gMCGWVPjuhOCXypeiRGHhRNlcxTtq5kLwBZGkcYg/wFFnWn+6hzc4
 kyXw2kS5WZq4Q/FPh7BdY0eHp6xv0EpAOZwceneLB9lhNINdxcQ=
 =ISJ6
 -----END PGP SIGNATURE-----

Merge tag 'fscache-rewrite-20220111' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull fscache rewrite from David Howells:
 "This is a set of patches that rewrites the fscache driver and the
  cachefiles driver, significantly simplifying the code compared to
  what's upstream, removing the complex operation scheduling and object
  state machine in favour of something much smaller and simpler.

  The series is structured such that the first few patches disable
  fscache use by the network filesystems using it, remove the cachefiles
  driver entirely and as much of the fscache driver as can be got away
  with without causing build failures in the network filesystems.

  The patches after that recreate fscache and then cachefiles,
  attempting to add the pieces in a logical order. Finally, the
  filesystems are reenabled and then the very last patch changes the
  documentation.

  [!] Note: I have dropped the cifs patch for the moment, leaving local
      caching in cifs disabled. I've been having trouble getting that
      working. I think I have it done, but it needs more testing (there
      seem to be some test failures occurring with v5.16 also from
      xfstests), so I propose deferring that patch to the end of the
      merge window.

  WHY REWRITE?
  ============

  Fscache's operation scheduling API was intended to handle sequencing
  of cache operations, which were all required (where possible) to run
  asynchronously in parallel with the operations being done by the
  network filesystem, whilst allowing the cache to be brought online and
  offline and to interrupt service for invalidation.

  With the advent of the tmpfile capacity in the VFS, however, an
  opportunity arises to do invalidation much more simply, without having
  to wait for I/O that's actually in progress: Cachefiles can simply
  create a tmpfile, cut over the file pointer for the backing object
  attached to a cookie and abandon the in-progress I/O, dismissing it
  upon completion.

  Future work here would involve using Omar Sandoval's vfs_link() with
  AT_LINK_REPLACE[1] to allow an extant file to be displaced by a new
  hard link from a tmpfile as currently I have to unlink the old file
  first.

  These patches can also simplify the object state handling as I/O
  operations to the cache don't all have to be brought to a stop in
  order to invalidate a file. To that end, and with an eye on to writing
  a new backing cache model in the future, I've taken the opportunity to
  simplify the indexing structure.

  I've separated the index cookie concept from the file cookie concept
  by C type now. The former is now called a "volume cookie" (struct
  fscache_volume) and there is a container of file cookies. There are
  then just the two levels. All the index cookie levels are collapsed
  into a single volume cookie, and this has a single printable string as
  a key. For instance, an AFS volume would have a key of something like
  "afs,example.com,1000555", combining the filesystem name, cell name
  and volume ID. This is freeform, but must not have '/' chars in it.

  I've also eliminated all pointers back from fscache into the network
  filesystem. This required the duplication of a little bit of data in
  the cookie (cookie key, coherency data and file size), but it's not
  actually that much. This gets rid of problems with making sure we keep
  netfs data structures around so that the cache can access them.

  These patches mean that most of the code that was in the drivers
  before is simply gone and those drivers are now almost entirely new
  code. That being the case, there doesn't seem any particular reason to
  try and maintain bisectability across it. Further, there has to be a
  point in the middle where things are cut over as there's a single
  point everything has to go through (ie. /dev/cachefiles) and it can't
  be in use by two drivers at once.

  ISSUES YET OUTSTANDING
  ======================

  There are some issues still outstanding, unaddressed by this patchset,
  that will need fixing in future patchsets, but that don't stop this
  series from being usable:

  (1) The cachefiles driver needs to stop using the backing filesystem's
      metadata to store information about what parts of the cache are
      populated. This is not reliable with modern extent-based
      filesystems.

      Fixing this is deferred to a separate patchset as it involves
      negotiation with the network filesystem and the VM as to how much
      data to download to fulfil a read - which brings me on to (2)...

  (2) NFS (and CIFS with the dropped patch) do not take account of how
      the cache would like I/O to be structured to meet its granularity
      requirements. Previously, the cache used page granularity, which
      was fine as the network filesystems also dealt in page
      granularity, and the backing filesystem (ext4, xfs or whatever)
      did whatever it did out of sight. However, we now have folios to
      deal with and the cache will now have to store its own metadata to
      track its contents.

      The change I'm looking at making for cachefiles is to store
      content bitmaps in one or more xattrs and making a bit in the map
      correspond to something like a 256KiB block. However, the size of
      an xattr and the fact that they have to be read/updated in one go
      means that I'm looking at covering 1GiB of data per 512-byte map
      and storing each map in an xattr. Cachefiles has the potential to
      grow into a fully fledged filesystem of its very own if I'm not
      careful.

      However, I'm also looking at changing things even more radically
      and going to a different model of how the cache is arranged and
      managed - one that's more akin to the way, say, openafs does
      things - which brings me on to (3)...

  (3) The way cachefilesd does culling is very inefficient for large
      caches and it would be better to move it into the kernel if I can
      as cachefilesd has to keep asking the kernel if it can cull a
      file. Changing the way the backend works would allow this to be
      addressed.

  BITS THAT MAY BE CONTROVERSIAL
  ==============================

  There are some bits I've added that may be controversial:

  (1) I've provided a flag, S_KERNEL_FILE, that cachefiles uses to check
      if a files is already being used by some other kernel service
      (e.g. a duplicate cachefiles cache in the same directory) and
      reject it if it is. This isn't entirely necessary, but it helps
      prevent accidental data corruption.

      I don't want to use S_SWAPFILE as that has other effects, but
      quite possibly swapon() should set S_KERNEL_FILE too.

      Note that it doesn't prevent userspace from interfering, though
      perhaps it should. (I have made it prevent a marked directory from
      being rmdir-able).

  (2) Cachefiles wants to keep the backing file for a cookie open whilst
      we might need to write to it from network filesystem writeback.
      The problem is that the network filesystem unuses its cookie when
      its file is closed, and so we have nothing pinning the cachefiles
      file open and it will get closed automatically after a short time
      to avoid EMFILE/ENFILE problems.

      Reopening the cache file, however, is a problem if this is being
      done due to writeback triggered by exit(). Some filesystems will
      oops if we try to open a file in that context because they want to
      access current->fs or suchlike.

      To get around this, I added the following:

      (A) An inode flag, I_PINNING_FSCACHE_WB, to be set on a network
          filesystem inode to indicate that we have a usage count on the
          cookie caching that inode.

      (B) A flag in struct writeback_control, unpinned_fscache_wb, that
          is set when __writeback_single_inode() clears the last dirty
          page from i_pages - at which point it clears
          I_PINNING_FSCACHE_WB and sets this flag.

          This has to be done here so that clearing I_PINNING_FSCACHE_WB
          can be done atomically with the check of PAGECACHE_TAG_DIRTY
          that clears I_DIRTY_PAGES.

      (C) A function, fscache_set_page_dirty(), which if it is not set,
          sets I_PINNING_FSCACHE_WB and calls fscache_use_cookie() to
          pin the cache resources.

      (D) A function, fscache_unpin_writeback(), to be called by
          ->write_inode() to unuse the cookie.

      (E) A function, fscache_clear_inode_writeback(), to be called when
          the inode is evicted, before clear_inode() is called. This
          cleans up any lingering I_PINNING_FSCACHE_WB.

      The network filesystem can then use these tools to make sure that
      fscache_write_to_cache() can write locally modified data to the
      cache as well as to the server.

      For the future, I'm working on write helpers for netfs lib that
      should allow this facility to be removed by keeping track of the
      dirty regions separately - but that's incomplete at the moment and
      is also going to be affected by folios, one way or another, since
      it deals with pages"

Link: https://lore.kernel.org/all/510611.1641942444@warthog.procyon.org.uk/
Tested-by: Dominique Martinet <asmadeus@codewreck.org> # 9p
Tested-by: kafs-testing@auristor.com # afs
Tested-by: Jeff Layton <jlayton@kernel.org> # ceph
Tested-by: Dave Wysochanski <dwysocha@redhat.com> # nfs
Tested-by: Daire Byrne <daire@dneg.com> # nfs

* tag 'fscache-rewrite-20220111' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: (67 commits)
  9p, afs, ceph, nfs: Use current_is_kswapd() rather than gfpflags_allow_blocking()
  fscache: Add a tracepoint for cookie use/unuse
  fscache: Rewrite documentation
  ceph: add fscache writeback support
  ceph: conversion to new fscache API
  nfs: Implement cache I/O by accessing the cache directly
  nfs: Convert to new fscache volume/cookie API
  9p: Copy local writes to the cache when writing to the server
  9p: Use fscache indexing rewrite and reenable caching
  afs: Skip truncation on the server of data we haven't written yet
  afs: Copy local writes to the cache when writing to the server
  afs: Convert afs to use the new fscache API
  fscache, cachefiles: Display stat of culling events
  fscache, cachefiles: Display stats of no-space events
  cachefiles: Allow cachefiles to actually function
  fscache, cachefiles: Store the volume coherency data
  cachefiles: Implement the I/O routines
  cachefiles: Implement cookie resize for truncate
  cachefiles: Implement begin and end I/O operation
  cachefiles: Implement backing file wrangling
  ...
2022-01-12 13:45:12 -08:00
David Howells 01491a7565 fscache, cachefiles: Disable configuration
Disable fscache and cachefiles in Kconfig whilst it is rewritten.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
cc: linux-cachefs@redhat.com
Link: https://lore.kernel.org/r/163819576672.215744.12444272479560406780.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/163906882835.143852.11073015983885872901.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/163967075113.1823006.277316290062782998.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/164021481179.640689.2004199594774033658.stgit@warthog.procyon.org.uk/ # v4
2022-01-07 09:21:44 +00:00
Thiago Rafael Becker a31080899d cifs: sanitize multiple delimiters in prepath
mount.cifs can pass a device with multiple delimiters in it. This will
cause rename(2) to fail with ENOENT.

V2:
  - Make sanitize_path more readable.
  - Fix multiple delimiters between UNC and prepath.
  - Avoid a memory leak if a bad user starts putting a lot of delimiters
    in the path on purpose.

BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=2031200
Fixes: 24e0a1eff9 ("cifs: switch to new mount api")
Cc: stable@vger.kernel.org # 5.11+
Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Thiago Rafael Becker <trbecker@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-12-17 19:16:49 -06:00
Shyam Prasad N b774302e88 cifs: ignore resource_id while getting fscache super cookie
We have a cyclic dependency between fscache super cookie
and root inode cookie. The super cookie relies on
tcon->resource_id, which gets populated from the root inode
number. However, fetching the root inode initializes inode
cookie as a child of super cookie, which is yet to be populated.

resource_id is only used as auxdata to check the validity of
super cookie. We can completely avoid setting resource_id to
remove the circular dependency. Since vol creation time and
vol serial numbers are used for auxdata, we should be fine.
Additionally, there will be auxiliary data check for each
inode cookie as well.

Fixes: 5bf91ef03d ("cifs: wait for tcon resource_id before getting fscache super")
CC: David Howells <dhowells@redhat.com>
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-12-17 19:09:06 -06:00
Paulo Alcantara 9de0737d5b cifs: fix ntlmssp auth when there is no key exchange
Warn on the lack of key exchange during NTLMSSP authentication rather
than aborting it as there are some servers that do not set it in
CHALLENGE message.

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-12-08 16:48:43 -06:00
Shyam Prasad N bbb9db5e2a cifs: avoid use of dstaddr as key for fscache client cookie
server->dstaddr can change when the DNS mapping for the
server hostname changes. But conn_id is a u64 counter
that is incremented each time a new TCP connection
is setup. So use only that as a key.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-12-03 12:38:25 -06:00
Shyam Prasad N 2adc82006b cifs: add server conn_id to fscache client cookie
The fscache client cookie uses the server address
(and port) as the cookie key. This is a problem when
nosharesock is used. Two different connections will
use duplicate cookies. Avoid this by adding
server->conn_id to the key, so that it's guaranteed
that cookie will not be duplicated.

Also, for secondary channels of a session, copy the
fscache pointer from the primary channel. The primary
channel is guaranteed not to go away as long as secondary
channels are in use.  Also addresses minor problem found
by kernel test robot.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-12-03 12:36:04 -06:00
Shyam Prasad N 5bf91ef03d cifs: wait for tcon resource_id before getting fscache super
The logic for initializing tcon->resource_id is done inside
cifs_root_iget. fscache super cookie relies on this for aux
data. So we need to push the fscache initialization to this
later point during mount.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-12-03 12:29:59 -06:00
Paulo Alcantara 65de262a20 cifs: fix missed refcounting of ipc tcon
Fix missed refcounting of IPC tcon used for getting domain-based DFS
root referrals.  We want to keep it alive as long as mount is active
and can be refreshed.  For standalone DFS root referrals it wouldn't
be a problem as the client ends up having an IPC tcon for both mount
and cache.

Fixes: c88f7dcd6d ("cifs: support nested dfs links over reconnect")
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-12-03 12:29:10 -06:00
Steve French 0b03fe6d3a cifs: update internal version number
To 2.34

Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-23 10:07:15 -06:00
Steve French 350f4a562e smb2: clarify rc initialization in smb2_reconnect
It is clearer to initialize rc at the beginning of the function.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-23 10:07:00 -06:00
Shyam Prasad N 5112d80c16 cifs: populate server_hostname for extra channels
Recently, a new field got added to the smb3_fs_context struct
named server_hostname. While creating extra channels, pick up
this field from primary channel.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-23 10:05:39 -06:00
Shyam Prasad N b9ad6b5b68 cifs: nosharesock should be set on new server
Recent fix to maintain a nosharesock state on the
server struct caused a regression. It updated this
field in the old tcp session, and not the new one.

This caused the multichannel scenario to misbehave.

Fixes: c9f1c19cf7 (cifs: nosharesock should not share socket with future sessions)
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-23 10:04:49 -06:00
Paulo Alcantara 8ae87bbeb5 cifs: introduce cifs_ses_mark_for_reconnect() helper
Use new cifs_ses_mark_for_reconnect() helper to mark all session
channels for reconnect instead of duplicating it in different places.

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-16 10:57:08 -06:00
Steve French 446e21482e cifs: protect srv_count with cifs_tcp_ses_lock
Updates to the srv_count field are protected elsewhere
with the cifs_tcp_ses_lock spinlock.  Add one missing place
(cifs_get_tcp_sesion).

CC: Shyam Prasad N <sprasad@microsoft.com>
Addresses-Coverity: 1494149 ("Data Race Condition")
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-16 10:46:22 -06:00
Steve French 0226487ad8 cifs: move debug print out of spinlock
It is better to print debug messages outside of the chan_lock
spinlock where possible.

Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Addresses-Coverity: 1493854 ("Thread deadlock")
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-16 10:46:09 -06:00
Shyam Prasad N 46bb1b9484 cifs: do not duplicate fscache cookie for secondary channels
We allocate index cookies for each connection from the client.
However, we don't need this index for each channel in case of
multichannel. So making sure that we avoid creating duplicate
cookies by instantiating only for primary channel.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-12 23:29:08 -06:00
Shyam Prasad N 0f2b305af9 cifs: connect individual channel servers to primary channel server
Today, we don't have any way to get the smb session for any
of the secondary channels. Introducing a pointer to the primary
server from server struct of any secondary channel. The value will
be NULL for the server of the primary channel. This will enable us
to get the smb session for any channel.

This will be needed for some of the changes that I'm planning
to make soon.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-12 20:27:06 -06:00
Shyam Prasad N 724244cdb3 cifs: protect session channel fields with chan_lock
Introducing a new spin lock to protect all the channel related
fields in a cifs_ses struct. This lock should be taken
whenever dealing with the channel fields, and should be held
only for very short intervals which will not sleep.

Currently, all channel related fields in cifs_ses structure
are protected by session_mutex. However, this mutex is held for
long periods (sometimes while waiting for a reply from server).
This makes the codepath quite tricky to change.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-12 16:22:20 -06:00
Shyam Prasad N 8e07757bec cifs: do not negotiate session if session already exists
In cifs_get_smb_ses, if we find an existing matching session,
we should not send a negotiate request for the session if a
session reconnect is not necessary.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-12 16:21:03 -06:00
Steve French 02102744d3 smb3: do not setup the fscache_super_cookie until fsinfo initialized
We were calling cifs_fscache_get_super_cookie after tcon but before
we queried the info (QFS_Info) we need to initialize the cookie
properly.  Also includes an additional check suggested by Paulo
to make sure we don't initialize super cookie twice.

Suggested-by: David Howells <dhowells@redhat.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-12 13:10:33 -06:00
Paulo Alcantara 7f28af9cf5 cifs: fix potential use-after-free bugs
Ensure that share and prefix variables are set to NULL after kfree()
when looping through DFS targets in __tree_connect_dfs_target().

Also, get rid of @ref in __tree_connect_dfs_target() and just pass a
boolean to indicate whether we're handling link targets or not.

Fixes: c88f7dcd6d ("cifs: support nested dfs links over reconnect")
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-12 12:59:59 -06:00
Paulo Alcantara 869da64d07 cifs: fix memory leak of smb3_fs_context_dup::server_hostname
Fix memory leak of smb3_fs_context_dup::server_hostname when parsing
and duplicating fs contexts during mount(2) as reported by kmemleak:

  unreferenced object 0xffff888125715c90 (size 16):
    comm "mount.cifs", pid 3832, jiffies 4304535868 (age 190.094s)
    hex dump (first 16 bytes):
      7a 65 6c 64 61 2e 74 65 73 74 00 6b 6b 6b 6b a5  zelda.test.kkkk.
    backtrace:
      [<ffffffff8168106e>] kstrdup+0x2e/0x60
      [<ffffffffa027a362>] smb3_fs_context_dup+0x392/0x8d0 [cifs]
      [<ffffffffa0136353>] cifs_smb3_do_mount+0x143/0x1700 [cifs]
      [<ffffffffa02795e8>] smb3_get_tree+0x2e8/0x520 [cifs]
      [<ffffffff817a19aa>] vfs_get_tree+0x8a/0x2d0
      [<ffffffff8181e3e3>] path_mount+0x423/0x1a10
      [<ffffffff8181fbca>] __x64_sys_mount+0x1fa/0x270
      [<ffffffff83ae364b>] do_syscall_64+0x3b/0x90
      [<ffffffff83c0007c>] entry_SYSCALL_64_after_hwframe+0x44/0xae
  unreferenced object 0xffff888111deed20 (size 32):
    comm "mount.cifs", pid 3832, jiffies 4304536044 (age 189.918s)
    hex dump (first 32 bytes):
      44 46 53 52 4f 4f 54 31 2e 5a 45 4c 44 41 2e 54  DFSROOT1.ZELDA.T
      45 53 54 00 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5  EST.kkkkkkkkkkk.
    backtrace:
      [<ffffffff8168118d>] kstrndup+0x2d/0x90
      [<ffffffffa027ab2e>] smb3_parse_devname+0x9e/0x360 [cifs]
      [<ffffffffa01870c8>] cifs_setup_volume_info+0xa8/0x470 [cifs]
      [<ffffffffa018c469>] connect_dfs_target+0x309/0xc80 [cifs]
      [<ffffffffa018d6cb>] cifs_mount+0x8eb/0x17f0 [cifs]
      [<ffffffffa0136475>] cifs_smb3_do_mount+0x265/0x1700 [cifs]
      [<ffffffffa02795e8>] smb3_get_tree+0x2e8/0x520 [cifs]
      [<ffffffff817a19aa>] vfs_get_tree+0x8a/0x2d0
      [<ffffffff8181e3e3>] path_mount+0x423/0x1a10
      [<ffffffff8181fbca>] __x64_sys_mount+0x1fa/0x270
      [<ffffffff83ae364b>] do_syscall_64+0x3b/0x90
      [<ffffffff83c0007c>] entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: 7be3248f31 ("cifs: To match file servers, make sure the server hostname matches")
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-12 12:59:56 -06:00
Steve French ca780da5fd smb3: add additional null check in SMB311_posix_mkdir
Although unlikely for it to be possible for rsp to be null here,
the check is safer to add, and quiets a Coverity warning.

Addresses-Coverity: 1437501 ("Explicit Null dereference")
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-12 12:59:54 -06:00
Steve French 9e7ffa77b2 cifs: release lock earlier in dequeue_mid error case
In dequeue_mid we can log an error while holding a spinlock,
GlobalMid_Lock.  Coverity notes that the error logging
also grabs a lock so it is cleaner (and a bit safer) to
release the GlobalMid_Lock before logging the warning.

Addresses-Coverity: 1507573 ("Thread deadlock")
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-12 12:59:51 -06:00
Steve French bac35395d2 smb3: add additional null check in SMB2_tcon
Although unlikely to be possible for rsp to be null here,
the check is safer to add, and quiets a Coverity warning.

Addresses-Coverity: 1420428 ("Explicit null dereferenced")
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-12 10:22:05 -06:00
Steve French 6b7895182c smb3: add additional null check in SMB2_open
Although unlikely to be possible for rsp to be null here,
the check is safer to add, and quiets a Coverity warning.

Addresses-Coverity: 1418458 ("Explicit null dereferenced")
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-12 10:21:51 -06:00
Steve French 4d9beec22f smb3: add additional null check in SMB2_ioctl
Although unlikely for it to be possible for rsp to be null here,
the check is safer to add, and quiets a Coverity warning.

Addresses-Coverity: 1443909 ("Explicit Null dereference")
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-11 14:49:24 -06:00
Steve French 0e62904836 smb3: remove trivial dfs compile warning
Fix warning caused by recent changes to the dfs code:

symbol 'tree_connect_dfs_target' was not declared. Should it be static?

Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-10 16:30:23 -06:00
Paulo Alcantara c88f7dcd6d cifs: support nested dfs links over reconnect
Mounting a dfs link that has nested links was already supported at
mount(2), so make it work over reconnect as well.

Make the following case work:

* mount //root/dfs/link /mnt -o ...
  - final share: /server/share

* in server settings
  - change target folder of /root/dfs/link3 to /server/share2
  - change target folder of /root/dfs/link2 to /root/dfs/link3
  - change target folder of /root/dfs/link to /root/dfs/link2

* mount -o remount,... /mnt
 - refresh all dfs referrals
 - mark current connection for failover
 - cifs_reconnect() reconnects to root server
 - tree_connect()
   * checks that /root/dfs/link2 is a link, then chase it
   * checks that root/dfs/link3 is a link, then chase it
   * finally tree connect to /server/share2

If the mounted share is no longer accessible and a reconnect had been
triggered, the client will retry it from both last referral
path (/root/dfs/link3) and original referral path (/root/dfs/link).

Any new referral paths found while chasing dfs links over reconnect,
it will be updated to TCP_Server_Info::leaf_fullpath, accordingly.

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-10 16:30:13 -06:00
Steve French 71e6864eac smb3: do not error on fsync when readonly
Linux allows doing a flush/fsync on a file open for read-only,
but the protocol does not allow that.  If the file passed in
on the flush is read-only try to find a writeable handle for
the same inode, if that is not possible skip sending the
fsync call to the server to avoid breaking the apps.

Reported-by: Julian Sikorski <belegdol@gmail.com>
Tested-by: Julian Sikorski <belegdol@gmail.com>
Suggested-by: Jeremy Allison <jra@samba.org>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-10 16:28:27 -06:00
Shyam Prasad N b6f2a0f89d cifs: for compound requests, use open handle if possible
For smb2_compound_op, it is possible to pass a ref to
an already open file. We should be passing it whenever possible.
i.e. if a matching handle is already kept open.

If we don't do that, we will end up breaking leases for files
kept open on the same client.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-10 11:05:20 -06:00
Paulo Alcantara 4ac0536f88 cifs: set a minimum of 120s for next dns resolution
With commit 506c1da44f ("cifs: use the expiry output of dns_query to
schedule next resolution") and after triggering the first reconnect,
the next async dns resolution of tcp server's hostname would be
scheduled based on dns_resolver's key expiry default, which happens to
default to 5s on most systems that use key.dns_resolver for upcall.

As per key.dns_resolver.conf(5):

       default_ttl=<number>
              The  number  of  seconds  to  set  as the expiration on a cached
              record.  This will be overridden if the program manages  to  re-
              trieve  TTL  information along with the addresses (if, for exam-
              ple, it accesses the DNS directly).  The default is  5  seconds.
              The value must be in the range 1 to INT_MAX.

Make the next async dns resolution no shorter than 120s as we do not
want to be upcalling too often.

Cc: stable@vger.kernel.org
Fixes: 506c1da44f ("cifs: use the expiry output of dns_query to schedule next resolution")
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-09 23:03:08 -06:00
Paulo Alcantara bbcce36804 cifs: split out dfs code from cifs_reconnect()
Make two separate functions that handle dfs and non-dfs reconnect
logics since cifs_reconnect() became way too complex to handle both.
While at it, add some documentation.

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-09 23:01:55 -06:00
Paulo Alcantara ae0abb4dac cifs: convert list_for_each to entry variant
Convert list_for_each{,_safe} to list_for_each_entry{,_safe} in
cifs_mark_tcp_ses_conns_for_reconnect() function.

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-09 20:46:36 -06:00
Paulo Alcantara 43b459aa5e cifs: introduce new helper for cifs_reconnect()
Create cifs_mark_tcp_ses_conns_for_reconnect() helper to mark all
sessions and tcons for reconnect when reconnecting tcp server.

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-09 20:46:08 -06:00
Paulo Alcantara efb21d7b0f cifs: fix print of hdr_flags in dfscache_proc_show()
Reorder the parameters in seq_printf() to correctly print header
flags.

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-09 20:44:07 -06:00
Shyam Prasad N 49bd49f983 cifs: send workstation name during ntlmssp session setup
During the ntlmssp session setup (authenticate phases)
send the client workstation info. This can make debugging easier on
servers.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-08 13:07:56 -06:00
Shyam Prasad N c9f1c19cf7 cifs: nosharesock should not share socket with future sessions
Today, when a new mount is done with nosharesock, we ensure
that we don't select an existing matching session. However,
we don't mark the connection as nosharesock, which means that
those could be shared with future sessions.

Fixed it with this commit. Also printing this info in DebugData.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-07 00:09:37 -05:00
Linus Torvalds b5013d084e 7 cifs/smb3 fixes, mostly refactoring, but also a mount fix, a debugging improvement, and a reconnect fix for stable
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmGF8yAACgkQiiy9cAdy
 T1F2gAv+PLwQmJVKBx7i8xExUX1SRPFLPBEC6RXWsBBgElSbE3RJdaRLbH2OBBty
 PSlK+hGMP65JevcZO2+bF0nHGiLfh7+MumF+xKkvJxXjoPz/+zTMPnlQP9SNW8Dl
 VhgcQDTdSxQ8lzv2d9Z16b749WAPLuMncZCz1IfY+Dsd7/Zagv12QdPdi2knzAxU
 B+qx3dNPzxTFyCtasUEMATHoxpsOc+MywqDPT8p5/NLpF7h7K2w9qwKezc7hKiI6
 iruZKfjJO+g0QAldT3fp3LzfmUr2V8Z85D0VZn18mQNBxinjtk0+uacZzwoXAxqU
 5EicdIhlMEQqtRJNoDUVRMst0h3UP45AhN63Jjh8VdJRUJeJ14zMlSf3ze9KgTIJ
 Sts3WU/7LPjHk6sMg2lr73y+VRSg2jtfEPpCdoo/g0Cv5h6IsdX5NUhNI98onQQ7
 R350i/A+raiRO5lYkzLcDabXDTesiFfENm8YYLlEK6DiQtZ6PhU/L46dgHPYt+hf
 7/RsKCz5
 =ibss
 -----END PGP SIGNATURE-----

Merge tag '5.16-rc-part1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs updates from Steve French:

 - reconnect fix for stable

 - minor mount option fix

 - debugging improvement for (TCP) connection issues

 - refactoring of common code to help ksmbd

* tag '5.16-rc-part1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  smb3: add dynamic trace points for socket connection
  cifs: Move SMB2_Create definitions to the shared area
  cifs: Move more definitions into the shared area
  cifs: move NEGOTIATE_PROTOCOL definitions out into the common area
  cifs: Create a new shared file holding smb2 pdu definitions
  cifs: add mount parameter tcpnodelay
  cifs: To match file servers, make sure the server hostname matches
2021-11-06 16:47:53 -07:00
Steve French d7171cd1ac smb3: add dynamic trace points for socket connection
In debugging user problems with ip address/DNS issues with
smb3 mounts, we sometimes needed additional info on the hostname
and ip address.

Add two tracepoints, one to show socket connection success
and one for failures to connect to the socket.

Sample output:
      mount.cifs-14551   [005] .....  7636.547906: smb3_connect_done: conn_id=0x1 server=localhost addr=127.0.0.1:445
      mount.cifs-14558   [004] .....  7642.405413: smb3_connect_done: conn_id=0x2 server=smfrench.file.core.windows.net addr=52.239.158.232:445
      mount.cifs-14741   [005] .....  7818.490716: smb3_connect_done: conn_id=0x3 server=::1 addr=[::1]:445/0%0
      mount.cifs-14810   [000] .....  7966.380337: smb3_connect_err: rc=-101 conn_id=0x4 server=::2 addr=[::2]:445/0%0
      mount.cifs-14810   [000] .....  7966.380356: smb3_connect_err: rc=-101 conn_id=0x4 server=::2 addr=[::2]:139/0%0
      mount.cifs-14818   [003] .....  7986.771992: smb3_connect_done: conn_id=0x5 server=127.0.0.9 addr=127.0.0.9:445
      mount.cifs-14825   [008] .....  8008.178109: smb3_connect_err: rc=-115 conn_id=0x6 server=124.23.0.9 addr=124.23.0.9:445
      mount.cifs-14825   [008] .....  8013.298085: smb3_connect_err: rc=-115 conn_id=0x6 server=124.23.0.9 addr=124.23.0.9:139
           cifsd-14553   [006] .....  8036.735615: smb3_reconnect: conn_id=0x1 server=localhost current_mid=32
           cifsd-14743   [010] .....  8036.735644: smb3_reconnect: conn_id=0x3 server=::1 current_mid=29
           cifsd-14743   [010] .....  8039.921740: smb3_connect_err: rc=-111 conn_id=0x3 server=::1 addr=[::1]:445/0%0
           cifsd-14553   [008] .....  8042.993894: smb3_connect_err: rc=-111 conn_id=0x1 server=localhost addr=127.0.0.1:445
           cifsd-14743   [010] .....  8042.993894: smb3_connect_err: rc=-111 conn_id=0x3 server=::1 addr=[::1]:445/0%0
           cifsd-14553   [008] .....  8046.065824: smb3_connect_err: rc=-111 conn_id=0x1 server=localhost addr=127.0.0.1:445
           cifsd-14743   [010] .....  8046.065824: smb3_connect_err: rc=-111 conn_id=0x3 server=::1 addr=[::1]:445/0%0
           cifsd-14553   [008] .....  8049.137796: smb3_connect_done: conn_id=0x1 server=localhost addr=127.0.0.1:445
           cifsd-14743   [010] .....  8049.137796: smb3_connect_done: conn_id=0x3 server=::1 addr=[::1]:445/0%0

Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-05 16:20:24 -05:00
Ronnie Sahlberg c462870bf8 cifs: Move SMB2_Create definitions to the shared area
Move all SMB2_Create definitions (except contexts) into the shared area.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-05 09:55:36 -05:00
Ronnie Sahlberg d8d9de532d cifs: Move more definitions into the shared area
Move SMB2_SessionSetup, SMB2_Close, SMB2_Read, SMB2_Write and
SMB2_ChangeNotify commands into smbfs_common/smb2pdu.h

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-05 09:55:21 -05:00
Ronnie Sahlberg fc0b384469 cifs: move NEGOTIATE_PROTOCOL definitions out into the common area
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-05 09:53:54 -05:00
Ronnie Sahlberg 0d35e382e4 cifs: Create a new shared file holding smb2 pdu definitions
This file will contain all the definitions we need for SMB2 packets
and will follow the naming convention of MS-SMB2.PDF as closely
as possible to make it easier to cross-reference beween the definitions
and the standard.

The content of this file will mostly consist of migration of existing
definitions in the cifs/smb2.pdu.h and ksmbd/smb2pdu.h files
with some additional tweaks as the two files have diverged.

This patch introduces the new smbfs_common/smb2pdu.h file
and migrates the SMB2 header as well as TREE_CONNECT and TREE_DISCONNECT
to the shared file.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-05 09:50:57 -05:00
Steve French 7ae5e588b0 cifs: add mount parameter tcpnodelay
Although corking and uncorking the socket (which cifs.ko already
does) should usually have the desired benefit, using the new
tcpnodelay mount option causes tcp_sock_set_nodelay() to be set
on the socket which may be useful in order to ensure that we don't
ever have cases where the network stack is waiting on sending an
SMB request until multiple SMB requests have been added to the
send queue (since this could lead to long latencies).

To enable it simply append "tcpnodelay" it to the mount options

Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-02 13:13:34 -05:00
Shyam Prasad N 7be3248f31 cifs: To match file servers, make sure the server hostname matches
We generally rely on a bunch of factors to differentiate between servers.
For example, IP address, port etc.

For certain server types (like Azure), it is important to make sure
that the server hostname matches too, even if the both hostnames currently
resolve to the same IP address.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-02 13:12:55 -05:00
Jens Axboe 6b19b766e8 fs: get rid of the res2 iocb->ki_complete argument
The second argument was only used by the USB gadget code, yet everyone
pays the overhead of passing a zero to be passed into aio, where it
ends up being part of the aio res2 value.

Now that everybody is passing in zero, kill off the extra argument.

Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-25 10:36:24 -06:00
Steve French 9ed38fd4a1 cifs: fix incorrect check for null pointer in header_assemble
Although very unlikely that the tlink pointer would be null in this case,
get_next_mid function can in theory return null (but not an error)
so need to check for null (not for IS_ERR, which can not be returned
here).

Address warning:

        fs/smbfs_client/connect.c:2392 cifs_match_super()
        warn: 'tlink' isn't an ERR_PTR

Pointed out by Dan Carpenter via smatch code analysis tool

CC: stable@vger.kernel.org
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-23 21:12:53 -05:00
Steve French 1db1aa9887 smb3: correct server pointer dereferencing check to be more consistent
Address warning:

    fs/smbfs_client/misc.c:273 header_assemble()
    warn: variable dereferenced before check 'treeCon->ses->server'

Pointed out by Dan Carpenter via smatch code analysis tool

Although the check is likely unneeded, adding it makes the code
more consistent and easier to read, as the same check is
done elsewhere in the function.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-23 21:12:23 -05:00