Go to file
Bob Peterson ad26967b9a gfs2: Use async glocks for rename
Because s_vfs_rename_mutex is not cluster-wide, multiple nodes can
reverse the roles of which directories are "old" and which are "new" for
the purposes of rename. This can cause deadlocks where two nodes end up
waiting for each other.

There can be several layers of directory dependencies across many nodes.

This patch fixes the problem by acquiring all gfs2_rename's inode glocks
asychronously and waiting for all glocks to be acquired.  That way all
inodes are locked regardless of the order.

The timeout value for multiple asynchronous glocks is calculated to be
the total of the individual wait times for each glock times two.

Since gfs2_exchange is very similar to gfs2_rename, both functions are
patched in the same way.

A new async glock wait queue, sd_async_glock_wait, keeps a list of
waiters for these events. If gfs2's holder_wake function detects an
async holder, it wakes up any waiters for the event. The waiter only
tests whether any of its requests are still pending.

Since the glocks are sent to dlm asychronously, the wait function needs
to check to see which glocks, if any, were granted.

If a glock is granted by dlm (and therefore held), its minimum hold time
is checked and adjusted as necessary, as other glock grants do.

If the event times out, all glocks held thus far must be dequeued to
resolve any existing deadlocks.  Then, if there are any outstanding
locking requests, we need to loop around and wait for dlm to respond to
those requests too.  After we release all requests, we return -ESTALE to
the caller (vfs rename) which loops around and retries the request.

    Node1           Node2
    ---------       ---------
1.  Enqueue A       Enqueue B
2.  Enqueue B       Enqueue A
3.  A granted
6.                  B granted
7.  Wait for B
8.                  Wait for A
9.                  A times out (since Node 1 holds A)
10.                 Dequeue B (since it was granted)
11.                 Wait for all requests from DLM
12. B Granted (since Node2 released it in step 10)
13. Rename
14. Dequeue A
15.                 DLM Grants A
16.                 Dequeue A (due to the timeout and since we
                    no longer have B held for our task).
17. Dequeue B
18.                 Return -ESTALE to vfs
19.                 VFS retries the operation, goto step 1.

This release-all-locks / acquire-all-locks may slow rename / exchange
down as both nodes struggle in the same way and do the same thing.
However, this will only happen when there is contention for the same
inodes, which ought to be rare.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-09-04 20:22:17 +02:00
Documentation HMM patches for 5.3-rc 2019-07-30 12:54:44 -07:00
LICENSES LICENSES: Rename other to deprecated 2019-05-03 06:34:32 -06:00
arch powerpc fixes for 5.3 #3 2019-08-04 10:30:47 -07:00
block for-linus-20190726-2 2019-07-26 19:20:34 -07:00
certs Revert "Merge tag 'keys-acl-20190703' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs" 2019-07-10 18:43:43 -07:00
crypto USB / PHY patches for 5.3-rc1 2019-07-11 15:40:06 -07:00
drivers tpmdd fixes for Linux v5.3-rc2 2019-08-04 16:39:07 -07:00
fs gfs2: Use async glocks for rename 2019-09-04 20:22:17 +02:00
include Merge branch 'akpm' (patches from Andrew) 2019-08-03 09:20:49 -07:00
init Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
ipc Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
kernel memremap: move from kernel/ to mm/ 2019-08-03 07:02:01 -07:00
lib Kbuild fixes for v5.3 (2nd) 2019-08-04 10:16:30 -07:00
mm memremap: move from kernel/ to mm/ 2019-08-03 07:02:01 -07:00
net tcp: be more careful in tcp_fragment() 2019-07-21 20:41:24 -07:00
samples treewide: remove SPDX "WITH Linux-syscall-note" from kernel-space headers again 2019-07-25 11:05:10 +02:00
scripts kconfig: Clear "written" flag to avoid data loss 2019-08-04 12:44:15 +09:00
security selinux/stable-5.3 PR 20190801 2019-08-02 18:40:49 -07:00
sound sound fixes for 5.3-rc3 2019-08-02 08:53:34 -07:00
tools Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-08-03 10:58:46 -07:00
usr kbuild: enable arch/s390/include/uapi/asm/zcrypt.h for uapi header test 2019-07-23 10:45:46 +02:00
virt Documentation: move Documentation/virtual to Documentation/virt 2019-07-24 10:52:11 +02:00
.clang-format Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-04-17 11:26:25 -07:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: set git diff driver for C source code files 2016-10-07 18:46:30 -07:00
.gitignore .gitignore: Add compilation database file 2019-07-27 12:18:19 +09:00
.mailmap MAINTAINERS: Update my email address 2019-07-22 14:57:50 +01:00
COPYING COPYING: use the new text with points to the license files 2018-03-23 12:41:45 -06:00
CREDITS Remove references to dead website. 2019-07-19 12:22:04 -07:00
Kbuild Kbuild updates for v5.1 2019-03-10 17:48:21 -07:00
Kconfig docs: kbuild: convert docs to ReST and rename to *.rst 2019-06-14 14:21:21 -06:00
MAINTAINERS MAINTAINERS: Add Geert as Renesas SoC Co-Maintainer 2019-08-04 10:23:23 -07:00
Makefile Linux 5.3-rc3 2019-08-04 18:40:12 -07:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.