ext4: update ext4 documentation

Add documentation for mount options and ioctls to
Documentation/filesystem/ext4.txt, which has not been udpated for some
time.  Also add for ext4 sysfs tunables to the
Documentation/ABI/testing/sysfs-fs-ext4 file, and fix a few
typographical errors in that file.

https://bugzilla.kernel.org/show_bug.cgi?id=9423

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This commit is contained in:
Lukas Czerner 2011-02-21 20:16:21 -05:00 committed by Theodore Ts'o
parent 3abb17e82f
commit 6f9524e9e1
2 changed files with 216 additions and 4 deletions

View File

@ -48,7 +48,7 @@ Description:
will have its blocks allocated out of its own unique
preallocation pool.
What: /sys/fs/ext4/<disk>/inode_readahead
What: /sys/fs/ext4/<disk>/inode_readahead_blks
Date: March 2008
Contact: "Theodore Ts'o" <tytso@mit.edu>
Description:
@ -85,7 +85,14 @@ Date: June 2008
Contact: "Theodore Ts'o" <tytso@mit.edu>
Description:
Tuning parameter which (if non-zero) controls the goal
inode used by the inode allocator in p0reference to
all other allocation hueristics. This is intended for
inode used by the inode allocator in preference to
all other allocation heuristics. This is intended for
debugging use only, and should be 0 on production
systems.
What: /sys/fs/ext4/<disk>/max_writeback_mb_bump
Date: September 2009
Contact: "Theodore Ts'o" <tytso@mit.edu>
Description:
The maximum number of megabytes the writeback code will
try to write out before move on to another inode.

View File

@ -373,6 +373,41 @@ nodiscard(*) commands to the underlying block device when
and sparse/thinly-provisioned LUNs, but it is off
by default until sufficient testing has been done.
nouid32 Disables 32-bit UIDs and GIDs. This is for
interoperability with older kernels which only
store and expect 16-bit values.
resize Allows to resize filesystem to the end of the last
existing block group, further resize has to be done
with resize2fs either online, or offline. It can be
used only with conjunction with remount.
block_validity This options allows to enables/disables the in-kernel
noblock_validity facility for tracking filesystem metadata blocks
within internal data structures. This allows multi-
block allocator and other routines to quickly locate
extents which might overlap with filesystem metadata
blocks. This option is intended for debugging
purposes and since it negatively affects the
performance, it is off by default.
dioread_lock Controls whether or not ext4 should use the DIO read
dioread_nolock locking. If the dioread_nolock option is specified
ext4 will allocate uninitialized extent before buffer
write and convert the extent to initialized after IO
completes. This approach allows ext4 code to avoid
using inode mutex, which improves scalability on high
speed storages. However this does not work with nobh
option and the mount will fail. Nor does it work with
data journaling and dioread_nolock option will be
ignored with kernel warning. Note that dioread_nolock
code path is only used for extent-based files.
Because of the restrictions this options comprises
it is off by default (e.g. dioread_lock).
i_version Enable 64-bit inode version support. This option is
off by default.
Data Mode
=========
There are 3 different data modes:
@ -400,6 +435,176 @@ needs to be read from and written to disk at the same time where it
outperforms all others modes. Currently ext4 does not have delayed
allocation support if this data journalling mode is selected.
/proc entries
=============
Information about mounted ext4 file systems can be found in
/proc/fs/ext4. Each mounted filesystem will have a directory in
/proc/fs/ext4 based on its device name (i.e., /proc/fs/ext4/hdc or
/proc/fs/ext4/dm-0). The files in each per-device directory are shown
in table below.
Files in /proc/fs/ext4/<devname>
..............................................................................
File Content
mb_groups details of multiblock allocator buddy cache of free blocks
..............................................................................
/sys entries
============
Information about mounted ext4 file systems can be found in
/sys/fs/ext4. Each mounted filesystem will have a directory in
/sys/fs/ext4 based on its device name (i.e., /sys/fs/ext4/hdc or
/sys/fs/ext4/dm-0). The files in each per-device directory are shown
in table below.
Files in /sys/fs/ext4/<devname>
(see also Documentation/ABI/testing/sysfs-fs-ext4)
..............................................................................
File Content
delayed_allocation_blocks This file is read-only and shows the number of
blocks that are dirty in the page cache, but
which do not have their location in the
filesystem allocated yet.
inode_goal Tuning parameter which (if non-zero) controls
the goal inode used by the inode allocator in
preference to all other allocation heuristics.
This is intended for debugging use only, and
should be 0 on production systems.
inode_readahead_blks Tuning parameter which controls the maximum
number of inode table blocks that ext4's inode
table readahead algorithm will pre-read into
the buffer cache
lifetime_write_kbytes This file is read-only and shows the number of
kilobytes of data that have been written to this
filesystem since it was created.
max_writeback_mb_bump The maximum number of megabytes the writeback
code will try to write out before move on to
another inode.
mb_group_prealloc The multiblock allocator will round up allocation
requests to a multiple of this tuning parameter if
the stripe size is not set in the ext4 superblock
mb_max_to_scan The maximum number of extents the multiblock
allocator will search to find the best extent
mb_min_to_scan The minimum number of extents the multiblock
allocator will search to find the best extent
mb_order2_req Tuning parameter which controls the minimum size
for requests (as a power of 2) where the buddy
cache is used
mb_stats Controls whether the multiblock allocator should
collect statistics, which are shown during the
unmount. 1 means to collect statistics, 0 means
not to collect statistics
mb_stream_req Files which have fewer blocks than this tunable
parameter will have their blocks allocated out
of a block group specific preallocation pool, so
that small files are packed closely together.
Each large file will have its blocks allocated
out of its own unique preallocation pool.
session_write_kbytes This file is read-only and shows the number of
kilobytes of data that have been written to this
filesystem since it was mounted.
..............................................................................
Ioctls
======
There is some Ext4 specific functionality which can be accessed by applications
through the system call interfaces. The list of all Ext4 specific ioctls are
shown in the table below.
Table of Ext4 specific ioctls
..............................................................................
Ioctl Description
EXT4_IOC_GETFLAGS Get additional attributes associated with inode.
The ioctl argument is an integer bitfield, with
bit values described in ext4.h. This ioctl is an
alias for FS_IOC_GETFLAGS.
EXT4_IOC_SETFLAGS Set additional attributes associated with inode.
The ioctl argument is an integer bitfield, with
bit values described in ext4.h. This ioctl is an
alias for FS_IOC_SETFLAGS.
EXT4_IOC_GETVERSION
EXT4_IOC_GETVERSION_OLD
Get the inode i_generation number stored for
each inode. The i_generation number is normally
changed only when new inode is created and it is
particularly useful for network filesystems. The
'_OLD' version of this ioctl is an alias for
FS_IOC_GETVERSION.
EXT4_IOC_SETVERSION
EXT4_IOC_SETVERSION_OLD
Set the inode i_generation number stored for
each inode. The '_OLD' version of this ioctl
is an alias for FS_IOC_SETVERSION.
EXT4_IOC_GROUP_EXTEND This ioctl has the same purpose as the resize
mount option. It allows to resize filesystem
to the end of the last existing block group,
further resize has to be done with resize2fs,
either online, or offline. The argument points
to the unsigned logn number representing the
filesystem new block count.
EXT4_IOC_MOVE_EXT Move the block extents from orig_fd (the one
this ioctl is pointing to) to the donor_fd (the
one specified in move_extent structure passed
as an argument to this ioctl). Then, exchange
inode metadata between orig_fd and donor_fd.
This is especially useful for online
defragmentation, because the allocator has the
opportunity to allocate moved blocks better,
ideally into one contiguous extent.
EXT4_IOC_GROUP_ADD Add a new group descriptor to an existing or
new group descriptor block. The new group
descriptor is described by ext4_new_group_input
structure, which is passed as an argument to
this ioctl. This is especially useful in
conjunction with EXT4_IOC_GROUP_EXTEND,
which allows online resize of the filesystem
to the end of the last existing block group.
Those two ioctls combined is used in userspace
online resize tool (e.g. resize2fs).
EXT4_IOC_MIGRATE This ioctl operates on the filesystem itself.
It converts (migrates) ext3 indirect block mapped
inode to ext4 extent mapped inode by walking
through indirect block mapping of the original
inode and converting contiguous block ranges
into ext4 extents of the temporary inode. Then,
inodes are swapped. This ioctl might help, when
migrating from ext3 to ext4 filesystem, however
suggestion is to create fresh ext4 filesystem
and copy data from the backup. Note, that
filesystem has to support extents for this ioctl
to work.
EXT4_IOC_ALLOC_DA_BLKS Force all of the delay allocated blocks to be
allocated to preserve application-expected ext3
behaviour. Note that this will also start
triggering a write of the data blocks, but this
behaviour may change in the future as it is
not necessary and has been done this way only
for sake of simplicity.
..............................................................................
References
==========