[CachePruning] Clarify the per-directory entry limit on Linux ext4.

Summary:
508   root node entries (root_limit)
510   internal node entries (node_limit)

For a filename with 40 bytes, its sizeof(ext4_dir_entry_2) = 48, a linear directory can contain at most floor(4096/48)=85 of them.
The real per-directory entry limit should be 508*510*85 = 22021800
The limit varies with the average length of filenames.

However, the Linux ext4 code does not try rebalancing the htree, so we will not be able to create filenames in a full leaf node. This is demonstrated with the following example, certain filenames cannot be used while others can:

  % touch d/0000000000000000000000000000000000816a6f
  touch: cannot touch 'd/0000000000000000000000000000000000816a6f': No
  space left on device
  % touch d/0000000000000000000000000000000000816a70
  # succeeded

Reviewers: pcc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45546

llvm-svn: 329966
This commit is contained in:
Fangrui Song 2018-04-12 22:27:38 +00:00
parent 01d349bab1
commit 6cf69128a1
1 changed files with 5 additions and 2 deletions

View File

@ -52,8 +52,11 @@ struct CachePruningPolicy {
/// the number of files based pruning.
///
/// This defaults to 1000000 because with that many files there are
/// diminishing returns on the effectiveness of the cache, and file
/// systems have a limit on total number of files.
/// diminishing returns on the effectiveness of the cache. Some systems have a
/// limit on total number of files, and some also limit the number of files
/// per directory, such as Linux ext4, with the default setting (block size is
/// 4096 and large_dir disabled), there is a per-directory entry limit of
/// 508*510*floor(4096/(40+8))~=20M for average filename length of 40.
uint64_t MaxSizeFiles = 1000000;
};