Changes since last update:
- Leave compressed inodes unsupported in fscache mode for now; - Avoid crash when using tracepoint cachefiles_prep_read; - Fix `backmost' behavior due to a recent cleanup; - Update documentation for better description of recent new features; - Several decompression cleanups w/o logical change. -----BEGIN PGP SIGNATURE----- iIcEABYIAC8WIQThPAmQN9sSA0DVxtI5NzHcH7XmBAUCYpeFXxEceGlhbmdAa2Vy bmVsLm9yZwAKCRA5NzHcH7XmBC9eAQC8YSePEG+YCGbmOCGadSuBsgU+OXzKGpCV KxPyy3SmPQEAyNCDk11HoaYDRywS8TbMPntlyRfXvtEGSxbRe+5d1Qc= =4RnO -----END PGP SIGNATURE----- Merge tag 'erofs-for-5.19-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull more erofs updates from Gao Xiang: "This is a follow-up to the main updates, including some fixes of fscache mode related to compressed inodes and a cachefiles tracepoint. There is also a patch to fix an unexpected decompression strategy change due to a cleanup in the past. All the fixes are quite small. Apart from these, documentation is also updated for a better description of recent new features. In addition, this has some trivial cleanups without actual code logic changes, so I could have a more recent codebase to work on folios and avoiding the PG_error page flag for the next cycle. Summary: - Leave compressed inodes unsupported in fscache mode for now - Avoid crash when using tracepoint cachefiles_prep_read - Fix `backmost' behavior due to a recent cleanup - Update documentation for better description of recent new features - Several decompression cleanups w/o logical change" * tag 'erofs-for-5.19-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: fix 'backmost' member of z_erofs_decompress_frontend erofs: simplify z_erofs_pcluster_readmore() erofs: get rid of label `restart_now' erofs: get rid of `struct z_erofs_collection' erofs: update documentation erofs: fix crash when enable tracepoint cachefiles_prep_read erofs: leave compressed inodes unsupported in fscache mode for now
This commit is contained in:
commit
8171acb8bc
|
@ -1,63 +1,82 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
======================================
|
||||
Enhanced Read-Only File System - EROFS
|
||||
EROFS - Enhanced Read-Only File System
|
||||
======================================
|
||||
|
||||
Overview
|
||||
========
|
||||
|
||||
EROFS file-system stands for Enhanced Read-Only File System. Different
|
||||
from other read-only file systems, it aims to be designed for flexibility,
|
||||
scalability, but be kept simple and high performance.
|
||||
EROFS filesystem stands for Enhanced Read-Only File System. It aims to form a
|
||||
generic read-only filesystem solution for various read-only use cases instead
|
||||
of just focusing on storage space saving without considering any side effects
|
||||
of runtime performance.
|
||||
|
||||
It is designed as a better filesystem solution for the following scenarios:
|
||||
It is designed to meet the needs of flexibility, feature extendability and user
|
||||
payload friendly, etc. Apart from those, it is still kept as a simple
|
||||
random-access friendly high-performance filesystem to get rid of unneeded I/O
|
||||
amplification and memory-resident overhead compared to similar approaches.
|
||||
|
||||
It is implemented to be a better choice for the following scenarios:
|
||||
|
||||
- read-only storage media or
|
||||
|
||||
- part of a fully trusted read-only solution, which means it needs to be
|
||||
immutable and bit-for-bit identical to the official golden image for
|
||||
their releases due to security and other considerations and
|
||||
their releases due to security or other considerations and
|
||||
|
||||
- hope to minimize extra storage space with guaranteed end-to-end performance
|
||||
by using compact layout, transparent file compression and direct access,
|
||||
especially for those embedded devices with limited memory and high-density
|
||||
hosts with numerous containers;
|
||||
hosts with numerous containers.
|
||||
|
||||
Here is the main features of EROFS:
|
||||
|
||||
- Little endian on-disk design;
|
||||
|
||||
- Currently 4KB block size (nobh) and therefore maximum 16TB address space;
|
||||
- 4KiB block size and 32-bit block addresses, therefore 16TiB address space
|
||||
at most for now;
|
||||
|
||||
- Metadata & data could be mixed by design;
|
||||
- Two inode layouts for different requirements:
|
||||
|
||||
- 2 inode versions for different requirements:
|
||||
|
||||
===================== ============ =====================================
|
||||
===================== ============ ======================================
|
||||
compact (v1) extended (v2)
|
||||
===================== ============ =====================================
|
||||
===================== ============ ======================================
|
||||
Inode metadata size 32 bytes 64 bytes
|
||||
Max file size 4 GB 16 EB (also limited by max. vol size)
|
||||
Max file size 4 GiB 16 EiB (also limited by max. vol size)
|
||||
Max uids/gids 65536 4294967296
|
||||
Per-inode timestamp no yes (64 + 32-bit timestamp)
|
||||
Max hardlinks 65536 4294967296
|
||||
Metadata reserved 4 bytes 14 bytes
|
||||
===================== ============ =====================================
|
||||
Metadata reserved 8 bytes 18 bytes
|
||||
===================== ============ ======================================
|
||||
|
||||
- Metadata and data could be mixed as an option;
|
||||
|
||||
- Support extended attributes (xattrs) as an option;
|
||||
|
||||
- Support xattr inline and tail-end data inline for all files;
|
||||
- Support tailpacking data and xattr inline compared to byte-addressed
|
||||
unaligned metadata or smaller block size alternatives;
|
||||
|
||||
- Support POSIX.1e ACLs by using xattrs;
|
||||
|
||||
- Support transparent data compression as an option:
|
||||
LZ4 algorithm with the fixed-sized output compression for high performance;
|
||||
LZ4 and MicroLZMA algorithms can be used on a per-file basis; In addition,
|
||||
inplace decompression is also supported to avoid bounce compressed buffers
|
||||
and page cache thrashing.
|
||||
|
||||
- Multiple device support for multi-layer container images.
|
||||
- Support direct I/O on uncompressed files to avoid double caching for loop
|
||||
devices;
|
||||
|
||||
- Support FSDAX on uncompressed images for secure containers and ramdisks in
|
||||
order to get rid of unnecessary page cache.
|
||||
|
||||
- Support multiple devices for multi blob container images;
|
||||
|
||||
- Support file-based on-demand loading with the Fscache infrastructure.
|
||||
|
||||
The following git tree provides the file system user-space tools under
|
||||
development (ex, formatting tool mkfs.erofs):
|
||||
development, such as a formatting tool (mkfs.erofs), an on-disk consistency &
|
||||
compatibility checking tool (fsck.erofs), and a debugging tool (dump.erofs):
|
||||
|
||||
- git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs-utils.git
|
||||
|
||||
|
@ -91,6 +110,7 @@ dax={always,never} Use direct access (no page cache). See
|
|||
Documentation/filesystems/dax.rst.
|
||||
dax A legacy option which is an alias for ``dax=always``.
|
||||
device=%s Specify a path to an extra device to be used together.
|
||||
fsid=%s Specify a filesystem image ID for Fscache back-end.
|
||||
=================== =========================================================
|
||||
|
||||
Sysfs Entries
|
||||
|
@ -226,8 +246,8 @@ Note that apart from the offset of the first filename, nameoff0 also indicates
|
|||
the total number of directory entries in this block since it is no need to
|
||||
introduce another on-disk field at all.
|
||||
|
||||
Chunk-based file
|
||||
----------------
|
||||
Chunk-based files
|
||||
-----------------
|
||||
In order to support chunk-based data deduplication, a new inode data layout has
|
||||
been supported since Linux v5.15: Files are split in equal-sized data chunks
|
||||
with ``extents`` area of the inode metadata indicating how to get the chunk
|
||||
|
|
|
@ -17,6 +17,7 @@ static struct netfs_io_request *erofs_fscache_alloc_request(struct address_space
|
|||
rreq->start = start;
|
||||
rreq->len = len;
|
||||
rreq->mapping = mapping;
|
||||
rreq->inode = mapping->host;
|
||||
INIT_LIST_HEAD(&rreq->subrequests);
|
||||
refcount_set(&rreq->ref, 1);
|
||||
return rreq;
|
||||
|
|
|
@ -288,7 +288,10 @@ static int erofs_fill_inode(struct inode *inode, int isdir)
|
|||
}
|
||||
|
||||
if (erofs_inode_is_data_compressed(vi->datalayout)) {
|
||||
if (!erofs_is_fscache_mode(inode->i_sb))
|
||||
err = z_erofs_fill_inode(inode);
|
||||
else
|
||||
err = -EOPNOTSUPP;
|
||||
goto out_unlock;
|
||||
}
|
||||
inode->i_mapping->a_ops = &erofs_raw_access_aops;
|
||||
|
|
135
fs/erofs/zdata.c
135
fs/erofs/zdata.c
|
@ -199,7 +199,6 @@ struct z_erofs_decompress_frontend {
|
|||
struct z_erofs_pagevec_ctor vector;
|
||||
|
||||
struct z_erofs_pcluster *pcl, *tailpcl;
|
||||
struct z_erofs_collection *cl;
|
||||
/* a pointer used to pick up inplace I/O pages */
|
||||
struct page **icpage_ptr;
|
||||
z_erofs_next_pcluster_t owned_head;
|
||||
|
@ -214,7 +213,7 @@ struct z_erofs_decompress_frontend {
|
|||
|
||||
#define DECOMPRESS_FRONTEND_INIT(__i) { \
|
||||
.inode = __i, .owned_head = Z_EROFS_PCLUSTER_TAIL, \
|
||||
.mode = COLLECT_PRIMARY_FOLLOWED }
|
||||
.mode = COLLECT_PRIMARY_FOLLOWED, .backmost = true }
|
||||
|
||||
static struct page *z_pagemap_global[Z_EROFS_VMAP_GLOBAL_PAGES];
|
||||
static DEFINE_MUTEX(z_pagemap_global_lock);
|
||||
|
@ -357,7 +356,7 @@ static bool z_erofs_try_inplace_io(struct z_erofs_decompress_frontend *fe,
|
|||
return false;
|
||||
}
|
||||
|
||||
/* callers must be with collection lock held */
|
||||
/* callers must be with pcluster lock held */
|
||||
static int z_erofs_attach_page(struct z_erofs_decompress_frontend *fe,
|
||||
struct page *page, enum z_erofs_page_type type,
|
||||
bool pvec_safereuse)
|
||||
|
@ -372,7 +371,7 @@ static int z_erofs_attach_page(struct z_erofs_decompress_frontend *fe,
|
|||
|
||||
ret = z_erofs_pagevec_enqueue(&fe->vector, page, type,
|
||||
pvec_safereuse);
|
||||
fe->cl->vcnt += (unsigned int)ret;
|
||||
fe->pcl->vcnt += (unsigned int)ret;
|
||||
return ret ? 0 : -EAGAIN;
|
||||
}
|
||||
|
||||
|
@ -405,12 +404,11 @@ static void z_erofs_try_to_claim_pcluster(struct z_erofs_decompress_frontend *f)
|
|||
f->mode = COLLECT_PRIMARY;
|
||||
}
|
||||
|
||||
static int z_erofs_lookup_collection(struct z_erofs_decompress_frontend *fe,
|
||||
static int z_erofs_lookup_pcluster(struct z_erofs_decompress_frontend *fe,
|
||||
struct inode *inode,
|
||||
struct erofs_map_blocks *map)
|
||||
{
|
||||
struct z_erofs_pcluster *pcl = fe->pcl;
|
||||
struct z_erofs_collection *cl;
|
||||
unsigned int length;
|
||||
|
||||
/* to avoid unexpected loop formed by corrupted images */
|
||||
|
@ -419,8 +417,7 @@ static int z_erofs_lookup_collection(struct z_erofs_decompress_frontend *fe,
|
|||
return -EFSCORRUPTED;
|
||||
}
|
||||
|
||||
cl = z_erofs_primarycollection(pcl);
|
||||
if (cl->pageofs != (map->m_la & ~PAGE_MASK)) {
|
||||
if (pcl->pageofs_out != (map->m_la & ~PAGE_MASK)) {
|
||||
DBG_BUGON(1);
|
||||
return -EFSCORRUPTED;
|
||||
}
|
||||
|
@ -443,23 +440,21 @@ static int z_erofs_lookup_collection(struct z_erofs_decompress_frontend *fe,
|
|||
length = READ_ONCE(pcl->length);
|
||||
}
|
||||
}
|
||||
mutex_lock(&cl->lock);
|
||||
mutex_lock(&pcl->lock);
|
||||
/* used to check tail merging loop due to corrupted images */
|
||||
if (fe->owned_head == Z_EROFS_PCLUSTER_TAIL)
|
||||
fe->tailpcl = pcl;
|
||||
|
||||
z_erofs_try_to_claim_pcluster(fe);
|
||||
fe->cl = cl;
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int z_erofs_register_collection(struct z_erofs_decompress_frontend *fe,
|
||||
static int z_erofs_register_pcluster(struct z_erofs_decompress_frontend *fe,
|
||||
struct inode *inode,
|
||||
struct erofs_map_blocks *map)
|
||||
{
|
||||
bool ztailpacking = map->m_flags & EROFS_MAP_META;
|
||||
struct z_erofs_pcluster *pcl;
|
||||
struct z_erofs_collection *cl;
|
||||
struct erofs_workgroup *grp;
|
||||
int err;
|
||||
|
||||
|
@ -482,17 +477,15 @@ static int z_erofs_register_collection(struct z_erofs_decompress_frontend *fe,
|
|||
|
||||
/* new pclusters should be claimed as type 1, primary and followed */
|
||||
pcl->next = fe->owned_head;
|
||||
pcl->pageofs_out = map->m_la & ~PAGE_MASK;
|
||||
fe->mode = COLLECT_PRIMARY_FOLLOWED;
|
||||
|
||||
cl = z_erofs_primarycollection(pcl);
|
||||
cl->pageofs = map->m_la & ~PAGE_MASK;
|
||||
|
||||
/*
|
||||
* lock all primary followed works before visible to others
|
||||
* and mutex_trylock *never* fails for a new pcluster.
|
||||
*/
|
||||
mutex_init(&cl->lock);
|
||||
DBG_BUGON(!mutex_trylock(&cl->lock));
|
||||
mutex_init(&pcl->lock);
|
||||
DBG_BUGON(!mutex_trylock(&pcl->lock));
|
||||
|
||||
if (ztailpacking) {
|
||||
pcl->obj.index = 0; /* which indicates ztailpacking */
|
||||
|
@ -519,11 +512,10 @@ static int z_erofs_register_collection(struct z_erofs_decompress_frontend *fe,
|
|||
fe->tailpcl = pcl;
|
||||
fe->owned_head = &pcl->next;
|
||||
fe->pcl = pcl;
|
||||
fe->cl = cl;
|
||||
return 0;
|
||||
|
||||
err_out:
|
||||
mutex_unlock(&cl->lock);
|
||||
mutex_unlock(&pcl->lock);
|
||||
z_erofs_free_pcluster(pcl);
|
||||
return err;
|
||||
}
|
||||
|
@ -535,9 +527,9 @@ static int z_erofs_collector_begin(struct z_erofs_decompress_frontend *fe,
|
|||
struct erofs_workgroup *grp;
|
||||
int ret;
|
||||
|
||||
DBG_BUGON(fe->cl);
|
||||
DBG_BUGON(fe->pcl);
|
||||
|
||||
/* must be Z_EROFS_PCLUSTER_TAIL or pointed to previous collection */
|
||||
/* must be Z_EROFS_PCLUSTER_TAIL or pointed to previous pcluster */
|
||||
DBG_BUGON(fe->owned_head == Z_EROFS_PCLUSTER_NIL);
|
||||
DBG_BUGON(fe->owned_head == Z_EROFS_PCLUSTER_TAIL_CLOSED);
|
||||
|
||||
|
@ -554,14 +546,14 @@ static int z_erofs_collector_begin(struct z_erofs_decompress_frontend *fe,
|
|||
fe->pcl = container_of(grp, struct z_erofs_pcluster, obj);
|
||||
} else {
|
||||
tailpacking:
|
||||
ret = z_erofs_register_collection(fe, inode, map);
|
||||
ret = z_erofs_register_pcluster(fe, inode, map);
|
||||
if (!ret)
|
||||
goto out;
|
||||
if (ret != -EEXIST)
|
||||
return ret;
|
||||
}
|
||||
|
||||
ret = z_erofs_lookup_collection(fe, inode, map);
|
||||
ret = z_erofs_lookup_pcluster(fe, inode, map);
|
||||
if (ret) {
|
||||
erofs_workgroup_put(&fe->pcl->obj);
|
||||
return ret;
|
||||
|
@ -569,7 +561,7 @@ tailpacking:
|
|||
|
||||
out:
|
||||
z_erofs_pagevec_ctor_init(&fe->vector, Z_EROFS_NR_INLINE_PAGEVECS,
|
||||
fe->cl->pagevec, fe->cl->vcnt);
|
||||
fe->pcl->pagevec, fe->pcl->vcnt);
|
||||
/* since file-backed online pages are traversed in reverse order */
|
||||
fe->icpage_ptr = fe->pcl->compressed_pages +
|
||||
z_erofs_pclusterpages(fe->pcl);
|
||||
|
@ -582,48 +574,36 @@ out:
|
|||
*/
|
||||
static void z_erofs_rcu_callback(struct rcu_head *head)
|
||||
{
|
||||
struct z_erofs_collection *const cl =
|
||||
container_of(head, struct z_erofs_collection, rcu);
|
||||
|
||||
z_erofs_free_pcluster(container_of(cl, struct z_erofs_pcluster,
|
||||
primary_collection));
|
||||
z_erofs_free_pcluster(container_of(head,
|
||||
struct z_erofs_pcluster, rcu));
|
||||
}
|
||||
|
||||
void erofs_workgroup_free_rcu(struct erofs_workgroup *grp)
|
||||
{
|
||||
struct z_erofs_pcluster *const pcl =
|
||||
container_of(grp, struct z_erofs_pcluster, obj);
|
||||
struct z_erofs_collection *const cl = z_erofs_primarycollection(pcl);
|
||||
|
||||
call_rcu(&cl->rcu, z_erofs_rcu_callback);
|
||||
}
|
||||
|
||||
static void z_erofs_collection_put(struct z_erofs_collection *cl)
|
||||
{
|
||||
struct z_erofs_pcluster *const pcl =
|
||||
container_of(cl, struct z_erofs_pcluster, primary_collection);
|
||||
|
||||
erofs_workgroup_put(&pcl->obj);
|
||||
call_rcu(&pcl->rcu, z_erofs_rcu_callback);
|
||||
}
|
||||
|
||||
static bool z_erofs_collector_end(struct z_erofs_decompress_frontend *fe)
|
||||
{
|
||||
struct z_erofs_collection *cl = fe->cl;
|
||||
struct z_erofs_pcluster *pcl = fe->pcl;
|
||||
|
||||
if (!cl)
|
||||
if (!pcl)
|
||||
return false;
|
||||
|
||||
z_erofs_pagevec_ctor_exit(&fe->vector, false);
|
||||
mutex_unlock(&cl->lock);
|
||||
mutex_unlock(&pcl->lock);
|
||||
|
||||
/*
|
||||
* if all pending pages are added, don't hold its reference
|
||||
* any longer if the pcluster isn't hosted by ourselves.
|
||||
*/
|
||||
if (fe->mode < COLLECT_PRIMARY_FOLLOWED_NOINPLACE)
|
||||
z_erofs_collection_put(cl);
|
||||
erofs_workgroup_put(&pcl->obj);
|
||||
|
||||
fe->cl = NULL;
|
||||
fe->pcl = NULL;
|
||||
return true;
|
||||
}
|
||||
|
||||
|
@ -663,28 +643,23 @@ static int z_erofs_do_read_page(struct z_erofs_decompress_frontend *fe,
|
|||
repeat:
|
||||
cur = end - 1;
|
||||
|
||||
/* lucky, within the range of the current map_blocks */
|
||||
if (offset + cur >= map->m_la &&
|
||||
offset + cur < map->m_la + map->m_llen) {
|
||||
/* didn't get a valid collection previously (very rare) */
|
||||
if (!fe->cl)
|
||||
goto restart_now;
|
||||
goto hitted;
|
||||
}
|
||||
|
||||
/* go ahead the next map_blocks */
|
||||
erofs_dbg("%s: [out-of-range] pos %llu", __func__, offset + cur);
|
||||
if (offset + cur < map->m_la ||
|
||||
offset + cur >= map->m_la + map->m_llen) {
|
||||
erofs_dbg("out-of-range map @ pos %llu", offset + cur);
|
||||
|
||||
if (z_erofs_collector_end(fe))
|
||||
fe->backmost = false;
|
||||
|
||||
map->m_la = offset + cur;
|
||||
map->m_llen = 0;
|
||||
err = z_erofs_map_blocks_iter(inode, map, 0);
|
||||
if (err)
|
||||
goto err_out;
|
||||
} else {
|
||||
if (fe->pcl)
|
||||
goto hitted;
|
||||
/* didn't get a valid pcluster previously (very rare) */
|
||||
}
|
||||
|
||||
restart_now:
|
||||
if (!(map->m_flags & EROFS_MAP_MAPPED))
|
||||
goto hitted;
|
||||
|
||||
|
@ -766,7 +741,7 @@ retry:
|
|||
/* bump up the number of spiltted parts of a page */
|
||||
++spiltted;
|
||||
/* also update nr_pages */
|
||||
fe->cl->nr_pages = max_t(pgoff_t, fe->cl->nr_pages, index + 1);
|
||||
fe->pcl->nr_pages = max_t(pgoff_t, fe->pcl->nr_pages, index + 1);
|
||||
next_part:
|
||||
/* can be used for verification */
|
||||
map->m_llen = offset + cur - map->m_la;
|
||||
|
@ -821,15 +796,13 @@ static int z_erofs_decompress_pcluster(struct super_block *sb,
|
|||
|
||||
enum z_erofs_page_type page_type;
|
||||
bool overlapped, partial;
|
||||
struct z_erofs_collection *cl;
|
||||
int err;
|
||||
|
||||
might_sleep();
|
||||
cl = z_erofs_primarycollection(pcl);
|
||||
DBG_BUGON(!READ_ONCE(cl->nr_pages));
|
||||
DBG_BUGON(!READ_ONCE(pcl->nr_pages));
|
||||
|
||||
mutex_lock(&cl->lock);
|
||||
nr_pages = cl->nr_pages;
|
||||
mutex_lock(&pcl->lock);
|
||||
nr_pages = pcl->nr_pages;
|
||||
|
||||
if (nr_pages <= Z_EROFS_VMAP_ONSTACK_PAGES) {
|
||||
pages = pages_onstack;
|
||||
|
@ -857,9 +830,9 @@ static int z_erofs_decompress_pcluster(struct super_block *sb,
|
|||
|
||||
err = 0;
|
||||
z_erofs_pagevec_ctor_init(&ctor, Z_EROFS_NR_INLINE_PAGEVECS,
|
||||
cl->pagevec, 0);
|
||||
pcl->pagevec, 0);
|
||||
|
||||
for (i = 0; i < cl->vcnt; ++i) {
|
||||
for (i = 0; i < pcl->vcnt; ++i) {
|
||||
unsigned int pagenr;
|
||||
|
||||
page = z_erofs_pagevec_dequeue(&ctor, &page_type);
|
||||
|
@ -945,11 +918,11 @@ static int z_erofs_decompress_pcluster(struct super_block *sb,
|
|||
goto out;
|
||||
|
||||
llen = pcl->length >> Z_EROFS_PCLUSTER_LENGTH_BIT;
|
||||
if (nr_pages << PAGE_SHIFT >= cl->pageofs + llen) {
|
||||
if (nr_pages << PAGE_SHIFT >= pcl->pageofs_out + llen) {
|
||||
outputsize = llen;
|
||||
partial = !(pcl->length & Z_EROFS_PCLUSTER_FULL_LENGTH);
|
||||
} else {
|
||||
outputsize = (nr_pages << PAGE_SHIFT) - cl->pageofs;
|
||||
outputsize = (nr_pages << PAGE_SHIFT) - pcl->pageofs_out;
|
||||
partial = true;
|
||||
}
|
||||
|
||||
|
@ -963,7 +936,7 @@ static int z_erofs_decompress_pcluster(struct super_block *sb,
|
|||
.in = compressed_pages,
|
||||
.out = pages,
|
||||
.pageofs_in = pcl->pageofs_in,
|
||||
.pageofs_out = cl->pageofs,
|
||||
.pageofs_out = pcl->pageofs_out,
|
||||
.inputsize = inputsize,
|
||||
.outputsize = outputsize,
|
||||
.alg = pcl->algorithmformat,
|
||||
|
@ -1012,16 +985,12 @@ out:
|
|||
else if (pages != pages_onstack)
|
||||
kvfree(pages);
|
||||
|
||||
cl->nr_pages = 0;
|
||||
cl->vcnt = 0;
|
||||
pcl->nr_pages = 0;
|
||||
pcl->vcnt = 0;
|
||||
|
||||
/* all cl locks MUST be taken before the following line */
|
||||
/* pcluster lock MUST be taken before the following line */
|
||||
WRITE_ONCE(pcl->next, Z_EROFS_PCLUSTER_NIL);
|
||||
|
||||
/* all cl locks SHOULD be released right now */
|
||||
mutex_unlock(&cl->lock);
|
||||
|
||||
z_erofs_collection_put(cl);
|
||||
mutex_unlock(&pcl->lock);
|
||||
return err;
|
||||
}
|
||||
|
||||
|
@ -1043,6 +1012,7 @@ static void z_erofs_decompress_queue(const struct z_erofs_decompressqueue *io,
|
|||
owned = READ_ONCE(pcl->next);
|
||||
|
||||
z_erofs_decompress_pcluster(io->sb, pcl, pagepool);
|
||||
erofs_workgroup_put(&pcl->obj);
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -1466,22 +1436,19 @@ static void z_erofs_pcluster_readmore(struct z_erofs_decompress_frontend *f,
|
|||
struct page *page;
|
||||
|
||||
page = erofs_grab_cache_page_nowait(inode->i_mapping, index);
|
||||
if (!page)
|
||||
goto skip;
|
||||
|
||||
if (page) {
|
||||
if (PageUptodate(page)) {
|
||||
unlock_page(page);
|
||||
put_page(page);
|
||||
goto skip;
|
||||
}
|
||||
|
||||
} else {
|
||||
err = z_erofs_do_read_page(f, page, pagepool);
|
||||
if (err)
|
||||
erofs_err(inode->i_sb,
|
||||
"readmore error at page %lu @ nid %llu",
|
||||
index, EROFS_I(inode)->nid);
|
||||
}
|
||||
put_page(page);
|
||||
skip:
|
||||
}
|
||||
|
||||
if (cur < PAGE_SIZE)
|
||||
break;
|
||||
cur = (index << PAGE_SHIFT) - 1;
|
||||
|
|
|
@ -12,21 +12,40 @@
|
|||
#define Z_EROFS_PCLUSTER_MAX_PAGES (Z_EROFS_PCLUSTER_MAX_SIZE / PAGE_SIZE)
|
||||
#define Z_EROFS_NR_INLINE_PAGEVECS 3
|
||||
|
||||
#define Z_EROFS_PCLUSTER_FULL_LENGTH 0x00000001
|
||||
#define Z_EROFS_PCLUSTER_LENGTH_BIT 1
|
||||
|
||||
/*
|
||||
* let's leave a type here in case of introducing
|
||||
* another tagged pointer later.
|
||||
*/
|
||||
typedef void *z_erofs_next_pcluster_t;
|
||||
|
||||
/*
|
||||
* Structure fields follow one of the following exclusion rules.
|
||||
*
|
||||
* I: Modifiable by initialization/destruction paths and read-only
|
||||
* for everyone else;
|
||||
*
|
||||
* L: Field should be protected by pageset lock;
|
||||
* L: Field should be protected by the pcluster lock;
|
||||
*
|
||||
* A: Field should be accessed / updated in atomic for parallelized code.
|
||||
*/
|
||||
struct z_erofs_collection {
|
||||
struct z_erofs_pcluster {
|
||||
struct erofs_workgroup obj;
|
||||
struct mutex lock;
|
||||
|
||||
/* A: point to next chained pcluster or TAILs */
|
||||
z_erofs_next_pcluster_t next;
|
||||
|
||||
/* A: lower limit of decompressed length and if full length or not */
|
||||
unsigned int length;
|
||||
|
||||
/* I: page offset of start position of decompression */
|
||||
unsigned short pageofs;
|
||||
unsigned short pageofs_out;
|
||||
|
||||
/* I: page offset of inline compressed data */
|
||||
unsigned short pageofs_in;
|
||||
|
||||
/* L: maximum relative page index in pagevec[] */
|
||||
unsigned short nr_pages;
|
||||
|
@ -41,29 +60,6 @@ struct z_erofs_collection {
|
|||
/* I: can be used to free the pcluster by RCU. */
|
||||
struct rcu_head rcu;
|
||||
};
|
||||
};
|
||||
|
||||
#define Z_EROFS_PCLUSTER_FULL_LENGTH 0x00000001
|
||||
#define Z_EROFS_PCLUSTER_LENGTH_BIT 1
|
||||
|
||||
/*
|
||||
* let's leave a type here in case of introducing
|
||||
* another tagged pointer later.
|
||||
*/
|
||||
typedef void *z_erofs_next_pcluster_t;
|
||||
|
||||
struct z_erofs_pcluster {
|
||||
struct erofs_workgroup obj;
|
||||
struct z_erofs_collection primary_collection;
|
||||
|
||||
/* A: point to next chained pcluster or TAILs */
|
||||
z_erofs_next_pcluster_t next;
|
||||
|
||||
/* A: lower limit of decompressed length and if full length or not */
|
||||
unsigned int length;
|
||||
|
||||
/* I: page offset of inline compressed data */
|
||||
unsigned short pageofs_in;
|
||||
|
||||
union {
|
||||
/* I: physical cluster size in pages */
|
||||
|
@ -80,8 +76,6 @@ struct z_erofs_pcluster {
|
|||
struct page *compressed_pages[];
|
||||
};
|
||||
|
||||
#define z_erofs_primarycollection(pcluster) (&(pcluster)->primary_collection)
|
||||
|
||||
/* let's avoid the valid 32-bit kernel addresses */
|
||||
|
||||
/* the chained workgroup has't submitted io (still open) */
|
||||
|
|
Loading…
Reference in New Issue