Changes for Linux 5.1:
- Fix online fsck to handle inode btrees correctly on 64k block filesystems. - Teach online fsck to check directory and attribute names for invalid characters. - Miscellanous fixes for online fsck. - Introduce a new panic mask so that we can halt immediately on metadata corruption (for debugging purposes) - Fix a block mapping race during writeback. - Cache unlinked inode list backrefs in memory to speed up list processing. - Separate the bnobt/cntbt and inobt/finobt buffer verifiers so that we can detect crosslinked btrees. - Refactor magic number verification so that we can standardize it. - Strengthen ondisk metadata structure offset build time verification. - Fix a memory corruption problem in the listxattr code. - Fix a shutdown problem during log recovery due to unreserved finobt expansion. - Fix a referential integrity problem where O_TMPFILE inodes were put on the unlinked list with nlink > 0 which would cause asserts during log recovery if the system went down immediately. - Refactor the delayed allocation allocator to be more clever about the possibility that its mapping might be stale. - Various fixes to the copy on write mechanism. - Make CoW preallocation suitable for use even with writes that wouldn't otherwise require it. - Refactor an internal API. - Fix some statx implementation bugs. - Fix miscellaneous compiler and static checker complaints. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAlx5ZaIACgkQ+H93GTRK tOvZnhAApQ7bEsJIovMhW36yh/pXef9iKk2xAv6Kwn9jWEysQH+vlBvwEnaJTb0E 8WBYm0Jv1DK8A6luSLmWbCIWB+RMkwhufXKadaTBtL8ySWq8uf776e9dVQflRyNU pN5k+JwQEfVDw8beAkUhr8yrZODv6iQi7/e2O0ARuf6CeNvX5ZR7OTjPdW3vX/zu F0t5UOegd2kGq+kkmIfDpAlQ8ju5U6P8sjImbXbjiTtSBydqkadgx3rCfa7MdD7u Y3Zauzba9PmJyJQ71JNsXgle58Ojq57LwWsRL/XEufy5B1upZsDzfDYTlAsDIkPJ pcG3YGPuGrBG/VX12XHmHqxt6DeW3S+GMVVxx7xBpSVDiSGeOazugLrAXk83NVzo eaUM96VVJO1lZArLxieZ7EBcuuNjW4PYYCOmUobGz/kxAzjubJU3dRMRLHiG5A1H TB2p0WlIP8cSxboF+SK8mhBmYDa1w16TCHQ6d6i9IFqcrmSc0JUvE3gy1UJzHhMb FESHasLTz9NTRhnCY3QjOYcT9Sa/s22+46+yZ8xBADG14N00dsYnCC5bbR/ALI09 9RNawcDvzN6p9pcJMuRImWlXx5dpNjIAr/bEp+xZK1tnsHg8HKHub61e2x6PhmH8 COJTErXV01IDRSVdQv6rFSE38/FACYppjzNyUdE7VydyKSoxBsM= =BAFq -----END PGP SIGNATURE----- Merge tag 'xfs-5.1-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux Pull xfs updates from Darrick Wong: "Here are a number of new features and bug fixes for 5.1 They've undergone a week's worth of fstesting and merge cleanly with master as of this morning Most of the changes center on improving metadata validation and fixing problems with online fsck, though there's also a new cache to speed up unlinked inode handling and cleanup of the copy on write code in preparation for future features Changes for Linux 5.1: - Fix online fsck to handle inode btrees correctly on 64k block filesystems - Teach online fsck to check directory and attribute names for invalid characters - Miscellanous fixes for online fsck - Introduce a new panic mask so that we can halt immediately on metadata corruption (for debugging purposes) - Fix a block mapping race during writeback - Cache unlinked inode list backrefs in memory to speed up list processing - Separate the bnobt/cntbt and inobt/finobt buffer verifiers so that we can detect crosslinked btrees - Refactor magic number verification so that we can standardize it - Strengthen ondisk metadata structure offset build time verification - Fix a memory corruption problem in the listxattr code - Fix a shutdown problem during log recovery due to unreserved finobt expansion - Fix a referential integrity problem where O_TMPFILE inodes were put on the unlinked list with nlink > 0 which would cause asserts during log recovery if the system went down immediately - Refactor the delayed allocation allocator to be more clever about the possibility that its mapping might be stale - Various fixes to the copy on write mechanism - Make CoW preallocation suitable for use even with writes that wouldn't otherwise require it - Refactor an internal API - Fix some statx implementation bugs - Fix miscellaneous compiler and static checker complaints" * tag 'xfs-5.1-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (70 commits) xfs: fix reporting supported extra file attributes for statx() xfs: fix backwards endian conversion in scrub xfs: fix uninitialized error variables xfs: rework breaking of shared extents in xfs_file_iomap_begin xfs: don't pass iomap flags to xfs_reflink_allocate_cow xfs: fix uninitialized error variable xfs: introduce an always_cow mode xfs: report IOMAP_F_SHARED from xfs_file_iomap_begin_delay xfs: make COW fork unwritten extent conversions more robust xfs: merge COW handling into xfs_file_iomap_begin_delay xfs: also truncate holes covered by COW blocks xfs: don't use delalloc extents for COW on files with extsize hints xfs: fix SEEK_DATA for speculative COW fork preallocation xfs: make xfs_bmbt_to_iomap more useful xfs: fix xfs_buf magic number endian checks xfs: retry COW fork delalloc conversion when no extent was found xfs: remove the truncate short cut in xfs_map_blocks xfs: move xfs_iomap_write_allocate to xfs_aops.c xfs: move stat accounting to xfs_bmapi_convert_delalloc xfs: move transaction handling to xfs_bmapi_convert_delalloc ..
This commit is contained in:
commit
9e1fd794cb
|
@ -272,7 +272,7 @@ The following sysctls are available for the XFS filesystem:
|
|||
XFS_ERRLEVEL_LOW: 1
|
||||
XFS_ERRLEVEL_HIGH: 5
|
||||
|
||||
fs.xfs.panic_mask (Min: 0 Default: 0 Max: 255)
|
||||
fs.xfs.panic_mask (Min: 0 Default: 0 Max: 256)
|
||||
Causes certain error conditions to call BUG(). Value is a bitmask;
|
||||
OR together the tags which represent errors which should cause panics:
|
||||
|
||||
|
@ -285,6 +285,7 @@ The following sysctls are available for the XFS filesystem:
|
|||
XFS_PTAG_SHUTDOWN_IOERROR 0x00000020
|
||||
XFS_PTAG_SHUTDOWN_LOGERROR 0x00000040
|
||||
XFS_PTAG_FSBLOCK_ZERO 0x00000080
|
||||
XFS_PTAG_VERIFIER_ERROR 0x00000100
|
||||
|
||||
This option is intended for debugging only.
|
||||
|
||||
|
|
|
@ -339,14 +339,14 @@ xfs_ag_init_headers(
|
|||
{ /* BNO root block */
|
||||
.daddr = XFS_AGB_TO_DADDR(mp, id->agno, XFS_BNO_BLOCK(mp)),
|
||||
.numblks = BTOBB(mp->m_sb.sb_blocksize),
|
||||
.ops = &xfs_allocbt_buf_ops,
|
||||
.ops = &xfs_bnobt_buf_ops,
|
||||
.work = &xfs_bnoroot_init,
|
||||
.need_init = true
|
||||
},
|
||||
{ /* CNT root block */
|
||||
.daddr = XFS_AGB_TO_DADDR(mp, id->agno, XFS_CNT_BLOCK(mp)),
|
||||
.numblks = BTOBB(mp->m_sb.sb_blocksize),
|
||||
.ops = &xfs_allocbt_buf_ops,
|
||||
.ops = &xfs_cntbt_buf_ops,
|
||||
.work = &xfs_cntroot_init,
|
||||
.need_init = true
|
||||
},
|
||||
|
@ -361,7 +361,7 @@ xfs_ag_init_headers(
|
|||
{ /* FINO root block */
|
||||
.daddr = XFS_AGB_TO_DADDR(mp, id->agno, XFS_FIBT_BLOCK(mp)),
|
||||
.numblks = BTOBB(mp->m_sb.sb_blocksize),
|
||||
.ops = &xfs_inobt_buf_ops,
|
||||
.ops = &xfs_finobt_buf_ops,
|
||||
.work = &xfs_btroot_init,
|
||||
.type = XFS_BTNUM_FINO,
|
||||
.need_init = xfs_sb_version_hasfinobt(&mp->m_sb)
|
||||
|
|
|
@ -281,7 +281,7 @@ xfs_ag_resv_init(
|
|||
*/
|
||||
ask = used = 0;
|
||||
|
||||
mp->m_inotbt_nores = true;
|
||||
mp->m_finobt_nores = true;
|
||||
|
||||
error = xfs_refcountbt_calc_reserves(mp, tp, agno, &ask,
|
||||
&used);
|
||||
|
|
|
@ -568,9 +568,9 @@ xfs_agfl_verify(
|
|||
if (!xfs_sb_version_hascrc(&mp->m_sb))
|
||||
return NULL;
|
||||
|
||||
if (!uuid_equal(&agfl->agfl_uuid, &mp->m_sb.sb_meta_uuid))
|
||||
if (!xfs_verify_magic(bp, agfl->agfl_magicnum))
|
||||
return __this_address;
|
||||
if (be32_to_cpu(agfl->agfl_magicnum) != XFS_AGFL_MAGIC)
|
||||
if (!uuid_equal(&agfl->agfl_uuid, &mp->m_sb.sb_meta_uuid))
|
||||
return __this_address;
|
||||
/*
|
||||
* during growfs operations, the perag is not fully initialised,
|
||||
|
@ -643,6 +643,7 @@ xfs_agfl_write_verify(
|
|||
|
||||
const struct xfs_buf_ops xfs_agfl_buf_ops = {
|
||||
.name = "xfs_agfl",
|
||||
.magic = { cpu_to_be32(XFS_AGFL_MAGIC), cpu_to_be32(XFS_AGFL_MAGIC) },
|
||||
.verify_read = xfs_agfl_read_verify,
|
||||
.verify_write = xfs_agfl_write_verify,
|
||||
.verify_struct = xfs_agfl_verify,
|
||||
|
@ -2587,8 +2588,10 @@ xfs_agf_verify(
|
|||
return __this_address;
|
||||
}
|
||||
|
||||
if (!(agf->agf_magicnum == cpu_to_be32(XFS_AGF_MAGIC) &&
|
||||
XFS_AGF_GOOD_VERSION(be32_to_cpu(agf->agf_versionnum)) &&
|
||||
if (!xfs_verify_magic(bp, agf->agf_magicnum))
|
||||
return __this_address;
|
||||
|
||||
if (!(XFS_AGF_GOOD_VERSION(be32_to_cpu(agf->agf_versionnum)) &&
|
||||
be32_to_cpu(agf->agf_freeblks) <= be32_to_cpu(agf->agf_length) &&
|
||||
be32_to_cpu(agf->agf_flfirst) < xfs_agfl_size(mp) &&
|
||||
be32_to_cpu(agf->agf_fllast) < xfs_agfl_size(mp) &&
|
||||
|
@ -2670,6 +2673,7 @@ xfs_agf_write_verify(
|
|||
|
||||
const struct xfs_buf_ops xfs_agf_buf_ops = {
|
||||
.name = "xfs_agf",
|
||||
.magic = { cpu_to_be32(XFS_AGF_MAGIC), cpu_to_be32(XFS_AGF_MAGIC) },
|
||||
.verify_read = xfs_agf_read_verify,
|
||||
.verify_write = xfs_agf_write_verify,
|
||||
.verify_struct = xfs_agf_verify,
|
||||
|
|
|
@ -297,48 +297,34 @@ xfs_allocbt_verify(
|
|||
struct xfs_perag *pag = bp->b_pag;
|
||||
xfs_failaddr_t fa;
|
||||
unsigned int level;
|
||||
xfs_btnum_t btnum = XFS_BTNUM_BNOi;
|
||||
|
||||
if (!xfs_verify_magic(bp, block->bb_magic))
|
||||
return __this_address;
|
||||
|
||||
if (xfs_sb_version_hascrc(&mp->m_sb)) {
|
||||
fa = xfs_btree_sblock_v5hdr_verify(bp);
|
||||
if (fa)
|
||||
return fa;
|
||||
}
|
||||
|
||||
/*
|
||||
* magic number and level verification
|
||||
* The perag may not be attached during grow operations or fully
|
||||
* initialized from the AGF during log recovery. Therefore we can only
|
||||
* check against maximum tree depth from those contexts.
|
||||
*
|
||||
* During growfs operations, we can't verify the exact level or owner as
|
||||
* the perag is not fully initialised and hence not attached to the
|
||||
* buffer. In this case, check against the maximum tree depth.
|
||||
*
|
||||
* Similarly, during log recovery we will have a perag structure
|
||||
* attached, but the agf information will not yet have been initialised
|
||||
* from the on disk AGF. Again, we can only check against maximum limits
|
||||
* in this case.
|
||||
* Otherwise check against the per-tree limit. Peek at one of the
|
||||
* verifier magic values to determine the type of tree we're verifying
|
||||
* against.
|
||||
*/
|
||||
level = be16_to_cpu(block->bb_level);
|
||||
switch (block->bb_magic) {
|
||||
case cpu_to_be32(XFS_ABTB_CRC_MAGIC):
|
||||
fa = xfs_btree_sblock_v5hdr_verify(bp);
|
||||
if (fa)
|
||||
return fa;
|
||||
/* fall through */
|
||||
case cpu_to_be32(XFS_ABTB_MAGIC):
|
||||
if (pag && pag->pagf_init) {
|
||||
if (level >= pag->pagf_levels[XFS_BTNUM_BNOi])
|
||||
return __this_address;
|
||||
} else if (level >= mp->m_ag_maxlevels)
|
||||
if (bp->b_ops->magic[0] == cpu_to_be32(XFS_ABTC_MAGIC))
|
||||
btnum = XFS_BTNUM_CNTi;
|
||||
if (pag && pag->pagf_init) {
|
||||
if (level >= pag->pagf_levels[btnum])
|
||||
return __this_address;
|
||||
break;
|
||||
case cpu_to_be32(XFS_ABTC_CRC_MAGIC):
|
||||
fa = xfs_btree_sblock_v5hdr_verify(bp);
|
||||
if (fa)
|
||||
return fa;
|
||||
/* fall through */
|
||||
case cpu_to_be32(XFS_ABTC_MAGIC):
|
||||
if (pag && pag->pagf_init) {
|
||||
if (level >= pag->pagf_levels[XFS_BTNUM_CNTi])
|
||||
return __this_address;
|
||||
} else if (level >= mp->m_ag_maxlevels)
|
||||
return __this_address;
|
||||
break;
|
||||
default:
|
||||
} else if (level >= mp->m_ag_maxlevels)
|
||||
return __this_address;
|
||||
}
|
||||
|
||||
return xfs_btree_sblock_verify(bp, mp->m_alloc_mxr[level != 0]);
|
||||
}
|
||||
|
@ -377,13 +363,23 @@ xfs_allocbt_write_verify(
|
|||
|
||||
}
|
||||
|
||||
const struct xfs_buf_ops xfs_allocbt_buf_ops = {
|
||||
.name = "xfs_allocbt",
|
||||
const struct xfs_buf_ops xfs_bnobt_buf_ops = {
|
||||
.name = "xfs_bnobt",
|
||||
.magic = { cpu_to_be32(XFS_ABTB_MAGIC),
|
||||
cpu_to_be32(XFS_ABTB_CRC_MAGIC) },
|
||||
.verify_read = xfs_allocbt_read_verify,
|
||||
.verify_write = xfs_allocbt_write_verify,
|
||||
.verify_struct = xfs_allocbt_verify,
|
||||
};
|
||||
|
||||
const struct xfs_buf_ops xfs_cntbt_buf_ops = {
|
||||
.name = "xfs_cntbt",
|
||||
.magic = { cpu_to_be32(XFS_ABTC_MAGIC),
|
||||
cpu_to_be32(XFS_ABTC_CRC_MAGIC) },
|
||||
.verify_read = xfs_allocbt_read_verify,
|
||||
.verify_write = xfs_allocbt_write_verify,
|
||||
.verify_struct = xfs_allocbt_verify,
|
||||
};
|
||||
|
||||
STATIC int
|
||||
xfs_bnobt_keys_inorder(
|
||||
|
@ -448,7 +444,7 @@ static const struct xfs_btree_ops xfs_bnobt_ops = {
|
|||
.init_rec_from_cur = xfs_allocbt_init_rec_from_cur,
|
||||
.init_ptr_from_cur = xfs_allocbt_init_ptr_from_cur,
|
||||
.key_diff = xfs_bnobt_key_diff,
|
||||
.buf_ops = &xfs_allocbt_buf_ops,
|
||||
.buf_ops = &xfs_bnobt_buf_ops,
|
||||
.diff_two_keys = xfs_bnobt_diff_two_keys,
|
||||
.keys_inorder = xfs_bnobt_keys_inorder,
|
||||
.recs_inorder = xfs_bnobt_recs_inorder,
|
||||
|
@ -470,7 +466,7 @@ static const struct xfs_btree_ops xfs_cntbt_ops = {
|
|||
.init_rec_from_cur = xfs_allocbt_init_rec_from_cur,
|
||||
.init_ptr_from_cur = xfs_allocbt_init_ptr_from_cur,
|
||||
.key_diff = xfs_cntbt_key_diff,
|
||||
.buf_ops = &xfs_allocbt_buf_ops,
|
||||
.buf_ops = &xfs_cntbt_buf_ops,
|
||||
.diff_two_keys = xfs_cntbt_diff_two_keys,
|
||||
.keys_inorder = xfs_cntbt_keys_inorder,
|
||||
.recs_inorder = xfs_cntbt_recs_inorder,
|
||||
|
|
|
@ -1336,3 +1336,20 @@ xfs_attr_node_get(xfs_da_args_t *args)
|
|||
xfs_da_state_free(state);
|
||||
return retval;
|
||||
}
|
||||
|
||||
/* Returns true if the attribute entry name is valid. */
|
||||
bool
|
||||
xfs_attr_namecheck(
|
||||
const void *name,
|
||||
size_t length)
|
||||
{
|
||||
/*
|
||||
* MAXNAMELEN includes the trailing null, but (name/length) leave it
|
||||
* out, so use >= for the length check.
|
||||
*/
|
||||
if (length >= MAXNAMELEN)
|
||||
return false;
|
||||
|
||||
/* There shouldn't be any nulls here */
|
||||
return !memchr(name, 0, length);
|
||||
}
|
||||
|
|
|
@ -145,6 +145,6 @@ int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name, int flags);
|
|||
int xfs_attr_remove_args(struct xfs_da_args *args);
|
||||
int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
|
||||
int flags, struct attrlist_cursor_kern *cursor);
|
||||
|
||||
bool xfs_attr_namecheck(const void *name, size_t length);
|
||||
|
||||
#endif /* __XFS_ATTR_H__ */
|
||||
|
|
|
@ -245,25 +245,14 @@ xfs_attr3_leaf_verify(
|
|||
struct xfs_attr_leaf_entry *entries;
|
||||
uint32_t end; /* must be 32bit - see below */
|
||||
int i;
|
||||
xfs_failaddr_t fa;
|
||||
|
||||
xfs_attr3_leaf_hdr_from_disk(mp->m_attr_geo, &ichdr, leaf);
|
||||
|
||||
if (xfs_sb_version_hascrc(&mp->m_sb)) {
|
||||
struct xfs_da3_node_hdr *hdr3 = bp->b_addr;
|
||||
fa = xfs_da3_blkinfo_verify(bp, bp->b_addr);
|
||||
if (fa)
|
||||
return fa;
|
||||
|
||||
if (ichdr.magic != XFS_ATTR3_LEAF_MAGIC)
|
||||
return __this_address;
|
||||
|
||||
if (!uuid_equal(&hdr3->info.uuid, &mp->m_sb.sb_meta_uuid))
|
||||
return __this_address;
|
||||
if (be64_to_cpu(hdr3->info.blkno) != bp->b_bn)
|
||||
return __this_address;
|
||||
if (!xfs_log_check_lsn(mp, be64_to_cpu(hdr3->info.lsn)))
|
||||
return __this_address;
|
||||
} else {
|
||||
if (ichdr.magic != XFS_ATTR_LEAF_MAGIC)
|
||||
return __this_address;
|
||||
}
|
||||
/*
|
||||
* In recovery there is a transient state where count == 0 is valid
|
||||
* because we may have transitioned an empty shortform attr to a leaf
|
||||
|
@ -369,6 +358,8 @@ xfs_attr3_leaf_read_verify(
|
|||
|
||||
const struct xfs_buf_ops xfs_attr3_leaf_buf_ops = {
|
||||
.name = "xfs_attr3_leaf",
|
||||
.magic16 = { cpu_to_be16(XFS_ATTR_LEAF_MAGIC),
|
||||
cpu_to_be16(XFS_ATTR3_LEAF_MAGIC) },
|
||||
.verify_read = xfs_attr3_leaf_read_verify,
|
||||
.verify_write = xfs_attr3_leaf_write_verify,
|
||||
.verify_struct = xfs_attr3_leaf_verify,
|
||||
|
|
|
@ -79,6 +79,7 @@ xfs_attr3_rmt_hdr_ok(
|
|||
static xfs_failaddr_t
|
||||
xfs_attr3_rmt_verify(
|
||||
struct xfs_mount *mp,
|
||||
struct xfs_buf *bp,
|
||||
void *ptr,
|
||||
int fsbsize,
|
||||
xfs_daddr_t bno)
|
||||
|
@ -87,7 +88,7 @@ xfs_attr3_rmt_verify(
|
|||
|
||||
if (!xfs_sb_version_hascrc(&mp->m_sb))
|
||||
return __this_address;
|
||||
if (rmt->rm_magic != cpu_to_be32(XFS_ATTR3_RMT_MAGIC))
|
||||
if (!xfs_verify_magic(bp, rmt->rm_magic))
|
||||
return __this_address;
|
||||
if (!uuid_equal(&rmt->rm_uuid, &mp->m_sb.sb_meta_uuid))
|
||||
return __this_address;
|
||||
|
@ -131,7 +132,7 @@ __xfs_attr3_rmt_read_verify(
|
|||
*failaddr = __this_address;
|
||||
return -EFSBADCRC;
|
||||
}
|
||||
*failaddr = xfs_attr3_rmt_verify(mp, ptr, blksize, bno);
|
||||
*failaddr = xfs_attr3_rmt_verify(mp, bp, ptr, blksize, bno);
|
||||
if (*failaddr)
|
||||
return -EFSCORRUPTED;
|
||||
len -= blksize;
|
||||
|
@ -193,7 +194,7 @@ xfs_attr3_rmt_write_verify(
|
|||
while (len > 0) {
|
||||
struct xfs_attr3_rmt_hdr *rmt = (struct xfs_attr3_rmt_hdr *)ptr;
|
||||
|
||||
fa = xfs_attr3_rmt_verify(mp, ptr, blksize, bno);
|
||||
fa = xfs_attr3_rmt_verify(mp, bp, ptr, blksize, bno);
|
||||
if (fa) {
|
||||
xfs_verifier_error(bp, -EFSCORRUPTED, fa);
|
||||
return;
|
||||
|
@ -220,6 +221,7 @@ xfs_attr3_rmt_write_verify(
|
|||
|
||||
const struct xfs_buf_ops xfs_attr3_rmt_buf_ops = {
|
||||
.name = "xfs_attr3_rmt",
|
||||
.magic = { 0, cpu_to_be32(XFS_ATTR3_RMT_MAGIC) },
|
||||
.verify_read = xfs_attr3_rmt_read_verify,
|
||||
.verify_write = xfs_attr3_rmt_write_verify,
|
||||
.verify_struct = xfs_attr3_rmt_verify_struct,
|
||||
|
|
|
@ -577,42 +577,44 @@ __xfs_bmap_add_free(
|
|||
*/
|
||||
|
||||
/*
|
||||
* Transform a btree format file with only one leaf node, where the
|
||||
* extents list will fit in the inode, into an extents format file.
|
||||
* Since the file extents are already in-core, all we have to do is
|
||||
* give up the space for the btree root and pitch the leaf block.
|
||||
* Convert the inode format to extent format if it currently is in btree format,
|
||||
* but the extent list is small enough that it fits into the extent format.
|
||||
*
|
||||
* Since the extents are already in-core, all we have to do is give up the space
|
||||
* for the btree root and pitch the leaf block.
|
||||
*/
|
||||
STATIC int /* error */
|
||||
xfs_bmap_btree_to_extents(
|
||||
xfs_trans_t *tp, /* transaction pointer */
|
||||
xfs_inode_t *ip, /* incore inode pointer */
|
||||
xfs_btree_cur_t *cur, /* btree cursor */
|
||||
struct xfs_trans *tp, /* transaction pointer */
|
||||
struct xfs_inode *ip, /* incore inode pointer */
|
||||
struct xfs_btree_cur *cur, /* btree cursor */
|
||||
int *logflagsp, /* inode logging flags */
|
||||
int whichfork) /* data or attr fork */
|
||||
{
|
||||
/* REFERENCED */
|
||||
struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, whichfork);
|
||||
struct xfs_mount *mp = ip->i_mount;
|
||||
struct xfs_btree_block *rblock = ifp->if_broot;
|
||||
struct xfs_btree_block *cblock;/* child btree block */
|
||||
xfs_fsblock_t cbno; /* child block number */
|
||||
xfs_buf_t *cbp; /* child block's buffer */
|
||||
int error; /* error return value */
|
||||
struct xfs_ifork *ifp; /* inode fork data */
|
||||
xfs_mount_t *mp; /* mount point structure */
|
||||
__be64 *pp; /* ptr to block address */
|
||||
struct xfs_btree_block *rblock;/* root btree block */
|
||||
struct xfs_owner_info oinfo;
|
||||
|
||||
mp = ip->i_mount;
|
||||
ifp = XFS_IFORK_PTR(ip, whichfork);
|
||||
/* check if we actually need the extent format first: */
|
||||
if (!xfs_bmap_wants_extents(ip, whichfork))
|
||||
return 0;
|
||||
|
||||
ASSERT(cur);
|
||||
ASSERT(whichfork != XFS_COW_FORK);
|
||||
ASSERT(ifp->if_flags & XFS_IFEXTENTS);
|
||||
ASSERT(XFS_IFORK_FORMAT(ip, whichfork) == XFS_DINODE_FMT_BTREE);
|
||||
rblock = ifp->if_broot;
|
||||
ASSERT(be16_to_cpu(rblock->bb_level) == 1);
|
||||
ASSERT(be16_to_cpu(rblock->bb_numrecs) == 1);
|
||||
ASSERT(xfs_bmbt_maxrecs(mp, ifp->if_broot_bytes, 0) == 1);
|
||||
|
||||
pp = XFS_BMAP_BROOT_PTR_ADDR(mp, rblock, 1, ifp->if_broot_bytes);
|
||||
cbno = be64_to_cpu(*pp);
|
||||
*logflagsp = 0;
|
||||
#ifdef DEBUG
|
||||
XFS_WANT_CORRUPTED_RETURN(cur->bc_mp,
|
||||
xfs_btree_check_lptr(cur, cbno, 1));
|
||||
|
@ -635,7 +637,7 @@ xfs_bmap_btree_to_extents(
|
|||
ASSERT(ifp->if_broot == NULL);
|
||||
ASSERT((ifp->if_flags & XFS_IFBROOT) == 0);
|
||||
XFS_IFORK_FMT_SET(ip, whichfork, XFS_DINODE_FMT_EXTENTS);
|
||||
*logflagsp = XFS_ILOG_CORE | xfs_ilog_fext(whichfork);
|
||||
*logflagsp |= XFS_ILOG_CORE | xfs_ilog_fext(whichfork);
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
@ -2029,7 +2031,7 @@ done:
|
|||
/*
|
||||
* Convert an unwritten allocation to a real allocation or vice versa.
|
||||
*/
|
||||
STATIC int /* error */
|
||||
int /* error */
|
||||
xfs_bmap_add_extent_unwritten_real(
|
||||
struct xfs_trans *tp,
|
||||
xfs_inode_t *ip, /* incore inode pointer */
|
||||
|
@ -3685,17 +3687,6 @@ xfs_trim_extent(
|
|||
}
|
||||
}
|
||||
|
||||
/* trim extent to within eof */
|
||||
void
|
||||
xfs_trim_extent_eof(
|
||||
struct xfs_bmbt_irec *irec,
|
||||
struct xfs_inode *ip)
|
||||
|
||||
{
|
||||
xfs_trim_extent(irec, 0, XFS_B_TO_FSB(ip->i_mount,
|
||||
i_size_read(VFS_I(ip))));
|
||||
}
|
||||
|
||||
/*
|
||||
* Trim the returned map to the required bounds
|
||||
*/
|
||||
|
@ -4203,6 +4194,44 @@ xfs_bmapi_convert_unwritten(
|
|||
return 0;
|
||||
}
|
||||
|
||||
static inline xfs_extlen_t
|
||||
xfs_bmapi_minleft(
|
||||
struct xfs_trans *tp,
|
||||
struct xfs_inode *ip,
|
||||
int fork)
|
||||
{
|
||||
if (tp && tp->t_firstblock != NULLFSBLOCK)
|
||||
return 0;
|
||||
if (XFS_IFORK_FORMAT(ip, fork) != XFS_DINODE_FMT_BTREE)
|
||||
return 1;
|
||||
return be16_to_cpu(XFS_IFORK_PTR(ip, fork)->if_broot->bb_level) + 1;
|
||||
}
|
||||
|
||||
/*
|
||||
* Log whatever the flags say, even if error. Otherwise we might miss detecting
|
||||
* a case where the data is changed, there's an error, and it's not logged so we
|
||||
* don't shutdown when we should. Don't bother logging extents/btree changes if
|
||||
* we converted to the other format.
|
||||
*/
|
||||
static void
|
||||
xfs_bmapi_finish(
|
||||
struct xfs_bmalloca *bma,
|
||||
int whichfork,
|
||||
int error)
|
||||
{
|
||||
if ((bma->logflags & xfs_ilog_fext(whichfork)) &&
|
||||
XFS_IFORK_FORMAT(bma->ip, whichfork) != XFS_DINODE_FMT_EXTENTS)
|
||||
bma->logflags &= ~xfs_ilog_fext(whichfork);
|
||||
else if ((bma->logflags & xfs_ilog_fbroot(whichfork)) &&
|
||||
XFS_IFORK_FORMAT(bma->ip, whichfork) != XFS_DINODE_FMT_BTREE)
|
||||
bma->logflags &= ~xfs_ilog_fbroot(whichfork);
|
||||
|
||||
if (bma->logflags)
|
||||
xfs_trans_log_inode(bma->tp, bma->ip, bma->logflags);
|
||||
if (bma->cur)
|
||||
xfs_btree_del_cursor(bma->cur, error);
|
||||
}
|
||||
|
||||
/*
|
||||
* Map file blocks to filesystem blocks, and allocate blocks or convert the
|
||||
* extent state if necessary. Details behaviour is controlled by the flags
|
||||
|
@ -4247,9 +4276,7 @@ xfs_bmapi_write(
|
|||
|
||||
ASSERT(*nmap >= 1);
|
||||
ASSERT(*nmap <= XFS_BMAP_MAX_NMAP);
|
||||
ASSERT(tp != NULL ||
|
||||
(flags & (XFS_BMAPI_CONVERT | XFS_BMAPI_COWFORK)) ==
|
||||
(XFS_BMAPI_CONVERT | XFS_BMAPI_COWFORK));
|
||||
ASSERT(tp != NULL);
|
||||
ASSERT(len > 0);
|
||||
ASSERT(XFS_IFORK_FORMAT(ip, whichfork) != XFS_DINODE_FMT_LOCAL);
|
||||
ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
|
||||
|
@ -4282,25 +4309,12 @@ xfs_bmapi_write(
|
|||
|
||||
XFS_STATS_INC(mp, xs_blk_mapw);
|
||||
|
||||
if (!tp || tp->t_firstblock == NULLFSBLOCK) {
|
||||
if (XFS_IFORK_FORMAT(ip, whichfork) == XFS_DINODE_FMT_BTREE)
|
||||
bma.minleft = be16_to_cpu(ifp->if_broot->bb_level) + 1;
|
||||
else
|
||||
bma.minleft = 1;
|
||||
} else {
|
||||
bma.minleft = 0;
|
||||
}
|
||||
|
||||
if (!(ifp->if_flags & XFS_IFEXTENTS)) {
|
||||
error = xfs_iread_extents(tp, ip, whichfork);
|
||||
if (error)
|
||||
goto error0;
|
||||
}
|
||||
|
||||
n = 0;
|
||||
end = bno + len;
|
||||
obno = bno;
|
||||
|
||||
if (!xfs_iext_lookup_extent(ip, ifp, bno, &bma.icur, &bma.got))
|
||||
eof = true;
|
||||
if (!xfs_iext_peek_prev_extent(ifp, &bma.icur, &bma.prev))
|
||||
|
@ -4309,7 +4323,11 @@ xfs_bmapi_write(
|
|||
bma.ip = ip;
|
||||
bma.total = total;
|
||||
bma.datatype = 0;
|
||||
bma.minleft = xfs_bmapi_minleft(tp, ip, whichfork);
|
||||
|
||||
n = 0;
|
||||
end = bno + len;
|
||||
obno = bno;
|
||||
while (bno < end && n < *nmap) {
|
||||
bool need_alloc = false, wasdelay = false;
|
||||
|
||||
|
@ -4323,26 +4341,7 @@ xfs_bmapi_write(
|
|||
ASSERT(!((flags & XFS_BMAPI_CONVERT) &&
|
||||
(flags & XFS_BMAPI_COWFORK)));
|
||||
|
||||
if (flags & XFS_BMAPI_DELALLOC) {
|
||||
/*
|
||||
* For the COW fork we can reasonably get a
|
||||
* request for converting an extent that races
|
||||
* with other threads already having converted
|
||||
* part of it, as there converting COW to
|
||||
* regular blocks is not protected using the
|
||||
* IOLOCK.
|
||||
*/
|
||||
ASSERT(flags & XFS_BMAPI_COWFORK);
|
||||
if (!(flags & XFS_BMAPI_COWFORK)) {
|
||||
error = -EIO;
|
||||
goto error0;
|
||||
}
|
||||
|
||||
if (eof || bno >= end)
|
||||
break;
|
||||
} else {
|
||||
need_alloc = true;
|
||||
}
|
||||
need_alloc = true;
|
||||
} else if (isnullstartblock(bma.got.br_startblock)) {
|
||||
wasdelay = true;
|
||||
}
|
||||
|
@ -4351,8 +4350,7 @@ xfs_bmapi_write(
|
|||
* First, deal with the hole before the allocated space
|
||||
* that we found, if any.
|
||||
*/
|
||||
if ((need_alloc || wasdelay) &&
|
||||
!(flags & XFS_BMAPI_CONVERT_ONLY)) {
|
||||
if (need_alloc || wasdelay) {
|
||||
bma.eof = eof;
|
||||
bma.conv = !!(flags & XFS_BMAPI_CONVERT);
|
||||
bma.wasdel = wasdelay;
|
||||
|
@ -4420,49 +4418,130 @@ xfs_bmapi_write(
|
|||
}
|
||||
*nmap = n;
|
||||
|
||||
/*
|
||||
* Transform from btree to extents, give it cur.
|
||||
*/
|
||||
if (xfs_bmap_wants_extents(ip, whichfork)) {
|
||||
int tmp_logflags = 0;
|
||||
|
||||
ASSERT(bma.cur);
|
||||
error = xfs_bmap_btree_to_extents(tp, ip, bma.cur,
|
||||
&tmp_logflags, whichfork);
|
||||
bma.logflags |= tmp_logflags;
|
||||
if (error)
|
||||
goto error0;
|
||||
}
|
||||
error = xfs_bmap_btree_to_extents(tp, ip, bma.cur, &bma.logflags,
|
||||
whichfork);
|
||||
if (error)
|
||||
goto error0;
|
||||
|
||||
ASSERT(XFS_IFORK_FORMAT(ip, whichfork) != XFS_DINODE_FMT_BTREE ||
|
||||
XFS_IFORK_NEXTENTS(ip, whichfork) >
|
||||
XFS_IFORK_MAXEXT(ip, whichfork));
|
||||
error = 0;
|
||||
xfs_bmapi_finish(&bma, whichfork, 0);
|
||||
xfs_bmap_validate_ret(orig_bno, orig_len, orig_flags, orig_mval,
|
||||
orig_nmap, *nmap);
|
||||
return 0;
|
||||
error0:
|
||||
/*
|
||||
* Log everything. Do this after conversion, there's no point in
|
||||
* logging the extent records if we've converted to btree format.
|
||||
*/
|
||||
if ((bma.logflags & xfs_ilog_fext(whichfork)) &&
|
||||
XFS_IFORK_FORMAT(ip, whichfork) != XFS_DINODE_FMT_EXTENTS)
|
||||
bma.logflags &= ~xfs_ilog_fext(whichfork);
|
||||
else if ((bma.logflags & xfs_ilog_fbroot(whichfork)) &&
|
||||
XFS_IFORK_FORMAT(ip, whichfork) != XFS_DINODE_FMT_BTREE)
|
||||
bma.logflags &= ~xfs_ilog_fbroot(whichfork);
|
||||
/*
|
||||
* Log whatever the flags say, even if error. Otherwise we might miss
|
||||
* detecting a case where the data is changed, there's an error,
|
||||
* and it's not logged so we don't shutdown when we should.
|
||||
*/
|
||||
if (bma.logflags)
|
||||
xfs_trans_log_inode(tp, ip, bma.logflags);
|
||||
xfs_bmapi_finish(&bma, whichfork, error);
|
||||
return error;
|
||||
}
|
||||
|
||||
if (bma.cur) {
|
||||
xfs_btree_del_cursor(bma.cur, error);
|
||||
/*
|
||||
* Convert an existing delalloc extent to real blocks based on file offset. This
|
||||
* attempts to allocate the entire delalloc extent and may require multiple
|
||||
* invocations to allocate the target offset if a large enough physical extent
|
||||
* is not available.
|
||||
*/
|
||||
int
|
||||
xfs_bmapi_convert_delalloc(
|
||||
struct xfs_inode *ip,
|
||||
int whichfork,
|
||||
xfs_fileoff_t offset_fsb,
|
||||
struct xfs_bmbt_irec *imap,
|
||||
unsigned int *seq)
|
||||
{
|
||||
struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, whichfork);
|
||||
struct xfs_mount *mp = ip->i_mount;
|
||||
struct xfs_bmalloca bma = { NULL };
|
||||
struct xfs_trans *tp;
|
||||
int error;
|
||||
|
||||
/*
|
||||
* Space for the extent and indirect blocks was reserved when the
|
||||
* delalloc extent was created so there's no need to do so here.
|
||||
*/
|
||||
error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, 0, 0,
|
||||
XFS_TRANS_RESERVE, &tp);
|
||||
if (error)
|
||||
return error;
|
||||
|
||||
xfs_ilock(ip, XFS_ILOCK_EXCL);
|
||||
xfs_trans_ijoin(tp, ip, 0);
|
||||
|
||||
if (!xfs_iext_lookup_extent(ip, ifp, offset_fsb, &bma.icur, &bma.got) ||
|
||||
bma.got.br_startoff > offset_fsb) {
|
||||
/*
|
||||
* No extent found in the range we are trying to convert. This
|
||||
* should only happen for the COW fork, where another thread
|
||||
* might have moved the extent to the data fork in the meantime.
|
||||
*/
|
||||
WARN_ON_ONCE(whichfork != XFS_COW_FORK);
|
||||
error = -EAGAIN;
|
||||
goto out_trans_cancel;
|
||||
}
|
||||
if (!error)
|
||||
xfs_bmap_validate_ret(orig_bno, orig_len, orig_flags, orig_mval,
|
||||
orig_nmap, *nmap);
|
||||
|
||||
/*
|
||||
* If we find a real extent here we raced with another thread converting
|
||||
* the extent. Just return the real extent at this offset.
|
||||
*/
|
||||
if (!isnullstartblock(bma.got.br_startblock)) {
|
||||
*imap = bma.got;
|
||||
*seq = READ_ONCE(ifp->if_seq);
|
||||
goto out_trans_cancel;
|
||||
}
|
||||
|
||||
bma.tp = tp;
|
||||
bma.ip = ip;
|
||||
bma.wasdel = true;
|
||||
bma.offset = bma.got.br_startoff;
|
||||
bma.length = max_t(xfs_filblks_t, bma.got.br_blockcount, MAXEXTLEN);
|
||||
bma.total = XFS_EXTENTADD_SPACE_RES(ip->i_mount, XFS_DATA_FORK);
|
||||
bma.minleft = xfs_bmapi_minleft(tp, ip, whichfork);
|
||||
if (whichfork == XFS_COW_FORK)
|
||||
bma.flags = XFS_BMAPI_COWFORK | XFS_BMAPI_PREALLOC;
|
||||
|
||||
if (!xfs_iext_peek_prev_extent(ifp, &bma.icur, &bma.prev))
|
||||
bma.prev.br_startoff = NULLFILEOFF;
|
||||
|
||||
error = xfs_bmapi_allocate(&bma);
|
||||
if (error)
|
||||
goto out_finish;
|
||||
|
||||
error = -ENOSPC;
|
||||
if (WARN_ON_ONCE(bma.blkno == NULLFSBLOCK))
|
||||
goto out_finish;
|
||||
error = -EFSCORRUPTED;
|
||||
if (WARN_ON_ONCE(!bma.got.br_startblock && !XFS_IS_REALTIME_INODE(ip)))
|
||||
goto out_finish;
|
||||
|
||||
XFS_STATS_ADD(mp, xs_xstrat_bytes, XFS_FSB_TO_B(mp, bma.length));
|
||||
XFS_STATS_INC(mp, xs_xstrat_quick);
|
||||
|
||||
ASSERT(!isnullstartblock(bma.got.br_startblock));
|
||||
*imap = bma.got;
|
||||
*seq = READ_ONCE(ifp->if_seq);
|
||||
|
||||
if (whichfork == XFS_COW_FORK) {
|
||||
error = xfs_refcount_alloc_cow_extent(tp, bma.blkno,
|
||||
bma.length);
|
||||
if (error)
|
||||
goto out_finish;
|
||||
}
|
||||
|
||||
error = xfs_bmap_btree_to_extents(tp, ip, bma.cur, &bma.logflags,
|
||||
whichfork);
|
||||
if (error)
|
||||
goto out_finish;
|
||||
|
||||
xfs_bmapi_finish(&bma, whichfork, 0);
|
||||
error = xfs_trans_commit(tp);
|
||||
xfs_iunlock(ip, XFS_ILOCK_EXCL);
|
||||
return error;
|
||||
|
||||
out_finish:
|
||||
xfs_bmapi_finish(&bma, whichfork, error);
|
||||
out_trans_cancel:
|
||||
xfs_trans_cancel(tp);
|
||||
xfs_iunlock(ip, XFS_ILOCK_EXCL);
|
||||
return error;
|
||||
}
|
||||
|
||||
|
@ -4536,13 +4615,7 @@ xfs_bmapi_remap(
|
|||
if (error)
|
||||
goto error0;
|
||||
|
||||
if (xfs_bmap_wants_extents(ip, whichfork)) {
|
||||
int tmp_logflags = 0;
|
||||
|
||||
error = xfs_bmap_btree_to_extents(tp, ip, cur,
|
||||
&tmp_logflags, whichfork);
|
||||
logflags |= tmp_logflags;
|
||||
}
|
||||
error = xfs_bmap_btree_to_extents(tp, ip, cur, &logflags, whichfork);
|
||||
|
||||
error0:
|
||||
if (ip->i_d.di_format != XFS_DINODE_FMT_EXTENTS)
|
||||
|
@ -5406,24 +5479,11 @@ nodelete:
|
|||
error = xfs_bmap_extents_to_btree(tp, ip, &cur, 0,
|
||||
&tmp_logflags, whichfork);
|
||||
logflags |= tmp_logflags;
|
||||
if (error)
|
||||
goto error0;
|
||||
}
|
||||
/*
|
||||
* transform from btree to extents, give it cur
|
||||
*/
|
||||
else if (xfs_bmap_wants_extents(ip, whichfork)) {
|
||||
ASSERT(cur != NULL);
|
||||
error = xfs_bmap_btree_to_extents(tp, ip, cur, &tmp_logflags,
|
||||
} else {
|
||||
error = xfs_bmap_btree_to_extents(tp, ip, cur, &logflags,
|
||||
whichfork);
|
||||
logflags |= tmp_logflags;
|
||||
if (error)
|
||||
goto error0;
|
||||
}
|
||||
/*
|
||||
* transform from extents to local?
|
||||
*/
|
||||
error = 0;
|
||||
|
||||
error0:
|
||||
/*
|
||||
* Log everything. Do this after conversion, there's no point in
|
||||
|
|
|
@ -95,12 +95,6 @@ struct xfs_extent_free_item
|
|||
/* Map something in the CoW fork. */
|
||||
#define XFS_BMAPI_COWFORK 0x200
|
||||
|
||||
/* Only convert delalloc space, don't allocate entirely new extents */
|
||||
#define XFS_BMAPI_DELALLOC 0x400
|
||||
|
||||
/* Only convert unwritten extents, don't allocate new blocks */
|
||||
#define XFS_BMAPI_CONVERT_ONLY 0x800
|
||||
|
||||
/* Skip online discard of freed extents */
|
||||
#define XFS_BMAPI_NODISCARD 0x1000
|
||||
|
||||
|
@ -117,8 +111,6 @@ struct xfs_extent_free_item
|
|||
{ XFS_BMAPI_ZERO, "ZERO" }, \
|
||||
{ XFS_BMAPI_REMAP, "REMAP" }, \
|
||||
{ XFS_BMAPI_COWFORK, "COWFORK" }, \
|
||||
{ XFS_BMAPI_DELALLOC, "DELALLOC" }, \
|
||||
{ XFS_BMAPI_CONVERT_ONLY, "CONVERT_ONLY" }, \
|
||||
{ XFS_BMAPI_NODISCARD, "NODISCARD" }, \
|
||||
{ XFS_BMAPI_NORMAP, "NORMAP" }
|
||||
|
||||
|
@ -181,7 +173,6 @@ static inline bool xfs_bmap_is_real_extent(struct xfs_bmbt_irec *irec)
|
|||
|
||||
void xfs_trim_extent(struct xfs_bmbt_irec *irec, xfs_fileoff_t bno,
|
||||
xfs_filblks_t len);
|
||||
void xfs_trim_extent_eof(struct xfs_bmbt_irec *, struct xfs_inode *);
|
||||
int xfs_bmap_add_attrfork(struct xfs_inode *ip, int size, int rsvd);
|
||||
int xfs_bmap_set_attrforkoff(struct xfs_inode *ip, int size, int *version);
|
||||
void xfs_bmap_local_to_extents_empty(struct xfs_inode *ip, int whichfork);
|
||||
|
@ -228,6 +219,13 @@ int xfs_bmapi_reserve_delalloc(struct xfs_inode *ip, int whichfork,
|
|||
xfs_fileoff_t off, xfs_filblks_t len, xfs_filblks_t prealloc,
|
||||
struct xfs_bmbt_irec *got, struct xfs_iext_cursor *cur,
|
||||
int eof);
|
||||
int xfs_bmapi_convert_delalloc(struct xfs_inode *ip, int whichfork,
|
||||
xfs_fileoff_t offset_fsb, struct xfs_bmbt_irec *imap,
|
||||
unsigned int *seq);
|
||||
int xfs_bmap_add_extent_unwritten_real(struct xfs_trans *tp,
|
||||
struct xfs_inode *ip, int whichfork,
|
||||
struct xfs_iext_cursor *icur, struct xfs_btree_cur **curp,
|
||||
struct xfs_bmbt_irec *new, int *logflagsp);
|
||||
|
||||
static inline void
|
||||
xfs_bmap_add_free(
|
||||
|
|
|
@ -416,8 +416,10 @@ xfs_bmbt_verify(
|
|||
xfs_failaddr_t fa;
|
||||
unsigned int level;
|
||||
|
||||
switch (block->bb_magic) {
|
||||
case cpu_to_be32(XFS_BMAP_CRC_MAGIC):
|
||||
if (!xfs_verify_magic(bp, block->bb_magic))
|
||||
return __this_address;
|
||||
|
||||
if (xfs_sb_version_hascrc(&mp->m_sb)) {
|
||||
/*
|
||||
* XXX: need a better way of verifying the owner here. Right now
|
||||
* just make sure there has been one set.
|
||||
|
@ -425,11 +427,6 @@ xfs_bmbt_verify(
|
|||
fa = xfs_btree_lblock_v5hdr_verify(bp, XFS_RMAP_OWN_UNKNOWN);
|
||||
if (fa)
|
||||
return fa;
|
||||
/* fall through */
|
||||
case cpu_to_be32(XFS_BMAP_MAGIC):
|
||||
break;
|
||||
default:
|
||||
return __this_address;
|
||||
}
|
||||
|
||||
/*
|
||||
|
@ -481,6 +478,8 @@ xfs_bmbt_write_verify(
|
|||
|
||||
const struct xfs_buf_ops xfs_bmbt_buf_ops = {
|
||||
.name = "xfs_bmbt",
|
||||
.magic = { cpu_to_be32(XFS_BMAP_MAGIC),
|
||||
cpu_to_be32(XFS_BMAP_CRC_MAGIC) },
|
||||
.verify_read = xfs_bmbt_read_verify,
|
||||
.verify_write = xfs_bmbt_write_verify,
|
||||
.verify_struct = xfs_bmbt_verify,
|
||||
|
|
|
@ -116,6 +116,34 @@ xfs_da_state_free(xfs_da_state_t *state)
|
|||
kmem_zone_free(xfs_da_state_zone, state);
|
||||
}
|
||||
|
||||
/*
|
||||
* Verify an xfs_da3_blkinfo structure. Note that the da3 fields are only
|
||||
* accessible on v5 filesystems. This header format is common across da node,
|
||||
* attr leaf and dir leaf blocks.
|
||||
*/
|
||||
xfs_failaddr_t
|
||||
xfs_da3_blkinfo_verify(
|
||||
struct xfs_buf *bp,
|
||||
struct xfs_da3_blkinfo *hdr3)
|
||||
{
|
||||
struct xfs_mount *mp = bp->b_target->bt_mount;
|
||||
struct xfs_da_blkinfo *hdr = &hdr3->hdr;
|
||||
|
||||
if (!xfs_verify_magic16(bp, hdr->magic))
|
||||
return __this_address;
|
||||
|
||||
if (xfs_sb_version_hascrc(&mp->m_sb)) {
|
||||
if (!uuid_equal(&hdr3->uuid, &mp->m_sb.sb_meta_uuid))
|
||||
return __this_address;
|
||||
if (be64_to_cpu(hdr3->blkno) != bp->b_bn)
|
||||
return __this_address;
|
||||
if (!xfs_log_check_lsn(mp, be64_to_cpu(hdr3->lsn)))
|
||||
return __this_address;
|
||||
}
|
||||
|
||||
return NULL;
|
||||
}
|
||||
|
||||
static xfs_failaddr_t
|
||||
xfs_da3_node_verify(
|
||||
struct xfs_buf *bp)
|
||||
|
@ -124,27 +152,16 @@ xfs_da3_node_verify(
|
|||
struct xfs_da_intnode *hdr = bp->b_addr;
|
||||
struct xfs_da3_icnode_hdr ichdr;
|
||||
const struct xfs_dir_ops *ops;
|
||||
xfs_failaddr_t fa;
|
||||
|
||||
ops = xfs_dir_get_ops(mp, NULL);
|
||||
|
||||
ops->node_hdr_from_disk(&ichdr, hdr);
|
||||
|
||||
if (xfs_sb_version_hascrc(&mp->m_sb)) {
|
||||
struct xfs_da3_node_hdr *hdr3 = bp->b_addr;
|
||||
fa = xfs_da3_blkinfo_verify(bp, bp->b_addr);
|
||||
if (fa)
|
||||
return fa;
|
||||
|
||||
if (ichdr.magic != XFS_DA3_NODE_MAGIC)
|
||||
return __this_address;
|
||||
|
||||
if (!uuid_equal(&hdr3->info.uuid, &mp->m_sb.sb_meta_uuid))
|
||||
return __this_address;
|
||||
if (be64_to_cpu(hdr3->info.blkno) != bp->b_bn)
|
||||
return __this_address;
|
||||
if (!xfs_log_check_lsn(mp, be64_to_cpu(hdr3->info.lsn)))
|
||||
return __this_address;
|
||||
} else {
|
||||
if (ichdr.magic != XFS_DA_NODE_MAGIC)
|
||||
return __this_address;
|
||||
}
|
||||
if (ichdr.level == 0)
|
||||
return __this_address;
|
||||
if (ichdr.level > XFS_DA_NODE_MAXDEPTH)
|
||||
|
@ -257,6 +274,8 @@ xfs_da3_node_verify_struct(
|
|||
|
||||
const struct xfs_buf_ops xfs_da3_node_buf_ops = {
|
||||
.name = "xfs_da3_node",
|
||||
.magic16 = { cpu_to_be16(XFS_DA_NODE_MAGIC),
|
||||
cpu_to_be16(XFS_DA3_NODE_MAGIC) },
|
||||
.verify_read = xfs_da3_node_read_verify,
|
||||
.verify_write = xfs_da3_node_write_verify,
|
||||
.verify_struct = xfs_da3_node_verify_struct,
|
||||
|
|
|
@ -869,4 +869,7 @@ static inline unsigned int xfs_dir2_dirblock_bytes(struct xfs_sb *sbp)
|
|||
return 1 << (sbp->sb_blocklog + sbp->sb_dirblklog);
|
||||
}
|
||||
|
||||
xfs_failaddr_t xfs_da3_blkinfo_verify(struct xfs_buf *bp,
|
||||
struct xfs_da3_blkinfo *hdr3);
|
||||
|
||||
#endif /* __XFS_DA_FORMAT_H__ */
|
||||
|
|
|
@ -703,3 +703,20 @@ xfs_dir2_shrink_inode(
|
|||
xfs_trans_log_inode(tp, dp, XFS_ILOG_CORE);
|
||||
return 0;
|
||||
}
|
||||
|
||||
/* Returns true if the directory entry name is valid. */
|
||||
bool
|
||||
xfs_dir2_namecheck(
|
||||
const void *name,
|
||||
size_t length)
|
||||
{
|
||||
/*
|
||||
* MAXNAMELEN includes the trailing null, but (name/length) leave it
|
||||
* out, so use >= for the length check.
|
||||
*/
|
||||
if (length >= MAXNAMELEN)
|
||||
return false;
|
||||
|
||||
/* There shouldn't be any slashes or nulls here */
|
||||
return !memchr(name, '/', length) && !memchr(name, 0, length);
|
||||
}
|
||||
|
|
|
@ -326,5 +326,6 @@ xfs_dir2_leaf_tail_p(struct xfs_da_geometry *geo, struct xfs_dir2_leaf *lp)
|
|||
unsigned char xfs_dir3_get_dtype(struct xfs_mount *mp, uint8_t filetype);
|
||||
void *xfs_dir3_data_endp(struct xfs_da_geometry *geo,
|
||||
struct xfs_dir2_data_hdr *hdr);
|
||||
bool xfs_dir2_namecheck(const void *name, size_t length);
|
||||
|
||||
#endif /* __XFS_DIR2_H__ */
|
||||
|
|
|
@ -53,18 +53,16 @@ xfs_dir3_block_verify(
|
|||
struct xfs_mount *mp = bp->b_target->bt_mount;
|
||||
struct xfs_dir3_blk_hdr *hdr3 = bp->b_addr;
|
||||
|
||||
if (!xfs_verify_magic(bp, hdr3->magic))
|
||||
return __this_address;
|
||||
|
||||
if (xfs_sb_version_hascrc(&mp->m_sb)) {
|
||||
if (hdr3->magic != cpu_to_be32(XFS_DIR3_BLOCK_MAGIC))
|
||||
return __this_address;
|
||||
if (!uuid_equal(&hdr3->uuid, &mp->m_sb.sb_meta_uuid))
|
||||
return __this_address;
|
||||
if (be64_to_cpu(hdr3->blkno) != bp->b_bn)
|
||||
return __this_address;
|
||||
if (!xfs_log_check_lsn(mp, be64_to_cpu(hdr3->lsn)))
|
||||
return __this_address;
|
||||
} else {
|
||||
if (hdr3->magic != cpu_to_be32(XFS_DIR2_BLOCK_MAGIC))
|
||||
return __this_address;
|
||||
}
|
||||
return __xfs_dir3_data_check(NULL, bp);
|
||||
}
|
||||
|
@ -112,6 +110,8 @@ xfs_dir3_block_write_verify(
|
|||
|
||||
const struct xfs_buf_ops xfs_dir3_block_buf_ops = {
|
||||
.name = "xfs_dir3_block",
|
||||
.magic = { cpu_to_be32(XFS_DIR2_BLOCK_MAGIC),
|
||||
cpu_to_be32(XFS_DIR3_BLOCK_MAGIC) },
|
||||
.verify_read = xfs_dir3_block_read_verify,
|
||||
.verify_write = xfs_dir3_block_write_verify,
|
||||
.verify_struct = xfs_dir3_block_verify,
|
||||
|
|
|
@ -252,18 +252,16 @@ xfs_dir3_data_verify(
|
|||
struct xfs_mount *mp = bp->b_target->bt_mount;
|
||||
struct xfs_dir3_blk_hdr *hdr3 = bp->b_addr;
|
||||
|
||||
if (!xfs_verify_magic(bp, hdr3->magic))
|
||||
return __this_address;
|
||||
|
||||
if (xfs_sb_version_hascrc(&mp->m_sb)) {
|
||||
if (hdr3->magic != cpu_to_be32(XFS_DIR3_DATA_MAGIC))
|
||||
return __this_address;
|
||||
if (!uuid_equal(&hdr3->uuid, &mp->m_sb.sb_meta_uuid))
|
||||
return __this_address;
|
||||
if (be64_to_cpu(hdr3->blkno) != bp->b_bn)
|
||||
return __this_address;
|
||||
if (!xfs_log_check_lsn(mp, be64_to_cpu(hdr3->lsn)))
|
||||
return __this_address;
|
||||
} else {
|
||||
if (hdr3->magic != cpu_to_be32(XFS_DIR2_DATA_MAGIC))
|
||||
return __this_address;
|
||||
}
|
||||
return __xfs_dir3_data_check(NULL, bp);
|
||||
}
|
||||
|
@ -339,6 +337,8 @@ xfs_dir3_data_write_verify(
|
|||
|
||||
const struct xfs_buf_ops xfs_dir3_data_buf_ops = {
|
||||
.name = "xfs_dir3_data",
|
||||
.magic = { cpu_to_be32(XFS_DIR2_DATA_MAGIC),
|
||||
cpu_to_be32(XFS_DIR3_DATA_MAGIC) },
|
||||
.verify_read = xfs_dir3_data_read_verify,
|
||||
.verify_write = xfs_dir3_data_write_verify,
|
||||
.verify_struct = xfs_dir3_data_verify,
|
||||
|
@ -346,6 +346,8 @@ const struct xfs_buf_ops xfs_dir3_data_buf_ops = {
|
|||
|
||||
static const struct xfs_buf_ops xfs_dir3_data_reada_buf_ops = {
|
||||
.name = "xfs_dir3_data_reada",
|
||||
.magic = { cpu_to_be32(XFS_DIR2_DATA_MAGIC),
|
||||
cpu_to_be32(XFS_DIR3_DATA_MAGIC) },
|
||||
.verify_read = xfs_dir3_data_reada_verify,
|
||||
.verify_write = xfs_dir3_data_write_verify,
|
||||
};
|
||||
|
|
|
@ -142,41 +142,22 @@ xfs_dir3_leaf_check_int(
|
|||
*/
|
||||
static xfs_failaddr_t
|
||||
xfs_dir3_leaf_verify(
|
||||
struct xfs_buf *bp,
|
||||
uint16_t magic)
|
||||
struct xfs_buf *bp)
|
||||
{
|
||||
struct xfs_mount *mp = bp->b_target->bt_mount;
|
||||
struct xfs_dir2_leaf *leaf = bp->b_addr;
|
||||
xfs_failaddr_t fa;
|
||||
|
||||
ASSERT(magic == XFS_DIR2_LEAF1_MAGIC || magic == XFS_DIR2_LEAFN_MAGIC);
|
||||
|
||||
if (xfs_sb_version_hascrc(&mp->m_sb)) {
|
||||
struct xfs_dir3_leaf_hdr *leaf3 = bp->b_addr;
|
||||
uint16_t magic3;
|
||||
|
||||
magic3 = (magic == XFS_DIR2_LEAF1_MAGIC) ? XFS_DIR3_LEAF1_MAGIC
|
||||
: XFS_DIR3_LEAFN_MAGIC;
|
||||
|
||||
if (leaf3->info.hdr.magic != cpu_to_be16(magic3))
|
||||
return __this_address;
|
||||
if (!uuid_equal(&leaf3->info.uuid, &mp->m_sb.sb_meta_uuid))
|
||||
return __this_address;
|
||||
if (be64_to_cpu(leaf3->info.blkno) != bp->b_bn)
|
||||
return __this_address;
|
||||
if (!xfs_log_check_lsn(mp, be64_to_cpu(leaf3->info.lsn)))
|
||||
return __this_address;
|
||||
} else {
|
||||
if (leaf->hdr.info.magic != cpu_to_be16(magic))
|
||||
return __this_address;
|
||||
}
|
||||
fa = xfs_da3_blkinfo_verify(bp, bp->b_addr);
|
||||
if (fa)
|
||||
return fa;
|
||||
|
||||
return xfs_dir3_leaf_check_int(mp, NULL, NULL, leaf);
|
||||
}
|
||||
|
||||
static void
|
||||
__read_verify(
|
||||
struct xfs_buf *bp,
|
||||
uint16_t magic)
|
||||
xfs_dir3_leaf_read_verify(
|
||||
struct xfs_buf *bp)
|
||||
{
|
||||
struct xfs_mount *mp = bp->b_target->bt_mount;
|
||||
xfs_failaddr_t fa;
|
||||
|
@ -185,23 +166,22 @@ __read_verify(
|
|||
!xfs_buf_verify_cksum(bp, XFS_DIR3_LEAF_CRC_OFF))
|
||||
xfs_verifier_error(bp, -EFSBADCRC, __this_address);
|
||||
else {
|
||||
fa = xfs_dir3_leaf_verify(bp, magic);
|
||||
fa = xfs_dir3_leaf_verify(bp);
|
||||
if (fa)
|
||||
xfs_verifier_error(bp, -EFSCORRUPTED, fa);
|
||||
}
|
||||
}
|
||||
|
||||
static void
|
||||
__write_verify(
|
||||
struct xfs_buf *bp,
|
||||
uint16_t magic)
|
||||
xfs_dir3_leaf_write_verify(
|
||||
struct xfs_buf *bp)
|
||||
{
|
||||
struct xfs_mount *mp = bp->b_target->bt_mount;
|
||||
struct xfs_buf_log_item *bip = bp->b_log_item;
|
||||
struct xfs_dir3_leaf_hdr *hdr3 = bp->b_addr;
|
||||
xfs_failaddr_t fa;
|
||||
|
||||
fa = xfs_dir3_leaf_verify(bp, magic);
|
||||
fa = xfs_dir3_leaf_verify(bp);
|
||||
if (fa) {
|
||||
xfs_verifier_error(bp, -EFSCORRUPTED, fa);
|
||||
return;
|
||||
|
@ -216,60 +196,22 @@ __write_verify(
|
|||
xfs_buf_update_cksum(bp, XFS_DIR3_LEAF_CRC_OFF);
|
||||
}
|
||||
|
||||
static xfs_failaddr_t
|
||||
xfs_dir3_leaf1_verify(
|
||||
struct xfs_buf *bp)
|
||||
{
|
||||
return xfs_dir3_leaf_verify(bp, XFS_DIR2_LEAF1_MAGIC);
|
||||
}
|
||||
|
||||
static void
|
||||
xfs_dir3_leaf1_read_verify(
|
||||
struct xfs_buf *bp)
|
||||
{
|
||||
__read_verify(bp, XFS_DIR2_LEAF1_MAGIC);
|
||||
}
|
||||
|
||||
static void
|
||||
xfs_dir3_leaf1_write_verify(
|
||||
struct xfs_buf *bp)
|
||||
{
|
||||
__write_verify(bp, XFS_DIR2_LEAF1_MAGIC);
|
||||
}
|
||||
|
||||
static xfs_failaddr_t
|
||||
xfs_dir3_leafn_verify(
|
||||
struct xfs_buf *bp)
|
||||
{
|
||||
return xfs_dir3_leaf_verify(bp, XFS_DIR2_LEAFN_MAGIC);
|
||||
}
|
||||
|
||||
static void
|
||||
xfs_dir3_leafn_read_verify(
|
||||
struct xfs_buf *bp)
|
||||
{
|
||||
__read_verify(bp, XFS_DIR2_LEAFN_MAGIC);
|
||||
}
|
||||
|
||||
static void
|
||||
xfs_dir3_leafn_write_verify(
|
||||
struct xfs_buf *bp)
|
||||
{
|
||||
__write_verify(bp, XFS_DIR2_LEAFN_MAGIC);
|
||||
}
|
||||
|
||||
const struct xfs_buf_ops xfs_dir3_leaf1_buf_ops = {
|
||||
.name = "xfs_dir3_leaf1",
|
||||
.verify_read = xfs_dir3_leaf1_read_verify,
|
||||
.verify_write = xfs_dir3_leaf1_write_verify,
|
||||
.verify_struct = xfs_dir3_leaf1_verify,
|
||||
.magic16 = { cpu_to_be16(XFS_DIR2_LEAF1_MAGIC),
|
||||
cpu_to_be16(XFS_DIR3_LEAF1_MAGIC) },
|
||||
.verify_read = xfs_dir3_leaf_read_verify,
|
||||
.verify_write = xfs_dir3_leaf_write_verify,
|
||||
.verify_struct = xfs_dir3_leaf_verify,
|
||||
};
|
||||
|
||||
const struct xfs_buf_ops xfs_dir3_leafn_buf_ops = {
|
||||
.name = "xfs_dir3_leafn",
|
||||
.verify_read = xfs_dir3_leafn_read_verify,
|
||||
.verify_write = xfs_dir3_leafn_write_verify,
|
||||
.verify_struct = xfs_dir3_leafn_verify,
|
||||
.magic16 = { cpu_to_be16(XFS_DIR2_LEAFN_MAGIC),
|
||||
cpu_to_be16(XFS_DIR3_LEAFN_MAGIC) },
|
||||
.verify_read = xfs_dir3_leaf_read_verify,
|
||||
.verify_write = xfs_dir3_leaf_write_verify,
|
||||
.verify_struct = xfs_dir3_leaf_verify,
|
||||
};
|
||||
|
||||
int
|
||||
|
|
|
@ -87,20 +87,18 @@ xfs_dir3_free_verify(
|
|||
struct xfs_mount *mp = bp->b_target->bt_mount;
|
||||
struct xfs_dir2_free_hdr *hdr = bp->b_addr;
|
||||
|
||||
if (!xfs_verify_magic(bp, hdr->magic))
|
||||
return __this_address;
|
||||
|
||||
if (xfs_sb_version_hascrc(&mp->m_sb)) {
|
||||
struct xfs_dir3_blk_hdr *hdr3 = bp->b_addr;
|
||||
|
||||
if (hdr3->magic != cpu_to_be32(XFS_DIR3_FREE_MAGIC))
|
||||
return __this_address;
|
||||
if (!uuid_equal(&hdr3->uuid, &mp->m_sb.sb_meta_uuid))
|
||||
return __this_address;
|
||||
if (be64_to_cpu(hdr3->blkno) != bp->b_bn)
|
||||
return __this_address;
|
||||
if (!xfs_log_check_lsn(mp, be64_to_cpu(hdr3->lsn)))
|
||||
return __this_address;
|
||||
} else {
|
||||
if (hdr->magic != cpu_to_be32(XFS_DIR2_FREE_MAGIC))
|
||||
return __this_address;
|
||||
}
|
||||
|
||||
/* XXX: should bounds check the xfs_dir3_icfree_hdr here */
|
||||
|
@ -151,6 +149,8 @@ xfs_dir3_free_write_verify(
|
|||
|
||||
const struct xfs_buf_ops xfs_dir3_free_buf_ops = {
|
||||
.name = "xfs_dir3_free",
|
||||
.magic = { cpu_to_be32(XFS_DIR2_FREE_MAGIC),
|
||||
cpu_to_be32(XFS_DIR3_FREE_MAGIC) },
|
||||
.verify_read = xfs_dir3_free_read_verify,
|
||||
.verify_write = xfs_dir3_free_write_verify,
|
||||
.verify_struct = xfs_dir3_free_verify,
|
||||
|
|
|
@ -277,6 +277,8 @@ xfs_dquot_buf_write_verify(
|
|||
|
||||
const struct xfs_buf_ops xfs_dquot_buf_ops = {
|
||||
.name = "xfs_dquot",
|
||||
.magic16 = { cpu_to_be16(XFS_DQUOT_MAGIC),
|
||||
cpu_to_be16(XFS_DQUOT_MAGIC) },
|
||||
.verify_read = xfs_dquot_buf_read_verify,
|
||||
.verify_write = xfs_dquot_buf_write_verify,
|
||||
.verify_struct = xfs_dquot_buf_verify_struct,
|
||||
|
@ -284,6 +286,8 @@ const struct xfs_buf_ops xfs_dquot_buf_ops = {
|
|||
|
||||
const struct xfs_buf_ops xfs_dquot_buf_ra_ops = {
|
||||
.name = "xfs_dquot_ra",
|
||||
.magic16 = { cpu_to_be16(XFS_DQUOT_MAGIC),
|
||||
cpu_to_be16(XFS_DQUOT_MAGIC) },
|
||||
.verify_read = xfs_dquot_buf_readahead_verify,
|
||||
.verify_write = xfs_dquot_buf_write_verify,
|
||||
};
|
||||
|
|
|
@ -54,7 +54,8 @@
|
|||
#define XFS_ERRTAG_BUF_LRU_REF 31
|
||||
#define XFS_ERRTAG_FORCE_SCRUB_REPAIR 32
|
||||
#define XFS_ERRTAG_FORCE_SUMMARY_RECALC 33
|
||||
#define XFS_ERRTAG_MAX 34
|
||||
#define XFS_ERRTAG_IUNLINK_FALLBACK 34
|
||||
#define XFS_ERRTAG_MAX 35
|
||||
|
||||
/*
|
||||
* Random factors for above tags, 1 means always, 2 means 1/2 time, etc.
|
||||
|
@ -93,5 +94,6 @@
|
|||
#define XFS_RANDOM_BUF_LRU_REF 2
|
||||
#define XFS_RANDOM_FORCE_SCRUB_REPAIR 1
|
||||
#define XFS_RANDOM_FORCE_SUMMARY_RECALC 1
|
||||
#define XFS_RANDOM_IUNLINK_FALLBACK (XFS_RANDOM_DEFAULT/10)
|
||||
|
||||
#endif /* __XFS_ERRORTAG_H_ */
|
||||
|
|
|
@ -2508,7 +2508,7 @@ xfs_agi_verify(
|
|||
/*
|
||||
* Validate the magic number of the agi block.
|
||||
*/
|
||||
if (agi->agi_magicnum != cpu_to_be32(XFS_AGI_MAGIC))
|
||||
if (!xfs_verify_magic(bp, agi->agi_magicnum))
|
||||
return __this_address;
|
||||
if (!XFS_AGI_GOOD_VERSION(be32_to_cpu(agi->agi_versionnum)))
|
||||
return __this_address;
|
||||
|
@ -2582,6 +2582,7 @@ xfs_agi_write_verify(
|
|||
|
||||
const struct xfs_buf_ops xfs_agi_buf_ops = {
|
||||
.name = "xfs_agi",
|
||||
.magic = { cpu_to_be32(XFS_AGI_MAGIC), cpu_to_be32(XFS_AGI_MAGIC) },
|
||||
.verify_read = xfs_agi_read_verify,
|
||||
.verify_write = xfs_agi_write_verify,
|
||||
.verify_struct = xfs_agi_verify,
|
||||
|
|
|
@ -124,7 +124,7 @@ xfs_finobt_alloc_block(
|
|||
union xfs_btree_ptr *new,
|
||||
int *stat)
|
||||
{
|
||||
if (cur->bc_mp->m_inotbt_nores)
|
||||
if (cur->bc_mp->m_finobt_nores)
|
||||
return xfs_inobt_alloc_block(cur, start, new, stat);
|
||||
return __xfs_inobt_alloc_block(cur, start, new, stat,
|
||||
XFS_AG_RESV_METADATA);
|
||||
|
@ -154,7 +154,7 @@ xfs_finobt_free_block(
|
|||
struct xfs_btree_cur *cur,
|
||||
struct xfs_buf *bp)
|
||||
{
|
||||
if (cur->bc_mp->m_inotbt_nores)
|
||||
if (cur->bc_mp->m_finobt_nores)
|
||||
return xfs_inobt_free_block(cur, bp);
|
||||
return __xfs_inobt_free_block(cur, bp, XFS_AG_RESV_METADATA);
|
||||
}
|
||||
|
@ -260,6 +260,9 @@ xfs_inobt_verify(
|
|||
xfs_failaddr_t fa;
|
||||
unsigned int level;
|
||||
|
||||
if (!xfs_verify_magic(bp, block->bb_magic))
|
||||
return __this_address;
|
||||
|
||||
/*
|
||||
* During growfs operations, we can't verify the exact owner as the
|
||||
* perag is not fully initialised and hence not attached to the buffer.
|
||||
|
@ -270,18 +273,10 @@ xfs_inobt_verify(
|
|||
* but beware of the landmine (i.e. need to check pag->pagi_init) if we
|
||||
* ever do.
|
||||
*/
|
||||
switch (block->bb_magic) {
|
||||
case cpu_to_be32(XFS_IBT_CRC_MAGIC):
|
||||
case cpu_to_be32(XFS_FIBT_CRC_MAGIC):
|
||||
if (xfs_sb_version_hascrc(&mp->m_sb)) {
|
||||
fa = xfs_btree_sblock_v5hdr_verify(bp);
|
||||
if (fa)
|
||||
return fa;
|
||||
/* fall through */
|
||||
case cpu_to_be32(XFS_IBT_MAGIC):
|
||||
case cpu_to_be32(XFS_FIBT_MAGIC):
|
||||
break;
|
||||
default:
|
||||
return __this_address;
|
||||
}
|
||||
|
||||
/* level verification */
|
||||
|
@ -328,6 +323,16 @@ xfs_inobt_write_verify(
|
|||
|
||||
const struct xfs_buf_ops xfs_inobt_buf_ops = {
|
||||
.name = "xfs_inobt",
|
||||
.magic = { cpu_to_be32(XFS_IBT_MAGIC), cpu_to_be32(XFS_IBT_CRC_MAGIC) },
|
||||
.verify_read = xfs_inobt_read_verify,
|
||||
.verify_write = xfs_inobt_write_verify,
|
||||
.verify_struct = xfs_inobt_verify,
|
||||
};
|
||||
|
||||
const struct xfs_buf_ops xfs_finobt_buf_ops = {
|
||||
.name = "xfs_finobt",
|
||||
.magic = { cpu_to_be32(XFS_FIBT_MAGIC),
|
||||
cpu_to_be32(XFS_FIBT_CRC_MAGIC) },
|
||||
.verify_read = xfs_inobt_read_verify,
|
||||
.verify_write = xfs_inobt_write_verify,
|
||||
.verify_struct = xfs_inobt_verify,
|
||||
|
@ -389,7 +394,7 @@ static const struct xfs_btree_ops xfs_finobt_ops = {
|
|||
.init_rec_from_cur = xfs_inobt_init_rec_from_cur,
|
||||
.init_ptr_from_cur = xfs_finobt_init_ptr_from_cur,
|
||||
.key_diff = xfs_inobt_key_diff,
|
||||
.buf_ops = &xfs_inobt_buf_ops,
|
||||
.buf_ops = &xfs_finobt_buf_ops,
|
||||
.diff_two_keys = xfs_inobt_diff_two_keys,
|
||||
.keys_inorder = xfs_inobt_keys_inorder,
|
||||
.recs_inorder = xfs_inobt_recs_inorder,
|
||||
|
|
|
@ -614,16 +614,15 @@ xfs_iext_realloc_root(
|
|||
}
|
||||
|
||||
/*
|
||||
* Increment the sequence counter if we are on a COW fork. This allows
|
||||
* the writeback code to skip looking for a COW extent if the COW fork
|
||||
* hasn't changed. We use WRITE_ONCE here to ensure the update to the
|
||||
* sequence counter is seen before the modifications to the extent
|
||||
* tree itself take effect.
|
||||
* Increment the sequence counter on extent tree changes. If we are on a COW
|
||||
* fork, this allows the writeback code to skip looking for a COW extent if the
|
||||
* COW fork hasn't changed. We use WRITE_ONCE here to ensure the update to the
|
||||
* sequence counter is seen before the modifications to the extent tree itself
|
||||
* take effect.
|
||||
*/
|
||||
static inline void xfs_iext_inc_seq(struct xfs_ifork *ifp, int state)
|
||||
{
|
||||
if (state & BMAP_COWFORK)
|
||||
WRITE_ONCE(ifp->if_seq, READ_ONCE(ifp->if_seq) + 1);
|
||||
WRITE_ONCE(ifp->if_seq, READ_ONCE(ifp->if_seq) + 1);
|
||||
}
|
||||
|
||||
void
|
||||
|
|
|
@ -97,10 +97,9 @@ xfs_inode_buf_verify(
|
|||
|
||||
dip = xfs_buf_offset(bp, (i << mp->m_sb.sb_inodelog));
|
||||
unlinked_ino = be32_to_cpu(dip->di_next_unlinked);
|
||||
di_ok = dip->di_magic == cpu_to_be16(XFS_DINODE_MAGIC) &&
|
||||
di_ok = xfs_verify_magic16(bp, dip->di_magic) &&
|
||||
xfs_dinode_good_version(mp, dip->di_version) &&
|
||||
(unlinked_ino == NULLAGINO ||
|
||||
xfs_verify_agino(mp, agno, unlinked_ino));
|
||||
xfs_verify_agino_or_null(mp, agno, unlinked_ino);
|
||||
if (unlikely(XFS_TEST_ERROR(!di_ok, mp,
|
||||
XFS_ERRTAG_ITOBP_INOTOBP))) {
|
||||
if (readahead) {
|
||||
|
@ -147,12 +146,16 @@ xfs_inode_buf_write_verify(
|
|||
|
||||
const struct xfs_buf_ops xfs_inode_buf_ops = {
|
||||
.name = "xfs_inode",
|
||||
.magic16 = { cpu_to_be16(XFS_DINODE_MAGIC),
|
||||
cpu_to_be16(XFS_DINODE_MAGIC) },
|
||||
.verify_read = xfs_inode_buf_read_verify,
|
||||
.verify_write = xfs_inode_buf_write_verify,
|
||||
};
|
||||
|
||||
const struct xfs_buf_ops xfs_inode_buf_ra_ops = {
|
||||
.name = "xxfs_inode_ra",
|
||||
.name = "xfs_inode_ra",
|
||||
.magic16 = { cpu_to_be16(XFS_DINODE_MAGIC),
|
||||
cpu_to_be16(XFS_DINODE_MAGIC) },
|
||||
.verify_read = xfs_inode_buf_readahead_verify,
|
||||
.verify_write = xfs_inode_buf_write_verify,
|
||||
};
|
||||
|
|
|
@ -14,7 +14,7 @@ struct xfs_dinode;
|
|||
*/
|
||||
struct xfs_ifork {
|
||||
int if_bytes; /* bytes in if_u1 */
|
||||
unsigned int if_seq; /* cow fork mod counter */
|
||||
unsigned int if_seq; /* fork mod counter */
|
||||
struct xfs_btree_block *if_broot; /* file's incore btree root */
|
||||
short if_broot_bytes; /* bytes allocated for root */
|
||||
unsigned char if_flags; /* per-fork flags */
|
||||
|
|
|
@ -209,7 +209,7 @@ xfs_refcountbt_verify(
|
|||
xfs_failaddr_t fa;
|
||||
unsigned int level;
|
||||
|
||||
if (block->bb_magic != cpu_to_be32(XFS_REFC_CRC_MAGIC))
|
||||
if (!xfs_verify_magic(bp, block->bb_magic))
|
||||
return __this_address;
|
||||
|
||||
if (!xfs_sb_version_hasreflink(&mp->m_sb))
|
||||
|
@ -264,6 +264,7 @@ xfs_refcountbt_write_verify(
|
|||
|
||||
const struct xfs_buf_ops xfs_refcountbt_buf_ops = {
|
||||
.name = "xfs_refcountbt",
|
||||
.magic = { 0, cpu_to_be32(XFS_REFC_CRC_MAGIC) },
|
||||
.verify_read = xfs_refcountbt_read_verify,
|
||||
.verify_write = xfs_refcountbt_write_verify,
|
||||
.verify_struct = xfs_refcountbt_verify,
|
||||
|
|
|
@ -310,7 +310,7 @@ xfs_rmapbt_verify(
|
|||
* from the on disk AGF. Again, we can only check against maximum limits
|
||||
* in this case.
|
||||
*/
|
||||
if (block->bb_magic != cpu_to_be32(XFS_RMAP_CRC_MAGIC))
|
||||
if (!xfs_verify_magic(bp, block->bb_magic))
|
||||
return __this_address;
|
||||
|
||||
if (!xfs_sb_version_hasrmapbt(&mp->m_sb))
|
||||
|
@ -365,6 +365,7 @@ xfs_rmapbt_write_verify(
|
|||
|
||||
const struct xfs_buf_ops xfs_rmapbt_buf_ops = {
|
||||
.name = "xfs_rmapbt",
|
||||
.magic = { 0, cpu_to_be32(XFS_RMAP_CRC_MAGIC) },
|
||||
.verify_read = xfs_rmapbt_read_verify,
|
||||
.verify_write = xfs_rmapbt_write_verify,
|
||||
.verify_struct = xfs_rmapbt_verify,
|
||||
|
|
|
@ -225,10 +225,11 @@ xfs_validate_sb_common(
|
|||
struct xfs_buf *bp,
|
||||
struct xfs_sb *sbp)
|
||||
{
|
||||
struct xfs_dsb *dsb = XFS_BUF_TO_SBP(bp);
|
||||
uint32_t agcount = 0;
|
||||
uint32_t rem;
|
||||
|
||||
if (sbp->sb_magicnum != XFS_SB_MAGIC) {
|
||||
if (!xfs_verify_magic(bp, dsb->sb_magicnum)) {
|
||||
xfs_warn(mp, "bad magic number");
|
||||
return -EWRONGFS;
|
||||
}
|
||||
|
@ -781,12 +782,14 @@ out_error:
|
|||
|
||||
const struct xfs_buf_ops xfs_sb_buf_ops = {
|
||||
.name = "xfs_sb",
|
||||
.magic = { cpu_to_be32(XFS_SB_MAGIC), cpu_to_be32(XFS_SB_MAGIC) },
|
||||
.verify_read = xfs_sb_read_verify,
|
||||
.verify_write = xfs_sb_write_verify,
|
||||
};
|
||||
|
||||
const struct xfs_buf_ops xfs_sb_quiet_buf_ops = {
|
||||
.name = "xfs_sb_quiet",
|
||||
.magic = { cpu_to_be32(XFS_SB_MAGIC), cpu_to_be32(XFS_SB_MAGIC) },
|
||||
.verify_read = xfs_sb_quiet_read_verify,
|
||||
.verify_write = xfs_sb_write_verify,
|
||||
};
|
||||
|
@ -874,7 +877,7 @@ xfs_initialize_perag_data(
|
|||
uint64_t bfreelst = 0;
|
||||
uint64_t btree = 0;
|
||||
uint64_t fdblocks;
|
||||
int error;
|
||||
int error = 0;
|
||||
|
||||
for (index = 0; index < agcount; index++) {
|
||||
/*
|
||||
|
|
|
@ -25,7 +25,8 @@ extern const struct xfs_buf_ops xfs_agf_buf_ops;
|
|||
extern const struct xfs_buf_ops xfs_agi_buf_ops;
|
||||
extern const struct xfs_buf_ops xfs_agf_buf_ops;
|
||||
extern const struct xfs_buf_ops xfs_agfl_buf_ops;
|
||||
extern const struct xfs_buf_ops xfs_allocbt_buf_ops;
|
||||
extern const struct xfs_buf_ops xfs_bnobt_buf_ops;
|
||||
extern const struct xfs_buf_ops xfs_cntbt_buf_ops;
|
||||
extern const struct xfs_buf_ops xfs_rmapbt_buf_ops;
|
||||
extern const struct xfs_buf_ops xfs_refcountbt_buf_ops;
|
||||
extern const struct xfs_buf_ops xfs_attr3_leaf_buf_ops;
|
||||
|
@ -36,6 +37,7 @@ extern const struct xfs_buf_ops xfs_dquot_buf_ops;
|
|||
extern const struct xfs_buf_ops xfs_symlink_buf_ops;
|
||||
extern const struct xfs_buf_ops xfs_agi_buf_ops;
|
||||
extern const struct xfs_buf_ops xfs_inobt_buf_ops;
|
||||
extern const struct xfs_buf_ops xfs_finobt_buf_ops;
|
||||
extern const struct xfs_buf_ops xfs_inode_buf_ops;
|
||||
extern const struct xfs_buf_ops xfs_inode_buf_ra_ops;
|
||||
extern const struct xfs_buf_ops xfs_dquot_buf_ops;
|
||||
|
|
|
@ -95,7 +95,7 @@ xfs_symlink_verify(
|
|||
|
||||
if (!xfs_sb_version_hascrc(&mp->m_sb))
|
||||
return __this_address;
|
||||
if (dsl->sl_magic != cpu_to_be32(XFS_SYMLINK_MAGIC))
|
||||
if (!xfs_verify_magic(bp, dsl->sl_magic))
|
||||
return __this_address;
|
||||
if (!uuid_equal(&dsl->sl_uuid, &mp->m_sb.sb_meta_uuid))
|
||||
return __this_address;
|
||||
|
@ -159,6 +159,7 @@ xfs_symlink_write_verify(
|
|||
|
||||
const struct xfs_buf_ops xfs_symlink_buf_ops = {
|
||||
.name = "xfs_symlink",
|
||||
.magic = { 0, cpu_to_be32(XFS_SYMLINK_MAGIC) },
|
||||
.verify_read = xfs_symlink_read_verify,
|
||||
.verify_write = xfs_symlink_write_verify,
|
||||
.verify_struct = xfs_symlink_verify,
|
||||
|
|
|
@ -115,6 +115,19 @@ xfs_verify_agino(
|
|||
return agino >= first && agino <= last;
|
||||
}
|
||||
|
||||
/*
|
||||
* Verify that an AG inode number pointer neither points outside the AG
|
||||
* nor points at static metadata, or is NULLAGINO.
|
||||
*/
|
||||
bool
|
||||
xfs_verify_agino_or_null(
|
||||
struct xfs_mount *mp,
|
||||
xfs_agnumber_t agno,
|
||||
xfs_agino_t agino)
|
||||
{
|
||||
return agino == NULLAGINO || xfs_verify_agino(mp, agno, agino);
|
||||
}
|
||||
|
||||
/*
|
||||
* Verify that an FS inode number pointer neither points outside the
|
||||
* filesystem nor points at static AG metadata.
|
||||
|
@ -204,3 +217,14 @@ xfs_verify_icount(
|
|||
xfs_icount_range(mp, &min, &max);
|
||||
return icount >= min && icount <= max;
|
||||
}
|
||||
|
||||
/* Sanity-checking of dir/attr block offsets. */
|
||||
bool
|
||||
xfs_verify_dablk(
|
||||
struct xfs_mount *mp,
|
||||
xfs_fileoff_t dabno)
|
||||
{
|
||||
xfs_dablk_t max_dablk = -1U;
|
||||
|
||||
return dabno <= max_dablk;
|
||||
}
|
||||
|
|
|
@ -183,10 +183,13 @@ void xfs_agino_range(struct xfs_mount *mp, xfs_agnumber_t agno,
|
|||
xfs_agino_t *first, xfs_agino_t *last);
|
||||
bool xfs_verify_agino(struct xfs_mount *mp, xfs_agnumber_t agno,
|
||||
xfs_agino_t agino);
|
||||
bool xfs_verify_agino_or_null(struct xfs_mount *mp, xfs_agnumber_t agno,
|
||||
xfs_agino_t agino);
|
||||
bool xfs_verify_ino(struct xfs_mount *mp, xfs_ino_t ino);
|
||||
bool xfs_internal_inum(struct xfs_mount *mp, xfs_ino_t ino);
|
||||
bool xfs_verify_dir_ino(struct xfs_mount *mp, xfs_ino_t ino);
|
||||
bool xfs_verify_rtbno(struct xfs_mount *mp, xfs_rtblock_t rtbno);
|
||||
bool xfs_verify_icount(struct xfs_mount *mp, unsigned long long icount);
|
||||
bool xfs_verify_dablk(struct xfs_mount *mp, xfs_fileoff_t off);
|
||||
|
||||
#endif /* __XFS_TYPES_H__ */
|
||||
|
|
|
@ -399,7 +399,7 @@ xchk_agf_xref_cntbt(
|
|||
if (!xchk_should_check_xref(sc, &error, &sc->sa.cnt_cur))
|
||||
return;
|
||||
if (!have) {
|
||||
if (agf->agf_freeblks != be32_to_cpu(0))
|
||||
if (agf->agf_freeblks != cpu_to_be32(0))
|
||||
xchk_block_xref_set_corrupt(sc, sc->sa.agf_bp);
|
||||
return;
|
||||
}
|
||||
|
@ -864,19 +864,17 @@ xchk_agi(
|
|||
|
||||
/* Check inode pointers */
|
||||
agino = be32_to_cpu(agi->agi_newino);
|
||||
if (agino != NULLAGINO && !xfs_verify_agino(mp, agno, agino))
|
||||
if (!xfs_verify_agino_or_null(mp, agno, agino))
|
||||
xchk_block_set_corrupt(sc, sc->sa.agi_bp);
|
||||
|
||||
agino = be32_to_cpu(agi->agi_dirino);
|
||||
if (agino != NULLAGINO && !xfs_verify_agino(mp, agno, agino))
|
||||
if (!xfs_verify_agino_or_null(mp, agno, agino))
|
||||
xchk_block_set_corrupt(sc, sc->sa.agi_bp);
|
||||
|
||||
/* Check unlinked inode buckets */
|
||||
for (i = 0; i < XFS_AGI_UNLINKED_BUCKETS; i++) {
|
||||
agino = be32_to_cpu(agi->agi_unlinked[i]);
|
||||
if (agino == NULLAGINO)
|
||||
continue;
|
||||
if (!xfs_verify_agino(mp, agno, agino))
|
||||
if (!xfs_verify_agino_or_null(mp, agno, agino))
|
||||
xchk_block_set_corrupt(sc, sc->sa.agi_bp);
|
||||
}
|
||||
|
||||
|
|
|
@ -341,23 +341,19 @@ xrep_agf(
|
|||
struct xrep_find_ag_btree fab[XREP_AGF_MAX] = {
|
||||
[XREP_AGF_BNOBT] = {
|
||||
.rmap_owner = XFS_RMAP_OWN_AG,
|
||||
.buf_ops = &xfs_allocbt_buf_ops,
|
||||
.magic = XFS_ABTB_CRC_MAGIC,
|
||||
.buf_ops = &xfs_bnobt_buf_ops,
|
||||
},
|
||||
[XREP_AGF_CNTBT] = {
|
||||
.rmap_owner = XFS_RMAP_OWN_AG,
|
||||
.buf_ops = &xfs_allocbt_buf_ops,
|
||||
.magic = XFS_ABTC_CRC_MAGIC,
|
||||
.buf_ops = &xfs_cntbt_buf_ops,
|
||||
},
|
||||
[XREP_AGF_RMAPBT] = {
|
||||
.rmap_owner = XFS_RMAP_OWN_AG,
|
||||
.buf_ops = &xfs_rmapbt_buf_ops,
|
||||
.magic = XFS_RMAP_CRC_MAGIC,
|
||||
},
|
||||
[XREP_AGF_REFCOUNTBT] = {
|
||||
.rmap_owner = XFS_RMAP_OWN_REFC,
|
||||
.buf_ops = &xfs_refcountbt_buf_ops,
|
||||
.magic = XFS_REFC_CRC_MAGIC,
|
||||
},
|
||||
[XREP_AGF_END] = {
|
||||
.buf_ops = NULL,
|
||||
|
@ -875,12 +871,10 @@ xrep_agi(
|
|||
[XREP_AGI_INOBT] = {
|
||||
.rmap_owner = XFS_RMAP_OWN_INOBT,
|
||||
.buf_ops = &xfs_inobt_buf_ops,
|
||||
.magic = XFS_IBT_CRC_MAGIC,
|
||||
},
|
||||
[XREP_AGI_FINOBT] = {
|
||||
.rmap_owner = XFS_RMAP_OWN_INOBT,
|
||||
.buf_ops = &xfs_inobt_buf_ops,
|
||||
.magic = XFS_FIBT_CRC_MAGIC,
|
||||
.buf_ops = &xfs_finobt_buf_ops,
|
||||
},
|
||||
[XREP_AGI_END] = {
|
||||
.buf_ops = NULL
|
||||
|
|
|
@ -82,12 +82,23 @@ xchk_xattr_listent(
|
|||
|
||||
sx = container_of(context, struct xchk_xattr, context);
|
||||
|
||||
if (xchk_should_terminate(sx->sc, &error)) {
|
||||
context->seen_enough = 1;
|
||||
return;
|
||||
}
|
||||
|
||||
if (flags & XFS_ATTR_INCOMPLETE) {
|
||||
/* Incomplete attr key, just mark the inode for preening. */
|
||||
xchk_ino_set_preen(sx->sc, context->dp->i_ino);
|
||||
return;
|
||||
}
|
||||
|
||||
/* Does this name make sense? */
|
||||
if (!xfs_attr_namecheck(name, namelen)) {
|
||||
xchk_fblock_set_corrupt(sx->sc, XFS_ATTR_FORK, args.blkno);
|
||||
return;
|
||||
}
|
||||
|
||||
args.flags = ATTR_KERNOTIME;
|
||||
if (flags & XFS_ATTR_ROOT)
|
||||
args.flags |= ATTR_ROOT;
|
||||
|
|
|
@ -281,6 +281,31 @@ xchk_bmap_extent_xref(
|
|||
xchk_ag_free(info->sc, &info->sc->sa);
|
||||
}
|
||||
|
||||
/*
|
||||
* Directories and attr forks should never have blocks that can't be addressed
|
||||
* by a xfs_dablk_t.
|
||||
*/
|
||||
STATIC void
|
||||
xchk_bmap_dirattr_extent(
|
||||
struct xfs_inode *ip,
|
||||
struct xchk_bmap_info *info,
|
||||
struct xfs_bmbt_irec *irec)
|
||||
{
|
||||
struct xfs_mount *mp = ip->i_mount;
|
||||
xfs_fileoff_t off;
|
||||
|
||||
if (!S_ISDIR(VFS_I(ip)->i_mode) && info->whichfork != XFS_ATTR_FORK)
|
||||
return;
|
||||
|
||||
if (!xfs_verify_dablk(mp, irec->br_startoff))
|
||||
xchk_fblock_set_corrupt(info->sc, info->whichfork,
|
||||
irec->br_startoff);
|
||||
|
||||
off = irec->br_startoff + irec->br_blockcount - 1;
|
||||
if (!xfs_verify_dablk(mp, off))
|
||||
xchk_fblock_set_corrupt(info->sc, info->whichfork, off);
|
||||
}
|
||||
|
||||
/* Scrub a single extent record. */
|
||||
STATIC int
|
||||
xchk_bmap_extent(
|
||||
|
@ -305,6 +330,8 @@ xchk_bmap_extent(
|
|||
xchk_fblock_set_corrupt(info->sc, info->whichfork,
|
||||
irec->br_startoff);
|
||||
|
||||
xchk_bmap_dirattr_extent(ip, info, irec);
|
||||
|
||||
/* There should never be a "hole" extent in either extent list. */
|
||||
if (irec->br_startblock == HOLESTARTBLOCK)
|
||||
xchk_fblock_set_corrupt(info->sc, info->whichfork,
|
||||
|
|
|
@ -129,6 +129,12 @@ xchk_dir_actor(
|
|||
goto out;
|
||||
}
|
||||
|
||||
/* Does this name make sense? */
|
||||
if (!xfs_dir2_namecheck(name, namelen)) {
|
||||
xchk_fblock_set_corrupt(sdc->sc, XFS_DATA_FORK, offset);
|
||||
goto out;
|
||||
}
|
||||
|
||||
if (!strncmp(".", name, namelen)) {
|
||||
/* If this is "." then check that the inum matches the dir. */
|
||||
if (xfs_sb_version_hasftype(&mp->m_sb) && type != DT_DIR)
|
||||
|
|
|
@ -47,6 +47,12 @@ xchk_setup_ag_iallocbt(
|
|||
struct xchk_iallocbt {
|
||||
/* Number of inodes we see while scanning inobt. */
|
||||
unsigned long long inodes;
|
||||
|
||||
/* Expected next startino, for big block filesystems. */
|
||||
xfs_agino_t next_startino;
|
||||
|
||||
/* Expected end of the current inode cluster. */
|
||||
xfs_agino_t next_cluster_ino;
|
||||
};
|
||||
|
||||
/*
|
||||
|
@ -128,41 +134,57 @@ xchk_iallocbt_freecount(
|
|||
return hweight64(freemask);
|
||||
}
|
||||
|
||||
/* Check a particular inode with ir_free. */
|
||||
/*
|
||||
* Check that an inode's allocation status matches ir_free in the inobt
|
||||
* record. First we try querying the in-core inode state, and if the inode
|
||||
* isn't loaded we examine the on-disk inode directly.
|
||||
*
|
||||
* Since there can be 1:M and M:1 mappings between inobt records and inode
|
||||
* clusters, we pass in the inode location information as an inobt record;
|
||||
* the index of an inode cluster within the inobt record (as well as the
|
||||
* cluster buffer itself); and the index of the inode within the cluster.
|
||||
*
|
||||
* @irec is the inobt record.
|
||||
* @irec_ino is the inode offset from the start of the record.
|
||||
* @dip is the on-disk inode.
|
||||
*/
|
||||
STATIC int
|
||||
xchk_iallocbt_check_cluster_freemask(
|
||||
xchk_iallocbt_check_cluster_ifree(
|
||||
struct xchk_btree *bs,
|
||||
xfs_ino_t fsino,
|
||||
xfs_agino_t chunkino,
|
||||
xfs_agino_t clusterino,
|
||||
struct xfs_inobt_rec_incore *irec,
|
||||
struct xfs_buf *bp)
|
||||
unsigned int irec_ino,
|
||||
struct xfs_dinode *dip)
|
||||
{
|
||||
struct xfs_dinode *dip;
|
||||
struct xfs_mount *mp = bs->cur->bc_mp;
|
||||
bool inode_is_free = false;
|
||||
xfs_ino_t fsino;
|
||||
xfs_agino_t agino;
|
||||
bool irec_free;
|
||||
bool ino_inuse;
|
||||
bool freemask_ok;
|
||||
bool inuse;
|
||||
int error = 0;
|
||||
|
||||
if (xchk_should_terminate(bs->sc, &error))
|
||||
return error;
|
||||
|
||||
dip = xfs_buf_offset(bp, clusterino * mp->m_sb.sb_inodesize);
|
||||
/*
|
||||
* Given an inobt record and the offset of an inode from the start of
|
||||
* the record, compute which fs inode we're talking about.
|
||||
*/
|
||||
agino = irec->ir_startino + irec_ino;
|
||||
fsino = XFS_AGINO_TO_INO(mp, bs->cur->bc_private.a.agno, agino);
|
||||
irec_free = (irec->ir_free & XFS_INOBT_MASK(irec_ino));
|
||||
|
||||
if (be16_to_cpu(dip->di_magic) != XFS_DINODE_MAGIC ||
|
||||
(dip->di_version >= 3 &&
|
||||
be64_to_cpu(dip->di_ino) != fsino + clusterino)) {
|
||||
(dip->di_version >= 3 && be64_to_cpu(dip->di_ino) != fsino)) {
|
||||
xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
|
||||
goto out;
|
||||
}
|
||||
|
||||
if (irec->ir_free & XFS_INOBT_MASK(chunkino + clusterino))
|
||||
inode_is_free = true;
|
||||
error = xfs_icache_inode_is_allocated(mp, bs->cur->bc_tp,
|
||||
fsino + clusterino, &inuse);
|
||||
error = xfs_icache_inode_is_allocated(mp, bs->cur->bc_tp, fsino,
|
||||
&ino_inuse);
|
||||
if (error == -ENODATA) {
|
||||
/* Not cached, just read the disk buffer */
|
||||
freemask_ok = inode_is_free ^ !!(dip->di_mode);
|
||||
freemask_ok = irec_free ^ !!(dip->di_mode);
|
||||
if (!bs->sc->try_harder && !freemask_ok)
|
||||
return -EDEADLOCK;
|
||||
} else if (error < 0) {
|
||||
|
@ -174,7 +196,7 @@ xchk_iallocbt_check_cluster_freemask(
|
|||
goto out;
|
||||
} else {
|
||||
/* Inode is all there. */
|
||||
freemask_ok = inode_is_free ^ inuse;
|
||||
freemask_ok = irec_free ^ ino_inuse;
|
||||
}
|
||||
if (!freemask_ok)
|
||||
xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
|
||||
|
@ -182,88 +204,223 @@ out:
|
|||
return 0;
|
||||
}
|
||||
|
||||
/* Make sure the free mask is consistent with what the inodes think. */
|
||||
/*
|
||||
* Check that the holemask and freemask of a hypothetical inode cluster match
|
||||
* what's actually on disk. If sparse inodes are enabled, the cluster does
|
||||
* not actually have to map to inodes if the corresponding holemask bit is set.
|
||||
*
|
||||
* @cluster_base is the first inode in the cluster within the @irec.
|
||||
*/
|
||||
STATIC int
|
||||
xchk_iallocbt_check_freemask(
|
||||
xchk_iallocbt_check_cluster(
|
||||
struct xchk_btree *bs,
|
||||
struct xfs_inobt_rec_incore *irec)
|
||||
struct xfs_inobt_rec_incore *irec,
|
||||
unsigned int cluster_base)
|
||||
{
|
||||
struct xfs_imap imap;
|
||||
struct xfs_mount *mp = bs->cur->bc_mp;
|
||||
struct xfs_dinode *dip;
|
||||
struct xfs_buf *bp;
|
||||
xfs_ino_t fsino;
|
||||
xfs_agino_t nr_inodes;
|
||||
xfs_agino_t agino;
|
||||
xfs_agino_t chunkino;
|
||||
xfs_agino_t clusterino;
|
||||
struct xfs_buf *cluster_bp;
|
||||
unsigned int nr_inodes;
|
||||
xfs_agnumber_t agno = bs->cur->bc_private.a.agno;
|
||||
xfs_agblock_t agbno;
|
||||
uint16_t holemask;
|
||||
unsigned int cluster_index;
|
||||
uint16_t cluster_mask = 0;
|
||||
uint16_t ir_holemask;
|
||||
int error = 0;
|
||||
|
||||
/* Make sure the freemask matches the inode records. */
|
||||
nr_inodes = mp->m_inodes_per_cluster;
|
||||
nr_inodes = min_t(unsigned int, XFS_INODES_PER_CHUNK,
|
||||
mp->m_inodes_per_cluster);
|
||||
|
||||
for (agino = irec->ir_startino;
|
||||
agino < irec->ir_startino + XFS_INODES_PER_CHUNK;
|
||||
agino += mp->m_inodes_per_cluster) {
|
||||
fsino = XFS_AGINO_TO_INO(mp, bs->cur->bc_private.a.agno, agino);
|
||||
chunkino = agino - irec->ir_startino;
|
||||
agbno = XFS_AGINO_TO_AGBNO(mp, agino);
|
||||
/* Map this inode cluster */
|
||||
agbno = XFS_AGINO_TO_AGBNO(mp, irec->ir_startino + cluster_base);
|
||||
|
||||
/* Compute the holemask mask for this cluster. */
|
||||
for (clusterino = 0, holemask = 0; clusterino < nr_inodes;
|
||||
clusterino += XFS_INODES_PER_HOLEMASK_BIT)
|
||||
holemask |= XFS_INOBT_MASK((chunkino + clusterino) /
|
||||
XFS_INODES_PER_HOLEMASK_BIT);
|
||||
/* Compute a bitmask for this cluster that can be used for holemask. */
|
||||
for (cluster_index = 0;
|
||||
cluster_index < nr_inodes;
|
||||
cluster_index += XFS_INODES_PER_HOLEMASK_BIT)
|
||||
cluster_mask |= XFS_INOBT_MASK((cluster_base + cluster_index) /
|
||||
XFS_INODES_PER_HOLEMASK_BIT);
|
||||
|
||||
/* The whole cluster must be a hole or not a hole. */
|
||||
ir_holemask = (irec->ir_holemask & holemask);
|
||||
if (ir_holemask != holemask && ir_holemask != 0) {
|
||||
xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
|
||||
continue;
|
||||
}
|
||||
/*
|
||||
* Map the first inode of this cluster to a buffer and offset.
|
||||
* Be careful about inobt records that don't align with the start of
|
||||
* the inode buffer when block sizes are large enough to hold multiple
|
||||
* inode chunks. When this happens, cluster_base will be zero but
|
||||
* ir_startino can be large enough to make im_boffset nonzero.
|
||||
*/
|
||||
ir_holemask = (irec->ir_holemask & cluster_mask);
|
||||
imap.im_blkno = XFS_AGB_TO_DADDR(mp, agno, agbno);
|
||||
imap.im_len = XFS_FSB_TO_BB(mp, mp->m_blocks_per_cluster);
|
||||
imap.im_boffset = XFS_INO_TO_OFFSET(mp, irec->ir_startino);
|
||||
|
||||
/* If any part of this is a hole, skip it. */
|
||||
if (ir_holemask) {
|
||||
xchk_xref_is_not_owned_by(bs->sc, agbno,
|
||||
mp->m_blocks_per_cluster,
|
||||
&XFS_RMAP_OINFO_INODES);
|
||||
continue;
|
||||
}
|
||||
if (imap.im_boffset != 0 && cluster_base != 0) {
|
||||
ASSERT(imap.im_boffset == 0 || cluster_base == 0);
|
||||
xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
|
||||
return 0;
|
||||
}
|
||||
|
||||
xchk_xref_is_owned_by(bs->sc, agbno, mp->m_blocks_per_cluster,
|
||||
trace_xchk_iallocbt_check_cluster(mp, agno, irec->ir_startino,
|
||||
imap.im_blkno, imap.im_len, cluster_base, nr_inodes,
|
||||
cluster_mask, ir_holemask,
|
||||
XFS_INO_TO_OFFSET(mp, irec->ir_startino +
|
||||
cluster_base));
|
||||
|
||||
/* The whole cluster must be a hole or not a hole. */
|
||||
if (ir_holemask != cluster_mask && ir_holemask != 0) {
|
||||
xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
|
||||
return 0;
|
||||
}
|
||||
|
||||
/* If any part of this is a hole, skip it. */
|
||||
if (ir_holemask) {
|
||||
xchk_xref_is_not_owned_by(bs->sc, agbno,
|
||||
mp->m_blocks_per_cluster,
|
||||
&XFS_RMAP_OINFO_INODES);
|
||||
return 0;
|
||||
}
|
||||
|
||||
/* Grab the inode cluster buffer. */
|
||||
imap.im_blkno = XFS_AGB_TO_DADDR(mp, bs->cur->bc_private.a.agno,
|
||||
agbno);
|
||||
imap.im_len = XFS_FSB_TO_BB(mp, mp->m_blocks_per_cluster);
|
||||
imap.im_boffset = 0;
|
||||
xchk_xref_is_owned_by(bs->sc, agbno, mp->m_blocks_per_cluster,
|
||||
&XFS_RMAP_OINFO_INODES);
|
||||
|
||||
error = xfs_imap_to_bp(mp, bs->cur->bc_tp, &imap,
|
||||
&dip, &bp, 0, 0);
|
||||
if (!xchk_btree_xref_process_error(bs->sc, bs->cur, 0,
|
||||
&error))
|
||||
continue;
|
||||
/* Grab the inode cluster buffer. */
|
||||
error = xfs_imap_to_bp(mp, bs->cur->bc_tp, &imap, &dip, &cluster_bp,
|
||||
0, 0);
|
||||
if (!xchk_btree_xref_process_error(bs->sc, bs->cur, 0, &error))
|
||||
return error;
|
||||
|
||||
/* Which inodes are free? */
|
||||
for (clusterino = 0; clusterino < nr_inodes; clusterino++) {
|
||||
error = xchk_iallocbt_check_cluster_freemask(bs,
|
||||
fsino, chunkino, clusterino, irec, bp);
|
||||
if (error) {
|
||||
xfs_trans_brelse(bs->cur->bc_tp, bp);
|
||||
return error;
|
||||
}
|
||||
/* Check free status of each inode within this cluster. */
|
||||
for (cluster_index = 0; cluster_index < nr_inodes; cluster_index++) {
|
||||
struct xfs_dinode *dip;
|
||||
|
||||
if (imap.im_boffset >= BBTOB(cluster_bp->b_length)) {
|
||||
xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
|
||||
break;
|
||||
}
|
||||
|
||||
xfs_trans_brelse(bs->cur->bc_tp, bp);
|
||||
dip = xfs_buf_offset(cluster_bp, imap.im_boffset);
|
||||
error = xchk_iallocbt_check_cluster_ifree(bs, irec,
|
||||
cluster_base + cluster_index, dip);
|
||||
if (error)
|
||||
break;
|
||||
imap.im_boffset += mp->m_sb.sb_inodesize;
|
||||
}
|
||||
|
||||
xfs_trans_brelse(bs->cur->bc_tp, cluster_bp);
|
||||
return error;
|
||||
}
|
||||
|
||||
/*
|
||||
* For all the inode clusters that could map to this inobt record, make sure
|
||||
* that the holemask makes sense and that the allocation status of each inode
|
||||
* matches the freemask.
|
||||
*/
|
||||
STATIC int
|
||||
xchk_iallocbt_check_clusters(
|
||||
struct xchk_btree *bs,
|
||||
struct xfs_inobt_rec_incore *irec)
|
||||
{
|
||||
unsigned int cluster_base;
|
||||
int error = 0;
|
||||
|
||||
/*
|
||||
* For the common case where this inobt record maps to multiple inode
|
||||
* clusters this will call _check_cluster for each cluster.
|
||||
*
|
||||
* For the case that multiple inobt records map to a single cluster,
|
||||
* this will call _check_cluster once.
|
||||
*/
|
||||
for (cluster_base = 0;
|
||||
cluster_base < XFS_INODES_PER_CHUNK;
|
||||
cluster_base += bs->sc->mp->m_inodes_per_cluster) {
|
||||
error = xchk_iallocbt_check_cluster(bs, irec, cluster_base);
|
||||
if (error)
|
||||
break;
|
||||
}
|
||||
|
||||
return error;
|
||||
}
|
||||
|
||||
/*
|
||||
* Make sure this inode btree record is aligned properly. Because a fs block
|
||||
* contains multiple inodes, we check that the inobt record is aligned to the
|
||||
* correct inode, not just the correct block on disk. This results in a finer
|
||||
* grained corruption check.
|
||||
*/
|
||||
STATIC void
|
||||
xchk_iallocbt_rec_alignment(
|
||||
struct xchk_btree *bs,
|
||||
struct xfs_inobt_rec_incore *irec)
|
||||
{
|
||||
struct xfs_mount *mp = bs->sc->mp;
|
||||
struct xchk_iallocbt *iabt = bs->private;
|
||||
|
||||
/*
|
||||
* finobt records have different positioning requirements than inobt
|
||||
* records: each finobt record must have a corresponding inobt record.
|
||||
* That is checked in the xref function, so for now we only catch the
|
||||
* obvious case where the record isn't at all aligned properly.
|
||||
*
|
||||
* Note that if a fs block contains more than a single chunk of inodes,
|
||||
* we will have finobt records only for those chunks containing free
|
||||
* inodes, and therefore expect chunk alignment of finobt records.
|
||||
* Otherwise, we expect that the finobt record is aligned to the
|
||||
* cluster alignment as told by the superblock.
|
||||
*/
|
||||
if (bs->cur->bc_btnum == XFS_BTNUM_FINO) {
|
||||
unsigned int imask;
|
||||
|
||||
imask = min_t(unsigned int, XFS_INODES_PER_CHUNK,
|
||||
mp->m_cluster_align_inodes) - 1;
|
||||
if (irec->ir_startino & imask)
|
||||
xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
|
||||
return;
|
||||
}
|
||||
|
||||
if (iabt->next_startino != NULLAGINO) {
|
||||
/*
|
||||
* We're midway through a cluster of inodes that is mapped by
|
||||
* multiple inobt records. Did we get the record for the next
|
||||
* irec in the sequence?
|
||||
*/
|
||||
if (irec->ir_startino != iabt->next_startino) {
|
||||
xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
|
||||
return;
|
||||
}
|
||||
|
||||
iabt->next_startino += XFS_INODES_PER_CHUNK;
|
||||
|
||||
/* Are we done with the cluster? */
|
||||
if (iabt->next_startino >= iabt->next_cluster_ino) {
|
||||
iabt->next_startino = NULLAGINO;
|
||||
iabt->next_cluster_ino = NULLAGINO;
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
/* inobt records must be aligned to cluster and inoalignmnt size. */
|
||||
if (irec->ir_startino & (mp->m_cluster_align_inodes - 1)) {
|
||||
xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
|
||||
return;
|
||||
}
|
||||
|
||||
if (irec->ir_startino & (mp->m_inodes_per_cluster - 1)) {
|
||||
xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
|
||||
return;
|
||||
}
|
||||
|
||||
if (mp->m_inodes_per_cluster <= XFS_INODES_PER_CHUNK)
|
||||
return;
|
||||
|
||||
/*
|
||||
* If this is the start of an inode cluster that can be mapped by
|
||||
* multiple inobt records, the next inobt record must follow exactly
|
||||
* after this one.
|
||||
*/
|
||||
iabt->next_startino = irec->ir_startino + XFS_INODES_PER_CHUNK;
|
||||
iabt->next_cluster_ino = irec->ir_startino + mp->m_inodes_per_cluster;
|
||||
}
|
||||
|
||||
/* Scrub an inobt/finobt record. */
|
||||
STATIC int
|
||||
xchk_iallocbt_rec(
|
||||
|
@ -276,7 +433,6 @@ xchk_iallocbt_rec(
|
|||
uint64_t holes;
|
||||
xfs_agnumber_t agno = bs->cur->bc_private.a.agno;
|
||||
xfs_agino_t agino;
|
||||
xfs_agblock_t agbno;
|
||||
xfs_extlen_t len;
|
||||
int holecount;
|
||||
int i;
|
||||
|
@ -303,11 +459,9 @@ xchk_iallocbt_rec(
|
|||
goto out;
|
||||
}
|
||||
|
||||
/* Make sure this record is aligned to cluster and inoalignmnt size. */
|
||||
agbno = XFS_AGINO_TO_AGBNO(mp, irec.ir_startino);
|
||||
if ((agbno & (mp->m_cluster_align - 1)) ||
|
||||
(agbno & (mp->m_blocks_per_cluster - 1)))
|
||||
xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
|
||||
xchk_iallocbt_rec_alignment(bs, &irec);
|
||||
if (bs->sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)
|
||||
goto out;
|
||||
|
||||
iabt->inodes += irec.ir_count;
|
||||
|
||||
|
@ -320,7 +474,7 @@ xchk_iallocbt_rec(
|
|||
|
||||
if (!xchk_iallocbt_chunk(bs, &irec, agino, len))
|
||||
goto out;
|
||||
goto check_freemask;
|
||||
goto check_clusters;
|
||||
}
|
||||
|
||||
/* Check each chunk of a sparse inode cluster. */
|
||||
|
@ -346,8 +500,8 @@ xchk_iallocbt_rec(
|
|||
holecount + irec.ir_count != XFS_INODES_PER_CHUNK)
|
||||
xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
|
||||
|
||||
check_freemask:
|
||||
error = xchk_iallocbt_check_freemask(bs, &irec);
|
||||
check_clusters:
|
||||
error = xchk_iallocbt_check_clusters(bs, &irec);
|
||||
if (error)
|
||||
goto out;
|
||||
|
||||
|
@ -429,6 +583,8 @@ xchk_iallocbt(
|
|||
struct xfs_btree_cur *cur;
|
||||
struct xchk_iallocbt iabt = {
|
||||
.inodes = 0,
|
||||
.next_startino = NULLAGINO,
|
||||
.next_cluster_ino = NULLAGINO,
|
||||
};
|
||||
int error;
|
||||
|
||||
|
|
|
@ -743,7 +743,8 @@ xrep_findroot_block(
|
|||
|
||||
/* Ensure the block magic matches the btree type we're looking for. */
|
||||
btblock = XFS_BUF_TO_BLOCK(bp);
|
||||
if (be32_to_cpu(btblock->bb_magic) != fab->magic)
|
||||
ASSERT(fab->buf_ops->magic[1] != 0);
|
||||
if (btblock->bb_magic != fab->buf_ops->magic[1])
|
||||
goto out;
|
||||
|
||||
/*
|
||||
|
|
|
@ -42,9 +42,6 @@ struct xrep_find_ag_btree {
|
|||
/* in: buffer ops */
|
||||
const struct xfs_buf_ops *buf_ops;
|
||||
|
||||
/* in: magic number of the btree */
|
||||
uint32_t magic;
|
||||
|
||||
/* out: the highest btree block found and the tree height */
|
||||
xfs_agblock_t root;
|
||||
unsigned int height;
|
||||
|
|
|
@ -141,9 +141,8 @@ xchk_xref_is_used_rt_space(
|
|||
startext = fsbno;
|
||||
endext = fsbno + len - 1;
|
||||
do_div(startext, sc->mp->m_sb.sb_rextsize);
|
||||
if (do_div(endext, sc->mp->m_sb.sb_rextsize))
|
||||
endext++;
|
||||
extcount = endext - startext;
|
||||
do_div(endext, sc->mp->m_sb.sb_rextsize);
|
||||
extcount = endext - startext + 1;
|
||||
xfs_ilock(sc->mp->m_rbmip, XFS_ILOCK_SHARED | XFS_ILOCK_RTBITMAP);
|
||||
error = xfs_rtalloc_extent_is_free(sc->mp, sc->tp, startext, extcount,
|
||||
&is_free);
|
||||
|
|
|
@ -545,6 +545,51 @@ TRACE_EVENT(xchk_xref_error,
|
|||
__entry->ret_ip)
|
||||
);
|
||||
|
||||
TRACE_EVENT(xchk_iallocbt_check_cluster,
|
||||
TP_PROTO(struct xfs_mount *mp, xfs_agnumber_t agno,
|
||||
xfs_agino_t startino, xfs_daddr_t map_daddr,
|
||||
unsigned short map_len, unsigned int chunk_ino,
|
||||
unsigned int nr_inodes, uint16_t cluster_mask,
|
||||
uint16_t holemask, unsigned int cluster_ino),
|
||||
TP_ARGS(mp, agno, startino, map_daddr, map_len, chunk_ino, nr_inodes,
|
||||
cluster_mask, holemask, cluster_ino),
|
||||
TP_STRUCT__entry(
|
||||
__field(dev_t, dev)
|
||||
__field(xfs_agnumber_t, agno)
|
||||
__field(xfs_agino_t, startino)
|
||||
__field(xfs_daddr_t, map_daddr)
|
||||
__field(unsigned short, map_len)
|
||||
__field(unsigned int, chunk_ino)
|
||||
__field(unsigned int, nr_inodes)
|
||||
__field(unsigned int, cluster_ino)
|
||||
__field(uint16_t, cluster_mask)
|
||||
__field(uint16_t, holemask)
|
||||
),
|
||||
TP_fast_assign(
|
||||
__entry->dev = mp->m_super->s_dev;
|
||||
__entry->agno = agno;
|
||||
__entry->startino = startino;
|
||||
__entry->map_daddr = map_daddr;
|
||||
__entry->map_len = map_len;
|
||||
__entry->chunk_ino = chunk_ino;
|
||||
__entry->nr_inodes = nr_inodes;
|
||||
__entry->cluster_mask = cluster_mask;
|
||||
__entry->holemask = holemask;
|
||||
__entry->cluster_ino = cluster_ino;
|
||||
),
|
||||
TP_printk("dev %d:%d agno %d startino %u daddr 0x%llx len %d chunkino %u nr_inodes %u cluster_mask 0x%x holemask 0x%x cluster_ino %u",
|
||||
MAJOR(__entry->dev), MINOR(__entry->dev),
|
||||
__entry->agno,
|
||||
__entry->startino,
|
||||
__entry->map_daddr,
|
||||
__entry->map_len,
|
||||
__entry->chunk_ino,
|
||||
__entry->nr_inodes,
|
||||
__entry->cluster_mask,
|
||||
__entry->holemask,
|
||||
__entry->cluster_ino)
|
||||
)
|
||||
|
||||
/* repair tracepoints */
|
||||
#if IS_ENABLED(CONFIG_XFS_ONLINE_REPAIR)
|
||||
|
||||
|
|
|
@ -28,7 +28,8 @@
|
|||
*/
|
||||
struct xfs_writepage_ctx {
|
||||
struct xfs_bmbt_irec imap;
|
||||
unsigned int io_type;
|
||||
int fork;
|
||||
unsigned int data_seq;
|
||||
unsigned int cow_seq;
|
||||
struct xfs_ioend *ioend;
|
||||
};
|
||||
|
@ -255,30 +256,20 @@ xfs_end_io(
|
|||
*/
|
||||
error = blk_status_to_errno(ioend->io_bio->bi_status);
|
||||
if (unlikely(error)) {
|
||||
switch (ioend->io_type) {
|
||||
case XFS_IO_COW:
|
||||
if (ioend->io_fork == XFS_COW_FORK)
|
||||
xfs_reflink_cancel_cow_range(ip, offset, size, true);
|
||||
break;
|
||||
}
|
||||
|
||||
goto done;
|
||||
}
|
||||
|
||||
/*
|
||||
* Success: commit the COW or unwritten blocks if needed.
|
||||
* Success: commit the COW or unwritten blocks if needed.
|
||||
*/
|
||||
switch (ioend->io_type) {
|
||||
case XFS_IO_COW:
|
||||
if (ioend->io_fork == XFS_COW_FORK)
|
||||
error = xfs_reflink_end_cow(ip, offset, size);
|
||||
break;
|
||||
case XFS_IO_UNWRITTEN:
|
||||
/* writeback should never update isize */
|
||||
else if (ioend->io_state == XFS_EXT_UNWRITTEN)
|
||||
error = xfs_iomap_write_unwritten(ip, offset, size, false);
|
||||
break;
|
||||
default:
|
||||
else
|
||||
ASSERT(!xfs_ioend_is_append(ioend) || ioend->io_append_trans);
|
||||
break;
|
||||
}
|
||||
|
||||
done:
|
||||
if (ioend->io_append_trans)
|
||||
|
@ -293,7 +284,8 @@ xfs_end_bio(
|
|||
struct xfs_ioend *ioend = bio->bi_private;
|
||||
struct xfs_mount *mp = XFS_I(ioend->io_inode)->i_mount;
|
||||
|
||||
if (ioend->io_type == XFS_IO_UNWRITTEN || ioend->io_type == XFS_IO_COW)
|
||||
if (ioend->io_fork == XFS_COW_FORK ||
|
||||
ioend->io_state == XFS_EXT_UNWRITTEN)
|
||||
queue_work(mp->m_unwritten_workqueue, &ioend->io_work);
|
||||
else if (ioend->io_append_trans)
|
||||
queue_work(mp->m_data_workqueue, &ioend->io_work);
|
||||
|
@ -301,6 +293,75 @@ xfs_end_bio(
|
|||
xfs_destroy_ioend(ioend, blk_status_to_errno(bio->bi_status));
|
||||
}
|
||||
|
||||
/*
|
||||
* Fast revalidation of the cached writeback mapping. Return true if the current
|
||||
* mapping is valid, false otherwise.
|
||||
*/
|
||||
static bool
|
||||
xfs_imap_valid(
|
||||
struct xfs_writepage_ctx *wpc,
|
||||
struct xfs_inode *ip,
|
||||
xfs_fileoff_t offset_fsb)
|
||||
{
|
||||
if (offset_fsb < wpc->imap.br_startoff ||
|
||||
offset_fsb >= wpc->imap.br_startoff + wpc->imap.br_blockcount)
|
||||
return false;
|
||||
/*
|
||||
* If this is a COW mapping, it is sufficient to check that the mapping
|
||||
* covers the offset. Be careful to check this first because the caller
|
||||
* can revalidate a COW mapping without updating the data seqno.
|
||||
*/
|
||||
if (wpc->fork == XFS_COW_FORK)
|
||||
return true;
|
||||
|
||||
/*
|
||||
* This is not a COW mapping. Check the sequence number of the data fork
|
||||
* because concurrent changes could have invalidated the extent. Check
|
||||
* the COW fork because concurrent changes since the last time we
|
||||
* checked (and found nothing at this offset) could have added
|
||||
* overlapping blocks.
|
||||
*/
|
||||
if (wpc->data_seq != READ_ONCE(ip->i_df.if_seq))
|
||||
return false;
|
||||
if (xfs_inode_has_cow_data(ip) &&
|
||||
wpc->cow_seq != READ_ONCE(ip->i_cowfp->if_seq))
|
||||
return false;
|
||||
return true;
|
||||
}
|
||||
|
||||
/*
|
||||
* Pass in a dellalloc extent and convert it to real extents, return the real
|
||||
* extent that maps offset_fsb in wpc->imap.
|
||||
*
|
||||
* The current page is held locked so nothing could have removed the block
|
||||
* backing offset_fsb, although it could have moved from the COW to the data
|
||||
* fork by another thread.
|
||||
*/
|
||||
static int
|
||||
xfs_convert_blocks(
|
||||
struct xfs_writepage_ctx *wpc,
|
||||
struct xfs_inode *ip,
|
||||
xfs_fileoff_t offset_fsb)
|
||||
{
|
||||
int error;
|
||||
|
||||
/*
|
||||
* Attempt to allocate whatever delalloc extent currently backs
|
||||
* offset_fsb and put the result into wpc->imap. Allocate in a loop
|
||||
* because it may take several attempts to allocate real blocks for a
|
||||
* contiguous delalloc extent if free space is sufficiently fragmented.
|
||||
*/
|
||||
do {
|
||||
error = xfs_bmapi_convert_delalloc(ip, wpc->fork, offset_fsb,
|
||||
&wpc->imap, wpc->fork == XFS_COW_FORK ?
|
||||
&wpc->cow_seq : &wpc->data_seq);
|
||||
if (error)
|
||||
return error;
|
||||
} while (wpc->imap.br_startoff + wpc->imap.br_blockcount <= offset_fsb);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
STATIC int
|
||||
xfs_map_blocks(
|
||||
struct xfs_writepage_ctx *wpc,
|
||||
|
@ -310,26 +371,16 @@ xfs_map_blocks(
|
|||
struct xfs_inode *ip = XFS_I(inode);
|
||||
struct xfs_mount *mp = ip->i_mount;
|
||||
ssize_t count = i_blocksize(inode);
|
||||
xfs_fileoff_t offset_fsb = XFS_B_TO_FSBT(mp, offset), end_fsb;
|
||||
xfs_fileoff_t offset_fsb = XFS_B_TO_FSBT(mp, offset);
|
||||
xfs_fileoff_t end_fsb = XFS_B_TO_FSB(mp, offset + count);
|
||||
xfs_fileoff_t cow_fsb = NULLFILEOFF;
|
||||
struct xfs_bmbt_irec imap;
|
||||
int whichfork = XFS_DATA_FORK;
|
||||
struct xfs_iext_cursor icur;
|
||||
bool imap_valid;
|
||||
int retries = 0;
|
||||
int error = 0;
|
||||
|
||||
/*
|
||||
* We have to make sure the cached mapping is within EOF to protect
|
||||
* against eofblocks trimming on file release leaving us with a stale
|
||||
* mapping. Otherwise, a page for a subsequent file extending buffered
|
||||
* write could get picked up by this writeback cycle and written to the
|
||||
* wrong blocks.
|
||||
*
|
||||
* Note that what we really want here is a generic mapping invalidation
|
||||
* mechanism to protect us from arbitrary extent modifying contexts, not
|
||||
* just eofblocks.
|
||||
*/
|
||||
xfs_trim_extent_eof(&wpc->imap, ip);
|
||||
if (XFS_FORCED_SHUTDOWN(mp))
|
||||
return -EIO;
|
||||
|
||||
/*
|
||||
* COW fork blocks can overlap data fork blocks even if the blocks
|
||||
|
@ -346,31 +397,19 @@ xfs_map_blocks(
|
|||
* against concurrent updates and provides a memory barrier on the way
|
||||
* out that ensures that we always see the current value.
|
||||
*/
|
||||
imap_valid = offset_fsb >= wpc->imap.br_startoff &&
|
||||
offset_fsb < wpc->imap.br_startoff + wpc->imap.br_blockcount;
|
||||
if (imap_valid &&
|
||||
(!xfs_inode_has_cow_data(ip) ||
|
||||
wpc->io_type == XFS_IO_COW ||
|
||||
wpc->cow_seq == READ_ONCE(ip->i_cowfp->if_seq)))
|
||||
if (xfs_imap_valid(wpc, ip, offset_fsb))
|
||||
return 0;
|
||||
|
||||
if (XFS_FORCED_SHUTDOWN(mp))
|
||||
return -EIO;
|
||||
|
||||
/*
|
||||
* If we don't have a valid map, now it's time to get a new one for this
|
||||
* offset. This will convert delayed allocations (including COW ones)
|
||||
* into real extents. If we return without a valid map, it means we
|
||||
* landed in a hole and we skip the block.
|
||||
*/
|
||||
retry:
|
||||
xfs_ilock(ip, XFS_ILOCK_SHARED);
|
||||
ASSERT(ip->i_d.di_format != XFS_DINODE_FMT_BTREE ||
|
||||
(ip->i_df.if_flags & XFS_IFEXTENTS));
|
||||
ASSERT(offset <= mp->m_super->s_maxbytes);
|
||||
|
||||
if (offset > mp->m_super->s_maxbytes - count)
|
||||
count = mp->m_super->s_maxbytes - offset;
|
||||
end_fsb = XFS_B_TO_FSB(mp, (xfs_ufsize_t)offset + count);
|
||||
|
||||
/*
|
||||
* Check if this is offset is covered by a COW extents, and if yes use
|
||||
|
@ -382,30 +421,16 @@ xfs_map_blocks(
|
|||
if (cow_fsb != NULLFILEOFF && cow_fsb <= offset_fsb) {
|
||||
wpc->cow_seq = READ_ONCE(ip->i_cowfp->if_seq);
|
||||
xfs_iunlock(ip, XFS_ILOCK_SHARED);
|
||||
/*
|
||||
* Truncate can race with writeback since writeback doesn't
|
||||
* take the iolock and truncate decreases the file size before
|
||||
* it starts truncating the pages between new_size and old_size.
|
||||
* Therefore, we can end up in the situation where writeback
|
||||
* gets a CoW fork mapping but the truncate makes the mapping
|
||||
* invalid and we end up in here trying to get a new mapping.
|
||||
* bail out here so that we simply never get a valid mapping
|
||||
* and so we drop the write altogether. The page truncation
|
||||
* will kill the contents anyway.
|
||||
*/
|
||||
if (offset > i_size_read(inode)) {
|
||||
wpc->io_type = XFS_IO_HOLE;
|
||||
return 0;
|
||||
}
|
||||
whichfork = XFS_COW_FORK;
|
||||
wpc->io_type = XFS_IO_COW;
|
||||
|
||||
wpc->fork = XFS_COW_FORK;
|
||||
goto allocate_blocks;
|
||||
}
|
||||
|
||||
/*
|
||||
* Map valid and no COW extent in the way? We're done.
|
||||
* No COW extent overlap. Revalidate now that we may have updated
|
||||
* ->cow_seq. If the data mapping is still valid, we're done.
|
||||
*/
|
||||
if (imap_valid) {
|
||||
if (xfs_imap_valid(wpc, ip, offset_fsb)) {
|
||||
xfs_iunlock(ip, XFS_ILOCK_SHARED);
|
||||
return 0;
|
||||
}
|
||||
|
@ -417,51 +442,65 @@ xfs_map_blocks(
|
|||
*/
|
||||
if (!xfs_iext_lookup_extent(ip, &ip->i_df, offset_fsb, &icur, &imap))
|
||||
imap.br_startoff = end_fsb; /* fake a hole past EOF */
|
||||
wpc->data_seq = READ_ONCE(ip->i_df.if_seq);
|
||||
xfs_iunlock(ip, XFS_ILOCK_SHARED);
|
||||
|
||||
wpc->fork = XFS_DATA_FORK;
|
||||
|
||||
/* landed in a hole or beyond EOF? */
|
||||
if (imap.br_startoff > offset_fsb) {
|
||||
/* landed in a hole or beyond EOF */
|
||||
imap.br_blockcount = imap.br_startoff - offset_fsb;
|
||||
imap.br_startoff = offset_fsb;
|
||||
imap.br_startblock = HOLESTARTBLOCK;
|
||||
wpc->io_type = XFS_IO_HOLE;
|
||||
} else {
|
||||
/*
|
||||
* Truncate to the next COW extent if there is one. This is the
|
||||
* only opportunity to do this because we can skip COW fork
|
||||
* lookups for the subsequent blocks in the mapping; however,
|
||||
* the requirement to treat the COW range separately remains.
|
||||
*/
|
||||
if (cow_fsb != NULLFILEOFF &&
|
||||
cow_fsb < imap.br_startoff + imap.br_blockcount)
|
||||
imap.br_blockcount = cow_fsb - imap.br_startoff;
|
||||
|
||||
if (isnullstartblock(imap.br_startblock)) {
|
||||
/* got a delalloc extent */
|
||||
wpc->io_type = XFS_IO_DELALLOC;
|
||||
goto allocate_blocks;
|
||||
}
|
||||
|
||||
if (imap.br_state == XFS_EXT_UNWRITTEN)
|
||||
wpc->io_type = XFS_IO_UNWRITTEN;
|
||||
else
|
||||
wpc->io_type = XFS_IO_OVERWRITE;
|
||||
imap.br_state = XFS_EXT_NORM;
|
||||
}
|
||||
|
||||
/*
|
||||
* Truncate to the next COW extent if there is one. This is the only
|
||||
* opportunity to do this because we can skip COW fork lookups for the
|
||||
* subsequent blocks in the mapping; however, the requirement to treat
|
||||
* the COW range separately remains.
|
||||
*/
|
||||
if (cow_fsb != NULLFILEOFF &&
|
||||
cow_fsb < imap.br_startoff + imap.br_blockcount)
|
||||
imap.br_blockcount = cow_fsb - imap.br_startoff;
|
||||
|
||||
/* got a delalloc extent? */
|
||||
if (imap.br_startblock != HOLESTARTBLOCK &&
|
||||
isnullstartblock(imap.br_startblock))
|
||||
goto allocate_blocks;
|
||||
|
||||
wpc->imap = imap;
|
||||
xfs_trim_extent_eof(&wpc->imap, ip);
|
||||
trace_xfs_map_blocks_found(ip, offset, count, wpc->io_type, &imap);
|
||||
trace_xfs_map_blocks_found(ip, offset, count, wpc->fork, &imap);
|
||||
return 0;
|
||||
allocate_blocks:
|
||||
error = xfs_iomap_write_allocate(ip, whichfork, offset, &imap,
|
||||
&wpc->cow_seq);
|
||||
if (error)
|
||||
error = xfs_convert_blocks(wpc, ip, offset_fsb);
|
||||
if (error) {
|
||||
/*
|
||||
* If we failed to find the extent in the COW fork we might have
|
||||
* raced with a COW to data fork conversion or truncate.
|
||||
* Restart the lookup to catch the extent in the data fork for
|
||||
* the former case, but prevent additional retries to avoid
|
||||
* looping forever for the latter case.
|
||||
*/
|
||||
if (error == -EAGAIN && wpc->fork == XFS_COW_FORK && !retries++)
|
||||
goto retry;
|
||||
ASSERT(error != -EAGAIN);
|
||||
return error;
|
||||
ASSERT(whichfork == XFS_COW_FORK || cow_fsb == NULLFILEOFF ||
|
||||
imap.br_startoff + imap.br_blockcount <= cow_fsb);
|
||||
wpc->imap = imap;
|
||||
xfs_trim_extent_eof(&wpc->imap, ip);
|
||||
trace_xfs_map_blocks_alloc(ip, offset, count, wpc->io_type, &imap);
|
||||
}
|
||||
|
||||
/*
|
||||
* Due to merging the return real extent might be larger than the
|
||||
* original delalloc one. Trim the return extent to the next COW
|
||||
* boundary again to force a re-lookup.
|
||||
*/
|
||||
if (wpc->fork != XFS_COW_FORK && cow_fsb != NULLFILEOFF &&
|
||||
cow_fsb < wpc->imap.br_startoff + wpc->imap.br_blockcount)
|
||||
wpc->imap.br_blockcount = cow_fsb - wpc->imap.br_startoff;
|
||||
|
||||
ASSERT(wpc->imap.br_startoff <= offset_fsb);
|
||||
ASSERT(wpc->imap.br_startoff + wpc->imap.br_blockcount > offset_fsb);
|
||||
trace_xfs_map_blocks_alloc(ip, offset, count, wpc->fork, &imap);
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
@ -486,7 +525,7 @@ xfs_submit_ioend(
|
|||
int status)
|
||||
{
|
||||
/* Convert CoW extents to regular */
|
||||
if (!status && ioend->io_type == XFS_IO_COW) {
|
||||
if (!status && ioend->io_fork == XFS_COW_FORK) {
|
||||
/*
|
||||
* Yuk. This can do memory allocation, but is not a
|
||||
* transactional operation so everything is done in GFP_KERNEL
|
||||
|
@ -504,7 +543,8 @@ xfs_submit_ioend(
|
|||
|
||||
/* Reserve log space if we might write beyond the on-disk inode size. */
|
||||
if (!status &&
|
||||
ioend->io_type != XFS_IO_UNWRITTEN &&
|
||||
(ioend->io_fork == XFS_COW_FORK ||
|
||||
ioend->io_state != XFS_EXT_UNWRITTEN) &&
|
||||
xfs_ioend_is_append(ioend) &&
|
||||
!ioend->io_append_trans)
|
||||
status = xfs_setfilesize_trans_alloc(ioend);
|
||||
|
@ -533,7 +573,8 @@ xfs_submit_ioend(
|
|||
static struct xfs_ioend *
|
||||
xfs_alloc_ioend(
|
||||
struct inode *inode,
|
||||
unsigned int type,
|
||||
int fork,
|
||||
xfs_exntst_t state,
|
||||
xfs_off_t offset,
|
||||
struct block_device *bdev,
|
||||
sector_t sector)
|
||||
|
@ -547,7 +588,8 @@ xfs_alloc_ioend(
|
|||
|
||||
ioend = container_of(bio, struct xfs_ioend, io_inline_bio);
|
||||
INIT_LIST_HEAD(&ioend->io_list);
|
||||
ioend->io_type = type;
|
||||
ioend->io_fork = fork;
|
||||
ioend->io_state = state;
|
||||
ioend->io_inode = inode;
|
||||
ioend->io_size = 0;
|
||||
ioend->io_offset = offset;
|
||||
|
@ -608,13 +650,15 @@ xfs_add_to_ioend(
|
|||
sector = xfs_fsb_to_db(ip, wpc->imap.br_startblock) +
|
||||
((offset - XFS_FSB_TO_B(mp, wpc->imap.br_startoff)) >> 9);
|
||||
|
||||
if (!wpc->ioend || wpc->io_type != wpc->ioend->io_type ||
|
||||
if (!wpc->ioend ||
|
||||
wpc->fork != wpc->ioend->io_fork ||
|
||||
wpc->imap.br_state != wpc->ioend->io_state ||
|
||||
sector != bio_end_sector(wpc->ioend->io_bio) ||
|
||||
offset != wpc->ioend->io_offset + wpc->ioend->io_size) {
|
||||
if (wpc->ioend)
|
||||
list_add(&wpc->ioend->io_list, iolist);
|
||||
wpc->ioend = xfs_alloc_ioend(inode, wpc->io_type, offset,
|
||||
bdev, sector);
|
||||
wpc->ioend = xfs_alloc_ioend(inode, wpc->fork,
|
||||
wpc->imap.br_state, offset, bdev, sector);
|
||||
}
|
||||
|
||||
if (!__bio_try_merge_page(wpc->ioend->io_bio, page, len, poff)) {
|
||||
|
@ -723,7 +767,7 @@ xfs_writepage_map(
|
|||
error = xfs_map_blocks(wpc, inode, file_offset);
|
||||
if (error)
|
||||
break;
|
||||
if (wpc->io_type == XFS_IO_HOLE)
|
||||
if (wpc->imap.br_startblock == HOLESTARTBLOCK)
|
||||
continue;
|
||||
xfs_add_to_ioend(inode, file_offset, page, iop, wpc, wbc,
|
||||
&submit_list);
|
||||
|
@ -918,9 +962,7 @@ xfs_vm_writepage(
|
|||
struct page *page,
|
||||
struct writeback_control *wbc)
|
||||
{
|
||||
struct xfs_writepage_ctx wpc = {
|
||||
.io_type = XFS_IO_HOLE,
|
||||
};
|
||||
struct xfs_writepage_ctx wpc = { };
|
||||
int ret;
|
||||
|
||||
ret = xfs_do_writepage(page, wbc, &wpc);
|
||||
|
@ -934,9 +976,7 @@ xfs_vm_writepages(
|
|||
struct address_space *mapping,
|
||||
struct writeback_control *wbc)
|
||||
{
|
||||
struct xfs_writepage_ctx wpc = {
|
||||
.io_type = XFS_IO_HOLE,
|
||||
};
|
||||
struct xfs_writepage_ctx wpc = { };
|
||||
int ret;
|
||||
|
||||
xfs_iflags_clear(XFS_I(mapping->host), XFS_ITRUNCATED);
|
||||
|
@ -983,7 +1023,7 @@ xfs_vm_bmap(
|
|||
* Since we don't pass back blockdev info, we can't return bmap
|
||||
* information for rt files either.
|
||||
*/
|
||||
if (xfs_is_reflink_inode(ip) || XFS_IS_REALTIME_INODE(ip))
|
||||
if (xfs_is_cow_inode(ip) || XFS_IS_REALTIME_INODE(ip))
|
||||
return 0;
|
||||
return iomap_bmap(mapping, block, &xfs_iomap_ops);
|
||||
}
|
||||
|
|
|
@ -8,33 +8,13 @@
|
|||
|
||||
extern struct bio_set xfs_ioend_bioset;
|
||||
|
||||
/*
|
||||
* Types of I/O for bmap clustering and I/O completion tracking.
|
||||
*
|
||||
* This enum is used in string mapping in xfs_trace.h; please keep the
|
||||
* TRACE_DEFINE_ENUMs for it up to date.
|
||||
*/
|
||||
enum {
|
||||
XFS_IO_HOLE, /* covers region without any block allocation */
|
||||
XFS_IO_DELALLOC, /* covers delalloc region */
|
||||
XFS_IO_UNWRITTEN, /* covers allocated but uninitialized data */
|
||||
XFS_IO_OVERWRITE, /* covers already allocated extent */
|
||||
XFS_IO_COW, /* covers copy-on-write extent */
|
||||
};
|
||||
|
||||
#define XFS_IO_TYPES \
|
||||
{ XFS_IO_HOLE, "hole" }, \
|
||||
{ XFS_IO_DELALLOC, "delalloc" }, \
|
||||
{ XFS_IO_UNWRITTEN, "unwritten" }, \
|
||||
{ XFS_IO_OVERWRITE, "overwrite" }, \
|
||||
{ XFS_IO_COW, "CoW" }
|
||||
|
||||
/*
|
||||
* Structure for buffered I/O completions.
|
||||
*/
|
||||
struct xfs_ioend {
|
||||
struct list_head io_list; /* next ioend in chain */
|
||||
unsigned int io_type; /* delalloc / unwritten */
|
||||
int io_fork; /* inode fork written back */
|
||||
xfs_exntst_t io_state; /* extent state */
|
||||
struct inode *io_inode; /* file being written to */
|
||||
size_t io_size; /* size of the extent */
|
||||
xfs_off_t io_offset; /* offset in the file */
|
||||
|
|
|
@ -555,6 +555,7 @@ xfs_attr_put_listent(
|
|||
attrlist_ent_t *aep;
|
||||
int arraytop;
|
||||
|
||||
ASSERT(!context->seen_enough);
|
||||
ASSERT(!(context->flags & ATTR_KERNOVAL));
|
||||
ASSERT(context->count >= 0);
|
||||
ASSERT(context->count < (ATTR_MAX_VALUELEN/8));
|
||||
|
|
|
@ -1162,16 +1162,13 @@ xfs_zero_file_space(
|
|||
* by virtue of the hole punch.
|
||||
*/
|
||||
error = xfs_free_file_space(ip, offset, len);
|
||||
if (error)
|
||||
goto out;
|
||||
if (error || xfs_is_always_cow_inode(ip))
|
||||
return error;
|
||||
|
||||
error = xfs_alloc_file_space(ip, round_down(offset, blksize),
|
||||
return xfs_alloc_file_space(ip, round_down(offset, blksize),
|
||||
round_up(offset + len, blksize) -
|
||||
round_down(offset, blksize),
|
||||
XFS_BMAPI_PREALLOC);
|
||||
out:
|
||||
return error;
|
||||
|
||||
}
|
||||
|
||||
static int
|
||||
|
|
|
@ -776,29 +776,24 @@ _xfs_buf_read(
|
|||
}
|
||||
|
||||
/*
|
||||
* Set buffer ops on an unchecked buffer and validate it, if possible.
|
||||
* Reverify a buffer found in cache without an attached ->b_ops.
|
||||
*
|
||||
* If the caller passed in an ops structure and the buffer doesn't have ops
|
||||
* assigned, set the ops and use them to verify the contents. If the contents
|
||||
* cannot be verified, we'll clear XBF_DONE. We assume the buffer has no
|
||||
* recorded errors and is already in XBF_DONE state.
|
||||
* If the caller passed an ops structure and the buffer doesn't have ops
|
||||
* assigned, set the ops and use it to verify the contents. If verification
|
||||
* fails, clear XBF_DONE. We assume the buffer has no recorded errors and is
|
||||
* already in XBF_DONE state on entry.
|
||||
*
|
||||
* Under normal operations, every in-core buffer must have buffer ops assigned
|
||||
* to them when the buffer is read in from disk so that we can validate the
|
||||
* metadata.
|
||||
*
|
||||
* However, there are two scenarios where one can encounter in-core buffers
|
||||
* that don't have buffer ops. The first is during log recovery of buffers on
|
||||
* a V4 filesystem, though these buffers are purged at the end of recovery.
|
||||
*
|
||||
* The other is online repair, which tries to match arbitrary metadata blocks
|
||||
* with btree types in order to find the root. If online repair doesn't match
|
||||
* the buffer with /any/ btree type, the buffer remains in memory in DONE state
|
||||
* with no ops, and a subsequent read_buf call from elsewhere will not set the
|
||||
* ops. This function helps us fix this situation.
|
||||
* Under normal operations, every in-core buffer is verified on read I/O
|
||||
* completion. There are two scenarios that can lead to in-core buffers without
|
||||
* an assigned ->b_ops. The first is during log recovery of buffers on a V4
|
||||
* filesystem, though these buffers are purged at the end of recovery. The
|
||||
* other is online repair, which intentionally reads with a NULL buffer ops to
|
||||
* run several verifiers across an in-core buffer in order to establish buffer
|
||||
* type. If repair can't establish that, the buffer will be left in memory
|
||||
* with NULL buffer ops.
|
||||
*/
|
||||
int
|
||||
xfs_buf_ensure_ops(
|
||||
xfs_buf_reverify(
|
||||
struct xfs_buf *bp,
|
||||
const struct xfs_buf_ops *ops)
|
||||
{
|
||||
|
@ -840,7 +835,7 @@ xfs_buf_read_map(
|
|||
return bp;
|
||||
}
|
||||
|
||||
xfs_buf_ensure_ops(bp, ops);
|
||||
xfs_buf_reverify(bp, ops);
|
||||
|
||||
if (flags & XBF_ASYNC) {
|
||||
/*
|
||||
|
@ -2209,3 +2204,40 @@ void xfs_buf_set_ref(struct xfs_buf *bp, int lru_ref)
|
|||
|
||||
atomic_set(&bp->b_lru_ref, lru_ref);
|
||||
}
|
||||
|
||||
/*
|
||||
* Verify an on-disk magic value against the magic value specified in the
|
||||
* verifier structure. The verifier magic is in disk byte order so the caller is
|
||||
* expected to pass the value directly from disk.
|
||||
*/
|
||||
bool
|
||||
xfs_verify_magic(
|
||||
struct xfs_buf *bp,
|
||||
__be32 dmagic)
|
||||
{
|
||||
struct xfs_mount *mp = bp->b_target->bt_mount;
|
||||
int idx;
|
||||
|
||||
idx = xfs_sb_version_hascrc(&mp->m_sb);
|
||||
if (unlikely(WARN_ON(!bp->b_ops || !bp->b_ops->magic[idx])))
|
||||
return false;
|
||||
return dmagic == bp->b_ops->magic[idx];
|
||||
}
|
||||
/*
|
||||
* Verify an on-disk magic value against the magic value specified in the
|
||||
* verifier structure. The verifier magic is in disk byte order so the caller is
|
||||
* expected to pass the value directly from disk.
|
||||
*/
|
||||
bool
|
||||
xfs_verify_magic16(
|
||||
struct xfs_buf *bp,
|
||||
__be16 dmagic)
|
||||
{
|
||||
struct xfs_mount *mp = bp->b_target->bt_mount;
|
||||
int idx;
|
||||
|
||||
idx = xfs_sb_version_hascrc(&mp->m_sb);
|
||||
if (unlikely(WARN_ON(!bp->b_ops || !bp->b_ops->magic16[idx])))
|
||||
return false;
|
||||
return dmagic == bp->b_ops->magic16[idx];
|
||||
}
|
||||
|
|
|
@ -125,6 +125,10 @@ struct xfs_buf_map {
|
|||
|
||||
struct xfs_buf_ops {
|
||||
char *name;
|
||||
union {
|
||||
__be32 magic[2]; /* v4 and v5 on disk magic values */
|
||||
__be16 magic16[2]; /* v4 and v5 on disk magic values */
|
||||
};
|
||||
void (*verify_read)(struct xfs_buf *);
|
||||
void (*verify_write)(struct xfs_buf *);
|
||||
xfs_failaddr_t (*verify_struct)(struct xfs_buf *bp);
|
||||
|
@ -385,6 +389,8 @@ extern int xfs_setsize_buftarg(xfs_buftarg_t *, unsigned int);
|
|||
#define xfs_getsize_buftarg(buftarg) block_size((buftarg)->bt_bdev)
|
||||
#define xfs_readonly_buftarg(buftarg) bdev_read_only((buftarg)->bt_bdev)
|
||||
|
||||
int xfs_buf_ensure_ops(struct xfs_buf *bp, const struct xfs_buf_ops *ops);
|
||||
int xfs_buf_reverify(struct xfs_buf *bp, const struct xfs_buf_ops *ops);
|
||||
bool xfs_verify_magic(struct xfs_buf *bp, __be32 dmagic);
|
||||
bool xfs_verify_magic16(struct xfs_buf *bp, __be16 dmagic);
|
||||
|
||||
#endif /* __XFS_BUF_H__ */
|
||||
|
|
|
@ -51,6 +51,7 @@ static unsigned int xfs_errortag_random_default[] = {
|
|||
XFS_RANDOM_BUF_LRU_REF,
|
||||
XFS_RANDOM_FORCE_SCRUB_REPAIR,
|
||||
XFS_RANDOM_FORCE_SUMMARY_RECALC,
|
||||
XFS_RANDOM_IUNLINK_FALLBACK,
|
||||
};
|
||||
|
||||
struct xfs_errortag_attr {
|
||||
|
@ -159,6 +160,7 @@ XFS_ERRORTAG_ATTR_RW(log_item_pin, XFS_ERRTAG_LOG_ITEM_PIN);
|
|||
XFS_ERRORTAG_ATTR_RW(buf_lru_ref, XFS_ERRTAG_BUF_LRU_REF);
|
||||
XFS_ERRORTAG_ATTR_RW(force_repair, XFS_ERRTAG_FORCE_SCRUB_REPAIR);
|
||||
XFS_ERRORTAG_ATTR_RW(bad_summary, XFS_ERRTAG_FORCE_SUMMARY_RECALC);
|
||||
XFS_ERRORTAG_ATTR_RW(iunlink_fallback, XFS_ERRTAG_IUNLINK_FALLBACK);
|
||||
|
||||
static struct attribute *xfs_errortag_attrs[] = {
|
||||
XFS_ERRORTAG_ATTR_LIST(noerror),
|
||||
|
@ -195,6 +197,7 @@ static struct attribute *xfs_errortag_attrs[] = {
|
|||
XFS_ERRORTAG_ATTR_LIST(buf_lru_ref),
|
||||
XFS_ERRORTAG_ATTR_LIST(force_repair),
|
||||
XFS_ERRORTAG_ATTR_LIST(bad_summary),
|
||||
XFS_ERRORTAG_ATTR_LIST(iunlink_fallback),
|
||||
NULL,
|
||||
};
|
||||
|
||||
|
@ -357,7 +360,8 @@ xfs_buf_verifier_error(
|
|||
fa = failaddr ? failaddr : __return_address;
|
||||
__xfs_buf_ioerror(bp, error, fa);
|
||||
|
||||
xfs_alert(mp, "Metadata %s detected at %pS, %s block 0x%llx %s",
|
||||
xfs_alert_tag(mp, XFS_PTAG_VERIFIER_ERROR,
|
||||
"Metadata %s detected at %pS, %s block 0x%llx %s",
|
||||
bp->b_error == -EFSBADCRC ? "CRC error" : "corruption",
|
||||
fa, bp->b_ops->name, bp->b_bn, name);
|
||||
|
||||
|
|
|
@ -98,5 +98,6 @@ extern int xfs_errortag_clearall(struct xfs_mount *mp);
|
|||
#define XFS_PTAG_SHUTDOWN_IOERROR 0x00000020
|
||||
#define XFS_PTAG_SHUTDOWN_LOGERROR 0x00000040
|
||||
#define XFS_PTAG_FSBLOCK_ZERO 0x00000080
|
||||
#define XFS_PTAG_VERIFIER_ERROR 0x00000100
|
||||
|
||||
#endif /* __XFS_ERROR_H__ */
|
||||
|
|
|
@ -507,7 +507,7 @@ xfs_file_dio_aio_write(
|
|||
* We can't properly handle unaligned direct I/O to reflink
|
||||
* files yet, as we can't unshare a partial block.
|
||||
*/
|
||||
if (xfs_is_reflink_inode(ip)) {
|
||||
if (xfs_is_cow_inode(ip)) {
|
||||
trace_xfs_reflink_bounce_dio_write(ip, iocb->ki_pos, count);
|
||||
return -EREMCHG;
|
||||
}
|
||||
|
@ -872,14 +872,27 @@ xfs_file_fallocate(
|
|||
goto out_unlock;
|
||||
}
|
||||
|
||||
if (mode & FALLOC_FL_ZERO_RANGE)
|
||||
if (mode & FALLOC_FL_ZERO_RANGE) {
|
||||
error = xfs_zero_file_space(ip, offset, len);
|
||||
else {
|
||||
if (mode & FALLOC_FL_UNSHARE_RANGE) {
|
||||
error = xfs_reflink_unshare(ip, offset, len);
|
||||
if (error)
|
||||
goto out_unlock;
|
||||
} else if (mode & FALLOC_FL_UNSHARE_RANGE) {
|
||||
error = xfs_reflink_unshare(ip, offset, len);
|
||||
if (error)
|
||||
goto out_unlock;
|
||||
|
||||
if (!xfs_is_always_cow_inode(ip)) {
|
||||
error = xfs_alloc_file_space(ip, offset, len,
|
||||
XFS_BMAPI_PREALLOC);
|
||||
}
|
||||
} else {
|
||||
/*
|
||||
* If always_cow mode we can't use preallocations and
|
||||
* thus should not create them.
|
||||
*/
|
||||
if (xfs_is_always_cow_inode(ip)) {
|
||||
error = -EOPNOTSUPP;
|
||||
goto out_unlock;
|
||||
}
|
||||
|
||||
error = xfs_alloc_file_space(ip, offset, len,
|
||||
XFS_BMAPI_PREALLOC);
|
||||
}
|
||||
|
@ -1068,10 +1081,10 @@ xfs_file_llseek(
|
|||
default:
|
||||
return generic_file_llseek(file, offset, whence);
|
||||
case SEEK_HOLE:
|
||||
offset = iomap_seek_hole(inode, offset, &xfs_iomap_ops);
|
||||
offset = iomap_seek_hole(inode, offset, &xfs_seek_iomap_ops);
|
||||
break;
|
||||
case SEEK_DATA:
|
||||
offset = iomap_seek_data(inode, offset, &xfs_iomap_ops);
|
||||
offset = iomap_seek_data(inode, offset, &xfs_seek_iomap_ops);
|
||||
break;
|
||||
}
|
||||
|
||||
|
|
|
@ -533,6 +533,7 @@ xfs_fs_reserve_ag_blocks(
|
|||
int error = 0;
|
||||
int err2;
|
||||
|
||||
mp->m_finobt_nores = false;
|
||||
for (agno = 0; agno < mp->m_sb.sb_agcount; agno++) {
|
||||
pag = xfs_perag_get(mp, agno);
|
||||
err2 = xfs_ag_resv_init(pag, NULL);
|
||||
|
|
|
@ -16,7 +16,7 @@ xfs_param_t xfs_params = {
|
|||
/* MIN DFLT MAX */
|
||||
.sgid_inherit = { 0, 0, 1 },
|
||||
.symlink_mode = { 0, 0, 1 },
|
||||
.panic_mask = { 0, 0, 255 },
|
||||
.panic_mask = { 0, 0, 256 },
|
||||
.error_level = { 0, 3, 11 },
|
||||
.syncd_timer = { 1*100, 30*100, 7200*100},
|
||||
.stats_clear = { 0, 0, 1 },
|
||||
|
|
|
@ -1332,7 +1332,7 @@ xfs_create_tmpfile(
|
|||
if (error)
|
||||
goto out_trans_cancel;
|
||||
|
||||
error = xfs_dir_ialloc(&tp, dp, mode, 1, 0, prid, &ip);
|
||||
error = xfs_dir_ialloc(&tp, dp, mode, 0, 0, prid, &ip);
|
||||
if (error)
|
||||
goto out_trans_cancel;
|
||||
|
||||
|
@ -1754,7 +1754,7 @@ xfs_inactive_ifree(
|
|||
* now remains allocated and sits on the unlinked list until the fs is
|
||||
* repaired.
|
||||
*/
|
||||
if (unlikely(mp->m_inotbt_nores)) {
|
||||
if (unlikely(mp->m_finobt_nores)) {
|
||||
error = xfs_trans_alloc(mp, &M_RES(mp)->tr_ifree,
|
||||
XFS_IFREE_SPACE_RES(mp), 0, XFS_TRANS_RESERVE,
|
||||
&tp);
|
||||
|
@ -1907,86 +1907,510 @@ xfs_inactive(
|
|||
}
|
||||
|
||||
/*
|
||||
* This is called when the inode's link count goes to 0 or we are creating a
|
||||
* tmpfile via O_TMPFILE. In the case of a tmpfile, @ignore_linkcount will be
|
||||
* set to true as the link count is dropped to zero by the VFS after we've
|
||||
* created the file successfully, so we have to add it to the unlinked list
|
||||
* while the link count is non-zero.
|
||||
* In-Core Unlinked List Lookups
|
||||
* =============================
|
||||
*
|
||||
* Every inode is supposed to be reachable from some other piece of metadata
|
||||
* with the exception of the root directory. Inodes with a connection to a
|
||||
* file descriptor but not linked from anywhere in the on-disk directory tree
|
||||
* are collectively known as unlinked inodes, though the filesystem itself
|
||||
* maintains links to these inodes so that on-disk metadata are consistent.
|
||||
*
|
||||
* XFS implements a per-AG on-disk hash table of unlinked inodes. The AGI
|
||||
* header contains a number of buckets that point to an inode, and each inode
|
||||
* record has a pointer to the next inode in the hash chain. This
|
||||
* singly-linked list causes scaling problems in the iunlink remove function
|
||||
* because we must walk that list to find the inode that points to the inode
|
||||
* being removed from the unlinked hash bucket list.
|
||||
*
|
||||
* What if we modelled the unlinked list as a collection of records capturing
|
||||
* "X.next_unlinked = Y" relations? If we indexed those records on Y, we'd
|
||||
* have a fast way to look up unlinked list predecessors, which avoids the
|
||||
* slow list walk. That's exactly what we do here (in-core) with a per-AG
|
||||
* rhashtable.
|
||||
*
|
||||
* Because this is a backref cache, we ignore operational failures since the
|
||||
* iunlink code can fall back to the slow bucket walk. The only errors that
|
||||
* should bubble out are for obviously incorrect situations.
|
||||
*
|
||||
* All users of the backref cache MUST hold the AGI buffer lock to serialize
|
||||
* access or have otherwise provided for concurrency control.
|
||||
*/
|
||||
|
||||
/* Capture a "X.next_unlinked = Y" relationship. */
|
||||
struct xfs_iunlink {
|
||||
struct rhash_head iu_rhash_head;
|
||||
xfs_agino_t iu_agino; /* X */
|
||||
xfs_agino_t iu_next_unlinked; /* Y */
|
||||
};
|
||||
|
||||
/* Unlinked list predecessor lookup hashtable construction */
|
||||
static int
|
||||
xfs_iunlink_obj_cmpfn(
|
||||
struct rhashtable_compare_arg *arg,
|
||||
const void *obj)
|
||||
{
|
||||
const xfs_agino_t *key = arg->key;
|
||||
const struct xfs_iunlink *iu = obj;
|
||||
|
||||
if (iu->iu_next_unlinked != *key)
|
||||
return 1;
|
||||
return 0;
|
||||
}
|
||||
|
||||
static const struct rhashtable_params xfs_iunlink_hash_params = {
|
||||
.min_size = XFS_AGI_UNLINKED_BUCKETS,
|
||||
.key_len = sizeof(xfs_agino_t),
|
||||
.key_offset = offsetof(struct xfs_iunlink,
|
||||
iu_next_unlinked),
|
||||
.head_offset = offsetof(struct xfs_iunlink, iu_rhash_head),
|
||||
.automatic_shrinking = true,
|
||||
.obj_cmpfn = xfs_iunlink_obj_cmpfn,
|
||||
};
|
||||
|
||||
/*
|
||||
* Return X, where X.next_unlinked == @agino. Returns NULLAGINO if no such
|
||||
* relation is found.
|
||||
*/
|
||||
static xfs_agino_t
|
||||
xfs_iunlink_lookup_backref(
|
||||
struct xfs_perag *pag,
|
||||
xfs_agino_t agino)
|
||||
{
|
||||
struct xfs_iunlink *iu;
|
||||
|
||||
iu = rhashtable_lookup_fast(&pag->pagi_unlinked_hash, &agino,
|
||||
xfs_iunlink_hash_params);
|
||||
return iu ? iu->iu_agino : NULLAGINO;
|
||||
}
|
||||
|
||||
/*
|
||||
* Take ownership of an iunlink cache entry and insert it into the hash table.
|
||||
* If successful, the entry will be owned by the cache; if not, it is freed.
|
||||
* Either way, the caller does not own @iu after this call.
|
||||
*/
|
||||
static int
|
||||
xfs_iunlink_insert_backref(
|
||||
struct xfs_perag *pag,
|
||||
struct xfs_iunlink *iu)
|
||||
{
|
||||
int error;
|
||||
|
||||
error = rhashtable_insert_fast(&pag->pagi_unlinked_hash,
|
||||
&iu->iu_rhash_head, xfs_iunlink_hash_params);
|
||||
/*
|
||||
* Fail loudly if there already was an entry because that's a sign of
|
||||
* corruption of in-memory data. Also fail loudly if we see an error
|
||||
* code we didn't anticipate from the rhashtable code. Currently we
|
||||
* only anticipate ENOMEM.
|
||||
*/
|
||||
if (error) {
|
||||
WARN(error != -ENOMEM, "iunlink cache insert error %d", error);
|
||||
kmem_free(iu);
|
||||
}
|
||||
/*
|
||||
* Absorb any runtime errors that aren't a result of corruption because
|
||||
* this is a cache and we can always fall back to bucket list scanning.
|
||||
*/
|
||||
if (error != 0 && error != -EEXIST)
|
||||
error = 0;
|
||||
return error;
|
||||
}
|
||||
|
||||
/* Remember that @prev_agino.next_unlinked = @this_agino. */
|
||||
static int
|
||||
xfs_iunlink_add_backref(
|
||||
struct xfs_perag *pag,
|
||||
xfs_agino_t prev_agino,
|
||||
xfs_agino_t this_agino)
|
||||
{
|
||||
struct xfs_iunlink *iu;
|
||||
|
||||
if (XFS_TEST_ERROR(false, pag->pag_mount, XFS_ERRTAG_IUNLINK_FALLBACK))
|
||||
return 0;
|
||||
|
||||
iu = kmem_zalloc(sizeof(*iu), KM_SLEEP | KM_NOFS);
|
||||
iu->iu_agino = prev_agino;
|
||||
iu->iu_next_unlinked = this_agino;
|
||||
|
||||
return xfs_iunlink_insert_backref(pag, iu);
|
||||
}
|
||||
|
||||
/*
|
||||
* Replace X.next_unlinked = @agino with X.next_unlinked = @next_unlinked.
|
||||
* If @next_unlinked is NULLAGINO, we drop the backref and exit. If there
|
||||
* wasn't any such entry then we don't bother.
|
||||
*/
|
||||
static int
|
||||
xfs_iunlink_change_backref(
|
||||
struct xfs_perag *pag,
|
||||
xfs_agino_t agino,
|
||||
xfs_agino_t next_unlinked)
|
||||
{
|
||||
struct xfs_iunlink *iu;
|
||||
int error;
|
||||
|
||||
/* Look up the old entry; if there wasn't one then exit. */
|
||||
iu = rhashtable_lookup_fast(&pag->pagi_unlinked_hash, &agino,
|
||||
xfs_iunlink_hash_params);
|
||||
if (!iu)
|
||||
return 0;
|
||||
|
||||
/*
|
||||
* Remove the entry. This shouldn't ever return an error, but if we
|
||||
* couldn't remove the old entry we don't want to add it again to the
|
||||
* hash table, and if the entry disappeared on us then someone's
|
||||
* violated the locking rules and we need to fail loudly. Either way
|
||||
* we cannot remove the inode because internal state is or would have
|
||||
* been corrupt.
|
||||
*/
|
||||
error = rhashtable_remove_fast(&pag->pagi_unlinked_hash,
|
||||
&iu->iu_rhash_head, xfs_iunlink_hash_params);
|
||||
if (error)
|
||||
return error;
|
||||
|
||||
/* If there is no new next entry just free our item and return. */
|
||||
if (next_unlinked == NULLAGINO) {
|
||||
kmem_free(iu);
|
||||
return 0;
|
||||
}
|
||||
|
||||
/* Update the entry and re-add it to the hash table. */
|
||||
iu->iu_next_unlinked = next_unlinked;
|
||||
return xfs_iunlink_insert_backref(pag, iu);
|
||||
}
|
||||
|
||||
/* Set up the in-core predecessor structures. */
|
||||
int
|
||||
xfs_iunlink_init(
|
||||
struct xfs_perag *pag)
|
||||
{
|
||||
return rhashtable_init(&pag->pagi_unlinked_hash,
|
||||
&xfs_iunlink_hash_params);
|
||||
}
|
||||
|
||||
/* Free the in-core predecessor structures. */
|
||||
static void
|
||||
xfs_iunlink_free_item(
|
||||
void *ptr,
|
||||
void *arg)
|
||||
{
|
||||
struct xfs_iunlink *iu = ptr;
|
||||
bool *freed_anything = arg;
|
||||
|
||||
*freed_anything = true;
|
||||
kmem_free(iu);
|
||||
}
|
||||
|
||||
void
|
||||
xfs_iunlink_destroy(
|
||||
struct xfs_perag *pag)
|
||||
{
|
||||
bool freed_anything = false;
|
||||
|
||||
rhashtable_free_and_destroy(&pag->pagi_unlinked_hash,
|
||||
xfs_iunlink_free_item, &freed_anything);
|
||||
|
||||
ASSERT(freed_anything == false || XFS_FORCED_SHUTDOWN(pag->pag_mount));
|
||||
}
|
||||
|
||||
/*
|
||||
* Point the AGI unlinked bucket at an inode and log the results. The caller
|
||||
* is responsible for validating the old value.
|
||||
*/
|
||||
STATIC int
|
||||
xfs_iunlink_update_bucket(
|
||||
struct xfs_trans *tp,
|
||||
xfs_agnumber_t agno,
|
||||
struct xfs_buf *agibp,
|
||||
unsigned int bucket_index,
|
||||
xfs_agino_t new_agino)
|
||||
{
|
||||
struct xfs_agi *agi = XFS_BUF_TO_AGI(agibp);
|
||||
xfs_agino_t old_value;
|
||||
int offset;
|
||||
|
||||
ASSERT(xfs_verify_agino_or_null(tp->t_mountp, agno, new_agino));
|
||||
|
||||
old_value = be32_to_cpu(agi->agi_unlinked[bucket_index]);
|
||||
trace_xfs_iunlink_update_bucket(tp->t_mountp, agno, bucket_index,
|
||||
old_value, new_agino);
|
||||
|
||||
/*
|
||||
* We should never find the head of the list already set to the value
|
||||
* passed in because either we're adding or removing ourselves from the
|
||||
* head of the list.
|
||||
*/
|
||||
if (old_value == new_agino)
|
||||
return -EFSCORRUPTED;
|
||||
|
||||
agi->agi_unlinked[bucket_index] = cpu_to_be32(new_agino);
|
||||
offset = offsetof(struct xfs_agi, agi_unlinked) +
|
||||
(sizeof(xfs_agino_t) * bucket_index);
|
||||
xfs_trans_log_buf(tp, agibp, offset, offset + sizeof(xfs_agino_t) - 1);
|
||||
return 0;
|
||||
}
|
||||
|
||||
/* Set an on-disk inode's next_unlinked pointer. */
|
||||
STATIC void
|
||||
xfs_iunlink_update_dinode(
|
||||
struct xfs_trans *tp,
|
||||
xfs_agnumber_t agno,
|
||||
xfs_agino_t agino,
|
||||
struct xfs_buf *ibp,
|
||||
struct xfs_dinode *dip,
|
||||
struct xfs_imap *imap,
|
||||
xfs_agino_t next_agino)
|
||||
{
|
||||
struct xfs_mount *mp = tp->t_mountp;
|
||||
int offset;
|
||||
|
||||
ASSERT(xfs_verify_agino_or_null(mp, agno, next_agino));
|
||||
|
||||
trace_xfs_iunlink_update_dinode(mp, agno, agino,
|
||||
be32_to_cpu(dip->di_next_unlinked), next_agino);
|
||||
|
||||
dip->di_next_unlinked = cpu_to_be32(next_agino);
|
||||
offset = imap->im_boffset +
|
||||
offsetof(struct xfs_dinode, di_next_unlinked);
|
||||
|
||||
/* need to recalc the inode CRC if appropriate */
|
||||
xfs_dinode_calc_crc(mp, dip);
|
||||
xfs_trans_inode_buf(tp, ibp);
|
||||
xfs_trans_log_buf(tp, ibp, offset, offset + sizeof(xfs_agino_t) - 1);
|
||||
xfs_inobp_check(mp, ibp);
|
||||
}
|
||||
|
||||
/* Set an in-core inode's unlinked pointer and return the old value. */
|
||||
STATIC int
|
||||
xfs_iunlink_update_inode(
|
||||
struct xfs_trans *tp,
|
||||
struct xfs_inode *ip,
|
||||
xfs_agnumber_t agno,
|
||||
xfs_agino_t next_agino,
|
||||
xfs_agino_t *old_next_agino)
|
||||
{
|
||||
struct xfs_mount *mp = tp->t_mountp;
|
||||
struct xfs_dinode *dip;
|
||||
struct xfs_buf *ibp;
|
||||
xfs_agino_t old_value;
|
||||
int error;
|
||||
|
||||
ASSERT(xfs_verify_agino_or_null(mp, agno, next_agino));
|
||||
|
||||
error = xfs_imap_to_bp(mp, tp, &ip->i_imap, &dip, &ibp, 0, 0);
|
||||
if (error)
|
||||
return error;
|
||||
|
||||
/* Make sure the old pointer isn't garbage. */
|
||||
old_value = be32_to_cpu(dip->di_next_unlinked);
|
||||
if (!xfs_verify_agino_or_null(mp, agno, old_value)) {
|
||||
error = -EFSCORRUPTED;
|
||||
goto out;
|
||||
}
|
||||
|
||||
/*
|
||||
* Since we're updating a linked list, we should never find that the
|
||||
* current pointer is the same as the new value, unless we're
|
||||
* terminating the list.
|
||||
*/
|
||||
*old_next_agino = old_value;
|
||||
if (old_value == next_agino) {
|
||||
if (next_agino != NULLAGINO)
|
||||
error = -EFSCORRUPTED;
|
||||
goto out;
|
||||
}
|
||||
|
||||
/* Ok, update the new pointer. */
|
||||
xfs_iunlink_update_dinode(tp, agno, XFS_INO_TO_AGINO(mp, ip->i_ino),
|
||||
ibp, dip, &ip->i_imap, next_agino);
|
||||
return 0;
|
||||
out:
|
||||
xfs_trans_brelse(tp, ibp);
|
||||
return error;
|
||||
}
|
||||
|
||||
/*
|
||||
* This is called when the inode's link count has gone to 0 or we are creating
|
||||
* a tmpfile via O_TMPFILE. The inode @ip must have nlink == 0.
|
||||
*
|
||||
* We place the on-disk inode on a list in the AGI. It will be pulled from this
|
||||
* list when the inode is freed.
|
||||
*/
|
||||
STATIC int
|
||||
xfs_iunlink(
|
||||
struct xfs_trans *tp,
|
||||
struct xfs_inode *ip)
|
||||
struct xfs_trans *tp,
|
||||
struct xfs_inode *ip)
|
||||
{
|
||||
xfs_mount_t *mp = tp->t_mountp;
|
||||
xfs_agi_t *agi;
|
||||
xfs_dinode_t *dip;
|
||||
xfs_buf_t *agibp;
|
||||
xfs_buf_t *ibp;
|
||||
xfs_agino_t agino;
|
||||
short bucket_index;
|
||||
int offset;
|
||||
int error;
|
||||
struct xfs_mount *mp = tp->t_mountp;
|
||||
struct xfs_agi *agi;
|
||||
struct xfs_buf *agibp;
|
||||
xfs_agino_t next_agino;
|
||||
xfs_agnumber_t agno = XFS_INO_TO_AGNO(mp, ip->i_ino);
|
||||
xfs_agino_t agino = XFS_INO_TO_AGINO(mp, ip->i_ino);
|
||||
short bucket_index = agino % XFS_AGI_UNLINKED_BUCKETS;
|
||||
int error;
|
||||
|
||||
ASSERT(VFS_I(ip)->i_nlink == 0);
|
||||
ASSERT(VFS_I(ip)->i_mode != 0);
|
||||
trace_xfs_iunlink(ip);
|
||||
|
||||
/*
|
||||
* Get the agi buffer first. It ensures lock ordering
|
||||
* on the list.
|
||||
*/
|
||||
error = xfs_read_agi(mp, tp, XFS_INO_TO_AGNO(mp, ip->i_ino), &agibp);
|
||||
/* Get the agi buffer first. It ensures lock ordering on the list. */
|
||||
error = xfs_read_agi(mp, tp, agno, &agibp);
|
||||
if (error)
|
||||
return error;
|
||||
agi = XFS_BUF_TO_AGI(agibp);
|
||||
|
||||
/*
|
||||
* Get the index into the agi hash table for the
|
||||
* list this inode will go on.
|
||||
* Get the index into the agi hash table for the list this inode will
|
||||
* go on. Make sure the pointer isn't garbage and that this inode
|
||||
* isn't already on the list.
|
||||
*/
|
||||
agino = XFS_INO_TO_AGINO(mp, ip->i_ino);
|
||||
ASSERT(agino != 0);
|
||||
bucket_index = agino % XFS_AGI_UNLINKED_BUCKETS;
|
||||
ASSERT(agi->agi_unlinked[bucket_index]);
|
||||
ASSERT(be32_to_cpu(agi->agi_unlinked[bucket_index]) != agino);
|
||||
next_agino = be32_to_cpu(agi->agi_unlinked[bucket_index]);
|
||||
if (next_agino == agino ||
|
||||
!xfs_verify_agino_or_null(mp, agno, next_agino))
|
||||
return -EFSCORRUPTED;
|
||||
|
||||
if (next_agino != NULLAGINO) {
|
||||
struct xfs_perag *pag;
|
||||
xfs_agino_t old_agino;
|
||||
|
||||
if (agi->agi_unlinked[bucket_index] != cpu_to_be32(NULLAGINO)) {
|
||||
/*
|
||||
* There is already another inode in the bucket we need
|
||||
* to add ourselves to. Add us at the front of the list.
|
||||
* Here we put the head pointer into our next pointer,
|
||||
* and then we fall through to point the head at us.
|
||||
* There is already another inode in the bucket, so point this
|
||||
* inode to the current head of the list.
|
||||
*/
|
||||
error = xfs_imap_to_bp(mp, tp, &ip->i_imap, &dip, &ibp,
|
||||
0, 0);
|
||||
error = xfs_iunlink_update_inode(tp, ip, agno, next_agino,
|
||||
&old_agino);
|
||||
if (error)
|
||||
return error;
|
||||
ASSERT(old_agino == NULLAGINO);
|
||||
|
||||
/*
|
||||
* agino has been unlinked, add a backref from the next inode
|
||||
* back to agino.
|
||||
*/
|
||||
pag = xfs_perag_get(mp, agno);
|
||||
error = xfs_iunlink_add_backref(pag, agino, next_agino);
|
||||
xfs_perag_put(pag);
|
||||
if (error)
|
||||
return error;
|
||||
}
|
||||
|
||||
/* Point the head of the list to point to this inode. */
|
||||
return xfs_iunlink_update_bucket(tp, agno, agibp, bucket_index, agino);
|
||||
}
|
||||
|
||||
/* Return the imap, dinode pointer, and buffer for an inode. */
|
||||
STATIC int
|
||||
xfs_iunlink_map_ino(
|
||||
struct xfs_trans *tp,
|
||||
xfs_agnumber_t agno,
|
||||
xfs_agino_t agino,
|
||||
struct xfs_imap *imap,
|
||||
struct xfs_dinode **dipp,
|
||||
struct xfs_buf **bpp)
|
||||
{
|
||||
struct xfs_mount *mp = tp->t_mountp;
|
||||
int error;
|
||||
|
||||
imap->im_blkno = 0;
|
||||
error = xfs_imap(mp, tp, XFS_AGINO_TO_INO(mp, agno, agino), imap, 0);
|
||||
if (error) {
|
||||
xfs_warn(mp, "%s: xfs_imap returned error %d.",
|
||||
__func__, error);
|
||||
return error;
|
||||
}
|
||||
|
||||
error = xfs_imap_to_bp(mp, tp, imap, dipp, bpp, 0, 0);
|
||||
if (error) {
|
||||
xfs_warn(mp, "%s: xfs_imap_to_bp returned error %d.",
|
||||
__func__, error);
|
||||
return error;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Walk the unlinked chain from @head_agino until we find the inode that
|
||||
* points to @target_agino. Return the inode number, map, dinode pointer,
|
||||
* and inode cluster buffer of that inode as @agino, @imap, @dipp, and @bpp.
|
||||
*
|
||||
* @tp, @pag, @head_agino, and @target_agino are input parameters.
|
||||
* @agino, @imap, @dipp, and @bpp are all output parameters.
|
||||
*
|
||||
* Do not call this function if @target_agino is the head of the list.
|
||||
*/
|
||||
STATIC int
|
||||
xfs_iunlink_map_prev(
|
||||
struct xfs_trans *tp,
|
||||
xfs_agnumber_t agno,
|
||||
xfs_agino_t head_agino,
|
||||
xfs_agino_t target_agino,
|
||||
xfs_agino_t *agino,
|
||||
struct xfs_imap *imap,
|
||||
struct xfs_dinode **dipp,
|
||||
struct xfs_buf **bpp,
|
||||
struct xfs_perag *pag)
|
||||
{
|
||||
struct xfs_mount *mp = tp->t_mountp;
|
||||
xfs_agino_t next_agino;
|
||||
int error;
|
||||
|
||||
ASSERT(head_agino != target_agino);
|
||||
*bpp = NULL;
|
||||
|
||||
/* See if our backref cache can find it faster. */
|
||||
*agino = xfs_iunlink_lookup_backref(pag, target_agino);
|
||||
if (*agino != NULLAGINO) {
|
||||
error = xfs_iunlink_map_ino(tp, agno, *agino, imap, dipp, bpp);
|
||||
if (error)
|
||||
return error;
|
||||
|
||||
ASSERT(dip->di_next_unlinked == cpu_to_be32(NULLAGINO));
|
||||
dip->di_next_unlinked = agi->agi_unlinked[bucket_index];
|
||||
offset = ip->i_imap.im_boffset +
|
||||
offsetof(xfs_dinode_t, di_next_unlinked);
|
||||
if (be32_to_cpu((*dipp)->di_next_unlinked) == target_agino)
|
||||
return 0;
|
||||
|
||||
/* need to recalc the inode CRC if appropriate */
|
||||
xfs_dinode_calc_crc(mp, dip);
|
||||
|
||||
xfs_trans_inode_buf(tp, ibp);
|
||||
xfs_trans_log_buf(tp, ibp, offset,
|
||||
(offset + sizeof(xfs_agino_t) - 1));
|
||||
xfs_inobp_check(mp, ibp);
|
||||
/*
|
||||
* If we get here the cache contents were corrupt, so drop the
|
||||
* buffer and fall back to walking the bucket list.
|
||||
*/
|
||||
xfs_trans_brelse(tp, *bpp);
|
||||
*bpp = NULL;
|
||||
WARN_ON_ONCE(1);
|
||||
}
|
||||
|
||||
trace_xfs_iunlink_map_prev_fallback(mp, agno);
|
||||
|
||||
/* Otherwise, walk the entire bucket until we find it. */
|
||||
next_agino = head_agino;
|
||||
while (next_agino != target_agino) {
|
||||
xfs_agino_t unlinked_agino;
|
||||
|
||||
if (*bpp)
|
||||
xfs_trans_brelse(tp, *bpp);
|
||||
|
||||
*agino = next_agino;
|
||||
error = xfs_iunlink_map_ino(tp, agno, next_agino, imap, dipp,
|
||||
bpp);
|
||||
if (error)
|
||||
return error;
|
||||
|
||||
unlinked_agino = be32_to_cpu((*dipp)->di_next_unlinked);
|
||||
/*
|
||||
* Make sure this pointer is valid and isn't an obvious
|
||||
* infinite loop.
|
||||
*/
|
||||
if (!xfs_verify_agino(mp, agno, unlinked_agino) ||
|
||||
next_agino == unlinked_agino) {
|
||||
XFS_CORRUPTION_ERROR(__func__,
|
||||
XFS_ERRLEVEL_LOW, mp,
|
||||
*dipp, sizeof(**dipp));
|
||||
error = -EFSCORRUPTED;
|
||||
return error;
|
||||
}
|
||||
next_agino = unlinked_agino;
|
||||
}
|
||||
|
||||
/*
|
||||
* Point the bucket head pointer at the inode being inserted.
|
||||
*/
|
||||
ASSERT(agino != 0);
|
||||
agi->agi_unlinked[bucket_index] = cpu_to_be32(agino);
|
||||
offset = offsetof(xfs_agi_t, agi_unlinked) +
|
||||
(sizeof(xfs_agino_t) * bucket_index);
|
||||
xfs_trans_log_buf(tp, agibp, offset,
|
||||
(offset + sizeof(xfs_agino_t) - 1));
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
@ -1995,181 +2419,106 @@ xfs_iunlink(
|
|||
*/
|
||||
STATIC int
|
||||
xfs_iunlink_remove(
|
||||
xfs_trans_t *tp,
|
||||
xfs_inode_t *ip)
|
||||
struct xfs_trans *tp,
|
||||
struct xfs_inode *ip)
|
||||
{
|
||||
xfs_ino_t next_ino;
|
||||
xfs_mount_t *mp;
|
||||
xfs_agi_t *agi;
|
||||
xfs_dinode_t *dip;
|
||||
xfs_buf_t *agibp;
|
||||
xfs_buf_t *ibp;
|
||||
xfs_agnumber_t agno;
|
||||
xfs_agino_t agino;
|
||||
xfs_agino_t next_agino;
|
||||
xfs_buf_t *last_ibp;
|
||||
xfs_dinode_t *last_dip = NULL;
|
||||
short bucket_index;
|
||||
int offset, last_offset = 0;
|
||||
int error;
|
||||
struct xfs_mount *mp = tp->t_mountp;
|
||||
struct xfs_agi *agi;
|
||||
struct xfs_buf *agibp;
|
||||
struct xfs_buf *last_ibp;
|
||||
struct xfs_dinode *last_dip = NULL;
|
||||
struct xfs_perag *pag = NULL;
|
||||
xfs_agnumber_t agno = XFS_INO_TO_AGNO(mp, ip->i_ino);
|
||||
xfs_agino_t agino = XFS_INO_TO_AGINO(mp, ip->i_ino);
|
||||
xfs_agino_t next_agino;
|
||||
xfs_agino_t head_agino;
|
||||
short bucket_index = agino % XFS_AGI_UNLINKED_BUCKETS;
|
||||
int error;
|
||||
|
||||
mp = tp->t_mountp;
|
||||
agno = XFS_INO_TO_AGNO(mp, ip->i_ino);
|
||||
trace_xfs_iunlink_remove(ip);
|
||||
|
||||
/*
|
||||
* Get the agi buffer first. It ensures lock ordering
|
||||
* on the list.
|
||||
*/
|
||||
/* Get the agi buffer first. It ensures lock ordering on the list. */
|
||||
error = xfs_read_agi(mp, tp, agno, &agibp);
|
||||
if (error)
|
||||
return error;
|
||||
|
||||
agi = XFS_BUF_TO_AGI(agibp);
|
||||
|
||||
/*
|
||||
* Get the index into the agi hash table for the
|
||||
* list this inode will go on.
|
||||
* Get the index into the agi hash table for the list this inode will
|
||||
* go on. Make sure the head pointer isn't garbage.
|
||||
*/
|
||||
agino = XFS_INO_TO_AGINO(mp, ip->i_ino);
|
||||
if (!xfs_verify_agino(mp, agno, agino))
|
||||
return -EFSCORRUPTED;
|
||||
bucket_index = agino % XFS_AGI_UNLINKED_BUCKETS;
|
||||
if (!xfs_verify_agino(mp, agno,
|
||||
be32_to_cpu(agi->agi_unlinked[bucket_index]))) {
|
||||
head_agino = be32_to_cpu(agi->agi_unlinked[bucket_index]);
|
||||
if (!xfs_verify_agino(mp, agno, head_agino)) {
|
||||
XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp,
|
||||
agi, sizeof(*agi));
|
||||
return -EFSCORRUPTED;
|
||||
}
|
||||
|
||||
if (be32_to_cpu(agi->agi_unlinked[bucket_index]) == agino) {
|
||||
/*
|
||||
* We're at the head of the list. Get the inode's on-disk
|
||||
* buffer to see if there is anyone after us on the list.
|
||||
* Only modify our next pointer if it is not already NULLAGINO.
|
||||
* This saves us the overhead of dealing with the buffer when
|
||||
* there is no need to change it.
|
||||
*/
|
||||
error = xfs_imap_to_bp(mp, tp, &ip->i_imap, &dip, &ibp,
|
||||
0, 0);
|
||||
if (error) {
|
||||
xfs_warn(mp, "%s: xfs_imap_to_bp returned error %d.",
|
||||
__func__, error);
|
||||
return error;
|
||||
}
|
||||
next_agino = be32_to_cpu(dip->di_next_unlinked);
|
||||
ASSERT(next_agino != 0);
|
||||
if (next_agino != NULLAGINO) {
|
||||
dip->di_next_unlinked = cpu_to_be32(NULLAGINO);
|
||||
offset = ip->i_imap.im_boffset +
|
||||
offsetof(xfs_dinode_t, di_next_unlinked);
|
||||
/*
|
||||
* Set our inode's next_unlinked pointer to NULL and then return
|
||||
* the old pointer value so that we can update whatever was previous
|
||||
* to us in the list to point to whatever was next in the list.
|
||||
*/
|
||||
error = xfs_iunlink_update_inode(tp, ip, agno, NULLAGINO, &next_agino);
|
||||
if (error)
|
||||
return error;
|
||||
|
||||
/* need to recalc the inode CRC if appropriate */
|
||||
xfs_dinode_calc_crc(mp, dip);
|
||||
|
||||
xfs_trans_inode_buf(tp, ibp);
|
||||
xfs_trans_log_buf(tp, ibp, offset,
|
||||
(offset + sizeof(xfs_agino_t) - 1));
|
||||
xfs_inobp_check(mp, ibp);
|
||||
} else {
|
||||
xfs_trans_brelse(tp, ibp);
|
||||
}
|
||||
/*
|
||||
* Point the bucket head pointer at the next inode.
|
||||
*/
|
||||
ASSERT(next_agino != 0);
|
||||
ASSERT(next_agino != agino);
|
||||
agi->agi_unlinked[bucket_index] = cpu_to_be32(next_agino);
|
||||
offset = offsetof(xfs_agi_t, agi_unlinked) +
|
||||
(sizeof(xfs_agino_t) * bucket_index);
|
||||
xfs_trans_log_buf(tp, agibp, offset,
|
||||
(offset + sizeof(xfs_agino_t) - 1));
|
||||
} else {
|
||||
/*
|
||||
* We need to search the list for the inode being freed.
|
||||
*/
|
||||
next_agino = be32_to_cpu(agi->agi_unlinked[bucket_index]);
|
||||
last_ibp = NULL;
|
||||
while (next_agino != agino) {
|
||||
struct xfs_imap imap;
|
||||
|
||||
if (last_ibp)
|
||||
xfs_trans_brelse(tp, last_ibp);
|
||||
|
||||
imap.im_blkno = 0;
|
||||
next_ino = XFS_AGINO_TO_INO(mp, agno, next_agino);
|
||||
|
||||
error = xfs_imap(mp, tp, next_ino, &imap, 0);
|
||||
if (error) {
|
||||
xfs_warn(mp,
|
||||
"%s: xfs_imap returned error %d.",
|
||||
__func__, error);
|
||||
return error;
|
||||
}
|
||||
|
||||
error = xfs_imap_to_bp(mp, tp, &imap, &last_dip,
|
||||
&last_ibp, 0, 0);
|
||||
if (error) {
|
||||
xfs_warn(mp,
|
||||
"%s: xfs_imap_to_bp returned error %d.",
|
||||
__func__, error);
|
||||
return error;
|
||||
}
|
||||
|
||||
last_offset = imap.im_boffset;
|
||||
next_agino = be32_to_cpu(last_dip->di_next_unlinked);
|
||||
if (!xfs_verify_agino(mp, agno, next_agino)) {
|
||||
XFS_CORRUPTION_ERROR(__func__,
|
||||
XFS_ERRLEVEL_LOW, mp,
|
||||
last_dip, sizeof(*last_dip));
|
||||
return -EFSCORRUPTED;
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* Now last_ibp points to the buffer previous to us on the
|
||||
* unlinked list. Pull us from the list.
|
||||
*/
|
||||
error = xfs_imap_to_bp(mp, tp, &ip->i_imap, &dip, &ibp,
|
||||
0, 0);
|
||||
if (error) {
|
||||
xfs_warn(mp, "%s: xfs_imap_to_bp(2) returned error %d.",
|
||||
__func__, error);
|
||||
return error;
|
||||
}
|
||||
next_agino = be32_to_cpu(dip->di_next_unlinked);
|
||||
ASSERT(next_agino != 0);
|
||||
ASSERT(next_agino != agino);
|
||||
if (next_agino != NULLAGINO) {
|
||||
dip->di_next_unlinked = cpu_to_be32(NULLAGINO);
|
||||
offset = ip->i_imap.im_boffset +
|
||||
offsetof(xfs_dinode_t, di_next_unlinked);
|
||||
|
||||
/* need to recalc the inode CRC if appropriate */
|
||||
xfs_dinode_calc_crc(mp, dip);
|
||||
|
||||
xfs_trans_inode_buf(tp, ibp);
|
||||
xfs_trans_log_buf(tp, ibp, offset,
|
||||
(offset + sizeof(xfs_agino_t) - 1));
|
||||
xfs_inobp_check(mp, ibp);
|
||||
} else {
|
||||
xfs_trans_brelse(tp, ibp);
|
||||
}
|
||||
/*
|
||||
* Point the previous inode on the list to the next inode.
|
||||
*/
|
||||
last_dip->di_next_unlinked = cpu_to_be32(next_agino);
|
||||
ASSERT(next_agino != 0);
|
||||
offset = last_offset + offsetof(xfs_dinode_t, di_next_unlinked);
|
||||
|
||||
/* need to recalc the inode CRC if appropriate */
|
||||
xfs_dinode_calc_crc(mp, last_dip);
|
||||
|
||||
xfs_trans_inode_buf(tp, last_ibp);
|
||||
xfs_trans_log_buf(tp, last_ibp, offset,
|
||||
(offset + sizeof(xfs_agino_t) - 1));
|
||||
xfs_inobp_check(mp, last_ibp);
|
||||
/*
|
||||
* If there was a backref pointing from the next inode back to this
|
||||
* one, remove it because we've removed this inode from the list.
|
||||
*
|
||||
* Later, if this inode was in the middle of the list we'll update
|
||||
* this inode's backref to point from the next inode.
|
||||
*/
|
||||
if (next_agino != NULLAGINO) {
|
||||
pag = xfs_perag_get(mp, agno);
|
||||
error = xfs_iunlink_change_backref(pag, next_agino,
|
||||
NULLAGINO);
|
||||
if (error)
|
||||
goto out;
|
||||
}
|
||||
return 0;
|
||||
|
||||
if (head_agino == agino) {
|
||||
/* Point the head of the list to the next unlinked inode. */
|
||||
error = xfs_iunlink_update_bucket(tp, agno, agibp, bucket_index,
|
||||
next_agino);
|
||||
if (error)
|
||||
goto out;
|
||||
} else {
|
||||
struct xfs_imap imap;
|
||||
xfs_agino_t prev_agino;
|
||||
|
||||
if (!pag)
|
||||
pag = xfs_perag_get(mp, agno);
|
||||
|
||||
/* We need to search the list for the inode being freed. */
|
||||
error = xfs_iunlink_map_prev(tp, agno, head_agino, agino,
|
||||
&prev_agino, &imap, &last_dip, &last_ibp,
|
||||
pag);
|
||||
if (error)
|
||||
goto out;
|
||||
|
||||
/* Point the previous inode on the list to the next inode. */
|
||||
xfs_iunlink_update_dinode(tp, agno, prev_agino, last_ibp,
|
||||
last_dip, &imap, next_agino);
|
||||
|
||||
/*
|
||||
* Now we deal with the backref for this inode. If this inode
|
||||
* pointed at a real inode, change the backref that pointed to
|
||||
* us to point to our old next. If this inode was the end of
|
||||
* the list, delete the backref that pointed to us. Note that
|
||||
* change_backref takes care of deleting the backref if
|
||||
* next_agino is NULLAGINO.
|
||||
*/
|
||||
error = xfs_iunlink_change_backref(pag, agino, next_agino);
|
||||
if (error)
|
||||
goto out;
|
||||
}
|
||||
|
||||
out:
|
||||
if (pag)
|
||||
xfs_perag_put(pag);
|
||||
return error;
|
||||
}
|
||||
|
||||
/*
|
||||
|
@ -2833,11 +3182,9 @@ xfs_rename_alloc_whiteout(
|
|||
|
||||
/*
|
||||
* Prepare the tmpfile inode as if it were created through the VFS.
|
||||
* Otherwise, the link increment paths will complain about nlink 0->1.
|
||||
* Drop the link count as done by d_tmpfile(), complete the inode setup
|
||||
* and flag it as linkable.
|
||||
* Complete the inode setup and flag it as linkable. nlink is already
|
||||
* zero, so we can skip the drop_nlink.
|
||||
*/
|
||||
drop_nlink(VFS_I(tmpfile));
|
||||
xfs_setup_iops(tmpfile);
|
||||
xfs_finish_inode_setup(tmpfile);
|
||||
VFS_I(tmpfile)->i_state |= I_LINKABLE;
|
||||
|
|
|
@ -500,4 +500,7 @@ extern struct kmem_zone *xfs_inode_zone;
|
|||
|
||||
bool xfs_inode_verify_forks(struct xfs_inode *ip);
|
||||
|
||||
int xfs_iunlink_init(struct xfs_perag *pag);
|
||||
void xfs_iunlink_destroy(struct xfs_perag *pag);
|
||||
|
||||
#endif /* __XFS_INODE_H__ */
|
||||
|
|
|
@ -35,18 +35,40 @@
|
|||
#define XFS_WRITEIO_ALIGN(mp,off) (((off) >> mp->m_writeio_log) \
|
||||
<< mp->m_writeio_log)
|
||||
|
||||
void
|
||||
static int
|
||||
xfs_alert_fsblock_zero(
|
||||
xfs_inode_t *ip,
|
||||
xfs_bmbt_irec_t *imap)
|
||||
{
|
||||
xfs_alert_tag(ip->i_mount, XFS_PTAG_FSBLOCK_ZERO,
|
||||
"Access to block zero in inode %llu "
|
||||
"start_block: %llx start_off: %llx "
|
||||
"blkcnt: %llx extent-state: %x",
|
||||
(unsigned long long)ip->i_ino,
|
||||
(unsigned long long)imap->br_startblock,
|
||||
(unsigned long long)imap->br_startoff,
|
||||
(unsigned long long)imap->br_blockcount,
|
||||
imap->br_state);
|
||||
return -EFSCORRUPTED;
|
||||
}
|
||||
|
||||
int
|
||||
xfs_bmbt_to_iomap(
|
||||
struct xfs_inode *ip,
|
||||
struct iomap *iomap,
|
||||
struct xfs_bmbt_irec *imap)
|
||||
struct xfs_bmbt_irec *imap,
|
||||
bool shared)
|
||||
{
|
||||
struct xfs_mount *mp = ip->i_mount;
|
||||
|
||||
if (unlikely(!imap->br_startblock && !XFS_IS_REALTIME_INODE(ip)))
|
||||
return xfs_alert_fsblock_zero(ip, imap);
|
||||
|
||||
if (imap->br_startblock == HOLESTARTBLOCK) {
|
||||
iomap->addr = IOMAP_NULL_ADDR;
|
||||
iomap->type = IOMAP_HOLE;
|
||||
} else if (imap->br_startblock == DELAYSTARTBLOCK) {
|
||||
} else if (imap->br_startblock == DELAYSTARTBLOCK ||
|
||||
isnullstartblock(imap->br_startblock)) {
|
||||
iomap->addr = IOMAP_NULL_ADDR;
|
||||
iomap->type = IOMAP_DELALLOC;
|
||||
} else {
|
||||
|
@ -60,6 +82,13 @@ xfs_bmbt_to_iomap(
|
|||
iomap->length = XFS_FSB_TO_B(mp, imap->br_blockcount);
|
||||
iomap->bdev = xfs_find_bdev_for_inode(VFS_I(ip));
|
||||
iomap->dax_dev = xfs_find_daxdev_for_inode(VFS_I(ip));
|
||||
|
||||
if (xfs_ipincount(ip) &&
|
||||
(ip->i_itemp->ili_fsync_fields & ~XFS_ILOG_TIMESTAMP))
|
||||
iomap->flags |= IOMAP_F_DIRTY;
|
||||
if (shared)
|
||||
iomap->flags |= IOMAP_F_SHARED;
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void
|
||||
|
@ -138,23 +167,6 @@ xfs_iomap_eof_align_last_fsb(
|
|||
return 0;
|
||||
}
|
||||
|
||||
STATIC int
|
||||
xfs_alert_fsblock_zero(
|
||||
xfs_inode_t *ip,
|
||||
xfs_bmbt_irec_t *imap)
|
||||
{
|
||||
xfs_alert_tag(ip->i_mount, XFS_PTAG_FSBLOCK_ZERO,
|
||||
"Access to block zero in inode %llu "
|
||||
"start_block: %llx start_off: %llx "
|
||||
"blkcnt: %llx extent-state: %x",
|
||||
(unsigned long long)ip->i_ino,
|
||||
(unsigned long long)imap->br_startblock,
|
||||
(unsigned long long)imap->br_startoff,
|
||||
(unsigned long long)imap->br_blockcount,
|
||||
imap->br_state);
|
||||
return -EFSCORRUPTED;
|
||||
}
|
||||
|
||||
int
|
||||
xfs_iomap_write_direct(
|
||||
xfs_inode_t *ip,
|
||||
|
@ -383,12 +395,13 @@ xfs_quota_calc_throttle(
|
|||
STATIC xfs_fsblock_t
|
||||
xfs_iomap_prealloc_size(
|
||||
struct xfs_inode *ip,
|
||||
int whichfork,
|
||||
loff_t offset,
|
||||
loff_t count,
|
||||
struct xfs_iext_cursor *icur)
|
||||
{
|
||||
struct xfs_mount *mp = ip->i_mount;
|
||||
struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
|
||||
struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, whichfork);
|
||||
xfs_fileoff_t offset_fsb = XFS_B_TO_FSBT(mp, offset);
|
||||
struct xfs_bmbt_irec prev;
|
||||
int shift = 0;
|
||||
|
@ -522,15 +535,16 @@ xfs_file_iomap_begin_delay(
|
|||
{
|
||||
struct xfs_inode *ip = XFS_I(inode);
|
||||
struct xfs_mount *mp = ip->i_mount;
|
||||
struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
|
||||
xfs_fileoff_t offset_fsb = XFS_B_TO_FSBT(mp, offset);
|
||||
xfs_fileoff_t maxbytes_fsb =
|
||||
XFS_B_TO_FSB(mp, mp->m_super->s_maxbytes);
|
||||
xfs_fileoff_t end_fsb;
|
||||
int error = 0, eof = 0;
|
||||
struct xfs_bmbt_irec got;
|
||||
struct xfs_iext_cursor icur;
|
||||
struct xfs_bmbt_irec imap, cmap;
|
||||
struct xfs_iext_cursor icur, ccur;
|
||||
xfs_fsblock_t prealloc_blocks = 0;
|
||||
bool eof = false, cow_eof = false, shared = false;
|
||||
int whichfork = XFS_DATA_FORK;
|
||||
int error = 0;
|
||||
|
||||
ASSERT(!XFS_IS_REALTIME_INODE(ip));
|
||||
ASSERT(!xfs_get_extsz_hint(ip));
|
||||
|
@ -548,7 +562,7 @@ xfs_file_iomap_begin_delay(
|
|||
|
||||
XFS_STATS_INC(mp, xs_blk_mapw);
|
||||
|
||||
if (!(ifp->if_flags & XFS_IFEXTENTS)) {
|
||||
if (!(ip->i_df.if_flags & XFS_IFEXTENTS)) {
|
||||
error = xfs_iread_extents(NULL, ip, XFS_DATA_FORK);
|
||||
if (error)
|
||||
goto out_unlock;
|
||||
|
@ -556,53 +570,101 @@ xfs_file_iomap_begin_delay(
|
|||
|
||||
end_fsb = min(XFS_B_TO_FSB(mp, offset + count), maxbytes_fsb);
|
||||
|
||||
eof = !xfs_iext_lookup_extent(ip, ifp, offset_fsb, &icur, &got);
|
||||
/*
|
||||
* Search the data fork fork first to look up our source mapping. We
|
||||
* always need the data fork map, as we have to return it to the
|
||||
* iomap code so that the higher level write code can read data in to
|
||||
* perform read-modify-write cycles for unaligned writes.
|
||||
*/
|
||||
eof = !xfs_iext_lookup_extent(ip, &ip->i_df, offset_fsb, &icur, &imap);
|
||||
if (eof)
|
||||
got.br_startoff = end_fsb; /* fake hole until the end */
|
||||
imap.br_startoff = end_fsb; /* fake hole until the end */
|
||||
|
||||
if (got.br_startoff <= offset_fsb) {
|
||||
/* We never need to allocate blocks for zeroing a hole. */
|
||||
if ((flags & IOMAP_ZERO) && imap.br_startoff > offset_fsb) {
|
||||
xfs_hole_to_iomap(ip, iomap, offset_fsb, imap.br_startoff);
|
||||
goto out_unlock;
|
||||
}
|
||||
|
||||
/*
|
||||
* Search the COW fork extent list even if we did not find a data fork
|
||||
* extent. This serves two purposes: first this implements the
|
||||
* speculative preallocation using cowextsize, so that we also unshare
|
||||
* block adjacent to shared blocks instead of just the shared blocks
|
||||
* themselves. Second the lookup in the extent list is generally faster
|
||||
* than going out to the shared extent tree.
|
||||
*/
|
||||
if (xfs_is_cow_inode(ip)) {
|
||||
if (!ip->i_cowfp) {
|
||||
ASSERT(!xfs_is_reflink_inode(ip));
|
||||
xfs_ifork_init_cow(ip);
|
||||
}
|
||||
cow_eof = !xfs_iext_lookup_extent(ip, ip->i_cowfp, offset_fsb,
|
||||
&ccur, &cmap);
|
||||
if (!cow_eof && cmap.br_startoff <= offset_fsb) {
|
||||
trace_xfs_reflink_cow_found(ip, &cmap);
|
||||
whichfork = XFS_COW_FORK;
|
||||
goto done;
|
||||
}
|
||||
}
|
||||
|
||||
if (imap.br_startoff <= offset_fsb) {
|
||||
/*
|
||||
* For reflink files we may need a delalloc reservation when
|
||||
* overwriting shared extents. This includes zeroing of
|
||||
* existing extents that contain data.
|
||||
*/
|
||||
if (xfs_is_reflink_inode(ip) &&
|
||||
((flags & IOMAP_WRITE) ||
|
||||
got.br_state != XFS_EXT_UNWRITTEN)) {
|
||||
xfs_trim_extent(&got, offset_fsb, end_fsb - offset_fsb);
|
||||
error = xfs_reflink_reserve_cow(ip, &got);
|
||||
if (error)
|
||||
goto out_unlock;
|
||||
if (!xfs_is_cow_inode(ip) ||
|
||||
((flags & IOMAP_ZERO) && imap.br_state != XFS_EXT_NORM)) {
|
||||
trace_xfs_iomap_found(ip, offset, count, XFS_DATA_FORK,
|
||||
&imap);
|
||||
goto done;
|
||||
}
|
||||
|
||||
trace_xfs_iomap_found(ip, offset, count, 0, &got);
|
||||
goto done;
|
||||
}
|
||||
xfs_trim_extent(&imap, offset_fsb, end_fsb - offset_fsb);
|
||||
|
||||
if (flags & IOMAP_ZERO) {
|
||||
xfs_hole_to_iomap(ip, iomap, offset_fsb, got.br_startoff);
|
||||
goto out_unlock;
|
||||
/* Trim the mapping to the nearest shared extent boundary. */
|
||||
error = xfs_inode_need_cow(ip, &imap, &shared);
|
||||
if (error)
|
||||
goto out_unlock;
|
||||
|
||||
/* Not shared? Just report the (potentially capped) extent. */
|
||||
if (!shared) {
|
||||
trace_xfs_iomap_found(ip, offset, count, XFS_DATA_FORK,
|
||||
&imap);
|
||||
goto done;
|
||||
}
|
||||
|
||||
/*
|
||||
* Fork all the shared blocks from our write offset until the
|
||||
* end of the extent.
|
||||
*/
|
||||
whichfork = XFS_COW_FORK;
|
||||
end_fsb = imap.br_startoff + imap.br_blockcount;
|
||||
} else {
|
||||
/*
|
||||
* We cap the maximum length we map here to MAX_WRITEBACK_PAGES
|
||||
* pages to keep the chunks of work done where somewhat
|
||||
* symmetric with the work writeback does. This is a completely
|
||||
* arbitrary number pulled out of thin air.
|
||||
*
|
||||
* Note that the values needs to be less than 32-bits wide until
|
||||
* the lower level functions are updated.
|
||||
*/
|
||||
count = min_t(loff_t, count, 1024 * PAGE_SIZE);
|
||||
end_fsb = min(XFS_B_TO_FSB(mp, offset + count), maxbytes_fsb);
|
||||
|
||||
if (xfs_is_always_cow_inode(ip))
|
||||
whichfork = XFS_COW_FORK;
|
||||
}
|
||||
|
||||
error = xfs_qm_dqattach_locked(ip, false);
|
||||
if (error)
|
||||
goto out_unlock;
|
||||
|
||||
/*
|
||||
* We cap the maximum length we map here to MAX_WRITEBACK_PAGES pages
|
||||
* to keep the chunks of work done where somewhat symmetric with the
|
||||
* work writeback does. This is a completely arbitrary number pulled
|
||||
* out of thin air as a best guess for initial testing.
|
||||
*
|
||||
* Note that the values needs to be less than 32-bits wide until
|
||||
* the lower level functions are updated.
|
||||
*/
|
||||
count = min_t(loff_t, count, 1024 * PAGE_SIZE);
|
||||
end_fsb = min(XFS_B_TO_FSB(mp, offset + count), maxbytes_fsb);
|
||||
|
||||
if (eof) {
|
||||
prealloc_blocks = xfs_iomap_prealloc_size(ip, offset, count,
|
||||
&icur);
|
||||
prealloc_blocks = xfs_iomap_prealloc_size(ip, whichfork, offset,
|
||||
count, &icur);
|
||||
if (prealloc_blocks) {
|
||||
xfs_extlen_t align;
|
||||
xfs_off_t end_offset;
|
||||
|
@ -623,9 +685,11 @@ xfs_file_iomap_begin_delay(
|
|||
}
|
||||
|
||||
retry:
|
||||
error = xfs_bmapi_reserve_delalloc(ip, XFS_DATA_FORK, offset_fsb,
|
||||
end_fsb - offset_fsb, prealloc_blocks, &got, &icur,
|
||||
eof);
|
||||
error = xfs_bmapi_reserve_delalloc(ip, whichfork, offset_fsb,
|
||||
end_fsb - offset_fsb, prealloc_blocks,
|
||||
whichfork == XFS_DATA_FORK ? &imap : &cmap,
|
||||
whichfork == XFS_DATA_FORK ? &icur : &ccur,
|
||||
whichfork == XFS_DATA_FORK ? eof : cow_eof);
|
||||
switch (error) {
|
||||
case 0:
|
||||
break;
|
||||
|
@ -647,190 +711,26 @@ retry:
|
|||
* them out if the write happens to fail.
|
||||
*/
|
||||
iomap->flags |= IOMAP_F_NEW;
|
||||
trace_xfs_iomap_alloc(ip, offset, count, 0, &got);
|
||||
trace_xfs_iomap_alloc(ip, offset, count, whichfork,
|
||||
whichfork == XFS_DATA_FORK ? &imap : &cmap);
|
||||
done:
|
||||
if (isnullstartblock(got.br_startblock))
|
||||
got.br_startblock = DELAYSTARTBLOCK;
|
||||
|
||||
if (!got.br_startblock) {
|
||||
error = xfs_alert_fsblock_zero(ip, &got);
|
||||
if (error)
|
||||
if (whichfork == XFS_COW_FORK) {
|
||||
if (imap.br_startoff > offset_fsb) {
|
||||
xfs_trim_extent(&cmap, offset_fsb,
|
||||
imap.br_startoff - offset_fsb);
|
||||
error = xfs_bmbt_to_iomap(ip, iomap, &cmap, true);
|
||||
goto out_unlock;
|
||||
}
|
||||
/* ensure we only report blocks we have a reservation for */
|
||||
xfs_trim_extent(&imap, cmap.br_startoff, cmap.br_blockcount);
|
||||
shared = true;
|
||||
}
|
||||
|
||||
xfs_bmbt_to_iomap(ip, iomap, &got);
|
||||
|
||||
error = xfs_bmbt_to_iomap(ip, iomap, &imap, shared);
|
||||
out_unlock:
|
||||
xfs_iunlock(ip, XFS_ILOCK_EXCL);
|
||||
return error;
|
||||
}
|
||||
|
||||
/*
|
||||
* Pass in a delayed allocate extent, convert it to real extents;
|
||||
* return to the caller the extent we create which maps on top of
|
||||
* the originating callers request.
|
||||
*
|
||||
* Called without a lock on the inode.
|
||||
*
|
||||
* We no longer bother to look at the incoming map - all we have to
|
||||
* guarantee is that whatever we allocate fills the required range.
|
||||
*/
|
||||
int
|
||||
xfs_iomap_write_allocate(
|
||||
xfs_inode_t *ip,
|
||||
int whichfork,
|
||||
xfs_off_t offset,
|
||||
xfs_bmbt_irec_t *imap,
|
||||
unsigned int *cow_seq)
|
||||
{
|
||||
xfs_mount_t *mp = ip->i_mount;
|
||||
struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, whichfork);
|
||||
xfs_fileoff_t offset_fsb, last_block;
|
||||
xfs_fileoff_t end_fsb, map_start_fsb;
|
||||
xfs_filblks_t count_fsb;
|
||||
xfs_trans_t *tp;
|
||||
int nimaps;
|
||||
int error = 0;
|
||||
int flags = XFS_BMAPI_DELALLOC;
|
||||
int nres;
|
||||
|
||||
if (whichfork == XFS_COW_FORK)
|
||||
flags |= XFS_BMAPI_COWFORK | XFS_BMAPI_PREALLOC;
|
||||
|
||||
/*
|
||||
* Make sure that the dquots are there.
|
||||
*/
|
||||
error = xfs_qm_dqattach(ip);
|
||||
if (error)
|
||||
return error;
|
||||
|
||||
offset_fsb = XFS_B_TO_FSBT(mp, offset);
|
||||
count_fsb = imap->br_blockcount;
|
||||
map_start_fsb = imap->br_startoff;
|
||||
|
||||
XFS_STATS_ADD(mp, xs_xstrat_bytes, XFS_FSB_TO_B(mp, count_fsb));
|
||||
|
||||
while (count_fsb != 0) {
|
||||
/*
|
||||
* Set up a transaction with which to allocate the
|
||||
* backing store for the file. Do allocations in a
|
||||
* loop until we get some space in the range we are
|
||||
* interested in. The other space that might be allocated
|
||||
* is in the delayed allocation extent on which we sit
|
||||
* but before our buffer starts.
|
||||
*/
|
||||
nimaps = 0;
|
||||
while (nimaps == 0) {
|
||||
nres = XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK);
|
||||
/*
|
||||
* We have already reserved space for the extent and any
|
||||
* indirect blocks when creating the delalloc extent,
|
||||
* there is no need to reserve space in this transaction
|
||||
* again.
|
||||
*/
|
||||
error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, 0,
|
||||
0, XFS_TRANS_RESERVE, &tp);
|
||||
if (error)
|
||||
return error;
|
||||
|
||||
xfs_ilock(ip, XFS_ILOCK_EXCL);
|
||||
xfs_trans_ijoin(tp, ip, 0);
|
||||
|
||||
/*
|
||||
* it is possible that the extents have changed since
|
||||
* we did the read call as we dropped the ilock for a
|
||||
* while. We have to be careful about truncates or hole
|
||||
* punchs here - we are not allowed to allocate
|
||||
* non-delalloc blocks here.
|
||||
*
|
||||
* The only protection against truncation is the pages
|
||||
* for the range we are being asked to convert are
|
||||
* locked and hence a truncate will block on them
|
||||
* first.
|
||||
*
|
||||
* As a result, if we go beyond the range we really
|
||||
* need and hit an delalloc extent boundary followed by
|
||||
* a hole while we have excess blocks in the map, we
|
||||
* will fill the hole incorrectly and overrun the
|
||||
* transaction reservation.
|
||||
*
|
||||
* Using a single map prevents this as we are forced to
|
||||
* check each map we look for overlap with the desired
|
||||
* range and abort as soon as we find it. Also, given
|
||||
* that we only return a single map, having one beyond
|
||||
* what we can return is probably a bit silly.
|
||||
*
|
||||
* We also need to check that we don't go beyond EOF;
|
||||
* this is a truncate optimisation as a truncate sets
|
||||
* the new file size before block on the pages we
|
||||
* currently have locked under writeback. Because they
|
||||
* are about to be tossed, we don't need to write them
|
||||
* back....
|
||||
*/
|
||||
nimaps = 1;
|
||||
end_fsb = XFS_B_TO_FSB(mp, XFS_ISIZE(ip));
|
||||
error = xfs_bmap_last_offset(ip, &last_block,
|
||||
XFS_DATA_FORK);
|
||||
if (error)
|
||||
goto trans_cancel;
|
||||
|
||||
last_block = XFS_FILEOFF_MAX(last_block, end_fsb);
|
||||
if ((map_start_fsb + count_fsb) > last_block) {
|
||||
count_fsb = last_block - map_start_fsb;
|
||||
if (count_fsb == 0) {
|
||||
error = -EAGAIN;
|
||||
goto trans_cancel;
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* From this point onwards we overwrite the imap
|
||||
* pointer that the caller gave to us.
|
||||
*/
|
||||
error = xfs_bmapi_write(tp, ip, map_start_fsb,
|
||||
count_fsb, flags, nres, imap,
|
||||
&nimaps);
|
||||
if (error)
|
||||
goto trans_cancel;
|
||||
|
||||
error = xfs_trans_commit(tp);
|
||||
if (error)
|
||||
goto error0;
|
||||
|
||||
if (whichfork == XFS_COW_FORK)
|
||||
*cow_seq = READ_ONCE(ifp->if_seq);
|
||||
xfs_iunlock(ip, XFS_ILOCK_EXCL);
|
||||
}
|
||||
|
||||
/*
|
||||
* See if we were able to allocate an extent that
|
||||
* covers at least part of the callers request
|
||||
*/
|
||||
if (!(imap->br_startblock || XFS_IS_REALTIME_INODE(ip)))
|
||||
return xfs_alert_fsblock_zero(ip, imap);
|
||||
|
||||
if ((offset_fsb >= imap->br_startoff) &&
|
||||
(offset_fsb < (imap->br_startoff +
|
||||
imap->br_blockcount))) {
|
||||
XFS_STATS_INC(mp, xs_xstrat_quick);
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* So far we have not mapped the requested part of the
|
||||
* file, just surrounding data, try again.
|
||||
*/
|
||||
count_fsb -= imap->br_blockcount;
|
||||
map_start_fsb = imap->br_startoff + imap->br_blockcount;
|
||||
}
|
||||
|
||||
trans_cancel:
|
||||
xfs_trans_cancel(tp);
|
||||
error0:
|
||||
xfs_iunlock(ip, XFS_ILOCK_EXCL);
|
||||
return error;
|
||||
}
|
||||
|
||||
int
|
||||
xfs_iomap_write_unwritten(
|
||||
xfs_inode_t *ip,
|
||||
|
@ -975,7 +875,7 @@ xfs_ilock_for_iomap(
|
|||
* COW writes may allocate delalloc space or convert unwritten COW
|
||||
* extents, so we need to make sure to take the lock exclusively here.
|
||||
*/
|
||||
if (xfs_is_reflink_inode(ip) && is_write) {
|
||||
if (xfs_is_cow_inode(ip) && is_write) {
|
||||
/*
|
||||
* FIXME: It could still overwrite on unshared extents and not
|
||||
* need allocation.
|
||||
|
@ -1009,7 +909,7 @@ relock:
|
|||
* check, so if we got ILOCK_SHARED for a write and but we're now a
|
||||
* reflink inode we have to switch to ILOCK_EXCL and relock.
|
||||
*/
|
||||
if (mode == XFS_ILOCK_SHARED && is_write && xfs_is_reflink_inode(ip)) {
|
||||
if (mode == XFS_ILOCK_SHARED && is_write && xfs_is_cow_inode(ip)) {
|
||||
xfs_iunlock(ip, mode);
|
||||
mode = XFS_ILOCK_EXCL;
|
||||
goto relock;
|
||||
|
@ -1081,23 +981,33 @@ xfs_file_iomap_begin(
|
|||
* Break shared extents if necessary. Checks for non-blocking IO have
|
||||
* been done up front, so we don't need to do them here.
|
||||
*/
|
||||
if (xfs_is_reflink_inode(ip)) {
|
||||
if (xfs_is_cow_inode(ip)) {
|
||||
struct xfs_bmbt_irec cmap;
|
||||
bool directio = (flags & IOMAP_DIRECT);
|
||||
|
||||
/* if zeroing doesn't need COW allocation, then we are done. */
|
||||
if ((flags & IOMAP_ZERO) &&
|
||||
!needs_cow_for_zeroing(&imap, nimaps))
|
||||
goto out_found;
|
||||
|
||||
if (flags & IOMAP_DIRECT) {
|
||||
/* may drop and re-acquire the ilock */
|
||||
error = xfs_reflink_allocate_cow(ip, &imap, &shared,
|
||||
&lockmode);
|
||||
if (error)
|
||||
goto out_unlock;
|
||||
} else {
|
||||
error = xfs_reflink_reserve_cow(ip, &imap);
|
||||
if (error)
|
||||
goto out_unlock;
|
||||
}
|
||||
/* may drop and re-acquire the ilock */
|
||||
cmap = imap;
|
||||
error = xfs_reflink_allocate_cow(ip, &cmap, &shared, &lockmode,
|
||||
directio);
|
||||
if (error)
|
||||
goto out_unlock;
|
||||
|
||||
/*
|
||||
* For buffered writes we need to report the address of the
|
||||
* previous block (if there was any) so that the higher level
|
||||
* write code can perform read-modify-write operations; we
|
||||
* won't need the CoW fork mapping until writeback. For direct
|
||||
* I/O, which must be block aligned, we need to report the
|
||||
* newly allocated address. If the data fork has a hole, copy
|
||||
* the COW fork mapping to avoid allocating to the data fork.
|
||||
*/
|
||||
if (directio || imap.br_startblock == HOLESTARTBLOCK)
|
||||
imap = cmap;
|
||||
|
||||
end_fsb = imap.br_startoff + imap.br_blockcount;
|
||||
length = XFS_FSB_TO_B(mp, end_fsb) - offset;
|
||||
|
@ -1139,23 +1049,15 @@ xfs_file_iomap_begin(
|
|||
return error;
|
||||
|
||||
iomap->flags |= IOMAP_F_NEW;
|
||||
trace_xfs_iomap_alloc(ip, offset, length, 0, &imap);
|
||||
trace_xfs_iomap_alloc(ip, offset, length, XFS_DATA_FORK, &imap);
|
||||
|
||||
out_finish:
|
||||
if (xfs_ipincount(ip) && (ip->i_itemp->ili_fsync_fields
|
||||
& ~XFS_ILOG_TIMESTAMP))
|
||||
iomap->flags |= IOMAP_F_DIRTY;
|
||||
|
||||
xfs_bmbt_to_iomap(ip, iomap, &imap);
|
||||
|
||||
if (shared)
|
||||
iomap->flags |= IOMAP_F_SHARED;
|
||||
return 0;
|
||||
return xfs_bmbt_to_iomap(ip, iomap, &imap, shared);
|
||||
|
||||
out_found:
|
||||
ASSERT(nimaps);
|
||||
xfs_iunlock(ip, lockmode);
|
||||
trace_xfs_iomap_found(ip, offset, length, 0, &imap);
|
||||
trace_xfs_iomap_found(ip, offset, length, XFS_DATA_FORK, &imap);
|
||||
goto out_finish;
|
||||
|
||||
out_unlock:
|
||||
|
@ -1240,6 +1142,92 @@ const struct iomap_ops xfs_iomap_ops = {
|
|||
.iomap_end = xfs_file_iomap_end,
|
||||
};
|
||||
|
||||
static int
|
||||
xfs_seek_iomap_begin(
|
||||
struct inode *inode,
|
||||
loff_t offset,
|
||||
loff_t length,
|
||||
unsigned flags,
|
||||
struct iomap *iomap)
|
||||
{
|
||||
struct xfs_inode *ip = XFS_I(inode);
|
||||
struct xfs_mount *mp = ip->i_mount;
|
||||
xfs_fileoff_t offset_fsb = XFS_B_TO_FSBT(mp, offset);
|
||||
xfs_fileoff_t end_fsb = XFS_B_TO_FSB(mp, offset + length);
|
||||
xfs_fileoff_t cow_fsb = NULLFILEOFF, data_fsb = NULLFILEOFF;
|
||||
struct xfs_iext_cursor icur;
|
||||
struct xfs_bmbt_irec imap, cmap;
|
||||
int error = 0;
|
||||
unsigned lockmode;
|
||||
|
||||
if (XFS_FORCED_SHUTDOWN(mp))
|
||||
return -EIO;
|
||||
|
||||
lockmode = xfs_ilock_data_map_shared(ip);
|
||||
if (!(ip->i_df.if_flags & XFS_IFEXTENTS)) {
|
||||
error = xfs_iread_extents(NULL, ip, XFS_DATA_FORK);
|
||||
if (error)
|
||||
goto out_unlock;
|
||||
}
|
||||
|
||||
if (xfs_iext_lookup_extent(ip, &ip->i_df, offset_fsb, &icur, &imap)) {
|
||||
/*
|
||||
* If we found a data extent we are done.
|
||||
*/
|
||||
if (imap.br_startoff <= offset_fsb)
|
||||
goto done;
|
||||
data_fsb = imap.br_startoff;
|
||||
} else {
|
||||
/*
|
||||
* Fake a hole until the end of the file.
|
||||
*/
|
||||
data_fsb = min(XFS_B_TO_FSB(mp, offset + length),
|
||||
XFS_B_TO_FSB(mp, mp->m_super->s_maxbytes));
|
||||
}
|
||||
|
||||
/*
|
||||
* If a COW fork extent covers the hole, report it - capped to the next
|
||||
* data fork extent:
|
||||
*/
|
||||
if (xfs_inode_has_cow_data(ip) &&
|
||||
xfs_iext_lookup_extent(ip, ip->i_cowfp, offset_fsb, &icur, &cmap))
|
||||
cow_fsb = cmap.br_startoff;
|
||||
if (cow_fsb != NULLFILEOFF && cow_fsb <= offset_fsb) {
|
||||
if (data_fsb < cow_fsb + cmap.br_blockcount)
|
||||
end_fsb = min(end_fsb, data_fsb);
|
||||
xfs_trim_extent(&cmap, offset_fsb, end_fsb);
|
||||
error = xfs_bmbt_to_iomap(ip, iomap, &cmap, true);
|
||||
/*
|
||||
* This is a COW extent, so we must probe the page cache
|
||||
* because there could be dirty page cache being backed
|
||||
* by this extent.
|
||||
*/
|
||||
iomap->type = IOMAP_UNWRITTEN;
|
||||
goto out_unlock;
|
||||
}
|
||||
|
||||
/*
|
||||
* Else report a hole, capped to the next found data or COW extent.
|
||||
*/
|
||||
if (cow_fsb != NULLFILEOFF && cow_fsb < data_fsb)
|
||||
imap.br_blockcount = cow_fsb - offset_fsb;
|
||||
else
|
||||
imap.br_blockcount = data_fsb - offset_fsb;
|
||||
imap.br_startoff = offset_fsb;
|
||||
imap.br_startblock = HOLESTARTBLOCK;
|
||||
imap.br_state = XFS_EXT_NORM;
|
||||
done:
|
||||
xfs_trim_extent(&imap, offset_fsb, end_fsb);
|
||||
error = xfs_bmbt_to_iomap(ip, iomap, &imap, false);
|
||||
out_unlock:
|
||||
xfs_iunlock(ip, lockmode);
|
||||
return error;
|
||||
}
|
||||
|
||||
const struct iomap_ops xfs_seek_iomap_ops = {
|
||||
.iomap_begin = xfs_seek_iomap_begin,
|
||||
};
|
||||
|
||||
static int
|
||||
xfs_xattr_iomap_begin(
|
||||
struct inode *inode,
|
||||
|
@ -1273,12 +1261,10 @@ xfs_xattr_iomap_begin(
|
|||
out_unlock:
|
||||
xfs_iunlock(ip, lockmode);
|
||||
|
||||
if (!error) {
|
||||
ASSERT(nimaps);
|
||||
xfs_bmbt_to_iomap(ip, iomap, &imap);
|
||||
}
|
||||
|
||||
return error;
|
||||
if (error)
|
||||
return error;
|
||||
ASSERT(nimaps);
|
||||
return xfs_bmbt_to_iomap(ip, iomap, &imap, false);
|
||||
}
|
||||
|
||||
const struct iomap_ops xfs_xattr_iomap_ops = {
|
||||
|
|
|
@ -13,12 +13,10 @@ struct xfs_bmbt_irec;
|
|||
|
||||
int xfs_iomap_write_direct(struct xfs_inode *, xfs_off_t, size_t,
|
||||
struct xfs_bmbt_irec *, int);
|
||||
int xfs_iomap_write_allocate(struct xfs_inode *, int, xfs_off_t,
|
||||
struct xfs_bmbt_irec *, unsigned int *);
|
||||
int xfs_iomap_write_unwritten(struct xfs_inode *, xfs_off_t, xfs_off_t, bool);
|
||||
|
||||
void xfs_bmbt_to_iomap(struct xfs_inode *, struct iomap *,
|
||||
struct xfs_bmbt_irec *);
|
||||
int xfs_bmbt_to_iomap(struct xfs_inode *, struct iomap *,
|
||||
struct xfs_bmbt_irec *, bool shared);
|
||||
xfs_extlen_t xfs_eof_alignment(struct xfs_inode *ip, xfs_extlen_t extsize);
|
||||
|
||||
static inline xfs_filblks_t
|
||||
|
@ -42,6 +40,7 @@ xfs_aligned_fsb_count(
|
|||
}
|
||||
|
||||
extern const struct iomap_ops xfs_iomap_ops;
|
||||
extern const struct iomap_ops xfs_seek_iomap_ops;
|
||||
extern const struct iomap_ops xfs_xattr_iomap_ops;
|
||||
|
||||
#endif /* __XFS_IOMAP_H__*/
|
||||
|
|
|
@ -191,9 +191,18 @@ xfs_generic_create(
|
|||
|
||||
xfs_setup_iops(ip);
|
||||
|
||||
if (tmpfile)
|
||||
if (tmpfile) {
|
||||
/*
|
||||
* The VFS requires that any inode fed to d_tmpfile must have
|
||||
* nlink == 1 so that it can decrement the nlink in d_tmpfile.
|
||||
* However, we created the temp file with nlink == 0 because
|
||||
* we're not allowed to put an inode with nlink > 0 on the
|
||||
* unlinked list. Therefore we have to set nlink to 1 so that
|
||||
* d_tmpfile can immediately set it back to zero.
|
||||
*/
|
||||
set_nlink(inode, 1);
|
||||
d_tmpfile(dentry, inode);
|
||||
else
|
||||
} else
|
||||
d_instantiate(dentry, inode);
|
||||
|
||||
xfs_finish_inode_setup(ip);
|
||||
|
@ -522,6 +531,10 @@ xfs_vn_getattr(
|
|||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* Note: If you add another clause to set an attribute flag, please
|
||||
* update attributes_mask below.
|
||||
*/
|
||||
if (ip->i_d.di_flags & XFS_DIFLAG_IMMUTABLE)
|
||||
stat->attributes |= STATX_ATTR_IMMUTABLE;
|
||||
if (ip->i_d.di_flags & XFS_DIFLAG_APPEND)
|
||||
|
@ -529,6 +542,10 @@ xfs_vn_getattr(
|
|||
if (ip->i_d.di_flags & XFS_DIFLAG_NODUMP)
|
||||
stat->attributes |= STATX_ATTR_NODUMP;
|
||||
|
||||
stat->attributes_mask |= (STATX_ATTR_IMMUTABLE |
|
||||
STATX_ATTR_APPEND |
|
||||
STATX_ATTR_NODUMP);
|
||||
|
||||
switch (inode->i_mode & S_IFMT) {
|
||||
case S_IFBLK:
|
||||
case S_IFCHR:
|
||||
|
|
|
@ -2439,17 +2439,21 @@ xlog_recover_validate_buf_type(
|
|||
case XFS_BLFT_BTREE_BUF:
|
||||
switch (magic32) {
|
||||
case XFS_ABTB_CRC_MAGIC:
|
||||
case XFS_ABTC_CRC_MAGIC:
|
||||
case XFS_ABTB_MAGIC:
|
||||
bp->b_ops = &xfs_bnobt_buf_ops;
|
||||
break;
|
||||
case XFS_ABTC_CRC_MAGIC:
|
||||
case XFS_ABTC_MAGIC:
|
||||
bp->b_ops = &xfs_allocbt_buf_ops;
|
||||
bp->b_ops = &xfs_cntbt_buf_ops;
|
||||
break;
|
||||
case XFS_IBT_CRC_MAGIC:
|
||||
case XFS_FIBT_CRC_MAGIC:
|
||||
case XFS_IBT_MAGIC:
|
||||
case XFS_FIBT_MAGIC:
|
||||
bp->b_ops = &xfs_inobt_buf_ops;
|
||||
break;
|
||||
case XFS_FIBT_CRC_MAGIC:
|
||||
case XFS_FIBT_MAGIC:
|
||||
bp->b_ops = &xfs_finobt_buf_ops;
|
||||
break;
|
||||
case XFS_BMAP_CRC_MAGIC:
|
||||
case XFS_BMAP_MAGIC:
|
||||
bp->b_ops = &xfs_bmbt_buf_ops;
|
||||
|
@ -3045,7 +3049,7 @@ xlog_recover_inode_pass2(
|
|||
* Make sure the place we're flushing out to really looks
|
||||
* like an inode!
|
||||
*/
|
||||
if (unlikely(dip->di_magic != cpu_to_be16(XFS_DINODE_MAGIC))) {
|
||||
if (unlikely(!xfs_verify_magic16(bp, dip->di_magic))) {
|
||||
xfs_alert(mp,
|
||||
"%s: Bad inode magic number, dip = "PTR_FMT", dino bp = "PTR_FMT", ino = %Ld",
|
||||
__func__, dip, bp, in_f->ilf_ino);
|
||||
|
|
|
@ -149,6 +149,7 @@ xfs_free_perag(
|
|||
spin_unlock(&mp->m_perag_lock);
|
||||
ASSERT(pag);
|
||||
ASSERT(atomic_read(&pag->pag_ref) == 0);
|
||||
xfs_iunlink_destroy(pag);
|
||||
xfs_buf_hash_destroy(pag);
|
||||
mutex_destroy(&pag->pag_ici_reclaim_lock);
|
||||
call_rcu(&pag->rcu_head, __xfs_free_perag);
|
||||
|
@ -227,6 +228,9 @@ xfs_initialize_perag(
|
|||
/* first new pag is fully initialized */
|
||||
if (first_initialised == NULLAGNUMBER)
|
||||
first_initialised = index;
|
||||
error = xfs_iunlink_init(pag);
|
||||
if (error)
|
||||
goto out_hash_destroy;
|
||||
}
|
||||
|
||||
index = xfs_set_inode_alloc(mp, agcount);
|
||||
|
@ -249,6 +253,7 @@ out_unwind_new_pags:
|
|||
if (!pag)
|
||||
break;
|
||||
xfs_buf_hash_destroy(pag);
|
||||
xfs_iunlink_destroy(pag);
|
||||
mutex_destroy(&pag->pag_ici_reclaim_lock);
|
||||
kmem_free(pag);
|
||||
}
|
||||
|
|
|
@ -138,7 +138,7 @@ typedef struct xfs_mount {
|
|||
struct mutex m_growlock; /* growfs mutex */
|
||||
int m_fixedfsid[2]; /* unchanged for life of FS */
|
||||
uint64_t m_flags; /* global mount flags */
|
||||
bool m_inotbt_nores; /* no per-AG finobt resv. */
|
||||
bool m_finobt_nores; /* no per-AG finobt resv. */
|
||||
int m_ialloc_inos; /* inodes in inode allocation */
|
||||
int m_ialloc_blks; /* blocks in inode allocation */
|
||||
int m_ialloc_min_blks;/* min blocks in sparse inode
|
||||
|
@ -194,6 +194,7 @@ typedef struct xfs_mount {
|
|||
*/
|
||||
uint32_t m_generation;
|
||||
|
||||
bool m_always_cow;
|
||||
bool m_fail_unmount;
|
||||
#ifdef DEBUG
|
||||
/*
|
||||
|
@ -396,6 +397,13 @@ typedef struct xfs_perag {
|
|||
|
||||
/* reference count */
|
||||
uint8_t pagf_refcount_level;
|
||||
|
||||
/*
|
||||
* Unlinked inode information. This incore information reflects
|
||||
* data stored in the AGI, so callers must hold the AGI buffer lock
|
||||
* or have some other means to control concurrency.
|
||||
*/
|
||||
struct rhashtable pagi_unlinked_hash;
|
||||
} xfs_perag_t;
|
||||
|
||||
static inline struct xfs_ag_resv *
|
||||
|
|
|
@ -125,6 +125,27 @@ xfs_check_ondisk_structs(void)
|
|||
XFS_CHECK_STRUCT_SIZE(struct xfs_inode_log_format, 56);
|
||||
XFS_CHECK_STRUCT_SIZE(struct xfs_qoff_logformat, 20);
|
||||
XFS_CHECK_STRUCT_SIZE(struct xfs_trans_header, 16);
|
||||
|
||||
/*
|
||||
* The v5 superblock format extended several v4 header structures with
|
||||
* additional data. While new fields are only accessible on v5
|
||||
* superblocks, it's important that the v5 structures place original v4
|
||||
* fields/headers in the correct location on-disk. For example, we must
|
||||
* be able to find magic values at the same location in certain blocks
|
||||
* regardless of superblock version.
|
||||
*
|
||||
* The following checks ensure that various v5 data structures place the
|
||||
* subset of v4 metadata associated with the same type of block at the
|
||||
* start of the on-disk block. If there is no data structure definition
|
||||
* for certain types of v4 blocks, traverse down to the first field of
|
||||
* common metadata (e.g., magic value) and make sure it is at offset
|
||||
* zero.
|
||||
*/
|
||||
XFS_CHECK_OFFSET(struct xfs_dir3_leaf, hdr.info.hdr, 0);
|
||||
XFS_CHECK_OFFSET(struct xfs_da3_intnode, hdr.info.hdr, 0);
|
||||
XFS_CHECK_OFFSET(struct xfs_dir3_data_hdr, hdr.magic, 0);
|
||||
XFS_CHECK_OFFSET(struct xfs_dir3_free, hdr.hdr.magic, 0);
|
||||
XFS_CHECK_OFFSET(struct xfs_attr3_leafblock, hdr.info.hdr, 0);
|
||||
}
|
||||
|
||||
#endif /* __XFS_ONDISK_H */
|
||||
|
|
|
@ -185,7 +185,7 @@ xfs_fs_map_blocks(
|
|||
}
|
||||
xfs_iunlock(ip, XFS_IOLOCK_EXCL);
|
||||
|
||||
xfs_bmbt_to_iomap(ip, iomap, &imap);
|
||||
error = xfs_bmbt_to_iomap(ip, iomap, &imap, false);
|
||||
*device_generation = mp->m_generation;
|
||||
return error;
|
||||
out_unlock:
|
||||
|
|
|
@ -192,7 +192,7 @@ xfs_reflink_trim_around_shared(
|
|||
int error = 0;
|
||||
|
||||
/* Holes, unwritten, and delalloc extents cannot be shared */
|
||||
if (!xfs_is_reflink_inode(ip) || !xfs_bmap_is_real_extent(irec)) {
|
||||
if (!xfs_is_cow_inode(ip) || !xfs_bmap_is_real_extent(irec)) {
|
||||
*shared = false;
|
||||
return 0;
|
||||
}
|
||||
|
@ -234,93 +234,59 @@ xfs_reflink_trim_around_shared(
|
|||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* Trim the passed in imap to the next shared/unshared extent boundary, and
|
||||
* if imap->br_startoff points to a shared extent reserve space for it in the
|
||||
* COW fork.
|
||||
*
|
||||
* Note that imap will always contain the block numbers for the existing blocks
|
||||
* in the data fork, as the upper layers need them for read-modify-write
|
||||
* operations.
|
||||
*/
|
||||
int
|
||||
xfs_reflink_reserve_cow(
|
||||
bool
|
||||
xfs_inode_need_cow(
|
||||
struct xfs_inode *ip,
|
||||
struct xfs_bmbt_irec *imap)
|
||||
struct xfs_bmbt_irec *imap,
|
||||
bool *shared)
|
||||
{
|
||||
struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, XFS_COW_FORK);
|
||||
struct xfs_bmbt_irec got;
|
||||
int error = 0;
|
||||
bool eof = false;
|
||||
struct xfs_iext_cursor icur;
|
||||
bool shared;
|
||||
|
||||
/*
|
||||
* Search the COW fork extent list first. This serves two purposes:
|
||||
* first this implement the speculative preallocation using cowextisze,
|
||||
* so that we also unshared block adjacent to shared blocks instead
|
||||
* of just the shared blocks themselves. Second the lookup in the
|
||||
* extent list is generally faster than going out to the shared extent
|
||||
* tree.
|
||||
*/
|
||||
|
||||
if (!xfs_iext_lookup_extent(ip, ifp, imap->br_startoff, &icur, &got))
|
||||
eof = true;
|
||||
if (!eof && got.br_startoff <= imap->br_startoff) {
|
||||
trace_xfs_reflink_cow_found(ip, imap);
|
||||
xfs_trim_extent(imap, got.br_startoff, got.br_blockcount);
|
||||
/* We can't update any real extents in always COW mode. */
|
||||
if (xfs_is_always_cow_inode(ip) &&
|
||||
!isnullstartblock(imap->br_startblock)) {
|
||||
*shared = true;
|
||||
return 0;
|
||||
}
|
||||
|
||||
/* Trim the mapping to the nearest shared extent boundary. */
|
||||
error = xfs_reflink_trim_around_shared(ip, imap, &shared);
|
||||
if (error)
|
||||
return error;
|
||||
|
||||
/* Not shared? Just report the (potentially capped) extent. */
|
||||
if (!shared)
|
||||
return 0;
|
||||
|
||||
/*
|
||||
* Fork all the shared blocks from our write offset until the end of
|
||||
* the extent.
|
||||
*/
|
||||
error = xfs_qm_dqattach_locked(ip, false);
|
||||
if (error)
|
||||
return error;
|
||||
|
||||
error = xfs_bmapi_reserve_delalloc(ip, XFS_COW_FORK, imap->br_startoff,
|
||||
imap->br_blockcount, 0, &got, &icur, eof);
|
||||
if (error == -ENOSPC || error == -EDQUOT)
|
||||
trace_xfs_reflink_cow_enospc(ip, imap);
|
||||
if (error)
|
||||
return error;
|
||||
|
||||
xfs_trim_extent(imap, got.br_startoff, got.br_blockcount);
|
||||
trace_xfs_reflink_cow_alloc(ip, &got);
|
||||
return 0;
|
||||
return xfs_reflink_trim_around_shared(ip, imap, shared);
|
||||
}
|
||||
|
||||
/* Convert part of an unwritten CoW extent to a real one. */
|
||||
STATIC int
|
||||
xfs_reflink_convert_cow_extent(
|
||||
struct xfs_inode *ip,
|
||||
struct xfs_bmbt_irec *imap,
|
||||
xfs_fileoff_t offset_fsb,
|
||||
xfs_filblks_t count_fsb)
|
||||
static int
|
||||
xfs_reflink_convert_cow_locked(
|
||||
struct xfs_inode *ip,
|
||||
xfs_fileoff_t offset_fsb,
|
||||
xfs_filblks_t count_fsb)
|
||||
{
|
||||
int nimaps = 1;
|
||||
struct xfs_iext_cursor icur;
|
||||
struct xfs_bmbt_irec got;
|
||||
struct xfs_btree_cur *dummy_cur = NULL;
|
||||
int dummy_logflags;
|
||||
int error = 0;
|
||||
|
||||
if (imap->br_state == XFS_EXT_NORM)
|
||||
if (!xfs_iext_lookup_extent(ip, ip->i_cowfp, offset_fsb, &icur, &got))
|
||||
return 0;
|
||||
|
||||
xfs_trim_extent(imap, offset_fsb, count_fsb);
|
||||
trace_xfs_reflink_convert_cow(ip, imap);
|
||||
if (imap->br_blockcount == 0)
|
||||
return 0;
|
||||
return xfs_bmapi_write(NULL, ip, imap->br_startoff, imap->br_blockcount,
|
||||
XFS_BMAPI_COWFORK | XFS_BMAPI_CONVERT, 0, imap,
|
||||
&nimaps);
|
||||
do {
|
||||
if (got.br_startoff >= offset_fsb + count_fsb)
|
||||
break;
|
||||
if (got.br_state == XFS_EXT_NORM)
|
||||
continue;
|
||||
if (WARN_ON_ONCE(isnullstartblock(got.br_startblock)))
|
||||
return -EIO;
|
||||
|
||||
xfs_trim_extent(&got, offset_fsb, count_fsb);
|
||||
if (!got.br_blockcount)
|
||||
continue;
|
||||
|
||||
got.br_state = XFS_EXT_NORM;
|
||||
error = xfs_bmap_add_extent_unwritten_real(NULL, ip,
|
||||
XFS_COW_FORK, &icur, &dummy_cur, &got,
|
||||
&dummy_logflags);
|
||||
if (error)
|
||||
return error;
|
||||
} while (xfs_iext_next_extent(ip->i_cowfp, &icur, &got));
|
||||
|
||||
return error;
|
||||
}
|
||||
|
||||
/* Convert all of the unwritten CoW extents in a file's range to real ones. */
|
||||
|
@ -334,15 +300,12 @@ xfs_reflink_convert_cow(
|
|||
xfs_fileoff_t offset_fsb = XFS_B_TO_FSBT(mp, offset);
|
||||
xfs_fileoff_t end_fsb = XFS_B_TO_FSB(mp, offset + count);
|
||||
xfs_filblks_t count_fsb = end_fsb - offset_fsb;
|
||||
struct xfs_bmbt_irec imap;
|
||||
int nimaps = 1, error = 0;
|
||||
int error;
|
||||
|
||||
ASSERT(count != 0);
|
||||
|
||||
xfs_ilock(ip, XFS_ILOCK_EXCL);
|
||||
error = xfs_bmapi_write(NULL, ip, offset_fsb, count_fsb,
|
||||
XFS_BMAPI_COWFORK | XFS_BMAPI_CONVERT |
|
||||
XFS_BMAPI_CONVERT_ONLY, 0, &imap, &nimaps);
|
||||
error = xfs_reflink_convert_cow_locked(ip, offset_fsb, count_fsb);
|
||||
xfs_iunlock(ip, XFS_ILOCK_EXCL);
|
||||
return error;
|
||||
}
|
||||
|
@ -375,7 +338,7 @@ xfs_find_trim_cow_extent(
|
|||
if (got.br_startoff > offset_fsb) {
|
||||
xfs_trim_extent(imap, imap->br_startoff,
|
||||
got.br_startoff - imap->br_startoff);
|
||||
return xfs_reflink_trim_around_shared(ip, imap, shared);
|
||||
return xfs_inode_need_cow(ip, imap, shared);
|
||||
}
|
||||
|
||||
*shared = true;
|
||||
|
@ -397,7 +360,8 @@ xfs_reflink_allocate_cow(
|
|||
struct xfs_inode *ip,
|
||||
struct xfs_bmbt_irec *imap,
|
||||
bool *shared,
|
||||
uint *lockmode)
|
||||
uint *lockmode,
|
||||
bool convert_now)
|
||||
{
|
||||
struct xfs_mount *mp = ip->i_mount;
|
||||
xfs_fileoff_t offset_fsb = imap->br_startoff;
|
||||
|
@ -409,7 +373,10 @@ xfs_reflink_allocate_cow(
|
|||
xfs_extlen_t resblks = 0;
|
||||
|
||||
ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
|
||||
ASSERT(xfs_is_reflink_inode(ip));
|
||||
if (!ip->i_cowfp) {
|
||||
ASSERT(!xfs_is_reflink_inode(ip));
|
||||
xfs_ifork_init_cow(ip);
|
||||
}
|
||||
|
||||
error = xfs_find_trim_cow_extent(ip, imap, shared, &found);
|
||||
if (error || !*shared)
|
||||
|
@ -471,7 +438,16 @@ xfs_reflink_allocate_cow(
|
|||
if (nimaps == 0)
|
||||
return -ENOSPC;
|
||||
convert:
|
||||
return xfs_reflink_convert_cow_extent(ip, imap, offset_fsb, count_fsb);
|
||||
xfs_trim_extent(imap, offset_fsb, count_fsb);
|
||||
/*
|
||||
* COW fork extents are supposed to remain unwritten until we're ready
|
||||
* to initiate a disk write. For direct I/O we are going to write the
|
||||
* data and need the conversion, but for buffered writes we're done.
|
||||
*/
|
||||
if (!convert_now || imap->br_state == XFS_EXT_NORM)
|
||||
return 0;
|
||||
trace_xfs_reflink_convert_cow(ip, imap);
|
||||
return xfs_reflink_convert_cow_locked(ip, offset_fsb, count_fsb);
|
||||
|
||||
out_unreserve:
|
||||
xfs_trans_unreserve_quota_nblks(tp, ip, (long)resblks, 0,
|
||||
|
@ -586,7 +562,7 @@ xfs_reflink_cancel_cow_range(
|
|||
int error;
|
||||
|
||||
trace_xfs_reflink_cancel_cow_range(ip, offset, count);
|
||||
ASSERT(xfs_is_reflink_inode(ip));
|
||||
ASSERT(ip->i_cowfp);
|
||||
|
||||
offset_fsb = XFS_B_TO_FSBT(ip->i_mount, offset);
|
||||
if (count == NULLFILEOFF)
|
||||
|
@ -1192,7 +1168,7 @@ xfs_reflink_remap_blocks(
|
|||
break;
|
||||
ASSERT(nimaps == 1);
|
||||
|
||||
trace_xfs_reflink_remap_imap(src, srcoff, len, XFS_IO_OVERWRITE,
|
||||
trace_xfs_reflink_remap_imap(src, srcoff, len, XFS_DATA_FORK,
|
||||
&imap);
|
||||
|
||||
/* Translate imap into the destination file. */
|
||||
|
|
|
@ -6,16 +6,28 @@
|
|||
#ifndef __XFS_REFLINK_H
|
||||
#define __XFS_REFLINK_H 1
|
||||
|
||||
static inline bool xfs_is_always_cow_inode(struct xfs_inode *ip)
|
||||
{
|
||||
return ip->i_mount->m_always_cow &&
|
||||
xfs_sb_version_hasreflink(&ip->i_mount->m_sb);
|
||||
}
|
||||
|
||||
static inline bool xfs_is_cow_inode(struct xfs_inode *ip)
|
||||
{
|
||||
return xfs_is_reflink_inode(ip) || xfs_is_always_cow_inode(ip);
|
||||
}
|
||||
|
||||
extern int xfs_reflink_find_shared(struct xfs_mount *mp, struct xfs_trans *tp,
|
||||
xfs_agnumber_t agno, xfs_agblock_t agbno, xfs_extlen_t aglen,
|
||||
xfs_agblock_t *fbno, xfs_extlen_t *flen, bool find_maximal);
|
||||
extern int xfs_reflink_trim_around_shared(struct xfs_inode *ip,
|
||||
struct xfs_bmbt_irec *irec, bool *shared);
|
||||
bool xfs_inode_need_cow(struct xfs_inode *ip, struct xfs_bmbt_irec *imap,
|
||||
bool *shared);
|
||||
|
||||
extern int xfs_reflink_reserve_cow(struct xfs_inode *ip,
|
||||
struct xfs_bmbt_irec *imap);
|
||||
extern int xfs_reflink_allocate_cow(struct xfs_inode *ip,
|
||||
struct xfs_bmbt_irec *imap, bool *shared, uint *lockmode);
|
||||
struct xfs_bmbt_irec *imap, bool *shared, uint *lockmode,
|
||||
bool convert_now);
|
||||
extern int xfs_reflink_convert_cow(struct xfs_inode *ip, xfs_off_t offset,
|
||||
xfs_off_t count);
|
||||
|
||||
|
|
|
@ -1594,6 +1594,13 @@ xfs_mount_alloc(
|
|||
INIT_DELAYED_WORK(&mp->m_eofblocks_work, xfs_eofblocks_worker);
|
||||
INIT_DELAYED_WORK(&mp->m_cowblocks_work, xfs_cowblocks_worker);
|
||||
mp->m_kobj.kobject.kset = xfs_kset;
|
||||
/*
|
||||
* We don't create the finobt per-ag space reservation until after log
|
||||
* recovery, so we must set this to true so that an ifree transaction
|
||||
* started during log recovery will not depend on space reservations
|
||||
* for finobt expansion.
|
||||
*/
|
||||
mp->m_finobt_nores = true;
|
||||
return mp;
|
||||
}
|
||||
|
||||
|
@ -1729,11 +1736,18 @@ xfs_fs_fill_super(
|
|||
}
|
||||
}
|
||||
|
||||
if (xfs_sb_version_hasreflink(&mp->m_sb) && mp->m_sb.sb_rblocks) {
|
||||
xfs_alert(mp,
|
||||
if (xfs_sb_version_hasreflink(&mp->m_sb)) {
|
||||
if (mp->m_sb.sb_rblocks) {
|
||||
xfs_alert(mp,
|
||||
"reflink not compatible with realtime device!");
|
||||
error = -EINVAL;
|
||||
goto out_filestream_unmount;
|
||||
error = -EINVAL;
|
||||
goto out_filestream_unmount;
|
||||
}
|
||||
|
||||
if (xfs_globals.always_cow) {
|
||||
xfs_info(mp, "using DEBUG-only always_cow mode.");
|
||||
mp->m_always_cow = true;
|
||||
}
|
||||
}
|
||||
|
||||
if (xfs_sb_version_hasrmapbt(&mp->m_sb) && mp->m_sb.sb_rblocks) {
|
||||
|
|
|
@ -85,6 +85,7 @@ struct xfs_globals {
|
|||
int log_recovery_delay; /* log recovery delay (secs) */
|
||||
int mount_delay; /* mount setup delay (secs) */
|
||||
bool bug_on_assert; /* BUG() the kernel on assert failure */
|
||||
bool always_cow; /* use COW fork for all overwrites */
|
||||
};
|
||||
extern struct xfs_globals xfs_globals;
|
||||
|
||||
|
|
|
@ -183,10 +183,34 @@ mount_delay_show(
|
|||
}
|
||||
XFS_SYSFS_ATTR_RW(mount_delay);
|
||||
|
||||
static ssize_t
|
||||
always_cow_store(
|
||||
struct kobject *kobject,
|
||||
const char *buf,
|
||||
size_t count)
|
||||
{
|
||||
ssize_t ret;
|
||||
|
||||
ret = kstrtobool(buf, &xfs_globals.always_cow);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
return count;
|
||||
}
|
||||
|
||||
static ssize_t
|
||||
always_cow_show(
|
||||
struct kobject *kobject,
|
||||
char *buf)
|
||||
{
|
||||
return snprintf(buf, PAGE_SIZE, "%d\n", xfs_globals.always_cow);
|
||||
}
|
||||
XFS_SYSFS_ATTR_RW(always_cow);
|
||||
|
||||
static struct attribute *xfs_dbg_attrs[] = {
|
||||
ATTR_LIST(bug_on_assert),
|
||||
ATTR_LIST(log_recovery_delay),
|
||||
ATTR_LIST(mount_delay),
|
||||
ATTR_LIST(always_cow),
|
||||
NULL,
|
||||
};
|
||||
|
||||
|
|
|
@ -1218,23 +1218,17 @@ DEFINE_EVENT(xfs_readpage_class, name, \
|
|||
DEFINE_READPAGE_EVENT(xfs_vm_readpage);
|
||||
DEFINE_READPAGE_EVENT(xfs_vm_readpages);
|
||||
|
||||
TRACE_DEFINE_ENUM(XFS_IO_HOLE);
|
||||
TRACE_DEFINE_ENUM(XFS_IO_DELALLOC);
|
||||
TRACE_DEFINE_ENUM(XFS_IO_UNWRITTEN);
|
||||
TRACE_DEFINE_ENUM(XFS_IO_OVERWRITE);
|
||||
TRACE_DEFINE_ENUM(XFS_IO_COW);
|
||||
|
||||
DECLARE_EVENT_CLASS(xfs_imap_class,
|
||||
TP_PROTO(struct xfs_inode *ip, xfs_off_t offset, ssize_t count,
|
||||
int type, struct xfs_bmbt_irec *irec),
|
||||
TP_ARGS(ip, offset, count, type, irec),
|
||||
int whichfork, struct xfs_bmbt_irec *irec),
|
||||
TP_ARGS(ip, offset, count, whichfork, irec),
|
||||
TP_STRUCT__entry(
|
||||
__field(dev_t, dev)
|
||||
__field(xfs_ino_t, ino)
|
||||
__field(loff_t, size)
|
||||
__field(loff_t, offset)
|
||||
__field(size_t, count)
|
||||
__field(int, type)
|
||||
__field(int, whichfork)
|
||||
__field(xfs_fileoff_t, startoff)
|
||||
__field(xfs_fsblock_t, startblock)
|
||||
__field(xfs_filblks_t, blockcount)
|
||||
|
@ -1245,33 +1239,33 @@ DECLARE_EVENT_CLASS(xfs_imap_class,
|
|||
__entry->size = ip->i_d.di_size;
|
||||
__entry->offset = offset;
|
||||
__entry->count = count;
|
||||
__entry->type = type;
|
||||
__entry->whichfork = whichfork;
|
||||
__entry->startoff = irec ? irec->br_startoff : 0;
|
||||
__entry->startblock = irec ? irec->br_startblock : 0;
|
||||
__entry->blockcount = irec ? irec->br_blockcount : 0;
|
||||
),
|
||||
TP_printk("dev %d:%d ino 0x%llx size 0x%llx offset 0x%llx count %zd "
|
||||
"type %s startoff 0x%llx startblock %lld blockcount 0x%llx",
|
||||
"fork %s startoff 0x%llx startblock %lld blockcount 0x%llx",
|
||||
MAJOR(__entry->dev), MINOR(__entry->dev),
|
||||
__entry->ino,
|
||||
__entry->size,
|
||||
__entry->offset,
|
||||
__entry->count,
|
||||
__print_symbolic(__entry->type, XFS_IO_TYPES),
|
||||
__entry->whichfork == XFS_COW_FORK ? "cow" : "data",
|
||||
__entry->startoff,
|
||||
(int64_t)__entry->startblock,
|
||||
__entry->blockcount)
|
||||
)
|
||||
|
||||
#define DEFINE_IOMAP_EVENT(name) \
|
||||
#define DEFINE_IMAP_EVENT(name) \
|
||||
DEFINE_EVENT(xfs_imap_class, name, \
|
||||
TP_PROTO(struct xfs_inode *ip, xfs_off_t offset, ssize_t count, \
|
||||
int type, struct xfs_bmbt_irec *irec), \
|
||||
TP_ARGS(ip, offset, count, type, irec))
|
||||
DEFINE_IOMAP_EVENT(xfs_map_blocks_found);
|
||||
DEFINE_IOMAP_EVENT(xfs_map_blocks_alloc);
|
||||
DEFINE_IOMAP_EVENT(xfs_iomap_alloc);
|
||||
DEFINE_IOMAP_EVENT(xfs_iomap_found);
|
||||
int whichfork, struct xfs_bmbt_irec *irec), \
|
||||
TP_ARGS(ip, offset, count, whichfork, irec))
|
||||
DEFINE_IMAP_EVENT(xfs_map_blocks_found);
|
||||
DEFINE_IMAP_EVENT(xfs_map_blocks_alloc);
|
||||
DEFINE_IMAP_EVENT(xfs_iomap_alloc);
|
||||
DEFINE_IMAP_EVENT(xfs_iomap_found);
|
||||
|
||||
DECLARE_EVENT_CLASS(xfs_simple_io_class,
|
||||
TP_PROTO(struct xfs_inode *ip, xfs_off_t offset, ssize_t count),
|
||||
|
@ -3078,7 +3072,7 @@ DEFINE_EVENT(xfs_inode_irec_class, name, \
|
|||
DEFINE_INODE_EVENT(xfs_reflink_set_inode_flag);
|
||||
DEFINE_INODE_EVENT(xfs_reflink_unset_inode_flag);
|
||||
DEFINE_ITRUNC_EVENT(xfs_reflink_update_inode_size);
|
||||
DEFINE_IOMAP_EVENT(xfs_reflink_remap_imap);
|
||||
DEFINE_IMAP_EVENT(xfs_reflink_remap_imap);
|
||||
TRACE_EVENT(xfs_reflink_remap_blocks_loop,
|
||||
TP_PROTO(struct xfs_inode *src, xfs_fileoff_t soffset,
|
||||
xfs_filblks_t len, struct xfs_inode *dest,
|
||||
|
@ -3202,13 +3196,10 @@ DEFINE_INODE_ERROR_EVENT(xfs_reflink_unshare_error);
|
|||
|
||||
/* copy on write */
|
||||
DEFINE_INODE_IREC_EVENT(xfs_reflink_trim_around_shared);
|
||||
DEFINE_INODE_IREC_EVENT(xfs_reflink_cow_alloc);
|
||||
DEFINE_INODE_IREC_EVENT(xfs_reflink_cow_found);
|
||||
DEFINE_INODE_IREC_EVENT(xfs_reflink_cow_enospc);
|
||||
DEFINE_INODE_IREC_EVENT(xfs_reflink_convert_cow);
|
||||
|
||||
DEFINE_RW_EVENT(xfs_reflink_reserve_cow);
|
||||
|
||||
DEFINE_SIMPLE_IO_EVENT(xfs_reflink_bounce_dio_write);
|
||||
|
||||
DEFINE_SIMPLE_IO_EVENT(xfs_reflink_cancel_cow_range);
|
||||
|
@ -3371,6 +3362,84 @@ DEFINE_TRANS_EVENT(xfs_trans_roll);
|
|||
DEFINE_TRANS_EVENT(xfs_trans_add_item);
|
||||
DEFINE_TRANS_EVENT(xfs_trans_free_items);
|
||||
|
||||
TRACE_EVENT(xfs_iunlink_update_bucket,
|
||||
TP_PROTO(struct xfs_mount *mp, xfs_agnumber_t agno, unsigned int bucket,
|
||||
xfs_agino_t old_ptr, xfs_agino_t new_ptr),
|
||||
TP_ARGS(mp, agno, bucket, old_ptr, new_ptr),
|
||||
TP_STRUCT__entry(
|
||||
__field(dev_t, dev)
|
||||
__field(xfs_agnumber_t, agno)
|
||||
__field(unsigned int, bucket)
|
||||
__field(xfs_agino_t, old_ptr)
|
||||
__field(xfs_agino_t, new_ptr)
|
||||
),
|
||||
TP_fast_assign(
|
||||
__entry->dev = mp->m_super->s_dev;
|
||||
__entry->agno = agno;
|
||||
__entry->bucket = bucket;
|
||||
__entry->old_ptr = old_ptr;
|
||||
__entry->new_ptr = new_ptr;
|
||||
),
|
||||
TP_printk("dev %d:%d agno %u bucket %u old 0x%x new 0x%x",
|
||||
MAJOR(__entry->dev), MINOR(__entry->dev),
|
||||
__entry->agno,
|
||||
__entry->bucket,
|
||||
__entry->old_ptr,
|
||||
__entry->new_ptr)
|
||||
);
|
||||
|
||||
TRACE_EVENT(xfs_iunlink_update_dinode,
|
||||
TP_PROTO(struct xfs_mount *mp, xfs_agnumber_t agno, xfs_agino_t agino,
|
||||
xfs_agino_t old_ptr, xfs_agino_t new_ptr),
|
||||
TP_ARGS(mp, agno, agino, old_ptr, new_ptr),
|
||||
TP_STRUCT__entry(
|
||||
__field(dev_t, dev)
|
||||
__field(xfs_agnumber_t, agno)
|
||||
__field(xfs_agino_t, agino)
|
||||
__field(xfs_agino_t, old_ptr)
|
||||
__field(xfs_agino_t, new_ptr)
|
||||
),
|
||||
TP_fast_assign(
|
||||
__entry->dev = mp->m_super->s_dev;
|
||||
__entry->agno = agno;
|
||||
__entry->agino = agino;
|
||||
__entry->old_ptr = old_ptr;
|
||||
__entry->new_ptr = new_ptr;
|
||||
),
|
||||
TP_printk("dev %d:%d agno %u agino 0x%x old 0x%x new 0x%x",
|
||||
MAJOR(__entry->dev), MINOR(__entry->dev),
|
||||
__entry->agno,
|
||||
__entry->agino,
|
||||
__entry->old_ptr,
|
||||
__entry->new_ptr)
|
||||
);
|
||||
|
||||
DECLARE_EVENT_CLASS(xfs_ag_inode_class,
|
||||
TP_PROTO(struct xfs_inode *ip),
|
||||
TP_ARGS(ip),
|
||||
TP_STRUCT__entry(
|
||||
__field(dev_t, dev)
|
||||
__field(xfs_agnumber_t, agno)
|
||||
__field(xfs_agino_t, agino)
|
||||
),
|
||||
TP_fast_assign(
|
||||
__entry->dev = VFS_I(ip)->i_sb->s_dev;
|
||||
__entry->agno = XFS_INO_TO_AGNO(ip->i_mount, ip->i_ino);
|
||||
__entry->agino = XFS_INO_TO_AGINO(ip->i_mount, ip->i_ino);
|
||||
),
|
||||
TP_printk("dev %d:%d agno %u agino %u",
|
||||
MAJOR(__entry->dev), MINOR(__entry->dev),
|
||||
__entry->agno, __entry->agino)
|
||||
)
|
||||
|
||||
#define DEFINE_AGINODE_EVENT(name) \
|
||||
DEFINE_EVENT(xfs_ag_inode_class, name, \
|
||||
TP_PROTO(struct xfs_inode *ip), \
|
||||
TP_ARGS(ip))
|
||||
DEFINE_AGINODE_EVENT(xfs_iunlink);
|
||||
DEFINE_AGINODE_EVENT(xfs_iunlink_remove);
|
||||
DEFINE_AG_EVENT(xfs_iunlink_map_prev_fallback);
|
||||
|
||||
#endif /* _TRACE_XFS_H */
|
||||
|
||||
#undef TRACE_INCLUDE_PATH
|
||||
|
|
|
@ -17,7 +17,6 @@
|
|||
#include "xfs_alloc.h"
|
||||
#include "xfs_bmap.h"
|
||||
#include "xfs_inode.h"
|
||||
#include "xfs_defer.h"
|
||||
|
||||
/*
|
||||
* This routine is called to allocate a "bmap update done"
|
||||
|
|
|
@ -277,7 +277,7 @@ xfs_trans_read_buf_map(
|
|||
* release this buffer when it kills the tranaction.
|
||||
*/
|
||||
ASSERT(bp->b_ops != NULL);
|
||||
error = xfs_buf_ensure_ops(bp, ops);
|
||||
error = xfs_buf_reverify(bp, ops);
|
||||
if (error) {
|
||||
xfs_buf_ioerror_alert(bp, __func__);
|
||||
|
||||
|
|
|
@ -18,7 +18,6 @@
|
|||
#include "xfs_alloc.h"
|
||||
#include "xfs_bmap.h"
|
||||
#include "xfs_trace.h"
|
||||
#include "xfs_defer.h"
|
||||
|
||||
/*
|
||||
* This routine is called to allocate an "extent free done"
|
||||
|
|
|
@ -16,7 +16,6 @@
|
|||
#include "xfs_refcount_item.h"
|
||||
#include "xfs_alloc.h"
|
||||
#include "xfs_refcount.h"
|
||||
#include "xfs_defer.h"
|
||||
|
||||
/*
|
||||
* This routine is called to allocate a "refcount update done"
|
||||
|
|
|
@ -16,7 +16,6 @@
|
|||
#include "xfs_rmap_item.h"
|
||||
#include "xfs_alloc.h"
|
||||
#include "xfs_rmap.h"
|
||||
#include "xfs_defer.h"
|
||||
|
||||
/* Set the map extent flags for this reverse mapping. */
|
||||
static void
|
||||
|
|
|
@ -129,6 +129,9 @@ __xfs_xattr_put_listent(
|
|||
char *offset;
|
||||
int arraytop;
|
||||
|
||||
if (context->count < 0 || context->seen_enough)
|
||||
return;
|
||||
|
||||
if (!context->alist)
|
||||
goto compute_size;
|
||||
|
||||
|
|
Loading…
Reference in New Issue