OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
David Chinner	6b1d1a732f	[XFS] Fix lock inversion in forced shutdown. Recent changes to xlog_state_release_iclog() placed the grant_lock inside the icloglock. forced unmount of the log does this the opposite way around, but does not depend on the order for correct working. Fix the inversion by changing the order locks are gained in xfs_log_force_umount(). SGI-PV: 979661 SGI-Modid: xfs-linux-melb:xfs-kern:30773a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:51:04 +10:00
David Chinner	4679b2d36d	[XFS] Reorganise xlog_t for better cacheline isolation of contention To reduce contention on the log in large CPU count, separate out different parts of the xlog_t structure onto different cachelines. Move each lock onto a different cacheline along with all the members that are accessed/modified while that lock is held. Also, move the debugging code into debug code. SGI-PV: 978729 SGI-Modid: xfs-linux-melb:xfs-kern:30772a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:50:53 +10:00
David Chinner	eb01c9cd87	[XFS] Remove the xlog_ticket allocator The ticket allocator is just a simple slab implementation internal to the log. It requires the icloglock to be held when manipulating it and this contributes to contention on that lock. Just kill the entire allocator and use a memory zone instead. While there, allow us to gracefully fail allocation with ENOMEM. SGI-PV: 978729 SGI-Modid: xfs-linux-melb:xfs-kern:30771a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:50:39 +10:00
David Chinner	114d23aae5	[XFS] Per iclog callback chain lock Rather than use the icloglock for protecting the iclog completion callback chain, use a new per-iclog lock so that walking the callback chain doesn't require holding a global lock. This reduces contention on the icloglock during transaction commit and log I/O completion by reducing the number of times we need to hold the global icloglock during these operations. SGI-PV: 978729 SGI-Modid: xfs-linux-melb:xfs-kern:30770a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:50:22 +10:00
Lachlan McIlroy	2abdb8c881	[XFS] Prevent xfs_bmap_check_leaf_extents() referencing unmapped memory. While investigating the extent corruption bug I ran into this bug in debug only code. xfs_bmap_check_leaf_extents() loops through the leaf blocks of the extent btree checking that every extent is entirely before the next extent. It also compares the last extent in the previous block to the first extent in the current block when the previous block has been released and potentially unmapped. So take a copy of the last extent instead of a pointer. Also move the last extent check out of the loop because we only need to do it once. SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30718a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>	2008-04-18 11:49:51 +10:00
Christoph Hellwig	433550990e	[XFS] remove most calls to VN_RELE Most VN_RELE calls either directly contain a XFS_ITOV or have the corresponding xfs_inode already in scope. Use the IRELE helper instead of VN_RELE to clarify the code. With a little more work we can kill VN_RELE altogether and define IRELE in terms of iput directly. SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30710a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:49:08 +10:00
Lachlan McIlroy	df26cfe849	[XFS] split xfs_ioc_xattr The three subcases of xfs_ioc_xattr don't share any semantics and almost no code, so split it into three separate helpers. SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30709a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:44:03 +10:00
Christoph Hellwig	f3dcc13f6f	[XFS] cleanup root inode handling in xfs_fs_fill_super - rename rootvp to root for clarify - remove useless vn_to_inode call - check is_bad_inode before calling d_alloc_root - use iput instead of VN_RELE in the error case SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30708a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:42:36 +10:00
David Chinner	59a33f9f77	[XFS] Ensure a btree insert returns a valid cursor. When writing into preallocated regions there is a case where XFS can oops or hang doing the unwritten extent conversion on I/O completion. It turns out that the problem is related to the btree cursor being invalid. When we do an insert into the tree, we may need to split blocks in the tree. When we only split at the leaf level (i.e. level 0), everything works just fine. However, if we have a multi-level split in the btreee, the cursor passed to the insert function is no longer valid once the insert is complete. The leaf level split is handled correctly because all the operations at level 0 are done using the original cursor, hence it is updated correctly. However, when we need to update the next level up the tree, we don't use that cursor - we use a cloned cursor that points to the index in the next level up where we need to do the insert. Hence if we need to split a second level, the changes to the tree are reflected in the cloned cursor and not the original cursor. This clone-and-move-up-a-level-on-split behaviour recurses all the way to the top of the tree. The complexity here is that these cloned cursors do not point to the original index that was inserted - they point to the newly allocated block (the right block) and the original cursor pointer to that level may still point to the left block. Hence, without deep examination of the cloned cursor and buffers, we cannot update the original cursor with the new path from the cloned cursor. In these cases the original cursor could be pointing to the wrong block(s) and hence a subsequent modification to the tree using that cursor will lead to corruption of the tree. The crash case occurs when the tree changes height - we insert a new level in the tree, and the cursor does not have a buffer in it's path for that level. Hence any attempt to walk back up the cursor to the root block will result in a null pointer dereference. To make matters even more complex, the BMAP BT is rooted in an inode, so we can have a change of height in the btree without a root split. That is, if the root block in the inode is full when we split a leaf node, we cannot fit the pointer to the new block in the root, so we allocate a new block, migrate all the ptrs out of the inode into the new block and point the inode root block at the newly allocated block. This changes the height of the tree without a root split having occurred and hence invalidates the path in the original cursor. The patch below prevents xfs_bmbt_insert() from returning with an invalid cursor by detecting the cases that invalidate the original cursor and refresh it by do a lookup into the btree for the original index we were inserting at. Note that the INOBT, AGFBNO and AGFCNT btree implementations also have this bug, but the cursor is currently always destroyed or revalidated after an insert for those trees. Hence this patch only address the problem in the BMBT code. SGI-PV: 979339 SGI-Modid: xfs-linux-melb:xfs-kern:30701a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:42:21 +10:00
David Chinner	75de2a91c9	[XFS] Account for inode cluster alignment in all allocations At ENOSPC, we can get a filesystem shutdown due to a cancelling a dirty transaction in xfs_mkdir or xfs_create. This is due to the initial allocation attempt not taking into account inode alignment and hence we can prepare the AGF freelist for allocation when it's not actually possible to do an allocation. This results in inode allocation returning ENOSPC with a dirty transaction, and hence we shut down the filesystem. Because the first allocation is an exact allocation attempt, we must tell the allocator that the alignment does not affect the allocation attempt. i.e. we will accept any extent alignment as long as the extent starts at the block we want. Unfortunately, this means that if the longest free extent is less than the length + alignment necessary for fallback allocation attempts but is long enough to attempt a non-aligned allocation, we will modify the free list. If we then have the exact allocation fail, all other allocation attempts will also fail due to the alignment constraint being taken into account. Hence the initial attempt needs to set the "alignment slop" field so that alignment, while not required, must be taken into account when determining if there is enough space left in the AG to do the allocation. That means if the exact allocation fails, we will not dirty the freelist if there is not enough space available fo a subsequent allocation to succeed. Hence we get an ENOSPC error back to userspace without shutting down the filesystem. SGI-PV: 978886 SGI-Modid: xfs-linux-melb:xfs-kern:30699a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:42:09 +10:00
Josef 'Jeff' Sipek	535f6b3735	[XFS] Replace custom AIL linked-list code with struct list_head Replace the xfs_ail_entry_t with a struct list_head and clean the surrounding code up. Also fixes a livelock in xfs_trans_first_push_ail() by terminating the loop at the head of the list correctly. SGI-PV: 978682 SGI-Modid: xfs-linux-melb:xfs-kern:30636a Signed-off-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:41:57 +10:00
Christoph Hellwig	a45c796867	[XFS] Remove superflous xfs_readsb call in xfs_mountfs. When xfs_mountfs is called by xfs_mount xfs_readsb was called 35 lines above unconditionally, so there is no need to try to read the superblock if it's not present. If any other port doesn't have the superblock read at this point it should just call it directly from it's xfs_mount equivalent. SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30603a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Donald Douwsma <donaldd@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:41:46 +10:00
Niv Sardi	dfa18b1179	[XFS] kill t_sema member of struct xfs_trans It's completely unused so we might aswell kill it. Note that there is another t_sema in struct xlog_ticket, which is used and actually an sv_t despite the name. That one is left untouched by this patch. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30591a Signed-off-by: Niv Sardi <xaiki@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:41:35 +10:00
Christoph Hellwig	5f90150aba	[XFS] cleanup vnode use in xfs_bmap.c SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30553a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:41:25 +10:00
Christoph Hellwig	af048193fc	[XFS] cleanup vnode use in xfs_iops.c SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30552a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:41:14 +10:00
Christoph Hellwig	dcf49cc5cf	[XFS] cleanup vnode use in xfs_lrw.c SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30551a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:41:04 +10:00
Christoph Hellwig	ef1f5e7ad3	[XFS] cleanup vnode use in xfs_lookup SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30550a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:40:55 +10:00
Christoph Hellwig	3937be5ba8	[XFS] cleanup vnode use in xfs_symlink and xfs_rename SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30548a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:40:45 +10:00
Christoph Hellwig	a3da789640	[XFS] cleanup vnode use in xfs_link SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30547a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:40:35 +10:00
Christoph Hellwig	979ebab116	[XFS] cleanup vnode use in xfs_create/mknod/mkdir SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30546a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:40:25 +10:00
Christoph Hellwig	bc4ac74a4e	[XFS] cleanup vnode use in dmapi calls SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30545a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:40:15 +10:00
David Chinner	d234154125	[XFS] Use power-of-2 sized buffers to reduce overhead Now that the ktrace_enter() code is using atomics, the non-power-of-2 buffer sizes - which require modulus operations to get the index - are showing up as using substantial CPU in the profiles. Force the buffer sizes to be rounded up to the nearest power of two and use masking rather than modulus operations to convert the index counter to the buffer index. This reduces ktrace_enter overhead to 8% of a CPU time, and again almost halves the trace intensive test runtime. SGI-PV: 977546 SGI-Modid: xfs-linux-melb:xfs-kern:30538a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:40:04 +10:00
David Chinner	6ee4752ffe	[XFS] Use atomic counters for ktrace buffer indexes ktrace_enter() is consuming vast amounts of CPU time due to the use of a single global lock for protecting buffer index increments. Change it to use per-buffer atomic counters - this reduces ktrace_enter() overhead during a trace intensive test on a 4p machine from 58% of all CPU time to 12% and halves test runtime. SGI-PV: 977546 SGI-Modid: xfs-linux-melb:xfs-kern:30537a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:39:55 +10:00
David Chinner	44d814ced4	[XFS] Update c/mtime correctly on truncates XFS changes the c/mtime of an inode when truncating it to the same size. The c/mtime is only supposed to change if the size is changed. Not to be confused with ftruncate, where the c/mtime is supposed to be changed even if the size is not changed. The Linux VFS encodes this semantic difference in the flags it sends down to ->setattr, which XFS currently ignores. We need to make XFS pay attention to the VFS flags and hence Do The Right Thing. SGI-PV: 977547 SGI-Modid: xfs-linux-melb:xfs-kern:30536a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:39:45 +10:00
Christoph Hellwig	24bd861d1c	[XFS] don't encode parent in nfs filehandles unless nessecary As Dave pointed out after the export ops changes we now always encode the parent into the filehandle for regular files, but it's not actually needed when the filesystem is export with no_subtree_check. This one-liner fixes xfs_fs_encode_fh to skip encoding the parent unless nessecary. SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30535a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:39:35 +10:00
Christoph Hellwig	126468b115	[XFS] kill xfs_rwlock/xfs_rwunlock We can just use xfs_ilock/xfs_iunlock instead and get rid of the ugly bhv_vrwlock_t. SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30533a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:39:25 +10:00
Christoph Hellwig	43973964a3	[XFS] kill xfs_get_dir_entry Instead of of xfs_get_dir_entry use a macro to get the xfs_inode from the dentry in the callers and grab the reference manually. Only grab the reference once as it's fine to keep it over the dmapi calls. (And even that reference is actually superflous in Linux but I'll leave that for another patch) SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30531a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:39:14 +10:00
Christoph Hellwig	a8b3acd57e	[XFS] vnode cleanup in xfs_fs_subr.c Cleanup the unneeded intermediate vnode step in the flushing helpers and go directly from the xfs_inode to the struct address_space. SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30530a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:39:03 +10:00
Christoph Hellwig	db0bb7baa1	[XFS] cleanup xfs_vn_mknod - use proper goto based unwinding instead of the current mess of multiple conditionals - rename ip to inode because that's the normal convention for Linux inodes while ip is the convention for xfs_inodes - remove unlikely checks for the default_acl - branches marked unlikely might lead to extreme branch bredictor slowdons if taken and for some workloads a default acl is quite common - properly indent the switch statements - remove xfs_has_fs_struct as nfsd has a fs_struct in any semi-recent kernel SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30529a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:38:53 +10:00
David Chinner	155cc6b784	[XFS] Use atomics for iclog reference counting Now that we update the log tail LSN less frequently on transaction completion, we pass the contention straight to the global log state lock (l_iclog_lock) during transaction completion. We currently have to take this lock to decrement the iclog reference count. there is a reference count on each iclog, so we need to take �he global lock for all refcount changes. When large numbers of processes are all doing small trnasctions, the iclog reference counts will be quite high, and the state change that absolutely requires the l_iclog_lock is the except rather than the norm. Change the reference counting on the iclogs to use atomic_inc/dec so that we can use atomic_dec_and_lock during transaction completion and avoid the need for grabbing the l_iclog_lock for every reference count decrement except the one that matters - the last. SGI-PV: 975671 SGI-Modid: xfs-linux-melb:xfs-kern:30505a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:38:10 +10:00
David Chinner	b589334c7a	[XFS] Prevent AIL lock contention during transaction completion When hundreds of processors attempt to commit transactions at the same time, they can contend on the AIL lock when updating the tail LSN held in the in-core log structure. At the moment, the tail LSN is only needed when actually writing out an iclog, so it really does not need to be updated on every single transaction completion - only those that result in switching iclogs and flushing them to disk. The result is that we reduce the number of times we need to grab the AIL lock and the log grant lock by up to two orders of magnitude on large processor count machines. The problem has previously been hidden by AIL lock contention walking the AIL list which was recently solved and uncovered this issue. SGI-PV: 975671 SGI-Modid: xfs-linux-melb:xfs-kern:30504a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:38:01 +10:00
David Chinner	3354040897	[XFS] Use xfs_inode_clean() in more places Remove open coded checks for the whether the inode is clean and replace them with an inlined function. SGI-PV: 977461 SGI-Modid: xfs-linux-melb:xfs-kern:30503a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:37:51 +10:00
David Chinner	bad5584332	[XFS] Remove the xfs_icluster structure Remove the xfs_icluster structure and replace with a radix tree lookup. We don't need to keep a list of inodes in each cluster around anymore as we can look them up quickly when we need to. The only time we need to do this now is during inode writeback. Factor the inode cluster writeback code out of xfs_iflush and convert it to use radix_tree_gang_lookup() instead of walking a list of inodes built when we first read in the inodes. This remove 3 pointers from each xfs_inode structure and the xfs_icluster structure per inode cluster. Hence we reduce the cache footprint of the xfs_inodes by between 5-10% depending on cluster sparseness. To be truly efficient we need a radix_tree_gang_lookup_range() call to stop searching once we are past the end of the cluster instead of trying to find a full cluster's worth of inodes. Before (ia64): $ cat /sys/slab/xfs_inode/object_size 536 After: $ cat /sys/slab/xfs_inode/object_size 512 SGI-PV: 977460 SGI-Modid: xfs-linux-melb:xfs-kern:30502a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:37:41 +10:00
David Chinner	a3f74ffb6d	[XFS] Don't block pdflush when writing back inodes When pdflush is writing back inodes, it can get stuck on inode cluster buffers that are currently under I/O. This occurs when we write data to multiple inodes in the same inode cluster at the same time. Effectively, delayed allocation marks the inode dirty during the data writeback. Hence if the inode cluster was flushed during the writeback of the first inode, the writeback of the second inode will block waiting for the inode cluster write to complete before writing it again for the newly dirtied inode. Basically, we want to avoid this from happening so we don't block pdflush and slow down all of writeback. Hence we introduce a non-blocking async inode flush flag that pdflush uses. If this flag is set, we use non-blocking operations (e.g. try locks) whereever we can to avoid blocking or extra I/O being issued. SGI-PV: 970925 SGI-Modid: xfs-linux-melb:xfs-kern:30501a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:37:32 +10:00
David Chinner	4ae29b4321	[XFS] Factor xfs_itobp() and xfs_inotobp(). The only difference between the functions is one passes an inode for the lookup, the other passes an inode number. However, they don't do the same validity checking or set all the same state on the buffer that is returned yet they should. Factor the functions into a common implementation. SGI-PV: 970925 SGI-Modid: xfs-linux-melb:xfs-kern:30500a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:37:19 +10:00
Lachlan McIlroy	e9a56b7cda	[XFS] Fix regression due to refcache removal SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30490a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Donald Douwsma <donaldd@sgi.com>	2008-04-18 11:37:06 +10:00
Donald Douwsma	163d3686bb	[XFS] Remove the xfs_refcache Remove the xfs_refcache, it was only needed while we were still building for 2.4 kernels. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30472a Signed-off-by: Donald Douwsma <donaldd@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:36:55 +10:00
Lachlan McIlroy	461aa8a225	[XFS] make inode reclaim synchronise with xfs_iflush_done() On a forced shutdown, xfs_finish_reclaim() will skip flushing the inode. If the inode flush lock is not already held and there is an outstanding xfs_iflush_done() then we might free the inode prematurely. By acquiring and releasing the flush lock we will synchronise with xfs_iflush_done(). SGI-PV: 909874 SGI-Modid: xfs-linux-melb:xfs-kern:30468a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: David Chinner <dgc@sgi.com>	2008-04-18 11:34:54 +10:00
Niv Sardi	e12070a5dc	[XFS] actually check error returned by xfs_flush_pages, clean up and bailout if fails. SGI-PV: 973041 SGI-Modid: xfs-linux-melb:xfs-kern:30462a Signed-off-by: Niv Sardi <xaiki@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-18 11:34:47 +10:00
Eric Sandeen	e6957ea484	[XFS] Ensure "both" features2 slots are consistent Since older kernels may look in the sb_bad_features2 slot for flags, rather than zeroing it out on fixup, we should make it equal to the sb_features2 value. Also, if the ATTR2 flag was not found prior to features2 fixup, it was not set in the mount flags, so re-check after the fixup so that the current session will use the feature. Also fix up the comments to reflect these changes. SGI-PV: 980085 SGI-Modid: xfs-linux-melb:xfs-kern:30778a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-10 16:25:26 +10:00
David Chinner	ee1c090825	[XFS] Fix superblock features2 field alignment problem Due to the xfs_dsb_t structure not being 64 bit aligned, the last field of the on-disk superblock can vary in location This causes problems when the filesystem gets moved to a different platform, or there is a 32 bit userspace and 64 bit kernel. This patch detects the defect at mount time, logs a warning such as: XFS: correcting sb_features alignment problem in dmesg and corrects the problem so that everything is OK. it also blacklists the bad field in the superblock so it does not get used for something else later on. SGI-PV: 977636 SGI-Modid: xfs-linux-melb:xfs-kern:30539a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-10 16:25:15 +10:00
Eric Sandeen	6211870992	[XFS] remove shouting-indirection macros from xfs_sb.h Remove macro-to-small-function indirection from xfs_sb.h, and remove some which are completely unused. SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30528a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Donald Douwsma <donaldd@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-10 16:24:45 +10:00
David Chinner	72772a3b5b	[XFS] fix inode leak in xfs_iget_core() If the radix_tree_preload() fails, we need to destroy the inode we just read in before trying again. This could leak xfs_vnode structures when there is memory pressure. Noticed by Christoph Hellwig. SGI-PV: 977823 SGI-Modid: xfs-linux-melb:xfs-kern:30606a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>	2008-03-06 16:38:50 +11:00
David Chinner	92d9cd1059	[XFS] 977545 977545 977545 977545 977545 977545 xfsaild causing too many wakeups Idle state is not being detected properly by the xfsaild push code. The current idle state is detected by an empty list which may never happen with mostly idle filesystem or one using lazy superblock counters. A single dirty item in the list that exists beyond the push target can result repeated looping attempting to push up to the target because it fails to check if the push target has been acheived or not. Fix by considering a dirty list with everything past the target as an idle state and set the timeout appropriately. SGI-PV: 977545 SGI-Modid: xfs-linux-melb:xfs-kern:30532a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-03-06 16:38:17 +11:00
Josef Jeff Sipek	1bd960ee2b	[XFS] If you mount an XFS filesystem with no mount options at all, then the "ikeep" option is set rather than "noikeep". This regression was introduced in 970451. With no mount options specified, xfs_parseargs() does the following: int ikeep = 0; args->flags \|= XFSMNT_BARRIER; args->flags2 \|= XFSMNT2_COMPAT_IOSIZE; if (!options) goto done; It only sets the above two options by default and before, it also used to set XFSMNT_IDELETE by default. If options are specified, then if (!(args->flags & XFSMNT_DMAPI) && !ikeep) args->flags \|= XFSMNT_IDELETE; is executed later on which is skipped by the "goto done;" above. The solution is to invert the logic. SGI-PV: 977771 SGI-Modid: xfs-linux-melb:xfs-kern:30590a Signed-off-by: Niv Sardi <xaiki@sgi.com> Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-28 20:37:56 -08:00
Lachlan McIlroy	ef8ece55d9	[XFS] Undo bit ops cleanup mod due to regression on 32-bit powermac platform. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30559a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-26 17:05:44 +11:00
Lachlan McIlroy	db69c915e6	[XFS] Undo bit ops cleanup mod due to regression on 32-bit powermac platform. SGI-PV: 974005 SGI-Modid: xfs-linux-melb:xfs-kern:30558a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-26 17:05:37 +11:00
Lachlan McIlroy	6e5e93424d	Remove empty file fs/xfs/Makefile-linux-2.6.	2008-02-22 15:39:10 +11:00
Lachlan McIlroy	c58310bf49	Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 into for-linus	2008-02-18 13:51:42 +11:00
Lachlan McIlroy	269cdfaf76	[XFS] Added quota targets and removed dmapi directory Fixes build failures introduced by bad merge to mainline.	2008-02-18 13:06:17 +11:00
Eric Sandeen	794f744b22	[XFS] Fix up xfs out-of-tree builds. (a.k.a. external modules) Change -I include directives to find headers in the out-of-tree spot. This allows a directory containing only xfs files to be built as: SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:29878a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Donald Douwsma <donaldd@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-18 12:59:11 +11:00
Andi Kleen	58b7983d15	[XFS] Remove Makefile wrappers in XFS Makefile (and Kbuild) would include Makefile-linux-26 I doubt XFS will really still compile on 2.4; so drop that. This moves Makefile-linux-26 into Makefile and drops Kbuild. Also having wrappers as both Kbuild and Makefile seemed redundant anyways. The patch is relatively large because it renames a file, but no functional changes. SGI-PV: 971050 SGI-Modid: xfs-linux-melb:xfs-kern:29781a Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Donald Douwsma <donaldd@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-18 12:48:03 +11:00
Jan Blunck	1d957f9bf8	Introduce path_put() * Add path_put() functions for releasing a reference to the dentry and vfsmount of a struct path in the right order * Switch from path_release(nd) to path_put(&nd->path) * Rename dput_path() to path_put_conditional() [akpm@linux-foundation.org: fix cifs] Signed-off-by: Jan Blunck <jblunck@suse.de> Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Acked-by: Christoph Hellwig <hch@lst.de> Cc: <linux-fsdevel@vger.kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Steven French <sfrench@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-14 21:13:33 -08:00
Jan Blunck	4ac9137858	Embed a struct path into struct nameidata instead of nd->{dentry,mnt} This is the central patch of a cleanup series. In most cases there is no good reason why someone would want to use a dentry for itself. This series reflects that fact and embeds a struct path into nameidata. Together with the other patches of this series - it enforced the correct order of getting/releasing the reference count on <dentry,vfsmount> pairs - it prepares the VFS for stacking support since it is essential to have a struct path in every place where the stack can be traversed - it reduces the overall code size: without patch series: text data bss dec hex filename 5321639 858418 715768 6895825 6938d1 vmlinux with patch series: text data bss dec hex filename 5320026 858418 715768 `6894212` 693284 vmlinux This patch: Switch from nd->{dentry,mnt} to nd->path.{dentry,mnt} everywhere. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix cifs] [akpm@linux-foundation.org: fix smack] Signed-off-by: Jan Blunck <jblunck@suse.de> Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Acked-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-14 21:13:33 -08:00
Marcin Slusarz	413d57c990	xfs: convert beX_add to beX_add_cpu (new common API) remove beX_add functions and replace all uses with beX_add_cpu Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Dave Chinner <dgc@sgi.com> Cc: Timothy Shimmin <tes@sgi.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-13 16:21:19 -08:00
Linus Torvalds	0b61a2ba5d	Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6 * 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6: (62 commits) [XFS] add __init/__exit mark to specific init/cleanup functions [XFS] Fix oops in xfs_file_readdir() [XFS] kill xfs_root [XFS] keep i_nlink updated and use proper accessors [XFS] stop updating inode->i_blocks [XFS] Make xfs_ail_check check less by default [XFS] Move AIL pushing into it's own thread [XFS] use generic_permission [XFS] stop re-checking permissions in xfs_swapext [XFS] clean up xfs_swapext [XFS] remove permission check from xfs_change_file_space [XFS] prevent panic during log recovery due to bogus op_hdr length [XFS] Cleanup various fid related bits: [XFS] Fix xfs_lowbit64 [XFS] Remove CFORK macros and use code directly in IFORK and DFORK macros. [XFS] kill superflous buffer locking (2nd attempt) [XFS] Use kernel-supplied "roundup_pow_of_two" for simplicity [XFS] Remove the BPCSHIFT and NB* based macros from XFS. [XFS] Remove bogus assert [XFS] optimize XFS_IS_REALTIME_INODE w/o realtime config ...	2008-02-07 19:12:12 -08:00
Lachlan McIlroy	de2eeea609	[XFS] add __init/__exit mark to specific init/cleanup functions SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30459a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Denis Cheng <crquan@gmail.com>	2008-02-07 18:25:19 +11:00
David Chinner	450790a2c5	[XFS] Fix oops in xfs_file_readdir() When xfs_file_readdir() exactly fills a buffer, it can move it's index past the end of the buffer and dereference it even though the result of the dereference is never used. On some platforms this causes an oops. SGI-PV: 976923 SGI-Modid: xfs-linux-melb:xfs-kern:30458a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-07 18:24:13 +11:00
Christoph Hellwig	cbc89dcfd2	[XFS] kill xfs_root The only caller (xfs_fs_fill_super) can simplify call igrab on the root inode. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30393a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-07 18:24:00 +11:00
Christoph Hellwig	4188c78d95	[XFS] keep i_nlink updated and use proper accessors To get the read-only bind mounts in -mm to work correctly with XFS we need to call the drop_nlink and inc_nlink helpers to monitor the link count. Add calls to these to xfs_bumplink and xfs_droplink and stop copying over di_nlink to i_nlink in xfs_validate_fields and vn_revalidate. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30392a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-07 18:23:38 +11:00
Christoph Hellwig	222096ae7f	[XFS] stop updating inode->i_blocks The VFS doesn't use i_blocks, it's only used by generic_fillattr and the generic quota code which XFS doesn't use. In XFS there is one use to check whether we have an inline or out of line sumlink, but we can replace that with a check of the XFS_IFINLINE inode flag. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30391a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-07 18:23:15 +11:00
David Chinner	de08dbc197	[XFS] Make xfs_ail_check check less by default Checking the entire AIL on every insert and remove is prohibitively expensive - the sustained sequntial create rate on a single disk drops from about 1800/s to 60/s because of this checking resulting in the xfslogd becoming cpu bound. By default on debug builds, only check the next and previous entries in the list to ensure they are ordered correctly. If you really want, define XFS_TRANS_DEBUG to use the old behaviour. SGI-PV: 972759 SGI-Modid: xfs-linux-melb:xfs-kern:30372a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-07 18:23:05 +11:00
David Chinner	249a8c1124	[XFS] Move AIL pushing into it's own thread When many hundreds to thousands of threads all try to do simultaneous transactions and the log is in a tail-pushing situation (i.e. full), we can get multiple threads walking the AIL list and contending on the AIL lock. The AIL push is, in effect, a simple I/O dispatch algorithm complicated by the ordering constraints placed on it by the transaction subsystem. It really does not need multiple threads to push on it - even when only a single CPU is pushing the AIL, it can push the I/O out far faster that pretty much any disk subsystem can handle. So, to avoid contention problems stemming from multiple list walkers, move the list walk off into another thread and simply provide a "target" to push to. When a thread requires a push, it sets the target and wakes the push thread, then goes to sleep waiting for the required amount of space to become available in the log. This mechanism should also be a lot fairer under heavy load as the waiters will queue in arrival order, rather than queuing in "who completed a push first" order. Also, by moving the pushing to a separate thread we can do more effectively overload detection and prevention as we can keep context from loop iteration to loop iteration. That is, we can push only part of the list each loop and not have to loop back to the start of the list every time we run. This should also help by reducing the number of items we try to lock and/or push items that we cannot move. Note that this patch is not intended to solve the inefficiencies in the AIL structure and the associated issues with extremely large list contents. That needs to be addresses separately; parallel access would cause problems to any new structure as well, so I'm only aiming to isolate the structure from unbounded parallelism here. SGI-PV: 972759 SGI-Modid: xfs-linux-melb:xfs-kern:30371a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-07 18:22:51 +11:00
Christoph Hellwig	4576758db5	[XFS] use generic_permission Now that all direct caller of xfs_iaccess are gone we can kill xfs_iaccess and xfs_access and just use generic_permission with a check_acl callback. This is required for the per-mount read-only patchset in -mm to work properly with XFS. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30370a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-07 18:22:38 +11:00
Christoph Hellwig	f6aa7f2184	[XFS] stop re-checking permissions in xfs_swapext xfs_swapext should simplify check if we have a writeable file descriptor instead of re-checking the permissions using xfs_iaccess. Add an additional check to refuse O_APPEND file descriptors because swapext is not an append-only write operation. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30369a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-07 18:22:24 +11:00
Christoph Hellwig	35fec8df65	[XFS] clean up xfs_swapext - stop using vnodes - use proper multiple label goto unwinding - give the struct file * variables saner names SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30366a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-07 18:22:02 +11:00
Christoph Hellwig	199037c598	[XFS] remove permission check from xfs_change_file_space Both callers of xfs_change_file_space alreaedy do the file->f_mode & FMODE_WRITE check to ensure we have a file descriptor that has been opened for write mode, so there is no need to re-check that with xfs_iaccess. Especially as the later might wrongly deny it for corner cases like file descriptor passing through unix domain sockets. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30365a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-07 18:21:14 +11:00
Lachlan McIlroy	9742bb93da	[XFS] prevent panic during log recovery due to bogus op_hdr length A problem was reported where a system panicked in log recovery due to a corrupt log record. The cause of the corruption is not known but this change will at least prevent a crash for this specific scenario. Log recovery definitely needs some more work in this area. SGI-PV: 974151 SGI-Modid: xfs-linux-melb:xfs-kern:30318a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>	2008-02-07 18:20:58 +11:00
Christoph Hellwig	f71354bc3a	[XFS] Cleanup various fid related bits: - merge xfs_fid2 into it's only caller xfs_dm_inode_to_fh. - remove xfs_vget and opencode it in the two callers, simplifying both of them by avoiding the awkward calling convetion. - assign directly to the dm_fid_t members in various places in the dmapi code instead of casting them to xfs_fid_t first (which is identical to dm_fid_t) SGI-PV: 974747 SGI-Modid: xfs-linux-melb:xfs-kern:30258a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Vlad Apostolov <vapo@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-07 18:20:11 +11:00
David Chinner	edd319dc52	[XFS] Fix xfs_lowbit64 xfs_lowbit64 was broken on 32 bit platforms in a recent cleanup of the xfs bitops. Fix it back up again. SGI-PV: 974005 SGI-Modid: xfs-linux-melb:xfs-kern:30202a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-07 18:19:41 +11:00
Christoph Hellwig	45ba598e56	[XFS] Remove CFORK macros and use code directly in IFORK and DFORK macros. Currently XFS_IFORK_* and XFS_DFORK* are implemented by means of XFS_CFORK* macros. But given that XFS_IFORK_* operates on an xfs_inode that embedds and xfs_icdinode_core and XFS_DFORK_* operates on an xfs_dinode that embedds a xfs_dinode_core one will have to do endian swapping while the other doesn't. Instead of having the current mess with the CFORK macros that have byteswapping and non-byteswapping version (which are inconsistantly named while we're at it) just define each family of the macros to stand by itself and simplify the whole matter. A few direct references to the CFORK variants were cleaned up to use IFORK or DFORK to make this possible. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30163a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-07 18:19:24 +11:00
Christoph Hellwig	a9759f2de3	[XFS] kill superflous buffer locking (2nd attempt) There is no need to lock any page in xfs_buf.c because we operate on our own address_space and all locking is covered by the buffer semaphore. If we ever switch back to main blockdeive address_space as suggested e.g. for fsblock with a similar scheme the locking will have to be totally revised anyway because the current scheme is neither correct nor coherent with itself. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30156a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-07 18:18:50 +11:00
Robert P. J. Day	40ebd81d1a	[XFS] Use kernel-supplied "roundup_pow_of_two" for simplicity SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30098a Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-07 18:18:19 +11:00
Tim Shimmin	e6a4b37f38	[XFS] Remove the BPCSHIFT and NB* based macros from XFS. The BPCSHIFT based macros, btoc, ctob, offtoc* and ctooff are either not used or don't need to be used. The NDPP, NDPP, NBBY macros don't need to be used but instead are replaced directly by PAGE_SIZE and PAGE_CACHE_SIZE where appropriate. Initial patch and motivation from Nicolas Kaiser. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30096a Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-07 18:17:58 +11:00
Niv Sardi	f7b7c3673e	[XFS] Remove bogus assert This assert is bogus. We can have a forced shutdown occur between the check for the XLOG_FORCED_SHUTDOWN and the ASSERT. Also, the logging system shouldn't care about the state of XFS_FORCED_SHUTDOWN, it should only check XLOG_FORCED_SHUTDOWN. The logging system has it's own forced shutdown flag so, for the case of a forced shutdown that's not due to a logging error, we can flush the log. SGI-PV: 972985 SGI-Modid: xfs-linux-melb:xfs-kern:30029a Signed-off-by: Niv Sardi <xaiki@sgi.com> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-07 18:17:39 +11:00
Eric Sandeen	71ddabb94a	[XFS] optimize XFS_IS_REALTIME_INODE w/o realtime config Use XFS_IS_REALTIME_INODE in more places, and #define it to 0 if CONFIG_XFS_RT is off. This should be safe because mount checks in xfs_rtmount_init: so if we get mounted w/o CONFIG_XFS_RT, no realtime inodes should be encountered after that. Defining XFS_IS_REALTIME_INODE to 0 saves a bit of stack space, presumeably gcc can optimize around the various "if (0)" type checks: xfs_alloc_file_space -8 xfs_bmap_adjacent -16 xfs_bmapi -8 xfs_bmap_rtalloc -16 xfs_bunmapi -28 xfs_free_file_space -64 xfs_imap +8 <-- ? hmm. xfs_iomap_write_direct -12 xfs_qm_dqusage_adjust -4 xfs_qm_vop_chown_reserve -4 SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30014a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-07 18:16:43 +11:00
David Chinner	a67d7c5f5d	[XFS] Move platform specific mount option parse out of core XFS code Mount option parsing is platform specific. Move it out of core code into the platform specific superblock operation file. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30012a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-07 18:16:30 +11:00
David Chinner	3ed6526441	[XFS] Implement fallocate. Implement the new generic callout for file preallocation. Atomically change the file size if requested. SGI-PV: 972756 SGI-Modid: xfs-linux-melb:xfs-kern:30009a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-07 18:16:17 +11:00
David Chinner	5d51eff453	[XFS] Fix inode allocation latency The log force added in xfs_iget_core() has been a performance issue since it was introduced for tight loops that allocate then unlink a single file. under heavy writeback, this can introduce unnecessary latency due tothe log I/o getting stuck behind bulk data writes. Fix this latency problem by avoinding the need for the log force by moving the place we mark linux inode dirty to the transaction commit rather than on transaction completion. This also closes a potential hole in the sync code where a linux inode is not dirty between the time it is modified and the time the log buffer has been written to disk. SGI-PV: 972753 SGI-Modid: xfs-linux-melb:xfs-kern:30007a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-07 18:16:07 +11:00
David Chinner	e4143a1cf5	[XFS] Fix transaction overrun during writeback. Prevent transaction overrun in xfs_iomap_write_allocate() if we race with a truncate that overlaps the delalloc range we were planning to allocate. If we race, we may allocate into a hole and that requires block allocation. At this point in time we don't have a reservation for block allocation (apart from metadata blocks) and so allocating into a hole rather than a delalloc region results in overflowing the transaction block reservation. Fix it by only allowing a single extent to be allocated at a time. SGI-PV: 972757 SGI-Modid: xfs-linux-melb:xfs-kern:30005a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-07 18:15:55 +11:00
David Chinner	786f486f81	[XFS] Show all mount args in /proc/mounts There are several mount options that don't show up in /proc/mounts. Add them in and clean up the showargs code at the same time. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30004a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-07 18:15:43 +11:00
David Chinner	8ae2c0f64a	[XFS] Fix sparse warning in xlog_recover_do_efd_trans. Sparse trips over the locking order in xlog_recover_do_efd_trans() when xfs_trans_delete_ail() drops the ail lock. Because the unlock is conditional, we need to either annotate with a "fake unlock" or change the structure of the code so sparse thinks the function always unlocks. Reordering the code makes it simpler, so do that. SGI-PV: 972755 SGI-Modid: xfs-linux-melb:xfs-kern:30003a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-07 18:15:29 +11:00
David Chinner	a8272ce0c1	[XFS] Fix up sparse warnings. These are mostly locking annotations, marking things static, casts where needed and declaring stuff in header files. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30002a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-07 18:14:38 +11:00
David Chinner	a69b176df2	[XFS] Use the generic bitops rather than implementing them ourselves. Patch inspired by Andi Kleen. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30000a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-07 18:14:22 +11:00
Vlad Apostolov	c319b58b13	[XFS] Make xfs_bulkstat() to report unlinked but referenced inodes We need xfs_bulkstat() to report inode stat for inodes with link count zero but reference count non zero. The fix here: http://oss.sgi.com/archives/xfs/2007-09/msg00266.html changed this behavior and made xfs_bulkstat() to filter all unlinked inodes including those that are not destroyed yet but held by reference. The attached patch returns back to the original behavior by marking the on-disk inode buffer "dirty" when di_mode is cleared (at that time both inode link and reference counter are zero). SGI-PV: 972004 SGI-Modid: xfs-linux-melb:xfs-kern:29914a Signed-off-by: Vlad Apostolov <vapo@sgi.com> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-07 18:13:37 +11:00
Lachlan McIlroy	98ce2b5b1b	[XFS] 971186 Undo mod xfs-linux-melb:xfs-kern:29845a due to a regression SGI-PV: 971596 SGI-Modid: xfs-linux-melb:xfs-kern:29902a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-02-07 18:13:27 +11:00
Eric Sandeen	bc58f9bb6b	[XFS] fix 32-bit compat ioctls for GETXFLAGS, SETXFLAGS, GETVERSION XFS_IOC_GETVERSION, XFS_IOC_GETXFLAGS and XFS_IOC_SETXFLAGS all take a "long" which changes size between 32 and 64 bit platforms. So, the ioctl cmds that come in from a 32-bit app aren't as expected, for example on GETXFLAGS, unknown cmd fd(3) cmd(80046601){t:'f';sz:4} due to the size mismatch. So, use instead the 32-bit version of the commands for compat ioctls, and other than that it doesn't take any more manipulation. Also, for both native and compat versions, just define them to the values as defined in fs.h SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:29849a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>	2008-02-07 18:13:17 +11:00
Eric Sandeen	d4f3cc016f	[XFS] lose xfs_hex_dump in favor of print_hex_dump No need for xfs to have its own hex dumping routine now that the kernel has one. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:29847a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>	2008-02-07 18:13:05 +11:00
Christoph Hellwig	91906a882a	[XFS] kill XFS_INOBT_IS_FREE_DISK This macro is unused an all other acros in this family operate on native types, so we most likely won't grow a user either. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:29846a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>	2008-02-07 18:12:41 +11:00
Christoph Hellwig	c40ea74101	[XFS] kill superflous buffer locking There is no need to lock any page in xfs_buf.c because we operate on our own address_space and all locking is covered by the buffer semaphore. If we ever switch back to main blockdeive address_space as suggested e.g. for fsblock with a similar scheme the locking will have to be totally revised anyway because the current scheme is neither correct nor coherent with itself. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:29845a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>	2008-02-07 18:12:07 +11:00
Eric Sandeen	0771fb4515	[XFS] Refactor xfs_mountfs Refactoring xfs_mountfs() to call sub-functions for logical chunks can help save a bit of stack, and can make it easier to read this long function. The mount path is one of the longest common callchains, easily getting to within a few bytes of the end of a 4k stack when over lvm, quotas are enabled, and quotacheck must be done. With this change on top of the other stack-related changes I've sent, I can get xfs to survive a normal xfsqa run on 4k stacks over lvm. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:29834a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Donald Douwsma <donaldd@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>	2008-02-07 18:11:56 +11:00
Christoph Hellwig	b53e675dc8	[XFS] xlog_rec_header/xlog_rec_ext_header endianess annotations Mostly trivial conversion with one exceptions: h_num_logops was kept in native endian previously and only converted to big endian in xlog_sync, but we always keep it big endian now. With todays cpus fast byteswap instructions that's not an issue but the new variant keeps the code clean and maintainable. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:29821a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>	2008-02-07 18:11:47 +11:00
Christoph Hellwig	67fcb7bfb6	[XFS] clean up some xfs_log_priv.h macros - the various assign lsn macros are replaced by a single inline, xlog_assign_lsn, which is equivalent to ASSIGN_ANY_LSN_HOST except for a more sane calling convention. ASSIGN_LSN_DISK is replaced by xlog_assign_lsn and a manual bytespap, and ASSIGN_LSN by the same, except we pass the cycle and block arguments explicitly instead of a log paramter. The latter two variants only had 2, respectively one user anyway. - the GET_CYCLE is replaced by a xlog_get_cycle inline with exactly the same calling conventions. - GET_CLIENT_ID is replaced by xlog_get_client_id which leaves away the unused arch argument. Instead of conditional defintions depending on host endianess we now do an unconditional swap and shift then, which generates equal code. - the unused XLOG_SET macro is removed. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:29820a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>	2008-02-07 18:11:38 +11:00
Christoph Hellwig	03bea6fe6c	[XFS] clean up some xfs_log_priv.h macros - the various assign lsn macros are replaced by a single inline, xlog_assign_lsn, which is equivalent to ASSIGN_ANY_LSN_HOST except for a more sane calling convention. ASSIGN_LSN_DISK is replaced by xlog_assign_lsn and a manual bytespap, and ASSIGN_LSN by the same, except we pass the cycle and block arguments explicitly instead of a log paramter. The latter two variants only had 2, respectively one user anyway. - the GET_CYCLE is replaced by a xlog_get_cycle inline with exactly the same calling conventions. - GET_CLIENT_ID is replaced by xlog_get_client_id which leaves away the unused arch argument. Instead of conditional defintions depending on host endianess we now do an unconditional swap and shift then, which generates equal code. - the unused XLOG_SET macro is removed. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:29819a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>	2008-02-07 18:10:31 +11:00
Christoph Hellwig	9909c4aa1a	[XFS] kill xfs_freeze. No need to have a wrapper just two call two more functions. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:29816a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Donald Douwsma <donaldd@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>	2008-02-07 18:09:56 +11:00
Christoph Hellwig	10090be25c	[XFS] cleanup vnode useage in xfs_iget.c Get rid of vnode useage in xfs_iget.c and pass Linux inode / xfs_inode where apropinquate. And kill some useless helpers while we're at it. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:29808a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>	2008-02-07 16:55:46 +11:00
Christoph Hellwig	6e7f75eafb	[XFS] cleanup vnode useage in xfs_ioctl.c xfs_ioctl.c passes around vnode pointers quite a lot, but all places already have the Linux inode which is identical to the vnode these days. Clean the code up to always use the Linux inode. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:29807a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>	2008-02-07 16:55:35 +11:00
Christoph Hellwig	4ca488eb45	[XFS] Kill off xfs_statvfs. We were already filling the Linux struct statfs anyway, and doing this trivial task directly in xfs_fs_statfs makes the code quite a bit cleaner. While I was at it I also moved copying attributes that don't change over the lifetime of the filesystem outside the superblock lock. xfs_fs_fill_super used to get the magic number and blocksize through xfs_statvfs, but assigning them directly is a lot cleaner and will save some stack space during mount. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:29802a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Tim Shimmin <tes@sgi.com>	2008-02-07 16:53:27 +11:00
Christoph Hellwig	c43f408795	[XFS] simplify xfs_vn_getattr Just fill in struct kstat directly from the xfs_inode instead of doing a detour through a bhv_vattr_t and xfs_getattr. SGI-PV: 970980 SGI-Modid: xfs-linux-melb:xfs-kern:29770a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>	2008-02-07 16:49:06 +11:00
Christoph Hellwig	613d70436c	[XFS] kill xfs_iocore_t xfs_iocore_t is a structure embedded in xfs_inode. Except for one field it just duplicates fields already in xfs_inode, and there is nothing this abstraction buys us on XFS/Linux. This patch removes it and shrinks source and binary size of xfs aswell as shrinking the size of xfs_inode by 60/44 bytes in debug/non-debug builds. SGI-PV: 970852 SGI-Modid: xfs-linux-melb:xfs-kern:29754a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>	2008-02-07 16:48:58 +11:00

1 2 3 4 5 ...

859 Commits