Streamline SendReceiveBlockingLock: Use "goto out:" in an error condition
Signed-off-by: Volker Lendecke <vl@samba.org>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Streamline SendReceiveBlockingLock: Use "goto out:" in an error condition
Signed-off-by: Volker Lendecke <vl@samba.org>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Slightly streamline SendReceive[2]
Remove an else branch by naming the error condition what it is
Signed-off-by: Volker Lendecke <vl@samba.org>
Signed-off-by: Steve French <sfrench@us.ibm.com>
This is no functional change, because in the "if" branch we do an early
"return 0;".
Signed-off-by: Volker Lendecke <vl@samba.org>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Simplify allocate_mid() slightly: Remove some unnecessary "else" branches
Signed-off-by: Volker Lendecke <vl@samba.org>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
inbuf->smb_buf_length does not change in in wait_for_free_request() or in
allocate_mid(), so we can check it early.
Signed-off-by: Volker Lendecke <vl@samba.org>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
cifs: store password in tcon
Each tcon has its own password for share-level security. Store it in
the tcon and wipe it clean and free it when freeing the tcon. When
doing the tree connect with share-level security, use the tcon password
instead of the session password.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
cifs: have calc_lanman_hash take more granular args
We need to use this routine to encrypt passwords associated with the
tcon too. Don't assume that the password will be attached to the
smb_session.
Also, make some of the values in the lower encryption functions
const since they aren't changed.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
cifs: zero out session password before freeing it
...just to be on the safe side.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
cifs: fix wait_for_response to time out sleeping processes correctly
The current scheme that CIFS uses to sleep and wait for a response is
not quite what we want. After sending a request, wait_for_response puts
the task to sleep with wait_event(). One of the conditions for
wait_event is a timeout (using time_after()).
The problem with this is that there is no guarantee that the process
will ever be woken back up. If the server stops sending data, then
cifs_demultiplex_thread will leave its response queue sleeping.
I think the only thing that saves us here is the fact that
cifs_dnotify_thread periodically (every 15s) wakes up sleeping processes
on all response_q's that have calls in flight. This makes for
unnecessary wakeups of some processes. It also means large variability
in the timeouts since they're all woken up at once.
Instead of this, put the tasks to sleep with wait_event_timeout. This
makes them wake up on their own if they time out. With this change,
cifs_dnotify_thread should no longer be needed.
I've been testing this in conjunction with some other patches that I'm
working on. It doesn't seem to affect performance at all with with heavy
I/O. Identical iozone -ac runs complete in almost exactly the same time
(<1% difference in times).
Thanks to Wasrshi Nimara for initially pointing this out. Wasrshi, it
would be nice to know whether this patch also helps your testcase.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Cc: Wasrshi Nimara <warshinimara@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Windows allows you to deny access to the top of a share, but permit access to
a directory lower in the path. With the prefixpath feature of cifs
(ie mounting \\server\share\directory\subdirectory\etc.) this should have
worked if the user specified a prefixpath which put the root of the mount
at a directory to which he had access, but we still were doing a lookup
on the root of the share (null path) when we should have been doing it on
the prefixpath subdirectory.
This fixes Samba bug # 5925
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Some applications/subsystems require mandatory byte range locks
(as is used for Windows/DOS/OS2 etc). Sending advisory (posix style)
byte range lock requests (instead of mandatory byte range locks) can
lead to problems for these applications (which expect that other
clients be prevented from writing to portions of the file which
they have locked and are updating). This mount option allows
mounting cifs with the new mount option "forcemand" (or
"forcemandatorylock") in order to have the cifs client use mandatory
byte range locks (ie SMB/CIFS/Windows/NTFS style locks) rather than
posix byte range lock requests, even if the server would support
posix byte range lock requests. This has no effect if the server
does not support the CIFS Unix Extensions (since posix style locks
require support for the CIFS Unix Extensions), but for mounts
to Samba servers this can be helpful for Wine and applications
that require mandatory byte range locks.
Acked-by: Jeff Layton <jlayton@redhat.com>
CC: Alexander Bokovoy <ab@samba.org>
Signed-off-by: Steve French <sfrench@us.ibm.com>
In order to unify the smb_send routines, we need to reorganize the
routines that connect the sockets. Have ipv4_connect take a
TCP_Server_Info pointer and get the necessary fields from that.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
struct smb_vol is fairly large, it's probably best to kzalloc it...
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Clean up cifs_mount a bit by moving the code that creates new TCP
sessions into a separate function. Have that function search for an
existing socket and then create a new one if one isn't found.
Also reorganize the initializion of TCP_Server_Info a bit to prepare
for cleanup of the socket connection code.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
The current code for setting the session serverName is IPv4-specific.
Allow it to be an IPv6 address as well. Use NIP* macros to set the
format.
This also entails increasing the length of the serverName field, so
declare a new macro for RFC1001 name length and use it in the
appropriate places.
Finally, drop the unicode_server_Name field from TCP_Server_Info since
it's not used. We can add it back later if needed, but for now it just
wastes memory.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Now that tasks sleeping in wait_for_response will time out on their own,
we're not reliant on the dnotify thread to do this. Mark it as
experimental code for now.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
cifsd can outlive the last cifs mount. We need to hold a module
reference until it exits to prevent someone from unplugging
the module until we're ready.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Have cifs_show_options display the addr and prefixpath options in
/proc/mounts. Reduce struct dereferencing by adding some local
variables.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
arch_setup_additional_pages currently gets two arguments, the binary
format descripton and an indication if the process uses an executable
stack or not. The second argument is not used by anybody, it could
be removed without replacement.
What actually does make sense is to pass an indication if the process
uses the elf interpreter or not. The glibc code will not use anything
from the vdso if the process does not use the dynamic linker, so for
statically linked binaries the architecture backend can choose not
to map the vdso.
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The iolock is dropped and re-acquired around the call to XFS_SEND_NAMESP().
While the iolock is released the file can become cached. We then
'goto retry' and - if we are doing direct I/O - mapping->nrpages may now be
non zero but need_i_mutex will be zero and we will hit the WARN_ON().
Since we have dropped the I/O lock then the file size may have also changed
so what we need to do here is 'goto start' like we do for the XFS_SEND_DATA()
DMAPI event.
We also need to update the filesize before releasing the iolock so that
needs to be done before the XFS_SEND_NAMESP event. If we drop the iolock
before setting the filesize we could race with a truncate.
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
This patch adds server-side support for callbacks other than AUTH_SYS.
Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This patch adds client-side support to allow for callbacks other than
AUTH_SYS.
Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The rpc client needs to know the principal that the setclientid was done
as, so it can tell gssd who to authenticate to.
Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Two principals are involved in krb5 authentication: the target, who we
authenticate *to* (normally the name of the server, like
nfs/server.citi.umich.edu@CITI.UMICH.EDU), and the source, we we
authenticate *as* (normally a user, like bfields@UMICH.EDU)
In the case of NFSv4 callbacks, the target of the callback should be the
source of the client's setclientid call, and the source should be the
nfs server's own principal.
Therefore we allow svcgssd to pass down the name of the principal that
just authenticated, so that on setclientid we can store that principal
name with the new client, to be used later on callbacks.
Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
3 call sites look at hdr.status before returning success.
hdr.status must be zero in this case so there's no point in this.
Currently, hdr.status is correctly processed at decode_op_hdr time
if the op status cannot be decoded.
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
When there are no op replies encoded in the compound reply
hdr.status still contains the overall status of the compound
rpc. This can happen, e.g., when the server returns a
NFS4ERR_MINOR_VERS_MISMATCH error.
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Marten Gajda <marten.gajda@fernuni-hagen.de> states:
I tracked the problem down to the function nfs4_do_open_expired.
Within this function _nfs4_open_expired is called and may return
-NFS4ERR_DELAY. When a further call to _nfs4_open_expired is
executed and does not return -NFS4ERR_DELAY the "exception.retry"
variable is not reset to 0, causing the loop to iterate again
(and as long as err != -NFS4ERR_DELAY, probably forever)
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Hi.
I've been looking at a bugzilla which describes a problem where
a customer was advised to use either the "noac" or "actimeo=0"
mount options to solve a consistency problem that they were
seeing in the file attributes. It turned out that this solution
did not work reliably for them because sometimes, the local
attribute cache was believed to be valid and not timed out.
(With an attribute cache timeout of 0, the cache should always
appear to be timed out.)
In looking at this situation, it appears to me that the problem
is that the attribute cache timeout code has an off-by-one
error in it. It is assuming that the cache is valid in the
region, [read_cache_jiffies, read_cache_jiffies + attrtimeo]. The
cache should be considered valid only in the region,
[read_cache_jiffies, read_cache_jiffies + attrtimeo). With this
change, the options, "noac" and "actimeo=0", work as originally
expected.
This problem was previously addressed by special casing the
attrtimeo == 0 case. However, since the problem is only an off-
by-one error, the cleaner solution is address the off-by-one
error and thus, not require the special case.
Thanx...
ps
Signed-off-by: Peter Staubach <staubach@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This ensures that we don't have to look up the dentry again after we return
the delegation if we know that the directory didn't change.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Currently, the callback server is listening on IPv6 if it is enabled. This
means that IPv4 addresses will always be mapped.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
If the client is not using a delegation, the right thing to do is to return
it as soon as possible. This helps reduce the amount of state the server
has to track, as well as reducing the potential for conflicts with other
clients.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Let the actual delegreturn stuff be run in the state manager thread rather
than allocating a separate kthread.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We really shouldn't be resetting the sequence ids when doing state
expiration recovery, since we don't know if the server still remembers our
previous state owners. There are servers out there that do attempt to
preserve client state even if the lease has expired. Such a server would
only release that state if a conflicting OPEN request occurs.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Add a delegation cleanup phase to the state management loop, and do the
NFS4ERR_CB_PATH_DOWN recovery there.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Add a flag to mark delegations as requiring return, then run a garbage
collector. In the future, this will allow for more flexible delegation
management, where delegations may be marked for return if it turns out
that they are not being referenced.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
NFSv4 defines a number of state errors which the client does not currently
handle. Among those we should worry about are:
NFS4ERR_ADMIN_REVOKED - the server's administrator revoked our locks
and/or delegations.
NFS4ERR_BAD_STATEID - the client and server are out of sync, possibly
due to a delegation return racing with an OPEN
request.
NFS4ERR_OPENMODE - the client attempted to do something not sanctioned
by the open mode of the stateid. Should normally just
occur as a result of a delegation return race.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Now that we're using the flags to indicate state that needs to be
recovered, as well as having implemented proper refcounting and spinlocking
on the state and open_owners, we can get rid of nfs_client->cl_sem. The
only remaining case that was dubious was the file locking, and that case is
now covered by the nfsi->rwsem.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The unlock path is currently failing to take the nfs_client->cl_sem read
lock, and hence the recovery path may see locks disappear from underneath
it.
Also ensure that it takes the nfs_inode->rwsem read lock so that it there
is no conflict with delegation recalls.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
If the client for some reason is not able to recover all its state within
the time allotted for the grace period, and the server reboots again, the
client is not allowed to recover the state that was 'lost' using reboot
recovery.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Without an extra lock, we cannot just assume that the delegation->inode is
valid when we're traversing the rcu-protected nfs_client lists. Use the
delegation->lock to ensure that it is truly valid.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
When we can update_open_stateid(), we need to be certain that we don't
race with a delegation return. While we could do this by grabbing the
nfs_client->cl_lock, a dedicated spin lock in the delegation structure
will scale better.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
If the admin has specified the "noresvport" option for an NFS mount
point, the kernel's NFS client uses an unprivileged source port for
the main NFS transport. The kernel's lockd client should use an
unprivileged port in this case as well.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
If the admin has specified the "noresvport" option for an NFS mount
point, the kernel's NFS client uses an unprivileged source port for
the main NFS transport. The kernel's mountd client should use an
unprivileged port in this case as well.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The standard default security setting for NFS is AUTH_SYS. An NFS
client connects to NFS servers via a privileged source port and a
fixed standard destination port (2049). The client sends raw uid and
gid numbers to identify users making NFS requests, and the server
assumes an appropriate authority on the client has vetted these
values because the source port is privileged.
On Linux, by default in-kernel RPC services use a privileged port in
the range between 650 and 1023 to avoid using source ports of well-
known IP services. Using such a small range limits the number of NFS
mount points and the number of unique NFS servers to which a client
can connect concurrently.
An NFS client can use unprivileged source ports to expand the range of
source port numbers, allowing more concurrent server connections and
more NFS mount points. Servers must explicitly allow NFS connections
from unprivileged ports for this to work.
In the past, bumping the value of the sunrpc.max_resvport sysctl on
the client would permit the NFS client to use unprivileged ports.
Bumping this setting also changes the maximum port number used by
other in-kernel RPC services, some of which still required a port
number less than 1023.
This is exacerbated by the way source port numbers are chosen by the
Linux RPC client, which starts at the top of the range and works
downwards. It means that bumping the maximum means all RPC services
requesting a source port will likely get an unprivileged port instead
of a privileged one.
Changing this setting effects all NFS mount points on a client. A
sysadmin could not selectively choose which mount points would use
non-privileged ports and which could not.
Lastly, this mechanism of expanding the limit on the number of NFS
mount points was entirely undocumented.
To address the need for the NFS client to use a large range of source
ports without interfering with the activity of other in-kernel RPC
services, we introduce a new NFS mount option. This option explicitly
tells only the NFS client to use a non-privileged source port when
communicating with the NFS server for one specific mount point.
This new mount option is called "resvport," like the similar NFS mount
option on FreeBSD and Mac OS X. A sister patch for nfs-utils will be
submitted that documents this new option in nfs(5).
The default setting for this new mount option requires the NFS client
to use a privileged port, as before. Explicitly specifying the
"noresvport" mount option allows the NFS client to use an unprivileged
source port for this mount point when connecting to the NFS server
port.
This mount option is supported only for text-based NFS mounts.
[ Sidebar: it is widely known that security mechanisms based on the
use of privileged source ports are ineffective. However, the NFS
client can combine the use of unprivileged ports with the use of
secure authentication mechanisms, such as Kerberos. This allows a
large number of connections and mount points while ensuring a useful
level of security.
Eventually we may change the default setting for this option
depending on the security flavor used for the mount. For example,
if the mount is using only AUTH_SYS, then the default setting will
be "resvport;" if the mount is using a strong security flavor such
as krb5, the default setting will be "noresvport." ]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
[Trond.Myklebust@netapp.com: Fixed a bug whereby nfs4_init_client()
was being called with incorrect arguments.]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Make it possible for the NFSv4 mount set up logic to pass mount option
flags down the stack to nfs_create_rpc_client().
This is immediately useful if we want NFS mount options to modulate
settings of the underlying RPC transport, but it may be useful at some
later point if other parts of the NFSv4 mount initialization logic
want to know what the mount options are.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The nfs_create_rpc_client() function sets up an RPC client for an NFS
mount point. Add an option that allows it to set up an RPC transport
from an unprivileged port.
Instead of having nfs_create_rpc_client()'s callers retain local
knowledge about how to set up an RPC client, create a couple of flag
arguments to control the use of RPC_CLNT_CREATE flags.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Clean up: convert nfs_mount() to take a single data structure argument to make
it simpler to add more arguments.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Clean up: The nfs_mount() function is not to be used outside of the
NFS client. Move its public declaration to fs/nfs/internal.h.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Clean up: I'm about to move the declaration of nfs_mount into
fs/nfs/internal.h and include it in fs/nfs/nfsroot.c. There's a
conflicting definition of nfs_path in fs/nfs/internal.h and
fs/nfs/nfsroot.c, so rename the private one.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
My understanding is that there is a push to turn the kernel_thread
interface into a non-exported symbol and move all kernel threads to use
the kthread API. This patch changes lockd to use kthread_run to spawn
the reclaimer thread.
I've made the assumption here that the extra module references taken
when we spawn this thread are unnecessary and removed them. I've also
added a KERN_ERR printk that pops if the thread can't be spawned to warn
the admin that the locks won't be reclaimed.
In the future, it would be nice to be able to notify userspace that
locks have been lost (probably by implementing SIGLOST), and adding some
good policies about how long we should reattempt to reclaim the locks.
Finally, I removed a comment about memory leaks that I believe is
obsolete and added a new one to clarify the result of sending a SIGKILL
to the reclaimer thread. As best I can tell, doing so doesn't actually
cause a memory leak.
I consider this patch 2.6.29 material.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
aops->readpages() and its NFS helper readpage_async_filler() will only
be called to do readahead I/O for newly allocated pages. So it's not
necessary to test for the always 0 dirty/uptodate page flags.
The removal of nfs_wb_page() call also fixes a readahead bug: the NFS
readahead has been synchronous since 2.6.23, because that call will
clear PG_readahead, which is the reminder for asynchronous readahead.
More background: the PG_readahead page flag is shared with PG_reclaim,
one for read path and the other for write path. clear_page_dirty_for_io()
unconditionally clears PG_readahead to prevent possible readahead residuals,
assuming itself to be always called in the write path. However, NFS is one
and the only exception in that it _always_ calls clear_page_dirty_for_io()
in the read path, i.e. for readpages()/readpage().
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Wu Fengguang <wfg@linux.intel.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
When we commit, but before we try to write anything to the flash
media, @c->min_idx_size is inaccurate, because we do not re-calculate
it after the commit. Do not forget to do this.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Take into account that 2 eraseblocks are never available because
they are reserved for the index. This gives more realistic count
of FS blocks.
To avoid future confusions like this, introduce a constant.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
In libxfs xfs_bmbt_disk_get_all needs to handle unaligned data and thus
has been updated to use get_unaligned_be64. In kernelspace we don't strictly
need it as the routine is only used for tracing and xfsidbg, but let's keep
the two implementations in sync.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
xfs_fs_vcmn_err can be called under a spinlock, but does a sleeping memory
allocation to create buffer for it's internal sprintf. Fortunately it's
the only caller of icmn_err, so we can merge the two and have one single
static buffer and spinlock protecting it. While we're at it make sure
we proper __attribute__ format annotations so that the compiler can detect
mismatched format strings.
Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Speculative allocation beyond eof doesn't work properly. It was
broken some time ago after a code cleanup that moved what is now
xfs_iomap_eof_align_last_fsb() and xfs_iomap_eof_want_preallocate()
out of xfs_iomap_write_delay() into separate functions. The code
used to use the current file size in various checks but got changed
to be max(file_size, i_new_size). Since i_new_size is the result
of 'offset + count' then in xfs_iomap_eof_want_preallocate() the
check for '(offset + count) <= isize' will always be true.
ie if 'offset + count' is > ip->i_size then isize will be i_new_size
and equal to 'offset + count'.
This change fixes all the places that used to use the current file
size.
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
We should be using the incore inode size here not the linux inode
size. The incore inode size is always up to date for directories
whereas the linux inode size is not updated for directories.
We've hit assertions in xfs_bmap() and traced it back to the linux
inode size being zero but the incore size being correct.
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Stephen Rothwell reported this new (harmless) build warning on platforms that
define u64 to long:
fs/proc/base.c: In function 'proc_pid_schedstat':
fs/proc/base.c:352: warning: format '%llu' expects type 'long long unsigned int', but argument 3 has type 'u64'
asm-generic/int-l64.h platforms strike again: that file should be eliminated.
Fix it by casting the parameters to long long.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Since v9ses->uid is unsigned, it would seem better to use simple_strtoul that
simple_strtol.
A simplified version of the semantic patch that makes this change is as
follows: (http://www.emn.fr/x-info/coccinelle/)
// <smpl>
@r2@
long e;
position p;
@@
e = simple_strtol@p(...)
@@
position p != r2.p;
type T;
T e;
@@
e =
- simple_strtol@p
+ simple_strtoul
(...)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
d_iname is rubbish for long file names.
Use d_name.name in printks instead.
Signed-off-by: Wu Fengguang <wfg@linux.intel.com>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
Pass mount flags to security_sb_kern_mount(), so security modules
can determine if a mount operation is being performed by the kernel.
Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Impact: simplify code
When we turn on CONFIG_SCHEDSTATS, per-task cpu runtime is accumulated
twice. Once in task->se.sum_exec_runtime and once in sched_info.cpu_time.
These two stats are exactly the same.
Given that task->se.sum_exec_runtime is always accumulated by the core
scheduler, sched_info can reuse that data instead of duplicate the accounting.
Signed-off-by: Ken Chen <kenchen@google.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
While testing a kernel with memory poisoning enabled, I saw some warnings
about the redzone getting clobbered when chasing DFS referrals. The
buffer allocation for the unicode converted version of the searchName is
too small and needs to take null termination into account.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Define the OCFS2_FEATURE_COMPAT_JBD2 bit in the filesystem header.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
When we create xattr bucket during the process of xattr set, we always
need to update the ocfs2_xattr_search since even if the bucket size is
the same as block size, the offset will change because of the removal
of the ocfs2_xattr_block header.
Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
This is an alternate fix for a bug reported and fixed by Duane Griffin.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Reported-by: Duane Griffin <duaneg@dghda.com>
Impact: restructure code to fix compiler warning
commit 240d367b4e moved desc usage point
into #ifdef CONFIG_SPARSE_IRQ.
Eliminate the desc variable, otherwise following warning happens:
fs/proc/stat.c: In function 'show_stat':
fs/proc/stat.c:31: warning: unused variable 'desc'
[ akpm: cleaned up the patch to remove #ifdef ]
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The `have_of' variable is a relic from the arch/ppc time, it isn't
useful nowadays.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Impact: simplify code
de_thread() postpones release_task(leader) until after exit_itimers().
This was needed because !SIGEV_THREAD_ID timers could use ->group_leader
without get_task_struct(). With the recent changes we can release the
leader earlier and simplify the code.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Preserve any error returned by the bio layer.
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Instead of implementing our own checks use inode_change_ok to check for
necessary permission in setattr. There is a slight change in behaviour
as inode_change_ok doesn't allow i_mode updates to add the suid or sgid
without superuser privilegues while the old XFS code just stripped away
those bits from the file mode.
(First sent on Semptember 29th)
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
XFS has a mode called invisble I/O that doesn't update any of the
timestamps. It's used for HSM-style applications and exposed through
the nasty open by handle ioctl.
Instead of doing directly assignment of file operations that set an
internal flag for it add a new FMODE_NOCMTIME flag that we can check
in the normal file operations.
(addition of the generic VFS flag has been ACKed by Al as an interims
solution)
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
- xfs_sb.h add the XFS_SB_VERSION2_PARENTBIT features2 that has been
around in userspace for some time
- xfs_inode.h: move a few things out of __KERNEL__ that are needed by
userspace
- xfs_mount.h: only include xfs_sync.h under __KERNEL__
- xfs_inode.c: minor whitespace fixup. I accidentaly changes this when
importing this file for use by userspace.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Check for the project ID after attaching all inodes to the transaction.
That way the unlock in the error case is done by the transaction subsystem,
which guaratees that is uses the right flags (which was wrong from day one
of this check), and avoids having special code unlocking an array of inodes
with potential duplicates. Attaching the inode first is the method used
by xfs_rename and the other namespace methods all other error that require
multiple locked inodes.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Replace the b_fspriv pointer and it's ugly accessors with a properly types
xfs_mount pointer. Also switch log reocvery over to it instead of using
b_fspriv for the mount pointer.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Miles Lane tailing /sys files hit a BUG which Pekka Enberg has tracked
to my 966c8c12dc sprint_symbol(): use
less stack exposing a bug in slub's list_locations() -
kallsyms_lookup() writes a 0 to namebuf[KSYM_NAME_LEN-1], but that was
beyond the end of page provided.
The 100 slop which list_locations() allows at end of page looks roughly
enough for all the other stuff it might print after the symbol before
it checks again: break out KSYM_SYMBOL_LEN earlier than before.
Latencytop and ftrace and are using KSYM_NAME_LEN buffers where they
need KSYM_SYMBOL_LEN buffers, and vmallocinfo a 2*KSYM_NAME_LEN buffer
where it wants a KSYM_SYMBOL_LEN buffer: fix those before anyone copies
them.
[akpm@linux-foundation.org: ftrace.h needs module.h]
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc Miles Lane <miles.lane@gmail.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Steven Rostedt <srostedt@redhat.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On umount two event will be dispatched to watcher:
1: inotify_dev_queue_event(.., IN_UNMOUNT,..)
2: remove_watch(watch, dev)
->inotify_dev_queue_event(.., IN_IGNORED, ..)
But if watcher has IN_ONESHOT bit set then the watcher will be released
inside first event. Which result in accessing invalid object later. IMHO
it is not pure regression. This bug wasn't triggered while initial
inotify interface testing phase because of another bug in IN_ONESHOT
handling logic :)
commit ac74c00e49
Author: Ulisses Furquim <ulissesf@gmail.com>
Date: Fri Feb 8 04:18:16 2008 -0800
inotify: fix check for one-shot watches before destroying them
As the IN_ONESHOT bit is never set when an event is sent we must check it
in the watch's mask and not in the event's mask.
TESTCASE:
mkdir mnt
mount -ttmpfs none mnt
mkdir mnt/d
./inotify mnt/d&
umount mnt ## << lockup or crash here
TESTSOURCE:
/* gcc -oinotify inotify.c */
#include <stdio.h>
#include <stdlib.h>
#include <sys/inotify.h>
int main(int argc, char **argv)
{
char buf[1024];
struct inotify_event *ie;
char *p;
int i;
ssize_t l;
p = argv[1];
i = inotify_init();
inotify_add_watch(i, p, ~0);
l = read(i, buf, sizeof(buf));
printf("read %d bytes\n", l);
ie = (struct inotify_event *) buf;
printf("event mask: %d\n", ie->mask);
return 0;
}
Signed-off-by: Dmitri Monakhov <dmonakhov@openvz.org>
Cc: John McCutchan <ttb@tentacle.dhs.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Robert Love <rlove@google.com>
Cc: Ulisses Furquim <ulissesf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The large pages fix from bcf8039ed4 broke 32-bit pagemap by pulling the
pagemap entry code out into a function with the wrong return type.
Pagemap entries are 64 bits on all systems and unsigned long is only 32
bits on 32-bit systems.
Signed-off-by: Matt Mackall <mpm@selenic.com>
Reported-by: Doug Graham <dgraham@nortel.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: <stable@kernel.org> [2.6.26.x, 2.6.27.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Revert
commit e8ced39d5e
Author: Mingming Cao <cmm@us.ibm.com>
Date: Fri Jul 11 19:27:31 2008 -0400
percpu_counter: new function percpu_counter_sum_and_set
As described in
revert "percpu counter: clean up percpu_counter_sum_and_set()"
the new percpu_counter_sum_and_set() is racy against updates to the
cpu-local accumulators on other CPUs. Revert that change.
This means that ext4 will be slow again. But correct.
Reported-by: Eric Dumazet <dada1@cosmosbay.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mingming Cao <cmm@us.ibm.com>
Cc: <linux-ext4@vger.kernel.org>
Cc: <stable@kernel.org> [2.6.27.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Revert
commit 1f7c14c62c
Author: Mingming Cao <cmm@us.ibm.com>
Date: Thu Oct 9 12:50:59 2008 -0400
percpu counter: clean up percpu_counter_sum_and_set()
Before this patch we had the following:
percpu_counter_sum(): return the percpu_counter's value
percpu_counter_sum_and_set(): return the percpu_counter's value, copying
that value into the central value and zeroing the per-cpu counters before
returning.
After this patch, percpu_counter_sum_and_set() has gone, and
percpu_counter_sum() gets the old percpu_counter_sum_and_set()
functionality.
Problem is, as Eric points out, the old percpu_counter_sum_and_set()
functionality was racy and wrong. It zeroes out counters on "other" cpus,
without holding any locks which will prevent races agaist updates from
those other CPUS.
This patch reverts 1f7c14c62c. This means
that percpu_counter_sum_and_set() still has the race, but
percpu_counter_sum() does not.
Note that this is not a simple revert - ext4 has since started using
percpu_counter_sum() for its dirty_blocks counter as well.
Note that this revert patch changes percpu_counter_sum() semantics.
Before the patch, a call to percpu_counter_sum() will bring the counter's
central counter mostly up-to-date, so a following percpu_counter_read()
will return a close value.
After this patch, a call to percpu_counter_sum() will leave the counter's
central accumulator unaltered, so a subsequent call to
percpu_counter_read() can now return a significantly inaccurate result.
If there is any code in the tree which was introduced after
e8ced39d5e was merged, and which depends
upon the new percpu_counter_sum() semantics, that code will break.
Reported-by: Eric Dumazet <dada1@cosmosbay.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mingming Cao <cmm@us.ibm.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The patch 6341c39 "tracehook: exec" introduced a small regression in
2.6.27 regarding binfmt_misc exec event reporting. Since the reporting
is now done in the common search_binary_handler() function, an exec
of a misc binary will result in two (or possibly multiple) exec events
being reported, instead of just a single one, because the misc handler
contains a recursive call to search_binary_handler.
To add to the confusion, if PTRACE_O_TRACEEXEC is not active, the multiple
SIGTRAP signals will in fact cause only a single ptrace intercept, as the
signals are not queued. However, if PTRACE_O_TRACEEXEC is on, the debugger
will actually see multiple ptrace intercepts (PTRACE_EVENT_EXEC).
The test program included below demonstrates the problem.
This change fixes the bug by calling tracehook_report_exec() only in the
outermost search_binary_handler() call (bprm->recursion_depth == 0).
The additional change to restore bprm->recursion_depth after each binfmt
load_binary call is actually superfluous for this bug, since we test the
value saved on entry to search_binary_handler(). But it keeps the use of
of the depth count to its most obvious expected meaning. Depending on what
binfmt handlers do in certain cases, there could have been false-positive
tests for recursion limits before this change.
/* Test program using PTRACE_O_TRACEEXEC.
This forks and exec's the first argument with the rest of the arguments,
while ptrace'ing. It expects to see one PTRACE_EVENT_EXEC stop and
then a successful exit, with no other signals or events in between.
Test for kernel doing two PTRACE_EVENT_EXEC stops for a binfmt_misc exec:
$ gcc -g traceexec.c -o traceexec
$ sudo sh -c 'echo :test:M::foobar::/bin/cat: > /proc/sys/fs/binfmt_misc/register'
$ echo 'foobar test' > ./foobar
$ chmod +x ./foobar
$ ./traceexec ./foobar; echo $?
==> good <==
foobar test
0
$
==> bad <==
foobar test
unexpected status 0x4057f != 0
3
$
*/
#include <stdio.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <sys/ptrace.h>
#include <unistd.h>
#include <signal.h>
#include <stdlib.h>
static void
wait_for (pid_t child, int expect)
{
int status;
pid_t p = wait (&status);
if (p != child)
{
perror ("wait");
exit (2);
}
if (status != expect)
{
fprintf (stderr, "unexpected status %#x != %#x\n", status, expect);
exit (3);
}
}
int
main (int argc, char **argv)
{
pid_t child = fork ();
if (child < 0)
{
perror ("fork");
return 127;
}
else if (child == 0)
{
ptrace (PTRACE_TRACEME);
raise (SIGUSR1);
execv (argv[1], &argv[1]);
perror ("execve");
_exit (127);
}
wait_for (child, W_STOPCODE (SIGUSR1));
if (ptrace (PTRACE_SETOPTIONS, child,
0L, (void *) (long) PTRACE_O_TRACEEXEC) != 0)
{
perror ("PTRACE_SETOPTIONS");
return 4;
}
if (ptrace (PTRACE_CONT, child, 0L, 0L) != 0)
{
perror ("PTRACE_CONT");
return 5;
}
wait_for (child, W_STOPCODE (SIGTRAP | (PTRACE_EVENT_EXEC << 8)));
if (ptrace (PTRACE_CONT, child, 0L, 0L) != 0)
{
perror ("PTRACE_CONT");
return 6;
}
wait_for (child, W_EXITCODE (0, 0));
return 0;
}
Reported-by: Arnd Bergmann <arnd@arndb.de>
CC: Ulrich Weigand <ulrich.weigand@de.ibm.com>
Signed-off-by: Roland McGrath <roland@redhat.com>
None of this code appears to be used anywhere so remove it.
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
While 440037287c "[PATCH] switch all filesystems over to
d_obtain_alias" removed some cases where fh_to_dentry() and
fh_to_parent() could return NULL, there are still a few NULL returns
left in individual filesystems. Thus it was a mistake for that commit
to remove the handling of NULL returns in the callers.
Revert those parts of 440037287c which removed the NULL handling.
(We could, alternatively, modify all implementations to return -ESTALE
instead of NULL, but that proves to require fixing a number of
filesystems, and in some cases it's arguably more natural to return
NULL.)
Thanks to David for original patch and Linus, Christoph, and Hugh for
review.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: David Howells <dhowells@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Impact: build fix on Alpha
-tip testing found this build failure on the Alpha defconfig:
/home/mingo/tip/fs/proc/stat.c: In function 'show_stat':
/home/mingo/tip/fs/proc/stat.c:48: error: implicit declaration of function 'for_each_irq_desc'
/home/mingo/tip/fs/proc/stat.c:48: error: expected ';' before '{' token
can not use irq_desc() in stat.c on older architectures.
Signed-off-by: Yinghai Lu <yinghai@kernel.orgg>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: new feature
Problem on distro kernels: irq_desc[NR_IRQS] takes megabytes of RAM with
NR_CPUS set to large values. The goal is to be able to scale up to much
larger NR_IRQS value without impacting the (important) common case.
To solve this, we generalize irq_desc[NR_IRQS] to an (optional) array of
irq_desc pointers.
When CONFIG_SPARSE_IRQ=y is used, we use kzalloc_node to get irq_desc,
this also makes the IRQ descriptors NUMA-local (to the site that calls
request_irq()).
This gets rid of the irq_cfg[] static array on x86 as well: irq_cfg now
uses desc->chip_data for x86 to store irq_cfg.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Changeset a238b790d5 (Call fasync()
functions without the BKL) introduced a race which could leave
file->f_flags in a state inconsistent with what the underlying
driver/filesystem believes. Revert that change, and also fix the same
races in ioctl_fioasync() and ioctl_fionbio().
This is a minimal, short-term fix; the real fix will not involve the
BKL.
Reported-by: Oleg Nesterov <oleg@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/bdev:
[PATCH] fix bogus argument of blkdev_put() in pktcdvd
[PATCH 2/2] documnt FMODE_ constants
[PATCH 1/2] kill FMODE_NDELAY_NOW
[PATCH] clean up blkdev_get a little bit
[PATCH] Fix block dev compat ioctl handling
[PATCH] kill obsolete temporary comment in swsusp_close()
When project quota is active and is being used for directory tree
quota control, we disallow rename outside the current directory
tree. This requires a check to be made after all the inodes
involved in the rename are locked. We fail to unlock the inodes
correctly if we disallow the rename when the target is outside the
current directory tree. This results in a hang on the next access
to the inodes involved in failed rename.
Reported-by: Arkadiusz Miskiewicz <arekm@maven.pl>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Tested-by: Arkadiusz Miskiewicz <arekm@maven.pl>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Hit this assert because an inode was tagged with XFS_ICI_RECLAIM_TAG but
not XFS_IRECLAIMABLE|XFS_IRECLAIM. This is because xfs_iget_cache_hit()
first clears XFS_IRECLAIMABLE and then calls __xfs_inode_clear_reclaim_tag()
while only holding the pag_ici_lock in read mode so we can race with
xfs_reclaim_inodes_ag(). Looks like xfs_reclaim_inodes_ag() will do the
right thing anyway so just remove the assert.
Thanks to Christoph for pointing out where the problem was.
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
entries_size is probably left over from when we used to pass the
size to kmem_free().
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
We check the return value of all other calls to xfs_buf_get_noaddr().
Make sense to do it here too.
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
When project quota is active and is being used for directory tree
quota control, we disallow rename outside the current directory
tree. This requires a check to be made after all the inodes
involved in the rename are locked. We fail to unlock the inodes
correctly if we disallow the rename when the target is outside the
current directory tree. This results in a hang on the next access
to the inodes involved in failed rename.
Reported-by: Arkadiusz Miskiewicz <arekm@maven.pl>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Tested-by: Arkadiusz Miskiewicz <arekm@maven.pl>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
This patch fixes the following section mismatch:
WARNING: fs/ubifs/ubifs.o(.init.text+0xec): Section mismatch in reference from the function init_module() to the function .exit.text:ubifs_compressors_exit()
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Update FMODE_NDELAY before each ioctl call so that we can kill the
magic FMODE_NDELAY_NOW. It would be even better to do this directly
in setfl(), but for that we'd need to have FMODE_NDELAY for all files,
not just block special files.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The way the bd_claim for the FMODE_EXCL case is implemented is rather
confusing. Clean it up to the most logical style.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>