deal with d_move() races properly; rename_lock read-retry loop,
rcu_read_lock() held while walking to root, d_lock held over
subtraction from namelen and copying the component to stabilize
->d_name.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Now we point superblock to a server share root and set a root dentry
appropriately. This let us share superblock between mounts like
//server/sharename/foo/bar and //server/sharename/foo further.
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Steve French <sfrench@us.ibm.com>
We need it to make them work with mandatory locking style because
we can fail in a situation like when kernel need to flush dirty pages
and there is a lock held by a process who opened file.
Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Make CIFS use the new d_automount() dentry operation rather than abusing
follow_link() on directories.
[NOTE: THIS IS UNTESTED!]
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Steve French <sfrench@samba.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
We call CIFSSMBUnixSetPathInfo in these functions, but we have a
filehandle since an open was just done. Switch these functions to
use CIFSSMBUnixSetFileInfo instead.
In practice, these codepaths are only used if posix opens are broken.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Require filesystems be aware of .d_revalidate being called in rcu-walk
mode (nd->flags & LOOKUP_RCU). For now do a simple push down, returning
-ECHILD from all implementations.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Reduce some branches and memory accesses in dcache lookup by adding dentry
flags to indicate common d_ops are set, rather than having to check them.
This saves a pointer memory access (dentry->d_op) in common path lookup
situations, and saves another pointer load and branch in cases where we
have d_op but not the particular operation.
Patched with:
git grep -E '[.>]([[:space:]])*d_op([[:space:]])*=' | xargs sed -e 's/\([^\t ]*\)->d_op = \(.*\);/d_set_d_op(\1, \2);/' -e 's/\([^\t ]*\)\.d_op = \(.*\);/d_set_d_op(\&\1, \2);/' -i
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Change d_hash so it may be called from lock-free RCU lookups. See similar
patch for d_compare for details.
For in-tree filesystems, this is just a mechanical change.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Change d_compare so it may be called from lock-free RCU lookups. This
does put significant restrictions on what may be done from the callback,
however there don't seem to have been any problems with in-tree fses.
If some strange use case pops up that _really_ cannot cope with the
rcu-walk rules, we can just add new rcu-unaware callbacks, which would
cause name lookup to drop out of rcu-walk mode.
For in-tree filesystems, this is just a mechanical change.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Use vfat's method for dealing with negative dentries to preserve case,
rather than overwrite dentry name in d_revalidate, which is a bit ugly
and also gets in the way of doing lock-free path walking.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Instead, use fatfs's method for dealing with negative dentries to
preserve case, rather than overwrite dentry name in d_revalidate, which
is a bit ugly and also gets in the way of doing lock-free path walking.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Signed-off-by: Steve French <sfrench@us.ibm.com>
It's currently in dir.c which makes little sense...
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Acked-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
All the remaining users of cifsFileInfo->pfile just use it to get
at the f_flags/f_mode. Now that we store that separately in the
cifsFileInfo, there's no need to consult the pfile at all from
a cifsFileInfo pointer.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Acked-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Add a f_flags field that holds the f_flags field from the filp. We'll
need this info in case the filp ever goes away before the cifsFileInfo
does. Have cifs_reopen_file use that value instead of filp->f_flags
too and have it take a cifsFileInfo arg instead of a filp.
While we're at it, get rid of some bogus cargo-cult NULL pointer
checks in that function and reduce the level of indentation.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Acked-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
It already takes a file pointer. The inode associated with that had damn
well better be the same one we're passing in anyway. Thus, there's no
need for a separate argument here.
Also, get rid of the bogus check for a null pCifsInode pointer. The
CIFS_I macro uses container_of(), and that will virtually never return a
NULL pointer anyway.
Finally, move the setting of the canCache* flags outside of the lock.
Other places in the code don't hold that lock when setting it, so I
assume it's not really needed here either.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Acked-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Eliminate the poor, misunderstood "oflags" option from cifs_new_fileinfo.
The callers mostly pass in the filp->f_flags here.
That's not correct however since we're checking that value for
the presence of FMODE_READ. Luckily that only affects how the f_list is
ordered. What it really wants here is the file->f_mode. Just use that
field from the filp to determine it.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
The way flags are passed and converted for cifs_posix_open is rather
non-sensical. Some callers call cifs_posix_convert_flags on the flags
before they pass them to cifs_posix_open, whereas some don't. Two flag
conversion steps is just confusing though.
Change the function instead to clearly expect input in f_flags format,
and fix the callers to pass that in. Then, have cifs_posix_open call
cifs_convert_posix_flags to do the conversion. Move cifs_posix_open to
file.c as well so we can keep cifs_convert_posix_flags as a static
function.
Fix it also to not ignore O_CREAT, O_EXCL and O_TRUNC, and instead have
cifs_reopen_file mask those bits off before calling cifs_posix_open.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Filesystems aren't really supposed to do anything with a vfsmount. It's
considered a layering violation since vfsmounts are entirely managed at
the VFS layer.
CIFS currently keeps an active reference to a vfsmount in order to
prevent the superblock vanishing before an oplock break has completed.
What we really want to do instead is to keep sb->s_active high until the
oplock break has completed. This patch borrows the scheme that NFS uses
for handling sillyrenames.
An atomic_t is added to the cifs_sb_info. When it transitions from 0 to
1, an extra reference to the superblock is taken (by bumping the
s_active value). When it transitions from 1 to 0, that reference is
dropped and a the superblock teardown may proceed if there are no more
references to it.
Also, the vfsmount pointer is removed from cifsFileInfo and from
cifs_new_fileinfo, and some bogus forward declarations are removed from
cifsfs.h.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Acked-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
cifsFileInfo is a bit problematic. It contains a reference back to the
struct file itself. This makes it difficult for a cifsFileInfo to exist
without a corresponding struct file.
It would be better instead of the cifsFileInfo just held info pertaining
to the open file on the server instead without any back refrences to the
struct file. This would allow it to exist after the filp to which it was
originally attached was closed.
Much of the use of the file pointer in this struct is to get at the
dentry. Begin divorcing the cifsFileInfo from the struct file by
keeping a reference to the dentry. Since the dentry will have a
reference to the inode, we can eliminate the "pInode" field too and
convert the igrab/iput to dget/dput.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Acked-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
When we implement multiuser mounts, we'll need to filter filehandles
by fsuid. Add a flag for multiuser mounts and code to filter by
fsuid when it's set.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
cifsFileInfo needs a pointer to a tcon, but it doesn't currently hold a
reference to it. Change it to keep a pointer to a tcon_link instead and
hold a reference to it.
That will keep the tcon from being freed until the file is closed.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Eventually, we'll need to track the use of tcons on a per-sb basis, so that
we know when it's ok to tear them down. Begin this conversion by adding a
new "tcon_link" struct and accessors that get it. For now, the core data
structures are untouched -- cifs_sb still just points to a single tcon and
the pointers are just cast to deal with the accessor functions. A later
patch will flesh this out.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
To minimize calls to cifs_sb_tcon and to allow for a clear error path if
a tcon can't be acquired.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
At mount time, we'll always need to create a tcon that will serve as a
template for others that are associated with the mount. This tcon is
known as the "master" tcon.
In some cases, we'll need to use that tcon regardless of who's accessing
the mount. Add an accessor function for the master tcon and go ahead and
switch the appropriate places to use it.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
When we convert cifs to do multiple sessions per mount, we'll need more
than one tcon per superblock. At that point "cifs_sb->tcon" will make
no sense. Add a new accessor function that gets a tcon given a cifs_sb.
For now, it just returns cifs_sb->tcon. Later it'll do more.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Eventually, we'll have more than one tcon per superblock. At that point,
we'll need to know which one is associated with a particular fid. For
now, this is just set from the cifs_sb->tcon pointer, but eventually
the caller of cifs_new_fileinfo will pass a tcon pointer in.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
cifs_new_fileinfo() does not use the 'oplock' value from the callers. Instead,
it sets it to REQ_OPLOCK which seems wrong. We should be using the oplock value
obtained from the Server to set the inode's clientCanCacheAll or
clientCanCacheRead flags. Fix this by passing oplock from the callers to
cifs_new_fileinfo().
This change dates back to commit a6ce4932 (2.6.30-rc3). So, all the affected
versions will need this fix. Please Cc stable once reviewed and accepted.
Cc: Stable <stable@kernel.org>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
cifs has a lot of complicated functions that have to clean up things on
error, but some of them don't have all of the cleanup code
well-consolidated. Clean up and consolidate error handling in several
functions.
This is in preparation of later patches that will need to put references
to the tcon link container.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Get rid of some nesting and add a label we can goto on error.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (55 commits)
workqueue: mark init_workqueues() as early_initcall()
workqueue: explain for_each_*cwq_cpu() iterators
fscache: fix build on !CONFIG_SYSCTL
slow-work: kill it
gfs2: use workqueue instead of slow-work
drm: use workqueue instead of slow-work
cifs: use workqueue instead of slow-work
fscache: drop references to slow-work
fscache: convert operation to use workqueue instead of slow-work
fscache: convert object to use workqueue instead of slow-work
workqueue: fix how cpu number is stored in work->data
workqueue: fix mayday_mask handling on UP
workqueue: fix build problem on !CONFIG_SMP
workqueue: fix locking in retry path of maybe_create_worker()
async: use workqueue for worker pool
workqueue: remove WQ_SINGLE_CPU and use WQ_UNBOUND instead
workqueue: implement unbound workqueue
workqueue: prepare for WQ_UNBOUND implementation
libata: take advantage of cmwq and remove concurrency limitations
workqueue: fix worker management invocation without pending works
...
Fixed up conflicts in fs/cifs/* as per Tejun. Other trivial conflicts in
include/linux/workqueue.h, kernel/trace/Kconfig and kernel/workqueue.c
The recent commit 6ca9f3bae8 modified the code so
that filp is full instantiated whenever the file is created and passed back.
The below comment is no longer true, remove it.
Cc: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Workqueue can now handle high concurrency. Use system_nrt_wq
instead of slow-work.
* Updated is_valid_oplock_break() to not call cifs_oplock_break_put()
as advised by Steve French. It might cause deadlock. Instead,
reference is increased after queueing succeeded and
cifs_oplock_break() briefly grabs GlobalSMBSeslock before putting
the cfile to make sure it doesn't put before the matching get is
finished.
* Anton Blanchard reported that cifs conversion was using now gone
system_single_wq. Use system_nrt_wq which provides non-reentrance
guarantee which is enough and much better.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Steve French <sfrench@samba.org>
Cc: Anton Blanchard <anton@samba.org>
The current scheme of sticking open files on a list and assuming that
cifs_open will scoop them off of it is broken and leads to "Busy
inodes after umount..." errors at unmount time.
The problem is that there is no guarantee that cifs_open will always
be called after a ->lookup or ->create operation. If there are
permissions or other problems, then it's quite likely that it *won't*
be called.
Fix this by fully instantiating the filp whenever the file is created
and pass that filp back to the VFS. If there is a problem, the VFS
can clean up the references.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-and-Tested-by: Suresh Jayaraman <sjayaraman@suse.de>
Having cifs_posix_open call cifs_new_fileinfo is problematic and
inconsistent with how "regular" opens work. It's also buggy as
cifs_reopen_file calls this function on a reconnect, which creates a new
struct cifsFileInfo that just gets leaked.
Push it out into the callers. This also allows us to get rid of the
"mnt" arg to cifs_posix_open.
Finally, in the event that a cifsFileInfo isn't or can't be created, we
always want to close the filehandle out on the server as the client
won't have a record of the filehandle and can't actually use it. Make
sure that CIFSSMBClose is called in those cases.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-and-Tested-by: Suresh Jayaraman <sjayaraman@suse.de>
The uniqueid field sent by the server when unix extensions are enabled
is currently used sometimes when it shouldn't be. The readdir codepath
is correct, but most others are not. Fix it.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
..otherwise memory allocation errors go undetected.
Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
The comments make it clear the otherwise subtle behavior of cifs_new_fileinfo().
Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Reviewed-by: Shirish Pargaonkar <shirishp@us.ibm.com>
--
fs/cifs/dir.c | 18 ++++++++++++++++--
1 files changed, 16 insertions(+), 2 deletions(-)
Signed-off-by: Steve French <sfrench@us.ibm.com>
While creating a file on a server which supports unix extensions
such as Samba, if a file is being created which does not supply
nameidata (i.e. nd is null), cifs client can oops when calling
cifs_posix_open.
Signed-off-by: Shirish Pargaonkar <shirishp@us.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Neaten cERROR and cFYI macros, reduce text space
~2.5K
Convert '__FILE__ ": " fmt' to '"%s: " fmt', __FILE__' to save text space
Surround macros with do {} while
Add parentheses to macros
Make statement expression macro from macro with assign
Remove now unnecessary parentheses from cFYI and cERROR uses
defconfig with CIFS support old
$ size fs/cifs/built-in.o
text data bss dec hex filename
156012 1760 148 157920 268e0 fs/cifs/built-in.o
defconfig with CIFS support old
$ size fs/cifs/built-in.o
text data bss dec hex filename
153508 1760 148 155416 25f18 fs/cifs/built-in.o
allyesconfig old:
$ size fs/cifs/built-in.o
text data bss dec hex filename
309138 3864 74824 387826 5eaf2 fs/cifs/built-in.o
allyesconfig new
$ size fs/cifs/built-in.o
text data bss dec hex filename
305655 3864 74824 384343 5dd57 fs/cifs/built-in.o
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
cifs_revalidate is renamed to cifs_revalidate_dentry as a later patch
will add a by-filehandle variant.
Add a new "invalid_mapping" flag to the cifsInodeInfo that indicates
that the pagecache is considered invalid. Add a new routine to check
inode attributes whenever they're updated and set that flag if the inode
has changed on the server.
cifs_revalidate_dentry is then changed to just update the attrcache if
needed and then to zap the pagecache if it's not valid.
There are some other behavior changes in here as well. Open files are
now allowed to have their caches invalidated. I see no reason why we'd
want to keep stale data around just because a file is open. Also,
cifs_revalidate_cache uses the server_eof for revalidating the file
size since that should more closely match the size of the file on the
server.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
While Linux provided an O_SYNC flag basically since day 1, it took until
Linux 2.4.0-test12pre2 to actually get it implemented for filesystems,
since that day we had generic_osync_around with only minor changes and the
great "For now, when the user asks for O_SYNC, we'll actually give
O_DSYNC" comment. This patch intends to actually give us real O_SYNC
semantics in addition to the O_DSYNC semantics. After Jan's O_SYNC
patches which are required before this patch it's actually surprisingly
simple, we just need to figure out when to set the datasync flag to
vfs_fsync_range and when not.
This patch renames the existing O_SYNC flag to O_DSYNC while keeping it's
numerical value to keep binary compatibility, and adds a new real O_SYNC
flag. To guarantee backwards compatiblity it is defined as expanding to
both the O_DSYNC and the new additional binary flag (__O_SYNC) to make
sure we are backwards-compatible when compiled against the new headers.
This also means that all places that don't care about the differences can
just check O_DSYNC and get the right behaviour for O_SYNC, too - only
places that actuall care need to check __O_SYNC in addition. Drivers and
network filesystems have been updated in a fail safe way to always do the
full sync magic if O_DSYNC is set. The few places setting O_SYNC for
lower layers are kept that way for now to stay failsafe.
We enforce that O_DSYNC is set when __O_SYNC is set early in the open path
to make sure we always get these sane options.
Note that parisc really screwed up their headers as they already define a
O_DSYNC that has always been a no-op. We try to repair it by using it for
the new O_DSYNC and redefinining O_SYNC to send both the traditional
O_SYNC numerical value _and_ the O_DSYNC one.
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Grant Grundler <grundler@parisc-linux.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andreas Dilger <adilger@sun.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Acked-by: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jan Kara <jack@suse.cz>
SMB writes are sent with a starting offset and length. When the server
supports the newer SMB trans2 posix open (rather than using the SMB
NTCreateX) a file can be opened with SMB_O_APPEND flag, and for that
case Samba server assumes that the offset sent in SMBWriteX is unneeded
since the write should go to the end of the file - which can cause
problems if the write was cached (since the beginning part of a
page could be written twice by the client mm). Jeff suggested that
masking the flag on posix open on the client is easiest for the time
being. Note that recent Samba server also had an unrelated problem with
SMB NTCreateX and append (see samba bugzilla bug number 6898) which
should not affect current Linux clients (unless cifs Unix Extensions
are disabled).
The cifs client did not send the O_APPEND flag on posix open
before 2.6.29 so the fix is unneeded on early kernels.
In the future, for the non-cached case (O_DIRECT, and forcedirectio mounts)
it would be possible and useful to send O_APPEND on posix open (for Windows
case: FILE_APPEND_DATA but not FILE_WRITE_DATA on SMB NTCreateX) but for
cached writes although the vfs sets the offset to end of file it
may fragment a write across pages - so we can't send O_APPEND on
open (could result in sending part of a page twice).
CC: Stable <stable@kernel.org>
Reviewed-by: Shirish Pargaonkar <shirishp@us.ibm.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Fixes bugzilla.kernel.org bug number 14641
Lookup called during network boot (network root filesystem
for diskless workstation) has case where nd is null in
lookup. This patch fixes that in cifs_lookup.
(Shirish noted that 2.6.30 and 2.6.31 stable need the same check)
Signed-off-by: Shirish Pargaonkar <shirishp@us.ibm.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Tested-by: Vladimir Stavrinov <vs@inist.ru>
CC: Stable <stable@kernel.org>
Signed-off-by: Steve French <sfrench@us.ibm.com>
...it does the same thing as cifs_fill_fileinfo, but doesn't handle the
flist ordering correctly. Also rename cifs_fill_fileinfo to a more
descriptive name and have it take an open flags arg instead of just a
write_only flag. That makes the logic in the callers a little simpler.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
This is the fourth respin of the patch to convert oplock breaks to
use the slow_work facility.
A customer of ours was testing a backport of one of the earlier
patchsets, and hit a "Busy inodes after umount..." problem. An oplock
break job had raced with a umount, and the superblock got torn down and
its memory reused. When the oplock break job tried to dereference the
inode->i_sb, the kernel oopsed.
This patchset has the oplock break job hold an inode and vfsmount
reference until the oplock break completes. With this, there should be
no need to take a tcon reference (the vfsmount implicitly holds one
already).
Currently, when an oplock break comes in there's a chance that the
oplock break job won't occur if the allocation of the oplock_q_entry
fails. There are also some rather nasty races in the allocation and
handling these structs.
Rather than allocating oplock queue entries when an oplock break comes
in, add a few extra fields to the cifsFileInfo struct. Get rid of the
dedicated cifs_oplock_thread as well and queue the oplock break job to
the slow_work thread pool.
This approach also has the advantage that the oplock break jobs can
potentially run in parallel rather than be serialized like they are
today.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
It's possible that this struct will outlive the filp to which it is
attached. If it does and it needs to do some work on the inode, then
it'll need a reference.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
cifs_posix_open takes a "poplock" argument that's intended to be used in
the actual posix open call to set the "Flags" field. It ignores this
value however and declares an "oplock" parameter on the stack that it
passes uninitialized to the CIFSPOSIXOpen function. Not only does this
mean that the oplock request flags are bogus, but the result that's
expected to be in that variable is unchanged.
Fix this, and also clean up the type of the oplock parameter used. Since
it's expected to be __u32, we should use that everywhere and not
implicitly cast it from a signed type.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Currently, cifs_close() tries to wait until all I/O is complete and then
frees the file private data. If I/O does not completely in a reasonable
amount of time it frees the structure anyway, leaving a potential use-
after-free situation.
This patch changes the wrtPending counter to a complete reference count and
lets the last user free the structure.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Tested-by: Shirish Pargaonkar <shirishp@us.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
cifs: rename CIFSSMBUnixSetInfo to CIFSSMBUnixSetPathInfo
...in preparation of adding a SET_FILE_INFO variant.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
cifs: add new cifs_iget function and convert unix codepath to use it
In order to unify some codepaths, introduce a common cifs_fattr struct
for storing inode attributes. The different codepaths (unix, legacy,
normal, etc...) can fill out this struct with inode info. It can then be
passed as an arg to a common set of routines to get and update inodes.
Add a new cifs_iget function that uses iget5_locked to identify inodes.
This will compare inodes based on the uniqueid value in a cifs_fattr
struct.
Rather than filling out an already-created inode, have
cifs_get_inode_info_unix instead fill out cifs_fattr and hand that off
to cifs_iget. cifs_iget can then properly look for hardlinked inodes.
On the readdir side, add a new cifs_readdir_lookup function that spawns
populated dentries. Redefine FILE_UNIX_INFO so that it's basically a
FILE_UNIX_BASIC_INFO that has a few fields wrapped around it. This
allows us to more easily use the same function for filling out the fattr
as the non-readdir codepath.
With this, we should then have proper hardlink detection and can
eventually get rid of some nasty CIFS-specific hacks for handing them.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
FreeXid() along with freeing Xid does add a cifsFYI debug message that
prints rc (return code) as well. In some code paths where we set/return
error code after calling FreeXid(), incorrect error code is being
printed when cifsFYI is enabled.
This could be misleading in few cases. For eg.
In cifs_open() if cifs_fill_filedata() returns a valid pointer to
cifsFileInfo, FreeXid() prints rc=-13 whereas 0 is actually being
returned. Fix this by setting rc before calling FreeXid().
Basically convert
FreeXid(xid); rc = -ERR;
return -ERR; => FreeXid(xid);
return rc;
[Note that Christoph would like to replace the GetXid/FreeXid
calls, which are primarily used for debugging. This seems
like a good longer term goal, but although there is an
alternative tracing facility, there are no examples yet
available that I know of that we can use (yet) to
convert this cifs function entry/exit logging, and for
creating an identifier that we can use to correlate
all dmesg log entries for a particular vfs operation
(ie identify all log entries for a particular vfs
request to cifs: e.g. a particular close or read or write
or byte range lock call ... and just using the thread id
is harder). Eventually when a replacement
for this is available (e.g. when NFS switches over and various
samples to look at in other file systems) we can remove the
GetXid/FreeXid macro but in the meantime multiple people
use this run time configurable logging all the time
for debugging, and Suresh's patch fixes a problem
which made it harder to notice some low
memory problems in the log so it is worthwhile
to fix this problem until a better logging
approach is able to be used]
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Small change (mostly formatting) to limit lookup based open calls to
file create only.
After discussion yesteday on samba-technical about the posix lookup
regression, and looking at a problem with cifs posix open to one
particular Samba version, Jeff and JRA realized that Samba server's
behavior changed in this area (posix open behavior on files vs.
directories). To make this behavior consistent, JRA just made a
fix to Samba server to alter how it handles open of directories (now
returning the equivalent of EISDIR instead of success). Since we don't
know at lookup time whether the inode is a directory or file (and
thus whether posix open will succeed with most current Samba server),
this change avoids the posix open code on lookup open (just issues
posix open on creates). This gets the semantic benefits we want
(atomicity, posix byte range locks, improved write semantics on newly
created files) and file create still is fast, and we avoid the problem
that Jeff noticed yesterday with "openat" (and some open directory
calls) of non-cached directories to one version of Samba server, and
will work with future Samba versions (which include the fix jra just
pushed into Samba server). I confirmed this approach with jra
yesterday and with Shirish today.
Posix open is only called (at lookup time) for file create now.
For opens (rather than creates), because we do not know if it
is a file or directory yet, and current Samba no longer allows
us to do posix open on dirs, we could end up wasting an open call
on what turns out to be a dir. For file opens, we wait to call posix
open till cifs_open. It could be added here (lookup) in the future
but the performance tradeoff of the extra network request when EISDIR
or EACCES is returned would have to be weighed against the 50%
reduction in network traffic in the other paths.
Reviewed-by: Shirish Pargaonkar <shirishp@us.ibm.com>
Tested-by: Jeff Layton <jlayton@redhat.com>
CC: Jeremy Allison <jra@samba.org>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Posix open code was not properly adding the file to the
list of open files. Fix allocating cifsFileInfo
more than once, and adding twice to flist and tlist.
Also fix mode setting to be done in one place in these
paths.
Signed-off-by: Steve French <sfrench@us.ibm.com>
Reviewed-by: Shirish Pargaonkar <shirishp@us.ibm.com>
Tested-by: Jeff Layton <jlayton@redhat.com>
Tested-by: Luca Tettamanti <kronos.it@gmail.com>
Remove adding open file entry twice to lists in the file
Do not fill file info twice in case of posix opens and creates
Signed-off-by: Shirish Pargaonkar <shirishp@us.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
This patch by utilizing lookup intents, and thus removing a network
roundtrip in the open path, improves performance dramatically on
open (30% or more) to Samba and other servers which support the
cifs posix extensions
Signed-off-by: Shirish Pargaonkar <shirishp@us.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Jeff made a good point that we should endian convert the UniqueId when we use
it to set i_ino Even though this value is opaque to the client, when comparing
the inode numbers of the same server file from two different clients (one
big endian, one little endian) or when we compare a big endian client's view
of i_ino with what the server thinks - we should get the same value
Signed-off-by: Steve French <sfrench@us.ibm.com>
If the network connection crashes, and we have to reopen files, preferentially
use the newer cifs posix open protocol operation if the server supports it.
Signed-off-by: Steve French <sfrench@us.ibm.com>
Samba server added support for a new posix open/create/mkdir operation
a year or so ago, and we added support to cifs for mkdir to use it,
but had not added the corresponding code to file create.
The following patch helps improve the performance of the cifs create
path (to Samba and servers which support the cifs posix protocol
extensions). Using Connectathon basic test1, with 2000 files, the
performance improved about 15%, and also helped reduce network traffic
(17% fewer SMBs sent over the wire) due to saving a network round trip
for the SetPathInfo on every file create.
It should also help the semantics (and probably the performance) of
write (e.g. when posix byte range locks are on the file) on file
handles opened with posix create, and adds support for a few flags
which would have to be ignored otherwise.
Signed-off-by: Steve French <sfrench@us.ibm.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: (31 commits)
[CIFS] Remove redundant test
[CIFS] make sure that DFS pathnames are properly formed
Remove an already-checked error condition in SendReceiveBlockingLock
Streamline SendReceiveBlockingLock: Use "goto out:" in an error condition
Streamline SendReceiveBlockingLock: Use "goto out:" in an error condition
[CIFS] Streamline SendReceive[2] by using "goto out:" in an error condition
Slightly streamline SendReceive[2]
Check the return value of cifs_sign_smb[2]
[CIFS] Cleanup: Move the check for too large R/W requests
[CIFS] Slightly simplify wait_for_free_request(), remove an unnecessary "else" branch
Simplify allocate_mid() slightly: Remove some unnecessary "else" branches
[CIFS] In SendReceive, move consistency check out of the mutexed region
cifs: store password in tcon
cifs: have calc_lanman_hash take more granular args
cifs: zero out session password before freeing it
cifs: fix wait_for_response to time out sleeping processes correctly
[CIFS] Can not mount with prefixpath if root directory of share is inaccessible
[CIFS] various minor cleanups pointed out by checkpatch script
[CIFS] fix typo
[CIFS] remove sparse warning
...
Fix trivial conflict in fs/cifs/cifs_fs_sb.h due to comment changes for
the CIFS_MOUNT_xyz bit definitions between cifs updates and security
updates.
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.
Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().
Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Steve French <sfrench@samba.org>
Cc: linux-cifs-client@lists.samba.org
Signed-off-by: James Morris <jmorris@namei.org>
If a server supports unix extensions but does not support POSIX create
routines, then the client will create a new inode with a standard SMB
mkdir or create/open call and then will set the mode. When it does this,
it does not take the setgid bit on the parent directory into account.
This patch has CIFS flip on the setgid bit when the parent directory has
it. If the share is mounted with "setuids" then also change the group
owner to the gid of the parent.
This patch should apply cleanly on top of the setattr cleanup patches
that I sent a few weeks ago.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
We'd like to be able to use the unix SET_PATH_INFO_BASIC args to set
file times as well, but that makes the argument list rather long. Bundle
up the args for unix SET_PATH_INFO call into a struct. For now, we don't
actually use the times fields anywhere. That will be done in a follow-on
patch.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
When CIFS creates a new inode on a mount without unix extensions, it
temporarily assigns the mode that was passed to it in the create/mkdir
call. Eventually, when the inode is revalidated, it changes to have the
file_mode or dir_mode for the mount. This is confusing to users who
expect that the mode shouldn't change this way. It's also problematic
since only the mode is treated this way, not the uid or gid. Suppose you
have a CIFS mount that's mounted with:
uid=0,gid=0,file_mode=0666,dir_mode=0777
...if an unprivileged user comes along and does this on the mount:
mkdir -m 0700 foo
touch foo/bar
...there is a period of time where the touch will fail, since the dir
will initially be owned by root and have mode 0700. If the user waits
long enough, then "foo" will be revalidated and will get the correct
dir_mode permissions.
This patch changes cifs_mkdir and cifs_create to not overwrite the
mode found by the initial cifs_get_inode_info call after the inode is
created on the server. Legacy behavior can be reenabled with the
new "dynperm" mount option.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
fs/cifs/dir.c: In function 'cifs_ci_compare':
fs/cifs/dir.c:582: warning: passing argument 1 of 'memcpy' discards
qualifiers from pointer target type
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Some versions of Samba (3.2-pre e.g.) are stricter about checking to make sure that
paths in DFS name spaces are sent in the form \\server\share\dir\subdir ...
instead of \dir\subdir
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
When creating a directory on a CIFS share without POSIX extensions,
and the given mode has no write bits set, set the ATTR_READONLY bit.
When creating a file, set ATTR_READONLY if the create mode has no write
bits set and we're not using unix extensions.
There are some comments about this being problematic due to the VFS
splitting creates into 2 parts. I'm not sure what that's actually
talking about, but I'm assuming that it has something to do with how
mknod is implemented. In the simple case where we have no unix
extensions and we're just creating a regular file, there's no reason
we can't set ATTR_READONLY.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Shirish Pargaonkar noted:
With cifsacl mount option, when a file is created on the Windows server,
exclusive oplock is broken right away because the get cifs acl code
again opens the file to obtain security descriptor.
The client does not have the newly created file handle or inode in any
of its lists yet so it does not respond to oplock break and server waits for
its duration and then responds to the second open. This slows down file
creation signficantly. The fix is to pass the file descriptor to the get
cifsacl code wherever available so that get cifs acl code does not send
second open (NT Create ANDX) and oplock is not broken.
CC: Shirish Pargaonkar <shirishp@us.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Andi Kleen noticed that we were logging access denied errors (which is
noisy in the dmesg log, and not needed to be logged) and that we were
logging path names on that an other errors (e.g. EIO) which we should
not be doing.
CC: Andi Kleen <ak@suse.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
This allows cifs to mount to ipc shares (IPC$)
which will allow user space applications to
layer over authenticated cifs connections
(useful for Wine and others that would want
to put DCE/RPC over CIFS or run CIFS named
pipes)
Acked-by: Rob Shearman <rob@codeweavers.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Previously the only way to do this was to umount all mounts to that server,
turn off a proc setting (/proc/fs/cifs/LinuxExtensionsEnabled).
Fixes Samba bugzilla bug number: 4582 (and also 2008)
Signed-off-by: Steve French <sfrench@us.ibm.com>
nfsd is passing null nameidata (probably the only one doing that)
on call to create - cifs was missing one check for this.
Note that running nfsd over a cifs mount requires specifying fsid on
the nfs exports entry and requires mounting cifs with serverino mount
option.
Signed-off-by: Steve French <sfrench@us.ibm.com>
This patch makes CIFS honour a process' umask like other filesystems.
Of course the server is still free to munge the permissions if it wants
to; but the client will send the "right" permissions to begin with.
A few caveats:
1) It only applies to filesystems that have CAP_UNIX (aka support unix
extensions)
2) It applies the correct mode to the follow up CIFSSMBUnixSetPerms()
after remote creation
When mode to CIFS/NTFS ACL mapping is complete we can do the
same thing for that case for servers which do not
support the Unix Extensions.
Signed-off-by: Matt Keenen <matt@opcode-solutions.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Various coding style problems found by running the new
checkpatch.pl script against fs/cifs. 3 more files
fixed up.
Signed-off-by: Steve French <sfrench@us.ibm.com>
Various coding style problems found by running fs/cifs
against the new checkpatch.pl script. Since there
were too many to fit in one patch. Updated the first
four files.
Signed-off-by: Steve French <sfrench@us.ibm.com>
Originally at http://lkml.org/lkml/2006/9/2/86
The recent change to "allow Windows blocking locks to be cancelled via a
CANCEL_LOCK call" introduced a new semaphore in struct cifsFileInfo,
lock_sem. However, semaphores used as mutexes are deprecated these days,
and there's no reason to add a new one to the kernel. Therefore, convert
lock_sem to a struct mutex (and also fix one indentation glitch on one of
the lines changed anyway).
Signed-off-by: Roland Dreier <roland@digitalvampire.org>
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>