Code cleanup.
The code block is for !(*flags & MS_RDONLY). We don't need
to check it again.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We also don't bother to flush free space cache while with free space
tree.
Cc: David Sterba <dsterba@suse.cz>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
These two BUG_ON()s would never be true, ensured by callers' logic.
Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This adds a helper to show directly whether ops require full stripe.
Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
With this, we can avoid allocating memory for dev replace copies if the
target dev is not available.
Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Since this part is mostly independent, this moves it to a separate
function.
Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
As the part of getting extra mirror in __btrfs_map_block is
independent, this puts it into a separate function.
Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Since DISCARD is not as important as an operation like write, we don't
copy it to target device during replace, and it makes __btrfs_map_block
less complex.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We have similar code here and there, this merges them into a helper.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
While debugging truncate problems, I found that these tracepoints could
help us quickly know what went wrong.
Two sets of tracepoints are created to track regular/prealloc file item
and inline file item respectively, I put inline as a separate one since
what inline file items cares about are way less than the regular one.
This adds four tracepoints:
- btrfs_get_extent_show_fi_regular
- btrfs_get_extent_show_fi_inline
- btrfs_truncate_show_fi_regular
- btrfs_truncate_show_fi_inline
Cc: David Sterba <dsterba@suse.cz>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ formatting adjustments ]
Signed-off-by: David Sterba <dsterba@suse.com>
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
After 76b42abbf7 ("Btrfs: fix data loss after truncate when using the
no-holes feature"),
For either NO_HOLES or inline extents, we've set last_size to newsize to
avoid data loss after remount or inode got evicted and read again, thus,
we don't need this check anymore.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
If your filesystem has, eg, data:raid0 metadata:raid1, and you run "btrfs
balance -dconvert=raid1", the meta.target field will be uninitialized.
That's otherwise ok, as it's unused except for this warning.
Thus, let's use the existing set of raid levels for the comparison.
As a side effect, non-convert balances will now nag about data>metadata.
Signed-off-by: Adam Borowski <kilobyte@angband.pl>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Otherwise lockdep says:
[ 1337.483798] ================================================
[ 1337.483999] [ BUG: lock held when returning to user space! ]
[ 1337.484252] 4.11.0-rc6 #19 Not tainted
[ 1337.484423] ------------------------------------------------
[ 1337.484626] mount/14766 is leaving the kernel with locks still held!
[ 1337.484841] 1 lock held by mount/14766:
[ 1337.485017] #0: (&type->s_umount_key#33/1){+.+.+.}, at: [<ffffffff8124171f>] sget_userns+0x2af/0x520
Caught by xfstests generic/413 which tried to mount with the unsupported
mount option dax. Then xfstests generic/422 ran sync which deadlocks.
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Acked-by: Mike Marshall <hubcap@omnibond.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Normal pathname lookup doesn't allow empty pathnames, but using
AT_EMPTY_PATH (with name_to_handle_at() or fstatat(), for example) you
can trigger an empty pathname lookup.
And not only is the RCU lookup in that case entirely unnecessary
(because we'll obviously immediately finalize the end result), it is
actively wrong.
Why? An empth path is a special case that will return the original
'dirfd' dentry - and that dentry may not actually be RCU-free'd,
resulting in a potential use-after-free if we were to initialize the
path lazily under the RCU read lock and depend on complete_walk()
finalizing the dentry.
Found by syzkaller and KASAN.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: Vegard Nossum <vegard.nossum@gmail.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull btrfs fixes from Chris Mason:
"Dave Sterba collected a few more fixes for the last rc.
These aren't marked for stable, but I'm putting them in with a batch
were testing/sending by hand for this release"
* 'for-linus-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: fix potential use-after-free for cloned bio
Btrfs: fix segmentation fault when doing dio read
Btrfs: fix invalid dereference in btrfs_retry_endio
btrfs: drop the nossd flag when remounting with -o ssd
Pull more CIFS fixes from Steve French:
"As promised, here is the remaining set of cifs/smb3 fixes for stable
(and a fix for one regression) now that they have had additional
review and testing"
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
CIFS: Fix SMB3 mount without specifying a security mechanism
CIFS: store results of cifs_reopen_file to avoid infinite wait
CIFS: remove bad_network_name flag
CIFS: reconnect thread reschedule itself
CIFS: handle guest access errors to Windows shares
CIFS: Fix null pointer deref during read resp processing
If mmap() maps a file, it can be passed an offset into the file at which
the mapping is to start. Offset could be a negative value when
represented as a loff_t. The offset plus length will be used to update
the file size (i_size) which is also a loff_t.
Validate the value of offset and offset + length to make sure they do
not overflow and appear as negative.
Found by syzcaller with commit ff8c0c53c4 ("mm/hugetlb.c: don't call
region_abort if region_chg fails") applied. Prior to this commit, the
overflow would still occur but we would luckily return ENOMEM.
To reproduce:
mmap(0, 0x2000, 0, 0x40021, 0xffffffffffffffffULL, 0x8000000000000000ULL);
Resulted in,
kernel BUG at mm/hugetlb.c:742!
Call Trace:
hugetlbfs_evict_inode+0x80/0xa0
evict+0x24a/0x620
iput+0x48f/0x8c0
dentry_unlink_inode+0x31f/0x4d0
__dentry_kill+0x292/0x5e0
dput+0x730/0x830
__fput+0x438/0x720
____fput+0x1a/0x20
task_work_run+0xfe/0x180
exit_to_usermode_loop+0x133/0x150
syscall_return_slowpath+0x184/0x1c0
entry_SYSCALL_64_fastpath+0xab/0xad
Fixes: ff8c0c53c4 ("mm/hugetlb.c: don't call region_abort if region_chg fails")
Link: http://lkml.kernel.org/r/1491951118-30678-1-git-send-email-mike.kravetz@oracle.com
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yet another instance of the same race.
Fix is identical to change_huge_pmd().
See "thp: fix MADV_DONTNEED vs. numa balancing race" for more details.
Link: http://lkml.kernel.org/r/20170302151034.27829-5-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit ef65aaede2 ("smb2: Enforce sec= mount option") changed the
behavior of a mount command to enforce a specified security mechanism
during mounting. On another hand according to the spec if SMB3 server
doesn't respond with a security context it implies that it supports
NTLMSSP. The current code doesn't keep it in mind and fails a mount
for such servers if no security mechanism is specified. Fix this by
indicating that a server supports NTLMSSP if a security context isn't
returned during negotiate phase. This allows the code to use NTLMSSP
by default for SMB3 mounts.
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
KASAN reports that there is a use-after-free case of bio in btrfs_map_bio.
If we need to submit IOs to several disks at a time, the original bio
would get cloned and mapped to the destination disk, but we really should
use the original bio instead of a cloned bio to do the sanity check
because cloned bios are likely to be freed by its endio.
Reported-by: Diego <diegocg@gmail.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Commit 2dabb32484 ("Btrfs: Direct I/O read: Work on sectorsized blocks")
introduced this bug during iterating bio pages in dio read's endio hook,
and it could end up with segment fault of the dio reading task.
So the reason is 'if (nr_sectors--)', and it makes the code assume that
there is one more block in the same page, so page offset is increased and
the bio which is created to repair the bad block then has an incorrect
bvec.bv_offset, and a later access of the page content would throw a
segmentation fault.
This also adds ASSERT to check page offset against page size.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When doing directIO repair, we have this oops:
[ 1458.532816] general protection fault: 0000 [#1] SMP
...
[ 1458.536291] Workqueue: btrfs-endio-repair btrfs_endio_repair_helper [btrfs]
[ 1458.536893] task: ffff88082a42d100 task.stack: ffffc90002b3c000
[ 1458.537499] RIP: 0010:btrfs_retry_endio+0x7e/0x1a0 [btrfs]
...
[ 1458.543261] Call Trace:
[ 1458.543958] ? rcu_read_lock_sched_held+0xc4/0xd0
[ 1458.544374] bio_endio+0xed/0x100
[ 1458.544750] end_workqueue_fn+0x3c/0x40 [btrfs]
[ 1458.545257] normal_work_helper+0x9f/0x900 [btrfs]
[ 1458.545762] btrfs_endio_repair_helper+0x12/0x20 [btrfs]
[ 1458.546224] process_one_work+0x34d/0xb70
[ 1458.546570] ? process_one_work+0x29e/0xb70
[ 1458.546938] worker_thread+0x1cf/0x960
[ 1458.547263] ? process_one_work+0xb70/0xb70
[ 1458.547624] kthread+0x17d/0x180
[ 1458.547909] ? kthread_create_on_node+0x70/0x70
[ 1458.548300] ret_from_fork+0x31/0x40
It turns out that btrfs_retry_endio is trying to get inode from a directIO
page.
This fixes the problem by using the saved inode pointer, done->inode.
btrfs_retry_endio_nocsum has the same problem, and it's fixed as well.
Also cleanup unused @start (which is too trivial for a separate patch).
Cc: David Sterba <dsterba@suse.cz>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The opposite case was already handled right in the very next switch entry.
And also when turning on nossd, drop ssd_spread.
Reported-by: Hans van Kranenburg <hans.van.kranenburg@mendix.com>
Signed-off-by: Adam Borowski <kilobyte@angband.pl>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This fixes Continuous Availability when errors during
file reopen are encountered.
cifs_user_readv and cifs_user_writev would wait for ever if
results of cifs_reopen_file are not stored and for later inspection.
In fact, results are checked and, in case of errors, a chain
of function calls leading to reads and writes to be scheduled in
a separate thread is skipped.
These threads will wake up the corresponding waiters once reads
and writes are done.
However, given the return value is not stored, when rc is checked
for errors a previous one (always zero) is inspected instead.
This leads to pending reads/writes added to the list, making
cifs_user_readv and cifs_user_writev wait for ever.
Signed-off-by: Germano Percossi <germano.percossi@citrix.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Steve French <smfrench@gmail.com>
STATUS_BAD_NETWORK_NAME can be received during node failover,
causing the flag to be set and making the reconnect thread
always unsuccessful, thereafter.
Once the only place where it is set is removed, the remaining
bits are rendered moot.
Removing it does not prevent "mount" from failing when a non
existent share is passed.
What happens when the share really ceases to exist while the
share is mounted is undefined now as much as it was before.
Signed-off-by: Germano Percossi <germano.percossi@citrix.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Steve French <smfrench@gmail.com>
In case of error, smb2_reconnect_server reschedule itself
with a delay, to avoid being too aggressive.
Signed-off-by: Germano Percossi <germano.percossi@citrix.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Steve French <smfrench@gmail.com>
Commit 1a967d6c9b ("correctly to
anonymous authentication for the NTLM(v2) authentication") introduces
a regression in handling errors related to attempting a guest
connection to a Windows share which requires authentication. This
should result in a permission denied error but actually causes the
kernel module to enter a never-ending loop trying to follow a DFS
referal which doesn't exist.
The base cause of this is the failure now occurs later in the process
during tree connect and not at the session setup setup and all errors
in tree connect are interpreted as needing to follow the DFS paths
which isn't in this case correct. So, check the returned error against
EACCES and fail if this is returned error.
Feedback from Aurelien:
PS> net user guest /activate:no
PS> mkdir C:\guestshare
PS> icacls C:\guestshare /grant 'Everyone:(OI)(CI)F'
PS> new-smbshare -name guestshare -path C:\guestshare -fullaccess Everyone
I've tested v3.10, v4.4, master, master+your patch using default options
(empty or no user "NU") and user=abc (U).
NT_LOGON_FAILURE in session setup: LF
This is what you seem to have in 3.10.
NT_ACCESS_DENIED in tree connect to the share: AD
This is what you get before your infinite loop.
| NU U
--------------------------------
3.10 | LF LF
4.4 | LF LF
master | AD LF
master+patch | AD LF
No infinite DFS loop :(
All these issues result in mount failing very fast with permission denied.
I guess it could be from either the Windows version or the share/folder
ACL. A deeper analysis of the packets might reveal more.
In any case I did not notice any issues for on a basic DFS setup with
the patch so I don't think it introduced any regressions, which is
probably all that matters. It still bothers me a little I couldn't hit
the bug.
I've included kernel output w/ debugging output and network capture of
my tests if anyone want to have a look at it. (master+patch = ml-guestfix).
Signed-off-by: Mark Syms <mark.syms@citrix.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Tested-by: Aurelien Aptel <aaptel@suse.com>
Acked-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Currently during receiving a read response mid->resp_buf can be
NULL when it is being passed to cifs_discard_remaining_data() from
cifs_readv_discard(). Fix it by always passing server->smallbuf
instead and initializing mid->resp_buf at the end of read response
processing.
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Acked-by: Sachin Prabhu <sprabhu@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Pull CIFS fixes from Steve French:
"This is a set of CIFS/SMB3 fixes for stable.
There is another set of four SMB3 reconnect fixes for stable in
progress but they are still being reviewed/tested, so didn't want to
wait any longer to send these five below"
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
Reset TreeId to zero on SMB2 TREE_CONNECT
CIFS: Fix build failure with smb2
Introduce cifs_copy_file_range()
SMB3: Rename clone_range to copychunk_range
Handle mismatched open calls
Here are 3 small fixes for 4.11-rc6. One resolves a reported issue with
sysfs files that NeilBrown found, one is a documenatation fix for the
stable kernel rules, and the last is a small MAINTAINERS file update for
kernfs.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWOnrMw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yk/JQCfQKjOpGDAR9Hs6u4YQ4hJrAHFneYAn1F4MLDW
3b0ZMnlZHkDq834UwKnB
=iiei
-----END PGP SIGNATURE-----
Merge tag 'driver-core-4.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are 3 small fixes for 4.11-rc6.
One resolves a reported issue with sysfs files that NeilBrown found,
one is a documenatation fix for the stable kernel rules, and the last
is a small MAINTAINERS file update for kernfs"
* tag 'driver-core-4.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
MAINTAINERS: separate out kernfs maintainership
sysfs: be careful of error returns from ops->show()
Documentation: stable-kernel-rules: fix stable-tag format
Pull VFS fixes from Al Viro:
"statx followup fixes and a fix for stack-smashing on alpha"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
alpha: fix stack smashing in old_adjtimex(2)
statx: Include a mask for stx_attributes in struct statx
statx: Reserve the top bit of the mask for future struct expansion
xfs: report crtime and attribute flags to statx
ext4: Add statx support
statx: optimize copy of struct statx to userspace
statx: remove incorrect part of vfs_statx() comment
statx: reject unknown flags when using NULL path
Documentation/filesystems: fix documentation for ->getattr()
ops->show() can return a negative error code.
Commit 65da3484d9 ("sysfs: correctly handle short reads on PREALLOC attrs.")
(in v4.4) caused this to be stored in an unsigned 'size_t' variable, so errors
would look like large numbers.
As a result, if an error is returned, sysfs_kf_read() will return the
value of 'count', typically 4096.
Commit 17d0774f80 ("sysfs: correctly handle read offset on PREALLOC attrs")
(in v4.8) extended this error to use the unsigned large 'len' as a size for
memmove().
Consequently, if ->show returns an error, then the first read() on the
sysfs file will return 4096 and could return uninitialized memory to
user-space.
If the application performs a subsequent read, this will trigger a memmove()
with extremely large count, and is likely to crash the machine is bizarre ways.
This bug can currently only be triggered by reading from an md
sysfs attribute declared with __ATTR_PREALLOC() during the
brief period between when mddev_put() deletes an mddev from
the ->all_mddevs list, and when mddev_delayed_delete() - which is
scheduled on a workqueue - completes.
Before this, an error won't be returned by the ->show()
After this, the ->show() won't be called.
I can reproduce it reliably only by putting delay like
usleep_range(500000,700000);
early in mddev_delayed_delete(). Then after creating an
md device md0 run
echo clear > /sys/block/md0/md/array_state; cat /sys/block/md0/md/array_state
The bug can be triggered without the usleep.
Fixes: 65da3484d9 ("sysfs: correctly handle short reads on PREALLOC attrs.")
Fixes: 17d0774f80 ("sysfs: correctly handle read offset on PREALLOC attrs")
Cc: stable@vger.kernel.org
Signed-off-by: NeilBrown <neilb@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Reported-and-tested-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Merge misc fixes from Andrew Morton:
"10 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm: move pcp and lru-pcp draining into single wq
mailmap: update Yakir Yang email address
mm, swap_cgroup: reschedule when neeed in swap_cgroup_swapoff()
dax: fix radix tree insertion race
mm, thp: fix setting of defer+madvise thp defrag mode
ptrace: fix PTRACE_LISTEN race corrupting task->state
vmlinux.lds: add missing VMLINUX_SYMBOL macros
mm/page_alloc.c: fix print order in show_free_areas()
userfaultfd: report actual registered features in fdinfo
mm: fix page_vma_mapped_walk() for ksm pages
While running generic/340 in my test setup I hit the following race. It
can happen with kernels that support FS DAX PMDs, so v4.10 thru
v4.11-rc5.
Thread 1 Thread 2
-------- --------
dax_iomap_pmd_fault()
grab_mapping_entry()
spin_lock_irq()
get_unlocked_mapping_entry()
'entry' is NULL, can't call lock_slot()
spin_unlock_irq()
radix_tree_preload()
dax_iomap_pmd_fault()
grab_mapping_entry()
spin_lock_irq()
get_unlocked_mapping_entry()
...
lock_slot()
spin_unlock_irq()
dax_pmd_insert_mapping()
<inserts a PMD mapping>
spin_lock_irq()
__radix_tree_insert() fails with -EEXIST
<fall back to 4k fault, and die horribly
when inserting a 4k entry where a PMD exists>
The issue is that we have to drop mapping->tree_lock while calling
radix_tree_preload(), but since we didn't have a radix tree entry to
lock (unlike in the pmd_downgrade case) we have no protection against
Thread 2 coming along and inserting a PMD at the same index. For 4k
entries we handled this with a special-case response to -EEXIST coming
from the __radix_tree_insert(), but this doesn't save us for PMDs
because the -EEXIST case can also mean that we collided with a 4k entry
in the radix tree at a different index, but one that is covered by our
PMD range.
So, correctly handle both the 4k and 2M collision cases by explicitly
re-checking the radix tree for an entry at our index once we reacquire
mapping->tree_lock.
This patch has made it through a clean xfstests run with the current
v4.11-rc5 based linux/master, and it also ran generic/340 500 times in a
loop. It used to fail within the first 10 iterations.
Link: http://lkml.kernel.org/r/20170406212944.2866-1-ross.zwisler@linux.intel.com
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: <stable@vger.kernel.org> [4.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>