Our gluster boxes were hitting a problem where they'd run out of space when
updating the block group cache and therefore wouldn't be able to update the free
space inode. This is a problem because this is how we invalidate the cache and
protect ourselves from errors further down the stack, so if this fails we have
to abort the transaction so we make sure we don't end up with stale free space
cache. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
We can have multiple fsync operations against the same file during the
same transaction and they can collect the same ordered extents while they
don't complete (still accessible from the inode's ordered tree). If this
happens, those ordered extents will never get their reference counts
decremented to 0, leading to memory leaks and inode leaks (an iput for an
ordered extent's inode is scheduled only when the ordered extent's refcount
drops to 0). The following sequence diagram explains this race:
CPU 1 CPU 2
btrfs_sync_file()
btrfs_sync_file()
mutex_lock(inode->i_mutex)
btrfs_log_inode()
btrfs_get_logged_extents()
--> collects ordered extent X
--> increments ordered
extent X's refcount
btrfs_submit_logged_extents()
mutex_unlock(inode->i_mutex)
mutex_lock(inode->i_mutex)
btrfs_sync_log()
btrfs_wait_logged_extents()
--> list_del_init(&ordered->log_list)
btrfs_log_inode()
btrfs_get_logged_extents()
--> Adds ordered extent X
to logged_list because
at this point:
list_empty(&ordered->log_list)
&& test_bit(BTRFS_ORDERED_LOGGED,
&ordered->flags) == 0
--> Increments ordered extent
X's refcount
--> check if ordered extent's io is
finished or not, start it if
necessary and wait for it to finish
--> sets bit BTRFS_ORDERED_LOGGED
on ordered extent X's flags
and adds it to trans->ordered
btrfs_sync_log() finishes
btrfs_submit_logged_extents()
btrfs_log_inode() finishes
mutex_unlock(inode->i_mutex)
btrfs_sync_file() finishes
btrfs_sync_log()
btrfs_wait_logged_extents()
--> Sees ordered extent X has the
bit BTRFS_ORDERED_LOGGED set in
its flags
--> X's refcount is untouched
btrfs_sync_log() finishes
btrfs_sync_file() finishes
btrfs_commit_transaction()
--> called by transaction kthread for e.g.
btrfs_wait_pending_ordered()
--> waits for ordered extent X to
complete
--> decrements ordered extent X's
refcount by 1 only, corresponding
to the increment done by the fsync
task ran by CPU 1
In the scenario of the above diagram, after the transaction commit,
the ordered extent will remain with a refcount of 1 forever, leaking
the ordered extent structure and preventing the i_count of its inode
from ever decreasing to 0, since the delayed iput is scheduled only
when the ordered extent's refcount drops to 0, preventing the inode
from ever being evicted by the VFS.
Fix this by using the flag BTRFS_ORDERED_LOGGED differently. Use it to
mean that an ordered extent is already being processed by an fsync call,
which will attach it to the current transaction, preventing it from being
collected by subsequent fsync operations against the same inode.
This race was introduced with the following change (added in 3.19 and
backported to stable 3.18 and 3.17):
Btrfs: make sure logged extents complete in the current transaction V3
commit 50d9aa99bd
I ran into this issue while running xfstests/generic/113 in a loop, which
failed about 1 out of 10 runs with the following warning in dmesg:
[ 2612.440038] WARNING: CPU: 4 PID: 22057 at fs/btrfs/disk-io.c:3558 free_fs_root+0x36/0x133 [btrfs]()
[ 2612.442810] Modules linked in: btrfs crc32c_generic xor raid6_pq nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc loop processor parport_pc parport psmouse therma
l_sys i2c_piix4 serio_raw pcspkr evdev microcode button i2c_core ext4 crc16 jbd2 mbcache sd_mod sg sr_mod cdrom virtio_scsi ata_generic virtio_pci ata_piix virtio_ring libata virtio flo
ppy e1000 scsi_mod [last unloaded: btrfs]
[ 2612.452711] CPU: 4 PID: 22057 Comm: umount Tainted: G W 3.19.0-rc5-btrfs-next-4+ #1
[ 2612.454921] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
[ 2612.457709] 0000000000000009 ffff8801342c3c78 ffffffff8142425e ffff88023ec8f2d8
[ 2612.459829] 0000000000000000 ffff8801342c3cb8 ffffffff81045308 ffff880046460000
[ 2612.461564] ffffffffa036da56 ffff88003d07b000 ffff880046460000 ffff880046460068
[ 2612.463163] Call Trace:
[ 2612.463719] [<ffffffff8142425e>] dump_stack+0x4c/0x65
[ 2612.464789] [<ffffffff81045308>] warn_slowpath_common+0xa1/0xbb
[ 2612.466026] [<ffffffffa036da56>] ? free_fs_root+0x36/0x133 [btrfs]
[ 2612.467247] [<ffffffff810453c5>] warn_slowpath_null+0x1a/0x1c
[ 2612.468416] [<ffffffffa036da56>] free_fs_root+0x36/0x133 [btrfs]
[ 2612.469625] [<ffffffffa036f2a7>] btrfs_drop_and_free_fs_root+0x93/0x9b [btrfs]
[ 2612.471251] [<ffffffffa036f353>] btrfs_free_fs_roots+0xa4/0xd6 [btrfs]
[ 2612.472536] [<ffffffff8142612e>] ? wait_for_completion+0x24/0x26
[ 2612.473742] [<ffffffffa0370bbc>] close_ctree+0x1f3/0x33c [btrfs]
[ 2612.475477] [<ffffffff81059d1d>] ? destroy_workqueue+0x148/0x1ba
[ 2612.476695] [<ffffffffa034e3da>] btrfs_put_super+0x19/0x1b [btrfs]
[ 2612.477911] [<ffffffff81153e53>] generic_shutdown_super+0x73/0xef
[ 2612.479106] [<ffffffff811540e2>] kill_anon_super+0x13/0x1e
[ 2612.480226] [<ffffffffa034e1e3>] btrfs_kill_super+0x17/0x23 [btrfs]
[ 2612.481471] [<ffffffff81154307>] deactivate_locked_super+0x3b/0x50
[ 2612.482686] [<ffffffff811547a7>] deactivate_super+0x3f/0x43
[ 2612.483791] [<ffffffff8116b3ed>] cleanup_mnt+0x59/0x78
[ 2612.484842] [<ffffffff8116b44c>] __cleanup_mnt+0x12/0x14
[ 2612.485900] [<ffffffff8105d019>] task_work_run+0x8f/0xbc
[ 2612.486960] [<ffffffff810028d8>] do_notify_resume+0x5a/0x6b
[ 2612.488083] [<ffffffff81236e5b>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 2612.489333] [<ffffffff8142a17f>] int_signal+0x12/0x17
[ 2612.490353] ---[ end trace 54a960a6bdcb8d93 ]---
[ 2612.557253] VFS: Busy inodes after unmount of sdb. Self-destruct in 5 seconds. Have a nice day...
Kmemleak confirmed the ordered extent leak (and btrfs inode specific
structures such as delayed nodes):
$ cat /sys/kernel/debug/kmemleak
unreferenced object 0xffff880154290db0 (size 576):
comm "btrfsck", pid 21980, jiffies 4295542503 (age 1273.412s)
hex dump (first 32 bytes):
01 40 00 00 01 00 00 00 b0 1d f1 4e 01 88 ff ff .@.........N....
00 00 00 00 00 00 00 00 c8 0d 29 54 01 88 ff ff ..........)T....
backtrace:
[<ffffffff8141d74d>] kmemleak_update_trace+0x4c/0x6a
[<ffffffff8122f2c0>] radix_tree_node_alloc+0x6d/0x83
[<ffffffff8122fb26>] __radix_tree_create+0x109/0x190
[<ffffffff8122fbdd>] radix_tree_insert+0x30/0xac
[<ffffffffa03b9bde>] btrfs_get_or_create_delayed_node+0x130/0x187 [btrfs]
[<ffffffffa03bb82d>] btrfs_delayed_delete_inode_ref+0x32/0xac [btrfs]
[<ffffffffa0379dae>] __btrfs_unlink_inode+0xee/0x288 [btrfs]
[<ffffffffa037c715>] btrfs_unlink_inode+0x1e/0x40 [btrfs]
[<ffffffffa037c797>] btrfs_unlink+0x60/0x9b [btrfs]
[<ffffffff8115d7f0>] vfs_unlink+0x9c/0xed
[<ffffffff8115f5de>] do_unlinkat+0x12c/0x1fa
[<ffffffff811601a7>] SyS_unlinkat+0x29/0x2b
[<ffffffff81429e92>] system_call_fastpath+0x12/0x17
[<ffffffffffffffff>] 0xffffffffffffffff
unreferenced object 0xffff88014ef11db0 (size 576):
comm "rm", pid 22009, jiffies 4295542593 (age 1273.052s)
hex dump (first 32 bytes):
02 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 c8 1d f1 4e 01 88 ff ff ...........N....
backtrace:
[<ffffffff8141d74d>] kmemleak_update_trace+0x4c/0x6a
[<ffffffff8122f2c0>] radix_tree_node_alloc+0x6d/0x83
[<ffffffff8122fb26>] __radix_tree_create+0x109/0x190
[<ffffffff8122fbdd>] radix_tree_insert+0x30/0xac
[<ffffffffa03b9bde>] btrfs_get_or_create_delayed_node+0x130/0x187 [btrfs]
[<ffffffffa03bb82d>] btrfs_delayed_delete_inode_ref+0x32/0xac [btrfs]
[<ffffffffa0379dae>] __btrfs_unlink_inode+0xee/0x288 [btrfs]
[<ffffffffa037c715>] btrfs_unlink_inode+0x1e/0x40 [btrfs]
[<ffffffffa037c797>] btrfs_unlink+0x60/0x9b [btrfs]
[<ffffffff8115d7f0>] vfs_unlink+0x9c/0xed
[<ffffffff8115f5de>] do_unlinkat+0x12c/0x1fa
[<ffffffff811601a7>] SyS_unlinkat+0x29/0x2b
[<ffffffff81429e92>] system_call_fastpath+0x12/0x17
[<ffffffffffffffff>] 0xffffffffffffffff
unreferenced object 0xffff8800336feda8 (size 584):
comm "aio-stress", pid 22031, jiffies 4295543006 (age 1271.400s)
hex dump (first 32 bytes):
00 40 3e 00 00 00 00 00 00 00 8f 42 00 00 00 00 .@>........B....
00 00 01 00 00 00 00 00 00 00 01 00 00 00 00 00 ................
backtrace:
[<ffffffff8114eb34>] create_object+0x172/0x29a
[<ffffffff8141d790>] kmemleak_alloc+0x25/0x41
[<ffffffff81141ae6>] kmemleak_alloc_recursive.constprop.52+0x16/0x18
[<ffffffff81145288>] kmem_cache_alloc+0xf7/0x198
[<ffffffffa0389243>] __btrfs_add_ordered_extent+0x43/0x309 [btrfs]
[<ffffffffa038968b>] btrfs_add_ordered_extent_dio+0x12/0x14 [btrfs]
[<ffffffffa03810e2>] btrfs_get_blocks_direct+0x3ef/0x571 [btrfs]
[<ffffffff81181349>] do_blockdev_direct_IO+0x62a/0xb47
[<ffffffff8118189a>] __blockdev_direct_IO+0x34/0x36
[<ffffffffa03776e5>] btrfs_direct_IO+0x16a/0x1e8 [btrfs]
[<ffffffff81100373>] generic_file_direct_write+0xb8/0x12d
[<ffffffffa038615c>] btrfs_file_write_iter+0x24b/0x42f [btrfs]
[<ffffffff8118bb0d>] aio_run_iocb+0x2b7/0x32e
[<ffffffff8118c99a>] do_io_submit+0x26e/0x2ff
[<ffffffff8118ca3b>] SyS_io_submit+0x10/0x12
[<ffffffff81429e92>] system_call_fastpath+0x12/0x17
CC: <stable@vger.kernel.org> # 3.19, 3.18 and 3.17
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
In commit b4eef9b36d, we started to use hwapic_isr_update() != NULL
instead of kvm_apic_vid_enabled(vcpu->kvm). This didn't work because
SVM had it defined and "apicv" path in apic_{set,clear}_isr() does not
change apic->isr_count, because it should always be 1. The initial
value of apic->isr_count was based on kvm_apic_vid_enabled(vcpu->kvm),
which is always 0 for SVM, so KVM could have injected interrupts when it
shouldn't.
Fix it by implicitly setting SVM's hwapic_isr_update to NULL and make the
initial isr_count depend on hwapic_isr_update() for good measure.
Fixes: b4eef9b36d ("kvm: x86: vmx: NULL out hwapic_isr_update() in case of !enable_apicv")
Reported-and-tested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
- fix a read-balance problem that was reported 2 years ago, but
that I never noticed the report :-(
- fix for rare RAID6 problem causing incorrect bitmap updates when
two devices fail.
- add __ATTR_PREALLOC annotation now that it is possible.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIVAwUAVPOlKznsnt1WYoG5AQJKlw//TXHI4MFB/3Zy0ncbHMEpKwgyuTYD0kCM
lpsQGowAaqKdUfmXxhtLjSgQXmxpUf/q200EKUr81nV/v+HQTraC91ZmyHNvUPaB
4+blSoEDqF/spo6rlbEXw6ByWAcaO6w3SVDLDci4rXoQoqzmGPzzjD4zqr485j61
xRk4cV0zDVpdzp7OX+bR/fCt3A0ELbAXi22E+8U6NXnYwQPb3vIYNydcjQEPEpKk
nLpQRoinz+XpnidUneuFO2/3Lgax5bsgK3ruxxgTUWrlF2weCD5+3g1S2FQrqZFp
d+FyEgyv5hGgpg6mqGRvERIrzlwkqdaZAhP0haC82ZhOR5VnZFR2KS+1sACDR3jQ
0QSR7IX8opTgvZaepNdjRAp2W4/zYnhIceMwgi9TPHWiTTT3xW7KW99kj5DdxiCg
21i/SHXuTnw//rlNfE663wwtuBnyCEDeTCmjUNBJ0Nset+Cnc4wq6pdvt8Wzxh/a
rGuTkD9eTQ3oR33hfJD2iUAQKYfvdr2u9zun8TzBwe50zTS+MTd3+k1xYNwcUC8z
LfUarTLlv59L8anBhNoBzGMhZa62jqqz1Tvj3EI5u/sXbDqtzZhixhoafpsWmBnA
8h2YyvVU4q3Oxalaqk2gEufscAtD8bAHzbbHKdd9HYLWnyoiWCYydN1QAUWKvfWP
ycs7YftfNDM=
=CaGN
-----END PGP SIGNATURE-----
Merge tag 'md/4.0-fixes' of git://neil.brown.name/md
Pull md fixes from Neil Brown:
"Three md fixes:
- fix a read-balance problem that was reported 2 years ago, but that
I never noticed the report :-(
- fix for rare RAID6 problem causing incorrect bitmap updates when
two devices fail.
- add __ATTR_PREALLOC annotation now that it is possible"
* tag 'md/4.0-fixes' of git://neil.brown.name/md:
md: mark some attributes as pre-alloc
raid5: check faulty flag for array status during recovery.
md/raid1: fix read balance when a drive is write-mostly.
This is just a single patch to fix the KSTK_EIP() and KSTK_ESP() macros
for metag which have always been erronously returning the PC and stack
pointer of the task's kernel context rather than from its user context
saved at entry from userland into the kernel, which affects the contents
of /proc/<pid>/maps and /proc/<pid>/stat.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJU9EnnAAoJEGwLaZPeOHZ67uYQALvIb6R/Webca0hEPuHa95ic
4mfKzj7iD4DZTBLsuYyq9+NIt2g24mCd5vbDLfH63PpEnRwjdt7y+n4xhRVoQS87
ZC2aLx3Ry6NC7ByGhEq0n4aTOaKYffe9y4XE8FZddn/9rZUIE2sEdXGtyrg6rsYf
Eb/e/senPBRPNT8LpurSmYcPsCB2q2yo+0503aj41VjCjPbYe92/QrIDU8Ag3R5y
c5C0btD9NOcB4xt/vIGU7H0OH85Q+OvLHBzu/5aVFyPelPtIE4xpYP1fRyd/P002
Jmm6KH52ILMArgqB3KavKMvCebQBwwf92LLUtQ5ZhdeX9TMYzgG22P3CmZcS49Ha
xwkIgDbeI1BQeMoVgTgVRMDnAOXmF/HdzxlbILHonaptiHDEOj3izdWpfubrIGi7
9/69L/hF3DY5udt8qBQ4fWDJrvBYQpoqyUEiv/eFfyhFpVaCxKQ0YQgWto3UujWG
7ESNkNp3kTTlo4NeUh47x1TE0CBiNHAGU+r72Uysb/u9N3Aya8b/jy8x4wCBemLs
vHL3bfgg7Pee067/O+w9GTQoe7ldzifcSrTGV3s7wpUqKPBGUdq4MtPaXDvJqk/W
uqnjoH1+/juvBpjwNwoavCXAO5CI6j19kKQH9iCc3v3YizSRtCG4VwfnVd2HgSc4
LtUrSZkfkwQvBn1oc4mg
=1+sf
-----END PGP SIGNATURE-----
Merge tag 'metag-fixes-v4.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag
Pull arch/metag fix from James Hogan:
"This is just a single patch to fix the KSTK_EIP() and KSTK_ESP()
macros for metag which have always been erronously returning the PC
and stack pointer of the task's kernel context rather than from its
user context saved at entry from userland into the kernel, which
affects the contents of /proc/<pid>/maps and /proc/<pid>/stat"
* tag 'metag-fixes-v4.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag:
metag: Fix KSTK_EIP() and KSTK_ESP() macros
Move the fallback code path in cpuidle_idle_call() to the end of the
function to avoid jumping to a label in an if () branch.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Or Gerlitz says:
====================
Mellanox driver fixes
Two small fixes, please apply to net.
Both patches should go to 3.19.y too.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Packets which are sent from the selftest (ethtool) flow,
should not be passed to GRO stack but rather dropped by
the driver after validation. To achieve that, we disable
GRO for the duration of the selftest.
Fixes: dd65beac48 ("net/mlx4_en: Extend usage of napi_gro_frags")
Reported-by: Carol Soto <clsoto@linux.vnet.ibm.com>
Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bit mask for currently supported driver features (MLX4_UPDATE_QP_SUPPORTED_ATTRS)
of the update-qp command was defined twice (using enum value and pre-processor
define directive) and wrong.
The return value of the call to mlx4_update_qp() from within the SRIOV
resource-tracker was wrongly voided down.
Fix both issues.
issue: none
Fixes: 09e05c3f78 ('net/mlx4: Set vlan stripping policy by the right command')
Fixes: ce8d9e0d67 ('net/mlx4_core: Add UPDATE_QP SRIOV wrapper support')
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During system reboot, the sh-dma-engine device may be runtime-suspended,
causing a crash:
Unhandled fault: imprecise external abort (0x1406) at 0x0002c02c
Internal error: : 1406 [#1] SMP ARM
...
PC is at sh_dmae_ctl_stop+0x28/0x64
LR is at sh_dmae_ctl_stop+0x24/0x64
If the sh-dma-engine is runtime-suspended, its module clock is turned
off, and its registers cannot be accessed.
To fix this, move the call to sh_dmae_ctl_stop(), which touches the
DMAOR register, to the sh_dmae_suspend() and sh_dmae_runtime_suspend()
callbacks. This makes PM operations more symmetric, as both
sh_dmae_resume() and sh_dmae_runtime_resume() already call sh_dmae_rst()
to re-initialize the DMAOR register.
Remove sh_dmae_shutdown(), as it became empty.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Atmel based boards can now only be used with device tree. Drop non DT
initialization.
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
and a type bug that could lead to integer overflow - Ivan Khoronzhuk
* Fix boundary checking in efi_high_alloc() which can lead to memory
corruption in the EFI boot stubs - Yinghai Lu
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJU9FtlAAoJEC84WcCNIz1VjfsP/jnZPtkSapSsFP9c7AfV/vpg
i4PLGk+18QhXpNrCVC1U4sdx3y+zefqImrDNEv72BLX6YDb10RvtydxEy4Kg2aaE
XzCRinHWu3+IEwv4fKAmNj2HORTl+jn79JDZ97jm1PN5sOxVcRG9e3QBg6aTVhHr
MdTXRMAKHYD+ZX5hrCMrbFXi1dboxVsUb1zwMTbJcmPSVPWToqNKCruSwp29LNfP
/2ZsJJSHgFP3tobk37JHDTHxjXaN/GUIwQC9cIWUQMPiwU3+WeOvROBPeKUTFNv7
kS4CXY5Q6eKz+pWYqG+FhbfHM71GTWPyFEJNeLtALg2DSKbgL6lJbtkrPpBVXrcU
TeHlHnYTlqEpcMqHW3JtrVb0Of0/8X/9YfWjpmdxNcNbbp7KvzTtoBcP8MjGdbIq
CztyB4clFsiyy1bEoGHFTVArzch5nn7sRCL3mYhTNQaeyN6TZc0wMXOFF/JU7N5a
GCn9VO6T396L/7WdzG0B/Uo01xw11OS/R0jZVoDvtGfAregO+NU+yLunTEYaRtkC
prxQ62Bu21EjLKJcdr/toFkEG8sT08XJnGTixRJnJlw+hmsK8WaigBrdpirXT5SV
TDJJNyo6A/drfjcPoTI4lCR1CpPV3QXjCTmhh+K6tbvX5/npuWN/i4KJh54WuwT4
BKouS5gjrgYcHH/XJjsQ
=GJnM
-----END PGP SIGNATURE-----
Merge tag 'efi-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into x86/urgent
Pull EFI fixes from Matt Fleming:
" - Fix regression in DMI sysfs code for handling "End of Table" entry
and a type bug that could lead to integer overflow. (Ivan Khoronzhuk)
- Fix boundary checking in efi_high_alloc() which can lead to memory
corruption in the EFI boot stubs. (Yinghai Lu)"
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch adds entry for SAMSUNG THERMAL DRIVER in the MAINTAINERS file.
It has been agreed, that pull request are going to be sent to Eduardo
Valentin.
Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
Acked-by: Eduardo Valentin <edubezval@gmail.com>
Commit: e725d26c48 provided possibility to
use device tree to asses if cpu can be used as cooling device. Since the
code was somewhat awkward, simpler approach has been proposed.
Test HW: Exynos 4412 - Odroid U3.
Suggested-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
This patch fixes the wrong control of PD_DET_EN (power down detection mode)
for Exynos7 because exynos7_tmu_control() always enables the power down detection
mode regardless 'on' parameter.
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Tested-by: Abhilash Kesavan <a.kesavan@samsung.com>
Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
The ch341_set_baudrate() function initialize the device baud speed
according to the value on priv->baud_rate. By default the ch341_open() set
it to a hardcoded value (DEFAULT_BAUD_RATE 9600). Unfortunately, the
tty_struct is not initialized with the same default value. (usually 56700)
This means that the tty_struct and the device baud rate generator are not
synchronized after opening the port.
Fixup is done by calling ch341_set_termios() if tty exist.
Remove unnecessary variable priv->baud_rate setup as it's already done by
ch341_port_probe().
Remove unnecessary call to ch341_set_{handshake,baudrate}() in
ch341_open() as there already called in ch341_configure() and
ch341_set_termios()
Signed-off-by: Nicolas PLANEL <nicolas.planel@enovance.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
If the server does not return a valid set of attributes that we can
use to either create a file or refresh the inode, then there is no
value in calling nfs_prime_dcache().
However if we're just refreshing the inode using the attributes that
the server returned, then it shouldn't matter whether or not we have
a filehandle, as long as we check the fsid+fileid combination.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
When we call readdirplus, set the fileid normally returned by readdir
as the mounted-on-fileid, since that is commonly the case if there is
a mountpoint. To ensure that we get it right, we only set the flag if
the readdir fileid differs from the one returned in the readdirplus
attributes.
This again means that we can avoid the issues described in commit
2ef47eb1ae ("NFS: Fix use of nfs_attr_use_mounted_on_fileid()"),
which only fixed NFSv4.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
If we're traversing a directory which contains a submounted filesystem,
or one that has a referral, the NFS server that is processing the READDIR
request will often return information for the underlying (mounted-on)
directory. It may, or may not, also return filehandle information.
If this happens, and the lookup in nfs_prime_dcache() returns the
dentry for the submounted directory, the filehandle comparison will
fail, and we call d_invalidate(). Post-commit 8ed936b567
("vfs: Lazily remove mounts on unlinked files and directories."), this
means the entire subtree is unmounted.
The following minimal patch addresses this problem by punting on
the invalidation if there is a submount.
Kudos to Neil Brown <neilb@suse.de> for having tracked down this
issue (see link).
Reported-by: Nix <nix@esperi.org.uk>
Link: http://lkml.kernel.org/r/87iofju9ht.fsf@spindle.srvr.nix
Cc: stable@vger.kernel.org # 3.18+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Ensure that we don't regress the changes that were made to the
directory.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
nfs_post_op_update_inode() is called after a self-induced attribute
update. Ensure that it also sets the barrier.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
Prior to this patch, we used to always OK attribute updates that extended
the file size on the assumption that we might be performing writeback.
Now that we have attribute barriers to protect the writeback related updates,
we should remove this hack, as it can cause truncate() operations to
apparently be reverted if/when a readahead or getattr RPC call races
with our on-the-wire SETATTR.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
Ensure that other operations that race with delegreturn and layoutcommit
cannot revert the attribute updates that were made on the server.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
Ensure that other operations that race with our write RPC calls
cannot revert the file size updates that were made on the server.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
Ensure that we update the attribute barrier even if there were no
invalidations, provided that this value is newer than the old one.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
Ensure that other operations which raced with our setattr RPC call
cannot revert the file attribute changes that were made on the server.
To do so, we artificially bump the attribute generation counter on
the inode so that all calls to nfs_fattr_init() that precede ours
will be dropped.
The motivation for the patch came from Chuck Lever's reports of readaheads
racing with truncate operations and causing the file size to be reverted.
Reported-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
The O_DIRECT code will grab the inode->i_mutex and flush out buffered
writes, before scheduling a read or a write. However there is no
equivalent in the buffered write code to wait for O_DIRECT to complete.
Fixes a reported issue in xfstests generic/133, when first performing an
O_DIRECT write followed by a buffered write.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
Set the internal device state to to disabled after hardware reset in stop flow.
This will cover cases when driver was not brought to disabled state because of
an error and in stop flow we wish not to retry the reset.
Cc: <stable@vger.kernel.org> #3.10+
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reading of analog input channels by the `INSN_READ` comedi instruction
is broken for all except channel 0. `pci171x_ai_insn_read()` calls
`pci171x_ai_read_sample()` with the wrong value for the third parameter.
It is supposed to be the current index in a channel list (which is
always of length 1 in this case, so the index should be 0), but instead
it is passing the actual channel number. `pci171x_ai_read_sample()`
checks the channel number encoded in the raw sample value read from the
hardware matches the channel number stored in the specified index of the
previously set up channel list and returns `-ENODATA` if it doesn't
match. Since the index should always be 0 in this case, the match will
fail unless the channel number is also 0. Fix it by passing 0 as the
channel index.
Note that when the bug first appeared, it was `pci171x_ai_dropout()`
that was called with the wrong parameter value. `pci171x_ai_dropout()`
got replaced with `pci171x_ai_read_sample()` in commit 7fd2dae250
("staging: comedi: adv_pci1710: introduce pci171x_ai_read_sample()").
Fixes: 16c7eb6047 ("staging: comedi: adv_pci1710: always enable PCI171x_PARANOIDCHECK code")
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Cc: stable <stable@vger.kernel.org> # 3.16+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
binder_update_page_range() initializes only addr and size
fields in 'struct vm_struct tmp_area;' and passes it to
map_vm_area().
Before 71394fe501 ("mm: vmalloc: add flag preventing guard hole allocation")
this was because map_vm_area() didn't use any other fields
in vm_struct except addr and size.
Now get_vm_area_size() (used in map_vm_area()) reads vm_struct's
flags to determine whether vm area has guard hole or not.
binder_update_page_range() don't initialize flags field, so
this causes following binder mmap failures:
-----------[ cut here ]------------
WARNING: CPU: 0 PID: 1971 at mm/vmalloc.c:130
vmap_page_range_noflush+0x119/0x144()
CPU: 0 PID: 1971 Comm: healthd Not tainted 4.0.0-rc1-00399-g7da3fdc-dirty #157
Hardware name: ARM-Versatile Express
[<c001246d>] (unwind_backtrace) from [<c000f7f9>] (show_stack+0x11/0x14)
[<c000f7f9>] (show_stack) from [<c049a221>] (dump_stack+0x59/0x7c)
[<c049a221>] (dump_stack) from [<c001cf21>] (warn_slowpath_common+0x55/0x84)
[<c001cf21>] (warn_slowpath_common) from [<c001cfe3>]
(warn_slowpath_null+0x17/0x1c)
[<c001cfe3>] (warn_slowpath_null) from [<c00c66c5>]
(vmap_page_range_noflush+0x119/0x144)
[<c00c66c5>] (vmap_page_range_noflush) from [<c00c716b>] (map_vm_area+0x27/0x48)
[<c00c716b>] (map_vm_area) from [<c038ddaf>]
(binder_update_page_range+0x12f/0x27c)
[<c038ddaf>] (binder_update_page_range) from [<c038e857>]
(binder_mmap+0xbf/0x1ac)
[<c038e857>] (binder_mmap) from [<c00c2dc7>] (mmap_region+0x2eb/0x4d4)
[<c00c2dc7>] (mmap_region) from [<c00c3197>] (do_mmap_pgoff+0x1e7/0x250)
[<c00c3197>] (do_mmap_pgoff) from [<c00b35b5>] (vm_mmap_pgoff+0x45/0x60)
[<c00b35b5>] (vm_mmap_pgoff) from [<c00c1f39>] (SyS_mmap_pgoff+0x5d/0x80)
[<c00c1f39>] (SyS_mmap_pgoff) from [<c000ce81>] (ret_fast_syscall+0x1/0x5c)
---[ end trace 48c2c4b9a1349e54 ]---
binder: 1982: binder_alloc_buf failed to map page at f0e00000 in kernel
binder: binder_mmap: 1982 b6bde000-b6cdc000 alloc small buf failed -12
Use map_kernel_range_noflush() instead of map_vm_area() as this is better
API for binder's purposes and it allows to get rid of 'vm_struct tmp_area' at all.
Fixes: 71394fe501 ("mm: vmalloc: add flag preventing guard hole allocation")
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Reported-by: Amit Pundir <amit.pundir@linaro.org>
Tested-by: Amit Pundir <amit.pundir@linaro.org>
Acked-by: David Rientjes <rientjes@google.com>
Tested-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull x86 fixes from Ingo Molnar:
"A CR4-shadow 32-bit init fix, plus two typo fixes"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Init per-cpu shadow copy of CR4 on 32-bit CPUs too
x86/platform/intel-mid: Fix trivial printk message typo in intel_mid_arch_setup()
x86/cpu/intel: Fix trivial typo in intel_tlb_table[]
Pull perf fixes from Ingo Molnar:
"Two kprobes fixes and a handful of tooling fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf tools: Make sparc64 arch point to sparc
perf symbols: Define EM_AARCH64 for older OSes
perf top: Fix SIGBUS on sparc64
perf tools: Fix probing for PERF_FLAG_FD_CLOEXEC flag
perf tools: Fix pthread_attr_setaffinity_np build error
perf tools: Define _GNU_SOURCE on pthread_attr_setaffinity_np feature check
perf bench: Fix order of arguments to memcpy_alloc_mem
kprobes/x86: Check for invalid ftrace location in __recover_probed_insn()
kprobes/x86: Use 5-byte NOP when the code might be modified by ftrace
Pull locking fix from Ingo Molnar:
"An rtmutex deadlock path fixlet"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/rtmutex: Set state back to running on error
Florian Fainelli says:
====================
net: bcmgenet and systemport statistics fixes
This two patches fix a similar problem in the GENET and SYSTEMPORT drivers
for software maintained statistics used to track DMA mapping and SKB
re-allocation failures.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 60b4ea1781 ("net: systemport: log RX buffer allocation and RX/TX DMA
failures") added a few software maintained statistics using
BCM_SYSPORT_STAT_MIB_RX and BCM_SYSPORT_STAT_MIB_TX. These statistics are read
from the hardware MIB counters, such that bcm_sysport_update_mib_counters() was
trying to read from a non-existing MIB offset for these counters.
Fix this by introducing a special type: BCM_SYSPORT_STAT_SOFT, similar to
BCM_SYSPORT_STAT_NETDEV, such that bcm_sysport_get_ethtool_stats will read from
the software mib.
Fixes: 60b4ea1781 ("net: systemport: log RX buffer allocation and RX/TX DMA failures")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 44c8bc3ce3 ("net: bcmgenet: log RX buffer allocation and RX/TX dma
failures") added a few software maintained statistics using
BCMGENET_STAT_MIB_RX and BCMGENET_STAT_MIB_TX. These statistics are read from
the hardware MIB counters, such that bcmgenet_update_mib_counters() was trying
to read from a non-existing MIB offset for these counters.
Fix this by introducing a special type: BCMGENET_STAT_SOFT, similar to
BCMGENET_STAT_NETDEV, such that bcmgenet_get_ethtool_stats will read from the
software mib.
Fixes: 44c8bc3ce3 ("net: bcmgenet: log RX buffer allocation and RX/TX dma failures")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
rxrpc_resend_timeout has an initial value of 4 * HZ; use it as-is.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Typo, 'stop' is never set to true.
Seems intent is to not attempt to retransmit more packets after sendmsg
returns an error.
This change is based on code inspection only.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
To repeat:
$ sudo ip link del hsr0
BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
IP: [<ffffffff8187f495>] hsr_del_port+0x15/0xa0
etc...
Bug description:
As part of the hsr master device destruction, hsr_del_port() is called for each of
the hsr ports. At each such call, the master device is updated regarding features
and mtu. When the master device is freed before the slave interfaces, master will
be NULL in hsr_del_port(), which led to a NULL pointer dereference.
Additionally, dev_put() was called on the master device itself in hsr_del_port(),
causing a refcnt error.
A third bug in the same code path was that the rtnl lock was not taken before
hsr_del_port() was called as part of hsr_dev_destroy().
The reporter (Nicolas Dichtel) also said: "hsr_netdev_notify() supposes that the
port will always be available when the notification is for an hsr interface. It's
wrong. For example, netdev_wait_allrefs() may resend NETDEV_UNREGISTER.". As a
precaution against this, a check for port == NULL was added in hsr_dev_notify().
Reported-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Fixes: 51f3c60531 ("net/hsr: Move slave init to hsr_slave.c.")
Signed-off-by: Arvid Brodin <arvid.brodin@alten.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use timer API functions setup_timer and mod_timer instead
of structure assignments as they are standard way to set
the timer and to update the expire field of an active timer
respectively.
This is done using Coccinelle and semantic patch used for
this is as follows:
// <smpl>
@@
expression x,y,z,a,b;
@@
-init_timer (&x);
+setup_timer (&x, y, z);
+mod_timer (&a, b);
-x.function = y;
-x.data = z;
-x.expires = b;
-add_timer(&a);
// </smpl>
Signed-off-by: Vaishali Thakkar <vthakkar1994@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use timer API functions setup_timer and mod_timer instead
of structure assignments as they are standard way to set
the timer and to update the expire field of an active timer
respectively.
This is done using Coccinelle and semantic patch used for
this is as follows:
// <smpl>
@@
expression x,y,z,a,b;
@@
-init_timer (&x);
+setup_timer (&x, y, z);
+mod_timer (&a, b);
-x.function = y;
-x.data = z;
-x.expires = b;
-add_timer(&a);
// </smpl>
Signed-off-by: Vaishali Thakkar <vthakkar1994@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use timer API functions setup_timer and mod_timer instead
of structure assignments as they are standard way to set
the timer and to update the expire field of an active timer
respectively.
This is done using Coccinelle and semantic patch used for
this is as follows:
// <smpl>
@@
expression x,y,z,a,b;
@@
-init_timer (&x);
+setup_timer (&x, y, z);
+mod_timer (&a, b);
-x.function = y;
-x.data = z;
-x.expires = b;
-add_timer(&a);
// </smpl>
Signed-off-by: Vaishali Thakkar <vthakkar1994@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use timer API functions setup_timer and mod_timer instead
of structure assignments as they are standard way to set
the timer and to update the expire field of an active timer
respectively.
This is done using Coccinelle and semantic patch used for
this is as follows:
// <smpl>
@@
expression x,y,z,a,b;
@@
-init_timer (&x);
+setup_timer (&x, y, z);
+mod_timer (&a, b);
-x.function = y;
-x.data = z;
-x.expires = b;
-add_timer(&a);
// </smpl>
Signed-off-by: Vaishali Thakkar <vthakkar1994@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use timer API functions setup_timer and mod_timer instead
of structure assignments as they are standard way to set
the timer and to update the expire field of an active timer
respectively.
This is done using Coccinelle and semantic patch used for
this is as follows:
// <smpl>
@@
expression x,y,z,a,b;
@@
-init_timer (&x);
+setup_timer (&x, y, z);
+mod_timer (&a, b);
-x.function = y;
-x.data = z;
-x.expires = b;
-add_timer(&a);
// </smpl>
Signed-off-by: Vaishali Thakkar <vthakkar1994@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change 'mutliple' to 'multiple'
Change 'Firmare' to 'Firmware'
Signed-off-by: Yannick Guerrini <yguerrini@tomshardware.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Setting a dev_pm_ops suspend/resume pair but not a set of
hibernation functions means those pm functions will not be
called upon hibernation.
Fix this by using SIMPLE_DEV_PM_OPS, which appropriately
assigns the suspend and hibernation handlers and move
cpsw_suspend/resume calbacks under CONFIG_PM_SLEEP
to avoid build warnings.
Signed-off-by: Grygorii Strashko <Grygorii.Strashko@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>