This reverts commit d353d7587d.
Doing the block layer plug/unplug inside writeback_sb_inodes() is
broken, because that function is actually called with a spinlock held:
wb->list_lock, as pointed out by Chris Mason.
Chris suggested just dropping and re-taking the spinlock around the
blk_finish_plug() call (the plgging itself can happen under the
spinlock), and that would technically work, but is just disgusting.
We do something fairly similar - but not quite as disgusting because we
at least have a better reason for it - in writeback_single_inode(), so
it's not like the caller can depend on the lock being held over the
call, but in this case there just isn't any good reason for that
"release and re-take the lock" pattern.
[ In general, we should really strive to avoid the "release and retake"
pattern for locks, because in the general case it can easily cause
subtle bugs when the caller caches any state around the call that
might be invalidated by dropping the lock even just temporarily. ]
But in this case, the plugging should be easy to just move up to the
callers before the spinlock is taken, which should even improve the
effectiveness of the plug. So there is really no good reason to play
games with locking here.
I'll send off a test-patch so that Dave Chinner can verify that that
plug movement works. In the meantime this just reverts the problematic
commit and adds a comment to the function so that we hopefully don't
make this mistake again.
Reported-by: Chris Mason <clm@fb.com>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull btrfs cleanups and fixes from Chris Mason:
"These are small cleanups, and also some fixes for our async worker
thread initialization.
I was having some trouble testing these, but it ended up being a
combination of changing around my test servers and a shiny new
schedule while atomic from the new start/finish_plug in
writeback_sb_inodes().
That one only hits on btrfs raid5/6 or MD raid10, and if I wasn't
changing a bunch of things in my test setup at once it would have been
really clear. Fix for writeback_sb_inodes() on the way as well"
* 'for-linus-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: cleanup: remove unnecessary check before btrfs_free_path is called
btrfs: async_thread: Fix workqueue 'max_active' value when initializing
btrfs: Add raid56 support for updating num_tolerated_disk_barrier_failures in btrfs_balance
btrfs: Cleanup for btrfs_calc_num_tolerated_disk_barrier_failures
btrfs: Remove noused chunk_tree and chunk_objectid from scrub_enumerate_chunks and scrub_chunk
btrfs: Update out-of-date "skip parity stripe" comment
Pull Ceph update from Sage Weil:
"There are a few fixes for snapshot behavior with CephFS and support
for the new keepalive protocol from Zheng, a libceph fix that affects
both RBD and CephFS, a few bug fixes and cleanups for RBD from Ilya,
and several small fixes and cleanups from Jianpeng and others"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
ceph: improve readahead for file holes
ceph: get inode size for each append write
libceph: check data_len in ->alloc_msg()
libceph: use keepalive2 to verify the mon session is alive
rbd: plug rbd_dev->header.object_prefix memory leak
rbd: fix double free on rbd_dev->header_name
libceph: set 'exists' flag for newly up osd
ceph: cleanup use of ceph_msg_get
ceph: no need to get parent inode in ceph_open
ceph: remove the useless judgement
ceph: remove redundant test of head->safe and silence static analysis warnings
ceph: fix queuing inode to mdsdir's snaprealm
libceph: rename con_work() to ceph_con_workfn()
libceph: Avoid holding the zero page on ceph_msgr_slab_init errors
libceph: remove the unused macro AES_KEY_SIZE
ceph: invalidate dirty pages after forced umount
ceph: EIO all operations after forced umount
Here is a list of patches we've accumulated for GFS2 for the current upstream
merge window. This time we've only got six patches, many of which are very small:
- Three cleanups from Andreas Gruenbacher, including a nice cleanup of
the sequence file code for the sbstats debugfs file.
- A patch from Ben Hutchings that changes statistics variables from signed
to unsigned.
- Two patches from me that increase GFS2's glock scalability by switching
from a conventional hash table to rhashtable.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJV8XuJAAoJENeLYdPf93o73+UH/j+c5Iug/FaTGHTtHiTZjjcR
GRYOHL1UqwM5xb3YwAi63JSgt0jvf+Oo4hf9LZ8wLEm69yCTo4kc8zMNnqDd5Evc
Zx4jJT5XUBtpjPhCAQyJuE6TCjAqm/fsnZmDqUWiwByDkaUnW7cKB20KrIERbYiL
qBV5F42XSpXnNSWeMs8Sg2vYiCS9omI/ZenoIsL4YQAtKdPlX1Ce4Apv8EO2c09i
HzNseOQierZE6ghCKRELusqqGzgK3GyqWjOWa8ZGLsD9dRyPLK7FNO7HBIBwV2Wb
G6KKnVCDSCRM1zXMc5+YplvzEsHN1dT+rqroxRrYlVHJ3hcHBqNis0X4pjxwHEo=
=idxz
-----END PGP SIGNATURE-----
Merge tag 'gfs2-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
Pull GFS2 updates from Bob Peterson:
"Here is a list of patches we've accumulated for GFS2 for the current
upstream merge window. This time we've only got six patches, many of
which are very small:
- three cleanups from Andreas Gruenbacher, including a nice cleanup
of the sequence file code for the sbstats debugfs file.
- a patch from Ben Hutchings that changes statistics variables from
signed to unsigned.
- two patches from me that increase GFS2's glock scalability by
switching from a conventional hash table to rhashtable"
* tag 'gfs2-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
gfs2: A minor "sbstats" cleanup
gfs2: Fix a typo in a comment
gfs2: Make statistics unsigned, suitable for use with do_div()
GFS2: Use resizable hash table for glocks
GFS2: Move glock superblock pointer to field gl_name
gfs2: Simplify the seq file code for "sbstats"
It looks like the Kconfig check that was meant to fix this (commit
fe9233fb69 [SCSI] scsi_dh: fix kconfig related
build errors) was actually reversed, but no-one noticed until the new set of
patches which separated DM and SCSI_DH).
Fixes: fe9233fb69
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
On old ARM chips, unaligned accesses to memory are not trapped and
fixed. On module load, symbols are relocated, and the relocation of
__bug_table symbols is done on a u32 basis. Yet the section is not
aligned to a multiple of 4 address, but to a multiple of 2.
This triggers an Oops on pxa architecture, where address 0xbf0021ea
is the first relocation in the __bug_table section :
apply_relocate(): pxa3xx_nand: section 13 reloc 0 sym ''
Unable to handle kernel paging request at virtual address bf0021ea
pgd = e1cd0000
[bf0021ea] *pgd=c1cce851, *pte=c1cde04f, *ppte=c1cde01f
Internal error: Oops: 23 [#1] ARM
Modules linked in:
CPU: 0 PID: 606 Comm: insmod Not tainted 4.2.0-rc8-next-20150828-cm-x300+ #887
Hardware name: CM-X300 module
task: e1c68700 ti: e1c3e000 task.ti: e1c3e000
PC is at apply_relocate+0x2f4/0x3d4
LR is at 0xbf0021ea
pc : [<c000e7c8>] lr : [<bf0021ea>] psr: 80000013
sp : e1c3fe30 ip : 60000013 fp : e49e8c60
r10: e49e8fa8 r9 : 00000000 r8 : e49e7c58
r7 : e49e8c38 r6 : e49e8a58 r5 : e49e8920 r4 : e49e8918
r3 : bf0021ea r2 : bf007034 r1 : 00000000 r0 : bf000000
Flags: Nzcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
Control: 0000397f Table: c1cd0018 DAC: 00000051
Process insmod (pid: 606, stack limit = 0xe1c3e198)
[<c000e7c8>] (apply_relocate) from [<c005ce5c>] (load_module+0x1248/0x1f5c)
[<c005ce5c>] (load_module) from [<c005dc54>] (SyS_init_module+0xe4/0x170)
[<c005dc54>] (SyS_init_module) from [<c000a420>] (ret_fast_syscall+0x0/0x38)
Fix this by ensuring entries in __bug_table are all aligned to at least
of multiple of 4. This transforms a module section __bug_table as :
- [12] __bug_table PROGBITS 00000000 002232 000018 00 A 0 0 1
+ [12] __bug_table PROGBITS 00000000 002232 000018 00 A 0 0 4
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When Xen is copying data to/from the guest it will check if the kernel
has the right to do the access. If not, the hypercall will return an
error.
After the commit a5e090acbf "ARM:
software-based privileged-no-access support", the kernel can't access
any longer the user space by default. This will result to fail on every
hypercall made by the userspace (i.e via privcmd).
We have to enable the userspace access and then restore the correct
permission every time the privcmd is used to made an hypercall.
I didn't find generic helpers to do a these operations, so the change
is only arm32 specific.
Reported-by: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
A collection of small fixes since the last update: the HD-audio
quirks as usual with a USB-audio fix and a trivial fix for the
old sparc driver.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJV8sv8AAoJEGwxgFQ9KSmkxzIQAKsv+qoF0cR5WJdCJSn14WKN
LA0fj4/BZIC6bS3wNuXzP0UqaRAyM5MLJ9Jo7l+leklJK6T2wqsRcPai2+SEiqb9
zVKZ/UxLk/xMZpn1SkkbjMxDWTyJWn7aKUvxDNlUNxxS6JcajMFp8wv4sthrEVXl
JhT77qhTWyjIleoldaOIQNSzXhxluPh2C4kgkmeBKpkUzep/NxWFbhab3JSamWbz
YvoAt66iggQmPQfx6CHYQ+htmsdW5F/qAORQLPxnQ765/AtIKP28bg+iqD5JnvMw
X1IkwcHlLeULMwjPfsYsx1QXzupC6VLm1LlV3QTPcGwrIg1XMOAY/QZeZT8F/APD
f/j14qS5DOwBX/+D+0LYI0sPqfso39W6CfF7xpYxqNdU7CLDuMMd/MFvYDcyraWu
xew6lcZKvqmOyo7AI7DOXWyKekS4gBanzzsXTBp/nrH+8l88X1opR+nlt4ODYc7d
GLwF6DTwDYo7HgCm1toK4pzeE2H0iQXsdHk7IOQsOWWdA0+AACStHoiR0x12jjkB
TRB5YlUGYNojxDcWWpAgmi93dl30C0A0yEd6ClA3UJzHimOhXwwYmDehANbpCmXl
ncxuSzRj1qMhBMHEG0KIKn9b0Uo5KqHs0ebPZS10y/s3I/lLYgzCsbyCdev2Ab0q
QhwpCp3zgiQcswrI5fI8
=bKSi
-----END PGP SIGNATURE-----
Merge tag 'sound-fix-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of small fixes since the last update: the HD-audio quirks
as usual with a USB-audio fix and a trivial fix for the old sparc
driver"
* tag 'sound-fix-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: usb-audio: Change internal PCM order
ALSA: hda - Fix white noise on Dell M3800
ALSA: hda - Use ALC880_FIXUP_FUJITSU for FSC Amilo M1437
ALSA: hda - Enable headphone jack detect on old Fujitsu laptops
ALSA: sparc: amd7930: Fix module autoload for OF platform driver
ALSA: hda - Add some FIXUP quirks for white noise on Dell laptop.
Pull drm fixes from Dave Airlie:
"Just a bunch of fixes to squeeze in before -rc1:
- three nouveau regression fixes
- one qxl regression fix
- a bunch of i915 fixes
... and some core displayport/atomic fixes"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/nouveau/device: enable c800 quirk for tecra w50
drm/nouveau/clk/gt215: Unbreak engine pausing for GT21x/MCP7x
drm/nouveau/gr/nv04: fix big endian setting on gr context
drm/qxl: validate monitors config modes
drm/i915: Allow DSI dual link to be configured on any pipe
drm/i915: Don't try to use DDR DVFS on CHV when disabled in the BIOS
drm/i915: Fix CSR MMIO address check
drm/i915: Limit the number of loops for reading a split 64bit register
drm/i915: Fix broken mst get_hw_state.
drm/i915: Pass hpd_status_i915[] to intel_get_hpd_pins() in pre-g4x
uapi/drm/i915_drm.h: fix userspace compilation.
drm/i915: Always mark the object as dirty when used by the GPU
drm/dp: Add dp_aux_i2c_speed_khz module param to set the assume i2c bus speed
drm/dp: Adjust i2c-over-aux retry count based on message size and i2c bus speed
drm/dp: Define AUX_RETRY_INTERVAL as 500 us
drm/atomic: Fix bookkeeping with TEST_ONLY, v3.
We need to have memory dependencies on get_domain/set_domain to avoid
the compiler over-optimising these inline assembly instructions.
Loads/stores must not be reordered across a set_domain(), so introduce
a compiler barrier for that assembly.
The value of get_domain() must not be cached across a set_domain(), but
we still want to allow the compiler to optimise it away. Introduce a
dependency on current_thread_info()->cpu_domain to avoid this; the new
memory clobber in set_domain() should therefore cause the compiler to
re-load this. The other advantage of using this is we should have its
address in the register set already, or very soon after at most call
sites.
Tested-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
As of 1eef5d2f1b ("ARM: domains: switch to keeping domain value in
register") we no longer need to include asm/domains.h into
asm/thread_info.h. Remove it.
Tested-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
While making the root blkg unconditional, ec13b1d6f0 ("blkcg: always
create the blkcg_gq for the root blkcg") removed the part which clears
q->root_blkg and ->root_rl.blkg during q exit. This leaves the two
pointers dangling after blkg_destroy_all(). blk-throttle exit path
performs blkg traversals and dereferences ->root_blkg and can lead to
the following oops.
BUG: unable to handle kernel NULL pointer dereference at 0000000000000558
IP: [<ffffffff81389746>] __blkg_lookup+0x26/0x70
...
task: ffff88001b4e2580 ti: ffff88001ac0c000 task.ti: ffff88001ac0c000
RIP: 0010:[<ffffffff81389746>] [<ffffffff81389746>] __blkg_lookup+0x26/0x70
...
Call Trace:
[<ffffffff8138d14a>] blk_throtl_drain+0x5a/0x110
[<ffffffff8138a108>] blkcg_drain_queue+0x18/0x20
[<ffffffff81369a70>] __blk_drain_queue+0xc0/0x170
[<ffffffff8136a101>] blk_queue_bypass_start+0x61/0x80
[<ffffffff81388c59>] blkcg_deactivate_policy+0x39/0x100
[<ffffffff8138d328>] blk_throtl_exit+0x38/0x50
[<ffffffff8138a14e>] blkcg_exit_queue+0x3e/0x50
[<ffffffff8137016e>] blk_release_queue+0x1e/0xc0
...
While the bug is a straigh-forward use-after-free bug, it is tricky to
reproduce because blkg release is RCU protected and the rest of exit
path usually finishes before RCU grace period.
This patch fixes the bug by updating blkg_destro_all() to clear
q->root_blkg and ->root_rl.blkg.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: "Richard W.M. Jones" <rjones@redhat.com>
Reported-by: Josh Boyer <jwboyer@fedoraproject.org>
Link: http://lkml.kernel.org/g/CA+5PVA5rzQ0s4723n5rHBcxQa9t0cW8BPPBekr_9aMRoWt2aYg@mail.gmail.com
Fixes: ec13b1d6f0 ("blkcg: always create the blkcg_gq for the root blkcg")
Cc: stable@vger.kernel.org # v4.2+
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
For drivers that don't support gaps in the SG lists handed to
them we must bounce (copy the user buffers) and pass a bio that
does not include gaps. This doesn't matter for any current user,
but will help to allow iser which can't handle gaps to use the
block virtual boundary instead of using driver-local bounce
buffering when handling SG_IO commands.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
This is only theoretical at the moment given that the only
subsystems that generate integrity payloads are the block layer
itself and the scsi target (which generate well aligned integrity
payloads). But when we will expose integrity meta-data to user-space,
we'll need to refuse appending a page with a gap (if the queue
virtual boundary is set).
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
If a driver sets the block queue virtual boundary mask, it means that
it cannot handle gaps so we must not allow those in the integrity
payload as well.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Fixed up by me to have duplicate integrity merge functions, depending
on whether block integrity is enabled or not. Fixes a compilations
issue with CONFIG_BLK_DEV_INTEGRITY unset.
Signed-off-by: Jens Axboe <axboe@fb.com>
This might lead to local privilege escalation (code execution as
kernel) for systems where the following conditions are met:
- CONFIG_CIFS_SMB2 and CONFIG_CIFS_POSIX are enabled
- a cifs filesystem is mounted where:
- the mount option "vers" was used and set to a value >=2.0
- the attacker has write access to at least one file on the filesystem
To attack this, an attacker would have to guess the target_tcon
pointer (but guessing wrong doesn't cause a crash, it just returns an
error code) and win a narrow race.
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Jann Horn <jann@thejh.net>
Signed-off-by: Steve French <smfrench@gmail.com>
While the destination buffer 'iv' is MAX_IVLEN size,
the source 'template[i].iv' could be smaller, thus
memcpy may read read invalid memory.
Use crypto_skcipher_ivsize() to get real ivsize
and pass it to memcpy.
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* pm-cpu:
kernel/cpu_pm: fix cpu_cluster_pm_exit comment
* pm-cpuidle:
cpuidle/coupled: Add sanity check for safe_state_index
* pm-domains:
staging: board: Migrate away from __pm_genpd_name_add_device()
PM / Domains: Ensure subdomain is not in use before removing
PM / Domains: Try power off masters in error path of __pm_genpd_poweron()
* pm-cpufreq:
intel_pstate: fix PCT_TO_HWP macro
intel_pstate: Fix user input of min/max to legal policy region
cpufreq-dt: add suspend frequency support
cpufreq: allow cpufreq_generic_suspend() to work without suspend frequency
cpufreq: Use __func__ to print function's name
cpufreq: staticize cpufreq_cpu_get_raw()
cpufreq: Add ARM_MT8173_CPUFREQ dependency on THERMAL
cpufreq: dt: Tolerance applies on both sides of target voltage
cpufreq: dt: Print error on failing to mark OPPs as shared
cpufreq: dt: Check OPP count before marking them shared
Since event->hw.itrace_started is now set in pmu::start() to signal the beginning of
the trace, do so also in the intel_bts driver.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@infradead.org
Cc: adrian.hunter@intel.com
Cc: hpa@zytor.com
Link: http://lkml.kernel.org/r/1437140050-23363-4-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Calling transport_generic_request_failure() from here causes list
corruption. We should be using target_complete_cmd() instead.
Which we do in all other cases, so the UNKNOWN_OP case can become just
another member of the big else/if chain in tcmu_handle_completion().
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This does nothing, and there are many other places where
transport_cmd_check_stop_to_fabric()'s retval is not checked>, If we
wanted to check it here, we should probably do it those other places too.
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Current for-next iscsi target is broken:
commit 109e238174
Author: Roland Dreier <roland@purestorage.com>
Date: Thu Jul 23 14:53:32 2015 -0700
target: Drop iSCSI use of mutex around max_cmd_sn increment
This patch fixes incorrect pr_debug() + atomic_inc_return() usage
within iscsit_increment_maxcmdsn() code.
Also fix funny iscsit_determine_maxcmdsn() usage and update
iscsi_target_do_tx_login_io() code.
Reported-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Cc: Roland Dreier <roland@purestorage.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch is a >= v4.1 regression bug-fix where control CDB
emulation logic in commit 38b57f82 now expects a se_cmd->se_sess
pointer to exist when determining T10-PI support is to be exposed
for initiator host ports.
To address this bug, go ahead and add locally generated se_cmd
descriptors for copy-offload block-copy to it's own stand-alone
se_session nexus, while the parent EXTENDED_COPY se_cmd descriptor
remains associated with it's originating se_cmd->se_sess nexus.
Note a valid se_cmd->se_sess is also required for future support
of WRITE_INSERT and READ_STRIP software emulation when submitting
backend I/O to se_device that exposes T10-PI suport.
Reported-by: Alex Gorbachev <ag@iss-integration.com>
Tested-by: Alex Gorbachev <ag@iss-integration.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Doug Gilbert <dgilbert@interlog.com>
Cc: <stable@vger.kernel.org> # v4.1+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch adds an optional fabric driver provided SGL limit
that target-core will honor as it's own internal I/O maximum
transfer length limit, as exposed by EVPD=0xb0 block limits
parameters.
This is required for handling cases when host I/O transfer
length exceeds the requested EVPD block limits maximum
transfer length. The initial user of this logic is qla2xxx,
so that we can avoid having to reject I/Os from some legacy
FC hosts where EVPD=0xb0 parameters are not honored.
When se_cmd payload length exceeds the provided limit in
target_check_max_data_sg_nents() code, se_cmd->data_length +
se_cmd->prot_length are reset with se_cmd->residual_count
plus underflow bit for outgoing TFO response callbacks.
It also checks for existing CDB level underflow + overflow
and recalculates final residual_count as necessary.
Note this patch currently assumes 1:1 mapping of PAGE_SIZE
per struct scatterlist entry.
Reported-by: Craig Watson <craig.watson@vanguard-rugged.com>
Cc: Craig Watson <craig.watson@vanguard-rugged.com>
Tested-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Cc: Roland Dreier <roland@purestorage.com>
Cc: Arun Easi <arun.easi@qlogic.com>
Cc: Giridhar Malavali <giridhar.malavali@qlogic.com>
Cc: Andrew Vasquez <andrew.vasquez@qlogic.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Kernel testing triggered this warning:
| WARNING: CPU: 0 PID: 13 at kernel/sched/core.c:1156 do_set_cpus_allowed+0x7e/0x80()
| Modules linked in:
| CPU: 0 PID: 13 Comm: migration/0 Not tainted 4.2.0-rc1-00049-g25834c7 #2
| Call Trace:
| dump_stack+0x4b/0x75
| warn_slowpath_common+0x8b/0xc0
| warn_slowpath_null+0x22/0x30
| do_set_cpus_allowed+0x7e/0x80
| cpuset_cpus_allowed_fallback+0x7c/0x170
| select_fallback_rq+0x221/0x280
| migration_call+0xe3/0x250
| notifier_call_chain+0x53/0x70
| __raw_notifier_call_chain+0x1e/0x30
| cpu_notify+0x28/0x50
| take_cpu_down+0x22/0x40
| multi_cpu_stop+0xd5/0x140
| cpu_stopper_thread+0xbc/0x170
| smpboot_thread_fn+0x174/0x2f0
| kthread+0xc4/0xe0
| ret_from_kernel_thread+0x21/0x30
As Peterz pointed out:
| So the normal rules for changing task_struct::cpus_allowed are holding
| both pi_lock and rq->lock, such that holding either stabilizes the mask.
|
| This is so that wakeup can happen without rq->lock and load-balance
| without pi_lock.
|
| From this we already get the relaxation that we can omit acquiring
| rq->lock if the task is not on the rq, because in that case
| load-balancing will not apply to it.
|
| ** these are the rules currently tested in do_set_cpus_allowed() **
|
| Now, since __set_cpus_allowed_ptr() uses task_rq_lock() which
| unconditionally acquires both locks, we could get away with holding just
| rq->lock when on_rq for modification because that'd still exclude
| __set_cpus_allowed_ptr(), it would also work against
| __kthread_bind_mask() because that assumes !on_rq.
|
| That said, this is all somewhat fragile.
|
| Now, I don't think dropping rq->lock is quite as disastrous as it
| usually is because !cpu_active at this point, which means load-balance
| will not interfere, but that too is somewhat fragile.
|
| So we end up with a choice of two fragile..
This patch fixes it by following the rules for changing
task_struct::cpus_allowed with both pi_lock and rq->lock held.
Reported-by: kernel test robot <ying.huang@intel.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
[ Modified changelog and patch. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/BLU436-SMTP1660820490DE202E3934ED3806E0@phx.gbl
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Only emit the test-and-set fallback for Hypervisors lacking
PARAVIRT_SPINLOCKS support when building for guests.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org # 4.2
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Dave ran into horrible performance on a VM without PARAVIRT_SPINLOCKS
set and Linus noted that the test-and-set implementation was retarded.
One should spin on the variable with a load, not a RMW.
While there, remove 'queued' from the name, as the lock isn't queued
at all, but a simple test-and-set.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Dave Chinner <david@fromorbit.com>
Tested-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman Long <Waiman.Long@hp.com>
Cc: stable@vger.kernel.org # v4.2+
Link: http://lkml.kernel.org/r/20150904152523.GR18673@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
time_in_state in struct devfreq is defined as unsigned long, so
devm_kzalloc should use sizeof(unsigned long) as argument instead
of sizeof(unsigned int), otherwise it will cause unexpected result
in 64bit system.
Signed-off-by: Xiaolong Ye <yexl@marvell.com>
Signed-off-by: Kevin Liu <kliu5@marvell.com>
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
With the introduction of devfreq_update_stats(), governors
are not recommended to use get_dev_status() directly.
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
The thermal infrastructure should use the devfreq cooling device, which
uses the OPP library to disable OPPs as necessary.
Fix a couple of typos in the same comment while we are at it.
Signed-off-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
The return value of get_dev_status() can be reused. Cache it so that
other parts of the kernel can reuse it instead of having to call the
same function again.
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
IS_ERR(_OR_NULL) already contain an 'unlikely' compiler flag and there
is no need to do that again from its callers. Drop it.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
This patch updates the documentation to include the information of PPMUv2.
The PPMUv2 is used for Exynos5433 and Exynos7420 to monitor the performance
of each IP in Exynos SoC.
Cc: MyungJoo Ham <myungjoo.ham@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
This patch adds the support for PPMU (Platform Performance Monitoring Unit)
version 2.0 for Exynos5433 SoC. Exynos5433 SoC must need PPMUv2 which is
quite different from PPMUv1.1. The exynos-ppmu.c driver supports both PPMUv1.1
and PPMUv2.
Cc: MyungJoo Ham <myungjoo.ham@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
The exynos-ppmu driver is only a clock consumer and not a clock provider
but its Device Tree binding listed #clock-cells as an optional property.
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
three nouveau regression fixes.
* 'linux-4.3' of git://anongit.freedesktop.org/git/nouveau/linux-2.6:
drm/nouveau/device: enable c800 quirk for tecra w50
drm/nouveau/clk/gt215: Unbreak engine pausing for GT21x/MCP7x
drm/nouveau/gr/nv04: fix big endian setting on gr context
Pull blk-cg updates from Jens Axboe:
"A bit later in the cycle, but this has been in the block tree for a a
while. This is basically four patchsets from Tejun, that improve our
buffered cgroup writeback. It was dependent on the other cgroup
changes, but they went in earlier in this cycle.
Series 1 is set of 5 patches that has cgroup writeback updates:
- bdi_writeback iteration fix which could lead to some wb's being
skipped or repeated during e.g. sync under memory pressure.
- Simplification of wb work wait mechanism.
- Writeback tracepoints updated to report cgroup.
Series 2 is is a set of updates for the CFQ cgroup writeback handling:
cfq has always charged all async IOs to the root cgroup. It didn't
have much choice as writeback didn't know about cgroups and there
was no way to tell who to blame for a given writeback IO.
writeback finally grew support for cgroups and now tags each
writeback IO with the appropriate cgroup to charge it against.
This patchset updates cfq so that it follows the blkcg each bio is
tagged with. Async cfq_queues are now shared across cfq_group,
which is per-cgroup, instead of per-request_queue cfq_data. This
makes all IOs follow the weight based IO resource distribution
implemented by cfq.
- Switched from GFP_ATOMIC to GFP_NOWAIT as suggested by Jeff.
- Other misc review points addressed, acks added and rebased.
Series 3 is the blkcg policy cleanup patches:
This patchset contains assorted cleanups for blkcg_policy methods
and blk[c]g_policy_data handling.
- alloc/free added for blkg_policy_data. exit dropped.
- alloc/free added for blkcg_policy_data.
- blk-throttle's async percpu allocation is replaced with direct
allocation.
- all methods now take blk[c]g_policy_data instead of blkcg_gq or
blkcg.
And finally, series 4 is a set of patches cleaning up the blkcg stats
handling:
blkcg's stats have always been somwhat of a mess. This patchset
tries to improve the situation a bit.
- The following patches added to consolidate blkcg entry point and
blkg creation. This is in itself is an improvement and helps
colllecting common stats on bio issue.
- per-blkg stats now accounted on bio issue rather than request
completion so that bio based and request based drivers can behave
the same way. The issue was spotted by Vivek.
- cfq-iosched implements custom recursive stats and blk-throttle
implements custom per-cpu stats. This patchset make blkcg core
support both by default.
- cfq-iosched and blk-throttle keep track of the same stats
multiple times. Unify them"
* 'for-4.3/blkcg' of git://git.kernel.dk/linux-block: (45 commits)
blkcg: use CGROUP_WEIGHT_* scale for io.weight on the unified hierarchy
blkcg: s/CFQ_WEIGHT_*/CFQ_WEIGHT_LEGACY_*/
blkcg: implement interface for the unified hierarchy
blkcg: misc preparations for unified hierarchy interface
blkcg: separate out tg_conf_updated() from tg_set_conf()
blkcg: move body parsing from blkg_conf_prep() to its callers
blkcg: mark existing cftypes as legacy
blkcg: rename subsystem name from blkio to io
blkcg: refine error codes returned during blkcg configuration
blkcg: remove unnecessary NULL checks from __cfqg_set_weight_device()
blkcg: reduce stack usage of blkg_rwstat_recursive_sum()
blkcg: remove cfqg_stats->sectors
blkcg: move io_service_bytes and io_serviced stats into blkcg_gq
blkcg: make blkg_[rw]stat_recursive_sum() to be able to index into blkcg_gq
blkcg: make blkcg_[rw]stat per-cpu
blkcg: add blkg_[rw]stat->aux_cnt and replace cfq_group->dead_stats with it
blkcg: consolidate blkg creation in blkcg_bio_issue_check()
blk-throttle: improve queue bypass handling
blkcg: move root blkg lookup optimization from throtl_lookup_tg() to __blkg_lookup()
blkcg: inline [__]blkg_lookup()
...
Typo that snuck in with commit 6979c6303a
Signed-off-by: Roy Spliet <rspliet@eclipso.eu>
Reported-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Broken since "gr: convert user classes to new-style nvkm_object"
Tested on a PPC64 G5 + NV34
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Merge third patch-bomb from Andrew Morton:
- even more of the rest of MM
- lib/ updates
- checkpatch updates
- small changes to a few scruffy filesystems
- kmod fixes/cleanups
- kexec updates
- a dma-mapping cleanup series from hch
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (81 commits)
dma-mapping: consolidate dma_set_mask
dma-mapping: consolidate dma_supported
dma-mapping: cosolidate dma_mapping_error
dma-mapping: consolidate dma_{alloc,free}_noncoherent
dma-mapping: consolidate dma_{alloc,free}_{attrs,coherent}
mm: use vma_is_anonymous() in create_huge_pmd() and wp_huge_pmd()
mm: make sure all file VMAs have ->vm_ops set
mm, mpx: add "vm_flags_t vm_flags" arg to do_mmap_pgoff()
mm: mark most vm_operations_struct const
namei: fix warning while make xmldocs caused by namei.c
ipc: convert invalid scenarios to use WARN_ON
zlib_deflate/deftree: remove bi_reverse()
lib/decompress_unlzma: Do a NULL check for pointer
lib/decompressors: use real out buf size for gunzip with kernel
fs/affs: make root lookup from blkdev logical size
sysctl: fix int -> unsigned long assignments in INT_MIN case
kexec: export KERNEL_IMAGE_SIZE to vmcoreinfo
kexec: align crash_notes allocation to make it be inside one physical page
kexec: remove unnecessary test in kimage_alloc_crash_control_pages()
kexec: split kexec_load syscall from kexec core code
...
This is a collection of a few late fixes and other misc. stuff that
had dependencies on things being merged from other trees.
The bulk of the changes are for samsung/exynos SoCs for some changes
that needed a few minor reworks so ended up a bit late. The others
are mainly for qcom SoCs: a couple fixes and some DTS updates.
There's one conflict with drivers/cpufreq/exynos-cpufreq.c because
it's now been completely removed, but there were some fixes that hit
mainline in the meantime.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIbBAABAgAGBQJV8fkAAAoJEFk3GJrT+8Zllf8P9jj3+TnvbJS/8bWoQoB7BRUZ
LZPgi2+sBXylrBV60uQdyodiTHQUMZhbL7GvgEVG0z6yyin7nyijqNkulTbQbWmg
WhumLNCNcs8vlZegA/corbwgcVC7FkjOP97HveTe2mgwZ+GaXj9qMRQzBsMqSXEo
4890ZeP1nWBTP42oXOQHkNyKWFBjuERK0dTw2MXj7WE0/Ag8i7ERp76uJQdQ7V5O
BpNRwxp3vSCky8rxbpD/avWdlspv1yZGBQyLeIreVq2YQFojvT36K8wHcf6iWBT/
pzGGV/uZM7MnrGZdqSfVEMDHl7Z2s7Ls+sv5F6Md7ErnVDerHGRjw/6lJDjbeH7u
trucpsuhz5yhTjpZssGHH2NT8pWxxh8M+AaNOiiH8PuYESAbPAmWLpWkn+648bIn
P++Z90DIyfNEqnNSMHkcQYpVt8zc4g75gZfTIIsXLB+DLgzgSK8ergrfyRN/O0zj
oFY35g3wHdgisnGve+BAW30zTZtP19TMT36OltWjIkjuRZC29PS2vkH8eTETdVXx
01f/qcpbB1L1rXfIBjNkjx81j89XPd68JIBfctTF5QBOAGm/Dix6tj2mU/N7TNu5
TrBmD3CXdOQbCPaoK/qWX7/b11IN/XOlGL06hhYwbSdvCVy7rNccApXvfnrDWziK
Ly923FP1OB7h0Kk1cmo=
=85Ck
-----END PGP SIGNATURE-----
Merge tag 'armsoc-late' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull late ARM SoC updates from Kevin Hilman:
"This is a collection of a few late fixes and other misc stuff that had
dependencies on things being merged from other trees.
The bulk of the changes are for samsung/exynos SoCs for some changes
that needed a few minor reworks so ended up a bit late. The others
are mainly for qcom SoCs: a couple fixes and some DTS updates"
* tag 'armsoc-late' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (37 commits)
ARM: multi_v7_defconfig: Enable PBIAS regulator
soc: qcom: smd: Correct fBLOCKREADINTR handling
soc: qcom: smd: Use correct remote processor ID
soc: qcom: smem: Fix errant private access
ARM: dts: qcom: msm8974-sony-xperia-honami: Use stdout-path
ARM: dts: qcom: msm8960-cdp: Use stdout-path
ARM: dts: qcom: msm8660-surf: Use stdout-path
ARM: dts: qcom: ipq8064-ap148: Use stdout-path
ARM: dts: qcom: apq8084-mtp: Use stdout-path
ARM: dts: qcom: apq8084-ifc6540: Use stdout-path
ARM: dts: qcom: apq8074-dragonboard: Use stdout-path
ARM: dts: qcom: apq8064-ifc6410: Use stdout-path
ARM: dts: qcom: apq8064-cm-qs600: Use stdout-path
ARM: dts: qcom: Label serial nodes for aliasing and stdout-path
reset: ath79: Fix missing spin_lock_init
reset: Add (devm_)reset_control_get stub functions
ARM: EXYNOS: switch to using generic cpufreq driver for exynos4x12
cpufreq: exynos: Remove unselectable rule for arm-exynos-cpufreq.o
ARM: dts: add iommu property to JPEG device for exynos4
ARM: dts: enable SPI1 for exynos4412-odroidu3
...