If a spinner is present, there is a chance that the load of
rwsem_has_spinner() in rwsem_wake() can be reordered with
respect to decrement of rwsem count in __up_write() leading
to wakeup being missed:
spinning writer up_write caller
--------------- -----------------------
[S] osq_unlock() [L] osq
spin_lock(wait_lock)
sem->count=0xFFFFFFFF00000001
+0xFFFFFFFF00000000
count=sem->count
MB
sem->count=0xFFFFFFFE00000001
-0xFFFFFFFF00000001
spin_trylock(wait_lock)
return
rwsem_try_write_lock(count)
spin_unlock(wait_lock)
schedule()
Reordering of atomic_long_sub_return_release() in __up_write()
and rwsem_has_spinner() in rwsem_wake() can cause missing of
wakeup in up_write() context. In spinning writer, sem->count
and local variable count is 0XFFFFFFFE00000001. It would result
in rwsem_try_write_lock() failing to acquire rwsem and spinning
writer going to sleep in rwsem_down_write_failed().
The smp_rmb() will make sure that the spinner state is
consulted after sem->count is updated in up_write context.
Signed-off-by: Prateek Sood <prsood@codeaurora.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dave@stgolabs.net
Cc: longman@redhat.com
Cc: parri.andrea@gmail.com
Cc: sramana@codeaurora.org
Link: http://lkml.kernel.org/r/1504794658-15397-1-git-send-email-prsood@codeaurora.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Convert trace_sched_switch to use the common task-state helpers and
fix the "X" and "Z" order, possibly they ended up in the wrong order
because TASK_REPORT has them in the wrong order too.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Bit patterns are easier in hex.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently get_task_state() and task_state_to_char() report different
states, create a number of common helpers and unify the reported state
space.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The new timer_setup() function for struct timer_list collides with a
private um function. Rename it.
Fixes: 686fef928b ("timer: Prepare to change timer callback argument type")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: user-mode-linux-devel@lists.sourceforge.net
Cc: Kees Cook <keescook@chromium.org>
The following commit:
d9a50b0256 ("perf/aux: Ensure aux_wakeup represents most recent wakeup index")
changed the AUX wakeup position calculation to rounddown(), which causes
a division-by-zero in AUX overwrite mode (aka "snapshot mode").
The zero denominator results from the fact that perf record doesn't set
aux_watermark to anything, in which case the kernel will set it to half
the AUX buffer size, but only for non-overwrite mode. In the overwrite
mode aux_watermark stays zero.
The good news is that, AUX overwrite mode, wakeups don't happen and
related bookkeeping is not relevant, so we can simply forego the whole
wakeup updates.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/20170906160811.16510-1-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This fixes an APEI problem that may cause a reported error to be
missed due to a race condition.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJZzVkTAAoJEILEb/54YlRxGgcP/1j0kvrkU+/V8D4bZrpH9q/3
ad1W/krYLmam5Q+IsWEMfK+mwmT0CPHG3wM/OeT6VV/mTF9u6CyCM0m/XvnorKg3
Yp6wiKzAq7N9HIB6nqZUPwTgB0vIh3pLYBvqA1Dc6hlNU7lrsBuxUmYrpxM4hk6R
X5BAKFQygFrunjxi22fvJjk2yxxxg6IY4R7JYbJQIJbfKBAfMraMrDVoSE/gHieL
riOe1qJp0x5enI7kyOlGHQr0Sq+tOIrfJbf4O4Y4p1EwaXwk23mrfIpG9PtUpW3z
t3jJZC7Rg7liIS1ZrozZmSbNP2KFdF3nbQYqRBEzfbT4isOSJRXHGB2eqzroIpM5
rEgPjflLb561RWx7pcEQHH9z6cZ6cdbw97XNcdPTsJpxc46FohojdNR4FVY+z90I
KwakMwVUs5qUEhU7LcLbtRCXZyzCnXHdz72zYEIaqTBOhZ3yXFHzy66ld7Fe7Dwk
9Cu2u6P8gnnLPPbW5vRQGYhNdb5tfcOdzjQ0kajX+5kj+xlo5Nlhn8/LMOAqipOu
nwYnJLfu4adMyVCTmepgur32Pwlfp/oDupbe1Fp0dHe6wk6wiqzmo8RCxDT/3uIT
qxJB664AVQ9xpqOHVfRyZZxa07CRaW3aAYBqxkluIuL9lvEpSNWeStY9OxL1HCL3
C1R6WiA48V/1dDkbimJ1
=4NCJ
-----END PGP SIGNATURE-----
Merge tag 'acpi-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"This fixes an APEI problem that may cause a reported error to be
missed due to a race condition"
* tag 'acpi-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / APEI: clear error status before acknowledging the error
- Fix a deadlock in the operating performance points (OPP)
framework caused by a notifier callback taking a lock that's
already held by its caller (Viresh Kumar).
- Prevent the ti-cpufreq and cpufreq-dt-platdev drivers from
attempting to register conflicting device objects which
triggers a warning from sysfs (Suniel Mahesh).
- Drop a stale reference to a piece of intel_pstate documentation
that's not in the tree any more (Rafael Wysocki).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJZzVhfAAoJEILEb/54YlRxG1EQAJT3sG6InKIntxApvNG4o2fM
0zs2At29EgiOcxN0rBs5DYUiNk2yLMmbt3X/PnJxt5BijIANN/HlvEeD5jip4iHU
0F4o1gEDgFRkpEGlnR3tjUpCs/YZRwvmsox9zPqU7Nu4G/x6MjU5tlwT6BHCzq/U
gSG6O8GSy+pK8B+SJu2SpWSNGqdCmv2a1aKgGA+KLFlad+AM7k1cPoX/Wv5fQGZ6
iS20CLel4U4A6mzgYnnBhPSNsFYL4y0AxJ2SQ+O8PEWdP6hcmOvT5bo3TJTiTqqP
vQU9DTzsNxS8NL3ShGVCRAKZVWQav0SQHESTx687bjjPaxg7ppMHpodnRAp3niEI
5uyKGGerbdmJdKqEjEajpRLJWFjU8lcGMqWUUJFWDkIA88soSF1EoelxifH7rxnA
raLPxQ/FJKX/Og36jVgH96+a0sz+emnFj/BBrWxySKED5tBQ6HqKPKZqV/uFJE6h
DJ0qcYIxPdHtOCKYwLBsjJ2au2HUpp5fXzX+EOLmgnxIkHl9tsIbCnCAZldBIKVd
9ENErc1vFXA38SpHSWRf2mT/sGOjnToxik1PRsWOp7zXNkiXyFRqblQe6uJiUCvM
jU7Wl0HUNRGP0xEhdL3Ij7uGyOdVauHRLPEy5c+CJ9nSMCMYwYIW2pBdV8IgyRmD
Y4gxTBrJ8nHTTWLucSFb
=EEyM
-----END PGP SIGNATURE-----
Merge tag 'pm-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix a deadlock in the operating performance points (OPP)
framework introduced during the 4.11 cycle, more issues with duplicate
device objects for cpufreq-dt and cpufreq documentation.
Specifics:
- Fix a deadlock in the operating performance points (OPP) framework
caused by a notifier callback taking a lock that's already held by
its caller (Viresh Kumar).
- Prevent the ti-cpufreq and cpufreq-dt-platdev drivers from
attempting to register conflicting device objects which triggers a
warning from sysfs (Suniel Mahesh).
- Drop a stale reference to a piece of intel_pstate documentation
that's not in the tree any more (Rafael Wysocki)"
* tag 'pm-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: docs: Drop intel-pstate.txt from index.txt
cpufreq: dt: Fix sysfs duplicate filename creation for platform-device
PM / OPP: Call notifier without holding opp_table->lock
- fix various problems with the copy-on-write extent maps getting freed
at the wrong time
- fix printk format specifier problems
- report zeroing operation outcomes instead of dropping them on the
floor
- fix some crashes when dio operations partially fail
- fix a race condition between unwritten extent conversion & dio read
- fix some incorrect tests in the inode log item processing
- correct the delayed allocation space reservations on rmap filesystems
- fix some problems checking for dax support
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCgAGBQJZypYxAAoJEPh/dxk0SrTrJ3YQAJFWUCp194an+yuvgOY+MuyL
PG/vAA3DyJjYbwIsqUE//dlp9nrarccAXcxPITWlLdGZ//qHbXO2MguO3KIQ4iG8
qmsA+tXetVoYZYxYZLQ0KjX/XJTaAXY64xKTFxMMTTKUoxPygJRUF/FPfFFcTtaq
Q/ULikS5mhtW7/mQCfXBvtqM5ZD61A9vQRjDL5jRdrDbz49TQqtskp/7F6SEHLxU
fTCGhN7Ys4MQ4fmtUc+EUh0LPX8oAKIIKiGz3zUqrk/FgNYI2NqnTYvflfN8L9UE
t+k+4CGrON+dzrau4HrvZaYbfIPhRaJUM4QzFcDIPoaBZOt6DpBI0dEKm9FD7Hw/
vUvBs0M9asqYycH3PopFHugF+SxW8g7g+5TD8S9rg3j33PZahSNm3gt5gYb1Kiij
3TZPirst6OeQuEjWX6L5LAruAtqtEXtHL7o4dGn5LdQkJ0EIdKXMd9YGz0F/trTK
Grqf2Mep/Q8nccMTksaj94X5AhmM4znYmbAnbS/+QfYTgLk92GJltxoKTB6roW/N
fJ5azjyzGsr4BWdgakK3aA9glaQWGh3PY8Up2VLeEdjwcy3zyscnpZP2PSvt+l9X
pmMDpMTvQD0E6e5246itB69Il1NXTEoG/t9Hlx/2x9g0R2hjK6CRXXrwPnz9zYkI
7wFz5B5LmJ27vFGTCxo5
=7ptY
-----END PGP SIGNATURE-----
Merge tag 'xfs-4.14-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Darrick Wong:
- fix various problems with the copy-on-write extent maps getting freed
at the wrong time
- fix printk format specifier problems
- report zeroing operation outcomes instead of dropping them on the
floor
- fix some crashes when dio operations partially fail
- fix a race condition between unwritten extent conversion & dio read
- fix some incorrect tests in the inode log item processing
- correct the delayed allocation space reservations on rmap filesystems
- fix some problems checking for dax support
* tag 'xfs-4.14-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: revert "xfs: factor rmap btree size into the indlen calculations"
xfs: Capture state of the right inode in xfs_iflush_done
xfs: perag initialization should only touch m_ag_max_usable for AG 0
xfs: update i_size after unwritten conversion in dio completion
iomap_dio_rw: Allocate AIO completion queue before submitting dio
xfs: validate bdev support for DAX inode flag
xfs: remove redundant re-initialization of total_nr_pages
xfs: Output warning message when discard option was enabled even though the device does not support discard
xfs: report zeroed or not correctly in xfs_zero_range()
xfs: kill meaningless variable 'zero'
fs/xfs: Use %pS printk format for direct addresses
xfs: evict CoW fork extents when performing finsert/fcollapse
xfs: don't unconditionally clear the reflink flag on zero-block files
This reverts commit dbbccdc4ce.
It turns out that the "legacy" users aren't so legacy at all, and that
turning off the legacy ioctl will break the current Qt bluetooth stack
for bluetooth LE devices that were released just a couple of months ago.
So it's simply not true that this was a legacy interface that hasn't
been needed and is only limited to old legacy BT devices. Because I
actually read Kconfig help messages, and actively try to turn off
features that I don't need, I turned the option off.
Then I spent _way_ too much time debugging BLE issues until I realized
that it wasn't the Qt and subsurface development that had broken one of
my dive computer BLE downloads, but simply my broken kernel config.
Maybe in a decade it will be true that this is a legacy interface. And
maybe with a better help-text and correct dependencies, this kind of
legacy removal might be acceptable. But as things are right now both
the commit message and the Kconfig help text were misleading, and the
Kconfig option had the wrong dependenencies.
There's no reason to keep that broken Kconfig option in the tree.
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- a few core fixes
- a few ipoib fixes
- a few mlx5 fixes
- a 7 patch hfi1 related series
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJZy8HKAAoJELgmozMOVy/d/3YP/RtJ4I+7dlHAdTrUsLkNIXzj
6e2sc5A7JQRvhbWa6ZfqkbD4DBz2gkz9bXmlYotP1nVfunBie9xQPi+nN39YNnTv
VPYa0G7RD53APw71ETCGh0uBBAjc8lGm0AOPj+HpSP7PvrLdH6B68IcAeXCSOf8D
orzXI0bRpRnLsW4IJ0zN09zShigYuCJVl0Wf59QB0Wrbw4veQD4W7bLSCAUTmuZk
TPb8bPlXY64Bf731HRftxIRl3HwUrpTPv5DuHcASAbVL/KeucWpPmOAj9XqhXTQp
tnqtiwBWYDcsLBwS/IS40B2gfN1BCh6hn03pSVbPj+HD/FLY7x8Gf/Lu0qQNmklz
9nvgMKHL/2h+T4M7DulhS7DTP58bvtkyKG+j77gjEmKX1OI0NXHOntKZDSjGAT2J
zw2dNx4Y/Sgng1HBCbHAAHMrFUdyj7XpQNR8mzdGvDcwtRfrDKmchGtvhVclPsbl
R3U9GN2NcAwg2+bIN96hTzUMB10QOZdvddGFvbxuB7FaWkskPaN52O1ptT3+MyWt
xccZp0iYu40zV80mEm+nF/kZwR8omfE6xM1ujQdIhMHstGe+z29BhqsaQ8Zw1qEG
oaU7+9m2aK57SvcSimR2S4kdK7Gxw9+BIVKdRREJwe9xvWVf96OvJnhnh5t5Fs56
BTN1mBn+7LxlK9eDVler
=HbhA
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma fixes from Doug Ledford:
"Second -rc update for 4.14.
Both Mellanox and Intel had a series of -rc fixes that landed this
week. The Mellanox bunch is spread throughout the stack and not just
in their driver, where as the Intel bunch was mostly in the hfi1
driver. And, several of the fixes in the hfi1 driver were more than
just simple 5 line fixes. As a result, the hfi1 driver fixes has a
sizable LOC count.
Everything else is as one would expect in an RC cycle in terms of LOC
count. One item that might jump out and make you think "That's not an
rc item" is the fix that corrects a typo. But, that change fixes a
typo in a user visible API that was just added in this merge window,
so if we fix it now, we can fix it. If we don't, the typo is in the
API forever. Another that might not appear to be a fix at first glance
is the Simplify mlx5_ib_cont_pages patch, but the simplification
allows them to fix a bug in the existing function whenever the length
of an SGE exceeded page size. We also had to revert one patch from the
merge window that was wrong.
Summary:
- a few core fixes
- a few ipoib fixes
- a few mlx5 fixes
- a 7-patch hfi1 related series"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
IB/hfi1: Unsuccessful PCIe caps tuning should not fail driver load
IB/hfi1: On error, fix use after free during user context setup
Revert "IB/ipoib: Update broadcast object if PKey value was changed in index 0"
IB/hfi1: Return correct value in general interrupt handler
IB/hfi1: Check eeprom config partition validity
IB/hfi1: Only reset QSFP after link up and turn off AOC TX
IB/hfi1: Turn off AOC TX after offline substates
IB/mlx5: Fix NULL deference on mlx5_ib_update_xlt failure
IB/mlx5: Simplify mlx5_ib_cont_pages
IB/ipoib: Fix inconsistency with free_netdev and free_rdma_netdev
IB/ipoib: Fix sysfs Pkey create<->remove possible deadlock
IB: Correct MR length field to be 64-bit
IB/core: Fix qp_sec use after free access
IB/core: Fix typo in the name of the tag-matching cap struct
On s390x perf test 1 failed. It turned out that commit cf6383f73c
("perf report: Fix kernel symbol adjustment for s390x") was incorrect.
The previous implementation in dso__load_sym() is also suitable for
s390x.
Therefore this patch undoes commit cf6383f73c
Signed-off-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Cc: Zvonko Kosic <zvonko.kosic@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Fixes: cf6383f73c ("perf report: Fix kernel symbol adjustment for s390x")
LPU-Reference: 20170915071404.58398-2-tmricht@linux.vnet.ibm.com
Link: http://lkml.kernel.org/n/tip-v101o8k25vuja2ogosgf15yy@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On s390x perf test 1 failed. It turned out that commit 4a084ecfc8
("perf report: Fix module symbol adjustment for s390x") was incorrect.
The previous implementation in dso__load_sym() is also suitable for
s390x.
Therefore this patch undoes commit 4a084ecfc8.
Signed-off-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Zvonko Kosic <zvonko.kosic@de.ibm.com>
Fixes: 4a084ecfc8 ("perf report: Fix module symbol adjustment for s390x")
LPU-Reference: 20170915071404.58398-1-tmricht@linux.vnet.ibm.com
Link: http://lkml.kernel.org/n/tip-5ani7ly57zji7s0hmzkx416l@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The iterator functions pcpu_next_md_free_region and
pcpu_next_fit_region use the block offset to determine if they have
checked the area in the prior iteration. However, this causes an issue
when the block offset is greater than subsequent block contig hints. If
within the iterator it moves to check subsequent blocks, it may fail in
the second predicate due to the block offset not being cleared. Thus,
this causes the allocator to skip over blocks leading to false failures
when allocating from the reserved chunk. While this happens in the
general case as well, it will only fail if it cannot allocate a new
chunk.
This patch resets the block offset to 0 to pass the second predicate
when checking subseqent blocks within the iterator function.
Signed-off-by: Dennis Zhou <dennisszhou@gmail.com>
Reported-and-tested-by: Luis Henriques <lhenriques@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Modern kernel callback systems pass the structure associated with a
given callback to the callback function. The timer callback remains one
of the legacy cases where an arbitrary unsigned long argument continues
to be passed as the callback argument. This has several problems:
- This bloats the timer_list structure with a normally redundant
.data field.
- No type checking is being performed, forcing callbacks to do
explicit type casts of the unsigned long argument into the object
that was passed, rather than using container_of(), as done in most
of the other callback infrastructure.
- Neighboring buffer overflows can overwrite both the .function and
the .data field, providing attackers with a way to elevate from a buffer
overflow into a simplistic ROP-like mechanism that allows calling
arbitrary functions with a controlled first argument.
- For future Control Flow Integrity work, this creates a unique function
prototype for timer callbacks, instead of allowing them to continue to
be clustered with other void functions that take a single unsigned long
argument.
This adds a new timer initialization API, which will ultimately replace
the existing setup_timer(), setup_{deferrable,pinned,etc}_timer() family,
named timer_setup() (to mirror hrtimer_setup(), making instances of its
use much easier to grep for).
In order to support the migration of existing timers into the new
callback arguments, timer_setup() casts its arguments to the existing
legacy types, and explicitly passes the timer pointer as the legacy
data argument. Once all setup_*timer() callers have been replaced with
timer_setup(), the casts can be removed, and the data argument can be
dropped with the timer expiration code changed to just pass the timer
to the callback directly.
Since the regular pattern of using container_of() during local variable
declaration repeats the need for the variable type declaration
to be included, this adds a helper modeled after other from_*()
helpers that wrap container_of(), named from_timer(). This helper uses
typeof(*variable), removing the type redundancy and minimizing the need
for line wraps in forthcoming conversions from "unsigned data long" to
"struct timer_list *" in the timer callbacks:
-void callback(unsigned long data)
+void callback(struct timer_list *t)
{
- struct some_data_structure *local = (struct some_data_structure *)data;
+ struct some_data_structure *local = from_timer(local, t, timer);
Finally, in order to support the handful of timer users that perform
open-coded assignments of the .function (and .data) fields, provide
cast macros (TIMER_FUNC_TYPE and TIMER_DATA_TYPE) that can be used
temporarily. Once conversion has been completed, these can be globally
trivially removed.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20170928133817.GA113410@beast
When bootup a PVM guest with large memory(Ex.240GB), XEN provided initial
mapping overlaps with kernel module virtual space. When mapping in this space
is cleared by xen_cleanhighmap(), in certain case there could be an 2MB mapping
left. This is due to XEN initialize 4MB aligned mapping but xen_cleanhighmap()
finish at 2MB boundary.
When module loading is just on top of the 2MB space, got below warning:
WARNING: at mm/vmalloc.c:106 vmap_pte_range+0x14e/0x190()
Call Trace:
[<ffffffff81117083>] warn_alloc_failed+0xf3/0x160
[<ffffffff81146022>] __vmalloc_area_node+0x182/0x1c0
[<ffffffff810ac91e>] ? module_alloc_update_bounds+0x1e/0x80
[<ffffffff81145df7>] __vmalloc_node_range+0xa7/0x110
[<ffffffff810ac91e>] ? module_alloc_update_bounds+0x1e/0x80
[<ffffffff8103ca54>] module_alloc+0x64/0x70
[<ffffffff810ac91e>] ? module_alloc_update_bounds+0x1e/0x80
[<ffffffff810ac91e>] module_alloc_update_bounds+0x1e/0x80
[<ffffffff810ac9a7>] move_module+0x27/0x150
[<ffffffff810aefa0>] layout_and_allocate+0x120/0x1b0
[<ffffffff810af0a8>] load_module+0x78/0x640
[<ffffffff811ff90b>] ? security_file_permission+0x8b/0x90
[<ffffffff810af6d2>] sys_init_module+0x62/0x1e0
[<ffffffff815154c2>] system_call_fastpath+0x16/0x1b
Then the mapping of 2MB is cleared, finally oops when the page in that space is
accessed.
BUG: unable to handle kernel paging request at ffff880022600000
IP: [<ffffffff81260877>] clear_page_c_e+0x7/0x10
PGD 1788067 PUD 178c067 PMD 22434067 PTE 0
Oops: 0002 [#1] SMP
Call Trace:
[<ffffffff81116ef7>] ? prep_new_page+0x127/0x1c0
[<ffffffff81117d42>] get_page_from_freelist+0x1e2/0x550
[<ffffffff81133010>] ? ii_iovec_copy_to_user+0x90/0x140
[<ffffffff81119c9d>] __alloc_pages_nodemask+0x12d/0x230
[<ffffffff81155516>] alloc_pages_vma+0xc6/0x1a0
[<ffffffff81006ffd>] ? pte_mfn_to_pfn+0x7d/0x100
[<ffffffff81134cfb>] do_anonymous_page+0x16b/0x350
[<ffffffff81139c34>] handle_pte_fault+0x1e4/0x200
[<ffffffff8100712e>] ? xen_pmd_val+0xe/0x10
[<ffffffff810052c9>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
[<ffffffff81139dab>] handle_mm_fault+0x15b/0x270
[<ffffffff81510c10>] do_page_fault+0x140/0x470
[<ffffffff8150d7d5>] page_fault+0x25/0x30
Call xen_cleanhighmap() with 4MB aligned for page tables mapping to fix it.
The unnecessory call of xen_cleanhighmap() in DEBUG mode is also removed.
-v2: add comment about XEN alignment from Juergen.
References: https://lists.xen.org/archives/html/xen-devel/2012-07/msg01562.html
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
[boris: added 'xen/mmu' tag to commit subject]
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Just like done in d2bd05d88d ("xen-pciback: return proper values during
BAR sizing") for the ROM BAR, ordinary ones also shouldn't compare the
written value directly against ~0, but consider the r/o bits at the
bottom (if any).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
When generic irq chips are allocated for an irq domain the domain name is
set to the irq chip name. That was done to have named domains before the
recent changes which enforce domain naming were done.
Since then the overwrite causes a memory leak when the domain name is
dynamically allocated and even worse it would cause the domain free code to
free the wrong name pointer, which might point to a constant.
Remove the name assignment to prevent this.
Fixes: d59f6617ee ("genirq: Allow fwnode to carry name information only")
Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20170928043731.4764-1-jeffy.chen@rock-chips.com
Add compatible string to use this generic glue layer to support
Spreadtrum SC9860 platform's dwc3 controller.
Signed-off-by: Baolin Wang <baolin.wang@spreadtrum.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
The driver triggers actions on both edges of the vbus signal.
The former PIO controller was triggering IRQs on both falling and rising edges
by default. Newer PIO controller don't, so it's better to set it explicitly to
IRQF_TRIGGER_FALLING | IRQF_TRIGGER_RISING.
Without this patch we may trigger the connection with host but only on some
bouncing signal conditions and thus lose connecting events.
Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Cc: stable <stable@vger.kernel.org> # v4.4+
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
By submitting completed transfers to the system workqueue there is no
guarantee that completion events will be queued up in the correct order,
as in multi-processor systems there is a thread running for each
processor and the work items are not bound to a particular core.
This means that several completions are in the queue at the same time,
they may be processed in parallel and complete out of order, resulting
in data appearing corrupt when read by userspace.
Create a single-threaded workqueue for FunctionFS so that data completed
requests is passed to userspace in the order in which they complete.
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: John Keeping <john@metanate.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
This patch fixes an issue that the usbhsf_fifo_clear() is possible
to cause 10 msec delay if the pipe is RX direction and empty because
the FRDY bit will never be set to 1 in such case.
Fixes: e8d548d549 ("usb: renesas_usbhs: fifo became independent from pipe.")
Cc: <stable@vger.kernel.org> # v3.1+
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
This patch fixes an issue that the driver sets the BCLR bit of
{C,Dn}FIFOCTR register to 1 even when it's non-DCP pipe and
the FRDY bit of {C,Dn}FIFOCTR register is set to 1.
Fixes: e8d548d549 ("usb: renesas_usbhs: fifo became independent from pipe.")
Cc: <stable@vger.kernel.org> # v3.1+
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
This patch fixes an issue that this driver cannot go status stage
in control read when the req.zero is set to 1 and the len in
usb3_write_pipe() is set to 0. Otherwise, if we use g_ncm driver,
usb enumeration takes long time (5 seconds or more).
Fixes: 746bfe63bb ("usb: gadget: renesas_usb3: add support for Renesas USB3.0 peripheral controller")
Cc: <stable@vger.kernel.org> # v4.5+
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
According to the datasheet of R-Car Gen3, the Pn_RAMMAP.Pn_MPKT should
be set to one of 8, 16, 32, 64, 512 and 1024. Otherwise, when a gadget
driver uses an interrupt endpoint, unexpected behavior happens. So,
this patch fixes it.
Fixes: 746bfe63bb ("usb: gadget: renesas_usb3: add support for Renesas USB3.0 peripheral controller")
Cc: <stable@vger.kernel.org> # v4.5+
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
When bRequestType & USB_DIR_IN is false and req.length is 0 in control
transfer, since it means non-data, this driver should not set the mode
as control write. So, this patch fixes it.
Fixes: 746bfe63bb ("usb: gadget: renesas_usb3: add support for Renesas USB3.0 peripheral controller")
Cc: <stable@vger.kernel.org> # v4.5+
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
A recent change to the synchronization in dummy-hcd was incorrect.
The issue was that dummy_udc_stop() contained no locking and therefore
could race with various gadget driver callbacks, and the fix was to
add locking and issue the callbacks with the private spinlock held.
UDC drivers aren't supposed to do this. Gadget driver callback
routines are allowed to invoke functions in the UDC driver, and these
functions will generally try to acquire the private spinlock. This
would deadlock the driver.
The correct solution is to drop the spinlock before issuing callbacks,
and avoid races by emulating the synchronize_irq() call that all real
UDC drivers must perform in their ->udc_stop() routines after
disabling interrupts. This involves adding a flag to dummy-hcd's
private structure to keep track of whether interrupts are supposed to
be enabled, and adding a counter to keep track of ongoing callbacks so
that dummy_udc_stop() can wait for them all to finish.
A real UDC driver won't receive disconnect, reset, suspend, resume, or
setup events once it has disabled interrupts. dummy-hcd will receive
them but won't try to issue any gadget driver callbacks, which should
be just as good.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Fixes: f16443a034 ("USB: gadgetfs, dummy-hcd, net2280: fix locking for callbacks")
CC: <stable@vger.kernel.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
The dummy-hcd HCD/UDC emulator tries not to do too much work during
each timer interrupt. But it doesn't try very hard; currently all
it does is limit the total amount of bulk data transferred. Other
transfer types aren't limited, and URBs that transfer no data (because
of an error, perhaps) don't count toward the limit, even though on a
real USB bus they would consume at least a minimum overhead.
This means it's possible to get the driver stuck in an infinite loop,
for example, if the host class driver resubmits an URB every time it
completes (which is common for interrupt URBs). Each time the URB is
resubmitted it gets added to the end of the pending-URBs list, and
dummy-hcd doesn't stop until that list is empty. Andrey Konovalov was
able to trigger this failure mode using the syzkaller fuzzer.
This patch fixes the infinite-loop problem by restricting the URBs
handled during each timer interrupt to those that were already on the
pending list when the interrupt routine started. Newly added URBs
won't be processed until the next timer interrupt. The problem of
properly accounting for non-bulk bandwidth (as well as packet and
transaction overhead) is not addressed here.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
The dummy-hcd UDC driver is not careful about the way it handles
connection speeds. It ignores the module parameter that is supposed
to govern the maximum connection speed and it doesn't set the HCD
flags properly for the case where it ends up running at full speed.
The result is that in many cases, gadget enumeration over dummy-hcd
fails because the bMaxPacketSize byte in the device descriptor is set
incorrectly. For example, the default settings call for a high-speed
connection, but the maxpacket value for ep0 ends up being set for a
Super-Speed connection.
This patch fixes the problem by initializing the gadget's max_speed
and the HCD flags correctly.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: <stable@vger.kernel.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
As Chris explains, get_seccomp_filter() and put_seccomp_filter() can end
up using different filters. Once we drop ->siglock it is possible for
task->seccomp.filter to have been replaced by SECCOMP_FILTER_FLAG_TSYNC.
Fixes: f8e529ed94 ("seccomp, ptrace: add support for dumping seccomp filters")
Reported-by: Chris Salls <chrissalls5@gmail.com>
Cc: stable@vger.kernel.org # needs s/refcount_/atomic_/ for v4.12 and earlier
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
[tycho: add __get_seccomp_filter vs. open coding refcount_inc()]
Signed-off-by: Tycho Andersen <tycho@docker.com>
[kees: tweak commit log]
Signed-off-by: Kees Cook <keescook@chromium.org>
Arnd Bergmann reported a bunch of warnings like:
crypto/jitterentropy.o: warning: objtool: jent_fold_time()+0x3b: call without frame pointer save/setup
crypto/jitterentropy.o: warning: objtool: jent_stuck()+0x1d: call without frame pointer save/setup
crypto/jitterentropy.o: warning: objtool: jent_unbiased_bit()+0x15: call without frame pointer save/setup
crypto/jitterentropy.o: warning: objtool: jent_read_entropy()+0x32: call without frame pointer save/setup
crypto/jitterentropy.o: warning: objtool: jent_entropy_collector_free()+0x19: call without frame pointer save/setup
and
arch/x86/events/core.o: warning: objtool: collect_events uses BP as a scratch register
arch/x86/events/core.o: warning: objtool: events_ht_sysfs_show()+0x22: call without frame pointer save/setup
With certain rare configurations, GCC sometimes sets up the frame
pointer with:
lea (%rsp),%rbp
instead of:
mov %rsp,%rbp
The instructions are equivalent, so treat the former like the latter.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/a468af8b28a69b83fffc6d7668be9b6fcc873699.1506526584.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The kbuild bot occasionally reports warnings like:
drivers/scsi/pcmcia/aha152x_core.o: warning: objtool: seldo_run()+0x130: unreachable instruction
These warnings are always with GCC 4.4. That version of GCC sometimes
places unreachable instructions after calls to noreturn functions.
The unreachable warnings aren't very important anyway. Just ignore them
for old versions of GCC.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/bc89b807d965b98ec18a0bb94f96a594bd58f2f2.1506551639.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
static checker reports a potential integer overflow. Cap the worker count to
avoid the overflow.
Reported:-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Shaohua Li <shli@fb.com>
raid_map calls pers->make_request, which missed the suspend check. Fix it with
the new md_handle_request API.
Fix: cc27b0c78c79(md: fix deadlock between mddev_suspend() and md_write_start())
Cc: Heinz Mauelshagen <heinzm@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
md_submit_flush_data calls pers->make_request, which missed the suspend check.
Fix it with the new md_handle_request API.
Reported-by: Nate Dailey <nate.dailey@stratus.com>
Tested-by: Nate Dailey <nate.dailey@stratus.com>
Fix: cc27b0c78c79(md: fix deadlock between mddev_suspend() and md_write_start())
Cc: stable@vger.kernel.org
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
With commit cc27b0c78c, pers->make_request could bail out without handling
the bio. If that happens, we should retry. The commit fixes md_make_request
but not other call sites. Separate the request handling part, so other call
sites can use it.
Reported-by: Nate Dailey <nate.dailey@stratus.com>
Fix: cc27b0c78c79(md: fix deadlock between mddev_suspend() and md_write_start())
Cc: stable@vger.kernel.org
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
ASC 0x27 is "WRITE PROTECTED". This error code is returned e.g. by
Fujitsu ETERNUS systems under certain conditions for WRITE SAME 16
commands with UNMAP bit set. It should not be treated as a path
error. In general, it makes sense to assume that being write protected
is a target rather than a path property.
Signed-off-by: Martin Wilck <mwilck@suse.com>
Acked-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Commit 0e9973ed33 ("scsi: aacraid: Add periodic checks to see IOP reset
status") changed the way driver checks if a reset succeeded. Now, after an
IOP reset, aacraid immediately start polling a register to verify the reset
is complete.
This behavior cause regressions on the reset path in PowerPC (at least).
Since the delay after the IOP reset was removed by the aforementioned patch,
the fact driver just starts to read a register instantly after the reset
was issued (by writing in another register) "corrupts" the reset procedure,
which ends up failing all the time.
The issue highly impacted kdump on PowerPC, since on kdump path we
proactively issue a reset in adapter (through the reset_devices kernel
parameter).
This patch (re-)adds a delay right after IOP reset is issued. Empirically
we measured that 3 seconds is enough, but for safety reasons we delay
for 5s (and since it was 30s before, 5s is still a small amount).
For reference, without this patch we observe the following messages
on kdump kernel boot process:
[ 76.294] aacraid 0003:01:00.0: IOP reset failed
[ 76.294] aacraid 0003:01:00.0: ARC Reset attempt failed
[ 86.524] aacraid 0003:01:00.0: adapter kernel panic'd ff.
[ 86.524] aacraid 0003:01:00.0: Controller reset type is 3
[ 86.524] aacraid 0003:01:00.0: Issuing IOP reset
[146.534] aacraid 0003:01:00.0: IOP reset failed
[146.534] aacraid 0003:01:00.0: ARC Reset attempt failed
Fixes: 0e9973ed33 ("scsi: aacraid: Add periodic checks to see IOP reset status")
Cc: stable@vger.kernel.org # v4.13+
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Acked-by: Dave Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Commit 33fc30b470 (cpufreq: intel_pstate: Document the current
behavior and user interface) dropped the intel-pstate.txt file
from Documentation/cpu-freq/, but it did not update the index.txt
file in there accordingly, so do that now.
Fixes: 33fc30b470 (cpufreq: intel_pstate: Document the current behavior and user interface)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
From David Howells:
"There are two sets of patches here:
(1) A bunch of core keyrings bug fixes from Eric Biggers.
(2) Fixing big_key to use safe crypto from Jason A. Donenfeld."
This patch fixes the starting offset used when scanning chunks to
compute the chunk statistics. The value start_offset (and end_offset)
are managed in bytes while the traversal occurs over bits. Thus for the
reserved and dynamic chunk, it may incorrectly skip over the initial
allocations.
Signed-off-by: Dennis Zhou <dennisszhou@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Currently we acknowledge errors before clearing the error status.
This could cause a new error to be populated by firmware in-between
the error acknowledgment and the error status clearing which would
cause the second error's status to be cleared without being handled.
So, clear the error status before acknowledging the errors.
Also, make sure to acknowledge the error if the error status read
fails.
Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
A few fixes for 4.14. Nothing too major.
* 'drm-fixes-4.14' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: disable hard reset in hibernate for APUs
drm/amdgpu: revert tile table update for oland
Just two small etnaviv fixes, one fixing a list corruption, the other
fixing a NULL ptr deref in an error path.
* 'etnaviv/fixes' of https://git.pengutronix.de/git/lst/linux:
etnaviv: fix gem object list corruption
etnaviv: fix submit error path