We have a bit more changes than usual in ASoC here, as it was slipped
from the previous update. There are one minr ASoC PCM code fix and
ASoC dmaengine fix, in addition of a collection of small ASoC driver
fixes. The rest are a couple of HD-audio stable fixups, and a
long-standing fix for the paused stream handling.
So, all commits look not scary (and hopefully won't give you
disastrous holiday season).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJSstz0AAoJEGwxgFQ9KSmkc7UP/A5n/mgQlsyyjax0bJSEK0gx
0GpqnCo4mQFypDnPpb6MN3Xa8GDS+MbUFUmZYOpuB3aAoUm0quiBpB9u7tfjr/BO
lMS42dhvDQSe+EGJDoBo6TAp8yb2uU7x70GqnzijWvISqn5rMzgFRWuUPKnXEuaE
On++cPZhfy0f8sWmh2FdBXDeKNa/32n2RU3PcO8vdAG3BRz8eO05cVxGV8hywHHa
BGMwsVUUq+9jT4sF5a8pbqC9J1+EmJk1xkNc3I+i7KDw0MrwUsyt/i/MbcNBcNgH
FyG2VejPHkcpPs/wSZNRZe0nHjAyCYsYMFjmS0V7HqZYdKWUTzBhP63QRrUC5Dgw
lEDpApKIk+6gFgBY7tyqD3MEG5K6YSs7h0jElm32ei649U15IMr1nXB4FYM3zYyY
PBzFpTcO+lpIj/kV7GJeIdCROLUKaTheHEd4TpbNt2lLWwFs05B1b/QhMF50zw3X
7ad+VxZx+3PbvQrUU2IsBth23I9x4a9HwSScTq9xZdMzWT3TQ7O4B/Ii15rI4QTK
oQ3tofuqx3IJFoRcrXMZwF8RfKYopseO208so1v9qHt+5UeYxp8xokpa8QHzY3Yw
UjsaymLdp+uCeEYE2reaE6d4xFPXiEr2vPgBDR7ZAZLVKrlDl5+AND1m6E0FVG2v
FHlJjFh58wngEbSfQbbk
=OXEI
-----END PGP SIGNATURE-----
Merge tag 'sound-3.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"We have a bit more changes than usual in ASoC here, as it was slipped
from the previous update. There are one minr ASoC PCM code fix and
ASoC dmaengine fix, in addition of a collection of small ASoC driver
fixes. The rest are a couple of HD-audio stable fixups, and a
long-standing fix for the paused stream handling.
So, all commits look not scary (and hopefully won't give you
disastrous holiday season)"
* tag 'sound-3.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Add Dell headset detection quirk for one more laptop model
ASoC: wm8904: fix DSP mode B configuration
ASoC: wm_adsp: Add small delay while polling DSP RAM start
ALSA: Add SNDRV_PCM_STATE_PAUSED case in wait_for_avail function
ASoC: kirkwood: Fix the CPU DAI rates
ASoC: wm5110: Correct HPOUT3 DAPM route typo
ALSA: hda - Add Dell headset detection quirk for three laptop models
ALSA: hda - Add enable_msi=0 workaround for four HP machines
ASoC: don't leak on error in snd_dmaengine_pcm_register
ASoC: fsl: imx-wm8962: Don't update bias_level in machine driver
ASoC: tegra: fix uninitialized variables in set_fmt
ASoC: wm8962: Enable SYSCLK provisonally before fetching generated DSPCLK_DIV
ASoC: sam9x5_wm8731: change to work in DSP A mode
ASoC: atmel_ssc_dai: add dai trigger ops
ASoC: soc-pcm: Use valid condition for snd_soc_dai_digital_mute() in hw_free()
Add reg_defaults to regmap and at the same time implement proper power
state handling with using regcache_cache_only(), regcache_sync() and
regcache_mark_dirty().
This will make sure that we do not need to do restore operations in child
drivers anymore.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Let the user know when the number of submission queues are being
ignored.
Signed-off-by: Matias Bjorling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Simplify the initialization logic of the three block-layers.
- The queue initialization is split into two parts. This allows reuse of
code when initializing the sq-, bio- and mq-based layers.
- Set submit_queues default value to 0 and always set it at init time.
- Simplify the init error code paths.
Signed-off-by: Matias Bjorling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
For NUMA systems, initializing the blk-mq layer and using per node hctx.
We initialize submit queues to 1, while blk-mq nr_hw_queues is
initialized to the number of NUMA nodes.
This makes the null_init_hctx function overwrite memory outside of what
it allocated. In my case it lead to writing garbage into struct
request_queue's mq_map.
Signed-off-by: Matias Bjorling <m@bjorling.me>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mark functions skd_skmsg_state_to_str() and skd_skreq_state_to_str() as
static in skd_main.c because they are not used outside this file.
This eliminates the following warnings in skd_main.c:
drivers/block/skd_main.c:5272:13: warning: no previous prototype for ‘skd_skmsg_state_to_str’ [-Wmissing-prototypes]
drivers/block/skd_main.c:5284:13: warning: no previous prototype for ‘skd_skreq_state_to_str’ [-Wmissing-prototypes]
Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This patch adds three main functions for DAI master mode: set_dai_fmt(),
set_dai_sysclk() and set_dai_tdm_slot(), and one essential baud clock
accordingly. After appending this patch, the fsl_ssi driver on i.MX series
has the ability to derive LRCLK and BCLK from baud clock source so as to
support some audio Codecs which can only be used in slave mode.
Signed-off-by: Nicolin Chen <Guangyu.Chen@freescale.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
The kernel as a number of cases of gendered language. The majority of these
refer to objects that don't have gender in English, and so I've replaced
them with "it" and "its". Some refer to people (developers or users), and
I've replaced these with the singular "they" variant. Some are simply
typos that I've fixed up.
I've left cases where gendered language was used to refer to specific
individuals, was a quote or is part of license text.
Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
Commit 97bc386fc1 "ARC: Add guard macro to uapi/asm/unistd.h"
inhibited multiple inclusion of ARCH unistd.h. This however hosed the system
since Generic syscall table generator relies on it being included twice,
and in lack-of an empty table was emitted by C preprocessor.
Fix that by allowing one exception to rule for the special case (just
like Xtensa)
Suggested-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
When a Kconfig of a codec driver doesn't match with the controller
(CONFIG_SND_HDA_INTEL), it'll result in the non-working automatic
probing. Unfortunately kbuild can't give such a restriction, but at
least, it's possible to show a warning if such a condition is found.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
So far, CONFIG_SND_HDA_CODEC_* kconfigs have been booleans due to
historical reasons. The major reason was that the automatic codec
driver probing wouldn't work if user sets a codec driver as a module
while the controller driver as a built-in. And, another reason was to
avoid exporting symbols of the helper codes when all drivers are built
in.
But, this sort of "kindness" rather confuses people in the end,
especially makes the config refinement via localmodconfig unhappy.
Also, a codec module would still work if you re-bind the controller
driver via sysfs (although it's no automatic loading), so there might
be a slight use case.
That said, better to let people fallen into a pitfall than being too
smart and restrict something. Let's make things straightforward: now
all CONFIG_SND_HDA_CODEC_* become tristate, and all symbols exported
unconditionally.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The fixes here are all driver specific ones, none of which particularly
stand out but all of which are useful to users of those drivers.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.15 (GNU/Linux)
iQIcBAABAgAGBQJSstRwAAoJELSic+t+oim9zPcP/3bbMKIGKrRhnkGrYWrJQqBS
NVj327bouHo8TwvW/PDrApY9QOXMxv9Z5GMeazJJ4Y6rOZ/zYVOpO1kU6vwGZ+so
GMtMGFzV5+aF0Rvud+YAKFFV6uj/hkP08YKt9tDFAXBia/Ff8MVdvCW1xt1a6wO4
En+tmZKenhELKjtbnhpfCZClcFLYZxmtYR94Tr3NvzaIDMG0cdR5hKF8V93Wr66v
mCyBG8AblRoWklKNoB2UXJup9/xDR1yggaCq8ObSRbNfs8Zxso6fN7LrqjWuNz7w
Yvh/UiRNqWjcrmihaqRvt0KayEXF5ZSvTUB0U7InYmeZZArITMIJ2Z/BXzi1SazG
/kCg9qkNigeUYMpJPDOm71SKjvzSB1mL+Eol/5wC/GxmdFMEriKf5ob7D3Kq1zR+
c83bZZqK+663OXTdmASTB5wDAyDqwHBFVdOQlY4s8L11nP/bHDgeiRAcni4wdfOW
xovgyPY+PMGFJBfqqvoXShNtDsp2gMDhWYDrH9vUvwUQAhdikPbfEU9KhlOfEpiv
/HMD3r0VKbsstBhE9CE5dnmketQNHPDBkBCzhoFmKOQPKhR9lBwNkLiLQEYfvWaB
omg0vPUubTb+EN6lechRoPYI/s6+mZ+nBaj3skI3MUcqfsegL7QLl6O4RNlCOFD9
S83sp2o6MK1qHspj1MP4
=pOaJ
-----END PGP SIGNATURE-----
Merge tag 'asoc-v3.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v3.13
The fixes here are all driver specific ones, none of which particularly
stand out but all of which are useful to users of those drivers.
Makes the code slightly shorter.
Signed-off-by: Xiubo Li <Li.Xiubo@freescale.com>
Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
Add support for configuring the sample rate on the SYSCLK side of the
ASRC.
Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
Currently, the driver only supports configuration of the lower sample
rate (FSL) on the ISRCs. With the higher rate being fixed a SYSCLK, this
patch adds support for configuring the higher sample rate (FSH).
Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
Certain use-cases require the DRE to be disabled so expose controls for
the enables.
Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
Certain use-cases require the DRE to be disabled so expose registers
necessary to control the DRE enables.
Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Mark Brown <broonie@linaro.org>
The r8a7790.dtsi file has four sdhi nodes which the first two have the wrong
resource size for their register block. This causes the sh_modbile_sdhi driver
to fail to communicate with card at-all.
Change sdhi{0,1} node size from 0x100 to 0x200 to correct these nodes
as per Kuninori Morimoto's response to the original patch where all four
nodes where changed. sdhi{2,3} are the correct size.
This bug has been present since sdhi resources were added to the r8a7790 by
8c9b1aa418 ("ARM: shmobile: r8a7790: add MMCIF and SDHI DT
templates") in v3.11-rc2.
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Tested-by: William Towle <william.towle@codethink.co.uk>
Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
4dcfa60071
(ARM: DMA-API: better handing of DMA masks for coherent allocations)
exchanged DMA mask check method.
Below warning will appear without this patch
asoc-simple-card asoc-simple-card.0: \
Coherent DMA mask 0xffffffffffffffff is larger than dma_addr_t allows
asoc-simple-card asoc-simple-card.0: \
Driver did not use or check the return value from dma_set_coherent_mask()?
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
This patch allows FILEIO to update hw_max_sectors based on the current
max_bytes_per_io. This is required because vfs_[writev,readv]() can accept
a maximum of 2048 iovecs per call, so the enforced hw_max_sectors really
needs to be calculated based on block_size.
This addresses a >= v3.5 bug where block_size=512 was rejecting > 1M
sized I/O requests, because FD_MAX_SECTORS was hardcoded to 2048 for
the block_size=4096 case.
(v2: Use max_bytes_per_io instead of ->update_hw_max_sectors)
Reported-by: Henrik Goldman <hg@x-formation.com>
Cc: <stable@vger.kernel.org> #3.5+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch moves INIT_WORK setup for cq_desc->cq_[rx,tx]_work into
isert_create_device_ib_res(), instead of being done each callback
invocation in isert_cq_[rx,tx]_callback().
This also fixes a 'INFO: trying to register non-static key' warning
when cancel_work_sync() is called before INIT_WORK has setup the
struct work_struct.
Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Cc: <stable@vger.kernel.org> #3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
When shutting down a target there is a race condition between
iscsit_del_np() and __iscsi_target_login_thread().
The latter sets the thread pointer to NULL, and the former
tries to issue kthread_stop() on that pointer without any
synchronization.
This patch moves the np->np_thread NULL assignment into
iscsit_del_np(), after kthread_stop() has completed. It also
removes the signal_pending() + np_state check, and only
exits when kthread_should_stop() is true.
Reported-by: Hannes Reinecke <hare@suse.de>
Cc: <stable@vger.kernel.org> #3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Commit 22ceeee16e ("pwm-backlight: Add
power supply support") added a mandatory power supply for the PWM
backlight. Add a fixed 5V regulator to board code with a consumer supply
entry for the backlight device.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
(cherry picked from commit ad11cb9a5cf96346f1240995c672cdbb5501785c)
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Merge patches from Andrew Morton:
"23 fixes and a MAINTAINERS update"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (24 commits)
mm/hugetlb: check for pte NULL pointer in __page_check_address()
fix build with make 3.80
mm/mempolicy: fix !vma in new_vma_page()
MAINTAINERS: add Davidlohr as GPT maintainer
mm/memory-failure.c: recheck PageHuge() after hugetlb page migrate successfully
mm/compaction: respect ignore_skip_hint in update_pageblock_skip
mm/mempolicy: correct putback method for isolate pages if failed
mm: add missing dependency in Kconfig
sh: always link in helper functions extracted from libgcc
mm: page_alloc: exclude unreclaimable allocations from zone fairness policy
mm: numa: defer TLB flush for THP migration as long as possible
mm: numa: guarantee that tlb_flush_pending updates are visible before page table updates
mm: fix TLB flush race between migration, and change_protection_range
mm: numa: avoid unnecessary disruption of NUMA hinting during migration
mm: numa: clear numa hinting information on mprotect
sched: numa: skip inaccessible VMAs
mm: numa: avoid unnecessary work on the failure path
mm: numa: ensure anon_vma is locked to prevent parallel THP splits
mm: numa: do not clear PTE for pte_numa update
mm: numa: do not clear PMD during PTE update scan
...
In __page_check_address(), if address's pud is not present,
huge_pte_offset() will return NULL, we should check the return value.
Signed-off-by: Jianguo Wu <wujianguo@huawei.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: qiuxishi <qiuxishi@huawei.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
According to Documentation/Changes, make 3.80 is still being supported
for building the kernel, hence make files must not make (unconditional)
use of features introduced only in newer versions.
Commit 1bf49dd4be ("./Makefile: export initial ramdisk compression
config option") however introduced "else ifeq" constructs which make
3.80 doesn't understand. Replace the logic there with more conventional
(in the kernel build infrastructure) list constructs (except that the
list here is intentionally limited to exactly one element).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: P J P <ppandit@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a new entry for the GPT standard. Any future changes will now be
routed through linux-efi.
Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Acked-by: Matt Fleming <matt.fleming@intel.com>
Cc: Jens Axboe <axboe@kernel.dk>
Acked-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After a successful hugetlb page migration by soft offline, the source
page will either be freed into hugepage_freelists or buddy(over-commit
page). If page is in buddy, page_hstate(page) will be NULL. It will
hit a NULL pointer dereference in dequeue_hwpoisoned_huge_page().
BUG: unable to handle kernel NULL pointer dereference at 0000000000000058
IP: [<ffffffff81163761>] dequeue_hwpoisoned_huge_page+0x131/0x1d0
PGD c23762067 PUD c24be2067 PMD 0
Oops: 0000 [#1] SMP
So check PageHuge(page) after call migrate_pages() successfully.
Signed-off-by: Jianguo Wu <wujianguo@huawei.com>
Tested-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
update_pageblock_skip() only fits to compaction which tries to isolate
by pageblock unit. If isolate_migratepages_range() is called by CMA, it
try to isolate regardless of pageblock unit and it don't reference
get_pageblock_skip() by ignore_skip_hint. We should also respect it on
update_pageblock_skip() to prevent from setting the wrong information.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: <stable@vger.kernel.org> [3.7+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
queue_pages_range() isolates hugetlbfs pages and putback_lru_pages()
can't handle these. We should change it to putback_movable_pages().
Naoya said that it is worth going into stable, because it can break
in-use hugepage list.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Rafael Aquini <aquini@redhat.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: <stable@vger.kernel.org> [3.12.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eliminate the following (rand)config warning by adding missing PROC_FS
dependency:
warning: (HWPOISON_INJECT && MEM_SOFT_DIRTY) selects PROC_PAGE_MONITOR which has unmet direct dependencies (PROC_FS && MMU)
Signed-off-by: Sima Baymani <sima.baymani@gmail.com>
Suggested-by: David Rientjes <rientjes@google.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
E.g. landisk_defconfig, which has CONFIG_NTFS_FS=m:
ERROR: "__ashrdi3" [fs/ntfs/ntfs.ko] undefined!
For "lib-y", if no symbols in a compilation unit are referenced by other
units, the compilation unit will not be included in vmlinux. This
breaks modules that do reference those symbols.
Use "obj-y" instead to fix this.
http://kisskb.ellerman.id.au/kisskb/buildresult/8838077/
This doesn't fix all cases. There are others, e.g. udivsi3.
This is also not limited to sh, many architectures handle this in the
same way.
A simple solution is to unconditionally include all helper functions.
A more complex solution is to make the choice of "lib-y" or "obj-y" depend
on CONFIG_MODULES:
obj-$(CONFIG_MODULES) += ...
lib-y($CONFIG_MODULES) += ...
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Tested-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Reviewed-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dave Hansen noted a regression in a microbenchmark that loops around
open() and close() on an 8-node NUMA machine and bisected it down to
commit 81c0a2bb51 ("mm: page_alloc: fair zone allocator policy").
That change forces the slab allocations of the file descriptor to spread
out to all 8 nodes, causing remote references in the page allocator and
slab.
The round-robin policy is only there to provide fairness among memory
allocations that are reclaimed involuntarily based on pressure in each
zone. It does not make sense to apply it to unreclaimable kernel
allocations that are freed manually, in this case instantly after the
allocation, and incur the remote reference costs twice for no reason.
Only round-robin allocations that are usually freed through page reclaim
or slab shrinking.
Bisected by Dave Hansen.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
THP migration can fail for a variety of reasons. Avoid flushing the TLB
to deal with THP migration races until the copy is ready to start.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
According to documentation on barriers, stores issued before a LOCK can
complete after the lock implying that it's possible tlb_flush_pending
can be visible after a page table update. As per revised documentation,
this patch adds a smp_mb__before_spinlock to guarantee the correct
ordering.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are a few subtle races, between change_protection_range (used by
mprotect and change_prot_numa) on one side, and NUMA page migration and
compaction on the other side.
The basic race is that there is a time window between when the PTE gets
made non-present (PROT_NONE or NUMA), and the TLB is flushed.
During that time, a CPU may continue writing to the page.
This is fine most of the time, however compaction or the NUMA migration
code may come in, and migrate the page away.
When that happens, the CPU may continue writing, through the cached
translation, to what is no longer the current memory location of the
process.
This only affects x86, which has a somewhat optimistic pte_accessible.
All other architectures appear to be safe, and will either always flush,
or flush whenever there is a valid mapping, even with no permissions
(SPARC).
The basic race looks like this:
CPU A CPU B CPU C
load TLB entry
make entry PTE/PMD_NUMA
fault on entry
read/write old page
start migrating page
change PTE/PMD to new page
read/write old page [*]
flush TLB
reload TLB from new entry
read/write new page
lose data
[*] the old page may belong to a new user at this point!
The obvious fix is to flush remote TLB entries, by making sure that
pte_accessible aware of the fact that PROT_NONE and PROT_NUMA memory may
still be accessible if there is a TLB flush pending for the mm.
This should fix both NUMA migration and compaction.
[mgorman@suse.de: fix build]
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
do_huge_pmd_numa_page() handles the case where there is parallel THP
migration. However, by the time it is checked the NUMA hinting
information has already been disrupted. This patch adds an earlier
check with some helpers.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On a protection change it is no longer clear if the page should be still
accessible. This patch clears the NUMA hinting fault bits on a
protection change.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Inaccessible VMA should not be trapping NUMA hint faults. Skip them.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If a PMD changes during a THP migration then migration aborts but the
failure path is doing more work than is necessary.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The anon_vma lock prevents parallel THP splits and any associated
complexity that arises when handling splits during THP migration. This
patch checks if the lock was successfully acquired and bails from THP
migration if it failed for any reason.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The TLB must be flushed if the PTE is updated but change_pte_range is
clearing the PTE while marking PTEs pte_numa without necessarily
flushing the TLB if it reinserts the same entry. Without the flush,
it's conceivable that two processors have different TLBs for the same
virtual address and at the very least it would generate spurious faults.
This patch only unmaps the pages in change_pte_range for a full
protection change.
[riel@redhat.com: write pte_numa pte back to the page tables]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: Chegu Vinod <chegu_vinod@hp.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If the PMD is flushed then a parallel fault in handle_mm_fault() will
enter the pmd_none and do_huge_pmd_anonymous_page() path where it'll
attempt to insert a huge zero page. This is wasteful so the patch
avoids clearing the PMD when setting pmd_numa.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On x86, PMD entries are similar to _PAGE_PROTNONE protection and are
handled as NUMA hinting faults. The following two page table protection
bits are what defines them
_PAGE_NUMA:set _PAGE_PRESENT:clear
A PMD is considered present if any of the _PAGE_PRESENT, _PAGE_PROTNONE,
_PAGE_PSE or _PAGE_NUMA bits are set. If pmdp_invalidate encounters a
pmd_numa, it clears the present bit leaving _PAGE_NUMA which will be
considered not present by the CPU but present by pmd_present. The
existing caller of pmdp_invalidate should handle it but it's an
inconsistent state for a PMD. This patch keeps the state consistent
when calling pmdp_invalidate.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>