OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Thomas Petazzoni	9ff0bb5ba6	ARM: 8180/1: mm: implement no-highmem fast path in kmap_atomic_pfn() Since CONFIG_HIGHMEM got enabled on ARMv5 Kirkwood, we have noticed a very significant drop in networking performance. The test were conducted on an OpenBlocks A7 board. Without this patch, the outgoing performance measured with iperf are: - highmem OFF, TSO OFF 544 Mbit/s - highmem OFF, TSO ON 942 Mbit/s - highmem ON, TSO OFF 306 Mbit/s - highmem ON, TSO ON 246 Mbit/s On this Kirkwood platform, the L2 cache is a Feroceon cache, and with this cache, all the range operations have to be done on virtual addresses and not physical addresses. Therefore, whenever CONFIG_HIGHMEM is enabled, the cache maintenance operations call kmap_atomic_pfn() and kunmap_atomic(). However, kmap_atomic_pfn() does not implement the same fast path for non-highmem pages as the one implemented in kmap_atomic(), and this is one of the reason for the performance drop. While this patch does not fully restore the performances, it clearly improves them a lot: without patch with patch - highmem ON, TSO OFF 306 Mbit/s 387 Mbit/s - highmem ON, TSO ON 246 Mbit/s 434 Mbit/s We're still far from the !CONFIG_HIGHMEM performances, but it does improve a bit the situation. Thanks a lot to Ezequiel Garcia and Gregory Clement for all the testing work around this topic. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-10-29 17:22:07 +00:00
Liu Hua	a05e54c103	ARM: 8031/2: change fixmap mapping region to support 32 CPUs In 32-bit ARM systems, the fixmap mapping region can support no more than 14 CPUs(total: 896k; one CPU: 64K). And we can configure NR_CPUS up to 32. So there is a mismatch. This patch moves fixmapping region downwards to region 0xffc00000- 0xffe00000. Then the fixmap mapping region can support up to 32 CPUs. Reviewed-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Liu Hua <sdu.liu@huawei.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-04-23 11:09:42 +01:00
Liu Hua	4221e2e6b3	ARM: 8031/1: fixmap: remove FIX_KMAP_BEGIN and FIX_KMAP_END It seems that these two macros are not used by non architecture specific code. And on ARM FIX_KMAP_BEGIN equals zero. This patch removes these two macros. Instead, using FIX_KMAP_NR_PTES to tell the pte number belonged to fixmap mapping region. The code will become clearer when I introduce a bugfix on fixmap mapping region. Reviewed-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Liu Hua <sdu.liu@huawei.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-04-23 11:09:27 +01:00
Linus Torvalds	12679a2d7e	Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-arm Pull more ARM updates from Russell King. This got a fair number of conflicts with the <asm/system.h> split, but also with some other sparse-irq and header file include cleanups. They all looked pretty trivial, though. * 'for-linus' of git://git.linaro.org/people/rmk/linux-arm: (59 commits) ARM: fix Kconfig warning for HAVE_BPF_JIT ARM: 7361/1: provide XIP_VIRT_ADDR for no-MMU builds ARM: 7349/1: integrator: convert to sparse irqs ARM: 7259/3: net: JIT compiler for packet filters ARM: 7334/1: add jump label support ARM: 7333/2: jump label: detect %c support for ARM ARM: 7338/1: add support for early console output via semihosting ARM: use set_current_blocked() and block_sigmask() ARM: exec: remove redundant set_fs(USER_DS) ARM: 7332/1: extract out code patch function from kprobes ARM: 7331/1: extract out insn generation code from ftrace ARM: 7330/1: ftrace: use canonical Thumb-2 wide instruction format ARM: 7351/1: ftrace: remove useless memory checks ARM: 7316/1: kexec: EOI active and mask all interrupts in kexec crash path ARM: Versatile Express: add NO_IOPORT ARM: get rid of asm/irq.h in asm/prom.h ARM: 7319/1: Print debug info for SIGBUS in user faults ARM: 7318/1: gic: refactor irq_start assignment ARM: 7317/1: irq: avoid NULL check in for_each_irq_desc loop ARM: 7315/1: perf: add support for the Cortex-A7 PMU ...	2012-03-29 16:53:48 -07:00
Cong Wang	a24401bcf4	highmem: kill all __kmap_atomic() [swarren@nvidia.com: highmem: Fix ARM build break due to __kmap_atomic rename] Signed-off-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Cong Wang <amwang@redhat.com>	2012-03-20 21:48:30 +08:00
Russell King	0d31fe47b0	ARM: pgtable: provide get_top_pte() to complement set_top_pte() Provide get_top_pte() to complement set_top_pte(), moving the only users of TOP_PTE to arch/arm/mm/mm.h. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2012-01-26 20:07:41 +00:00
Russell King	67ece14431	ARM: pgtable: consolidate set_pte_ext(TOP_PTE,...) + tlb flush A number of places establish a PTE in our top page table and immediately flush the TLB. Rather than having this at every callsite, provide an inline function for this purpose. This changes some global tlb flushes to be local; each time we setup one of these mappings, we always do it with preemption disabled which would prevent us migrating to another CPU. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2012-01-26 20:06:28 +00:00
Nicolas Pitre	39af22a792	ARM: get rid of kmap_high_l1_vipt() Since commit `3e4d3af501` "mm: stack based kmap_atomic()", it is no longer necessary to carry an ad hoc version of kmap_atomic() added in commit `7e5a69e83b` "ARM: 6007/1: fix highmem with VIPT cache and DMA" to cope with reentrancy. In fact, it is now actively wrong to rely on fixed kmap type indices (namely KM_L1_CACHE) as kmap_atomic() totally ignores them now and a concurrent instance of it may reuse any slot for any purpose. Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>	2010-12-19 12:56:46 -05:00
Peter Zijlstra	20273941f2	mm: fix race in kunmap_atomic() Christoph reported a nice splat which illustrated a race in the new stack based kmap_atomic implementation. The problem is that we pop our stack slot before we're completely done resetting its state -- in particular clearing the PTE (sometimes that's CONFIG_DEBUG_HIGHMEM). If an interrupt happens before we actually clear the PTE used for the last slot, that interrupt can reuse the slot in a dirty state, which triggers a BUG in kmap_atomic(). Fix this by introducing kmap_atomic_idx() which reports the current slot index without actually releasing it and use that to find the PTE and delay the _pop() until after we're completely done. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reported-by: Christoph Hellwig <hch@infradead.org> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-10-27 18:03:05 -07:00
Peter Zijlstra	3e4d3af501	mm: stack based kmap_atomic() Keep the current interface but ignore the KM_type and use a stack based approach. The advantage is that we get rid of crappy code like: #define __KM_PTE \ (in_nmi() ? KM_NMI_PTE : \ in_irq() ? KM_IRQ_PTE : \ KM_PTE0) and in general can stop worrying about what context we're in and what kmap slots might be appropriate for that. The downside is that FRV kmap_atomic() gets more expensive. For now we use a CPP trick suggested by Andrew: #define kmap_atomic(page, args...) __kmap_atomic(page) to avoid having to touch all kmap_atomic() users in a single patch. [ not compiled on: - mn10300: the arch doesn't actually build with highmem to begin with ] [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix up drivers/gpu/drm/i915/intel_overlay.c] Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Chris Metcalf <cmetcalf@tilera.com> Cc: David Howells <dhowells@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Miller <davem@davemloft.net> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Airlie <airlied@linux.ie> Cc: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-10-26 16:52:08 -07:00
Cesar Eduardo Barros	597781f3e5	kmap_atomic: make kunmap_atomic() harder to misuse kunmap_atomic() is currently at level -4 on Rusty's "Hard To Misuse" list[1] ("Follow common convention and you'll get it wrong"), except in some architectures when CONFIG_DEBUG_HIGHMEM is set[2][3]. kunmap() takes a pointer to a struct page; kunmap_atomic(), however, takes takes a pointer to within the page itself. This seems to once in a while trip people up (the convention they are following is the one from kunmap()). Make it much harder to misuse, by moving it to level 9 on Rusty's list[4] ("The compiler/linker won't let you get it wrong"). This is done by refusing to build if the type of its first argument is a pointer to a struct page. The real kunmap_atomic() is renamed to kunmap_atomic_notypecheck() (which is what you would call in case for some strange reason calling it with a pointer to a struct page is not incorrect in your code). The previous version of this patch was compile tested on x86-64. [1] http://ozlabs.org/~rusty/index.cgi/tech/2008-04-01.html [2] In these cases, it is at level 5, "Do it right or it will always break at runtime." [3] At least mips and powerpc look very similar, and sparc also seems to share a common ancestor with both; there seems to be quite some degree of copy-and-paste coding here. The include/asm/highmem.h file for these three archs mention x86 CPUs at its top. [4] http://ozlabs.org/~rusty/index.cgi/tech/2008-03-30.html [5] As an aside, could someone tell me why mn10300 uses unsigned long as the first parameter of kunmap_atomic() instead of void *? Signed-off-by: Cesar Eduardo Barros <cesarb@cesarb.net> Cc: Russell King <linux@arm.linux.org.uk> (arch/arm) Cc: Ralf Baechle <ralf@linux-mips.org> (arch/mips) Cc: David Howells <dhowells@redhat.com> (arch/frv, arch/mn10300) Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> (arch/mn10300) Cc: Kyle McMartin <kyle@mcmartin.ca> (arch/parisc) Cc: Helge Deller <deller@gmx.de> (arch/parisc) Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> (arch/parisc) Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> (arch/powerpc) Cc: Paul Mackerras <paulus@samba.org> (arch/powerpc) Cc: "David S. Miller" <davem@davemloft.net> (arch/sparc) Cc: Thomas Gleixner <tglx@linutronix.de> (arch/x86) Cc: Ingo Molnar <mingo@redhat.com> (arch/x86) Cc: "H. Peter Anvin" <hpa@zytor.com> (arch/x86) Cc: Arnd Bergmann <arnd@arndb.de> (include/asm-generic) Cc: Rusty Russell <rusty@rustcorp.com.au> ("Hard To Misuse" list) Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-09 20:44:54 -07:00
Gary King	831e8047eb	ARM: 6279/1: highmem: fix SMP preemption bug in kmap_high_l1_vipt smp_processor_id() must not be called from a preemptible context (this is checked by CONFIG_DEBUG_PREEMPT). kmap_high_l1_vipt() was doing so. This lead to a problem where the wrong per_cpu kmap_high_l1_vipt_depth could be incremented, causing a BUG_ON(*depth <= 0); in kunmap_high_l1_vipt(). The solution is to move the call to smp_processor_id() after the call to preempt_disable(). Originally by: Andrew Howe <ahowe@nvidia.com> Signed-off-by: Gary King <gking@nvidia.com> Acked-by: Nicolas Pitre <nico.as.pitre@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2010-07-30 23:16:07 +01:00
Nicolas Pitre	17ebba1fe4	ARM: 6165/1: trap overflows on highmem pages from kmap_atomic when debugging When CONFIG_DEBUG_HIGHMEM is used, the fixmap entry used for a highmem page by kmap_atomic() is always cleared by kunmap_atomic(). This helps find bad usages such as dereferences after the unmap, or overflow into the adjacent fixmap areas. But this debugging aid is completely bypassed when a kmap for the same page already exists as the kmap is reused instead. ON VIVT systems we have no choice but to reuse that kmap due to cache coherency issues, but on non VIVT systems we should always force the fixmap usage when debugging is active. Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2010-06-08 19:25:50 +01:00
Nicolas Pitre	7e5a69e83b	ARM: 6007/1: fix highmem with VIPT cache and DMA The VIVT cache of a highmem page is always flushed before the page is unmapped. This cache flush is explicit through flush_cache_kmaps() in flush_all_zero_pkmaps(), or through __cpuc_flush_dcache_area() in kunmap_atomic(). There is also an implicit flush of those highmem pages that were part of a process that just terminated making those pages free as the whole VIVT cache has to be flushed on every task switch. Hence unmapped highmem pages need no cache maintenance in that case. However unmapped pages may still be cached with a VIPT cache because the cache is tagged with physical addresses. There is no need for a whole cache flush during task switching for that reason, and despite the explicit cache flushes in flush_all_zero_pkmaps() and kunmap_atomic(), some highmem pages that were mapped in user space end up still cached even when they become unmapped. So, we do have to perform cache maintenance on those unmapped highmem pages in the context of DMA when using a VIPT cache. Unfortunately, it is not possible to perform that cache maintenance using physical addresses as all the L1 cache maintenance coprocessor functions accept virtual addresses only. Therefore we have no choice but to set up a temporary virtual mapping for that purpose. And of course the explicit cache flushing when unmapping a highmem page on a system with a VIPT cache now can go, which should increase performance. While at it, because the code in __flush_dcache_page() has to be modified anyway, let's also make sure the mapped highmem pages are pinned with kmap_high_get() for the duration of the cache maintenance operation. Because kunmap() does unmap highmem pages lazily, it was reported by Gary King <GKing@nvidia.com> that those pages ended up being unmapped during cache maintenance on SMP causing segmentation faults. Signed-off-by: Nicolas Pitre <nico@marvell.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2010-04-14 11:11:27 +01:00
Russell King	2c9b9c8490	ARM: add size argument to __cpuc_flush_dcache_page ... and rename the function since it no longer operates on just pages. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2009-12-14 14:53:22 +00:00
Russell King	6a5e293f1b	ARM: Add kmap_atomic type debugging Seemingly this support was missed when highmem was added, so DEBUG_HIGHMEM wouldn't have checked the kmap_atomic type. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2009-10-11 16:29:48 +01:00
Nicolas Pitre	7929eb9cf6	ARM: 5691/1: fix cache aliasing issues between kmap() and kmap_atomic() with highmem Let's suppose a highmem page is kmap'd with kmap(). A pkmap entry is used, the page mapped to it, and the virtual cache is dirtied. Then kunmap() is used which does virtually nothing except for decrementing a usage count. Then, let's suppose the _same_ page gets mapped using kmap_atomic(). It is therefore mapped onto a fixmap entry instead, which has a different virtual address unaware of the dirty cache data for that page sitting in the pkmap mapping. Fortunately it is easy to know if a pkmap mapping still exists for that page and use it directly with kmap_atomic(), thanks to kmap_high_get(). And actual testing with a printk in the added code path shows that this condition is actually met extremely frequently. Seems that we've been quite lucky that things have worked so well with highmem so far. Cc: stable@kernel.org Signed-off-by: Nicolas Pitre <nico@marvell.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2009-09-04 19:20:07 +01:00
Nicolas Pitre	d73cd42893	[ARM] kmap support The kmap virtual area borrows a 2MB range at the top of the 16MB area below PAGE_OFFSET currently reserved for kernel modules and/or the XIP kernel. This 2MB corresponds to the range covered by 2 consecutive second-level page tables, or a single pmd entry as seen by the Linux page table abstraction. Because XIP kernels are unlikely to be seen on systems needing highmem support, there shouldn't be any shortage of VM space for modules (14 MB for modules is still way more than twice the typical usage). Because the virtual mapping of highmem pages can go away at any moment after kunmap() is called on them, we need to bypass the delayed cache flushing provided by flush_dcache_page() in that case. The atomic kmap versions are based on fixmaps, and __cpuc_flush_dcache_page() is used directly in that case. Signed-off-by: Nicolas Pitre <nico@marvell.com>	2009-03-15 21:01:20 -04:00

18 Commits