linux-sg2042

Commit Graph

Author	SHA1	Message	Date
Nick Piggin	db64fe0225	mm: rewrite vmap layer Rewrite the vmap allocator to use rbtrees and lazy tlb flushing, and provide a fast, scalable percpu frontend for small vmaps (requires a slightly different API, though). The biggest problem with vmap is actually vunmap. Presently this requires a global kernel TLB flush, which on most architectures is a broadcast IPI to all CPUs to flush the cache. This is all done under a global lock. As the number of CPUs increases, so will the number of vunmaps a scaled workload will want to perform, and so will the cost of a global TLB flush. This gives terrible quadratic scalability characteristics. Another problem is that the entire vmap subsystem works under a single lock. It is a rwlock, but it is actually taken for write in all the fast paths, and the read locking would likely never be run concurrently anyway, so it's just pointless. This is a rewrite of vmap subsystem to solve those problems. The existing vmalloc API is implemented on top of the rewritten subsystem. The TLB flushing problem is solved by using lazy TLB unmapping. vmap addresses do not have to be flushed immediately when they are vunmapped, because the kernel will not reuse them again (would be a use-after-free) until they are reallocated. So the addresses aren't allocated again until a subsequent TLB flush. A single TLB flush then can flush multiple vunmaps from each CPU. XEN and PAT and such do not like deferred TLB flushing because they can't always handle multiple aliasing virtual addresses to a physical address. They now call vm_unmap_aliases() in order to flush any deferred mappings. That call is very expensive (well, actually not a lot more expensive than a single vunmap under the old scheme), however it should be OK if not called too often. The virtual memory extent information is stored in an rbtree rather than a linked list to improve the algorithmic scalability. There is a per-CPU allocator for small vmaps, which amortizes or avoids global locking. To use the per-CPU interface, the vm_map_ram / vm_unmap_ram interfaces must be used in place of vmap and vunmap. Vmalloc does not use these interfaces at the moment, so it will not be quite so scalable (although it will use lazy TLB flushing). As a quick test of performance, I ran a test that loops in the kernel, linearly mapping then touching then unmapping 4 pages. Different numbers of tests were run in parallel on an 4 core, 2 socket opteron. Results are in nanoseconds per map+touch+unmap. threads vanilla vmap rewrite 1 14700 2900 2 33600 3000 4 49500 2800 8 70631 2900 So with a 8 cores, the rewritten version is already 25x faster. In a slightly more realistic test (although with an older and less scalable version of the patch), I ripped the not-very-good vunmap batching code out of XFS, and implemented the large buffer mapping with vm_map_ram and vm_unmap_ram... along with a couple of other tricks, I was able to speed up a large directory workload by 20x on a 64 CPU system. I believe vmap/vunmap is actually sped up a lot more than 20x on such a system, but I'm running into other locks now. vmap is pretty well blown off the profiles. Before: 1352059 total 0.1401 798784 _write_lock 8320.6667 <- vmlist_lock 529313 default_idle 1181.5022 15242 smp_call_function 15.8771 <- vmap tlb flushing 2472 __get_vm_area_node 1.9312 <- vmap 1762 remove_vm_area 4.5885 <- vunmap 316 map_vm_area 0.2297 <- vmap 312 kfree 0.1950 300 _spin_lock 3.1250 252 sn_send_IPI_phys 0.4375 <- tlb flushing 238 vmap 0.8264 <- vmap 216 find_lock_page 0.5192 196 find_next_bit 0.3603 136 sn2_send_IPI 0.2024 130 pio_phys_write_mmr 2.0312 118 unmap_kernel_range 0.1229 After: 78406 total 0.0081 40053 default_idle 89.4040 33576 ia64_spinlock_contention 349.7500 1650 _spin_lock 17.1875 319 __reg_op 0.5538 281 _atomic_dec_and_lock 1.0977 153 mutex_unlock 1.5938 123 iget_locked 0.1671 117 xfs_dir_lookup 0.1662 117 dput 0.1406 114 xfs_iget_core 0.0268 92 xfs_da_hashname 0.1917 75 d_alloc 0.0670 68 vmap_page_range 0.0462 <- vmap 58 kmem_cache_alloc 0.0604 57 memset 0.0540 52 rb_next 0.1625 50 __copy_user 0.0208 49 bitmap_find_free_region 0.2188 <- vmap 46 ia64_sn_udelay 0.1106 45 find_inode_fast 0.1406 42 memcmp 0.2188 42 finish_task_switch 0.1094 42 __d_lookup 0.0410 40 radix_tree_lookup_slot 0.1250 37 _spin_unlock_irqrestore 0.3854 36 xfs_bmapi 0.0050 36 kmem_cache_free 0.0256 35 xfs_vn_getattr 0.0322 34 radix_tree_lookup 0.1062 33 __link_path_walk 0.0035 31 xfs_da_do_buf 0.0091 30 _xfs_buf_find 0.0204 28 find_get_page 0.0875 27 xfs_iread 0.0241 27 __strncpy_from_user 0.2812 26 _xfs_buf_initialize 0.0406 24 _xfs_buf_lookup_pages 0.0179 24 vunmap_page_range 0.0250 <- vunmap 23 find_lock_page 0.0799 22 vm_map_ram 0.0087 <- vmap 20 kfree 0.0125 19 put_page 0.0330 18 __kmalloc 0.0176 17 xfs_da_node_lookup_int 0.0086 17 _read_lock 0.0885 17 page_waitqueue 0.0664 vmap has gone from being the top 5 on the profiles and flushing the crap out of all TLBs, to using less than 1% of kernel time. [akpm@linux-foundation.org: cleanups, section fix] [akpm@linux-foundation.org: fix build on alpha] Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:32 -07:00
Ingo Molnar	3dd392a407	Merge branch 'linus' into x86/pat2 Conflicts: arch/x86/mm/init_64.c	2008-10-10 19:30:08 +02:00
Suresh Siddha	ad5ca55f6b	x86, cpa: srlz cpa(), global flush tlb after splitting big page and before doing cpa Do a global flush tlb after splitting the large page and before we do the actual change page attribute in the PTE. With out this, we violate the TLB application note, which says "The TLBs may contain both ordinary and large-page translations for a 4-KByte range of linear addresses. This may occur if software modifies the paging structures so that the page size used for the address range changes. If the two translations differ with respect to page frame or attributes (e.g., permissions), processor behavior is undefined and may be implementation-specific." And also serialize cpa() (for !DEBUG_PAGEALLOC which uses large identity mappings) using cpa_lock. So that we don't allow any other cpu, with stale large tlb entries change the page attribute in parallel to some other cpu splitting a large page entry along with changing the attribute. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: arjan@linux.intel.com Cc: venkatesh.pallipadi@intel.com Cc: jeremy@goop.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-10 19:29:17 +02:00
Suresh Siddha	8311eb84bf	x86, cpa: remove cpa pool code Interrupt context no longer splits large page in cpa(). So we can do away with cpa memory pool code. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: arjan@linux.intel.com Cc: venkatesh.pallipadi@intel.com Cc: jeremy@goop.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-10 19:29:16 +02:00
Suresh Siddha	55121b4369	x86, cpa: no need to check alias for __set_pages_p/__set_pages_np No alias checking needed for setting present/not-present mapping. Otherwise, we may need to break large pages for 64-bit kernel text mappings (this adds to complexity if we want to do this from atomic context especially, for ex: with CONFIG_DEBUG_PAGEALLOC). Let's keep it simple! Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: arjan@linux.intel.com Cc: venkatesh.pallipadi@intel.com Cc: jeremy@goop.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-10 19:29:15 +02:00
Ingo Molnar	e496e3d645	Merge branches 'x86/alternatives', 'x86/cleanups', 'x86/commandline', 'x86/crashdump', 'x86/debug', 'x86/defconfig', 'x86/doc', 'x86/exports', 'x86/fpu', 'x86/gart', 'x86/idle', 'x86/mm', 'x86/mtrr', 'x86/nmi-watchdog', 'x86/oprofile', 'x86/paravirt', 'x86/reboot', 'x86/sparse-fixes', 'x86/tsc', 'x86/urgent' and 'x86/vmalloc' into x86-v28-for-linus-phase1	2008-10-06 18:17:07 +02:00
Bruce Allan	a03352d2c1	x86: export set_memory_ro and set_memory_rw Export set_memory_ro() and set_memory_rw() calls for use by drivers that need to have more debug information about who might be writing to memory space. this was initially developed for use while debugging a memory corruption problem with e1000e. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-09-30 09:06:41 +02:00
Ingo Molnar	ea1c9de45e	Merge branch 'x86/urgent' into x86/cleanups	2008-08-25 11:10:42 +02:00
Venki Pallipadi	01de05af94	x86: have set_memory_array_{uc,wb} coalesce memtypes, fix Fix the start addr for free_memtype calls in the error path. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Acked-by: Rene Herman <rene.herman@keyaccess.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-23 17:33:12 +02:00
Rene Herman	c5e147cf5a	x86: have set_memory_array_{uc,wb} coalesce memtypes. Actually, might as well simply reconstruct the memtype list at free time I guess. How is this for a coalescing version of the array functions? Compiles, boots and provides me with: root@7ixe4:~# wc -l /debug/x86/pat_memtype_list 53 /debug/x86/pat_memtype_list otherwise (down from 16384+). Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-22 06:07:32 +02:00
Rene Herman	9a79f4f491	x86: {reverve,free}_memtype() take a physical address The new set_memory_array_{uc,wb}() pass virtual addresses to {reserve,free}_memtype() it seems. Signed-off-by: Rene Herman <rene.herman@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-22 06:07:32 +02:00
Ingo Molnar	8b53b57576	Merge branch 'x86/urgent' into x86/pat Conflicts: arch/x86/mm/pageattr.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-22 06:06:51 +02:00
Shaohua Li	d75586ad01	x86, pageattr: introduce APIs to change pageattr of a page array Add array interface APIs of pageattr. page based cache flush is quite slow for a lot of pages. If pages are more than 1024 (4M), the patch will use a wbinvd(). We have a simple test here (run a 3d game - open arena), nearly all agp memory allocation are small (< 1M), so suppose this will not impact runtime performance. Signed-off-by: Dave Airlie <airlied@gmail.com> Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-21 13:47:45 +02:00
Ingo Molnar	cacf890694	Revert "introduce two APIs for page attribute" This reverts commit `1ac2f7d55b`.	2008-08-21 13:46:33 +02:00
venkatesh.pallipadi@intel.com	c15238df3b	x86: PAT proper tracking of set_memory_uc and friends Big thinko in pat memtype tracking code. reserve_memtype should be called with physical address and not virtual address. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-21 13:27:32 +02:00
Ingo Molnar	7393423dd9	Merge branch 'linus' into x86/cleanups	2008-08-20 11:52:15 +02:00
Nick Piggin	5843d9a4d0	x86, pat: avoid highmem cache attribute aliasing Highmem code can leave ptes and tlb entries around for a given page even after kunmap, and after it has been freed. >From what I can gather, the PAT code may change the cache attributes of arbitrary physical addresses (ie. including highmem pages), which would result in aliases in the case that it operates on one of these lazy tlb highmem pages. Flushing kmaps should solve the problem. I've also just added code for conditional flushing if we haven't got any dangling highmem aliases -- this should help performance if we change page attributes frequently or systems that aren't using much highmem pages (eg. if < 4G RAM). Should be turned into 2 patches, but just for RFC... Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-15 17:22:57 +02:00
Shaohua Li	1ac2f7d55b	introduce two APIs for page attribute Introduce two APIs for page attribute. flushing tlb/cache in every page attribute is expensive. AGP gart usually will do a lot of operations to change a page to uc, new APIs can reduce flush. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Cc: airlied@linux.ie Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-15 16:30:45 +02:00
Hugh Dickins	a06de63000	x86: fix /proc/meminfo DirectMap Do we actually want these DirectMap lines in the x86 /proc/meminfo? I can see they're interesting to CPA developers and TLB optimizers, but they don't fit its usual "where has all my memory gone?" usage. If they are to stay, here are some fixes. 1. On x86_32 without PAE, they're not 2M but 4M pages: no need to mess with the internal enum, but show the right name to users. 2. Many machines can never show anything but 0 for DirectMap1G, so suppress that line unless direct_gbpages are really enabled. 3. The unit in /proc/meminfo is kB not number of pages: HugePages messed that up, but they're an example to regret not to follow. 4. Once we use kB, it's easy to see that 1GB has gone missing (which explains why CONFIG_CPA_DEBUG=y soon wraps DirectMap2M negative): because head_64.S's level2_ident_pgt entries were not counted. My fix is not ideal, but works for more and for less than 1G, and avoids interfering with early bootup pagetable contortions. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-15 15:27:55 +02:00
Arjan van de Ven	875e40b975	x86: use WARN() in arch/x86/mm/pageattr.c Use WARN() instead of a printk+WARN_ON() pair; this way the message becomes part of the warning section for better reporting/collection. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: akpm@linux-foundation.org Cc: arjan@linux.intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-13 19:05:39 +02:00
Joerg Roedel	15ae2d76ce	x86: convert pageattr.c from round_up to roundup Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 15:39:20 +02:00
Ingo Molnar	1a781a777b	Merge branch 'generic-ipi' into generic-ipi-for-linus Conflicts: arch/powerpc/Kconfig arch/s390/kernel/time.c arch/x86/kernel/apic_32.c arch/x86/kernel/cpu/perfctr-watchdog.c arch/x86/kernel/i8259_64.c arch/x86/kernel/ldt.c arch/x86/kernel/nmi_64.c arch/x86/kernel/smpboot.c arch/x86/xen/smp.c include/asm-x86/hw_irq_32.h include/asm-x86/hw_irq_64.h include/asm-x86/mach-default/irq_vectors.h include/asm-x86/mach-voyager/irq_vectors.h include/asm-x86/smp.h kernel/Makefile Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-15 21:55:59 +02:00
Ingo Molnar	5806b81ac1	Merge branch 'auto-ftrace-next' into tracing/for-linus Conflicts: arch/x86/kernel/entry_32.S arch/x86/kernel/process_32.c arch/x86/kernel/process_64.c arch/x86/lib/Makefile include/asm-x86/irqflags.h kernel/Makefile kernel/sched.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-14 16:11:52 +02:00
Yinghai Lu	965194c15d	x86: max_low_pfn_mapped fix, #2 tighten the boundary checks around max_low_pfn_mapped - dont overmap nor undermap into holes. also print out tseg for AMD cpus, for diagnostic purposes. (this is an SMM area, and we split up any big mappings around that area) Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-13 08:19:16 +02:00
Yinghai Lu	f361a450bf	x86: introduce max_low_pfn_mapped for 64-bit when more than 4g memory is installed, don't map the big hole below 4g. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-11 10:24:04 +02:00
Ingo Molnar	6924d1ab8b	Merge branches 'x86/numa-fixes', 'x86/apic', 'x86/apm', 'x86/bitops', 'x86/build', 'x86/cleanups', 'x86/cpa', 'x86/cpu', 'x86/defconfig', 'x86/gart', 'x86/i8259', 'x86/intel', 'x86/irqstats', 'x86/kconfig', 'x86/ldt', 'x86/mce', 'x86/memtest', 'x86/pat', 'x86/ptemask', 'x86/resumetrace', 'x86/threadinfo', 'x86/timers', 'x86/vdso' and 'x86/xen' into x86/devel	2008-07-08 09:16:56 +02:00
Thomas Gleixner	65280e613f	x86: janitor CPA statistics patch 1) Remove __meminit from update_pages_count. It is used inside split_pages() 2) Make the code depend on PROC_FS. Doing statistics for nothing is useless and not adding useless code is nice to the Linux tiny folks. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 08:12:05 +02:00
Andi Kleen	ce0c0e50f9	x86, generic: CPA add statistics about state of direct mapping v4 Add information about the mapping state of the direct mapping to /proc/meminfo. I chose /proc/meminfo because that is where all the other memory statistics are too and it is a generally useful metric even outside debugging situations. A lot of split kernel pages means the kernel will run slower. This way we can see how many large pages are really used for it and how many are split. Useful for general insight into the kernel. v2: Add hotplug locking to 64bit to plug a very obscure theoretical race. 32bit doesn't need it because it doesn't support hotadd for lowmem. Fix some typos v3: Rename dpages_cnt Add CONFIG ifdef for count update as requested by tglx Expand description v4: Fix stupid bugs added in v3 Move update_page_count to pageattr.c Signed-off-by: Andi Kleen <andi@firstfloor.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 08:11:45 +02:00
Jens Axboe	15c8b6c1aa	on_each_cpu(): kill unused 'retry' parameter It's not even passed on to smp_call_function() anymore, since that was removed. So kill it. Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-06-26 11:24:38 +02:00
Andreas Herrmann	499f8f84b8	x86: rename pat_wc_enabled to pat_enabled BTW, what does pat_wc_enabled stand for? Does it mean "write-combining"? Currently it is used to globally switch on or off PAT support. Thus I renamed it to pat_enabled. I think this increases readability (and hope that I didn't miss something). Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-12 10:14:27 +02:00
Pekka Paalanen	75bb88350e	x86 mmiotrace: use lookup_address() Use lookup_address() from pageattr.c instead of doing the same manually. Also had to EXPORT_SYMBOL_GPL(lookup_address) to make this work for modules. This also fixes "undefined symbol 'init_mm'" compile error for x86_32. Signed-off-by: Pekka Paalanen <pq@iki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-24 11:21:32 +02:00
Suresh Siddha	de33c442ed	x86 PAT: fix performance drop for glx, use UC minus for ioremap(), ioremap_nocache() and pci_mmap_page_range() Use UC_MINUS for ioremap(), ioremap_nocache() instead of strong UC. Once all the X drivers move to ioremap_wc(), we can go back to strong UC semantics for ioremap() and ioremap_nocache(). To avoid attribute aliasing issues, pci_mmap_page_range() will also use UC_MINUS for default non write-combining mapping request. Next steps: a) change all the video drivers using ioremap() or ioremap_nocache() and adding WC MTTR using mttr_add() to ioremap_wc() b) for strict usage, we can go back to strong uc semantics for ioremap() and ioremap_nocache() after some grace period for completing step-a. c) user level X server needs to use the appropriate method for setting up WC mapping (like using resourceX_wc sysfs file instead of adding MTRR for WC and using /dev/mem or resourceX under /sys) Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-30 23:15:35 +02:00
Linus Torvalds	4b7227ca32	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-xen-next * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-xen-next: (52 commits) xen: add balloon driver xen: allow compilation with non-flat memory xen: fold xen_sysexit into xen_iret xen: allow set_pte_at on init_mm to be lockless xen: disable preemption during tlb flush xen pvfb: Para-virtual framebuffer, keyboard and pointer driver xen: Add compatibility aliases for frontend drivers xen: Module autoprobing support for frontend drivers xen blkfront: Delay wait for block devices until after the disk is added xen/blkfront: use bdget_disk xen: Make xen-blkfront write its protocol ABI to xenstore xen: import arch generic part of xencomm xen: make grant table arch portable xen: replace callers of alloc_vm_area()/free_vm_area() with xen_ prefixed one xen: make include/xen/page.h portable moving those definitions under asm dir xen: add resend_irq_on_evtchn() definition into events.c Xen: make events.c portable for ia64/xen support xen: move events.c to drivers/xen for IA64/Xen support xen: move features.c from arch/x86/xen/features.c to drivers/xen xen: add missing definitions in include/xen/interface/vcpu.h which ia64/xen needs ...	2008-04-25 12:32:10 -07:00
Jeremy Fitzhardinge	6944a9c894	x86: rename paravirt_alloc_pt etc after the pagetable structure Rename (alloc\|release)_(pt\|pd) to pte/pmd to explicitly match the name of the appropriate pagetable level structure. [ x86.git merge work by Mark McLoughlin <markmc@redhat.com> ] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:31 +02:00
Jeremy Fitzhardinge	1d262d3a49	x86: put paravirt stubs into common asm/pgalloc.h Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:30 +02:00
Ingo Molnar	a4928cffe6	"make namespacecheck" fixes Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-24 23:15:44 +02:00
Ingo Molnar	d1a4be630f	x86 PAT: fix mmap() of holes do not return a -EINVAL when mmap()-ing PCI holes. Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Arjan van de Ven <arjan@linux.intel.com>	2008-04-18 23:40:49 +02:00
Andi Kleen	c9caa02c52	x86: add set_memory_4k to pageattr.c Add a new function to force split large pages into 4k pages. This is needed for some followup optimizations. I had to add a new field to cpa_data to pass down the information that try_preserve_large_page should not run. Right now no set_page_4k() because I didn't need it and all the specialized users I have in mind would be more comfortable with pure addresses. I also didn't export it because it's unlikely external code needs it. Signed-off-by: Andi Kleen <ak@suse.de> Cc: andreas.herrmann3@amd.com Cc: mingo@elte.hu Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:30 +02:00
venkatesh.pallipadi@intel.com	ef354af462	x86: PAT add set_memory_wc() interface Add a set_memory_wc interface(), similar to set_memory_uc interface. Callers has to call set_memory_uc, set_memory_wb and set_memory_wc, set_memory_wb as pairs. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:20 +02:00
venkatesh.pallipadi@intel.com	1219333dfd	x86: PAT use reserve free memtype in set_memory_uc Use reserve_memtype and free_memtype interfaces in set_memory_uc/set_memory_wb interfaces to avoid aliasing. Usage model of set_memory_uc and set_memory_wb is for RAM memory and users will first call set_memory_uc and call set_memory_wb after use to reset the attribute. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:19 +02:00
venkatesh.pallipadi@intel.com	2e5d9c857d	x86: PAT infrastructure patch Sets up pat_init() infrastructure. PAT MSR has following setting. PAT \|PCD \|\|PWT \|\|\| 000 WB _PAGE_CACHE_WB 001 WC _PAGE_CACHE_WC 010 UC- _PAGE_CACHE_UC_MINUS 011 UC _PAGE_CACHE_UC We are effectively changing WT from boot time setting to WC. UC_MINUS is used to provide backward compatibility to existing /dev/mem users(X). reserve_memtype and free_memtype are new interfaces for maintaining alias-free mapping. It is currently implemented in a simple way with a linked list and not optimized. reserve and free tracks the effective memory type, as a result of PAT and MTRR setting rather than what is actually requested in PAT. pat_init piggy backs on mtrr_init as the rules for setting both pat and mtrr are same. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:19 +02:00
Thomas Gleixner	ee7ae7a198	x86: add debug info to DEBUG_PAGEALLOC Add debug information for DEBUG_PAGEALLOC to get some statistics about the pool usage and split status. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-17 17:40:45 +02:00
Suresh Siddha	d546b67a94	x86: fix performance drop for glx fix the 3D performance drop reported at: http://bugzilla.kernel.org/show_bug.cgi?id=10328 fb drivers are using ioremap()/ioremap_nocache(), followed by mtrr_add with WC attribute. Recent changes in page attribute code made both ioremap()/ioremap_nocache() mappings as UC (instead of previous UC-). This breaks the graphics performance, as the effective memory type is UC instead of expected WC. The correct way to fix this is to add ioremap_wc() (which uses UC- in the absence of PAT kernel support and WC with PAT) and change all the fb drivers to use this new ioremap_wc() API. We can take this correct and longer route for post 2.6.25. For now, revert back to the UC- behavior for ioremap/ioremap_nocache. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-26 22:23:41 +01:00
Rafael J. Wysocki	9b5cf48b06	x86: revert "x86: CPA: avoid split of alias mappings" Revert: commit `8be8f54bae` Author: Thomas Gleixner <tglx@linutronix.de> Date: Sat Feb 23 20:43:21 2008 +0100 x86: CPA: avoid split of alias mappings because it clearly mishandles the case when __change_page_attr(), called from __change_page_attr_set_clr(), changes cpa->processed to 1 and cpa_process_alias(cpa) is executed right after that. This crashes my x86-64 test box early in the boot process (ref. http://bugzilla.kernel.org/show_bug.cgi?id=10140#c4). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-03 14:18:27 +01:00
Thomas Gleixner	8be8f54bae	x86: CPA: avoid split of alias mappings avoid over-eager large page splitup. When the target area needs to be split or is split already (ioremap) then the current code enforces the split of large mappings in the alias regions even if we could avoid it. Use a separate variable processed in the cpa_data structure to carry the number of pages which have been processed instead of reusing the numpages variable. This keeps numpages intact and gives the alias code a chance to keep large mappings intact. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-29 18:55:42 +01:00
Ingo Molnar	92cb54a37a	x86: make DEBUG_PAGEALLOC and CPA more robust Use PF_MEMALLOC to prevent recursive calls in the DBEUG_PAGEALLOC case. This makes the code simpler and more robust against allocation failures. This fixes the following fallback to non-mmconfig: http://lkml.org/lkml/2008/2/20/551 http://bugzilla.kernel.org/show_bug.cgi?id=10083 Also, for DEBUG_PAGEALLOC=n reduce the pool size to one page. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-26 12:55:50 +01:00
Rafael J. Wysocki	8a235efad5	Hibernation: Handle DEBUG_PAGEALLOC on x86 Make hibernation work with CONFIG_DEBUG_PAGEALLOC set on x86, by checking if the pages to be copied are marked as present in the kernel mapping and temporarily marking them as present if that's not the case. No functional modifications are introduced if CONFIG_DEBUG_PAGEALLOC is unset. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>	2008-02-21 02:15:28 -05:00
Andi Kleen	8e31c2ac11	x86: CPA: remove BUG_ON for LRU/Compound pages New implementation does not use lru for anything so there is no need to reject pages that are in the LRU. Similar for compound pages (which were checked because they also use page->lru) [ tglx@linutronix.de: removed unused variable ] Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-19 16:18:29 +01:00
Thomas Gleixner	f34b439f34	x86: CPA: avoid double checking of alias ranges When the CPA code is called with an virtual address in the range of the direct mapping or the high alias then we do not need to run through the alias check for this range. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-18 20:54:14 +01:00
Thomas Gleixner	af96e4438a	x86: CPA no alias checking for _NX NX settings are not required to be consistent across alias mappings. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-18 20:54:14 +01:00
Thomas Gleixner	c31c7d4844	x86: CPA, fix alias checks c_p_a() did not discover all aliases correctly. (such as when called on vmalloc()-ed areas or ioremap()-ed areas) Push the alias checks to the lower, physical level and consistently discover all aliases that might exist: the low direct mappings and the high linear kernel-text mappings (on 64-bit). Thanks to Andi Kleen for pointing out that this was buggy. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-18 20:54:14 +01:00
Ingo Molnar	f8d8406bcb	x86: cpa, fix out of date comment Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-14 23:30:21 +01:00
Thomas Gleixner	69b1415e93	x86: cpa: ensure page alignment the cpa API is page aligned - warn about any weird alignments. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-14 23:30:20 +01:00
Andi Kleen	5d3c8b21e2	x86: CPA: fix gbpages support in try_preserve_large_page [ mingo@elte.hu: while gbpages cannot be enabled on mainline currently, keep the code uptodate and this fix is easy enough. ] Use correct page sizes and masks for GB pages in try_preserve_large_page() This prevents a boot hang on a GB capable system with CONFIG_DIRECT_GBPAGES enabled. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-13 16:20:35 +01:00
Thomas Gleixner	fac8493960	x86: cpa, strict range check in try_preserve_large_page() Right now, we check only the first 4k page for static required protections. This does not take overlapping regions into account. So we might end up setting the wrong permissions/protections for other parts of this large page. This can be optimized further, but correctness is the important part. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-09 23:24:09 +01:00
Thomas Gleixner	eb5b5f024c	x86: cpa, use page pool Switch the split page code to use the page pool. We do this unconditionally to avoid different behaviour with and without DEBUG_PAGEALLOC enabled. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-09 23:24:09 +01:00
Thomas Gleixner	76ebd0548d	x86: introduce page pool in cpa DEBUG_PAGEALLOC was not possible on 64-bit due to its early-bootup hardcoded reliance on PSE pages, and the unrobustness of the runtime splitup of large pages. The splitup ended in recursive calls to alloc_pages() when a page for a pte split was requested. Avoid the recursion with a preallocated page pool, which is used to split up large mappings and gets refilled in the return path of kernel_map_pages after the split has been done. The size of the page pool is adjusted to the available memory. This part just implements the page pool and the initialization w/o using it yet. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-09 23:24:09 +01:00
Harvey Harrison	da7bfc50f5	x86: sparse warnings in pageattr.c Adjust the definition of lookup_address to take an unsigned long level argument. Adjust callers in xen/mmu.c that pass in a dummy variable. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-09 23:24:08 +01:00
Arjan van de Ven	cc842b82cc	x86: remove suprious ifdefs from pageattr.c The .rodata section really should just be read only; the config option is there to make breaking up the 2Mb page an option (so people whos machines give more performance for the 2Mb case can opt to do so). But when the page gets split anyway, this is no longer an issue, so clean up the code and remove the ifdefs Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-06 22:39:45 +01:00
Ingo Molnar	2d684cd6d9	x86: remove X2 workaround With the spurious handler fix, the X2 does not lock up anymore. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-06 22:39:44 +01:00
Hugh Dickins	8cb2a7c1e9	stop c_p_a corrupting the pds When change_page_attr splits a large page on x86_32 (without PAE), it is currently corrupting every process's page directory: fix that by removing the thinko which passes down a physical instead of a virtual address. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-05 14:37:14 -08:00
Thomas Gleixner	7b610eec7a	x86: cpa, micro-optimization Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:10 +01:00
Ingo Molnar	87f7f8fe32	x86: cpa, clean up code flow Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:10 +01:00
Ingo Molnar	beaff6333b	x86: cpa, eliminate CPA_ enum Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:09 +01:00
Ingo Molnar	9df84993cb	x86: cpa, cleanups Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:09 +01:00
Andi Kleen	f07333fd14	x86: implement gbpages support in change_page_attr() Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:09 +01:00
Andi Kleen	c2f71ee214	x86: add gbpages support to lookup_address [ tglx@linutronix.de: fix bootup crash on sparse mappings. ] Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:09 +01:00
Thomas Gleixner	7bfb72e847	x86: fix page-present check in cpa_flush_range pte_present() might return true for PROT_NONE mappings. Explicitely check the present bit. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:08 +01:00
Ingo Molnar	6ce9fc17d9	x86: remove cpa warning this race is legit and can happen on SMP systems. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:08 +01:00
Thomas Gleixner	07cf89c05f	x86: CPA fix pagetable split Move the readout of the large entry into the spinlock section to prevent an unlikely but possible race. Mark the pmd/pud entry present after the split. We preserved the non present bit in the new split mapping. Remove the stale gfp_flags double initialization. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:08 +01:00
Andi Kleen	31422c51e0	x86: rename LARGE_PAGE_SIZE to PMD_PAGE_SIZE Fix up all users. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:08 +01:00
Thomas Gleixner	9a14aefc1d	x86: cpa, fix lookup_address lookup_address() returns a wrong level and a wrong pointer to a non existing pte, when pmd or pud entries are marked !present. This happens for example due to boot time mapping of GART into the low memory space. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:07 +01:00
Ingo Molnar	34508f66b6	x86: AMD Athlon X2 hard hang fix An Athlon 64 X2 test system showed hard hangs shortly after marking the kernel text read-only, if we tried to preserve largepages and changed the PSE entry from RW to RO. The pagetable code itself is correct, it's the CPU that locked up hard (and not even the NMI watchdog could punch through that hard hang). So be conservative and always do splitups - like we did in the past. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:07 +01:00
Thomas Gleixner	65e074dffa	x86: cpa, preserve large pages if possible When CPA is called on a range which fits into a large page mapping, avoid to split the page when: 1) There is no change of attributes 2) The range to change is a complete large mapping Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:07 +01:00
Thomas Gleixner	f4ae5da0e8	x86: cpa, check if we changed anything and tlb flushing is necessary Flush tlbs only when there was a real change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:07 +01:00
Thomas Gleixner	72e458dfa6	x86: introduce struct cpa_data The number of arguments which need to be transported is increasing and we want to add flush optimizations and large page preserving. Create struct cpa data and pass a pointer instead of increasing the number of arguments further. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:07 +01:00
Andi Kleen	6bb8383beb	x86: cpa, only flush the cache if the caching attributes have changed We only need to flush the caches in cpa() if the the caching attributes have changed. Otherwise only flush the TLBs. This checks the PAT bits too although they are currently not used by the kernel. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:06 +01:00
Thomas Gleixner	331e406588	x86: CPA return early when requested feature is not available Mask out the not supported bits (e.g. NX). If the clr/set masks are empty after the mask return without changing anything. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:06 +01:00
Thomas Gleixner	63c1dcf4bc	x86: CPA use the existing pfn in split as well When splitting large pages, we ge the pfn from the existing entry instead of calculating it ourself. This removes the last remaining range restriction of the cpa code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:05 +01:00
Arjan van de Ven	626c2c9d06	x86: use the pfn from the page when change its attributes When changing the attributes of a pte, we should use the PFN from the existing PTE rather than going through hoops calculating what we think it might have been; this is both fragile and totally unneeded. It also makes it more hairy to call any of these functions on non-direct maps for no good reason whatsover. With this change, __change_page_attr() no longer takes a pfn as argument, which simplifies all the callers. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@tglx.de>	2008-02-04 16:48:05 +01:00
Arjan van de Ven	cc0f21bbc1	x86: teach the static_protection function about high mappings Right now, enforcing that the high mapping of the kernel text doesn't get the NX bit is done deep in the guts of CPA, rather than in the static_protection() function that enforces all other per-arch sanity checks. This patch moves this sanity check into the central static_protection() function instead, and makes it apply ONLY to the kernel text, not to all other areas in the high mapping. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:05 +01:00
Thomas Gleixner	b50516fc20	x86: CPA remove bogus NX clear In split_large_page we clear the NX bit for the new split ptes, but we need to preserve the original setting of it for the split ptes. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:47:55 +01:00
Huang, Ying	5827040df0	x86: change_page_attr_clear fix This patch replaces __change_page_attr_set_clr() with change_page_attr_set_clr() in change_page_attr_clear() to flush the TLB/cache properly. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-31 22:05:43 +01:00
Jeremy Fitzhardinge	e3ed910db2	x86: use the same pgd_list for PAE and 64-bit Use a standard list threaded through page->lru for maintaining the pgd list on PAE. This is the same as 64-bit, and seems saner than using a non-standard list via page->index. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:11 +01:00
Thomas Gleixner	0879750f5d	x86: cpa cleanup the 64-bit alias math Cleanup the address calculations, which are necessary to identify the high/low alias mappings of the kernel on 64 bit machines. Instead of calling __pa/__va back and forth, calculate the physical address once and base the other calculations on it. Add understandable constants so we can use the already available within() helper. Also add comments, which help mere mortals to understand what this code does. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:09 +01:00
Ingo Molnar	86f03989d9	x86: cpa: fix the self-test Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:09 +01:00
Ingo Molnar	4c61afcdb2	x86: fix clflush_page_range logic only present ptes must be flushed. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:09 +01:00
Thomas Gleixner	3b233e52f7	x86: optimize clflush clflush is sufficient to be issued on one CPU. The invalidation is broadcast throughout the coherence domain. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:08 +01:00
Thomas Gleixner	cd8ddf1a28	x86: clflush_page_range needs mfence clflush is an unordered operation with respect to other memory traffic, including other CLFLUSH instructions. This needs proper fencing with mfence. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:08 +01:00
Thomas Gleixner	af1e6844d6	x86: cpa: rename global_flush_tlb() to cpa_flush_all() The function name global_flush_tlb() suggests something different from what the function really does. Rename it to cpa_flush_all(), which is an understandable counterpart to cpa_flush_range(). no global visibility of the old API anymore. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:08 +01:00
Thomas Gleixner	57a6a46aa2	x86: cpa: implement clflush optimization Use clflush on CPUs which support this. clflush is only used when the page attribute operation has been successful. On CPUs which do not support clflush and in the case of error the old fashioned global_flush_tlb() is called. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:08 +01:00
Thomas Gleixner	56744546b3	x86: cpa use the new set_clr function Convert cpa_set and cpa_clear to call the new set_clr function. Seperate out the debug helpers. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:08 +01:00
Thomas Gleixner	ff31452b6e	x86: cpa create set_and_clr function Create a set_and_clr function to avoid the duplicate loops. Allows also to do combined operations for optimization. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:08 +01:00
Thomas Gleixner	72932c7ad2	x86: cpa move the flush into set and clear functions To avoid the modification of the flush code for the clflush implementation, move the flush into the set and clear functions and provide helper functions for the debugging code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:08 +01:00
Thomas Gleixner	6eade8ff46	x86: cpa: clean up change_page_attr_set/clear() Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:08 +01:00
Ingo Molnar	4692a1450b	x86: cpa: fix loop Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:07 +01:00
Thomas Gleixner	a72a08a4b6	x86: cpa: fix split thinko Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:07 +01:00
Arjan van de Ven	488fd99588	x86: fix pageattr-selftest In Ingo's testing, he found a bug in the CPA selftest code. What would happen is that the test would call change_page_attr_addr on a range of memory, part of which was read only, part of which was writable. The only thing the test wanted to change was the global bit... What actually happened was that the selftest would take the permissions of the first page, and then the change_page_attr_addr call would then set the permissions of the entire range to this first page. In the rodata section case, this resulted in pages after the .rodata becoming read only... which made the kernel rather unhappy in many interesting ways. This is just another example of how dangerous the cpa API is (was); this patch changes the test to use the incremental clear/set APIs instead, and it changes the clear/set implementation to work on a 1 page at a time basis. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:07 +01:00
Thomas Gleixner	d7c8f21a8c	x86: cpa: move flush to cpa The set_memory_* and set_pages_* family of API's currently requires the callers to do a global tlb flush after the function call; forgetting this is a very nasty deathtrap. This patch moves the global tlb flush into each of the callers Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:07 +01:00
Arjan van de Ven	d1028a154c	x86: make various pageattr.c functions static change_page_attr_add is only used in pageattr.c now, so we can make this function static. change_page_attr() isn't used anywere at all anymore; this function is a really bad API anyway so just remove the bloat entirely. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:07 +01:00

1 2 3 4

163 Commits