OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Jon Tollefson	658013e93e	powerpc: scan device tree for gigantic pages The 16G huge pages have to be reserved in the HMC prior to boot. The location of the pages are placed in the device tree. This patch adds code to scan the device tree during very early boot and save these page locations until hugetlbfs is ready for them. Acked-by: Adam Litke <agl@us.ibm.com> Signed-off-by: Jon Tollefson <kniht@linux.vnet.ibm.com> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:19 -07:00
Jon Tollefson	ec4b2c0c83	powerpc: function to allocate gigantic hugepages The 16G page locations have been saved during early boot in an array. The alloc_bootmem_huge_page() function adds a page from here to the huge_boot_pages list. Acked-by: Adam Litke <agl@us.ibm.com> Signed-off-by: Jon Tollefson <kniht@linux.vnet.ibm.com> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:19 -07:00
Andi Kleen	ceb8687961	hugetlb: introduce pud_huge Straight forward extensions for huge pages located in the PUD instead of PMDs. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:18 -07:00
Andi Kleen	a551643895	hugetlb: modular state for hugetlb page size The goal of this patchset is to support multiple hugetlb page sizes. This is achieved by introducing a new struct hstate structure, which encapsulates the important hugetlb state and constants (eg. huge page size, number of huge pages currently allocated, etc). The hstate structure is then passed around the code which requires these fields, they will do the right thing regardless of the exact hstate they are operating on. This patch adds the hstate structure, with a single global instance of it (default_hstate), and does the basic work of converting hugetlb to use the hstate. Future patches will add more hstate structures to allow for different hugetlbfs mounts to have different page sizes. [akpm@linux-foundation.org: coding-style fixes] Acked-by: Adam Litke <agl@us.ibm.com> Acked-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:17 -07:00
Jan Beulich	42b7772812	mm: remove double indirection on tlb parameter to free_pgd_range() & Co The double indirection here is not needed anywhere and hence (at least) confusing. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Acked-by: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:15 -07:00
Benjamin Herrenschmidt	a1f242ff46	powerpc ioremap_prot This adds ioremap_prot and pte_pgprot() so that one can extract protection bits from a PTE and use them to ioremap_prot() (in order to support ptrace of VM_IO \| VM_PFNMAP as per Rik's patch). This moves a couple of flag checks around in the ioremap implementations of arch/powerpc. There's a side effect of allowing non-cacheable and non-guarded mappings on ppc32 which before would always have _PAGE_GUARDED set whenever _PAGE_NO_CACHE is. (standard ioremap will still set _PAGE_GUARDED, but ioremap_prot will be capable of setting such a non guarded mapping). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Rik van Riel <riel@redhat.com> Cc: Dave Airlie <airlied@linux.ie> Cc: Hugh Dickins <hugh@veritas.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:15 -07:00
Johannes Weiner	b61bfa3c46	mm: move bootmem descriptors definition to a single place There are a lot of places that define either a single bootmem descriptor or an array of them. Use only one central array with MAX_NUMNODES items instead. Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Richard Henderson <rth@twiddle.net> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Tony Luck <tony.luck@intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kyle McMartin <kyle@parisc-linux.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: David S. Miller <davem@davemloft.net> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:14 -07:00
Benjamin Herrenschmidt	84c3d4aaec	Merge commit 'origin/master' Manual merge of: arch/powerpc/Kconfig arch/powerpc/kernel/stacktrace.c arch/powerpc/mm/slice.c arch/ppc/kernel/smp.c	2008-07-16 11:07:59 +10:00
Stefan Roese	2bf3016f89	powerpc: Fix problems with 32bit PPC's running with >= 4GB of RAM This patch enables 32bit PPC's (with 36bit physical address space, e.g. IBM/AMCC PPC44x) to run with >= 4GB of RAM. Mostly its just replacing types (unsigned long -> phys_addr_t). Tested on an AMCC Katmai with 4GB of DDR2. Signed-off-by: Stefan Roese <sr@denx.de> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>	2008-07-09 14:13:01 -04:00
Benjamin Herrenschmidt	1bc54c0311	powerpc: rework 4xx PTE access and TLB miss This is some preliminary work to improve TLB management on SW loaded TLB powerpc platforms. This introduce support for non-atomic PTE operations in pgtable-ppc32.h and removes write back to the PTE from the TLB miss handlers. In addition, the DSI interrupt code no longer tries to fixup write permission, this is left to generic code, and _PAGE_HWWRITE is gone. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>	2008-07-09 13:36:17 -04:00
Stephen Rothwell	392096e98f	generic-ipi: fix linux-next tree build failure Today's linux-next build (powerpc ppc64_defconfig) failed like this: arch/powerpc/mm/tlb_64.c: In function 'pgtable_free_now': arch/powerpc/mm/tlb_64.c:66: error: too many arguments to function 'smp_call_function' arch/powerpc/kernel/machine_kexec_64.c: In function 'kexec_prepare_cpus': arch/powerpc/kernel/machine_kexec_64.c:175: error: too many arguments to function 'smp_call_function' Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Jens Axboe <jens.axboe@oracle.com> Cc: Paul Mackerras <paulus@samba.org> Cc: <linuxppc-dev@ozlabs.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-03 09:25:42 +02:00
Nathan Fontenot	0db9360aaa	powerpc/pseries: Update numa association of hotplug memory add for drconf memory Update the association of a memory section with a numa node that occurs during hotplug add of a memory section. This adds a check in the hot_add_scn_to_nid() routine for the ibm,dynamic-reconfiguration-memory node in the device tree. If present the new hot_add_drconf_scn_to_nid() routine is invoked, which can properly parse the ibm,dynamic-reconfiguration-memory node of the device tree and make the proper numa node associations. This also introduces the valid_hot_add_scn() routine as a helper function for code that is common to the hot_add_scn_to_nid() and hot_add_drconf_scn_to_nid() routines. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-07-03 16:58:18 +10:00
Nathan Fontenot	8342681d3e	powerpc/pseries: Split code into helper routines for drconf memory This splits off several pieces of code that parse the ibm,dynamic-reconfiguration-memory node of the device tree into separate helper routines. This is in preparation for the next commit that will use these helper routines. There are no functional changes in this patch. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-07-03 16:58:17 +10:00
Tony Breeds	db7f37de2c	powerpc: Fix building of arch/powerpc/mm/mem.o when MEMORY_HOTPLUG=y and SPARSEMEM=n Currently the kernel fails to build with the above config options with: CC arch/powerpc/mm/mem.o arch/powerpc/mm/mem.c: In function 'arch_add_memory': arch/powerpc/mm/mem.c:130: error: implicit declaration of function 'create_section_mapping' This explicitly includes asm/sparsemem.h in arch/powerpc/mm/mem.c and moves the guards in include/asm-powerpc/sparsemem.h to protect the SPARSEMEM specific portions only. Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-07-03 16:58:07 +10:00
Dave Kleikamp	87e9ab13c3	powerpc: hash_huge_page() should get the WIMG bits from the lpte Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Cc: Jon Tollefson <kniht@linux.vnet.ibm.com> Cc: Adam Litke <agl@us.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-07-01 11:28:02 +10:00
Paul Mackerras	3a8247cc2c	powerpc: Only demote individual slices rather than whole process At present, if we have a kernel with a 64kB page size, and some process maps something that has to be mapped with 4kB pages (such as a cache-inhibited mapping on POWER5+, or the eHCA infiniband queue-pair pages), we change the process to use 4kB pages everywhere. This hurts the performance of HPC programs that access eHCA from userspace. With this patch, the kernel will only demote the slice(s) containing the eHCA or cache-inhibited mappings, leaving the remaining slices able to use 64kB hardware pages. This also changes the slice_get_unmapped_area code so that it is willing to place a 64k-page mapping into (or across) a 4k-page slice if there is no better alternative, i.e. if the program specified MAP_FIXED or if there is not sufficient space available in slices that are either empty or already have 64k-page mappings in them. Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-01 11:27:57 +10:00
Becky Bruce	316a405841	powerpc: Get rid of bitfields in ppc_bat struct While working on the 36-bit physical support, I noticed that there was exactly one line of code that actually referenced the bitfields. So I got rid of them and redefined ppc_bat as a struct of 2 u32's: batu and batl. I also got rid of the previous union that held the bitfield structs and a word representation of the batu/l values. This seems like a nicer solution than adding in a bunch of new bitfields to support extended bat addressing that would never get used, and just leaving the struct as-is would have been incomplete in the face of large physical addressing. Signed-off-by: Becky Bruce <becky.bruce@freescale.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-06-30 22:31:05 +10:00
Becky Bruce	7c5c4325d2	powerpc: Change BAT code to use phys_addr_t Currently, the physical address is an unsigned long, but it should be phys_addr_t in set_bat, [v/p]_mapped_by_bat. Also, create a macro that can convert a large physical address into the correct format for programming the BAT registers. Signed-off-by: Becky Bruce <becky.bruce@freescale.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-06-30 22:31:03 +10:00
Benjamin Herrenschmidt	41743a4e34	powerpc: Free a PTE bit on ppc64 with 64K pages This frees a PTE bit when using 64K pages on ppc64. This is done by getting rid of the separate _PAGE_HASHPTE bit. Instead, we just test if any of the 16 sub-page bits is set. For non-combo pages (ie. real 64K pages), we set SUB0 and the location encoding in that field. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-06-30 22:30:53 +10:00
Paul Mackerras	e9a4b6a3f6	Merge branch 'linux-2.6'	2008-06-30 10:16:50 +10:00
Jens Axboe	15c8b6c1aa	on_each_cpu(): kill unused 'retry' parameter It's not even passed on to smp_call_function() anymore, since that was removed. So kill it. Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-06-26 11:24:38 +02:00
Paul Mackerras	65ba6cdc83	[POWERPC] Clear sub-page HPTE present bits when demoting page size When we demote a slice from 64k to 4k, and we are about to insert an HPTE for a 4k subpage and we notice that there is an existing 64k HPTE, we first invalidate that HPTE before inserting the new 4k subpage HPTE. Since the bits that encode which hash bucket the old HPTE was in overlap with the bits that encode which of the 16 subpages have HPTEs, we need to clear out the subpage HPTE-present bits before starting to insert HPTEs for the 4k subpages. If we don't do that, we can erroneously think that a subpage already has an HPTE when it doesn't. That in itself wouldn't be such a problem except that when we go to update the HPTE that we think is present on machines with a hypervisor, the hypervisor can tell us that the HPTE we think is there is actually there even though it isn't, which can lead to a process getting stuck in a loop, continually faulting. The reason for the confusion is that the AVPN (abbreviated virtual page number) we are looking for in the HPTE for a 4k subpage can actually match the AVPN in a stale HPTE for another 64k page. For example, the HPTE for the 4k subpage at 0x84000f000 will be in the same hash bucket and have the same AVPN as the HPTE for the 64k page at 0x8400f0000. This fixes the code to clear out the subpage HPTE-present bits. Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-06-18 21:40:43 +10:00
Paul Mackerras	8a3e1c670e	Merge branch 'merge' Conflicts: arch/powerpc/sysdev/fsl_soc.c	2008-06-09 12:19:41 +10:00
Nathan Lynch	0d5799449f	[POWERPC] Make walk_memory_resource available with MEMORY_HOTPLUG=n The ehea driver was recently changed[1] to use walk_memory_resource() to detect the system's memory layout. However, walk_memory_resource() is available only when memory hotplug is enabled. So CONFIG_EHEA was made to depend on MEMORY_HOTPLUG [2], but it is inappropriate for a network driver to have such a dependency. Make the declaration of walk_memory_resource() and its powerpc implementation (ehea is powerpc-specific) unconditionally available. [1] `48cfb14f8b` "ehea: Add DLPAR memory remove support" [2] `fb7b6ca2b6` "ehea: Add dependency to Kconfig" Signed-off-by: Nathan Lynch <ntl@pobox.com> Acked-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-06-09 11:32:41 +10:00
Paul Mackerras	acf464817d	Merge branch 'merge' into powerpc-next	2008-05-23 16:53:23 +10:00
David Gibson	46a7417963	[POWERPC] Fix __set_fixmap() for STRICT_MM_TYPECHECKS __set_fixmap() in pgtable_32.c currently fails to compile if STRICT_MM_TYPECHECKS is defined. This fixes it. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-05-23 16:15:32 +10:00
Adrian Bunk	d3d3d3cdb1	[POWERPC] powerpc/mm/hash_low_32.S: Remove CVS keyword This removes a CVS keyword that wasn't updated for a long time from a comment. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-05-20 09:34:18 +10:00
Paul Mackerras	fcff474ea5	Merge branch 'linux-2.6' into powerpc-next	2008-05-16 23:13:42 +10:00
Benjamin Herrenschmidt	cec08e7a94	[POWERPC] vmemmap fixes to use smaller pages This changes vmemmap to use a different region (region 0xf) of the address space, and to configure the page size of that region dynamically at boot. The problem with the current approach of always using 16M pages is that it's not well suited to machines that have small amounts of memory such as small partitions on pseries, or PS3's. In fact, on the PS3, failure to allocate the 16M page backing vmmemmap tends to prevent hotplugging the HV's "additional" memory, thus limiting the available memory even more, from my experience down to something like 80M total, which makes it really not very useable. The logic used by my match to choose the vmemmap page size is: - If 16M pages are available and there's 1G or more RAM at boot, use that size. - Else if 64K pages are available, use that - Else use 4K pages I've tested on a POWER6 (16M pages) and on an iSeries POWER3 (4K pages) and it seems to work fine. Note that I intend to change the way we organize the kernel regions & SLBs so the actual region will change from 0xf back to something else at one point, as I simplify the SLB miss handler, but that will be for a later patch. Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-05-15 20:49:25 +10:00
Michael Ellerman	c884116ac3	[POWERPC] Remove duplicate variable definitions in mm/tlb_64.c Somewhere along the way (`e28f7faf05`, "Four level pagetables for ppc64") we ended up with duplicate definitions for pte_freelist_cur and pte_freelist_force_free. Somehow this compiles, but it would be better to just have one definition for each. The two definitions we end up with can be static too! Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-05-14 22:31:49 +10:00
Michael Ellerman	572fb578de	[POWERPC] Move declaration of tce variables into mmu-hash64.h ... instead of having extern declarations in a .c file. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-05-14 22:31:47 +10:00
Michael Ellerman	09de9ff872	[POWERPC] Fix sparse warnings in arch/powerpc/mm Make two vmemmap helpers static in init_64.c Make stab variables static in stab.c Make psize defs static in hash_utils_64.c Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-05-14 22:31:46 +10:00
Michael Ellerman	5f25f06529	[POWERPC] Move declaration of init_bootmem_done into system.h ... instead of having an extern declaration in a .c file. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-05-14 22:31:44 +10:00
Paul Mackerras	3b5750644b	[POWERPC] Bolt in SLB entry for kernel stack on secondary cpus This fixes a regression reported by Kamalesh Bulabel where a POWER4 machine would crash because of an SLB miss at a point where the SLB miss exception was unrecoverable. This regression is tracked at: http://bugzilla.kernel.org/show_bug.cgi?id=10082 SLB misses at such points shouldn't happen because the kernel stack is the only memory accessed other than things in the first segment of the linear mapping (which is mapped at all times by entry 0 of the SLB). The context switch code ensures that SLB entry 2 covers the kernel stack, if it is not already covered by entry 0. None of entries 0 to 2 are ever replaced by the SLB miss handler. Where this went wrong is that the context switch code assumes it doesn't have to write to SLB entry 2 if the new kernel stack is in the same segment as the old kernel stack, since entry 2 should already be correct. However, when we start up a secondary cpu, it calls slb_initialize, which doesn't set up entry 2. This is correct for the boot cpu, where we will be using a stack in the kernel BSS at this point (i.e. init_thread_union), but not necessarily for secondary cpus, whose initial stack can be allocated anywhere. This doesn't cause any immediate problem since the SLB miss handler will just create an SLB entry somewhere else to cover the initial stack. In fact it's possible for the cpu to go quite a long time without SLB entry 2 being valid. Eventually, though, the entry created by the SLB miss handler will get overwritten by some other entry, and if the next access to the stack is at an unrecoverable point, we get the crash. This fixes the problem by making slb_initialize create a suitable entry for the kernel stack, if we are on a secondary cpu and the stack isn't covered by SLB entry 0. This requires initializing the get_paca()->kstack field earlier, so I do that in smp_create_idle where the current field is initialized. This also abstracts a bit of the computation that mk_esid_data in slb.c does so that it can be used in slb_initialize. Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-05-02 15:00:45 +10:00
Geoff Levand	bbea346062	[POWERPC] Fix slb.c compile warnings Arrange for a syntax check to always be done on the powerpc/mm/slb.c DBG() macro by defining it to pr_debug() for non-debug builds. Also, fix these related compile warnings: slb.c:273: warning: format '%04x' expects type 'unsigned int', but argument 2 has type 'long unsigned int slb.c:274: warning: format '%04x' expects type 'unsigned int', but argument 2 has type 'long unsigned int' Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-05-02 15:00:44 +10:00
Badari Pulavarty	9d88a2eb6e	[POWERPC] Provide walk_memory_resource() for powerpc Provide walk_memory_resource() for 64-bit powerpc. PowerPC maintains logical memory region mapping in the lmb.memory structure. Walk through these structures and do the callbacks for the contiguous chunks. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-04-29 15:57:53 +10:00
Jeremy Fitzhardinge	180c06efce	hotplug-memory: make online_page() common All architectures use an effectively identical definition of online_page(), so just make it common code. x86-64, ia64, powerpc and sh are actually identical; x86-32 is slightly different. x86-32's differences arise because it puts its hotplug pages in the highmem zone. We can handle this in the generic code by inspecting the page to see if its in highmem, and update the totalhigh_pages count appropriately. This leaves init_32.c:free_new_highpage with a single caller, so I folded it into add_one_highpage_init. I also removed an incorrect comment referring to the NUMA case; any NUMA details have already been dealt with by the time online_page() is called. [akpm@linux-foundation.org: fix indenting] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Dave Hansen <dave@linux.vnet.ibm.com> Reviewed-by: KAMEZAWA Hiroyuki <kamez.hiroyu@jp.fujitsu.com> Tested-by: KAMEZAWA Hiroyuki <kamez.hiroyu@jp.fujitsu.com> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Christoph Lameter <clameter@sgi.com> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:17 -07:00
Kumar Gala	f608600e74	[POWERPC] Clean up access to thread_info in assembly Use (31-THREAD_SHIFT) to get to thread_info from stack pointer. This makes the code a bit easier to read and more robust if we ever change THREAD_SHIFT. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-04-24 20:58:02 +10:00
Kumar Gala	2c419bdeca	[POWERPC] Port fixmap from x86 and use for kmap_atomic The fixmap code from x86 allows us to have compile time virtual addresses that we change the physical addresses of at run time. This is useful for applications like kmap_atomic, PCI config that is done via direct memory map, kexec/kdump. We got ride of CONFIG_HIGHMEM_START as we can now determine a more optimal location for PKMAP_BASE based on where the fixmap addresses start and working back from there. Additionally, the kmap code in asm-powerpc/highmem.h always had debug enabled. Moved to using CONFIG_DEBUG_HIGHMEM to determine if we should have the extra debug checking. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-04-24 20:58:02 +10:00
Kumar Gala	37dd2badcf	[POWERPC] 85xx: Add support for relocatable kernel (and booting at non-zero) Added support to allow an 85xx kernel to be run from a non-zero physical address (useful for cooperative asymmetric multiprocessing situations and kdump). The support can be configured at compile time by setting CONFIG_PAGE_OFFSET, CONFIG_KERNEL_START, and CONFIG_PHYSICAL_START as desired. Alternatively, the kernel build can set CONFIG_RELOCATABLE. Setting this config option causes the kernel to determine at runtime the physical addresses of CONFIG_PAGE_OFFSET and CONFIG_KERNEL_START. If CONFIG_RELOCATABLE is set, then CONFIG_PHYSICAL_START has no meaning. However, CONFIG_PHYSICAL_START will always be used to set the LOAD program header physical address field in the resulting ELF image. Currently we are limited to running at a physical address that is a multiple of 256M. This is due to how we map TLBs to cover lowmem. This should be fixed to allow 64M or maybe even 16M alignment in the future. It is considered an error to try and run a kernel at a non-aligned physical address. All the magic for this support is accomplished by proper initialization of the kernel memory subsystem and use of ARCH_PFN_OFFSET. The use of ARCH_PFN_OFFSET only affects normal memory and not IO mappings. ioremap uses map_page and isn't affected by ARCH_PFN_OFFSET. /dev/mem continues to allow access to any physical address in the system regardless of how CONFIG_PHYSICAL_START is set. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-04-24 20:58:01 +10:00
Michael Ellerman	6df1646e31	[POWERPC] Add include of linux/of.h to numa.c numa.c requires routines declared in linux/of.h, so should include it. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-04-24 20:57:32 +10:00
Olof Johansson	49a9997884	[POWERPC] Remove unused __max_memory variable Remove the __max_memory variable, as it is not referenced anywhere in the tree besides some code in arch/ppc. Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-04-18 15:37:11 +10:00
Kumar Gala	7711684947	[POWERPC] Remove unused machine call outs When we moved to arch/powerpc we actively tried to avoid using the ppc_md.setup_io_mappings(). Currently no board ports use it so let's remove it to avoid any new boards using it. Also, remove early_serial_map() since we don't even have a call out for it in arch/powerpc. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-04-17 10:01:00 +10:00
Kumar Gala	09b5e63f82	[POWERPC] Rename __initial_memory_limit to __initial_memory_limit_addr We always use __initial_memory_limit as an address so rename it to be clear. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-04-17 07:46:13 +10:00
Kumar Gala	0aef996b37	[POWERPC] 85xx: Cleanup TLB initialization * Determine the RPN we are running the kernel at runtime rather than using compile time constant for initial TLB * Cleanup adjust_total_lowmem() to respect memstart_addr and be a bit more clear on variables that are sizes vs addresses. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-04-17 07:46:13 +10:00
Kumar Gala	d7917ba705	[POWERPC] Introduce lowmem_end_addr to distinguish from total_lowmem total_lowmem represents the amount of low memory, not the physical address that low memory ends at. If the start of memory is at 0 it happens that total_lowmem can be used as both the size and the address that lowmem ends at (or more specifically one byte beyond the end). To make the code a bit more clear and deal with the case when the start of memory isn't at physical 0, we introduce lowmem_end_addr that represents one byte beyond the last physical address in the lowmem region. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-04-17 07:46:13 +10:00
Kumar Gala	99c62dd773	[POWERPC] Remove and replace uses of PPC_MEMSTART with memstart_addr A number of users of PPC_MEMSTART (40x, ppc_mmu_32) can just always use 0 as we don't support booting these kernels at non-zero physical addresses since their exception vectors must be at 0 (or 0xfffx_xxxx). For the sub-arches that support relocatable interrupt vectors (book-e), it's reasonable to have memory start at a non-zero physical address. For those cases use the variable memstart_addr instead of the #define PPC_MEMSTART since the only uses of PPC_MEMSTART are for initialization and in the future we can set memstart_addr at runtime to have a relocatable kernel. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-04-17 07:46:12 +10:00
Paul Mackerras	ac7c5353b1	Merge branch 'linux-2.6'	2008-04-14 21:11:02 +10:00
Stephen Rothwell	ae86f0088d	[POWERPC] htab_remove_mapping is only used by MEMORY_HOTPLUG This eliminates a warning in builds that don't define CONFIG_MEMORY_HOTPLUG. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-04-07 13:49:25 +10:00
Benjamin Herrenschmidt	b991f05f13	[POWERPC] Fix deadlock with mmu_hash_lock in hash_page_sync hash_page_sync() takes and releases the low level mmu hash lock in order to sync with other processors disposing of page tables. Because that lock can be needed to service hash misses triggered by interrupt handlers, taking it must be done with interrupts off. However, hash_page_sync() appears to be called with interrupts enabled, thus causing occasional deadlocks. We fix it by making sure hash_page_sync() masks interrupts while holding the lock. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-04-03 22:11:11 +11:00
Johannes Weiner	745681a524	[POWERPC] Remove redundant display of free swap space in show_mem() show_mem() has no need to print the amount of free swap space manually because show_free_areas() does this already and is called by the former. The two outputs only differ in text formatting: printk("Free swap = %lukB\n", ...); printk("Free swap: %6ldkB\n", ...); Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-04-01 20:43:10 +11:00
Harvey Harrison	e48b1b452f	[POWERPC] Replace remaining __FUNCTION__ occurrences __FUNCTION__ is gcc-specific, use __func__ Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-04-01 20:43:09 +11:00
Geert Uytterhoeven	fa90f70a8e	[POWERPC] arch_add_memory() cannot be __devinit WARNING: vmlinux.o(.text+0xb41b0): Section mismatch in reference from the function .add_memory() to the function .devinit.text:.arch_add_memory() The function .add_memory() references the function __devinit .arch_add_memory(). This is often because .add_memory lacks a __devinit annotation or the annotation of .arch_add_memory is wrong. arch_add_memory() is also not __devinit on other architectures Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-04-01 20:43:08 +11:00
Badari Pulavarty	52db9b4426	[POWERPC] Add error return from htab_remove_mapping() If the platform doesn't support hpte_removebolted(), gracefully return failure rather than success. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-04-01 20:43:08 +11:00
Paul Mackerras	54f53f2b94	Merge branch 'linux-2.6'	2008-03-26 08:44:18 +11:00
Paul Mackerras	cfe666b145	[POWERPC] Don't use 64k pages for ioremap on pSeries On pSeries, the hypervisor doesn't let us map in the eHEA ethernet adapter using 64k pages, and thus the ehea driver will fail if 64k pages are configured. This works around the problem by always using 4k pages for ioremap on pSeries (but not on other platforms). A better fix would be to check whether the partition could ever have an eHEA adapter, and only force 4k pages if it could, but this will do for 2.6.25. This is based on an earlier patch by Tony Breeds. Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-03-24 17:41:22 +11:00
Anton Blanchard	44387e9ff2	[POWERPC] Fix PMU + soft interrupt disable bug Since the PMU is an NMI now, it can come at any time we are only soft disabled. We must hard disable around the two places we allow the kernel stack SLB and r1 to go out of sync. Otherwise the PMU exception can force a kernel stack SLB into another slot, which can lead to it getting evicted, which can lead to a nasty unrecoverable SLB miss in the exception entry code. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-03-20 10:14:55 +11:00
Paul Mackerras	bed04a4413	Merge branch 'linux-2.6'	2008-03-13 15:26:33 +11:00
Michael Ellerman	31bf111944	[POWERPC] Fix large hash table allocation on Cell blades My recent hack to allocate the hash table under 1GB on cell was poorly tested, cough. It turns out on blades with large amounts of memory we fail to allocate the hash table at all. This is because RTAS has been instantiated just below 768MB, and 0-x MB are used by the kernel, leaving no areas that are both large enough and also naturally-aligned. For the cell IOMMU hack the page tables must be under 2GB, so use that as the limit instead. This has been tested on real hardware and boots happily. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-03-13 10:10:26 +11:00
Badari Pulavarty	f8c8803bda	[POWERPC] Add code for removing HPTEs for parts of the linear mapping For memory remove, we need to clean up htab mappings for the section of the memory we are removing. This implements support for removing htab bolted mappings for pSeries logical partitions. Other sub-archs may need to implement similar functionality for hotplug memory remove to work on them. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-02-26 22:17:03 +11:00
David S. Miller	d9b2b2a277	[LIB]: Make PowerPC LMB code generic so sparc64 can use it too. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-13 16:56:49 -08:00
Linus Torvalds	dde0013782	Merge branch 'for-2.6.25' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'for-2.6.25' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: [POWERPC] Add arch-specific walk_memory_remove() for 64-bit powerpc [POWERPC] Enable hotplug memory remove for 64-bit powerpc [POWERPC] Add remove_memory() for 64-bit powerpc [POWERPC] Make cell IOMMU fixed mapping printk more useful [POWERPC] Fix potential cell IOMMU bug when switching back to default DMA ops [POWERPC] Don't enable cell IOMMU fixed mapping if there are no dma-ranges [POWERPC] Fix cell IOMMU null pointer explosion on old firmwares [POWERPC] spufs: Fix timing dependent false return from spufs_run_spu [POWERPC] spufs: No need to have a runnable SPU for libassist update [POWERPC] spufs: Update SPU_Status[CISHP] in backing runcntl write [POWERPC] spufs: Fix state_mutex leaks [POWERPC] Disable G5 NAP mode during SMU commands on U3	2008-02-08 09:31:42 -08:00
Martin Schwidefsky	2f569afd9c	CONFIG_HIGHPTE vs. sub-page page tables. Background: I've implemented 1K/2K page tables for s390. These sub-page page tables are required to properly support the s390 virtualization instruction with KVM. The SIE instruction requires that the page tables have 256 page table entries (pte) followed by 256 page status table entries (pgste). The pgstes are only required if the process is using the SIE instruction. The pgstes are updated by the hardware and by the hypervisor for a number of reasons, one of them is dirty and reference bit tracking. To avoid wasting memory the standard pte table allocation should return 1K/2K (31/64 bit) and 2K/4K if the process is using SIE. Problem: Page size on s390 is 4K, page table size is 1K or 2K. That means the s390 version for pte_alloc_one cannot return a pointer to a struct page. Trouble is that with the CONFIG_HIGHPTE feature on x86 pte_alloc_one cannot return a pointer to a pte either, since that would require more than 32 bit for the return value of pte_alloc_one (and the pte * would not be accessible since its not kmapped). Solution: The only solution I found to this dilemma is a new typedef: a pgtable_t. For s390 pgtable_t will be a (pte ) - to be introduced with a later patch. For everybody else it will be a (struct page ). The additional problem with the initialization of the ptl lock and the NR_PAGETABLE accounting is solved with a constructor pgtable_page_ctor and a destructor pgtable_page_dtor. The page table allocation and free functions need to call these two whenever a page table page is allocated or freed. pmd_populate will get a pgtable_t instead of a struct page pointer. To get the pgtable_t back from a pmd entry that has been installed with pmd_populate a new function pmd_pgtable is added. It replaces the pmd_page call in free_pte_range and apply_to_pte_range. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-08 09:22:42 -08:00
Badari Pulavarty	a99824f327	[POWERPC] Add arch-specific walk_memory_remove() for 64-bit powerpc walk_memory_resource() verifies if there are holes in a given memory range, by checking against /proc/iomem. On x86/ia64 system memory is represented in /proc/iomem. On powerpc, we don't show system memory as IO resource in /proc/iomem - instead it's maintained in /proc/device-tree. This provides a way for an architecture to provide its own walk_memory_resource() function. On powerpc, the memory region is small (16MB), contiguous and non-overlapping. So extra checking against the device-tree is not needed. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Kumar Gala <galak@gate.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-02-08 19:52:48 +11:00
Badari Pulavarty	aa620abe75	[POWERPC] Add remove_memory() for 64-bit powerpc Supply remove_memory() function for 64-bit powerpc. This is still not quite complete as it needs to do some more arch-specific stuff, which will be added in a later patch. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-02-08 19:52:47 +11:00
Linus Torvalds	3796958130	Merge branch 'for-2.6.25' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'for-2.6.25' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (69 commits) [POWERPC] Add SPE registers to core dumps [POWERPC] Use regset code for compat PTRACE_REGS calls [POWERPC] Use generic compat_sys_ptrace [POWERPC] Use generic compat_ptrace_request [POWERPC] Use generic ptrace peekdata/pokedata [POWERPC] Use regset code for PTRACE_REGS requests [POWERPC] Switch to generic compat_binfmt_elf code [POWERPC] Switch to using user_regset-based core dumps [POWERPC] Add user_regset compat support [POWERPC] Add user_regset_view definitions [POWERPC] Use user_regset accessors for GPRs [POWERPC] ptrace accessors for special regs MSR and TRAP [POWERPC] Use user_regset accessors for SPE regs [POWERPC] Use user_regset accessors for altivec regs [POWERPC] Use user_regset accessors for FP regs [POWERPC] mpc52xx: fix compile error introduce when rebasing patch [POWERPC] 4xx: PCIe indirect DCR spinlock fix. [POWERPC] Add missing native dcr dcr_ind_lock spinlock [POWERPC] 4xx: Fix offset value on Warp board [POWERPC] 4xx: Add 440EPx Sequoia ehci dts entry ...	2008-02-07 09:02:26 -08:00
Bernhard Walle	72a7fe3967	Introduce flags for reserve_bootmem() This patchset adds a flags variable to reserve_bootmem() and uses the BOOTMEM_EXCLUSIVE flag in crashkernel reservation code to detect collisions between crashkernel area and already used memory. This patch: Change the reserve_bootmem() function to accept a new flag BOOTMEM_EXCLUSIVE. If that flag is set, the function returns with -EBUSY if the memory already has been reserved in the past. This is to avoid conflicts. Because that code runs before SMP initialisation, there's no race condition inside reserve_bootmem_core(). [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix powerpc build] Signed-off-by: Bernhard Walle <bwalle@suse.de> Cc: <linux-arch@vger.kernel.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-07 08:42:25 -08:00
Balbir Singh	1daa6d08d1	[POWERPC] Fake NUMA emulation for PowerPC Here's a dumb simple implementation of fake NUMA nodes for PowerPC. Fake NUMA nodes can be specified using the following command line option numa=fake=<node range> node range is of the format <range1>,<range2>,...<rangeN> Each of the rangeX parameters is passed using memparse(). I find the patch useful for fake NUMA emulation on my simple PowerPC machine. I've tested it on a numa box with the following arguments numa=fake=512M numa=fake=512M,768M numa=fake=256M,512M mem=512M numa=fake=1G mem=768M numa=fake= without any numa= argument The other side-effect introduced by this patch is that; in the case where we don't have NUMA information, we now set a node online after adding each LMB. This node could very well be node 0, but in the case that we enable fake NUMA nodes, when we cross node boundaries, we need to set the new node online. Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-02-07 11:40:19 +11:00
Scott Wood	551ed332da	[POWERPC] update_mmu_cache: Don't cache-flush non-readable pages Currently, update_mmu_cache will crash if given a no-access PTE. There's no need to synchronize dcache/icache unless it's an exec mapping -- however, due to the existence of older glibc versions that execute out of a read-but-no-exec page, readability is tested instead. This assumes no exec-only mappings; if such mappings become supported, they will need to go through the kmap_atomic() version of dcache/icache synchronization. This fixes a bug reported by some users where the kernel would crash while dumping core on a threaded program. Signed-off-by: Scott Wood <scottwood@freescale.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-02-06 16:30:01 +11:00
Benjamin Herrenschmidt	5e5419734c	add mm argument to pte/pmd/pud/pgd_free (with Martin Schwidefsky <schwidefsky@de.ibm.com>) The pgd/pud/pmd/pte page table allocation functions get a mm_struct pointer as first argument. The free functions do not get the mm_struct argument. This is 1) asymmetrical and 2) to do mm related page table allocations the mm argument is needed on the free function as well. [kamalesh@linux.vnet.ibm.com: i386 fix] [akpm@linux-foundation.org: coding-syle fixes] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-05 09:44:18 -08:00
Michael Ellerman	41d824bf61	[POWERPC] Allocate the hash table under 1G on cell In order to support the fixed IOMMU mapping (in a subsequent patch), we need the hash table to be inside the IOMMUs DMA window. This is usually 2G, but let's make sure the hash table is under 1G as that will satisfy the IOMMU requirements and also means the hash table will be on node 0. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-01-31 12:11:09 +11:00
Paul Mackerras	55852bed57	Revert "[POWERPC] Fake NUMA emulation for PowerPC" This reverts commit `5c3f5892a2`, basically because it changes behaviour even when no fake NUMA information is specified on the kernel command line. Firstly, it changes the nid, thus destroying the real NUMA information. Secondly, it also changes behaviour in that if a node ends up with no memory in it because of the memory limit, we used to set it online and now we don't. Also, in the non-NUMA case with no fake NUMA information, we do node_set_online once for each LMB now, whereas previously we only did it once. I don't know if that is actually a problem, but it does seem unnecessary. Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-01-26 16:40:33 +11:00
Michael Neuling	c3b75bd7bb	[POWERPC] Make setjmp/longjmp code usable outside of xmon This makes the setjmp/longjmp code used by xmon, generically available to other code. It also removes the requirement for debugger hooks to be only called on 0x300 (data storage) exception. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-01-25 22:52:50 +11:00
Paul Mackerras	dcb571be20	Merge branch 'for-2.6.25' of master.kernel.org:/pub/scm/linux/kernel/git/galak/powerpc into for-2.6.25	2008-01-24 15:29:14 +11:00
Dale Farnsworth	e8b6376155	[POWERPC] 85xx: Respect KERNELBASE, PAGE_OFFSET, and PHYSICAL_START on e500 The e500 MMU init code previously assumed KERNELBASE always equaled PAGE_OFFSET and PHYSICAL_START was 0. This is useful for kdump support as well as asymetric multicore. For the initial kdump support the secondary kernel will run at 32M but need access to all of memory so we bump the initial TLB up to 64M. This also matches with the forth coming ePAPR spec. Signed-off-by: Dale Farnsworth <dale@farnsworth.org> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2008-01-23 19:34:36 -06:00
Kumar Gala	f98eeb4eb1	[POWERPC] Fix handling of memreserve if the range lands in highmem There were several issues if a memreserve range existed and happened to be in highmem: * The bootmem allocator is only aware of lowmem so calling reserve_bootmem with a highmem address would cause a BUG_ON * All highmem pages were provided to the buddy allocator Added a lmb_is_reserved() api that we now use to determine if a highem page should continue to be PageReserved or provided to the buddy allocator. Also, we incorrectly reported the amount of pages reserved since all highmem pages are initally marked reserved and we clear the PageReserved flag as we "free" up the highmem pages. Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2008-01-23 19:29:08 -06:00
Paul Mackerras	9156ad4833	Merge branch 'linux-2.6'	2008-01-24 10:07:21 +11:00
Paul Mackerras	fa28237cfc	[POWERPC] Provide a way to protect 4k subpages when using 64k pages Using 64k pages on 64-bit PowerPC systems makes life difficult for emulators that are trying to emulate an ISA, such as x86, which use a smaller page size, since the emulator can no longer use the MMU and the normal system calls for controlling page protections. Of course, the emulator can emulate the MMU by checking and possibly remapping the address for each memory access in software, but that is pretty slow. This provides a facility for such programs to control the access permissions on individual 4k sub-pages of 64k pages. The idea is that the emulator supplies an array of protection masks to apply to a specified range of virtual addresses. These masks are applied at the level where hardware PTEs are inserted into the hardware page table based on the Linux PTEs, so the Linux PTEs are not affected. Note that this new mechanism does not allow any access that would otherwise be prohibited; it can only prohibit accesses that would otherwise be allowed. This new facility is only available on 64-bit PowerPC and only when the kernel is configured for 64k pages. The masks are supplied using a new subpage_prot system call, which takes a starting virtual address and length, and a pointer to an array of protection masks in memory. The array has a 32-bit word per 64k page to be protected; each 32-bit word consists of 16 2-bit fields, for which 0 allows any access (that is otherwise allowed), 1 prevents write accesses, and 2 or 3 prevent any access. Implicit in this is that the regions of the address space that are protected are switched to use 4k hardware pages rather than 64k hardware pages (on machines with hardware 64k page support). In fact the whole process is switched to use 4k hardware pages when the subpage_prot system call is used, but this could be improved in future to switch only the affected segments. The subpage protection bits are stored in a 3 level tree akin to the page table tree. The top level of this tree is stored in a structure that is appended to the top level of the page table tree, i.e., the pgd array. Since it will often only be 32-bit addresses (below 4GB) that are protected, the pointers to the first four bottom level pages are also stored in this structure (each bottom level page contains the protection bits for 1GB of address space), so the protection bits for addresses below 4GB can be accessed with one fewer loads than those for higher addresses. Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-01-24 10:06:01 +11:00
Jon Tollefson	4ec161cf73	[POWERPC] Add hugepagesz boot-time parameter This adds the hugepagesz boot-time parameter for ppc64. It lets one pick the size for huge pages. The choices available are 64K and 16M when the base page size is 4k. It defaults to 16M (previously the only only choice) if nothing or an invalid choice is specified. Tested 64K huge pages successfully with the libhugetlbfs 1.2. Signed-off-by: Jon Tollefson <kniht@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-01-17 14:57:36 +11:00
Paul Mackerras	dfbe0d3b6b	[POWERPC] Fix boot failure on POWER6 Commit `473980a993` added a call to clear the SLB shadow buffer before registering it. Unfortunately this means that we clear out the entries that slb_initialize has previously set in there. On POWER6, the hypervisor uses the SLB shadow buffer when doing partition switches, and that means that after the next partition switch, each non-boot CPU has no SLB entries to map the kernel text and data, which causes it to crash. This fixes it by reverting most of `473980a9` and instead clearing the 3rd entry explicitly in slb_initialize. This fixes the problem that `473980a9` was trying to solve, but without breaking POWER6. Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-01-15 17:30:58 +11:00
Michael Neuling	473980a993	[POWERPC] Fix CPU hotplug when using the SLB shadow buffer Before we register the SLB shadow buffer, we need to invalidate the entries in the buffer, otherwise we can end up stale entries from when we previously offlined the CPU. This does this invalidate as well as unregistering the buffer with PHYP before we offline the cpu. Tested and fixes crashes seen on 970MP (thanks to tonyb) and POWER5. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-01-11 16:33:55 +11:00
Balbir Singh	5c3f5892a2	[POWERPC] Fake NUMA emulation for PowerPC Here's a dumb simple implementation of fake NUMA nodes for PowerPC. Fake NUMA nodes can be specified using the following command line option numa=fake=<node range> node range is of the format <range1>,<range2>,...<rangeN> Each of the rangeX parameters is passed using memparse(). I find this useful for fake NUMA emulation on my simple PowerPC machine. I've tested it on a non-numa box with the following arguments: numa=fake=1G numa=fake=1G,2G name=fake=1G,512M,2G numa=fake=1500M,2800M mem=3500M numa=fake=1G mem=512M numa=fake=1G mem=1G Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-12-20 16:11:46 +11:00
Michael Neuling	584f8b71a2	[POWERPC] Use SLB size from the device tree Currently we hardwire the number of SLBs to 64, but PAPR says we should use the ibm,slb-size property to obtain the number of SLB entries. This uses this property instead of assuming 64. If no property is found, we assume 64 entries as before. This soft patches the SLB handler, so it shouldn't change performance at all. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-12-11 13:45:56 +11:00
joe@perches.com	df3c9019ed	[POWERPC] Add missing spaces in printk formats Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-12-03 13:56:27 +11:00
Benjamin Herrenschmidt	0b47759db5	[POWERPC] Fix 8xx build breakage due to _tlbie changes My changes to _tlbie to fix 4xx unfortunately broke 8xx build in a couple of places. This fixes it. Spotted by Olof Johansson. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Vitaly Bordug <vitb@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-11-20 18:42:00 +11:00
Kamalesh Babulal	f9b6c1de69	[POWERPC] Fix build failure on legacy iSeries Include <asm/iseries/hv_call.h> in arch/powerpc/mm/stab.c to fix the following compile error (found with randconfig): CC arch/powerpc/mm/stab.o arch/powerpc/mm/stab.c: In function "stab_initialize": arch/powerpc/mm/stab.c:282: error: implicit declaration of function "HvCall1" arch/powerpc/mm/stab.c:282: error: "HvCallBaseSetASR" undeclared (first use in this function) arch/powerpc/mm/stab.c:282: error: (Each undeclared identifier is reported only once arch/powerpc/mm/stab.c:282: error: for each function it appears in.) make[1]: * [arch/powerpc/mm/stab.o] Error 1 make: * [arch/powerpc/mm] Error 2 Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Acked-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-11-20 11:37:39 +11:00
Stephen Rothwell	6548d83a37	[POWERPC] Silence an annoying boot message vmemmap_populate will printk (with KERN_WARNING) for a lot of pages if CONFIG_SPARSEMEM_VMEMMAP is enabled (at least it does on iSeries). Use pr_debug for it instead. Replace the only other use of DBG in this file with pr_debug as well. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-11-13 16:23:47 +11:00
Olof Johansson	9bafbb0c4d	[POWERPC] Fix CONFIG_SMP=n build error on ppc64 The patch "KVM: fix !SMP build error" change the way smp_call_function() actually uses the passed in function names on non-SMP builds. So previously it was never caught that the function passed in was never actually defined. This causes a build error on ppc64_defconfig + CONFIG_SMP=n: arch/powerpc/mm/tlb_64.c: In function 'pgtable_free_now': arch/powerpc/mm/tlb_64.c:71: error: 'pte_free_smp_sync' undeclared (first use in this function) arch/powerpc/mm/tlb_64.c:71: error: (Each undeclared identifier is reported only once arch/powerpc/mm/tlb_64.c:71: error: for each function it appears in.) So we need to define it even if CONFIG_SMP is off. Either that or ifdef out the smp_call_function() call, but that's ugly. Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-11-13 16:22:44 +11:00
Paul Mackerras	688016f4e2	Merge branch 'for-2.6.24' of master.kernel.org:/pub/scm/linux/kernel/git/jwboyer/powerpc-4xx into merge	2007-11-08 14:28:14 +11:00
will schmidt	465ccab9eb	[POWERPC] Fix switch_slb handling of 1T ESID values Now that we have 1TB segment size support, we need to be using the GET_ESID_1T macro when comparing ESID values for pc, stack, and unmapped_base within switch_slb(). A new helper function called esids_match() contains the logic for deciding when to call GET_ESID and GET_ESID_1T. This fixes a duplicate-slb-entry inspired machine-check exception I was seeing when trying to run java on a power6 partition. Tested on power6 and power5. Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-11-08 14:15:31 +11:00
will schmidt	aa39be09df	[POWERPC] Include udbg.h when using udbg_printf This fixes the error error: implicit declaration of function "udbg_printf" We have a few spots where we reference udbg_printf() without #including udbg.h. These are within #ifdef DEBUG blocks, so unnoticed until we do a #define DEBUG or #define DEBUG_LOW nearby. Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-11-08 14:15:31 +11:00
Grant Likely	bd942ba3db	[POWERPC] ppc405 Fix arithmatic rollover bug when memory size under 16M mmu_mapin_ram() loops over total_lowmem to setup page tables. However, if total_lowmem is less that 16M, the subtraction rolls over and results in a number just under 4G (because total_lowmem is an unsigned value). This patch rejigs the loop from countup to countdown to eliminate the bug. Special thanks to Magnus Hjorth who wrote the original patch to fix this bug. This patch improves on his by making the loop code simpler (which also eliminates the possibility of another rollover at the high end) and also applies the change to arch/powerpc. Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>	2007-11-01 07:15:59 -05:00
Benjamin Herrenschmidt	b98ac05d5e	[POWERPC] 4xx: Deal with 44x virtually tagged icache The 44x family has an interesting "feature" which is a virtually tagged instruction cache (yuck !). So far, we haven't dealt with it properly, which means we've been mostly lucky or people didn't report the problems, unless people have been running custom patches in their distro... This is an attempt at fixing it properly. I chose to do it by setting a global flag whenever we change a PTE that was previously marked executable, and flush the entire instruction cache upon return to user space when that happens. This is a bit heavy handed, but it's hard to do more fine grained flushes as the icbi instruction, on those processor, for some very strange reasons (since the cache is virtually mapped) still requires a valid TLB entry for reading in the target address space, which isn't something I want to deal with. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>	2007-11-01 07:15:30 -05:00
Benjamin Herrenschmidt	e701d269aa	[POWERPC] 4xx: Fix 4xx flush_tlb_page() On 4xx CPUs, the current implementation of flush_tlb_page() uses a low level _tlbie() assembly function that only works for the current PID. Thus, invalidations caused by, for example, a COW fault triggered by get_user_pages() from a different context will not work properly, causing among other things, gdb breakpoints to fail. This patch adds a "pid" argument to _tlbie() on 4xx processors, and uses it to flush entries in the right context. FSL BookE also gets the argument but it seems they don't need it (their tlbivax form ignores the PID when invalidating according to the document I have). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>	2007-11-01 07:15:09 -05:00
Benjamin Herrenschmidt	f6ab0b922c	[POWERPC] powerpc: Fix demotion of segments to 4K pages When demoting a process to use 4K HW pages (instead of 64K), which happens under various circumstances such as doing cache inhibited mappings on machines that do not support 64K CI pages, the assembly hash code calls back into the C function flush_hash_page(). This function prototype was recently changed to accomodate for 1T segments but the assembly call site was not updated, causing applications that do demotion to hang. In addition, when updating the per-CPU PACA for the new sizes, we didn't properly update the slice "map", thus causing the SLB miss code to re-insert segments for the wrong size. This fixes both and adds a warning comment next to the C implementation to try to avoid problems next time someone changes it. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-10-29 14:34:14 +11:00
Serge E. Hallyn	b460cbc581	pid namespaces: define is_global_init() and is_container_init() is_init() is an ambiguous name for the pid==1 check. Split it into is_global_init() and is_container_init(). A cgroup init has it's tsk->pid == 1. A global init also has it's tsk->pid == 1 and it's active pid namespace is the init_pid_ns. But rather than check the active pid namespace, compare the task structure with 'init_pid_ns.child_reaper', which is initialized during boot to the /sbin/init process and never changes. Changelog: 2.6.22-rc4-mm2-pidns1: - Use 'init_pid_ns.child_reaper' to determine if a given task is the global init (/sbin/init) process. This would improve performance and remove dependence on the task_pid(). 2.6.21-mm2-pidns2: - [Sukadev Bhattiprolu] Changed is_container_init() calls in {powerpc, ppc,avr32}/traps.c for the _exception() call to is_global_init(). This way, we kill only the cgroup if the cgroup's init has a bug rather than force a kernel panic. [akpm@linux-foundation.org: fix comment] [sukadev@us.ibm.com: Use is_global_init() in arch/m32r/mm/fault.c] [bunk@stusta.de: kernel/pid.c: remove unused exports] [sukadev@us.ibm.com: Fix capability.c to work with threaded init] Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Acked-by: Pavel Emelianov <xemul@openvz.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Herbert Poetzel <herbert@13thfloor.at> Cc: Kirill Korotaev <dev@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:37 -07:00
Linus Torvalds	c548f08a4f	Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (24 commits) [POWERPC] Fix vmemmap warning in init_64.c [POWERPC] Fix 64 bits vDSO DWARF info for CR register [POWERPC] Add 1TB workaround for PA6T [POWERPC] Enable NO_HZ and high res timers for pseries and ppc64 configs [POWERPC] Quieten cache information at boot [POWERPC] Quieten clockevent printk [POWERPC] Enable SLUB in *_defconfig [POWERPC] Fix 1TB segment detection [POWERPC] Fix iSeries_hpte_insert prototype [POWERPC] Fix copyright symbol [POWERPC] ibmebus: Move to of_device and of_platform_driver, match eHCA and eHEA drivers [POWERPC] ibmebus: Add device creation and bus probing based on of_device [POWERPC] ibmebus: Remove bus match/probe/remove functions [POWERPC] Move of_device allocation into of_device.[ch] [POWERPC] mpc52xx: device tree changes for FEC and MDIO [POWERPC] bestcomm: GenBD task support [POWERPC] bestcomm: FEC task support [POWERPC] bestcomm: ATA task support [POWERPC] bestcomm: core bestcomm support for Freescale MPC5200 [POWERPC] mpc52xx: Update mpc52xx_psc structure with B revision changes ...	2007-10-17 09:05:55 -07:00
Roel Kluin	f7a75f0a40	spin_lock_unlocked cleanups Replace some SPIN_LOCK_UNLOCKED with DEFINE_SPINLOCK Signed-off-by: Roel Kluin <12o3l@tiscali.nl> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:01 -07:00
Christoph Lameter	4ba9b9d0ba	Slab API: remove useless ctor parameter and reorder parameters Slab constructors currently have a flags parameter that is never used. And the order of the arguments is opposite to other slab functions. The object pointer is placed before the kmem_cache pointer. Convert ctor(void object, struct kmem_cache s, unsigned long flags) to ctor(struct kmem_cache s, void object) throughout the kernel [akpm@linux-foundation.org: coupla fixes] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:45 -07:00
Tony Breeds	f6b8076910	[POWERPC] Fix vmemmap warning in init_64.c Use the right printk format to silence the following warning. CC arch/powerpc/mm/init_64.o arch/powerpc/mm/init_64.c: In function 'vmemmap_populate': arch/powerpc/mm/init_64.c:243: warning: format '%p' expects type 'void *', but argument 4 has type 'long unsigned int' Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-10-17 22:30:09 +10:00
Olof Johansson	f66bce5e6a	[POWERPC] Add 1TB workaround for PA6T PA6T has a bug where the slbie instruction does not honor the large segment bit. As a result, we have to always use slbia when switching context. We don't have to worry about changing the slbie's during fault processing, since they should never be replacing one VSID with another using the same ESID. I.e. there's no risk for inserting duplicate entries due to a failed slbie of the old entry. So as long as we clear it out on context switch we should be fine. Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-10-17 22:30:09 +10:00
Olof Johansson	f5534004e5	[POWERPC] Fix 1TB segment detection Buglet in the 1TB detection makes it return after checking the first property word, even if it's not a match. Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-10-17 22:30:08 +10:00
Anton Blanchard	e95206ab2c	Update PowerPC vmemmap code for 1TB segments htab_bolt_mapping takes another argument now the 1TB code has been merged. Update vmemmap_populate to match. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-16 13:10:58 -07:00
KAMEZAWA Hiroyuki	48e94196a5	fix memory hot remove not configured case. Now, arch dependent code around CONFIG_MEMORY_HOTREMOVE is a mess. This patch cleans up them. This is against 2.6.23-rc6-mm1. - fix compile failure on ia64/ CONFIG_MEMORY_HOTPLUG && !CONFIG_MEMORY_HOTREMOVE case. - For !CONFIG_MEMORY_HOTREMOVE, add generic no-op remove_memory(), which returns -EINVAL. - removed remove_pages() only used in powerpc. - removed no-op remove_memory() in i386, sh, sparc64, x86_64. - only powerpc returns -ENOSYS at memory hot remove(no-op). changes it to return -EINVAL. Note: Currently, only ia64 supports CONFIG_MEMORY_HOTREMOVE. I welcome other archs if there are requirements and testers. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-16 09:43:02 -07:00
Andy Whitcroft	d29eff7bca	ppc64: SPARSEMEM_VMEMMAP support Enable virtual memmap support for SPARSEMEM on PPC64 systems. Slice a 16th off the end of the linear mapping space and use that to hold the vmemmap. Uses the same size mapping as uses in the linear 1:1 kernel mapping. [pbadari@gmail.com: fix warning] Signed-off-by: Andy Whitcroft <apw@shadowen.org> Acked-by: Mel Gorman <mel@csn.ul.ie> Cc: Christoph Lameter <clameter@sgi.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-16 09:42:51 -07:00
Paul Mackerras	1189be6508	[POWERPC] Use 1TB segments This makes the kernel use 1TB segments for all kernel mappings and for user addresses of 1TB and above, on machines which support them (currently POWER5+, POWER6 and PA6T). We detect that the machine supports 1TB segments by looking at the ibm,processor-segment-sizes property in the device tree. We don't currently use 1TB segments for user addresses < 1T, since that would effectively prevent 32-bit processes from using huge pages unless we also had a way to revert to using 256MB segments. That would be possible but would involve extra complications (such as keeping track of which segment size was used when HPTEs were inserted) and is not addressed here. Parts of this patch were originally written by Ben Herrenschmidt. Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-10-12 14:05:17 +10:00
Dale Farnsworth	873553b3d6	[POWERPC] 85xx: Failure with odd memory sizes and CONFIG_HIGHMEM The CONFIG_FSL_BOOKE mmu setup code fails when CONFIG_HIGHMEM=y and the 3 fixed TLB entries cannot exactly map the lowmem size. Each TLB entry can map 4MB, 16MB, 64MB or 256MB, so the failure is observed when the kernel lowmem size is not equal to the sum of up to 3 of those values. Normally, memory is sized in nice numbers, but I observed this problem while testing a crash dump kernel. The failure can also be observed by artificially reducing the kernel's main memory via the mem= kernel command line parameter. This commit fixes the problem by setting __initial_memory_limit in adjust_total_lowmem(). Signed-off-by: Dale Farnsworth <dale@farnsworth.org> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2007-10-08 08:38:34 -05:00
John Traill	544cdabe64	[POWERPC] 8xx: Set initial memory limit. The 8xx can only support a max of 8M during early boot (it seems a lot of 8xx boards only have 8M so the bug was never triggered), but the early allocator isn't aware of this. The following change makes it able to run with larger memory. Signed-off-by: John Traill <john.traill@freescale.com> Signed-off-by: Vitaly Bordug <vitb@kernel.crashing.org> Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2007-10-03 20:36:36 -05:00
Ed Swarthout	df174e3be8	[POWERPC] Add memory regions to the kcore list for 32-bit machines The entries are only 32-bit, so restrict the virtual address to stay below 0xffff_ffff. With KERNELBASE set to 0xc000_0000, this in effect restricts access to the first 1GB of real memory. Make setup_kcore conditional on CONFIG_PROC_KCORE for both 32/64. Signed-off-by: Ed Swarthout <Ed.Swarthout@freescale.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-10-03 09:12:06 +10:00
Stephen Rothwell	2578bfae84	[POWERPC] Create and use CONFIG_WORD_SIZE Linus made this suggestion for the x86 merge and this starts the process for powerpc. We assume that CONFIG_PPC64 implies CONFIG_PPC_MERGE and CONFIG_PPC_STD_MMU_32 implies CONFIG_PPC_STD_MMU. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-10-03 09:12:02 +10:00
Michael Neuling	00efee7d5d	[POWERPC] Remove barriers from the SLB shadow buffer update After talking to an IBM POWER hypervisor (PHYP) design and development guy, there seems to be no need for memory barriers when updating the SLB shadow buffer provided we only update it from the current CPU, which we do. Also, these guys see no need in the future for these barriers. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-09-19 14:40:54 +10:00
Olof Johansson	a302cb9d95	[POWERPC] Export new __io{re,un}map_at() symbols Export new __io{re,un}map_at() symbols so modules can use them. Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-09-14 01:33:21 +10:00
Paul Mackerras	35438c4327	Merge branch 'linux-2.6' into for-2.6.24	2007-08-28 15:56:11 +10:00
Paul Mackerras	175587cca7	[POWERPC] Fix SLB initialization at boot time This partially reverts `edd0622bd2`. It turns out that the part of that commit that aimed to ensure that we created an SLB entry for the kernel stack on secondary CPUs when starting the CPU didn't achieve its aim, and in fact caused a regression, because get_paca()->kstack is not initialized at the point where slb_initialize is called. This therefore just reverts that part of that commit, while keeping the change to slb_flush_and_rebolt, which is correct and necessary. Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-08-25 16:58:43 +10:00
Josh Boyer	4d922c8dc3	[POWERPC] 40x MMU Add MMU definitions for 40x platforms. Also fixes two warnings in 40x_mmu.c. Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com> Acked-by: David Gibson <david@gibson.dropbear.id.au>	2007-08-20 07:28:48 -05:00
Josh Boyer	15f6527e8e	[POWERPC] Rename 4xx paths to 40x 4xx is a bit of a misnomer for certain things, as they really apply to PowerPC 40x only. Rename some of the files to clean this up. Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com> Acked-by: David Gibson <david@gibson.dropbear.id.au>	2007-08-20 07:27:07 -05:00
Stephen Rothwell	e8ff0646e5	[POWERPC] Tidy up CONFIG_PPC_MM_SLICES code This removes some of the #ifdefs from .c files. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-08-17 11:01:59 +10:00
Stephen Rothwell	9dfe5c53d0	[POWERPC] Fix non HUGETLB_PAGE build warning arch/powerpc/mm/mmu_context_64.c: In function 'init_new_context': arch/powerpc/mm/mmu_context_64.c:31: warning: unused variable 'new_context' Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-08-17 11:01:58 +10:00
Jesper Juhl	9420dc65ff	[POWERPC] Clean out a bunch of duplicate includes This removes several duplicate includes from arch/powerpc/. Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-08-17 11:01:51 +10:00
Ilpo JÃ¤rvinen	2b02d13996	[POWERPC] Fix invalid semicolon after if statement A similar fix to netfilter from Eric Dumazet inspired me to look around a bit by using some grep/sed stuff as looking for this kind of bugs seemed easy to automate. This is one of them I found where it looks like this semicolon is not valid. Signed-off-by: Ilpo JÃ¤rvinen <ilpo.jarvinen@helsinki.fi> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-08-17 10:48:52 +10:00
Benjamin Herrenschmidt	d1f5a77f2c	[POWERPC] Fix size check for hugetlbfs My "slices" address space management code that was added in the 2.6.22 implementation of get_unmapped_area() doesn't properly check that the size is a multiple of the requested page size. This allows userland to create VMAs that aren't a multiple of the huge page size with hugetlbfs (since hugetlbfs entirely relies on get_unmapped_area() to do that checking) which leads to a kernel BUG() when such areas are torn down. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-08-10 21:04:42 +10:00
Paul Mackerras	edd0622bd2	[POWERPC] Fix potential duplicate entry in SLB shadow buffer We were getting a duplicate entry in the SLB shadow buffer in slb_flush_and_rebolt() if the kernel stack was in the same segment as PAGE_OFFSET, which on POWER6 causes the hypervisor to terminate the partition with an error. This fixes it. Also we were not creating an SLB entry (or an SLB shadow buffer entry) for the kernel stack on secondary CPUs when starting the CPU. This isn't a major problem, since an appropriate entry will be created on demand, but this fixes that also for consistency. Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-08-10 21:04:07 +10:00
Michael Neuling	67439b76f2	[POWERPC] Fixes for the SLB shadow buffer code On a machine with hardware 64kB pages and a kernel configured for a 64kB base page size, we need to change the vmalloc segment from 64kB pages to 4kB pages if some driver creates a non-cacheable mapping in the vmalloc area. However, we never updated with SLB shadow buffer. This fixes it. Thanks to paulus for finding this. Also added some write barriers to ensure the shadow buffer contents are always consistent. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-08-03 19:36:01 +10:00
Michael Ellerman	b9c3fdb0f0	[POWERPC] Fix parse_drconf_memory() for 64-bit start addresses Some new machines use the "ibm,dynamic-reconfiguration-memory" property to provide memory layout information, rather than via memory nodes. There is a bug in the code to parse this property for start addresses over 4GB; we store the start address in an unsigned int, which means we throw away the high bits and add apparently duplicate regions. This results in a BUG() in free_bootmem_core(). This fixes it by using an unsigned long instead. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-08-03 19:36:00 +10:00
Paul Mackerras	430404ed9c	[POWERPC] Fix special PTE code for secondary hash bucket The code for mapping special 4k pages on kernels using a 64kB base page size was missing the code for doing the RPN (real page number) manipulation when inserting the hardware PTE in the secondary hash bucket. It needs the same code as has already been added to the code that inserts the HPTE in the primary hash bucket. This adds it. Spotted by Ben Herrenschmidt. Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-08-03 19:16:11 +10:00
Manish Ahuja	56d6d1a73d	[POWERPC] Fix loop with unsigned long counter variable This fixes a possible infinite loop when the unsigned long counter "i" is used in lmb_add_region() in the following for loop: for (i = rgn->cnt-1; i >= 0; i--) by making the loop counter "i" be signed. Signed-off-by: Manish Ahuja <ahuja@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-07-26 16:12:17 +10:00
Linus Torvalds	dc79747019	Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: [POWERPC] Clean up duplicate includes in drivers/macintosh/ [POWERPC] Quiet section mismatch warning on pcibios_setup [POWERPC] init and exit markings for hvc_iseries [POWERPC] Quiet section mismatch in hvc_rtas.c [POWERPC] Constify of_platform_driver match_table [POWERPC] hvcs: Make some things static and const [POWERPC] Constify of_platform_driver name [POWERPC] MPIC protected sources [POWERPC] of_detach_node()'s device node argument cannot be const [POWERPC] Fix ARCH=ppc builds [POWERPC] mv64x60: Use mutex instead of semaphore [POWERPC] Allow smp_call_function_single() to current cpu [POWERPC] Allow exec faults on readable areas on classic 32-bit PowerPC [POWERPC] Fix future firmware feature fixups function failure [POWERPC] fix showing xmon help [POWERPC] Make xmon_write accept a const buffer [POWERPC] Fix misspelled "CONFIG_CHECK_CACHE_COHERENCY" Kconfig option. [POWERPC] cell: CONFIG_SPE_BASE is a typo	2007-07-22 11:17:35 -07:00
Paul Mackerras	08ae6cc15d	[POWERPC] Allow exec faults on readable areas on classic 32-bit PowerPC Classic 32-bit PowerPC CPUs, and the early 64-bit PowerPC CPUs, don't provide a way to prevent execution from readable pages, that is, the MMU doesn't distinguish between data reads and instruction reads, although a different exception is taken for faults in data accesses and instruction accesses. Commit `9ba4ace39f`, in the course of fixing another bug, added a check that meant that a page fault due to an instruction access would fail if the vma did not have the VM_EXEC flag set. This gives an inconsistent enforcement on these CPUs of the no-execute status of the vma (since reading from the page is sufficient to allow subsequent execution from it), and causes old versions of ppc32 glibc (2.2 and earlier) to fail, since they rely on executing the word before the GOT but don't have it marked executable. This fixes the problem by allowing execution from readable (or writable) areas on CPUs which do not provide separate control over data and instruction reads. Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Jon Loeliger <jdl@freescale.com>	2007-07-22 21:30:58 +10:00
Geert Uytterhoeven	1e57ba8ddd	[POWERPC] cell: CONFIG_SPE_BASE is a typo The config symbol for SPE support is called CONFIG_SPU_BASE, not CONFIG_SPE_BASE. Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-07-22 21:30:57 +10:00
Mariusz Kozlowski	97d22d26b4	powerpc: tlb_32.c build fix allnoconfig results in this: CC arch/powerpc/mm/tlb_32.o In file included from include/asm/tlb.h:60, from arch/powerpc/mm/tlb_32.c:30: include/asm-generic/tlb.h: In function 'tlb_flush_mmu': include/asm-generic/tlb.h:76: error: implicit declaration of function 'release_pages' include/asm-generic/tlb.h: In function 'tlb_remove_page': include/asm-generic/tlb.h:105: error: implicit declaration of function 'page_cache_release' Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-21 17:49:16 -07:00
Paul Mundt	20c2df83d2	mm: Remove slab destructors from kmem_cache_create(). Slab destructors were no longer supported after Christoph's `c59def9f22` change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2007-07-20 10:11:58 +09:00
Nick Piggin	83c54070ee	mm: fault feedback #2 This patch completes Linus's wish that the fault return codes be made into bit flags, which I agree makes everything nicer. This requires requires all handle_mm_fault callers to be modified (possibly the modifications should go further and do things like fault accounting in handle_mm_fault -- however that would be for another patch). [akpm@linux-foundation.org: fix alpha build] [akpm@linux-foundation.org: fix s390 build] [akpm@linux-foundation.org: fix sparc build] [akpm@linux-foundation.org: fix sparc64 build] [akpm@linux-foundation.org: fix ia64 build] Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ian Molton <spyro@f2s.com> Cc: Bryan Wu <bryan.wu@analog.com> Cc: Mikael Starvik <starvik@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Greg Ungerer <gerg@uclinux.org> Cc: Matthew Wilcox <willy@debian.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Richard Curnow <rc@rc0.org.uk> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp> Cc: Chris Zankel <chris@zankel.net> Acked-by: Kyle McMartin <kyle@mcmartin.ca> Acked-by: Haavard Skinnemoen <hskinnemoen@atmel.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Acked-by: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> [ Still apparently needs some ARM and PPC loving - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-19 10:04:41 -07:00
Paul Mackerras	bf22f6fe2d	Merge branch 'for-2.6.23' into merge	2007-07-11 13:28:26 +10:00
Manish Ahuja	b3e998ee05	[POWERPC] Remove extra return statement Found 2 instances of return one right after each other in arch_add_memory(). This removes the superfluous one. Signed-off-by: Manish Ahuja <mahuja@us.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-07-10 22:01:01 +10:00
Kumar Gala	74a0ba61b1	[POWERPC] Move inline asm eieio to using eieio inline function Use the eieio function so we can redefine what eieio does rather than direct inline asm. This is part code clean up and partially because not all PPCs have eieio (book-e has mbar that maps to eieio). Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2007-07-10 00:33:14 -05:00
Segher Boessenkool	9ba4ace39f	[POWERPC] PowerPC: Prevent data exception in kernel space (32-bit) The "is_exec" branch of the protection check in do_page_fault() didn't do anything on 32-bit PowerPC. So if a userland program jumps to a page with Linux protection flags "---p", all the tests happily fall through, and handle_mm_fault() is called, which in turn calls handle_pte_fault(), which calls update_mmu_cache(), which goes flush the dcache to a page with no access rights. Boom. This fixes it. Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org> Cc: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-06-20 22:07:38 +10:00
David Gibson	8e561e7eda	[POWERPC] Kill typedef-ed structs for hash PTEs and BATs Using typedefs to rename structure types if frowned on by CodingStyle. However, we do so for the hash PTE structure on both ppc32 (where it's called "PTE") and ppc64 (where it's called "hpte_t"). On ppc32 we also have such a typedef for the BATs ("BAT"). This removes this unhelpful use of typedefs, in the process bringing ppc32 and ppc64 closer together, by using the name "struct hash_pte" in both cases. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-06-14 22:30:16 +10:00
David Gibson	c0770f686c	[POWERPC] Remove a couple of unused definitions from pgtable_32.c In arch/powerpc/mm/pgtable_32.c, the variable io_bat_index and the macro is_power_of_4() no longer have any users. This removes them. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-06-14 22:30:15 +10:00
David Gibson	f21f49ea63	[POWERPC] Remove the dregs of APUS support from arch/powerpc APUS (the Amiga Power-Up System) is not supported under arch/powerpc and it's unlikely it ever will be. Therefore, this patch removes the fragments of APUS support code from arch/powerpc which have been copied from arch/ppc. A few APUS references are left in asm-powerpc in .h files which are still used from arch/ppc. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-06-14 22:30:15 +10:00
David Gibson	90ac19a8b2	[POWERPC] Abolish iopa(), mm_ptov(), io_block_mapping() from arch/powerpc These old-fashioned IO mapping functions no longer have any callers in code which remains relevant on arch/powerpc. Therefore, this removes them from arch/powerpc. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-06-14 22:30:15 +10:00
will schmidt	effe24bdd4	[POWERPC] During VM oom condition, kill all threads in process group We have had complaints where a threaded application is left in a bad state after one of it's threads is killed when we hit a VM: out_of_memory condition. Killing just one of the process threads can leave the application in a bad state, whereas killing the entire process group would allow for the application to restart, or be otherwise handled, and makes it very obvious that something has gone wrong. This change allows the entire process group to be taken down, rather than just the one thread. lightly tested on powerpc Signed-off-by: Will <will_schmidt@vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-06-14 22:29:59 +10:00
Benjamin Herrenschmidt	3d5134ee83	[POWERPC] Rewrite IO allocation & mapping on powerpc64 This rewrites pretty much from scratch the handling of MMIO and PIO space allocations on powerpc64. The main goals are: - Get rid of imalloc and use more common code where possible - Simplify the current mess so that PIO space is allocated and mapped in a single place for PCI bridges - Handle allocation constraints of PIO for all bridges including hot plugged ones within the 2GB space reserved for IO ports, so that devices on hotplugged busses will now work with drivers that assume IO ports fit in an int. - Cleanup and separate tracking of the ISA space in the reserved low 64K of IO space. No ISA -> Nothing mapped there. I booted a cell blade with IDE on PIO and MMIO and a dual G5 so far, that's it :-) With this patch, all allocations are done using the code in mm/vmalloc.c, though we use the low level __get_vm_area with explicit start/stop constraints in order to manage separate areas for vmalloc/vmap, ioremap, and PCI IOs. This greatly simplifies a lot of things, as you can see in the diffstat of that patch :-) A new pair of functions pcibios_map/unmap_io_space() now replace all of the previous code that used to manipulate PCI IOs space. The allocation is done at mapping time, which is now called from scan_phb's, just before the devices are probed (instead of after, which is by itself a bug fix). The only other caller is the PCI hotplug code for hot adding PCI-PCI bridges (slots). imalloc is gone, as is the "sub-allocation" thing, but I do beleive that hotplug should still work in the sense that the space allocation is always done by the PHB, but if you unmap a child bus of this PHB (which seems to be possible), then the code should properly tear down all the HPTE mappings for that area of the PHB allocated IO space. I now always reserve the first 64K of IO space for the bridge with the ISA bus on it. I have moved the code for tracking ISA in a separate file which should also make it smarter if we ever are capable of hot unplugging or re-plugging an ISA bridge. This should have a side effect on platforms like powermac where VGA IOs will no longer work. This is done on purpose though as they would have worked semi-randomly before. The idea at this point is to isolate drivers that might need to access those and fix them by providing a proper function to obtain an offset to the legacy IOs of a given bus. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-06-14 22:29:56 +10:00
Benjamin Herrenschmidt	c19c03fc74	[POWERPC] unmap_vm_area becomes unmap_kernel_range for the public This makes unmap_vm_area static and a wrapper around a new exported unmap_kernel_range that takes an explicit range instead of a vm_area struct. This makes it more versatile for code that wants to play with kernel page tables outside of the standard vmalloc area. (One example is some rework of the PowerPC PCI IO space mapping code that depends on that patch and removes some code duplication and horrible abuse of forged struct vm_struct). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-06-14 22:29:56 +10:00
Jon Tollefson	3f1df7a260	[POWERPC] Move common code out of if/else Move common code out of if/else. Signed-off-by: Jon Tollefson <kniht@linux.vnet.ibm.com> ---- hash_native_64.c \| 3 +-- 1 files changed, 1 insertion(+), 2 deletions(-) Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-06-14 22:29:55 +10:00
Kumar Gala	f1aed92464	[POWERPC] Fix modpost warning Mark pte_alloc_one_kernel as __init_refok to fix the following warning: WARNING: arch/powerpc/mm/built-in.o(.text+0x1068): Section mismatch: reference to .init.text:early_get_page (between 'pte_alloc_one_kernel' and 'steal_context') Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2007-05-23 07:49:37 -05:00
Benjamin Herrenschmidt	5453e7723b	[POWERPC] Fix warning in 32-bit builds with CONFIG_HIGHMEM Some missing fixup for the removal of 4 level fixup header. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-22 20:20:57 +10:00
Alexey Dobriyan	e8edc6e03a	Detach sched.h from mm.h First thing mm.h does is including sched.h solely for can_do_mlock() inline function which has "current" dereference inside. By dealing with can_do_mlock() mm.h can be detached from sched.h which is good. See below, why. This patch a) removes unconditional inclusion of sched.h from mm.h b) makes can_do_mlock() normal function in mm/mlock.c c) exports can_do_mlock() to not break compilation d) adds sched.h inclusions back to files that were getting it indirectly. e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were getting them indirectly Net result is: a) mm.h users would get less code to open, read, preprocess, parse, ... if they don't need sched.h b) sched.h stops being dependency for significant number of files: on x86_64 allmodconfig touching sched.h results in recompile of 4083 files, after patch it's only 3744 (-8.3%). Cross-compile tested on all arm defconfigs, all mips defconfigs, all powerpc defconfigs, alpha alpha-up arm i386 i386-up i386-defconfig i386-allnoconfig ia64 ia64-up m68k mips parisc parisc-up powerpc powerpc-up s390 s390-up sparc sparc-up sparc64 sparc64-up um-x86_64 x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig as well as my two usual configs. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-21 09:18:19 -07:00
Jon Tollefson	5b82583185	[POWERPC] Correct #endif comment Fix up comment on two #endifs to match their #ifs. Signed-off-by: Jon Tollefson <kniht@linux.vnet.ibm.com> ---- hash_utils_64.c \| 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-17 21:11:19 +10:00
Benjamin Herrenschmidt	017e3c53f1	[POWERPC] Add spinlock to request_phb_iospace() request_phb_iospace() can be called from different CPUs at init time (at least with my next patch) and thus needs a spinlock. As for the next patch, this is a temporary workaround for 2.6.22 issues until my rewrite of IO mappings is ready (for 2.6.23) Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-17 21:11:14 +10:00
Kumar Gala	991eb43af9	[POWERPC] Fix COMMON symbol warnings We get the following warnings in various ARCH=powerpc builds: WARNING: "ee_restarts" [arch/powerpc/kernel/built-in] is COMMON symbol WARNING: "fee_restarts" [arch/powerpc/kernel/built-in] is COMMON symbol WARNING: "htab_hash_searches" [arch/powerpc/mm/built-in] is COMMON symbol WARNING: "next_slot" [arch/powerpc/mm/built-in] is COMMON symbol WARNING: "mmu_hash_lock" [arch/powerpc/mm/built-in] is COMMON symbol WARNING: "primary_pteg_full" [arch/powerpc/mm/built-in] is COMMON symbol WARNING: "global_dbcr0" [arch/powerpc/kernel/built-in] is COMMON symbol Switch to moving local symbols (except mmu_hash_lock which is global) and space directive instead. Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2007-05-17 21:10:15 +10:00
Stephen Rothwell	8980ae8677	[POWERPC] Remove unused variable in hpte_decode() Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-12 11:32:47 +10:00
Stephen Rothwell	0c12fe5697	[POWERPC] Assign correct variable in hpte_decode() This case will never be hit, but it should be corrected anyway. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-12 11:32:47 +10:00
Paul Mackerras	2454c7e98c	[POWERPC] Fix warning in hpte_decode(), and generalize it This adds the necessary support to hpte_decode() to handle 1TB segments and 16GB pages, and removes an uninitialized value warning on avpn. We don't have any code to generate HPTEs for 1TB segments or 16GB pages yet, so this is mostly for completeness, and to fix the warning. Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2007-05-10 21:28:13 +10:00
Linus Torvalds	aabded9c3a	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: [POWERPC] Further fixes for the removal of 4level-fixup hack from ppc32 [POWERPC] EEH: log all PCI-X and PCI-E AER registers [POWERPC] EEH: capture and log pci state on error [POWERPC] EEH: Split up long error msg [POWERPC] EEH: log error only after driver notification. [POWERPC] fsl_soc: Make mac_addr const in fs_enet_of_init(). [POWERPC] Don't use SLAB/SLUB for PTE pages [POWERPC] Spufs support for 64K LS mappings on 4K kernels [POWERPC] Add ability to 4K kernel to hash in 64K pages [POWERPC] Introduce address space "slices" [POWERPC] Small fixes & cleanups in segment page size demotion [POWERPC] iSeries: Make HVC_ISERIES the default [POWERPC] iSeries: suppress build warning in lparmap.c [POWERPC] Mark pages that don't exist as nosave [POWERPC] swsusp: Introduce register_nosave_region_late	2007-05-09 12:56:01 -07:00
Rafael J. Wysocki	8bb7844286	Add suspend-related notifications for CPU hotplug Since nonboot CPUs are now disabled after tasks and devices have been frozen and the CPU hotplug infrastructure is used for this purpose, we need special CPU hotplug notifications that will help the CPU-hotplug-aware subsystems distinguish normal CPU hotplug events from CPU hotplug events related to a system-wide suspend or resume operation in progress. This patch introduces such notifications and causes them to be used during suspend and resume transitions. It also changes all of the CPU-hotplug-aware subsystems to take these notifications into consideration (for now they are handled in the same way as the corresponding "normal" ones). [oleg@tv-sign.ru: cleanups] Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:56 -07:00
David Gibson	f1a1eb299a	[POWERPC] Further fixes for the removal of 4level-fixup hack from ppc32 Commit `d1953c8888` removed the use of 4level-fixup.h for 32-bit systems under arch/powerpc. However, I missed a few things activated on some configurations, resulting in some warnings (at least with STRICT_MM_TYPECHECKS enabled) and build errors in some circumstances. This fixes it. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-09 16:35:01 +10:00
Hugh Dickins	517e22638c	[POWERPC] Don't use SLAB/SLUB for PTE pages The SLUB allocator relies on struct page fields first_page and slab, overwritten by ptl when SPLIT_PTLOCK: so the SLUB allocator cannot then be used for the lowest level of pagetable pages. This was obstructing SLUB on PowerPC, which uses kmem_caches for its pagetables. So convert its pte level to use normal gfp pages (whereas pmd, pud and 64k-page pgd want partpages, so continue to use kmem_caches for pmd, pud and pgd). Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-09 16:35:00 +10:00
Benjamin Herrenschmidt	16c2d47623	[POWERPC] Add ability to 4K kernel to hash in 64K pages This adds the ability for a kernel compiled with 4K page size to have special slices containing 64K pages and hash the right type of hash PTEs. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-09 16:35:00 +10:00
Benjamin Herrenschmidt	d0f13e3c20	[POWERPC] Introduce address space "slices" The basic issue is to be able to do what hugetlbfs does but with different page sizes for some other special filesystems; more specifically, my need is: - Huge pages - SPE local store mappings using 64K pages on a 4K base page size kernel on Cell - Some special 4K segments in 64K-page kernels for mapping a dodgy type of powerpc-specific infiniband hardware that requires 4K MMU mappings for various reasons I won't explain here. The main issues are: - To maintain/keep track of the page size per "segment" (as we can only have one page size per segment on powerpc, which are 256MB divisions of the address space). - To make sure special mappings stay within their allotted "segments" (including MAP_FIXED crap) - To make sure everybody else doesn't mmap/brk/grow_stack into a "segment" that is used for a special mapping Some of the necessary mechanisms to handle that were present in the hugetlbfs code, but mostly in ways not suitable for anything else. The patch relies on some changes to the generic get_unmapped_area() that just got merged. It still hijacks hugetlb callbacks here or there as the generic code hasn't been entirely cleaned up yet but that shouldn't be a problem. So what is a slice ? Well, I re-used the mechanism used formerly by our hugetlbfs implementation which divides the address space in "meta-segments" which I called "slices". The division is done using 256MB slices below 4G, and 1T slices above. Thus the address space is divided currently into 16 "low" slices and 16 "high" slices. (Special case: high slice 0 is the area between 4G and 1T). Doing so simplifies significantly the tracking of segments and avoids having to keep track of all the 256MB segments in the address space. While I used the "concepts" of hugetlbfs, I mostly re-implemented everything in a more generic way and "ported" hugetlbfs to it. Slices can have an associated page size, which is encoded in the mmu context and used by the SLB miss handler to set the segment sizes. The hash code currently doesn't care, it has a specific check for hugepages, though I might add a mechanism to provide per-slice hash mapping functions in the future. The slice code provide a pair of "generic" get_unmapped_area() (bottomup and topdown) functions that should work with any slice size. There is some trickiness here so I would appreciate people to have a look at the implementation of these and let me know if I got something wrong. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-09 16:35:00 +10:00
Benjamin Herrenschmidt	16f1c74675	[POWERPC] Small fixes & cleanups in segment page size demotion The code for demoting segments to 4K had some issues, like for example, when using _PAGE_4K_PFN flag, the first CPU to hit it would do the demotion, but other CPUs hitting the same page wouldn't properly flush their SLBs if mmu_ci_restriction isn't set. There are also potential issues with hash_preload not handling _PAGE_4K_PFN. All of these are non issues on current hardware but might bite us in the future. This patch thus fixes it by: - Taking the test comparing the mm and current CPU context page sizes to decide to flush SLBs out of the mmu_ci_restrictions test since that can also be triggered by _PAGE_4K_PFN pages - Due to the above being done all the time, demote_segment_4k doesn't need update the context and flush the SLB - demote_segment_4k can be static and doesn't need an EXPORT_SYMBOL - Making hash_preload ignore anything that has either _PAGE_4K_PFN or _PAGE_NO_CACHE set, thus avoiding duplication of the complicated logic in hash_page() (and possibly making hash_preload a little bit faster for the normal case). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-09 16:35:00 +10:00
Johannes Berg	4e8ad3e816	[POWERPC] Mark pages that don't exist as nosave On some powerpc architectures (notably 64-bit powermac) there is a memory hole, for example on powermacs between 2G and 4G. Since we use the flat memory model regardless, these pages must be marked as nosave (for suspend to disk.) Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-09 16:34:56 +10:00
Linus Torvalds	df6d3916f3	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (77 commits) [POWERPC] Abolish powerpc_flash_init() [POWERPC] Early serial debug support for PPC44x [POWERPC] Support for the Ebony 440GP reference board in arch/powerpc [POWERPC] Add device tree for Ebony [POWERPC] Add powerpc/platforms/44x, disable platforms/4xx for now [POWERPC] MPIC U3/U4 MSI backend [POWERPC] MPIC MSI allocator [POWERPC] Enable MSI mappings for MPIC [POWERPC] Tell Phyp we support MSI [POWERPC] RTAS MSI implementation [POWERPC] PowerPC MSI infrastructure [POWERPC] Rip out the existing powerpc msi stubs [POWERPC] Remove use of 4level-fixup.h for ppc32 [POWERPC] Add powerpc PCI-E reset API implementation [POWERPC] Holly bootwrapper [POWERPC] Holly DTS [POWERPC] Holly defconfig [POWERPC] Add support for 750CL Holly board [POWERPC] Generalize tsi108 PCI setup [POWERPC] Generalize tsi108 PHY types ... Fixed conflict in include/asm-powerpc/kdebug.h manually Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:50:19 -07:00
Randy Dunlap	e63340ae6b	header cleaning: don't include smp_lock.h when not used Remove includes of <linux/smp_lock.h> where it is not used/needed. Suggested by Al Viro. Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc, sparc64, and arm (all 59 defconfigs). Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:07 -07:00
Christoph Hellwig	1eeb66a1bb	move die notifier handling to common code This patch moves the die notifier handling to common code. Previous various architectures had exactly the same code for it. Note that the new code is compiled unconditionally, this should be understood as an appel to the other architecture maintainer to implement support for it aswell (aka sprinkling a notify_die or two in the proper place) arm had a notifiy_die that did something totally different, I renamed it to arm_notify_die as part of the patch and made it static to the file it's declared and used at. avr32 used to pass slightly less information through this interface and I brought it into line with the other architectures. [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: fix vmalloc_sync_all bustage] [bryan.wu@analog.com: fix vmalloc_sync_all in nommu] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: <linux-arch@vger.kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Bryan Wu <bryan.wu@analog.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:04 -07:00
Akinobu Mita	0e6b9c98be	use SLAB_PANIC flag cleanup Use SLAB_PANIC and delete duplicated panic(). Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Ian Molton <spyro@f2s.com> Cc: David Howells <dhowells@redhat.com> Cc: Andi Kleen <ak@suse.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:14:57 -07:00
David Gibson	d1953c8888	[POWERPC] Remove use of 4level-fixup.h for ppc32 For 32-bit systems, powerpc still relies on the 4level-fixup.h hack, to pretend that the generic pagetable handling stuff is 3-levels rather than 4. This patch removes this, instead using the newer pgtable-nopmd.h to handle the elision of both the pud and pmd pagetable levels (ppc32 pagetables are actually 2 levels). This removes a little extraneous code, and makes it more easily compared to the 64-bit pagetable code. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-08 13:40:31 +10:00
Paul Mackerras	02bbc0f09c	Merge branch 'linux-2.6'	2007-05-08 13:37:51 +10:00
Michael Ellerman	0108d3fe3c	[POWERPC] Add __init annotations to reserve_mem() and stabs_alloc() reserve_mem() and stabs_alloc() are both called only from other __init routines, so can be marked __init. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-08 11:54:19 +10:00
Benjamin Herrenschmidt	d506a77251	get_unmapped_area handles MAP_FIXED on powerpc The current get_unmapped_area code calls the f_ops->get_unmapped_area or the arch one (via the mm) only when MAP_FIXED is not passed. That makes it impossible for archs to impose proper constraints on regions of the virtual address space. To work around that, get_unmapped_area() then calls some hugetlbfs specific hacks. This cause several problems, among others: - It makes it impossible for a driver or filesystem to do the same thing that hugetlbfs does (for example, to allow a driver to use larger page sizes to map external hardware) if that requires applying a constraint on the addresses (constraining that mapping in certain regions and other mappings out of those regions). - Some archs like arm, mips, sparc, sparc64, sh and sh64 already want MAP_FIXED to be passed down in order to deal with aliasing issues. The code is there to handle it... but is never called. This series of patches moves the logic to handle MAP_FIXED down to the various arch/driver get_unmapped_area() implementations, and then changes the generic code to always call them. The hugetlbfs hacks then disappear from the generic code. Since I need to do some special 64K pages mappings for SPEs on cell, I need to work around the first problem at least. I have further patches thus implementing a "slices" layer that handles multiple page sizes through slices of the address space for use by hugetlbfs, the SPE code, and possibly others, but it requires that serie of patches first/ There is still a potential (but not practical) issue due to the fact that filesystems/drivers implemeting g_u_a will effectively bypass all arch checks. This is not an issue in practice as the only filesystems/drivers using that hook are doing so for arch specific purposes in the first place. There is also a problem with mremap that will completely bypass all arch checks. I'll try to address that separately, I'm not 100% certain yet how, possibly by making it not work when the vma has a file whose f_ops has a get_unmapped_area callback, and by making it use is_hugepage_only_range() before expanding into a new area. Also, I want to turn is_hugepage_only_range() into a more generic is_normal_page_range() as that's really what it will end up meaning when used in stack grow, brk grow and mremap. None of the above "issues" however are introduced by this patch, they are already there, so I think the patch can go ini for 2.6.22. This patch: Handle MAP_FIXED in powerpc's arch_get_unmapped_area() in all 3 implementations of it. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: William Irwin <bill.irwin@oracle.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: David Howells <dhowells@redhat.com> Cc: Andi Kleen <ak@suse.de> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Grant Grundler <grundler@parisc-linux.org> Cc: Matthew Wilcox <willy@debian.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Adam Litke <agl@us.ibm.com> Cc: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-07 12:12:55 -07:00
Christoph Lameter	f0f3980b21	slab allocators: remove multiple alignment specifications It is not necessary to tell the slab allocators to align to a cacheline if an explicit alignment was already specified. It is rather confusing to specify multiple alignments. Make sure that the call sites only use one form of alignment. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-07 12:12:55 -07:00
Christoph Lameter	5af6083990	slab allocators: Remove obsolete SLAB_MUST_HWCACHE_ALIGN This patch was recently posted to lkml and acked by Pekka. The flag SLAB_MUST_HWCACHE_ALIGN is 1. Never checked by SLAB at all. 2. A duplicate of SLAB_HWCACHE_ALIGN for SLUB 3. Fulfills the role of SLAB_HWCACHE_ALIGN for SLOB. The only remaining use is in sparc64 and ppc64 and their use there reflects some earlier role that the slab flag once may have had. If its specified then SLAB_HWCACHE_ALIGN is also specified. The flag is confusing, inconsistent and has no purpose. Remove it. Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-07 12:12:55 -07:00
Luke Browning	71bf08b6c0	[POWERPC] 64K page support for kexec This fixes a couple of kexec problems related to 64K page support in the kernel. kexec issues a tlbie for each pte. The parameters for the tlbie are the page size and the virtual address. Support was missing for the computation of these two parameters for 64K pages. This adds that support. Signed-off-by: Luke Browning <lukebrowning@us.ibm.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Olof Johansson <olof@lixom.net> Acked-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-07 20:31:12 +10:00
Christoph Hellwig	9f90b997de	[POWERPC] Minor fault path optimization Call the kprobes pagefault handler directly instead of going through the complex notifier chain. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-02 20:57:39 +10:00
David Gibson	57d7909e0d	[POWERPC] Revise PPC44x MMU code for arch/powerpc This patch takes the definitions for the PPC44x MMU (a software loaded TLB) from asm-ppc/mmu.h, cleans them up of things no longer necessary in arch/powerpc and puts them in a new asm-powerpc/mmu_44x.h file. It also substantially simplifies arch/powerpc/mm/44x_mmu.c and makes a couple of small fixes necessary for the 44x MMU code to build and work properly in arch/powerpc. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-02 20:04:29 +10:00
Michael Ellerman	ed16669298	[POWERPC] Initialise spinlock in the DEBUG_PAGEALLOC code Fixes: BUG: spinlock bad magic on CPU#0, swapper/0 lock: c00000000064ec30, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 Call Trace: [c00000000062b980] [c00000000000f920] .show_stack+0x6c/0x1a0 (unreliable) [c00000000062ba20] [c0000000001c2b40] .spin_bug+0xb0/0xd4 [c00000000062bab0] [c0000000001c2ed0] ._raw_spin_lock+0x44/0x184 [c00000000062bb50] [c0000000003a42b4] ._spin_lock+0x10/0x24 [c00000000062bbd0] [c00000000002b4dc] .kernel_map_pages+0x198/0x278 [c00000000062bc90] [c000000000079720] .free_hot_cold_page+0x124/0x418 [c00000000062bd70] [c000000000530278] .free_all_bootmem_core+0x14c/0x224 [c00000000062be50] [c00000000052a178] .mem_init+0x68/0x170 [c00000000062bee0] [c00000000051d874] .start_kernel+0x2a0/0x37c [c00000000062bf90] [c0000000000084c8] .start_here_common+0x54/0x8c Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-02 20:04:29 +10:00
Johannes Berg	a3cf4bdef0	[POWERPC] Remove unneeded page_is_ram export arch/powerpc/mm/mem.c exports page_is_ram, which is not used anywhere that could be modular. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-02 16:40:57 +10:00
David Gibson	37f01d64d8	[POWERPC] Abolish PHYS_FMT macro from arch/powerpc 32-bit powerpc systems define a macro, PHYS_FMT, giving a printf format string fragment for displaying physical addresses, since most 32-bit powerpc platforms use 32-bit physical addresses but a few use 64-bit physical addresses. This macro is used in exactly one place, a rare error message, where we can solve the problem more simply by just unconditionally casting the address up to 64-bit quantity before formatting it. This patch does so, meaning that as we bring MMU definitions from asm-ppc over to asm-powerpc, cleaning them up in the process, we don't need to implement this ugly macro (which additionally has a very bad name for something global). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-04-24 22:11:16 +10:00
David Gibson	6210230725	[POWERPC] Cleanup and fix breakage in tlbflush.h BenH's commit `a741e67969` in powerpc.git, although (AFAICT) only intended to affect ppc64, also has side-effects which break 44x. I think 40x, 8xx and Freescale Book E are also affected, though I haven't tested them. The problem lies in unconditionally removing flush_tlb_pending() from the versions of flush_tlb_mm(), flush_tlb_range() and flush_tlb_kernel_range() used on ppc64 - which are also used the embedded platforms mentioned above. The patch below cleans up the convoluted #ifdef logic in tlbflush.h, in the process restoring the necessary flushes for the software TLB platforms. There are three sets of definitions for the flushing hooks: the software TLB versions (revised to avoid using names which appear to related to TLB batching), the 32-bit hash based versions (external functions) amd the 64-bit hash based versions (which implement batching). It also moves the declaration of update_mmu_cache() to always be in tlbflush.h (previously it was in tlbflush.h except for PPC64, where it was in pgtable.h). Booted on Ebony (440GP) and compiled for 64-bit and 32-bit multiplatform. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-04-24 22:08:56 +10:00
Benjamin Herrenschmidt	370a908db1	[POWERPC] DEBUG_PAGEALLOC for 64-bit Here's an implementation of DEBUG_PAGEALLOC for 64 bits powerpc. It applies on top of the 32 bits patch. Unlike Anton's previous attempt, I'm not using updatepp. I'm removing the hash entries from the bolted mapping (using a map in RAM of all the slots). Expensive but it doesn't really matter, does it ? :-) Memory hot-added doesn't benefit from this unless it's added at an address that is below end_of_DRAM() as calculated at boot time. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> arch/powerpc/Kconfig.debug \| 2 arch/powerpc/mm/hash_utils_64.c \| 84 ++++++++++++++++++++++++++++++++++++++-- 2 files changed, 82 insertions(+), 4 deletions(-) Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-04-13 04:09:39 +10:00
Benjamin Herrenschmidt	88df6e90fa	[POWERPC] DEBUG_PAGEALLOC for 32-bit Here's an implementation of DEBUG_PAGEALLOC for ppc32. It disables BAT mapping and is only tested with Hash table based processor though it shouldn't be too hard to adapt it to others. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> arch/powerpc/Kconfig.debug \| 9 ++++++ arch/powerpc/mm/init_32.c \| 4 +++ arch/powerpc/mm/pgtable_32.c \| 52 +++++++++++++++++++++++++++++++++++++++ arch/powerpc/mm/ppc_mmu_32.c \| 4 ++- include/asm-powerpc/cacheflush.h \| 6 ++++ 5 files changed, 74 insertions(+), 1 deletion(-) Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-04-13 04:09:39 +10:00
Benjamin Herrenschmidt	ee4f2ea486	[POWERPC] Fix 32-bit mm operations when not using BATs On hash table based 32 bits powerpc's, the hash management code runs with a big spinlock. It's thus important that it never causes itself a hash fault. That code is generally safe (it does memory accesses in real mode among other things) with the exception of the actual access to the code itself. That is, the kernel text needs to be accessible without taking a hash miss exceptions. This is currently guaranteed by having a BAT register mapping part of the linear mapping permanently, which includes the kernel text. But this is not true if using the "nobats" kernel command line option (which can be useful for debugging) and will not be true when using DEBUG_PAGEALLOC implemented in a subsequent patch. This patch fixes this by pre-faulting in the hash table pages that hit the kernel text, and making sure we never evict such a page under hash pressure. Signed-off-by: Benjamin Herrenchmidt <benh@kernel.crashing.org> arch/powerpc/mm/hash_low_32.S \| 22 ++++++++++++++++++++-- arch/powerpc/mm/mem.c \| 3 --- arch/powerpc/mm/mmu_decl.h \| 4 ++++ arch/powerpc/mm/pgtable_32.c \| 11 +++++++---- 4 files changed, 31 insertions(+), 9 deletions(-) Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-04-13 04:09:39 +10:00
Benjamin Herrenschmidt	3be4e6990e	[POWERPC] Cleanup 32-bit map_page The 32 bits map_page() function is used internally by the mm code for early mmu mappings and for ioremap. It should never be called for an address that already has a valid PTE or hash entry, so we add a BUG_ON for that and remove the useless flush_HPTE call. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> arch/powerpc/mm/pgtable_32.c \| 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-04-13 04:09:39 +10:00
Benjamin Herrenschmidt	a741e67969	[POWERPC] Make tlb flush batch use lazy MMU mode The current tlb flush code on powerpc 64 bits has a subtle race since we lost the page table lock due to the possible faulting in of new PTEs after a previous one has been removed but before the corresponding hash entry has been evicted, which can leads to all sort of fatal problems. This patch reworks the batch code completely. It doesn't use the mmu_gather stuff anymore. Instead, we use the lazy mmu hooks that were added by the paravirt code. They have the nice property that the enter/leave lazy mmu mode pair is always fully contained by the PTE lock for a given range of PTEs. Thus we can guarantee that all batches are flushed on a given CPU before it drops that lock. We also generalize batching for any PTE update that require a flush. Batching is now enabled on a CPU by arch_enter_lazy_mmu_mode() and disabled by arch_leave_lazy_mmu_mode(). The code epects that this is always contained within a PTE lock section so no preemption can happen and no PTE insertion in that range from another CPU. When batching is enabled on a CPU, every PTE updates that need a hash flush will use the batch for that flush. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-04-13 04:09:38 +10:00
Stephen Rothwell	e2eb63927b	[POWERPC] Rename get_property to of_get_property: arch/powerpc Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-04-13 03:55:19 +10:00
Paul Mackerras	721151d004	[POWERPC] Allow drivers to map individual 4k pages to userspace Some drivers have resources that they want to be able to map into userspace that are 4k in size. On a kernel configured with 64k pages we currently end up mapping the 4k we want plus another 60k of physical address space, which could contain anything. This can introduce security problems, for example in the case of an infiniband adaptor where the other 60k could contain registers that some other program is using for its communications. This patch adds a new function, remap_4k_pfn, which drivers can use to map a single 4k page to userspace regardless of whether the kernel is using a 4k or a 64k page size. Like remap_pfn_range, it would typically be called in a driver's mmap function. It only maps a single 4k page, which on a 64k page kernel appears replicated 16 times throughout a 64k page. On a 4k page kernel it reduces to a call to remap_pfn_range. The way this works on a 64k kernel is that a new bit, _PAGE_4K_PFN, gets set on the linux PTE. This alters the way that __hash_page_4K computes the real address to put in the HPTE. The RPN field of the linux PTE becomes the 4k RPN directly rather than being interpreted as a 64k RPN. Since the RPN field is 32 bits, this means that physical addresses being mapped with remap_4k_pfn have to be below 2^44, i.e. 0x100000000000. The patch also factors out the code in arch/powerpc/mm/hash_utils_64.c that deals with demoting a process to use 4k pages into one function that gets called in the various different places where we need to do that. There were some discrepancies between exactly what was done in the various places, such as a call to spu_flush_all_slbs in one case but not in others. Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-04-13 03:55:18 +10:00
Stephen Rothwell	9213feea6e	[POWERPC] Rename prom_n_size_cells to of_n_size_cells This is more consistent and gets us closer to the Sparc code. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-04-13 03:55:18 +10:00
Stephen Rothwell	a8bda5dd4f	[POWERPC] Rename prom_n_addr_cells to of_n_addr_cells This is more consistent and gets us closer to the Sparc code. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-04-13 03:55:18 +10:00
Paul Mackerras	e049d1ca30	Merge branch 'linux-2.6' into for-2.6.22	2007-04-13 03:50:03 +10:00
Benjamin Herrenschmidt	94b2a4393c	[POWERPC] Fix spu SLB invalidations The SPU code doesn't properly invalidate SPUs SLBs when necessary, for example when changing a segment size from the hugetlbfs code. In addition, it saves and restores the SLB content on context switches which makes it harder to properly handle those invalidations. This patch removes the saving & restoring for now, something more efficient might be found later on. It also adds a spu_flush_all_slbs(mm) that can be used by the core mm code to flush the SLBs of all SPEs that are running a given mm at the time of the flush. In order to do that, it adds a spinlock to the list of all SPEs and move some bits & pieces from spufs to spu_base.c Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2007-03-10 00:07:50 +01:00
David Gibson	eb6de28637	[POWERPC] Allow duplicate lmb_reserve() calls At present calling lmb_reserve() (and hence lmb_add_region()) twice for exactly the same memory region will cause strange behaviour. This makes life difficult when booting from a flat device tree with memory reserve map. Which regions are automatically reserved by the kernel has changed over time, so it's quite possible a newer kernel could attempt to auto-reserve a region which is also explicitly listed in the device tree's reserve map, leading to trouble. This patch avoids the problem by making lmb_reserve() ignore a call to reserve a previously reserved region. It also removes a now redundant test designed to avoid one specific case of the problem noted above. At present, this patch deals only with duplicate reservations of an identical region. Attempting to reserve two different, but overlapping regions will still cause problems. I might post another patch later dealing with this case, but I'm avoiding it now since it is substantially more complicated to deal with, less likely to occur and more likely to indicate a genuine bug elsewhere if it does occur. Signed-off-by: David Gibson <dwg@au1.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-03-08 15:43:28 +11:00
Linus Torvalds	874ff01bd9	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial * git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (25 commits) Documentation/kernel-docs.txt update. arch/cris: typo in KERN_INFO Storage class should be before const qualifier kernel/printk.c: comment fix update I/O sched Kconfig help texts - CFQ is now default, not AS. Remove duplicate listing of Cris arch from README kbuild: more doc. cleanups doc: make doc. for maxcpus= more visible drivers/net/eexpress.c: remove duplicate comment add a help text for BLK_DEV_GENERIC correct a dead URL in the IP_MULTICAST help text fix the BAYCOM_SER_HDX help text fix SCSI_SCAN_ASYNC help text trivial documentation patch for platform.txt Fix typos concerning hierarchy Fix comment typo "spin_lock_irqrestore". Fix misspellings of "agressive". drivers/scsi/a100u2w.c: trivial typo patch Correct trivial typo in log2.h. Remove useless FIND_FIRST_BIT() macro from cardbus.c. ...	2007-02-19 13:29:02 -08:00
Uwe Kleine-König	1b3c3714cb	Fix typos concerning hierarchy heirarchical, hierachical -> hierarchical heirarchy, hierachy -> hierarchy Signed-off-by: Uwe Kleine-König <zeisberg@informatik.uni-freiburg.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2007-02-17 19:23:03 +01:00
Benjamin Herrenschmidt	a32525449b	[POWERPC] Fix bug with early ioremap and 64k pages The code for bolting hash entries for ioremap done before proper mm initialization has a grown a bug when using 64K pages on a machine where non-cacheable mappings are demoted to 4K HW pages. The wrong page size index is being passed to the hash table mapping functions causing a crash at boot on some pSeries machines using bare metal linux. This fixes it. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-02-16 14:00:20 +11:00
Benjamin Herrenschmidt	7ac9a13717	[POWERPC] Fix vDSO page count calculation The recent vDSO consolidation patches broke powerpc due to a mistake in the definition of MAXPAGES constants. This fixes it by moving to a dynamically allocated array of pages instead as I don't like much hard coded size limits. Also move the vdso initialisation to an initcall since it doesn't really need to be done -that- early. Applogies for not catching the breakage earlier, Roland _did_ CC me on his patches a while ago, I got busy with other things and forgot to test them. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-02-13 15:35:52 +11:00
Kumar Gala	8dabba5d1a	[POWERPC] Fix is_power_of_4(x) compile error When building an 85xx kernel we get: CC arch/powerpc/mm/pgtable_32.o arch/powerpc/mm/pgtable_32.c: In function 'io_block_mapping': arch/powerpc/mm/pgtable_32.c:330: error: expected identifier before '(' token arch/powerpc/mm/pgtable_32.c:330: error: expected statement before ')' token The is_power_of_2(x) fixup patch left an extra ')' on the is_power_of_4 macro. There is a similiar issue on the arch/ppc side. Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2007-02-09 09:30:05 -06:00
Johannes Berg	bcff4948c6	[POWERPC] Remove bogus comment about page_is_ram arch/powerpc/mm/mem.c states that page_is_ram is called by the code that implements /dev/mem, which isn't true. Remove the comment. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-02-08 16:08:43 +11:00
Robert P. J. Day	63c2f782e8	[POWERPC] Add "is_power_of_2" checking to log2.h. Add the inline function "is_power_of_2()" to log2.h, where the value zero is not considered to be a power of two. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Acked-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-02-07 14:03:19 +11:00
Vitaly Bordug	dbbb06b7f6	[POWERPC] 8xx: platform specific mmu updates This is just a straight port of the same done in arch/ppc by Marcelo Tosatti. One used to be [PATCH] ppc32 8xx: update_mmu_cache() needs unconditional tlbie, commit `eb07d964b4` In a nutshell, the board is nearly stuck without this, yet without any visible failure - being just very slow. Signed-off-by: Vitaly Bordug <vbordug@ru.mvista.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-02-07 12:00:32 +11:00
Ishizaki Kou	d649bd7b76	[POWERPC] TLB insertion cleanup This patch changes handling return value of ppc_md.hpte_insert() into the same way as __hash_page_*(). Signed-off-by: Kou Ishizaki <kou.ishizaki@toshiba.co.jp> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-01-24 21:13:59 +11:00
David Gibson	6aa3e1e944	[POWERPC] Fix bogus BUG_ON() in in hugetlb_get_unmapped_area() The powerpc specific version of hugetlb_get_unmapped_area() makes some unwarranted assumptions about what checks have been made to its parameters by its callers. This will lead to a BUG_ON() if a 32-bit process attempts to make a hugepage mapping which extends above TASK_SIZE (4GB). I'm not sure if these assumptions came about because they were valid with earlier versions of the get_unmapped_area() path, or if it was always broken. Nonetheless this patch fixes the logic, and removes the crash. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-01-09 17:03:01 +11:00
Robert P. J. Day	5cbded585d	[PATCH] getting rid of all casts of k[cmz]alloc() calls Run this: #!/bin/sh for f in $(grep -Erl "$[^$]\) k[cmz]alloc" ) ; do echo "De-casting $f..." perl -pi -e "s/ ?= ?$[^$]\) (k[cmz]alloc) \(/ = \1\(/" $f done And then go through and reinstate those cases where code is casting pointers to non-pointers. And then drop a few hunks which conflicted with outstanding work. Cc: Russell King <rmk@arm.linux.org.uk>, Ian Molton <spyro@f2s.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Greg KH <greg@kroah.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Paul Fulghum <paulkf@microgate.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Karsten Keil <kkeil@suse.de> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: Jeff Garzik <jeff@garzik.org> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Ian Kent <raven@themaw.net> Cc: Steven French <sfrench@us.ibm.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Neil Brown <neilb@cse.unsw.edu.au> Cc: Jaroslav Kysela <perex@suse.cz> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-13 09:05:58 -08:00
Paul Mackerras	0204568a08	[POWERPC] Support ibm,dynamic-reconfiguration-memory nodes For PAPR partitions with large amounts of memory, the firmware has an alternative, more compact representation for the information about the memory in the partition and its NUMA associativity information. This adds the code to the kernel to parse this alternative representation. The other part of this patch is telling the firmware that we can handle the alternative representation. There is however a subtlety here, because the firmware will invoke a reboot if the memory representation we request is different from the representation that firmware is currently using. This is because firmware can't change the representation on the fly. Further, some firmware versions used on POWER5+ machines have a bug where this reboot leaves the machine with an altered value of load-base, which will prevent any kernel booting until it is reset to the normal value (0x4000). Because of this bug, we do NOT set fake_elf.rpanote.new_mem_def = 1, and thus we do not request the new representation on POWER5+ and earlier machines. We do request the new representation on POWER6, which uses the ibm,client-architecture-support call. Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-12-11 13:49:49 +11:00
Christoph Lameter	e18b890bb0	[PATCH] slab: remove kmem_cache_t Replace all uses of kmem_cache_t with struct kmem_cache. The patch was generated using the following script: #!/bin/sh # # Replace one string by another in all the kernel sources. # set -e for file in `find * -name ".c" -o -name ".h"\|xargs grep -l $1`; do quilt add $file sed -e "1,\$s/$1/$2/g" $file >/tmp/$$ mv /tmp/$$ $file quilt refresh done The script was run like this sh replace kmem_cache_t "struct kmem_cache" Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:25 -08:00
Chen, Kenneth W	39dde65c99	[PATCH] shared page table for hugetlb page Following up with the work on shared page table done by Dave McCracken. This set of patch target shared page table for hugetlb memory only. The shared page table is particular useful in the situation of large number of independent processes sharing large shared memory segments. In the normal page case, the amount of memory saved from process' page table is quite significant. For hugetlb, the saving on page table memory is not the primary objective (as hugetlb itself already cuts down page table overhead significantly), instead, the purpose of using shared page table on hugetlb is to allow faster TLB refill and smaller cache pollution upon TLB miss. With PT sharing, pte entries are shared among hundreds of processes, the cache consumption used by all the page table is smaller and in return, application gets much higher cache hit ratio. One other effect is that cache hit ratio with hardware page walker hitting on pte in cache will be higher and this helps to reduce tlb miss latency. These two effects contribute to higher application performance. Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Acked-by: Hugh Dickins <hugh@veritas.com> Cc: Dave McCracken <dmccr@us.ibm.com> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Adam Litke <agl@us.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:21 -08:00
Stephen Rothwell	0470466dba	[POWERPC] Fix cputable.h for combined build Remove CPU_FTR_16M_PAGE from the cupfeatures mask at runtime on iSeries. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-12-04 20:41:59 +11:00
Arnd Bergmann	e22ba7e381	[POWERPC] ps3: multiplatform build fixes A few code paths need to check whether or not they are running on the PS3's LV1 hypervisor before making hcalls. This introduces a new firmware feature bit for this, FW_FEATURE_PS3_LV1. Now when both PS3 and IBM_CELL_BLADE are enabled, but not PSERIES, FW_FEATURE_PS3_LV1 and FW_FEATURE_LPAR get enabled at compile time, which is a bug. The same problem can also happen for (PPC_ISERIES && !PPC_PSERIES && PPC_SOMETHING_ELSE). In order to solve this, I introduce a new CONFIG_PPC_NATIVE option that is set when at least one platform is selected that can run without a hypervisor and then turns the firmware feature check into a run-time option. The new cell oprofile support that was recently merged does not work on hypervisor based platforms like the PS3, therefore make it depend on PPC_CELL_NATIVE instead of PPC_CELL. This may change if we get oprofile support for PS3. Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>	2006-12-04 20:41:16 +11:00
Geert Uytterhoeven	adaa3a7962	[POWERPC] setup_kcore(): Fix incorrect function name in panic() call. Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-12-04 20:39:39 +11:00
Stephen Rothwell	56291e19e3	[POWERPC] iSeries: fix slb.c for combined build Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-12-04 20:39:19 +11:00
Benjamin Herrenschmidt	68a64357d1	[POWERPC] Merge 32 and 64 bits asm-powerpc/io.h powerpc: Merge 32 and 64 bits asm-powerpc/io.h The rework on io.h done for the new hookable accessors made it easier, so I just finished the work and merged 32 and 64 bits io.h for arch/powerpc. arch/ppc still uses the old version in asm-ppc, there is just too much gunk in there that I really can't be bothered trying to cleanup. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-12-04 20:39:05 +11:00
Benjamin Herrenschmidt	3d1ea8e8cb	[POWERPC] Remove ioremap64 and fixup_bigphys_addr In order to suppose platforms with devices above 4Gb on 32 bits platforms with a >32 bits physical address space, we used to have a special ioremap64 along with a fixup routine fixup_bigphys_addr. This shouldn't be necessary anymore as struct resource now supports 64 bits addresses even on 32 bits archs. This patch enables that option when CONFIG_PHYS_64BIT is set and removes ioremap64 and fixup_bigphys_addr. This is a preliminary work for the upcoming merge of 32 and 64 bits io.h Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-12-04 20:39:04 +11:00
Benjamin Herrenschmidt	4cb3cee03d	[POWERPC] Allow hooking of PCI MMIO & PIO accessors on 64 bits This patch reworks the way iSeries hooks on PCI IO operations (both MMIO and PIO) and provides a generic way for other platforms to do so (we have need to do that for various other platforms). While reworking the IO ops, I ended up doing some spring cleaning in io.h and eeh.h which I might want to split into 2 or 3 patches (among others, eeh.h had a lot of useless stuff in it). A side effect is that EEH for PIO should work now (it used to pass IO ports down to the eeh address check functions which is bogus). Also, new are MMIO "repeat" ops, which other archs like ARM already had, and that we have too now: readsb, readsw, readsl, writesb, writesw, writesl. In the long run, I might also make EEH use the hooks instead of wrapping at the toplevel, which would make things even cleaner and relegate EEH completely in platforms/iseries, but we have to measure the performance impact there (though it's really only on MMIO reads) Since I also need to hook on ioremap, I shuffled the functions a bit there. I introduced ioremap_flags() to use by drivers who want to pass explicit flags to ioremap (and it can be hooked). The old __ioremap() is still there as a low level and cannot be hooked, thus drivers who use it should migrate unless they know they want the low level version. The patch "arch provides generic iomap missing accessors" (should be number 4 in this series) is a pre-requisite to provide full iomap API support with this patch. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-12-04 20:38:52 +11:00
Paul Mackerras	79acbb3ff2	Merge branch 'linux-2.6' into for-linus	2006-12-04 15:59:07 +11:00
Hugh Dickins	68589bc353	[PATCH] hugetlb: prepare_hugepage_range check offset too (David:) If hugetlbfs_file_mmap() returns a failure to do_mmap_pgoff() - for example, because the given file offset is not hugepage aligned - then do_mmap_pgoff will go to the unmap_and_free_vma backout path. But at this stage the vma hasn't been marked as hugepage, and the backout path will call unmap_region() on it. That will eventually call down to the non-hugepage version of unmap_page_range(). On ppc64, at least, that will cause serious problems if there are any existing hugepage pagetable entries in the vicinity - for example if there are any other hugepage mappings under the same PUD. unmap_page_range() will trigger a bad_pud() on the hugepage pud entries. I suspect this will also cause bad problems on ia64, though I don't have a machine to test it on. (Hugh:) prepare_hugepage_range() should check file offset alignment when it checks virtual address and length, to stop MAP_FIXED with a bad huge offset from unmapping before it fails further down. PowerPC should apply the same prepare_hugepage_range alignment checks as ia64 and all the others do. Then none of the alignment checks in hugetlbfs_file_mmap are required (nor is the check for too small a mapping); but even so, move up setting of VM_HUGETLB and add a comment to warn of what David Gibson discovered - if hugetlbfs_file_mmap fails before setting it, do_mmap_pgoff's unmap_region when unwinding from error will go the non-huge way, which may cause bad behaviour on architectures (powerpc and ia64) which segregate their huge mappings into a separate region of the address space. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Acked-by: Adam Litke <agl@us.ibm.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-11-14 09:09:27 -08:00
Michael Ellerman	a416dd8d9c	[PATCH] Do a single one-line printk in bad_page_fault() bad_page_fault() prints a message telling the user what type of bad fault we took. The first line of this message is currently implemented as two separate printks. This has the unfortunate effect that if several cpus simultaneously take a bad fault, the first and second parts of the printk get jumbled up, which looks dodge and is hard to read. So do a single one-line printk for each fault type. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-11-13 14:48:56 +11:00
Hugh Dickins	96268889ee	[POWERPC] Make high hugepage areas preempt safe Checking source for other get_paca()->field preemption dangers found that open_high_hpage_areas does a structure copy into its paca while preemption is enabled: unsafe however gcc accomplishes it. Just remove that copy: it's done safely afterwards by on_each_cpu, as in open_low_hpage_areas. Signed-off-by: Hugh Dickins <hugh@veritas.com> Acked-by: David Gibson <dwg@au1.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-11-01 14:52:48 +11:00
Geoff Levand	035223fb28	[POWERPC] Make pSeries_lpar_hpte_insert static Change the powerpc hpte_insert routines now called through ppc_md to static scope. Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-10-16 16:33:04 +10:00
Mel Gorman	6391af174a	[PATCH] mm: use symbolic names instead of indices for zone initialisation Arch-independent zone-sizing is using indices instead of symbolic names to offset within an array related to zones (max_zone_pfns). The unintended impact is that ZONE_DMA and ZONE_NORMAL is initialised on powerpc instead of ZONE_DMA and ZONE_HIGHMEM when CONFIG_HIGHMEM is set. As a result, the the machine fails to boot but will boot with CONFIG_HIGHMEM turned off. The following patch properly initialises the max_zone_pfns[] array and uses symbolic names instead of indices in each architecture using arch-independent zone-sizing. Two users have successfully booted their powerpcs with it (one an ibook G4). It has also been boot tested on x86, x86_64, ppc64 and ia64. Please merge for 2.6.19-rc2. Credit to Benjamin Herrenschmidt for identifying the bug and rolling the first fix. Additional credit to Johannes Berg and Andreas Schwab for reporting the problem and testing on powerpc. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-11 11:14:14 -07:00
Paul Mackerras	c730f5b621	Merge branch 'master' of git://oak/home/sfr/kernels/iseries/work	2006-10-04 15:02:27 +10:00
Stephen Rothwell	3f639ee8c5	[POWERPC] implement BEGIN/END_FW_FTR_SECTION and use it an all the obvious places in assembler code. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>	2006-10-03 16:50:21 +10:00
Sukadev Bhattiprolu	f400e198b2	[PATCH] pidspace: is_init() This is an updated version of Eric Biederman's is_init() patch. (http://lkml.org/lkml/2006/2/6/280). It applies cleanly to 2.6.18-rc3 and replaces a few more instances of ->pid == 1 with is_init(). Further, is_init() checks pid and thus removes dependency on Eric's other patches for now. Eric's original description: There are a lot of places in the kernel where we test for init because we give it special properties. Most significantly init must not die. This results in code all over the kernel test ->pid == 1. Introduce is_init to capture this case. With multiple pid spaces for all of the cases affected we are looking for only the first process on the system, not some other process that has pid == 1. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: <lxc-devel@lists.sourceforge.net> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-09-29 09:18:12 -07:00
Jason Baron	df67b3daea	[PATCH] make PROT_WRITE imply PROT_READ Make PROT_WRITE imply PROT_READ for a number of architectures which don't support write only in hardware. While looking at this, I noticed that some architectures which do not support write only mappings already take the exact same approach. For example, in arch/alpha/mm/fault.c: " if (cause < 0) { if (!(vma->vm_flags & VM_EXEC)) goto bad_area; } else if (!cause) { /* Allow reads even for write-only mappings */ if (!(vma->vm_flags & (VM_READ \| VM_WRITE))) goto bad_area; } else { if (!(vma->vm_flags & VM_WRITE)) goto bad_area; } " Thus, this patch brings other architectures which do not support write only mappings in-line and consistent with the rest. I've verified the patch on ia64, x86_64 and x86. Additional discussion: Several architectures, including x86, can not support write-only mappings. The pte for x86 reserves a single bit for protection and its two states are read only or read/write. Thus, write only is not supported in h/w. Currently, if i 'mmap' a page write-only, the first read attempt on that page creates a page fault and will SEGV. That check is enforced in arch/blah/mm/fault.c. However, if i first write that page it will fault in and the pte will be set to read/write. Thus, any subsequent reads to the page will succeed. It is this inconsistency in behavior that this patch is attempting to address. Furthermore, if the page is swapped out, and then brought back the first read will also cause a SEGV. Thus, any arbitrary read on a page can potentially result in a SEGV. According to the SuSv3 spec, "if the application requests only PROT_WRITE, the implementation may also allow read access." Also as mentioned, some archtectures, such as alpha, shown above already take the approach that i am suggesting. The counter-argument to this raised by Arjan, is that the kernel is enforcing the write only mapping the best it can given the h/w limitations. This is true, however Alan Cox, and myself would argue that the inconsitency in behavior, that is applications can sometimes work/sometimes fails is highly undesireable. If you read through the thread, i think people, came to an agreement on the last patch i posted, as nobody has objected to it... Signed-off-by: Jason Baron <jbaron@redhat.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Andi Kleen <ak@muc.de> Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Ian Molton <spyro@f2s.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-09-29 09:18:05 -07:00
Mel Gorman	c67c3cb4c9	[PATCH] Have Power use add_active_range() and free_area_init_nodes() Size zones and holes in an architecture independent manner for Power. [judith@osdl.org: build fix] Signed-off-by: Mel Gorman <mel@csn.ul.ie> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Andi Kleen <ak@muc.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "Keith Mannthey" <kmannth@gmail.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-09-27 08:26:11 -07:00
Stephen Rothwell	5e203d6862	[POWERPC] fix ioremap for a combined kernel Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>	2006-09-25 13:36:31 +10:00
Paul Mackerras	ea0763a7e6	Merge branch 'merge'	2006-08-25 14:56:07 +10:00
Matt Porter	054389f114	[POWERPC] Fix powerpc 44x_mmu build The PIN_SIZE definition name changed, update 44x_mmu.c accordingly. Signed-off-by: Matt Porter <mporter@embeddedalley.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-08-25 13:41:41 +10:00
Adam Litke	c9169f8747	[POWERPC] hugepage BUG fix On Tue, 2006-08-15 at 08:22 -0700, Dave Hansen wrote: > kernel BUG in cache_free_debugcheck at mm/slab.c:2748! Alright, this one is only triggered when slab debugging is enabled. The slabs are assumed to be aligned on a HUGEPTE_TABLE_SIZE boundary. The free path makes use of this assumption and uses the lowest nibble to pass around an index into an array of kmem_cache pointers. With slab debugging turned on, the slab is still aligned, but the "working" object pointer is not. This would break the assumption above that a full nibble is available for the PGF_CACHENUM_MASK. The following patch reduces PGF_CACHENUM_MASK to cover only the two least significant bits, which is enough to cover the current number of 4 pgtable cache types. Then use this constant to mask out the appropriate part of the huge pte pointer. Signed-off-by: Adam Litke <agl@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-08-24 10:07:23 +10:00
Michael Neuling	2f6093c847	[POWERPC] Implement SLB shadow buffer This adds a shadow buffer for the SLBs and regsiters it with PHYP. Only the bolted SLB entries (top 3) are shadowed. The SLB shadow buffer tells the hypervisor what the kernel needs to have in the SLB for the kernel to be able to function. The hypervisor can use this information to speed up partition context switches. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-08-08 17:08:56 +10:00
Matt Porter	452b5e2121	[POWERPC] Fix powerpc 44x_mmu build The PIN_SIZE definition name changed, update 44x_mmu.c accordingly. Signed-off-by: Matt Porter <mporter@embeddedalley.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-08-08 17:07:08 +10:00
Jeremy Kerr	a7f67bdf2c	[POWERPC] Constify & voidify get_property() Now that get_property() returns a void *, there's no need to cast its return value. Also, treat the return value as const, so we can constify get_property later. powerpc core changes. Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-07-31 15:55:04 +10:00
Michael Ellerman	30f30e1305	[POWERPC] Fix mem= handling when the memory limit is > RMO size There's a bug in my cleaned up mem= handling, if the memory limit is larger than the RMO size we'll erroneously enlarge the RMO size. Fix is to only change the RMO size if the memory limit is less than the current RMO value. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-07-26 01:28:24 +10:00
Michael Ellerman	2d69ff32eb	[POWERPC] Fix a compiler warning in mm/tlb_64.c The compiler doesn't understand that BUG() never returns, so complains that psize isn't set. Just set it to the normal value, which seems to produce nice code and keeps gcc happy. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>	2006-07-13 18:43:25 +10:00
Michael Ellerman	e7c1f69d4f	[POWERPC] Fix mem= handling when the memory limit is > RMO size There's a bug in my cleaned up mem= handling, if the memory limit is larger than the RMO size we'll erroneously enlarge the RMO size. Fix is to only change the RMO size if the memory limit is less than the current RMO value. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-07-07 20:19:16 +10:00
Jörn Engel	6ab3d5624e	Remove obsolete #include <linux/config.h> Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2006-06-30 19:25:36 +02:00
Linus Torvalds	3aa590c6b7	Merge git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (43 commits) [POWERPC] Use little-endian bit from firmware ibm,pa-features property [POWERPC] Make sure smp_processor_id works very early in boot [POWERPC] U4 DART improvements [POWERPC] todc: add support for Time-Of-Day-Clock [POWERPC] Make lparcfg.c work when both iseries and pseries are selected [POWERPC] Fix idr locking in init_new_context [POWERPC] mpc7448hpc2 (taiga) board config file [POWERPC] Add tsi108 pci and platform device data register function [POWERPC] Add general support for mpc7448hpc2 (Taiga) platform [POWERPC] Correct the MAX_CONTEXT definition powerpc: minor cleanups for mpc86xx [POWERPC] Make sure we select CONFIG_NEW_LEDS if ADB_PMU_LED is set [POWERPC] Simplify the code defining the 64-bit CPU features [POWERPC] powerpc: kconfig warning fix [POWERPC] Consolidate some of kernel/misc*.S [POWERPC] Remove unused function call_with_mmu_off [POWERPC] update asm-powerpc/time.h [POWERPC] Clean up it_lp_queue.h [POWERPC] Skip the "copy down" of the kernel if it is already at zero. [POWERPC] Add the use of the firmware soft-reset-nmi to kdump. ...	2006-06-29 11:32:34 -07:00
Sonny Rao	f86c9747fe	[POWERPC] Fix idr locking in init_new_context We always need to serialize accesses to mmu_context_idr. I hit this bug when testing with a small number of mmu contexts. Signed-off-by: Sonny Rao <sonny@burdell.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-06-29 16:22:46 +10:00
Michael Ellerman	c30a4df3f1	[POWERPC] Use ppc_md.hpte_insert() in htab_bolt_mapping() With the ppc_md htab pointers setup earlier, we can use ppc_md.hpte_insert in htab_bolt_mapping(), rather than deciding which version to call by hand. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-06-28 11:59:47 +10:00
Michael Ellerman	7d0daae4ae	[POWERPC] powerpc: Initialise ppc_md htab pointers earlier Initialise the ppc_md htab callbacks earlier, in the probe routines. This allows us to call htab_finish_init() from htab_initialize(), and makes it private to hash_utils_64.c. Move htab_finish_init() and make_bl() above htab_initialize() to avoid forward declarations. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-06-28 11:59:47 +10:00
Chandra Seetharaman	74b85f3790	[PATCH] cpu hotplug: make cpu_notifier related notifier blocks __cpuinit only Make notifier_blocks associated with cpu_notifier as __cpuinitdata. __cpuinitdata makes sure that the data is init time only unless CONFIG_HOTPLUG_CPU is defined. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Cc: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-27 17:32:41 -07:00
Randy Dunlap	c9cf55285e	[PATCH] add poison.h and patch primary users Localize poison values into one header file for better documentation and easier/quicker debugging and so that the same values won't be used for multiple purposes. Use these constants in core arch., mm, driver, and fs code. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Acked-by: Matt Mackall <mpm@selenic.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-27 17:32:38 -07:00
Yasunori Goto	bc02af93dd	[PATCH] pgdat allocation for new node add (specify node id) Change the name of old add_memory() to arch_add_memory. And use node id to get pgdat for the node at NODE_DATA(). Note: Powerpc's old add_memory() is defined as __devinit. However, add_memory() is usually called only after bootup. I suppose it may be redundant. But, I'm not well known about powerpc. So, I keep it. (But, __meminit is better at least.) Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: "Brown, Len" <len.brown@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-27 17:32:35 -07:00
Anil S Keshavamurthy	4f9e87c045	[PATCH] Notify page fault call chain for powerpc Overloading of page fault notification with the notify_die() has performance issues(since the only interested components for page fault is kprobes and/or kdb) and hence this patch introduces the new notifier call chain exclusively for page fault notifications their by avoiding notifying unnecessary components in the do_page_fault() code path. Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-26 09:58:22 -07:00
Linus Torvalds	45c091bb2d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (139 commits) [POWERPC] re-enable OProfile for iSeries, using timer interrupt [POWERPC] support ibm,extended-*-frequency properties [POWERPC] Extra sanity check in EEH code [POWERPC] Dont look for class-code in pci children [POWERPC] Fix mdelay badness on shared processor partitions [POWERPC] disable floating point exceptions for init [POWERPC] Unify ppc syscall tables [POWERPC] mpic: add support for serial mode interrupts [POWERPC] pseries: Print PCI slot location code on failure [POWERPC] spufs: one more fix for 64k pages [POWERPC] spufs: fail spu_create with invalid flags [POWERPC] spufs: clear class2 interrupt status before wakeup [POWERPC] spufs: fix Makefile for "make clean" [POWERPC] spufs: remove stop_code from struct spu [POWERPC] spufs: fix spu irq affinity setting [POWERPC] spufs: further abstract priv1 register access [POWERPC] spufs: split the Cell BE support into generic and platform dependant parts [POWERPC] spufs: dont try to access SPE channel 1 count [POWERPC] spufs: use kzalloc in create_spu [POWERPC] spufs: fix initial state of wbox file ... Manually resolved conflicts in: drivers/net/phy/Makefile include/asm-powerpc/spu.h	2006-06-22 22:11:30 -07:00
Jon Loeliger	ee0339f205	[POWERPC] Add starting of secondary 86xx CPUs. Clear the high BATS during load_up_mmu if FTR_HAS_HIGH_BATS. Allow just a bit more time for secondary CPUs to phone home. Signed-off-by: Wei Zhang <Wei.Zhang@freescale.com> Signed-off-by: Haiying Wang <Haiying.Wang@freescale.com> Signed-off-by: Jon Loeliger <jdl@freescale.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-06-21 15:01:28 +10:00
Arnd Bergmann	19242b2407	[PATCH] powerpc: Fix 64k pages on non-partitioned machines The page size encoding passed to tlbie is incorrect for new-style large pages. This fixes it. This doesn't affect anything on older machines because mmu_psize_defs[psize].penc (the page size encoding) is 0 for 4k and 16M pages (the two are distinguished by a separate "is a large page" bit). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-17 10:56:24 -07:00
Anton Blanchard	227318bbde	[POWERPC] Remove stale 64bit on 32bit kernel code Remove some stale POWER3/POWER4/970 on 32bit kernel support. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-06-15 19:31:26 +10:00
Paul Mackerras	bf72aeba2f	powerpc: Use 64k pages without needing cache-inhibited large pages Some POWER5+ machines can do 64k hardware pages for normal memory but not for cache-inhibited pages. This patch lets us use 64k hardware pages for most user processes on such machines (assuming the kernel has been configured with CONFIG_PPC_64K_PAGES=y). User processes start out using 64k pages and get switched to 4k pages if they use any non-cacheable mappings. With this, we use 64k pages for the vmalloc region and 4k pages for the imalloc region. If anything creates a non-cacheable mapping in the vmalloc region, the vmalloc region will get switched to 4k pages. I don't know of any driver other than the DRM that would do this, though, and these machines don't have AGP. When a region gets switched from 64k pages to 4k pages, we do not have to clear out all the 64k HPTEs from the hash table immediately. We use the _PAGE_COMBO bit in the Linux PTE to indicate whether the page was hashed in as a 64k page or a set of 4k pages. If hash_page is trying to insert a 4k page for a Linux PTE and it sees that it has already been inserted as a 64k page, it first invalidates the 64k HPTE before inserting the 4k HPTE. The hash invalidation routines also use the _PAGE_COMBO bit, to determine whether to look for a 64k HPTE or a set of 4k HPTEs to remove. With those two changes, we can tolerate a mix of 4k and 64k HPTEs in the hash table, and they will all get removed when the address space is torn down. Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-06-15 10:45:18 +10:00
Paul Mackerras	4306443128	powerpc: Remove unused paca->pgdir field The pgdir field in the paca was a leftover from the dynamic VSIDs patch, and is not used in the current kernel code. This removes it. Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-06-12 18:38:21 +10:00
Paul Mackerras	6218a761bb	powerpc: add context.vdso_base for 32-bit too This adds a vdso_base element to the mm_context_t for 32-bit compiles (both for ARCH=powerpc and ARCH=ppc). This fixes the compile errors that have been reported in arch/powerpc/kernel/signal_32.c. Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-06-11 14:15:17 +10:00
Benjamin Herrenschmidt	c5cf0e30bf	[PATCH] powerpc: Fix buglet with MMU hash management Our MMU hash management code would not set the "C" bit (changed bit) in the hardware PTE when updating a RO PTE into a RW PTE. That would cause the hardware to possibly to a write back to the hash table to set it on the first store access, which in addition to being a performance issue, might also hit a bug when running with native hash management (non-HV) as our code is specifically optimized for the case where no write back happens. Thus there is a very small therocial window were a hash PTE can become corrupted if that HPTE has just been upgraded to read write, a store access happens on it, and that races with another processor evicting that same slot. Since eviction (caused by an almost full hash) is extremely rare, the bug is very unlikely to happen fortunately. This fixes by allowing the updating of the protection bits in the native hash handling to also set (but not clear) the "C" bit, and, in order to also improve performances in the general case, by always setting that bit on newly inserted hash PTE so that writeback really never happens. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-06-09 21:20:59 +10:00
Michael Ellerman	2babf5c2ec	[PATCH] powerpc: Unify mem= handling We currently do mem= handling in three seperate places. And as benh pointed out I wrote two of them. Now that we parse command line parameters earlier we can clean this mess up. Moving the parsing out of prom_init means the device tree might be allocated above the memory limit. If that happens we'd have to move it. As it happens we already have logic to do that for kdump, so just genericise it. This also means we might have reserved regions above the memory limit, if we do the bootmem allocator will blow up, so we have to modify lmb_enforce_memory_limit() to truncate the reserves as well. Tested on P5 LPAR, iSeries, F50, 44p. Tested moving device tree on P5 and 44p and F50. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-05-19 15:02:15 +10:00

... 3 4 5 6 7 ...

567 Commits