OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Alexander Graf	29d03158f9	KVM: PPC: Remove prog_flags Commit c8f729d408 (KVM: PPC: Deliver program interrupts right away instead of queueing them) made away with all users of prog_flags, so we can just remove it from the headers. Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:17:00 +03:00
Paul Mackerras	9e368f2915	KVM: PPC: book3s_hv: Add support for PPC970-family processors This adds support for running KVM guests in supervisor mode on those PPC970 processors that have a usable hypervisor mode. Unfortunately, Apple G5 machines have supervisor mode disabled (MSR[HV] is forced to 1), but the YDL PowerStation does have a usable hypervisor mode. There are several differences between the PPC970 and POWER7 in how guests are managed. These differences are accommodated using the CPU_FTR_ARCH_201 (PPC970) and CPU_FTR_ARCH_206 (POWER7) CPU feature bits. Notably, on PPC970: * The LPCR, LPID or RMOR registers don't exist, and the functions of those registers are provided by bits in HID4 and one bit in HID0. * External interrupts can be directed to the hypervisor, but unlike POWER7 they are masked by MSR[EE] in non-hypervisor modes and use SRR0/1 not HSRR0/1. * There is no virtual RMA (VRMA) mode; the guest must use an RMO (real mode offset) area. * The TLB entries are not tagged with the LPID, so it is necessary to flush the whole TLB on partition switch. Furthermore, when switching partitions we have to ensure that no other CPU is executing the tlbie or tlbsync instructions in either the old or the new partition, otherwise undefined behaviour can occur. * The PMU has 8 counters (PMC registers) rather than 6. * The DSCR, PURR, SPURR, AMR, AMOR, UAMOR registers don't exist. * The SLB has 64 entries rather than 32. * There is no mediated external interrupt facility, so if we switch to a guest that has a virtual external interrupt pending but the guest has MSR[EE] = 0, we have to arrange to have an interrupt pending for it so that we can get control back once it re-enables interrupts. We do that by sending ourselves an IPI with smp_send_reschedule after hard-disabling interrupts. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:59 +03:00
Paul Mackerras	969391c58a	powerpc, KVM: Split HVMODE_206 cpu feature bit into separate HV and architecture bits This replaces the single CPU_FTR_HVMODE_206 bit with two bits, one to indicate that we have a usable hypervisor mode, and another to indicate that the processor conforms to PowerISA version 2.06. We also add another bit to indicate that the processor conforms to ISA version 2.01 and set that for PPC970 and derivatives. Some PPC970 chips (specifically those in Apple machines) have a hypervisor mode in that MSR[HV] is always 1, but the hypervisor mode is not useful in the sense that there is no way to run any code in supervisor mode (HV=0 PR=0). On these processors, the LPES0 and LPES1 bits in HID4 are always 0, and we use that as a way of detecting that hypervisor mode is not useful. Where we have a feature section in assembly code around code that only applies on POWER7 in hypervisor mode, we use a construct like END_FTR_SECTION_IFSET(CPU_FTR_HVMODE \| CPU_FTR_ARCH_206) The definition of END_FTR_SECTION_IFSET is such that the code will be enabled (not overwritten with nops) only if all bits in the provided mask are set. Note that the CPU feature check in __tlbie() only needs to check the ARCH_206 bit, not the HVMODE bit, because __tlbie() can only get called if we are running bare-metal, i.e. in hypervisor mode. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:58 +03:00
Paul Mackerras	aa04b4cc5b	KVM: PPC: Allocate RMAs (Real Mode Areas) at boot for use by guests This adds infrastructure which will be needed to allow book3s_hv KVM to run on older POWER processors, including PPC970, which don't support the Virtual Real Mode Area (VRMA) facility, but only the Real Mode Offset (RMO) facility. These processors require a physically contiguous, aligned area of memory for each guest. When the guest does an access in real mode (MMU off), the address is compared against a limit value, and if it is lower, the address is ORed with an offset value (from the Real Mode Offset Register (RMOR)) and the result becomes the real address for the access. The size of the RMA has to be one of a set of supported values, which usually includes 64MB, 128MB, 256MB and some larger powers of 2. Since we are unlikely to be able to allocate 64MB or more of physically contiguous memory after the kernel has been running for a while, we allocate a pool of RMAs at boot time using the bootmem allocator. The size and number of the RMAs can be set using the kvm_rma_size=xx and kvm_rma_count=xx kernel command line options. KVM exports a new capability, KVM_CAP_PPC_RMA, to signal the availability of the pool of preallocated RMAs. The capability value is 1 if the processor can use an RMA but doesn't require one (because it supports the VRMA facility), or 2 if the processor requires an RMA for each guest. This adds a new ioctl, KVM_ALLOCATE_RMA, which allocates an RMA from the pool and returns a file descriptor which can be used to map the RMA. It also returns the size of the RMA in the argument structure. Having an RMA means we will get multiple KMV_SET_USER_MEMORY_REGION ioctl calls from userspace. To cope with this, we now preallocate the kvm->arch.ram_pginfo array when the VM is created with a size sufficient for up to 64GB of guest memory. Subsequently we will get rid of this array and use memory associated with each memslot instead. This moves most of the code that translates the user addresses into host pfns (page frame numbers) out of kvmppc_prepare_vrma up one level to kvmppc_core_prepare_memory_region. Also, instead of having to look up the VMA for each page in order to check the page size, we now check that the pages we get are compound pages of 16MB. However, if we are adding memory that is mapped to an RMA, we don't bother with calling get_user_pages_fast and instead just offset from the base pfn for the RMA. Typically the RMA gets added after vcpus are created, which makes it inconvenient to have the LPCR (logical partition control register) value in the vcpu->arch struct, since the LPCR controls whether the processor uses RMA or VRMA for the guest. This moves the LPCR value into the kvm->arch struct and arranges for the MER (mediated external request) bit, which is the only bit that varies between vcpus, to be set in assembly code when going into the guest if there is a pending external interrupt request. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:57 +03:00
Paul Mackerras	371fefd6f2	KVM: PPC: Allow book3s_hv guests to use SMT processor modes This lifts the restriction that book3s_hv guests can only run one hardware thread per core, and allows them to use up to 4 threads per core on POWER7. The host still has to run single-threaded. This capability is advertised to qemu through a new KVM_CAP_PPC_SMT capability. The return value of the ioctl querying this capability is the number of vcpus per virtual CPU core (vcore), currently 4. To use this, the host kernel should be booted with all threads active, and then all the secondary threads should be offlined. This will put the secondary threads into nap mode. KVM will then wake them from nap mode and use them for running guest code (while they are still offline). To wake the secondary threads, we send them an IPI using a new xics_wake_cpu() function, implemented in arch/powerpc/sysdev/xics/icp-native.c. In other words, at this stage we assume that the platform has a XICS interrupt controller and we are using icp-native.c to drive it. Since the woken thread will need to acknowledge and clear the IPI, we also export the base physical address of the XICS registers using kvmppc_set_xics_phys() for use in the low-level KVM book3s code. When a vcpu is created, it is assigned to a virtual CPU core. The vcore number is obtained by dividing the vcpu number by the number of threads per core in the host. This number is exported to userspace via the KVM_CAP_PPC_SMT capability. If qemu wishes to run the guest in single-threaded mode, it should make all vcpu numbers be multiples of the number of threads per core. We distinguish three states of a vcpu: runnable (i.e., ready to execute the guest), blocked (that is, idle), and busy in host. We currently implement a policy that the vcore can run only when all its threads are runnable or blocked. This way, if a vcpu needs to execute elsewhere in the kernel or in qemu, it can do so without being starved of CPU by the other vcpus. When a vcore starts to run, it executes in the context of one of the vcpu threads. The other vcpu threads all go to sleep and stay asleep until something happens requiring the vcpu thread to return to qemu, or to wake up to run the vcore (this can happen when another vcpu thread goes from busy in host state to blocked). It can happen that a vcpu goes from blocked to runnable state (e.g. because of an interrupt), and the vcore it belongs to is already running. In that case it can start to run immediately as long as the none of the vcpus in the vcore have started to exit the guest. We send the next free thread in the vcore an IPI to get it to start to execute the guest. It synchronizes with the other threads via the vcore->entry_exit_count field to make sure that it doesn't go into the guest if the other vcpus are exiting by the time that it is ready to actually enter the guest. Note that there is no fixed relationship between the hardware thread number and the vcpu number. Hardware threads are assigned to vcpus as they become runnable, so we will always use the lower-numbered hardware threads in preference to higher-numbered threads if not all the vcpus in the vcore are runnable, regardless of which vcpus are runnable. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:57 +03:00
David Gibson	54738c0971	KVM: PPC: Accelerate H_PUT_TCE by implementing it in real mode This improves I/O performance for guests using the PAPR paravirtualization interface by making the H_PUT_TCE hcall faster, by implementing it in real mode. H_PUT_TCE is used for updating virtual IOMMU tables, and is used both for virtual I/O and for real I/O in the PAPR interface. Since this moves the IOMMU tables into the kernel, we define a new KVM_CREATE_SPAPR_TCE ioctl to allow qemu to create the tables. The ioctl returns a file descriptor which can be used to mmap the newly created table. The qemu driver models use them in the same way as userspace managed tables, but they can be updated directly by the guest with a real-mode H_PUT_TCE implementation, reducing the number of host/guest context switches during guest IO. There are certain circumstances where it is useful for userland qemu to write to the TCE table even if the kernel H_PUT_TCE path is used most of the time. Specifically, allowing this will avoid awkwardness when we need to reset the table. More importantly, we will in the future need to write the table in order to restore its state after a checkpoint resume or migration. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:56 +03:00
Paul Mackerras	a8606e20e4	KVM: PPC: Handle some PAPR hcalls in the kernel This adds the infrastructure for handling PAPR hcalls in the kernel, either early in the guest exit path while we are still in real mode, or later once the MMU has been turned back on and we are in the full kernel context. The advantage of handling hcalls in real mode if possible is that we avoid two partition switches -- and this will become more important when we support SMT4 guests, since a partition switch means we have to pull all of the threads in the core out of the guest. The disadvantage is that we can only access the kernel linear mapping, not anything vmalloced or ioremapped, since the MMU is off. This also adds code to handle the following hcalls in real mode: H_ENTER Add an HPTE to the hashed page table H_REMOVE Remove an HPTE from the hashed page table H_READ Read HPTEs from the hashed page table H_PROTECT Change the protection bits in an HPTE H_BULK_REMOVE Remove up to 4 HPTEs from the hashed page table H_SET_DABR Set the data address breakpoint register Plus code to handle the following hcalls in the kernel: H_CEDE Idle the vcpu until an interrupt or H_PROD hcall arrives H_PROD Wake up a ceded vcpu H_REGISTER_VPA Register a virtual processor area (VPA) The code that runs in real mode has to be in the base kernel, not in the module, if KVM is compiled as a module. The real-mode code can only access the kernel linear mapping, not vmalloc or ioremap space. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:55 +03:00
Paul Mackerras	de56a948b9	KVM: PPC: Add support for Book3S processors in hypervisor mode This adds support for KVM running on 64-bit Book 3S processors, specifically POWER7, in hypervisor mode. Using hypervisor mode means that the guest can use the processor's supervisor mode. That means that the guest can execute privileged instructions and access privileged registers itself without trapping to the host. This gives excellent performance, but does mean that KVM cannot emulate a processor architecture other than the one that the hardware implements. This code assumes that the guest is running paravirtualized using the PAPR (Power Architecture Platform Requirements) interface, which is the interface that IBM's PowerVM hypervisor uses. That means that existing Linux distributions that run on IBM pSeries machines will also run under KVM without modification. In order to communicate the PAPR hypercalls to qemu, this adds a new KVM_EXIT_PAPR_HCALL exit code to include/linux/kvm.h. Currently the choice between book3s_hv support and book3s_pr support (i.e. the existing code, which runs the guest in user mode) has to be made at kernel configuration time, so a given kernel binary can only do one or the other. This new book3s_hv code doesn't support MMIO emulation at present. Since we are running paravirtualized guests, this isn't a serious restriction. With the guest running in supervisor mode, most exceptions go straight to the guest. We will never get data or instruction storage or segment interrupts, alignment interrupts, decrementer interrupts, program interrupts, single-step interrupts, etc., coming to the hypervisor from the guest. Therefore this introduces a new KVMTEST_NONHV macro for the exception entry path so that we don't have to do the KVM test on entry to those exception handlers. We do however get hypervisor decrementer, hypervisor data storage, hypervisor instruction storage, and hypervisor emulation assist interrupts, so we have to handle those. In hypervisor mode, real-mode accesses can access all of RAM, not just a limited amount. Therefore we put all the guest state in the vcpu.arch and use the shadow_vcpu in the PACA only for temporary scratch space. We allocate the vcpu with kzalloc rather than vzalloc, and we don't use anything in the kvmppc_vcpu_book3s struct, so we don't allocate it. We don't have a shared page with the guest, but we still need a kvm_vcpu_arch_shared struct to store the values of various registers, so we include one in the vcpu_arch struct. The POWER7 processor has a restriction that all threads in a core have to be in the same partition. MMU-on kernel code counts as a partition (partition 0), so we have to do a partition switch on every entry to and exit from the guest. At present we require the host and guest to run in single-thread mode because of this hardware restriction. This code allocates a hashed page table for the guest and initializes it with HPTEs for the guest's Virtual Real Memory Area (VRMA). We require that the guest memory is allocated using 16MB huge pages, in order to simplify the low-level memory management. This also means that we can get away without tracking paging activity in the host for now, since huge pages can't be paged or swapped. This also adds a few new exports needed by the book3s_hv code. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:54 +03:00
Paul Mackerras	3c42bf8a71	KVM: PPC: Split host-state fields out of kvmppc_book3s_shadow_vcpu There are several fields in struct kvmppc_book3s_shadow_vcpu that temporarily store bits of host state while a guest is running, rather than anything relating to the particular guest or vcpu. This splits them out into a new kvmppc_host_state structure and modifies the definitions in asm-offsets.c to suit. On 32-bit, we have a kvmppc_host_state structure inside the kvmppc_book3s_shadow_vcpu since the assembly code needs to be able to get to them both with one pointer. On 64-bit they are separate fields in the PACA. This means that on 64-bit we don't need to copy the kvmppc_host_state in and out on vcpu load/unload, and in future will mean that the book3s_hv code doesn't need a shadow_vcpu struct in the PACA at all. That does mean that we have to be careful not to rely on any values persisting in the hstate field of the paca across any point where we could block or get preempted. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:53 +03:00
Paul Mackerras	923c53caea	powerpc: Set up LPCR for running guest partitions In hypervisor mode, the LPCR controls several aspects of guest partitions, including virtual partition memory mode, and also controls whether the hypervisor decrementer interrupts are enabled. This sets up LPCR at boot time so that guest partitions will use a virtual real memory area (VRMA) composed of 16MB large pages, and hypervisor decrementer interrupts are disabled. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:52 +03:00
Paul Mackerras	df6909e5d5	KVM: PPC: Move guest enter/exit down into subarch-specific code Instead of doing the kvm_guest_enter/exit() and local_irq_dis/enable() calls in powerpc.c, this moves them down into the subarch-specific book3s_pr.c and booke.c. This eliminates an extra local_irq_enable() call in book3s_pr.c, and will be needed for when we do SMT4 guest support in the book3s hypervisor mode code. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:51 +03:00
Paul Mackerras	f9e0554dec	KVM: PPC: Pass init/destroy vm and prepare/commit memory region ops down This arranges for the top-level arch/powerpc/kvm/powerpc.c file to pass down some of the calls it gets to the lower-level subarchitecture specific code. The lower-level implementations (in booke.c and book3s.c) are no-ops. The coming book3s_hv.c will need this. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:50 +03:00
Paul Mackerras	3cf658b605	KVM: PPC: Deliver program interrupts right away instead of queueing them Doing so means that we don't have to save the flags anywhere and gets rid of the last reference to to_book3s(vcpu) in arch/powerpc/kvm/book3s.c. Doing so is OK because a program interrupt won't be generated at the same time as any other synchronous interrupt. If a program interrupt and an asynchronous interrupt (external or decrementer) are generated at the same time, the program interrupt will be delivered, which is correct because it has a higher priority, and then the asynchronous interrupt will be masked. We don't ever generate system reset or machine check interrupts to the guest, but if we did, then we would need to make sure they got delivered rather than the program interrupt. The current code would be wrong in this situation anyway since it would deliver the program interrupt as well as the reset/machine check interrupt. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:49 +03:00
Paul Mackerras	b01c8b54a1	powerpc, KVM: Rework KVM checks in first-level interrupt handlers Instead of branching out-of-line with the DO_KVM macro to check if we are in a KVM guest at the time of an interrupt, this moves the KVM check inline in the first-level interrupt handlers. This speeds up the non-KVM case and makes sure that none of the interrupt handlers are missing the check. Because the first-level interrupt handlers are now larger, some things had to be move out of line in exceptions-64s.S. This all necessitated some minor changes to the interrupt entry code in KVM. This also streamlines the book3s_32 KVM test. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:48 +03:00
Paul Mackerras	f05ed4d56e	KVM: PPC: Split out code from book3s.c into book3s_pr.c In preparation for adding code to enable KVM to use hypervisor mode on 64-bit Book 3S processors, this splits book3s.c into two files, book3s.c and book3s_pr.c, where book3s_pr.c contains the code that is specific to running the guest in problem state (user mode) and book3s.c contains code which should apply to all Book 3S processors. In doing this, we abstract some details, namely the interrupt offset, updating the interrupt pending flag, and detecting if the guest is in a critical section. These are all things that will be different when we use hypervisor mode. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:47 +03:00
Paul Mackerras	c4befc58a0	KVM: PPC: Move fields between struct kvm_vcpu_arch and kvmppc_vcpu_book3s This moves the slb field, which represents the state of the emulated SLB, from the kvmppc_vcpu_book3s struct to the kvm_vcpu_arch, and the hpte_hash_[v]pte[_long] fields from kvm_vcpu_arch to kvmppc_vcpu_book3s. This is in accord with the principle that the kvm_vcpu_arch struct represents the state of the emulated CPU, and the kvmppc_vcpu_book3s struct holds the auxiliary data structures used in the emulation. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:46 +03:00
Paul Mackerras	149dbdb185	KVM: PPC: Fix machine checks on 32-bit Book3S Commit 69acc0d3ba ("KVM: PPC: Resolve real-mode handlers through function exports") resulted in vcpu->arch.trampoline_lowmem and vcpu->arch.trampoline_enter ending up with kernel virtual addresses rather than physical addresses. This is OK on 64-bit Book3S machines, which ignore the top 4 bits of the effective address in real mode, but on 32-bit Book3S machines, accessing these addresses in real mode causes machine check interrupts, as the hardware uses the whole effective address as the physical address in real mode. This fixes the problem by using __pa() to convert these addresses to physical addresses. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:45 +03:00
Scott Wood	1aee47a027	KVM: PPC: e500: Don't search over the entire TLB0. Only look in the 4 entries that could possibly contain the entry we're looking for. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:40 +03:00
Liu Yu	dd9ebf1f94	KVM: PPC: e500: Add shadow PID support Dynamically assign host PIDs to guest PIDs, splitting each guest PID into multiple host (shadow) PIDs based on kernel/user and MSR[IS/DS]. Use both PID0 and PID1 so that the shadow PIDs for the right mode can be selected, that correspond both to guest TID = zero and guest TID = guest PID. This allows us to significantly reduce the frequency of needing to invalidate the entire TLB. When the guest mode or PID changes, we just update the host PID0/PID1. And since the allocation of shadow PIDs is global, multiple guests can share the TLB without conflict. Note that KVM does not yet support the guest setting PID1 or PID2 to a value other than zero. This will need to be fixed for nested KVM to work. Until then, we enforce the requirement for guest PID1/PID2 to stay zero by failing the emulation if the guest tries to set them to something else. Signed-off-by: Liu Yu <yu.liu@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:39 +03:00
Liu Yu	08b7fa92b9	KVM: PPC: e500: Stop keeping shadow TLB Instead of a fully separate set of TLB entries, keep just the pfn and dirty status. Signed-off-by: Liu Yu <yu.liu@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:38 +03:00
Scott Wood	a4cd8b23ac	KVM: PPC: e500: enable magic page This is a shared page used for paravirtualization. It is always present in the guest kernel's effective address space at the address indicated by the hypercall that enables it. The physical address specified by the hypercall is not used, as e500 does not have real mode. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:37 +03:00
Scott Wood	9973d54eea	KVM: PPC: e500: Support large page mappings of PFNMAP vmas. This allows large pages to be used on guest mappings backed by things like /dev/mem, resulting in a significant speedup when guest memory is mapped this way (it's useful for directly-assigned MMIO, too). This is not a substitute for hugetlbfs integration, but is useful for configurations where devices are directly assigned on chips without an IOMMU -- in these cases, we need guest physical and true physical to match, and be contiguous, so static reservation and mapping via /dev/mem is the most straightforward way to set things up. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:36 +03:00
Scott Wood	59c1f4e35c	KVM: PPC: e500: Eliminate shadow_pages[], and use pfns instead. This is in line with what other architectures do, and will allow us to map things other than ordinary, unreserved kernel pages -- such as dedicated devices, or large contiguous reserved regions. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:35 +03:00
Scott Wood	0ef309956c	KVM: PPC: e500: don't use MAS0 as intermediate storage. This avoids races. It also means that we use the shadow TLB way, rather than the hardware hint -- if this is a problem, we could do a tlbsx before inserting a TLB0 entry. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:34 +03:00
Scott Wood	6fc4d1eb91	KVM: PPC: e500: Disable preloading TLB1 in tlb_load(). Since TLB1 loading doesn't check the shadow TLB before allocating another entry, you can get duplicates. Once shadow PIDs are enabled in a later patch, we won't need to invalidate the TLB on every switch, so this optimization won't be needed anyway. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:33 +03:00
Scott Wood	4cd35f675b	KVM: PPC: e500: Save/restore SPE state This is done lazily. The SPE save will be done only if the guest has used SPE since the last preemption or heavyweight exit. Restore will be done only on demand, when enabling MSR_SPE in the shadow MSR, in response to an SPE fault or mtmsr emulation. For SPEFSCR, Linux already switches it on context switch (non-lazily), so the only remaining bit is to save it between qemu and the guest. Signed-off-by: Liu Yu <yu.liu@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:32 +03:00
Scott Wood	ecee273fc4	KVM: PPC: booke: use shadow_msr Keep the guest MSR and the guest-mode true MSR separate, rather than modifying the guest MSR on each guest entry to produce a true MSR. Any bits which should be modified based on guest MSR must be explicitly propagated from vcpu->arch.shared->msr to vcpu->arch.shadow_msr in kvmppc_set_msr(). While we're modifying the guest entry code, reorder a few instructions to bury some load latencies. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:32 +03:00
Scott Wood	c51584d52e	powerpc/e500: SPE register saving: take arbitrary struct offset Previously, these macros hardcoded THREAD_EVR0 as the base of the save area, relative to the base register passed. This base offset is now passed as a separate macro parameter, allowing reuse with other SPE save areas, such as used by KVM. Acked-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:31 +03:00
yu liu	685659ee70	powerpc/e500: Save SPEFCSR in flush_spe_to_thread() giveup_spe() saves the SPE state which is protected by MSR[SPE]. However, modifying SPEFSCR does not trap when MSR[SPE]=0. And since SPEFSCR is already saved/restored in _switch(), not all the callers want to save SPEFSCR again. Thus, saving SPEFSCR should not belong to giveup_spe(). This patch moves SPEFSCR saving to flush_spe_to_thread(), and cleans up the caller that needs to save SPEFSCR accordingly. Signed-off-by: Liu Yu <yu.liu@freescale.com> Acked-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:30 +03:00
Alexander Graf	a22a2daccf	KVM: PPC: Resolve real-mode handlers through function exports Up until now, Book3S KVM had variables stored in the kernel that a kernel module or the kvm code in the kernel could read from to figure out where some real mode helper functions are located. This is all unnecessary. The high bits of the EA get ignore in real mode, so we can just use the pointer as is. Also, it's a lot easier on relocations when we use the normal way of resolving the address to a function, instead of jumping through hoops. This patch fixes compilation with CONFIG_RELOCATABLE=y. Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:29 +03:00
Stuart Yoder	24294b9a3f	KVM: PPC: fix partial application of "exit timing in ticks" When http://www.spinics.net/lists/kvm-ppc/msg02664.html was applied to produce commit b51e7aa7ed6d8d134d02df78300ab0f91cfff4d2, the removal of the conversion in add_exit_timing was left out. Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:28 +03:00
Benjamin Herrenschmidt	770e1ac5f2	powerpc/mm: Fix memory_block_size_bytes() for non-pseries Just compiling pseries in the kernel causes it to override memory_block_size_bytes() regardless of what is the runtime platform. This cleans up the implementation of that function, fixing a bug or two while at it, so that it's harmless (and potentially useful) for other platforms. Without this, bugs in that code would trigger a WARN_ON() in drivers/base/memory.c when booting some different platforms. If/when we have another platform supporting memory hotplug we might want to either move that out to a generic place or make it a ppc_md. callback. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-07-12 11:16:45 +10:00
Jiri Kosina	b7e9c223be	Merge branch 'master' into for-next Sync with Linus' tree to be able to apply pending patches that are based on newer code already present upstream.	2011-07-11 14:15:55 +02:00
Kumar Gala	6471fc6630	powerpc: Dont require a dma_ops struct to set dma mask The only reason to require a dma_ops struct is to see if it has implemented set_dma_mask. If not we can fall back to setting the mask directly. This resolves an issue with how to sequence the setting of a DMA mask for platform devices. Before we had an issue in that we have no way of setting the DMA mask before the various low level bus notifiers get called that might check it (swiotlb). So now we can do: pdev = platform_device_alloc("foobar", 0); dma_set_mask(&pdev->dev, DMA_BIT_MASK(37)); platform_device_add(pdev); And expect the right thing to happen with the bus notifiers get called via platform_device_add. Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-07-08 00:21:36 -05:00
Kumar Gala	314b02f503	powerpc: implement arch_setup_pdev_archdata We have a long standing issues with platform devices not have a valid dma_mask pointer. This hasn't been an issue to date as no platform device has tried to set its dma_mask value to a non-default value. Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-07-08 00:21:36 -05:00
Becky Bruce	3160b09796	powerpc: Create next_tlbcam_idx percpu variable for FSL_BOOKE This is used to round-robin TLBCAM entries. Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-07-08 00:21:34 -05:00
Felix Radensky	f6ad160e6f	powerpc/p1022ds: Remove fixed-link property from ethernet nodes. On P1022DS both ethernet controllers are connected to RGMII PHYs accessible via MDIO bus. Remove fixed-link property from ethernet nodes as they only required when fixed link PHYs without MDIO bus are used. Signed-off-by: Felix Radensky <felix@embedded-sol.com> Acked-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-07-08 00:21:34 -05:00
Laurentiu TUDOR	2647aa19fb	powerpc/85xx: Remove stale BUG_ON in mpc85xx_smp_init Under the FSL Hypervisor we triggered a BUG_ON in mpc85xx_smp_init that expected smp_ops.message_pass to be explicity set. However recent changes allows smp_ops.message_pass to be NULL and handled by default code. Thus the BUG_ON isn't relevant anymore. Signed-off-by: Laurentiu TUDOR <Laurentiu.Tudor@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-07-08 00:21:33 -05:00
Mingkai Hu	3fce1c0ba2	powerpc/85xx: Add p2040 RDB board support P2040RDB Specification: ----------------------- 2Gbyte unbuffered DDR3 SDRAM SO-DIMM(64bit bus) 128 Mbyte NOR flash single-chip memory 256 Kbit M24256 I2C EEPROM 16 Mbyte SPI memory SD connector to interface with the SD memory card dTSEC1: connected to the Vitesse SGMII PHY (VSC8221) dTSEC2: connected to the Vitesse SGMII PHY (VSC8221) dTSEC3: connected to the Vitesse SGMII PHY (VSC8221) dTSEC4: connected to the Vitesse RGMII PHY (VSC8641) dTSEC5: connected to the Vitesse RGMII PHY (VSC8641) I2C1: Real time clock, Temperature sensor I2C2: Vcore Regulator, 256Kbit I2C Bus EEPROM SATA: Lanes C and Land D of Bank2 are connected to two SATA connectors UART: supports two UARTs up to 115200 bps for console USB 2.0: connected via a internal UTMI PHY to two TYPE-A interfaces PCIe: - Lanes E, F, G and H of Bank1 are connected to one x4 PCIe SLOT1 - Lanes C and Land D of Bank2 are connected to one x4 PCIe SLOT2 Signed-off-by: Mingkai Hu <Mingkai.hu@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-07-08 00:21:32 -05:00
Timur Tabi	59f8df290a	powerpc/85xx: add hypervisor config entries to corenet_smp_defconfig CONFIG_PPC_EPAPR_HV_BYTECHAN adds support for the Freescale hypervisor byte channel tty driver. CONFIG_VIRT_DRIVERS and CONFIG_FSL_HV_MANAGER add support for the Freescale hypervisor management driver. Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-07-08 00:21:31 -05:00
Kumar Gala	8dbb6bc136	powerpc/85xx: Add P5020 SoC device tree include stub Split out common (non-board specific) parts of the SoC related device tree into a stub so multiple board dts files can include it and we can reduce duplication and maintenance effort. Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-07-07 09:33:50 -05:00
Kumar Gala	e5aae727c0	powerpc/85xx: Add P3041 SoC device tree include stub Split out common (non-board specific) parts of the SoC related device tree into a stub so multiple board dts files can include it and we can reduce duplication and maintenance effort. Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-07-07 09:33:50 -05:00
Grant Likely	6eae1ace68	gpio: Move mpc5200 gpio driver to drivers/gpio GPIO drivers are getting consolidated into drivers/gpio. While at it, change the driver name to mpc5200-gpio* to avoid collisions. Signed-off-by: Grant Likely <grant.likely@secretlab.ca>	2011-07-06 11:57:15 -06:00
Peter Zijlstra	29dfc4fd7e	perf, powerpc: Fix build borkage The patch `a8b0ca17b8` ("perf: Remove the nmi parameter from the swevent and overflow interface") missed a spot in the ppc hw_breakpoint code, fix this up so things compile again. Reported-by: Ingo Molnar <mingo@elte.hu> Cc: Anton Blanchard <anton@samba.org> Cc: Eric B Munson <emunson@mgebm.net> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-09pfip95g88s70iwkxu6nnbt@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-01 16:20:04 +02:00
Avi Kivity	4dc0da8696	perf: Add context field to perf_event The perf_event overflow handler does not receive any caller-derived argument, so many callers need to resort to looking up the perf_event in their local data structure. This is ugly and doesn't scale if a single callback services many perf_events. Fix by adding a context parameter to perf_event_create_kernel_counter() (and derived hardware breakpoints APIs) and storing it in the perf_event. The field can be accessed from the callback as event->overflow_handler_context. All callers are updated. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1309362157-6596-2-git-send-email-avi@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-01 11:06:38 +02:00
Peter Zijlstra	89d6c0b5bd	perf, arch: Add generic NODE cache events Add a NODE level to the generic cache events which is used to measure local vs remote memory accesses. Like all other cache events, an ACCESS is HIT+MISS, if there is no way to distinguish between reads and writes do reads only etc.. The below needs filling out for !x86 (which I filled out with unsupported events). I'm fairly sure ARM can leave it like that since it doesn't strike me as an architecture that even has NUMA support. SH might have something since it does appear to have some NUMA bits. Sparc64, PowerPC and MIPS certainly want a good look there since they clearly are NUMA capable. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: David Miller <davem@davemloft.net> Cc: Anton Blanchard <anton@samba.org> Cc: David Daney <ddaney@caviumnetworks.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1303508226.4865.8.camel@laptop Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-01 11:06:38 +02:00
Peter Zijlstra	a8b0ca17b8	perf: Remove the nmi parameter from the swevent and overflow interface The nmi parameter indicated if we could do wakeups from the current context, if not, we would set some state and self-IPI and let the resulting interrupt do the wakeup. For the various event classes: - hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from the PMI-tail (ARM etc.) - tracepoint: nmi=0; since tracepoint could be from NMI context. - software: nmi=[0,1]; some, like the schedule thing cannot perform wakeups, and hence need 0. As one can see, there is very little nmi=1 usage, and the down-side of not using it is that on some platforms some software events can have a jiffy delay in wakeup (when arch_irq_work_raise isn't implemented). The up-side however is that we can remove the nmi parameter and save a bunch of conditionals in fast paths. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Michael Cree <mcree@orcon.net.nz> Cc: Will Deacon <will.deacon@arm.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: Anton Blanchard <anton@samba.org> Cc: Eric B Munson <emunson@mgebm.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: David S. Miller <davem@davemloft.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Don Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-01 11:06:35 +02:00
Peter Zijlstra	4f8b50bbbe	irq_work, ppc: Fix up arch hooks Commit `e360adbe29` ("irq_work: Add generic hardirq context callbacks") fouled up the ppc bit, not properly naming the arch specific function that raises the 'self-IPI'. Cc: Huang Ying <ying.huang@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Anton Blanchard <anton@samba.org> Cc: Eric B Munson <emunson@mgebm.net> Cc: stable@kernel.org # 37+ Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-eg0aqien8p1aqvzu9dft6dtv@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-01 11:02:22 +02:00
Anton Blanchard	af9719c306	powerpc: Use -mtraceback=no gcc 4.7 will be more strict about parsing the -mtraceback option: gcc: error: unrecognized argument in option '-mtraceback=none' gcc: note: valid arguments to '-mtraceback=' are: full no part gcc used to do a 2 char compare so both "no" and "none" would match. Switch to using -mtraceback=no should work everywhere. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-07-01 13:49:27 +10:00
Michael Ellerman	ac5f89c7d8	powerpc: Add jump label support This patch adds support for the new "jump label" feature. Unlike x86 and sparc we just merrily patch the code with no locks etc, as far as I know this is safe, but I'm not really sure what the x86/sparc code is protecting against so maybe it's not. I also don't see any reason for us to implement the poke_early() routine, even though sparc does. [BenH: Updated the patch to upstream generic changes] Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-07-01 13:48:55 +10:00
Benjamin Herrenschmidt	87fa35dd88	powerpc/hvsi: Fix conflict with old HVSI driver A mix of think & mismerge on my side caused a problem where both the new hvsi_lib and the old hvsi driver gets compiled and try to define symbols with the same name. This fixes it by renaming the hvsi_lib exported symbols. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-07-01 13:10:21 +10:00
Benjamin Herrenschmidt	9def247a70	powerpc: Fix build problem with default ppc_md.progress commit `a9c0f41b3a` breaks the build on some platforms. The extern declaration must be shielded against assembly. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-07-01 13:00:21 +10:00
Dave Carroll	a9c0f41b3a	powerpc: Add printk companion for ppc_md.progress This patch adds a printk companion to replace the udbg progress function when initmem is freed. Suggested-by: Milton Miller <miltonm@bga.com> Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Dave Carroll <dcarroll@astekcorp.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-06-30 15:28:05 +10:00
Dave Carroll	2773fcc8c4	powerpc: Move free_initmem to common code The free_initmem function is basically duplicated in mm/init_32, and init_64, and is moved to the common 32/64-bit mm/mem.c. All other sections except init were removed in v2.6.15 by `6c45ab992e` (powerpc: Remove section free() and linker script bits), and therefore the bulk of the executed code is identical. This patch also removes updating ppc_md.progress to NULL in the powermac late_initcall. Suggested-by: Milton Miller <miltonm@bga.com> Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Dave Carroll <dcarroll@astekcorp.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-06-30 15:28:05 +10:00
Benjamin Herrenschmidt	6da49a2925	Merge remote branch 'origin/master' into next	2011-06-30 15:23:59 +10:00
Linus Torvalds	78a3cc38f7	Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: arch/powerpc: use printk_ratelimited instead of printk_ratelimit powerpc/rtas-rtc: remove sideeffects of printk_ratelimit powerpc/pseries: remove duplicate SCSI_BNX2_ISCSI in pseries_defconfig powerpc/e500: fix breakage with fsl_rio_mcheck_exception powerpc/p1022ds: fix audio-related properties in the device tree powerpc/85xx: fix NAND_CMD_READID read bytes number	2011-06-29 11:03:27 -07:00
Benjamin Herrenschmidt	17bdc6c0e9	powerpc/pseries: Move hvsi support into a library This will allow a different backend to share it Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-06-29 17:48:37 +10:00
Benjamin Herrenschmidt	4d2bb3f500	powerpc/pseries: Re-implement HVSI as part of hvc_vio On pseries machines, consoles are provided by the hypervisor using a low level get_chars/put_chars type interface. However, this is really just a transport to the service processor which implements them either as "raw" console (networked consoles, HMC, ...) or as "hvsi" serial ports. The later is a simple packet protocol on top of the raw character interface that is supposed to convey additional "serial port" style semantics. In practice however, all it does is provide a way to read the CD line and set/clear our DTR line, that's it. We currently implement the "raw" protocol as an hvc console backend (/dev/hvcN) and the "hvsi" protocol using a separate tty driver (/dev/hvsi0). However this is quite impractical. The arbitrary difference between the two type of devices has been a major source of user (and distro) confusion. Additionally, there's an additional mini -hvsi implementation in the pseries platform code for our low level debug console and early boot kernel messages, which means code duplication, though that low level variant is impractical as it's incapable of doing the initial protocol negociation to establish the link to the FSP. This essentially replaces the dedicated hvsi driver and the platform udbg code completely by extending the existing hvc_vio backend used in "raw" mode so that: - It now supports HVSI as well - We add support for hvc backend providing tiocm{get,set} - It also provides a udbg interface for early debug and boot console This is overall less code, though this will only be obvious once we remove the old "hvsi" driver, which is still available for now. When the old driver is enabled, the new code still kicks in for the low level udbg console, replacing the old mini implementation in the platform code, it just doesn't provide the higher level "hvc" interface. In addition to producing generally simler code, this has several benefits over our current situation: - The user/distro only has to deal with /dev/hvcN for the hypervisor console, avoiding all sort of confusion that has plagued us in the past - The tty, kernel and low level debug console all use the same code base which supports the full protocol establishment process, thus the console is now available much earlier than it used to be with the old HVSI driver. The kernel console works much earlier and udbg is available much earlier too. Hackers can enable a hard coded very-early debug console as well that works with HVSI (previously that was only supported for the "raw" mode). I've tried to keep the same semantics as hvsi relative to how I react to things like CD changes, with some subtle differences though: - I clear DTR on close if HUPCL is set - Current hvsi triggers a hangup if it detects a up->down transition on CD (you can still open a console with CD down). My new implementation triggers a hangup if the link to the FSP is severed, and severs it upon detecting a up->down transition on CD. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-06-29 17:48:35 +10:00
Benjamin Herrenschmidt	dd2e356a3d	powerpc/udbg: Register udbg console generically When CONFIG_PPC_EARLY_DEBUG is set, call register_early_udbg_console() early from generic code. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-06-29 17:48:32 +10:00
Benjamin Herrenschmidt	048bee7718	powerpc/pseries: Factor HVSI header struct in packet definitions Embed the struct hvsi_header in the various packet definitions rather than open coding it multiple times. Will help provide stronger type checking. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-06-29 17:48:30 +10:00
Benjamin Herrenschmidt	725e789f22	powerpc/hvsi: Move HVSI protocol definitions to a header file This moves various HVSI protocol definitions from the hvsi.c driver to a header file that can be used later on by a udbg implementation Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-06-29 17:48:28 +10:00
Dmitry Eremin-Solenikov	8a0360a563	powerpc/maple: Register CPC925 EDAC device on all boards with CPC925 Currently Maple setup code creates cpc925_edac device only on Motorola ATCA-6101 blade. Make setup code check bridge revision and enable EDAC on all U3H bridges. Verified on Momentum MapleD (ppc970fx kit) board. Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-06-29 17:48:26 +10:00
Akinobu Mita	de2780a3d8	powerpc/pseries: Improve error code on reconfiguration notifier failure Reconfiguration notifier call for device node may fail by several reasons, but it always assumes kmalloc failures. This enables reconfiguration notifier call chain to get the actual error code rather than -ENOMEM by converting all reconfiguration notifier calls to return encapsulate error code with notifier_from_errno(). Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-06-29 17:48:24 +10:00
Akinobu Mita	3aef19f0a1	powerpc/pseries: Introduce pSeries_reconfig_notify() This introduces pSeries_reconfig_notify() as a just wrapper of blocking_notifier_call_chain() for pSeries_reconfig_chain. This is a preparation to improvement of error code on reconfiguration notifier failure. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-06-29 17:48:22 +10:00
Dmitry Eremin-Solenikov	e48f7eb27f	powerpc/maple: Enable scom access functions on Maple Enable functions used to access SCOM if PPC_MAPLE is defined: they are used by cpufreq driver to control hardware. Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-06-29 17:48:20 +10:00
Becky Bruce	3d41e0f6d9	powerpc: mem_init should call memblock_is_reserved with phys_addr_t This has been broken for a while but hasn't been an issue until now because nobody was reserving regions at high addresses. Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-06-29 17:48:18 +10:00
Becky Bruce	72632ce5a4	powerpc: Whitespace fix to include/asm/pgtable-ppc64.h Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-06-29 17:48:16 +10:00
Scott Wood	f67f4ef5fc	powerpc/book3e-64: use a separate TLB handler when linear map is bolted On MMUs such as FSL where we can guarantee the entire linear mapping is bolted, we don't need to worry about linear TLB misses. If on top of that we do a full table walk, we get rid of all recursive TLB faults, and can dispense with some state saving. This gains a few percent on TLB-miss-heavy workloads, and around 50% on a benchmark that had a high rate of virtual page table faults under the normal handler. While touching the EX_TLB layout, remove EX_TLB_MMUCR0, EX_TLB_SRR0, and EX_TLB_SRR1 as they're not used. [BenH: Fixed build with 64K pages (wsp config)] Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-06-29 17:47:48 +10:00
Scott Wood	3d97a619ac	powerpc/book3e-64: Reraise doorbell when masked by soft-irq-disable Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-06-29 16:40:59 +10:00
Christian Dietrich	76462232c2	arch/powerpc: use printk_ratelimited instead of printk_ratelimit Since printk_ratelimit() shouldn't be used anymore (see comment in include/linux/printk.h), replace it with printk_ratelimited. Signed-off-by: Christian Dietrich <christian.dietrich@informatik.uni-erlangen.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-06-29 15:31:01 +10:00
Christian Dietrich	9a8f99fab0	powerpc/rtas-rtc: remove sideeffects of printk_ratelimit Don't use printk_ratelimit() as an additional condition for returning on an error. Because when the ratelimit is reached, printk_ratelimit will return 0 and e.g. in rtas_get_boot_time won't check for an error condition. Signed-off-by: Christian Dietrich <christian.dietrich@informatik.uni-erlangen.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-06-29 15:30:43 +10:00
Benjamin Herrenschmidt	3c350a1a55	Merge remote branch 'jwb/next' into next	2011-06-29 12:45:43 +10:00
Michael Neuling	937c190ccd	powerpc/pseries: remove duplicate SCSI_BNX2_ISCSI in pseries_defconfig Remove duplicate assignment of SCSI_BNX2_ISCSI in pseries_defconfig introduced by: `37e0c21e` powerpc/pseries: Enable iSCSI support for a number of cards causes warning: arch/powerpc/configs/pseries_defconfig:151:warning: override: reassigning to symbol SCSI_BNX2_ISCSI Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-06-29 09:28:52 +10:00
Mike Williams	5730849959	powerpc/4xx: Update Canyonlands and Glacier boards DTS to add HW RNG support This will allow the new HW RNG driver to bind on these boards Signed-off-by: Mike Williams <mike@mikebwilliams.com> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>	2011-06-28 07:52:07 -04:00
Josh Boyer	61d1baaea2	ppc4xx: Add crypto and RNG entries to Sequoia DTS The Sequoia board has a Security function IP block on it that contains a TRNG. Add the crypto and rng portions of that IP block to the DTS. Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>	2011-06-28 07:41:09 -04:00
KAMEZAWA Hiroyuki	c6830c2260	Fix node_start/end_pfn() definition for mm/page_cgroup.c commit `21a3c96` uses node_start/end_pfn(nid) for detection start/end of nodes. But, it's not defined in linux/mmzone.h but defined in /arch/???/include/mmzone.h which is included only under CONFIG_NEED_MULTIPLE_NODES=y. Then, we see mm/page_cgroup.c: In function 'page_cgroup_init': mm/page_cgroup.c:308: error: implicit declaration of function 'node_start_pfn' mm/page_cgroup.c:309: error: implicit declaration of function 'node_end_pfn' So, fixiing page_cgroup.c is an idea... But node_start_pfn()/node_end_pfn() is a very generic macro and should be implemented in the same manner for all archs. (m32r has different implementation...) This patch removes definitions of node_start/end_pfn() in each archs and defines a unified one in linux/mmzone.h. It's not under CONFIG_NEED_MULTIPLE_NODES, now. A result of macro expansion is here (mm/page_cgroup.c) for !NUMA start_pfn = ((&contig_page_data)->node_start_pfn); end_pfn = ({ pg_data_t __pgdat = (&contig_page_data); __pgdat->node_start_pfn + __pgdat->node_spanned_pages;}); for NUMA (x86-64) start_pfn = ((node_data[nid])->node_start_pfn); end_pfn = ({ pg_data_t __pgdat = (node_data[nid]); __pgdat->node_start_pfn + __pgdat->node_spanned_pages;}); Changelog: - fixed to avoid using "nid" twice in node_end_pfn() macro. Reported-and-acked-by: Randy Dunlap <randy.dunlap@oracle.com> Reported-and-tested-by: Ingo Molnar <mingo@elte.hu> Acked-by: Mel Gorman <mgorman@suse.de> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-06-27 14:13:09 -07:00
Timur Tabi	14497d31e6	powerpc/85xx: disable timebase synchronization under the hypervisor The Freescale hypervisor does not allow guests to write to the timebase registers (virtualizing the timebase register was deemed too complicated), so don't try to synchronize the timebase registers when we're running under the hypervisor. This typically happens when kexec support is enabled. Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-06-27 08:36:19 -05:00
Prabhakar Kushwaha	2d05c392b8	powerpc/85xx: Add P1010RDB board support P1010RDB Overview ----------------- 1Gbyte DDR3 (on board DDR) 32Mbyte 16bit NOR flash 32Mbyte SLC NAND Flash 256 Kbit M24256 I2C EEPROM 128 Mbit SPI Flash memory I2C Board 128x8 bit memory SD/MMC connector to interface with the SD memory card 2 SATA interface 1 internal SATA connect to 2.5. 160G SATA2 HDD 1 eSATA connector to rear panel USB 2.0 x1 USB 2.0 port: connected via a UTMI PHY to Mini-AB interface. x1 USB 2.0 port: directly connected to Mini-AB interface Ethernet eTSEC1: Connected to RGMII PHY VSC8641XKO eTSEC2: Connected to SGMII PHY VSC8221 eTSEC3: Connected to SGMII PHY VSC8221 eCAN Two DB-9 female connectors for Field bus interface UART DUART interface: supports two UARTs up to 115200 bps for console display Signed-off-by: Poonam Aggrwal <poonam.aggrwal@freescale.com> Signed-off-by: Prabhakar Kushwaha <prabhakar@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-06-27 08:36:19 -05:00
Timur Tabi	2e5460f31a	powerpc/86xx: enable the framebuffer console on the MPC8610 HPCD Enable framebuffer console support by default in the defconfig on the Freescale MPC8610 HPCD reference board. This allows the boot messages to be shown on the video display. Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-06-27 08:36:18 -05:00
Timur Tabi	c8bfa77b56	powerpc/86xx: improve calculation of DIU pixel clock on the MPC8610 HPCD mpc8610hpcd_set_pixel_clock() calculates the correct value of the PXCLK bits in the CLKDVDR register for a given pixel clock rate. The code which performs this calculation is overly complicated and includes an error estimation routine that doesn't work most of the time anyway. Replace the code with the simpler routine that's currently used on the P1022DS. Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-06-27 08:36:17 -05:00
Timur Tabi	3846e332a9	powerpc/85xx: enable the framebuffer console for the defconfigs Enable framebuffer console support by default in the defconfigs for the Freescale 85xx-based reference board. This allows the boot messages to be shown on the video display on the P1022DS. Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-06-27 08:36:17 -05:00
Timur Tabi	7b93eccf28	powerpc/85xx: clamp the P1022DS DIU pixel clock to allowed values To ensure that the DIU pixel clock will not be set to an invalid value, clamp the PXCLK divider to the allowed range (2-255). This also acts as a limiter for the pixel clock. Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-06-27 08:36:16 -05:00
Scott Wood	ebf714ff37	powerpc/e500mc: Add support for the wait instruction in e500_idle e500mc cannot doze or nap due to an erratum (as well as having a different mechanism than previous e500), but it has a "wait" instruction that is similar to doze. On 64-bit, due to the soft-irq-disable mechanism, the existing book3e_idle should be used instead. Signed-off-by: Vakul Garg <vakul@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-06-27 08:36:15 -05:00
Kumar Gala	f340fe69f5	powerpc/85xx: Add P4080 SoC device tree include stub Split out common (non-board specific) parts of the SoC related device tree into a stub so multiple board dts files can include it and we can reduce duplication and maintenance effort. Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-06-27 08:36:15 -05:00
Timur Tabi	316559588f	powerpc/p1022ds: add missing iounmap calls to platform file The platform file for the Freecale P1022DS reference board is not freeing the ioremap() mapping of the PIXIS and global utilities nodes it creates. Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-06-27 08:31:12 -05:00
Dmitry Eremin-Solenikov	c0f589502e	powerpc/85xx: specify interrupt for pq3-localbus devices fsl-lbc driver requires an interrupt to bind to localbus device. Populate 85xx boards' dts trees with lbc interrupt info. Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-06-27 08:31:12 -05:00
Dmitry Eremin-Solenikov	67e64f4aee	powerpc/85xx: tqm8540 - add description for onboard flash Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-06-27 08:31:11 -05:00
Lei Xu	04243c4d32	powerpc/85xx: Update device tree to add nand info for p3041ds Signed-off-by: Lei Xu <B33228@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-06-27 08:31:10 -05:00
Lei Xu	045e1690b5	powerpc/85xx: Update device tree to add nand info for p5020ds Signed-off-by: Lei Xu <B33228@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-06-27 08:31:09 -05:00
Prabhakar Kushwaha	08871c097e	powerpc/85xx: Add host-pci(e) bridge only for RC FSL PCIe controller can act as agent(EP) or host(RC). Under Agent(EP) mode the controller will be configured by the host system. So its not required to be registered with the PCI(e) sub-system. We only register the controller if its configured in host(RC) mode. Signed-off-by: Vivek Mahajan <vivek.mahajan@freescale.com> Signed-off-by: Prabhakar Kushwaha <prabhakar@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-06-27 08:31:09 -05:00
Timur Tabi	3907ab2686	powerpc/85xx: add board support for the Freescale hypervisor Add support for the ePAPR-compliant Freescale hypervisor (aka "Topaz") on the Freescale P3041DS, P4080DS, and P5020DS reference boards. Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-06-27 08:30:54 -05:00
Timur Tabi	d173ea6b40	powerpc: add Freescale hypervisor partition control functions Add functions to restart and halt the current partition when running under the Freescale hypervisor. These functions should be assigned to various function pointers of the ppc_md structure during the .probe() function for the board: ppc_md.restart = fsl_hv_restart; ppc_md.power_off = fsl_hv_halt; ppc_md.halt = fsl_hv_halt; Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-06-27 08:30:53 -05:00
Ashish Kalra	3a93261f70	powerpc: introduce the ePAPR embedded hypervisor vmpic driver The Freescale ePAPR reference hypervisor provides interrupt controller services via a hypercall interface, instead of emulating the MPIC controller. This is called the VMPIC. The ePAPR "virtual interrupt controller" provides interrupt controller services for external interrupts. External interrupts received by a partition can come from two sources: - Hardware interrupts - hardware interrupts come from external interrupt lines or on-chip I/O devices. - Virtual interrupts - virtual interrupts are generated by the hypervisor as part of some hypervisor service or hypervisor-created virtual device. Both types of interrupts are processed using the same programming model and same set of hypercalls. Signed-off-by: Ashish Kalra <ashish.kalra@freescale.com> Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-06-27 08:30:26 -05:00
Timur Tabi	bd497fc978	powerpc: introduce ePAPR embedded hypervisor hcall interface ePAPR hypervisors provide operating system services via a "hypercall" interface. The following steps need to be performed to make an hcall: 1. Load r11 with the hcall number 2. Load specific other registers with parameters 3. Issue instrucion "sc 1" 4. The return code is in r3 5. Other returned parameters are in other registers. To provide this service to the kernel, these steps are wrapped in inline assembly functions. Standard ePAPR hcalls are in epapr_hcalls.h, and Freescale extensions are in fsl_hcalls.h. Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-06-27 08:30:19 -05:00
Stuart Yoder	6ec36b5848	powerpc: make irq_choose_cpu() available to all PIC drivers Move irq_choose_cpu() into arch/powerpc/kernel/irq.c so that it can be used by other PIC drivers. The function is not MPIC-specific. Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com> Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-06-22 21:44:59 -05:00
Kumar Gala	47fe819e75	powerpc/qe: Limit QE support to ppc32 Only 32-bit SoCs have a QUICC Engine so limit the config option to PPC32. Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-06-22 21:44:58 -05:00
Kumar Gala	6d251ddff8	powerpc/85xx: Add PCI support in 64-bit mode on P5020DS Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-06-22 21:44:57 -05:00
Kumar Gala	c065488f1a	powerpc/pci: Move FSL fixup from 32-bit to common We need the FSL specific header fixup code on both 32-bit and 64-bit platforms so just move the code into pci-common.c. Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-06-22 21:44:57 -05:00
Roy Zang	2602a21231	powerpc/85xx: Add basic P1023RDS board support The P1023 processor is an e500v2 based SoC that utilizes the DPAA networking architecture. This adds basic board support for non-DPAA functionality (device tree, board file, etc). Signed-off-by: Roy Zang <tie-fei.zang@freescale.com> Signed-off-by: Haiying Wang <Haiying.Wang@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-06-22 21:44:56 -05:00
Ashish Kalra	1325a684b5	powerpc/85xx: Save scratch registers to thread info instead of using SPRGs. We expect this is actually faster, and we end up needing more space than we can get from the SPRGs in some instances. This is also useful when running as a guest OS - SPRGs4-7 do not have guest versions. 8 slots are allocated in thread_info for this even though we only actually use 4 of them - this allows space for future code to have more scratch space (and we know we'll need it for things like hugetlb). Signed-off-by: Ashish Kalra <Ashish.Kalra@freescale.com> Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2011-06-22 21:44:55 -05:00

1 2 3 4 5 ...

8579 Commits