In order for the RCU to work, the file table array, sets and their sizes must
be updated atomically. Instead of ensuring this through too many memory
barriers, we put the arrays and their sizes in a separate structure. This
patch takes the first step of putting the file table elements in a separate
structure fdtable that is embedded withing files_struct. It also changes all
the users to refer to the file table using files_fdtable() macro. Subsequent
applciation of RCU becomes easier after this.
Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
For architecture like ia64, the switch stack structure is fairly large
(currently 528 bytes). For context switch intensive application, we found
that significant amount of cache misses occurs in switch_to() function.
The following patch adds a hook in the schedule() function to prefetch
switch stack structure as soon as 'next' task is determined. This allows
maximum overlap in prefetch cache lines for that structure.
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Increase the value for the maximum physical address on SN systems.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
These are SN2 only drivers. They should have platform checks to prevent
them from doing evil stuff in GENERIC kernels.
Signed-off-by: Martin Hicks <mort@sgi.com>
Acked-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Turn off the QLOGIC_FC driver. Supposedly qla2xxx should support
these devices. Do any ia64 machines have one of these devices as
the boot device?
Signed-off-by: Martin Hicks <mort@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch fixes a race condition where in system used to hang or sometime
crash within minutes when kprobes are inserted on ISR routine and a task
routine.
The fix has been stress tested on i386, ia64, pp64 and on x86_64. To
reproduce the problem insert kprobes on schedule() and do_IRQ() functions
and you should see hang or system crash.
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Acked-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch addresses a potential race condition for a case where Kprobe has
been removed right after another CPU has taken a break hit.
The way this is addressed here is when the CPU that has taken a break hit
does not find its corresponding kprobe, then we check to see if the
original instruction got replaced with other than break. If it got
replaced with other than break instruction, then we continue to execute
from the replaced instruction, else if we find that it is still a break,
then we let the kernel handle this, as this might be the break instruction
inserted by other than kprobe(may be kernel debugger).
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch contains the ia64 architecture specific changes to prevent the
possible race conditions.
Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch converts kcalloc(1, ...) calls to use the new kzalloc() function.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
64 bit architectures all implement their own compatibility sys_open(),
when in fact the difference is simply not forcing the O_LARGEFILE
flag. So use the a common function instead.
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <viro@parcelfarce.linux.theplanet.co.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I've already sent this to the maintainers, and this is now being sent to a
larger community audience. I have fixed a problem with the ia64 version of
build_sched_domains(), but a similar fix still needs to be made to the
generic build_sched_domains() in kernel/sched.c.
The "dynamic sched domains" functionality has recently been merged into
2.6.13-rcN that sees the dynamic declaration of a cpu-exclusive (a.k.a.
"isolated") cpuset and rebuilds the CPU Scheduler sched domains and sched
groups to separate away the CPUs in this cpu-exclusive cpuset from the
remainder of the non-isolated CPUs. This allows the non-isolated CPUs to
completely ignore the isolated CPUs when doing load-balancing.
Unfortunately, build_sched_domains() expects that a sched domain will
include all the CPUs of each node in the domain, i.e., that no node will
belong in both an isolated cpuset and a non-isolated cpuset. Declaring a
cpuset that violates this presumption will produce flawed data structures
and will oops the kernel.
To trigger the problem (on a NUMA system with >1 CPUs per node):
cd /dev/cpuset
mkdir newcpuset
cd newcpuset
echo 0 >cpus
echo 0 >mems
echo 1 >cpu_exclusive
I have fixed this shortcoming for ia64 NUMA (with multiple CPUs per node).
A similar shortcoming exists in the generic build_sched_domains() (in
kernel/sched.c) for NUMA, and that needs to be fixed also. The fix
involves dynamically allocating sched_group_nodes[] and
sched_group_allnodes[] for each invocation of build_sched_domains(), rather
than using global arrays for these structures. Care must be taken to
remember kmalloc() addresses so that arch_destroy_sched_domains() can
properly kfree() the new dynamic structures.
Signed-off-by: John Hawkes <hawkes@sgi.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When handling writes to /proc/irq, current code is re-programming rte
entries directly. This is not recommended and could potentially cause
chipset's to lockup, or cause missing interrupts.
CONFIG_IRQ_BALANCE does this correctly, where it re-programs only when the
interrupt is pending. The same needs to be done for /proc/irq handling as well.
Otherwise user space irq balancers are really not doing the right thing.
- Changed pending_irq_balance_cpumask to pending_irq_migrate_cpumask for
lack of a generic name.
- added move_irq out of IRQ_BALANCE, and added this same to X86_64
- Added new proc handler for write, so we can do deferred write at irq
handling time.
- Display of /proc/irq/XX/smp_affinity used to display CPU_MASKALL, instead
it now shows only active cpu masks, or exactly what was set.
- Provided a common move_irq implementation, instead of duplicating
when using generic irq framework.
Tested on i386/x86_64 and ia64 with CONFIG_PCI_MSI turned on and off.
Tested UP builds as well.
MSI testing: tbd: I have cards, need to look for a x-over cable, although I
did test an earlier version of this patch. Will test in a couple days.
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Acked-by: Zwane Mwaikambo <zwane@holomorphy.com>
Grudgingly-acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Coywolf Qi Hunt <coywolf@lovecn.org>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The config option 'CONFIG_ACPI_DEALLOCATE_IRQ' is no longer
needed. This patch removes it.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The reenabling of psr.ic should really belong to dtr mapping code block.
It make the fall through code fast since it doesn't need to execute the
predicated-off instruction. Logically make more sense as well since psr.ic
was turned off in .map code block.
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The exception handler in copy user always expects fault occurs only on
user space address and the fall back recovery code is written with that
very assumption in mind. Recent source code inspection revealed that
while it worked splendid and to the expectation under normal circumstances,
It broke down under unexpected condition where some address calculation
might go outside the legal address range the original copy_user was
called for. This patch is to make copy_user exception handler more robust
and to prevent potential memory corruption.
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
When copying data from user-space to kernel-space by __copy_user(),
a page_not_present fault sometimes occurs at vmalloced kernel address
because of VHPT pre-fetching.
Ignore the page_not_present fault in ia64_do_page_fault() before
jumping into exception handlers.
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Change sn2-specific calls into generic functions. Without this change
the uncached allocator will not work on non-sn2 platforms.
Signed-off-by: Greg Edwards <edwardsg@sgi.com>
Signed-off-by: Martin Hicks <mort@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
It has been reported that the way Linux handles NODEFER for signals is
not consistent with the way other Unix boxes handle it. I've written a
program to test the behavior of how this flag affects signals and had
several reports from people who ran this on various Unix boxes,
confirming that Linux seems to be unique on the way this is handled.
The way NODEFER affects signals on other Unix boxes is as follows:
1) If NODEFER is set, other signals in sa_mask are still blocked.
2) If NODEFER is set and the signal is in sa_mask, then the signal is
still blocked. (Note: this is the behavior of all tested but Linux _and_
NetBSD 2.0 *).
The way NODEFER affects signals on Linux:
1) If NODEFER is set, other signals are _not_ blocked regardless of
sa_mask (Even NetBSD doesn't do this).
2) If NODEFER is set and the signal is in sa_mask, then the signal being
handled is not blocked.
The patch converts signal handling in all current Linux architectures to
the way most Unix boxes work.
Unix boxes that were tested: DU4, AIX 5.2, Irix 6.5, NetBSD 2.0, SFU
3.5 on WinXP, AIX 5.3, Mac OSX, and of course Linux 2.6.13-rcX.
* NetBSD was the only other Unix to behave like Linux on point #2. The
main concern was brought up by point #1 which even NetBSD isn't like
Linux. So with this patch, we leave NetBSD as the lonely one that
behaves differently here with #2.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
copy_page.o appeared twice in arch/ia64/lib/Makefile. The
one in global lib-y is wrong where it should be just in
lib-$(CONFIG_ITANIUM).
Both copy_page.o and copy_page_mck.o are build for Itanium2
processor and the link order will pick up the low performing
copy_page function (originally written for itanium processor).
In this case, we really want the copy_page_mck.o for optimized
version.
Signed-off-by: Kenneth Chen <kenneth.w.chen@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Patch to support P-state transitions on ia64. This driver is based on ACPI,
and uses the ACPI processor driver interface to find out the P-state support
information for the processor. This driver plugs into generic cpufreq
infrastructure.
Once this driver is loaded successfully, ondemand/userspace governor can be
used to change the CPU frequency dynamically based on load or on request from
userspace process.
Refer :
ACPI specification -
http://www.acpi.info
P-state related PAL calls -
http://developer.intel.com/design/itanium/downloads/24869909.pdf
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
bte_copy() calls calls smp_processor_id(), which will get flagged if
preemption if enabled. raw_smp_processor_id() is used instead
because we are just using it to pick a BTE interface and are not
tied to a specific cpu.
Signed-off-by: Russ Anderson (rja@sgi.com)
Signed-off-by: Tony Luck <tony.luck@intel.com>
Altix patch to abstract irq_affinity down to the pci provider level since
different SGI hardware implements this in different ways.
Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Delete the ability to build an ACPI kernel that does
not include PCI support. When such a machine is created
and it requires a tuned kernel, send a patch.
http://bugzilla.kernel.org/show_bug.cgi?id=1364
Signed-off-by: Len Brown <len.brown@intel.com>
Build issues were mostly in the ACPI=n case -- don't do that.
Select ACPI from IA64_GENERIC.
Add some missing dependencies on ACPI.
Mark BLACKLIST_YEAR and some laptop-only ACPI drivers
as X86-only. Let me know when you get an IA64 Laptop.
Signed-off-by: Len Brown <len.brown@intel.com>
Clean up of SGI SN partitioning related code.
The SN_SAL_GET_SN_INFO SAL call returns the partition ID, making
the SN_SAL_SYSCTL_PARTITION_GET SAL call redundant. Remove sn_partid
and use sn_partition_id.
Signed-off-by: Russ Anderson (rja@sgi.com)
Signed-off-by: Tony Luck <tony.luck@intel.com>
Update the SN pci device info to use the nearest node function
to allocate driver memory on the nearest node (rather than
defaulting to node 0).
Signed-off-by: Mark Goodwin <markgw@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Add a new exported function for determining the nearest node
with CPUs for I/O nodes and fix a bug where the hwperf dynamic
misc device was being registered before misc_init().
Signed-off-by: Mark Goodwin <markgw@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Bugfix to export PCI topology information in /proc/sgi_sn/sn_topology.
Signed-off-by: Mark Goodwin <markgw@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Currently, region numbers are defined in several files, with several
names. For example, we have REGION_KERNEL in asm/page.h and
RGN_KERNEL in pgtable.h
We also have address definitions that should depend on the
RGN_XXX macros, but are currently just long constants.
The following patch reorganises all the definitions so that they have
the same form (RGN_XXX), are in one place, and that addresses that
depend on RGN_XXX are derived from them.
(This is a necessary but not sufficient patch to allow UML-like
operation on IA64).
Thanks to David Mosberger for catching the change I missed in mmu_context.h.
Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Kernel 2.6 doesn't support egcs, and I didn't find any user of this
function.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Removed IA64 architecture specific users of asm/segment.h
The removal of asm-ia64/segment.h itself can wait until all
of the kernel source has been purged of references.
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
pcibios_bus_to_resource is exported on all architectures except ia64
and sparc. Add exports for the two missing architectures. Needed when
Yenta socket support is compiled as a module.
Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>