Commit Graph

312 Commits

Author SHA1 Message Date
Yasunori Goto 281dd25cdc [PATCH] swiotlb: make sure initial DMA allocations really are in DMA memory
This introduces a limit parameter to the core bootmem allocator; The new
parameter indicates that physical memory allocated by the bootmem
allocator should be within the requested limit.

We also introduce alloc_bootmem_low_pages_limit, alloc_bootmem_node_limit,
alloc_bootmem_low_pages_node_limit apis, but alloc_bootmem_low_pages_limit
is the only api used for swiotlb.

The existing alloc_bootmem_low_pages() api could instead have been
changed and made to pass right limit to the core allocator.  But that
would make the patch more intrusive for 2.6.14, as other arches use
alloc_bootmem_low_pages().  We may be done that post 2.6.14 as a
cleanup.

With this, swiotlb gets memory within 4G for both x86_64 and ia64
arches.

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Ravikiran G Thirumalai <kiran@scalex86.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-19 23:11:33 -07:00
Bryan Sutula 76e677e25d [IA64] Avoid kernel hang during CMC interrupt storm
I've noticed a kernel hang during a storm of CMC interrupts, which was
tracked down to the continual execution of the interrupt handler.

There's code in the CMC handler that's supposed to disable CMC
interrupts and switch to polling mode when it sees a bunch of CMCs.
Because disabling CMCs across all CPUs isn't safe in interrupt context,
the disable is done with a schedule_work().  But with continual CMC
interrupts, the schedule_work() never gets executed.

The following patch immediately disables CMC interrupts for the current
CPU.  This then allows (at least) one CPU to ignore CMC interrupts,
execute the schedule_work() code, and disable CMC interrupts on the rest
of the CPUs.

Acked-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Bryan Sutula <Bryan.Sutula@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-10-06 15:04:11 -07:00
Hidetoshi Seto 4881e2cd25 [IA64] MCA recovery verify pfn_valid
Verify the pfn is valid before calling pfn_to_page(),
and cut isolation message if nothing was done.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Acked-by: Russ Anderson <rja@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-22 13:27:59 -07:00
Keith Owens 20bb86852a [IA64] Wire in the MCA/INIT handler stacks
Wire the MCA/INIT handler stacks into DTR[2] and track them in
IA64_KR(CURRENT_STACK).  This gives the MCA/INIT handler stacks the
same TLB status as normal kernel stacks.  Reload the old CURRENT_STACK
data on return from OS to SAL.

Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-22 13:24:19 -07:00
Peter Chubb 83a78d9ba7 [IA64] Fix simscsi for new SCSI midlayer
The sd driver now uses scsi_execute_req() for almost everything.
scsi_execute_req() converts requests into scatterlists.

Fix the HP SCSI disk simulator to understand scatterlists for
more commands.

Without this patch the current kernel will not boot on the simulator
(the disks are always detected as having no sectors, and so cannot be
mounted).

Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-22 10:42:39 -07:00
Dipankar Sarma 4fb3a53860 [PATCH] files: fix preemption issues
With the new fdtable locking rules, you have to protect fdtable with either
->file_lock or rcu_read_lock/unlock().  There are some places where we
aren't doing either.  This patch fixes those places.

Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-17 11:50:02 -07:00
Hidetoshi Seto 20305e5972 [IA64] mca_drv cleanup
There were some trailing white spaces, long lines, brackets in
weird style etc.  This patch cleans them up.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-16 10:39:40 -07:00
Peter Chubb 24b8e0cc09 [IA64] Remove warnings for gcc 4.0 IA64 compilation.
This patch removes some compilation warnings, mostly
trivially. acpi.c fix also noted by Kenji Kaneshige.

Signed-off-by; Peter Chubb <peterc@gelato.unsw.edu.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-16 09:45:27 -07:00
David S. Miller 4db2ce0199 [LIB]: Consolidate _atomic_dec_and_lock()
Several implementations were essentialy a common piece of C code using
the cmpxchg() macro.  Put the implementation in one spot that everyone
can share, and convert sparc64 over to using this.

Alpha is the lone arch-specific implementation, which codes up a
special fast path for the common case in order to avoid GP reloading
which a pure C version would require.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-14 21:47:01 -07:00
Tony Luck deb75f3c29 Pull fix-offsets-h into release branch 2005-09-14 14:14:45 -07:00
Hugh Dickins 2fd4ef85e0 [PATCH] error path in setup_arg_pages() misses vm_unacct_memory()
Pavel Emelianov and Kirill Korotaev observe that fs and arch users of
security_vm_enough_memory tend to forget to vm_unacct_memory when a
failure occurs further down (typically in setup_arg_pages variants).

These are all users of insert_vm_struct, and that reservation will only
be unaccounted on exit if the vma is marked VM_ACCOUNT: which in some
cases it is (hidden inside VM_STACK_FLAGS) and in some cases it isn't.

So x86_64 32-bit and ppc64 vDSO ELFs have been leaking memory into
Committed_AS each time they're run.  But don't add VM_ACCOUNT to them,
it's inappropriate to reserve against the very unlikely case that gdb
be used to COW a vDSO page - we ought to do something about that in
do_wp_page, but there are yet other inconsistencies to be resolved.

The safe and economical way to fix this is to let insert_vm_struct do
the security_vm_enough_memory check when it finds VM_ACCOUNT is set.

And the MIPS irix_brk has been calling security_vm_enough_memory before
calling do_brk which repeats it, doubly accounting and so also leaking.
Remove that, and all the fs and arch calls to security_vm_enough_memory:
give it a less misleading name later on.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-Off-By: Kirill Korotaev <dev@sw.ru>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-14 11:18:13 -07:00
Tony Luck 82f1b07b9a [IA64] fix circular dependency on generation of asm-offsets.h
Fix?  One ugly hack is replaced by a different ugly hack.

Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-13 08:50:39 -07:00
Tsuneo.Yoshioka@f-secure.com 83b942bd34 [PATCH] x86-64: Fix 32bit sendfile
If we use 64bit kernel on ia64/x86_64/s390 architecture, and we run
32bit binary on 32bit compatibility mode, sendfile system call seems be
not set offset argument.

This is because sendfile's return value is not zero but the code regards
the result by return value is zero or not.

This problem will be affect to ia64/x86_64/s390 and not affect to other
architecture does not affect other architecture (mips/parisc/ppc64/sparc64).

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:49:57 -07:00
Linus Torvalds 357d596bd5 Merge branch 'release' of master.kernel.org:/pub/scm/linux/kernel/git/aegl/linux-2.6 2005-09-11 15:51:40 -07:00
Tony Luck d67eb16f5d Pull sn-features into release branch 2005-09-11 14:34:23 -07:00
Tony Luck c85b2a5fe2 Pull sim-fixes into release branch 2005-09-11 14:27:15 -07:00
Keith Owens 49a28cc8fd [IA64] MCA/INIT: remove obsolete unwind code
Delete the special case unwind code that was only used by the old
MCA/INIT handler.

Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-11 14:09:34 -07:00
Keith Owens 05f335ea04 [IA64] MCA/INIT: remove the physical mode path from minstate.h
Remove the physical mode path from minstate.h.

Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-11 14:09:12 -07:00
Keith Owens 7f613c7d22 [PATCH] MCA/INIT: use per cpu stacks
The bulk of the change.  Use per cpu MCA/INIT stacks.  Change the SAL
to OS state (sos) to be per process.  Do all the assembler work on the
MCA/INIT stacks, leaving the original stack alone.  Pass per cpu state
data to the C handlers for MCA and INIT, which also means changing the
mca_drv interfaces slightly.  Lots of verification on whether the
original stack is usable before converting it to a sleeping process.

Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-11 14:08:41 -07:00
Keith Owens 289d773ee8 [IA64] MCA/INIT: avoid reading INIT record during INIT event
Reading the INIT record from SAL during the INIT event has proved to be
unreliable, and a source of hangs during INIT processing.  The new
MCA/INIT handlers remove the need to get the INIT record from SAL.
Change salinfo.c so mca.c can just flag that a new record is available,
without having to read the record during INIT processing.  This patch
can be applied without the new MCA/INIT handlers.

Also clean up some usage of NR_CPUS which should have been using
cpu_online().

Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-11 14:02:43 -07:00
Sam Ravnborg 5bb7826900 kbuild: rename prepare to archprepare to fix dependency chain
When introducing the generic asm-offsets.h support the dependency
chain for the prepare targets was changed. All build scripts expecting
include/asm/asm-offsets.h to be made when using the prepare target would broke.
With the limited number of prepare targets left in arch Makefiles
the trivial solution was to introduce a new arch specific target: archprepare

The dependency chain looks like this now:

prepare
  |
  +--> prepare0
         |
         +--> archprepare
                |
		+--> scripts_basic
                +--> prepare1
                       |
                       +---> prepare2
                               |
                               +--> prepare3

So prepare 3 is processed before prepare2 etc.
This guaantees that the asm symlink, version.h, scripts_basic
are all updated before archprepare is processed.

prepare0 which build the asm-offsets.h file will need the
actions performed by archprepare.

The head target is now named prepare, because users scripts will most
likely use that target, but prepare-all has been kept for compatibility.
Updated Documentation/kbuild/makefiles.txt.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2005-09-11 22:30:22 +02:00
Ingo Molnar fb1c8f93d8 [PATCH] spinlock consolidation
This patch (written by me and also containing many suggestions of Arjan van
de Ven) does a major cleanup of the spinlock code.  It does the following
things:

 - consolidates and enhances the spinlock/rwlock debugging code

 - simplifies the asm/spinlock.h files

 - encapsulates the raw spinlock type and moves generic spinlock
   features (such as ->break_lock) into the generic code.

 - cleans up the spinlock code hierarchy to get rid of the spaghetti.

Most notably there's now only a single variant of the debugging code,
located in lib/spinlock_debug.c.  (previously we had one SMP debugging
variant per architecture, plus a separate generic one for UP builds)

Also, i've enhanced the rwlock debugging facility, it will now track
write-owners.  There is new spinlock-owner/CPU-tracking on SMP builds too.
All locks have lockup detection now, which will work for both soft and hard
spin/rwlock lockups.

The arch-level include files now only contain the minimally necessary
subset of the spinlock code - all the rest that can be generalized now
lives in the generic headers:

 include/asm-i386/spinlock_types.h       |   16
 include/asm-x86_64/spinlock_types.h     |   16

I have also split up the various spinlock variants into separate files,
making it easier to see which does what. The new layout is:

   SMP                         |  UP
   ----------------------------|-----------------------------------
   asm/spinlock_types_smp.h    |  linux/spinlock_types_up.h
   linux/spinlock_types.h      |  linux/spinlock_types.h
   asm/spinlock_smp.h          |  linux/spinlock_up.h
   linux/spinlock_api_smp.h    |  linux/spinlock_api_up.h
   linux/spinlock.h            |  linux/spinlock.h

/*
 * here's the role of the various spinlock/rwlock related include files:
 *
 * on SMP builds:
 *
 *  asm/spinlock_types.h: contains the raw_spinlock_t/raw_rwlock_t and the
 *                        initializers
 *
 *  linux/spinlock_types.h:
 *                        defines the generic type and initializers
 *
 *  asm/spinlock.h:       contains the __raw_spin_*()/etc. lowlevel
 *                        implementations, mostly inline assembly code
 *
 *   (also included on UP-debug builds:)
 *
 *  linux/spinlock_api_smp.h:
 *                        contains the prototypes for the _spin_*() APIs.
 *
 *  linux/spinlock.h:     builds the final spin_*() APIs.
 *
 * on UP builds:
 *
 *  linux/spinlock_type_up.h:
 *                        contains the generic, simplified UP spinlock type.
 *                        (which is an empty structure on non-debug builds)
 *
 *  linux/spinlock_types.h:
 *                        defines the generic type and initializers
 *
 *  linux/spinlock_up.h:
 *                        contains the __raw_spin_*()/etc. version of UP
 *                        builds. (which are NOPs on non-debug, non-preempt
 *                        builds)
 *
 *   (included on UP-non-debug builds:)
 *
 *  linux/spinlock_api_up.h:
 *                        builds the _spin_*() APIs.
 *
 *  linux/spinlock.h:     builds the final spin_*() APIs.
 */

All SMP and UP architectures are converted by this patch.

arm, i386, ia64, ppc, ppc64, s390/s390x, x64 was build-tested via
crosscompilers.  m32r, mips, sh, sparc, have not been tested yet, but should
be mostly fine.

From: Grant Grundler <grundler@parisc-linux.org>

  Booted and lightly tested on a500-44 (64-bit, SMP kernel, dual CPU).
  Builds 32-bit SMP kernel (not booted or tested).  I did not try to build
  non-SMP kernels.  That should be trivial to fix up later if necessary.

  I converted bit ops atomic_hash lock to raw_spinlock_t.  Doing so avoids
  some ugly nesting of linux/*.h and asm/*.h files.  Those particular locks
  are well tested and contained entirely inside arch specific code.  I do NOT
  expect any new issues to arise with them.

 If someone does ever need to use debug/metrics with them, then they will
  need to unravel this hairball between spinlocks, atomic ops, and bit ops
  that exist only because parisc has exactly one atomic instruction: LDCW
  (load and clear word).

From: "Luck, Tony" <tony.luck@intel.com>

   ia64 fix

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjanv@infradead.org>
Signed-off-by: Grant Grundler <grundler@parisc-linux.org>
Cc: Matthew Wilcox <willy@debian.org>
Signed-off-by: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Mikael Pettersson <mikpe@csd.uu.se>
Signed-off-by: Benoit Boissinot <benoit.boissinot@ens-lyon.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 10:06:21 -07:00
Linus Torvalds 486a153f0e Merge master.kernel.org:/pub/scm/linux/kernel/git/sam/kbuild 2005-09-09 15:46:49 -07:00
Ingo Molnar a9f6a0dd54 [PATCH] more SPIN_LOCK_UNLOCKED -> DEFINE_SPINLOCK conversions
This converts the final 20 DEFINE_SPINLOCK holdouts.  (another 580 places
are already using DEFINE_SPINLOCK).  Build tested on x86.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 14:03:48 -07:00
Dipankar Sarma badf16621c [PATCH] files: break up files struct
In order for the RCU to work, the file table array, sets and their sizes must
be updated atomically.  Instead of ensuring this through too many memory
barriers, we put the arrays and their sizes in a separate structure.  This
patch takes the first step of putting the file table elements in a separate
structure fdtable that is embedded withing files_struct.  It also changes all
the users to refer to the file table using files_fdtable() macro.  Subsequent
applciation of RCU becomes easier after this.

Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 13:57:55 -07:00
Chen, Kenneth W 383f2835eb [PATCH] Prefetch kernel stacks to speed up context switch
For architecture like ia64, the switch stack structure is fairly large
(currently 528 bytes).  For context switch intensive application, we found
that significant amount of cache misses occurs in switch_to() function.
The following patch adds a hook in the schedule() function to prefetch
switch stack structure as soon as 'next' task is determined.  This allows
maximum overlap in prefetch cache lines for that structure.

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 13:57:31 -07:00
Sam Ravnborg 39e01cb874 kbuild: ia64 use generic asm-offsets.h support
Delete obsolete stuff from arch Makefile
Rename file to asm-offsets.h
The trick used in the arch Makefile to circumvent the circular
dependency is kept.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2005-09-09 22:03:13 +02:00
Tony Luck 344a076110 [IA64] Manual merge fix for 3 files
arch/ia64/Kconfig
	arch/ia64/kernel/acpi.c
	include/asm-ia64/irq.h

Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-08 14:27:13 -07:00
Jack Steiner 9b17e7e74e [IA64] Increase max physical address for SN platforms
Increase the value for the maximum physical address on SN systems.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-08 13:53:38 -07:00
Dean Nelson 408865ce48 [IA64] ensure XPC and XPNET are loaded on sn2 platforms only
These are SN2 only drivers.  They should have platform checks to prevent
them from doing evil stuff in GENERIC kernels.

Signed-off-by: Martin Hicks <mort@sgi.com>
Acked-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-08 13:53:09 -07:00
Martin Hicks 087f902686 [IA64] defconfig: turn off QLOGIC_FC
Turn off the QLOGIC_FC driver.  Supposedly qla2xxx should support
these devices.  Do any ia64 machines have one of these devices as
the boot device?

Signed-off-by: Martin Hicks <mort@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-08 13:48:50 -07:00
Len Brown 64e47488c9 Merge linux-2.6 with linux-acpi-2.6 2005-09-08 01:45:47 -04:00
viro@ZenIV.linux.org.uk e72225d160 [PATCH] bogus #if (simserial)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 17:17:34 -07:00
Keshavamurthy Anil S deac66ae45 [PATCH] kprobes: fix bug when probed on task and isr functions
This patch fixes a race condition where in system used to hang or sometime
crash within minutes when kprobes are inserted on ISR routine and a task
routine.

The fix has been stress tested on i386, ia64, pp64 and on x86_64.  To
reproduce the problem insert kprobes on schedule() and do_IRQ() functions
and you should see hang or system crash.

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Acked-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:58:01 -07:00
Keshavamurthy Anil S 661e5a3d99 [PATCH] Kprobes/IA64: fix race when break hits and kprobe not found
This patch addresses a potential race condition for a case where Kprobe has
been removed right after another CPU has taken a break hit.

The way this is addressed here is when the CPU that has taken a break hit
does not find its corresponding kprobe, then we check to see if the
original instruction got replaced with other than break.  If it got
replaced with other than break instruction, then we continue to execute
from the replaced instruction, else if we find that it is still a break,
then we let the kernel handle this, as this might be the break instruction
inserted by other than kprobe(may be kernel debugger).

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:58:00 -07:00
Prasanna S Panchamukhi 1f7ad57b75 [PATCH] Kprobes: prevent possible race conditions ia64 changes
This patch contains the ia64 architecture specific changes to prevent the
possible race conditions.

Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:58:00 -07:00
Pekka Enberg f96cb1f058 [PATCH] IA64: convert kcalloc to kzalloc
This patch converts kcalloc(1, ...) calls to use the new kzalloc() function.

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:57:45 -07:00
Miklos Szeredi e922efc342 [PATCH] remove duplicated sys_open32() code from 64bit archs
64 bit architectures all implement their own compatibility sys_open(),
when in fact the difference is simply not forcing the O_LARGEFILE
flag.  So use the a common function instead.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <viro@parcelfarce.linux.theplanet.co.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:57:43 -07:00
John Hawkes 9c1cfda20a [PATCH] cpusets: Move the ia64 domain setup code to the generic code
Signed-off-by: John Hawkes <hawkes@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:57:40 -07:00
John Hawkes f68f447e83 [PATCH] ia64 cpuset + build_sched_domains() mangles structures
I've already sent this to the maintainers, and this is now being sent to a
larger community audience.  I have fixed a problem with the ia64 version of
build_sched_domains(), but a similar fix still needs to be made to the
generic build_sched_domains() in kernel/sched.c.

The "dynamic sched domains" functionality has recently been merged into
2.6.13-rcN that sees the dynamic declaration of a cpu-exclusive (a.k.a.
"isolated") cpuset and rebuilds the CPU Scheduler sched domains and sched
groups to separate away the CPUs in this cpu-exclusive cpuset from the
remainder of the non-isolated CPUs.  This allows the non-isolated CPUs to
completely ignore the isolated CPUs when doing load-balancing.

Unfortunately, build_sched_domains() expects that a sched domain will
include all the CPUs of each node in the domain, i.e., that no node will
belong in both an isolated cpuset and a non-isolated cpuset.  Declaring a
cpuset that violates this presumption will produce flawed data structures
and will oops the kernel.

To trigger the problem (on a NUMA system with >1 CPUs per node):
   cd /dev/cpuset
   mkdir newcpuset
   cd newcpuset
   echo 0 >cpus
   echo 0 >mems
   echo 1 >cpu_exclusive

I have fixed this shortcoming for ia64 NUMA (with multiple CPUs per node).
A similar shortcoming exists in the generic build_sched_domains() (in
kernel/sched.c) for NUMA, and that needs to be fixed also.  The fix
involves dynamically allocating sched_group_nodes[] and
sched_group_allnodes[] for each invocation of build_sched_domains(), rather
than using global arrays for these structures.  Care must be taken to
remember kmalloc() addresses so that arch_destroy_sched_domains() can
properly kfree() the new dynamic structures.

Signed-off-by: John Hawkes <hawkes@sgi.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:57:39 -07:00
Ashok Raj 54d5d42404 [PATCH] x86/x86_64: deferred handling of writes to /proc/irqxx/smp_affinity
When handling writes to /proc/irq, current code is re-programming rte
entries directly. This is not recommended and could potentially cause
chipset's to lockup, or cause missing interrupts.

CONFIG_IRQ_BALANCE does this correctly, where it re-programs only when the
interrupt is pending. The same needs to be done for /proc/irq handling as well.
Otherwise user space irq balancers are really not doing the right thing.

- Changed pending_irq_balance_cpumask to pending_irq_migrate_cpumask for
  lack of a generic name.
- added move_irq out of IRQ_BALANCE, and added this same to X86_64
- Added new proc handler for write, so we can do deferred write at irq
  handling time.
- Display of /proc/irq/XX/smp_affinity used to display CPU_MASKALL, instead
  it now shows only active cpu masks, or exactly what was set.
- Provided a common move_irq implementation, instead of duplicating
  when using generic irq framework.

Tested on i386/x86_64 and ia64 with CONFIG_PCI_MSI turned on and off.
Tested UP builds as well.

MSI testing: tbd: I have cards, need to look for a x-over cable, although I
did test an earlier version of this patch.  Will test in a couple days.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Acked-by: Zwane Mwaikambo <zwane@holomorphy.com>
Grudgingly-acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Coywolf Qi Hunt <coywolf@lovecn.org>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:57:15 -07:00
Kenji Kaneshige 697eaad417 [IA64] Minor cleanups - remove CONFIG_ACPI_DEALLOCATE_IRQ
The config option 'CONFIG_ACPI_DEALLOCATE_IRQ' is no longer
needed. This patch removes it.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-07 14:00:08 -07:00
Chen, Kenneth W 0232622324 [IA64] minor performance tune-up in ia64_switch_to
The reenabling of psr.ic should really belong to dtr mapping code block.
It make the fall through code fast since it doesn't need to execute the
predicated-off instruction.  Logically make more sense as well since psr.ic
was turned off in .map code block.

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-07 13:56:23 -07:00
Chen, Kenneth W 295bd89279 [IA64] make exception handler in copy_user more robust
The exception handler in copy user always expects fault occurs only on
user space address and the fall back recovery code is written with that
very assumption in mind.  Recent source code inspection revealed that
while it worked splendid and to the expectation under normal circumstances,
It broke down under unexpected condition where some address calculation
might go outside the legal address range the original copy_user was
called for.  This patch is to make copy_user exception handler more robust
and to prevent potential memory corruption.

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-07 08:53:16 -07:00
Kiyoshi Ueda 63028aa7f5 [IA64] page_not_present fault in region 5 is normal
When copying data from user-space to kernel-space by __copy_user(),
a page_not_present fault sometimes occurs at vmalloced kernel address
because of VHPT pre-fetching.

Ignore the page_not_present fault in ia64_do_page_fault() before
jumping into exception handlers.

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-06 16:06:58 -07:00
Len Brown 129521dcc9 Merge linux-2.6 into linux-acpi-2.6 test 2005-09-03 02:44:09 -04:00
Martin Hicks a994018a5f [IA64] uncached allocator: use generic (not sn2 specific) functions
Change sn2-specific calls into generic functions.  Without this change
the uncached allocator will not work on non-sn2 platforms.

Signed-off-by: Greg Edwards <edwardsg@sgi.com>
Signed-off-by: Martin Hicks <mort@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-08-31 14:18:04 -07:00
Jack Steiner a1cddb8892 [IA64-SGI] Add new vendor-specific SAL calls for:
- notifying the PROM of specific features that are supported by the OS.
  This is used to enable PROM feature if and only if the corresponding
  feature is implemented in the OS

- fetch feature sets that are supported by the current PROM. This allows
  the OS to selectively enable features when the PROM support is available.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-08-31 11:00:53 -07:00
Peter Chubb 6cf07a8cc8 [IA64] Fix nasty VMLPT problem...
I've solved the problem I was having with the simulator and not
booting Debian.

The problem is that the number of bits for the virtual linear array
short-format VHPT (Virtually mapped linear page table, VMLPT for
short) is being tested incorrectly. 

There are two problems:
      1. The PAL call that should tell the kernel the size of the
      virtual address space isn't implemented for the simulator, so
      the kernel uses the default 50.  This is addressed separately
      in dc90e95f31

      2.  In arch/ia64/mm/init.c there's code to calcualte the size
      of the VMLPT based on the number of implemented virtual address
      bits and the page size.  It checks to see if the VMLPT base
      address overlaps the top of the mapped region, but this check
      doesn't allow for the address space hole, and in fact will
      never trigger.

Here's an alternative test and panic, that I think is more accurate.

Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-08-31 08:35:22 -07:00
Peter Chubb 714d2dc149 [IA64] Allow /proc/pal/cpu0/vm_info under the simulator
Not all of the PAL VM calls are implemented for the SKI simulator.
Don't just give up if one fails, print information from the calls
that succeed.

Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-08-31 08:34:51 -07:00