Commit Graph

118041 Commits

Author SHA1 Message Date
Christoph Hellwig 9eaead51be [XFS] implement generic xfs_btree_rshift
Make the btree right shift code generic. Based on a patch from David
Chinner with lots of changes to follow the original btree implementations
more closely. While this loses some of the generic helper routines for
inserting/moving/removing records it also solves some of the one off bugs
in the original code and makes it easier to verify.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32196a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:56:43 +11:00
Christoph Hellwig 278d0ca14e [XFS] implement generic xfs_btree_update
From: Dave Chinner <dgc@sgi.com>

The most complicated part here is the lastrec tracking for the alloc
btree. Most logic is in the update_lastrec method which has to do some
hopefully good enough dirty magic to maintain it.

[hch: split out from bigger patch and a rework of the lastrec

logic]

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32194a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:56:32 +11:00
Christoph Hellwig 38bb74237d [XFS] implement generic xfs_btree_updkey
From: Dave Chinner <dgc@sgi.com>

Note that there are many > 80 char lines introduced due to the
xfs_btree_key casts. But the places where this happens is throw-away code
once the whole btree code gets merged into a common implementation.

The same is true for the temporary xfs_alloc_log_keys define to the new
name. All old users will be gone after a few patches.

[hch: split out from bigger patch and minor adaptions]

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32193a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:56:22 +11:00
Christoph Hellwig fe033cc848 [XFS] implement generic xfs_btree_lookup
From: Dave Chinner <dgc@sgi.com>

[hch: split out from bigger patch and minor adaptions]

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32192a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:56:09 +11:00
Christoph Hellwig 8df4da4a0a [XFS] implement generic xfs_btree_decrement
From: Dave Chinner <dgc@sgi.com>

[hch: split out from bigger patch and minor adaptions]

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32191a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:55:58 +11:00
Christoph Hellwig 637aa50f46 [XFS] implement generic xfs_btree_increment
From: Dave Chinner <dgc@sgi.com>

Because this is the first major generic btree routine this patch includes
some infrastrucure, first a few routines to deal with a btree block that
can be either in short or long form, second xfs_btree_read_buf_block,
which is the new central routine to read a btree block given a cursor, and
third the new xfs_btree_ptr_addr routine to calculate the address for a
given btree pointer record.

[hch: split out from bigger patch and minor adaptions]

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32190a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:55:45 +11:00
Christoph Hellwig 65f1eaeac0 [XFS] add helpers for addressing entities inside a btree block
Add new helpers in xfs_btree.c to find the record, key and block pointer
entries inside a btree block. To implement this genericly the
->get_maxrecs methods and two new xfs_btree_ops entries for the key and
record sizes are used. Also add a big comment describing how the
addressing inside a btree block works.

Note that these helpers are unused until users are introduced in the next
patches and this patch will thus cause some harmless compiler warnings.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32189a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:55:34 +11:00
Christoph Hellwig ce5e42db42 [XFS] add get_maxrecs btree operation
Factor xfs_btree_maxrecs into a per-btree operation.

The get_maxrecs method is based on a patch from Dave Chinner.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32188a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:55:23 +11:00
Christoph Hellwig 8c4ed633e6 [XFS] make btree tracing generic
Make the existing bmap btree tracing generic so that it applies to all
btree types.

Some fragments lifted from a patch by Dave Chinner.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32187a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:55:13 +11:00
David Chinner 854929f058 [XFS] add new btree statistics
From: Dave Chinner <dgc@sgi.com>

Introduce statistics coverage of all the btrees and cover all the btree
operations, not just some.

Invaluable for determining test code coverage of all the btree
operations....

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32184a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
2008-10-30 16:55:03 +11:00
Christoph Hellwig a23f6ef8ce [XFS] refactor btree validation helpers
Move the various btree validation helpers around in xfs_btree.c so that
they are close to each other and in common #ifdef DEBUG sections.

Also add a new xfs_btree_check_ptr helper to check a btree ptr that can be
either long or short form.

Split out from a bigger patch from Dave Chinner with various small changes
applied by me.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32183a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:54:53 +11:00
Christoph Hellwig b524bfeee2 [XFS] refactor xfs_btree_readahead
From: Dave Chinner <dgc@sgi.com>

Refactor xfs_btree_readahead to make it more readable:

(a) remove the inline xfs_btree_readahead wrapper and move all checks out

of line into the main routine.

(b) factor out helpers for short/long form btrees

(c) move check for root in inodes from the callers into
xfs_btree_readahead

[hch: split out from a big patch and minor cleanups]

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32182a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:54:43 +11:00
Christoph Hellwig e99ab90d6a [XFS] add a long pointers flag to xfs_btree_cur
Add a flag to the xfs btree cursor when using long (64bit) block pointers
instead of checking btnum == XFS_BTNUM_BMAP.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32181a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:54:33 +11:00
Christoph Hellwig 8186e517fa [XFS] make btree root in inode support generic
The bmap btree is rooted in the inode and not in a disk block. Make the
support for this feature more generic by adding a btree flag to for this
feature instead of relying on the XFS_BTNUM_BMAP btnum check.

Also clean up xfs_btree_get_block where this new flag is used.

Based upon a patch from Dave Chinner.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32180a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:54:22 +11:00
Christoph Hellwig de227dd960 [XFS] add generic btree types
Add generic union types for btree pointers, keys and records. The generic
btree pointer contains either a 32 and 64bit big endian scalar for short
and long form btrees, and the key and record contain the relevant type for
each possible btree.

Split out from a bigger patch from Dave Chinner and simplified a little
further.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32178a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:54:12 +11:00
Christoph Hellwig 561f7d1739 [XFS] split up xfs_btree_init_cursor
xfs_btree_init_cursor contains close to little shared code for the
different btrees and will get even more non-common code in the future.
Split it up into one routine per btree type.

Because xfs_btree_dup_cursor needs to call the init routine for a generic
btree cursor add a new btree operation vector that contains a dup_cursor
method that initializes a new cursor based on an existing one.

The btree operations vector is based on an idea and code from Dave Chinner
and will grow more entries later during this series.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32176a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:53:59 +11:00
Christoph Hellwig f2277f06e6 [XFS] kill struct xfs_btree_hdr
This type is only embedded in struct xfs_btree_block and never used
directly. By moving the fields directly into struct xfs_btree_block a lot
of the macros for struct xfs_btree_sblock and struct xfs_btree_lblock can
be used for struct xfs_btree_block too now which helps greatly with some
of the migrations during implementing the generic btree code.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32174a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:53:47 +11:00
Lachlan McIlroy f338f90364 [XFS] Unlock inode before calling xfs_idestroy()
Lock debugging reported the ilock was being destroyed without being
unlocked. We don't need to lock the inode until we are going to insert it
into the radix tree.

SGI-PV: 987246

SGI-Modid: xfs-linux-melb:xfs-kern:32159a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 16:53:38 +11:00
Lachlan McIlroy a357a12156 [XFS] Fix use-after-free with log and quotas
Destroying the quota stuff on unmount can access the log - ie
XFS_QM_DONE() ends up in xfs_dqunlock() which calls
xfs_trans_unlocked_item() and then xfs_log_move_tail(). By this time the
log has already been destroyed. Just move the cleanup of the quota code
earlier in xfs_unmountfs() before the call to xfs_log_unmount(). Moving
XFS_QM_DONE() up near XFS_QM_DQPURGEALL() seems like a good spot.

SGI-PV: 987086

SGI-Modid: xfs-linux-melb:xfs-kern:32148a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Peter Leckie <pleckie@sgi.com>
2008-10-30 16:53:25 +11:00
Barry Naujok 46039928c9 [XFS] Remove final remnants of dirv1 macros and other stuff
SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:32002a

Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 16:52:35 +11:00
Lachlan McIlroy d07c60e54f [XFS] Use xfs_idestroy() to cleanup an inode.
SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31927a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:50:35 +11:00
Lachlan McIlroy be8b78a626 [XFS] Remove kmem_zone_t argument from xfs_inode_init_once()
kmem cache constructor no longer takes a kmem_zone_t argument.

SGI-PV: 957103

SGI-Modid: xfs-linux-melb:xfs-kern:32254a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 16:42:34 +11:00
David Chinner 07c8f67587 [XFS] Make use of the init-once slab optimisation.
To avoid having to initialise some fields of the XFS inode on every
allocation, we can use the slab init-once feature to initialise them. All
we have to guarantee is that when we free the inode, all it's entries are
in the initial state. Add asserts where possible to ensure debug kernels
check this initial state before freeing and after allocation.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31925a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 16:11:59 +11:00
Linus Torvalds e946217e4f Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (31 commits)
  ftrace: fix current_tracer error return
  tracing: fix a build error on alpha
  ftrace: use a real variable for ftrace_nop in x86
  tracing/ftrace: make boot tracer select the sched_switch tracer
  tracepoint: check if the probe has been registered
  asm-generic: define DIE_OOPS in asm-generic
  trace: fix printk warning for u64
  ftrace: warning in kernel/trace/ftrace.c
  ftrace: fix build failure
  ftrace, powerpc, sparc64, x86: remove notrace from arch ftrace file
  ftrace: remove ftrace hash
  ftrace: remove mcount set
  ftrace: remove daemon
  ftrace: disable dynamic ftrace for all archs that use daemon
  ftrace: add ftrace warn on to disable ftrace
  ftrace: only have ftrace_kill atomic
  ftrace: use probe_kernel
  ftrace: comment arch ftrace code
  ftrace: return error on failed modified text.
  ftrace: dynamic ftrace process only text section
  ...
2008-10-28 09:52:25 -07:00
Linus Torvalds a186576925 Merge branch 'kvm-updates/2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
* 'kvm-updates/2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
  KVM: ia64: Makefile fix for forcing to re-generate asm-offsets.h
  KVM: Future-proof device assignment ABI
  KVM: ia64: Fix halt emulation logic
  KVM: Fix guest shared interrupt with in-kernel irqchip
  KVM: MMU: sync root on paravirt TLB flush
2008-10-28 09:50:11 -07:00
Linus Torvalds 0d8762c9ee Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  lockdep: fix irqs on/off ip tracing
  lockdep: minor fix for debug_show_all_locks()
  x86: restore the old swiotlb alloc_coherent behavior
  x86: use GFP_DMA for 24bit coherent_dma_mask
  swiotlb: remove panic for alloc_coherent failure
  xen: compilation fix of drivers/xen/events.c on IA64
  xen: portability clean up and some minor clean up for xencomm.c
  xen: don't reload cr3 on suspend
  kernel/resource: fix reserve_region_with_split() section mismatch
  printk: remove unused code from kernel/printk.c
2008-10-28 09:49:27 -07:00
Linus Torvalds cf76dddb22 Merge branch 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  irq: make variable static
2008-10-28 09:48:25 -07:00
Linus Torvalds 8ca6215502 Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: fix documentation reference for sched_min_granularity_ns
  sched: virtual time buddy preemption
  sched: re-instate vruntime based wakeup preemption
  sched: weaken sync hint
  sched: more accurate min_vruntime accounting
  sched: fix a find_busiest_group buglet
  sched: add CONFIG_SMP consistency
2008-10-28 09:46:20 -07:00
Linus Torvalds f8245e91a5 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, memory hotplug: remove wrong -1 in calling init_memory_mapping()
  x86: keep the /proc/meminfo page count correct
  x86/uv: memory allocation at initialization
  xen: fix Xen domU boot with batched mprotect
2008-10-28 09:45:31 -07:00
Linus Torvalds b30fc14c5c Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
  [S390] s390: Fix build for !CONFIG_S390_GUEST + CONFIG_VIRTIO_CONSOLE
  [S390] No more 4kb stacks.
  [S390] Change default IPL method to IPL_VM.
  [S390] tape: disable interrupts in tape_open and tape_release
  [S390] appldata: unsigned ops->size cannot be negative
  [S390] tape block: complete request with correct locking
  [S390] Fix sysdev class file creation.
  [S390] pgtables: Fix race in enable_sie vs. page table ops
  [S390] qdio: remove incorrect memset
  [S390] qdio: prevent double qdio shutdown in case of I/O errors
2008-10-28 09:44:59 -07:00
Linus Torvalds 3c136f29ba Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  libata: ahci enclosure management bit mask
  libata: ahci enclosure management led sync
  pata_ninja32: suspend/resume support
  libata: Fix LBA48 on pata_it821x RAID volumes.
  libata: clear saved xfer_mode and ncq_enabled on device detach
  sata_sil24: configure max read request size to 4k
  libata: add missing kernel-doc
  libata: fix device iteration bugs
  ahci: Add support for Promise PDC42819
  ata: Switch all my stuff to a common address
2008-10-28 09:42:48 -07:00
Steven Rostedt 60063a6623 ftrace: fix current_tracer error return
The commit (in linux-tip) c2931e05ec
 ( ftrace: return an error when setting a nonexistent tracer )
added useful code that would error when a bad tracer was written into
the current_tracer file.

But this had a bug if the amount written was more than the amount read by
that code. The first iteration would set the tracer correctly, but since
it did not consume the rest of what was written (usually whitespace), the
userspace utility would continue to write what was not consumed. This
second iteration would fail to find a tracer and return -EINVAL. Funny
thing is that the tracer would have already been set.

This patch just consumes all the data that is written to the file.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-28 16:33:47 +01:00
Xiantao Zhang e45948b071 KVM: ia64: Makefile fix for forcing to re-generate asm-offsets.h
To avoid using stale asm-offsets.h.

Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-10-28 14:22:16 +02:00
Avi Kivity bb45e202e6 KVM: Future-proof device assignment ABI
Reserve some space so we can add more data.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-28 14:22:15 +02:00
Xiantao Zhang decc90162a KVM: ia64: Fix halt emulation logic
Common halt logic was changed by x86 and did not update ia64.  This patch
updates halt for ia64.

Fixes a regression causing guests to hang with more than 2 vcpus.

Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-10-28 14:22:14 +02:00
Sheng Yang 5550af4df1 KVM: Fix guest shared interrupt with in-kernel irqchip
Every call of kvm_set_irq() should offer an irq_source_id, which is
allocated by kvm_request_irq_source_id(). Based on irq_source_id, we
identify the irq source and implement logical OR for shared level
interrupts.

The allocated irq_source_id can be freed by kvm_free_irq_source_id().

Currently, we support at most sizeof(unsigned long) different irq sources.

[Amit: - rebase to kvm.git HEAD
       - move definition of KVM_USERSPACE_IRQ_SOURCE_ID to common file
       - move kvm_request_irq_source_id to the update_irq ioctl]

[Xiantao: - Add kvm/ia64 stuff and make it work for kvm/ia64 guests]

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-10-28 14:21:34 +02:00
Marcelo Tosatti 6ad9f15c94 KVM: MMU: sync root on paravirt TLB flush
The pvmmu TLB flush handler should request a root sync, similarly to
a native read-write CR3.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-10-28 14:09:27 +02:00
Heiko Carstens 6afe40b4da lockdep: fix irqs on/off ip tracing
Impact: fix lockdep lock-api-caller output when irqsoff tracing is enabled

81d68a96 "ftrace: trace irq disabled critical timings" added wrappers around
trace_hardirqs_on/off_caller. However these functions use
__builtin_return_address(0) to figure out which function actually disabled
or enabled irqs. The result is that we save the ips of trace_hardirqs_on/off
instead of the real caller. Not very helpful.

However since the patch from Steven the ip already gets passed. So use that
and get rid of __builtin_return_address(0) in these two functions.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-28 11:19:07 +01:00
Christian Borntraeger ea4bfdf52a [S390] s390: Fix build for !CONFIG_S390_GUEST + CONFIG_VIRTIO_CONSOLE
The s390 kernel does not compile if virtio console is enabled, but guest
support is disabled:

  LD      .tmp_vmlinux1
arch/s390/kernel/built-in.o: In function `setup_arch':
/space/linux-2.5/arch/s390/kernel/setup.c:773: undefined reference to
`s390_virtio_console_init'

The fix is related to
commit 99e65c92f2
Author: Christian Borntraeger <borntraeger@de.ibm.com>
Date:   Fri Jul 25 15:50:04 2008 +0200
    KVM: s390: Fix guest kconfig

Which changed the build process to build kvm_virtio.c only if CONFIG_S390_GUEST
is set. We must ifdef the prototype in the header file accordingly.

Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-28 11:12:06 +01:00
Heiko Carstens 7f5a8ba6b0 [S390] No more 4kb stacks.
We got a stack overflow with a small stack configuration on a 32 bit
system. It just looks like as 4kb isn't enough and too dangerous.
So lets get rid of 4kb stacks on 32 bit.

But one thing I completely dislike about the call trace below is that
just for debugging or tracing purposes sprintf gets called (cio_start_key):

	/* process condition code */
	sprintf(dbf_txt, "ccode:%d", ccode);
	CIO_TRACE_EVENT(4, dbf_txt);

But maybe its just me who thinks that this could be done better.

    <4>Kernel stack overflow.
    <4>Modules linked in: dm_multipath sunrpc bonding qeth_l2 dm_mod qeth ccwgroup vmur
    <4>CPU: 1 Not tainted 2.6.27-30.x.20081015-s390default #1
    <4>Process httpd (pid: 3807, task: 20ae2df8, ksp: 1666fb78)
    <4>Krnl PSW : 040c0000 8027098a (number+0xe/0x348)
    <4>           R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0
    <4>Krnl GPRS: 00d43318 0027097c 1666f277 9666f270
    <4>           00000000 00000000 0000000a ffffffff
    <4>           9666f270 1666f228 1666f277 1666f098
    <4>           00000002 80270982 80271016 1666f098
    <4>Krnl Code: 8027097e: f0340dd0a7f1	srp	3536(4,%r0),2033(%r10),4
    <4>           80270984: 0f00		clcl	%r0,%r0
    <4>           80270986: a7840001		brc	8,80270988
    <4>          >8027098a: 18ef		lr	%r14,%r15
    <4>           8027098c: a7faff68		ahi	%r15,-152
    <4>           80270990: 18bf		lr	%r11,%r15
    <4>           80270992: 18a2		lr	%r10,%r2
    <4>           80270994: 1893		lr	%r9,%r3

Modified calltrace with annotated stackframe size of each function:

stackframe size
    |
 0 304 vsnprintf+850 [0x271016]
 1  72 sprintf+74 [0x271522]
 2  56 cio_start_key+262 [0x2d4c16]
 3  56 ccw_device_start_key+222 [0x2dfe92]
 4  56 ccw_device_start+40 [0x2dff28]
 5  48 raw3215_start_io+104 [0x30b0f8]
 6  56 raw3215_write+494 [0x30ba0a]
 7  40 con3215_write+68 [0x30bafc]
 8  40 __call_console_drivers+146 [0x12b0fa]
 9  32 _call_console_drivers+102 [0x12b192]
10  64 release_console_sem+268 [0x12b614]
11 168 vprintk+462 [0x12bca6]
12  72 printk+68 [0x12bfd0]
13 256 __print_symbol+50 [0x15a882]
14  56 __show_trace+162 [0x103d06]
15  32 show_trace+224 [0x103e70]
16  48 show_stack+152 [0x103f20]
17  56 dump_stack+126 [0x104612]
18  96 __alloc_pages_internal+592 [0x175004]
19  80 cache_alloc_refill+776 [0x196f3c]
20  40 __kmalloc+258 [0x1972ae]
21  40 __alloc_skb+94 [0x328086]
22  32 pskb_copy+50 [0x328252]
23  32 skb_realloc_headroom+110 [0x328a72]
24 104 qeth_l2_hard_start_xmit+378 [0x7803bfde]
25  56 dev_hard_start_xmit+450 [0x32ef6e]
26  56 __qdisc_run+390 [0x3425d6]
27  48 dev_queue_xmit+410 [0x331e06]
28  40 ip_finish_output+308 [0x354ac8]
29  56 ip_output+218 [0x355b6e]
30  24 ip_local_out+56 [0x354584]
31 120 ip_queue_xmit+300 [0x355cec]
32  96 tcp_transmit_skb+812 [0x367da8]
33  40 tcp_push_one+158 [0x369fda]
34 112 tcp_sendmsg+852 [0x35d5a0]
35 240 sock_sendmsg+164 [0x32035c]
36  56 kernel_sendmsg+86 [0x32064a]
37  88 sock_no_sendpage+98 [0x322b22]
38 104 tcp_sendpage+70 [0x35cc1e]
39  48 sock_sendpage+74 [0x31eb66]
40  64 pipe_to_sendpage+102 [0x1c4b2e]
41  64 __splice_from_pipe+120 [0x1c5340]
42  72 splice_from_pipe+90 [0x1c57e6]
43  56 generic_splice_sendpage+38 [0x1c5832]
44  48 do_splice_from+104 [0x1c4c38]
45  48 direct_splice_actor+52 [0x1c4c88]
46  80 splice_direct_to_actor+180 [0x1c4f80]
47  72 do_splice_direct+70 [0x1c5112]
48  64 do_sendfile+360 [0x19de18]
49  72 sys_sendfile64+126 [0x19df32]
50 336 sysc_do_restart+18 [0x111a1a]

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-28 11:12:06 +01:00
Heiko Carstens 46e7951f94 [S390] Change default IPL method to IPL_VM.
allyesconfig and allmodconfig built kernels have a tape IPL record.
A the vmreader record makes much more sense, since hardly anybody will
ever IPL a kernel from tape. So change the default.
As I side effect I can test these kernels without fiddling around with
the kernel config ;)

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-28 11:12:06 +01:00
Frank Munzert b3c21e4919 [S390] tape: disable interrupts in tape_open and tape_release
Get tape device lock with interrupts disabled. Otherwise lockdep will issue a
warning similar to:

=================================
[ INFO: inconsistent lock state ]
2.6.27 #1
---------------------------------
inconsistent {in-hardirq-W} -> {hardirq-on-W} usage.
vol_id/2903 [HC0[0]:SC0[0]:HE1:SE1] takes:
 (sch->lock){++..}, at: [<000003e00004c7a2>] tape_open+0x42/0x1a4 [tape]
{in-hardirq-W} state was registered at:
  [<000000000007ce5c>] __lock_acquire+0x894/0xa74
  [<000000000007d0ce>] lock_acquire+0x92/0xb8
  [<0000000000345154>] _spin_lock+0x5c/0x9c
  [<0000000000202264>] do_IRQ+0x124/0x1f0
  [<0000000000026610>] io_return+0x0/0x8

irq event stamp: 847
hardirqs last  enabled at (847): [<000000000007aca6>] trace_hardirqs_on+0x2a/0x38
hardirqs last disabled at (846): [<0000000000076ca2>] trace_hardirqs_off+0x2a/0x38
softirqs last  enabled at (0): [<000000000004909e>] copy_process+0x43e/0x11f4
softirqs last disabled at (0): [<0000000000000000>] 0x0

other info that might help us debug this:
1 lock held by vol_id/2903:
 #0:  (&bdev->bd_mutex){--..}, at: [<000000000010e0f4>] do_open+0x78/0x358

stack backtrace:
CPU: 1 Not tainted 2.6.27 #1},
Process vol_id (pid: 2903, task: 000000003d4c0000, ksp: 000000003d4e3b10)
0400000000000000 000000003d4e3830 0000000000000002 0000000000000000
       000000003d4e38d0 000000003d4e3848 000000003d4e3848 00000000000168a8
       0000000000000000 000000003d4e3b10 0000000000000000 0000000000000000
       000000003d4e3830 000000000000000c 000000003d4e3830 000000003d4e38a0
       000000000034aa98 00000000000168a8 000000003d4e3830 000000003d4e3880
Call Trace:
([<000000000001681c>] show_trace+0x138/0x158)
 [<0000000000016902>] show_stack+0xc6/0xf8
 [<00000000000170d4>] dump_stack+0xb0/0xc0
 [<0000000000078810>] print_usage_bug+0x1e8/0x228
 [<000000000007a71c>] mark_lock+0xb14/0xd24
 [<000000000007cd5a>] __lock_acquire+0x792/0xa74
 [<000000000007d0ce>] lock_acquire+0x92/0xb8
 [<0000000000345154>] _spin_lock+0x5c/0x9c
 [<000003e00004c7a2>] tape_open+0x42/0x1a4 [tape]
 [<000003e00005185c>] tapeblock_open+0x98/0xd0 [tape]

Signed-off-by: Frank Munzert <munzert@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-28 11:12:05 +01:00
Roel Kluin 13f8b7c5e6 [S390] appldata: unsigned ops->size cannot be negative
unsigned ops->size cannot be negative

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-28 11:12:04 +01:00
Frank Munzert 7a4a1ccd44 [S390] tape block: complete request with correct locking
__blk_end_request must be called with request queue lock held. We need to use
blk_end_request rather than  __blk_end_request.

Signed-off-by: Frank Munzert <munzert@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-28 11:12:04 +01:00
Heiko Carstens da5aae7036 [S390] Fix sysdev class file creation.
Use sysdev_class_create_file() to create create sysdev class attributes
instead of sysfs_create_file(). Using sysfs_create_file() wasn't a very
good idea since the show and store functions have a different amount of
parameters for sysfs files and sysdev class files.
In particular the pointer to the buffer is the last argument and
therefore accesses to random memory regions happened.
Still worked surprisingly well until we got a kernel panic.

Cc: stable@kernel.org
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-28 11:12:03 +01:00
Christian Borntraeger 250cf776f7 [S390] pgtables: Fix race in enable_sie vs. page table ops
The current enable_sie code sets the mm->context.pgstes bit to tell
dup_mm that the new mm should have extended page tables. This bit is also
used by the s390 specific page table primitives to decide about the page
table layout - which means context.pgstes has two meanings. This can cause
any kind of bugs. For example  - e.g. shrink_zone can call
ptep_clear_flush_young while enable_sie is running. ptep_clear_flush_young
will test for context.pgstes. Since enable_sie changed that value of the old
struct mm without changing the page table layout ptep_clear_flush_young will
do the wrong thing.
The solution is to split pgstes into two bits
- one for the allocation
- one for the current state

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-28 11:12:03 +01:00
Jan Glauber 2c78091405 [S390] qdio: remove incorrect memset
Remove the memset since zeroing the string is not needed and use
snprintf instead of sprintf.

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-28 11:12:03 +01:00
Jan Glauber 7c045aa2c8 [S390] qdio: prevent double qdio shutdown in case of I/O errors
In case of I/O errors on a qdio subchannel qdio_shutdown may be
called twice by the qdio driver and by zfcp. Remove the
superfluous shutdown from qdio and let the upper layer driver
handle the error condition.

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-28 11:12:02 +01:00
qinghuang feng 46fec7ac40 lockdep: minor fix for debug_show_all_locks()
When we failed to get tasklist_lock eventually (count equals 0),
we should only print " ignoring it.\n", and not print
" locked it.\n" needlessly.

Signed-off-by: Qinghuang Feng <qhfeng.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-28 10:30:04 +01:00
Frederic Weisbecker 21798a84ab tracing: fix a build error on alpha
Impact: build fix on Alpha

When tracing is enabled, some arch have included <linux/irqflags.h>
on their <asm/system.h> but others like alpha or m68k don't.

Build error on alpha:

kernel/trace/trace.c: In function 'tracing_cpumask_write':
kernel/trace/trace.c:2145: error: implicit declaration of function 'raw_local_irq_disable'
kernel/trace/trace.c:2162: error: implicit declaration of function 'raw_local_irq_enable'

Tested on Alpha through a cross-compiler (should correct a similar issue on m68k).

Reported-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-28 09:53:28 +01:00