Commit Graph

24896 Commits

Author SHA1 Message Date
Paulius Zaleckas 092d392119 V4L/DVB (8337): soc_camera: make videobuf independent
Makes SoC camera videobuf independent. Includes all necessary changes for
PXA camera driver (currently the only driver using soc_camera in the mainline).
These changes are important for the future soc_camera based drivers.

Signed-off-by: Paulius Zaleckas <paulius.zaleckas@teltonika.lt>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:25:17 -03:00
Jean Delvare 5a367dfb73 V4L/DVB (8246): tvaudio: Stop I2C driver ID abuse
The tvaudio driver is using "official" I2C device IDs for internal
purpose. There must be some historical reason behind this but anyway,
it shouldn't do that. As the stored values are never used, the easiest
way to fix the problem is simply to remove them altogether.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:18:38 -03:00
Jean Delvare be99af6679 V4L/DVB (8245): ovcamchip: Delete stray I2C bus ID
I2C_HW_SMBUS_OVFX2 is referenced in ovcamchip_core.c, but no bus uses
this driver ID, so we can remove the reference. As far as I can see,
the Cypress FX2 webcam is handled by a different driver (dvb-usb).

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:18:34 -03:00
Hans Verkuil f87086e302 v4l-dvb: remove legacy checks to allow support for kernels < 2.6.10
Also remove some blank lines that were used to split compat code at -devel
tree.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:17:52 -03:00
Hans de Goede ab8f12cf8e V4L/DVB (8197): gspca: pac207 frames no more decoded in the subdriver.
videodev2: New pixfmt
pac207:   Remove the specific decoding.
main:     get_buff_size operation added for the subdriver.

Signed-off-by: Hans de Goede <j.w.r.degoede@hhs.nl>
Signed-off-by: Jean-Francois Moine <moinejf@free.fr>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:17:02 -03:00
Hans de Goede 54ab92ca05 V4L/DVB (8194): gspca: Fix the format of the low resolution mode of spca561.
The low (half) res modes of the spca561 are not spca561 compressed, but are
raw bayer, this patches fixes this and adds a PIX_FMT define for the GBRG
bayer format used by the spca561 in low res mode.

Signed-off-by: Hans de Goede <j.w.r.degoede@hhs.nl>
Signed-off-by: Jean-Francois Moine <moinejf@free.fr>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:16:47 -03:00
Jean-Francois Moine 6a7eba24e4 V4L/DVB (8157): gspca: all subdrivers
- remaning subdrivers added
- remove the decoding helper and some specific frame decodings

Signed-off-by: Jean-Francois Moine <moinejf@free.fr>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:14:49 -03:00
Al Viro a36ef6b1e0 V4L/DVB (8128): saa7146: ->cpu_addr and friends are little-endian
Annotations + stop saa7146_i2c from playing fast and loose with
reuse of ->cpu_addr for host-endian.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:13:14 -03:00
Hans Verkuil e0e31cdb91 V4L/DVB (8105): cx2341x: add TS capability
The cx18 can support transport streams with newer firmwares. Add a TS
capability to the generic cx2341x module.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:11:55 -03:00
Hans Verkuil 21575c1312 V4L/DVB (8103): videodev: fix/improve ioctl debugging
Various ioctl debugging fixes and improvements:

- use %x rather than %d for control IDs and bitmask fields
- make two arrays const
- show the whole control array for the ext_ctrl ioctls
- print pix_fmt for V4L2_BUF_TYPE_VIDEO_OUTPUT
- show full type name rather than an integer
- fix CROPCAP debugging
- fix G/S_TUNER debugging
- show error code in case of an error
- other small cleanups

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:11:44 -03:00
brandon@ifup.org 539a7555b3 V4L/DVB (8078): Introduce "index" attribute for persistent video4linux device nodes
A number of V4L drivers have a mod param to specify their preferred minors.
This is because it is often desirable for applications to have a static /dev
name for a particular device.  However, using minors has several disadvantages:

  1) the requested minor may already be taken
  2) using a mod param is driver specific
  3) it requires every driver to add a param
  4) requires configuration by hand

This patch introduces an "index" attribute that when combined with udev rules
can create static device paths like this:

/dev/v4l/by-path/pci-0000\:00\:1d.2-usb-0\:1\:1.0-video0
/dev/v4l/by-path/pci-0000\:00\:1d.2-usb-0\:1\:1.0-video1
/dev/v4l/by-path/pci-0000\:00\:1d.2-usb-0\:1\:1.0-video2

$ ls -la /dev/v4l/by-path/pci-0000\:00\:1d.2-usb-0\:1\:1.0-video0
lrwxrwxrwx 1 root root 12 2008-04-28 00:02 /dev/v4l/by-path/pci-0000:00:1d.2-usb-0:1:1.0-video0 -> ../../video1

These paths are steady across reboots and should be resistant to rearranging
across Kernel versions.

video_register_device_index is available to drivers to request a
specific index number.

Signed-off-by: Brandon Philips <bphilips@suse.de>
Signed-off-by: Kees Cook <kees@outflux.net>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:10:27 -03:00
Hans Verkuil 78b526a435 V4L/DVB (7949): videodev: renamed the vidioc_*_fmt_* callbacks
The naming for the callbacks that handle the VIDIOC_ENUM_FMT and
VIDIOC_S/G/TRY_FMT ioctls was very confusing. Renamed it to match
the v4l2_buf_type name.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:07:32 -03:00
Hans Verkuil 0e3bd2b999 V4L/DVB (7948): videodev: add missing vidioc_try_fmt_sliced_vbi_output and VIDIOC_ENUMOUTPUT handling
There was no vidioc_try_fmt_sliced_vbi_output, instead vidioc_try_fmt_vbi_output
was reused.

The VIDIOC_ENUMOUTPUT handling was missing altogether, even though the callback
existed.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:07:27 -03:00
Hans Verkuil b2de2313f1 V4L/DVB (7947): videodev: add vidioc_g_std callback.
The default videodev behavior for VIDIOC_G_STD is not correct for all devices.
Add a new callback that drivers can use instead.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:07:22 -03:00
Tobias Lorenz 1d0ba5f378 V4L/DVB (7942): Hardware frequency seek ioctl interface
Signed-off-by: Tobias Lorenz <tobias.lorenz@gmx.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:07:12 -03:00
Marcelo Tosatti 34d4cb8fca KVM: MMU: nuke shadowed pgtable pages and ptes on memslot destruction
Flush the shadow mmu before removing regions to avoid stale entries.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:40 +03:00
Avi Kivity d6e88aec07 KVM: Prefix some x86 low level function with kvm_, to avoid namespace issues
Fixes compilation with CONFIG_VMI enabled.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:39 +03:00
Christian Borntraeger 180c12fb22 KVM: s390: rename private structures
While doing some tests with our lcrash implementation I have seen a
naming conflict with prefix_info in kvm_host.h vs. addrconf.h

To avoid future conflicts lets rename private definitions in
asm/kvm_host.h by adding the kvm_s390 prefix.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:37 +03:00
Avi Kivity 7a5b56dfd3 KVM: x86 emulator: lazily evaluate segment registers
Instead of prefetching all segment bases before emulation, read them at the
last moment.  Since most of them are unneeded, we save some cycles on
Intel machines where this is a bit expensive.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:35 +03:00
Avi Kivity f5b4edcd52 KVM: x86 emulator: simplify rip relative decoding
rip relative decoding is relative to the instruction pointer of the next
instruction; by moving address adjustment until after decoding is complete,
we remove the need to determine the instruction size.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:34 +03:00
Tan, Li 9ef621d3be KVM: Support mixed endian machines
Currently kvmtrace is not portable. This will prevent from copying a
trace file from big-endian target to little-endian workstation for analysis.
In the patch, kernel outputs metadata containing a magic number to trace
log, and changes 64-bit words to be u64 instead of a pair of u32s.

Signed-off-by: Tan Li <li.tan@intel.com>
Acked-by: Jerone Young <jyoung5@us.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:32 +03:00
Laurent Vivier 7f39f8ac17 KVM: Add coalesced MMIO support (ia64 part)
This patch enables coalesced MMIO for ia64 architecture.
It defines KVM_MMIO_PAGE_OFFSET and KVM_CAP_COALESCED_MMIO.
It enables the compilation of coalesced_mmio.c.

[akpm: fix compile error on ia64]

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:31 +03:00
Laurent Vivier 588968b6b7 KVM: Add coalesced MMIO support (powerpc part)
This patch enables coalesced MMIO for powerpc architecture.
It defines KVM_MMIO_PAGE_OFFSET and KVM_CAP_COALESCED_MMIO.
It enables the compilation of coalesced_mmio.c.

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:31 +03:00
Laurent Vivier 542472b53e KVM: Add coalesced MMIO support (x86 part)
This patch enables coalesced MMIO for x86 architecture.
It defines KVM_MMIO_PAGE_OFFSET and KVM_CAP_COALESCED_MMIO.
It enables the compilation of coalesced_mmio.c.

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:31 +03:00
Laurent Vivier 5f94c1741b KVM: Add coalesced MMIO support (common part)
This patch adds all needed structures to coalesce MMIOs.
Until an architecture uses it, it is not compiled.

Coalesced MMIO introduces two ioctl() to define where are the MMIO zones that
can be coalesced:

- KVM_REGISTER_COALESCED_MMIO registers a coalesced MMIO zone.
  It requests one parameter (struct kvm_coalesced_mmio_zone) which defines
  a memory area where MMIOs can be coalesced until the next switch to
  user space. The maximum number of MMIO zones is KVM_COALESCED_MMIO_ZONE_MAX.

- KVM_UNREGISTER_COALESCED_MMIO cancels all registered zones inside
  the given bounds (bounds are also given by struct kvm_coalesced_mmio_zone).

The userspace client can check kernel coalesced MMIO availability by asking
ioctl(KVM_CHECK_EXTENSION) for the KVM_CAP_COALESCED_MMIO capability.
The ioctl() call to KVM_CAP_COALESCED_MMIO will return 0 if not supported,
or the page offset where will be stored the ring buffer.
The page offset depends on the architecture.

After an ioctl(KVM_RUN), the first page of the KVM memory mapped points to
a kvm_run structure. The offset given by KVM_CAP_COALESCED_MMIO is
an offset to the coalesced MMIO ring expressed in PAGE_SIZE relatively
to the address of the start of th kvm_run structure. The MMIO ring buffer
is defined by the structure kvm_coalesced_mmio_ring.

[akio: fix oops during guest shutdown]

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Akio Takebe <takebe_akio@jp.fujitsu.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:31 +03:00
Laurent Vivier 92760499d0 KVM: kvm_io_device: extend in_range() to manage len and write attribute
Modify member in_range() of structure kvm_io_device to pass length and the type
of the I/O (write or read).

This modification allows to use kvm_io_device with coalesced MMIO.

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:30 +03:00
Guillaume Thouvenin 3e6e0aab1b KVM: Prefixes segment functions that will be exported with "kvm_"
Prefixes functions that will be exported with kvm_.
We also prefixed set_segment() even if it still static
to be coherent.

signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Laurent Vivier <laurent.vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:27 +03:00
Avi Kivity 9ba075a664 KVM: MTRR support
Add emulation for the memory type range registers, needed by VMware esx 3.5,
and by pci device assignment.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:26 +03:00
Avi Kivity 81609e3e26 KVM: Order segment register constants in the same way as cpu operand encoding
This can be used to simplify the x86 instruction decoder.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:26 +03:00
Sheng Yang f08864b42a KVM: VMX: Enable NMI with in-kernel irqchip
Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:26 +03:00
Sheng Yang 3419ffc8e4 KVM: IOAPIC/LAPIC: Enable NMI support
[avi: fix ia64 build breakage]

Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:25 +03:00
Avi Kivity 7cc8883074 KVM: Remove decache_vcpus_on_cpu() and related callbacks
Obsoleted by the vmx-specific per-cpu list.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:25 +03:00
Avi Kivity 4ecac3fd6d KVM: Handle virtualization instruction #UD faults during reboot
KVM turns off hardware virtualization extensions during reboot, in order
to disassociate the memory used by the virtualization extensions from the
processor, and in order to have the system in a consistent state.
Unfortunately virtual machines may still be running while this goes on,
and once virtualization extensions are turned off, any virtulization
instruction will #UD on execution.

Fix by adding an exception handler to virtualization instructions; if we get
an exception during reboot, we simply spin waiting for the reset to complete.
If it's a true exception, BUG() so we can have our stack trace.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:41:43 +03:00
Avi Kivity 1b7fcd3263 KVM: MMU: Fix false flooding when a pte points to page table
The KVM MMU tries to detect when a speculative pte update is not actually
used by demand fault, by checking the accessed bit of the shadow pte.  If
the shadow pte has not been accessed, we deem that page table flooded and
remove the shadow page table, allowing further pte updates to proceed
without emulation.

However, if the pte itself points at a page table and only used for write
operations, the accessed bit will never be set since all access will happen
through the emulator.

This is exactly what happens with kscand on old (2.4.x) HIGHMEM kernels.
The kernel points a kmap_atomic() pte at a page table, and then
proceeds with read-modify-write operations to look at the dirty and accessed
bits.  We get a false flood trigger on the kmap ptes, which results in the
mmu spending all its time setting up and tearing down shadows.

Fix by setting the shadow accessed bit on emulated accesses.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:40:50 +03:00
Joerg Roedel d2ebb4103f KVM: SVM: add tracing support for TDP page faults
To distinguish between real page faults and nested page faults they should be
traced as different events. This is implemented by this patch.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:40:48 +03:00
Ingo Molnar d986434a7d Merge branch 'sched/urgent' into sched/devel 2008-07-20 11:01:29 +02:00
Peter Zijlstra 31656519e1 sched, x86: clean up hrtick implementation
random uvesafb failures were reported against Gentoo:

  http://bugs.gentoo.org/show_bug.cgi?id=222799

and Mihai Moldovan bisected it back to:

> 8f4d37ec07 is first bad commit
> commit 8f4d37ec07
> Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Date:   Fri Jan 25 21:08:29 2008 +0100
>
>    sched: high-res preemption tick

Linus suspected it to be hrtick + vm86 interaction and observed:

> Btw, Peter, Ingo: I think that commit is doing bad things. They aren't
> _incorrect_ per se, but they are definitely bad.
>
> Why?
>
> Using random _TIF_WORK_MASK flags is really impolite for doing
> "scheduling" work. There's a reason that arch/x86/kernel/entry_32.S
> special-cases the _TIF_NEED_RESCHED flag: we don't want to exit out of
> vm86 mode unnecessarily.
>
> See the "work_notifysig_v86" label, and how it does that
> "save_v86_state()" thing etc etc.

Right, I never liked having to fiddle with those TIF flags. Initially I
needed it because the hrtimer base lock could not nest in the rq lock.
That however is fixed these days.

Currently the only reason left to fiddle with the TIF flags is remote
wakeups. We cannot program a remote cpu's hrtimer. I've been thinking
about using the new and improved IPI function call stuff to implement
hrtimer_start_on().

However that does require that smp_call_function_single(.wait=0) works
from interrupt context - /me looks at the latest series from Jens - Yes
that does seem to be supported, good.

Here's a stab at cleaning this stuff up ...

Mihai reported test success as well.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Mihai Moldovan <ionic@ionic.de>
Cc: Michal Januszewski <spock@gentoo.org>
Cc: Antonino Daplas <adaplas@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-20 10:37:28 +02:00
Mike Travis 80422d3431 cpumask: Provide a generic set of CPUMASK_ALLOC macros, FIXUP
* Rename CPUMASK_VAR --> CPUMASK_PTR (and simplify)

  * Fix a semantic error in CPUMASK_ALLOC

  * Add a bit of commentry to cpumask.h

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-20 10:21:12 +02:00
Mike Travis 94a1e869c7 NR_CPUS: Replace per_cpu(..., smp_processor_id()) with __get_cpu_var
* Slight optimization when getting one's own cpu_info percpu data.

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-20 10:21:11 +02:00
Ingo Molnar c4dc59ae7a x86, VisWS: turn into generic arch, eliminate leftover files
remove unused leftovers.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-20 09:31:24 +02:00
Yinghai Lu 63b5d7af25 x86: add ->pre_time_init to x86_quirks
so NUMAQ can use that to call numaq_pre_time_init()

This allows us to remove a NUMAQ special from arch/x86/kernel/setup.c.

(and paves the way to remove the NUMAQ subarch)

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-20 09:25:52 +02:00
Yinghai Lu 64898a8bad x86: extend and use x86_quirks to clean up NUMAQ code
add these new x86_quirks methods:

	int *mpc_record;
	int (*mpc_apic_id)(struct mpc_config_processor *m);
	void (*mpc_oem_bus_info)(struct mpc_config_bus *m, char *name);
	void (*mpc_oem_pci_bus)(struct mpc_config_bus *m);
	void (*smp_read_mpc_oem)(struct mp_config_oemtable *oemtable,
                                    unsigned short oemsize);

... and move NUMAQ related mps table handling to numaq_32.c.

also move the call to smp_read_mpc_oem() to smp_read_mpc() directly.

Should not change functionality, albeit it would be nice to get it
tested on real NUMAQ as well ...

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-20 09:25:52 +02:00
Yinghai Lu 3c9cb6de1e x86: introduce x86_quirks
introduce x86_quirks array of boot-time quirk methods.

No change in functionality intended.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-20 09:18:17 +02:00
Jussi Kivilinna 175f9c1bba net_sched: Add size table for qdiscs
Add size table functions for qdiscs and calculate packet size in
qdisc_enqueue().

Based on patch by Patrick McHardy
 http://marc.info/?l=linux-netdev&m=115201979221729&w=2

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-20 00:08:47 -07:00
Jussi Kivilinna 0abf77e55a net_sched: Add accessor function for packet length for qdiscs
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-20 00:08:27 -07:00
Jussi Kivilinna 5f86173bdf net_sched: Add qdisc_enqueue wrapper
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-20 00:08:04 -07:00
YOSHIFUJI Hideaki 230b183921 net: Use standard structures for generic socket address structures.
Use sockaddr_storage{} for generic socket address storage
and ensures proper alignment.
Use sockaddr{} for pointers to omit several casts.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-19 22:35:47 -07:00
David S. Miller 407d819cf0 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-2.6 2008-07-19 00:30:39 -07:00
Denis V. Lunev 7abbcd6a4c ipv6: remove unused macros from net/ipv6.h
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-19 00:29:42 -07:00
Denis V. Lunev 725a8ff04a ipv6: remove unused parameter from ip6_ra_control
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-19 00:28:58 -07:00
Adam Langley 4389dded77 tcp: Remove redundant checks when setting eff_sacks
Remove redundant checks when setting eff_sacks and make the number of SACKs a
compile time constant. Now that the options code knows how many SACK blocks can
fit in the header, we don't need to have the SACK code guessing at it.

Signed-off-by: Adam Langley <agl@imperialviolet.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-19 00:07:02 -07:00
Adam Langley 33ad798c92 tcp: options clean up
This should fix the following bugs:
  * Connections with MD5 signatures produce invalid packets whenever SACK
    options are included
  * MD5 signatures are counted twice in the MSS calculations

Behaviour changes:
  * A SYN with MD5 + SACK + TS elicits a SYNACK with MD5 + SACK

    This is because we can't fit any SACK blocks in a packet with MD5 + TS
    options. There was discussion about disabling SACK rather than TS in
    order to fit in better with old, buggy kernels, but that was deemed to
    be unnecessary.

  * SYNs with MD5 don't include a TS option

    See above.

Additionally, it removes a bunch of duplicated logic for calculating options,
which should help avoid these sort of issues in the future.

Signed-off-by: Adam Langley <agl@imperialviolet.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-19 00:04:31 -07:00
Adam Langley 49a72dfb88 tcp: Fix MD5 signatures for non-linear skbs
Currently, the MD5 code assumes that the SKBs are linear and, in the case
that they aren't, happily goes off and hashes off the end of the SKB and
into random memory.

Reported by Stephen Hemminger in [1]. Advice thanks to Stephen and Evgeniy
Polyakov. Also includes a couple of missed route_caps from Stephen's patch
in [2].

[1] http://marc.info/?l=linux-netdev&m=121445989106145&w=2
[2] http://marc.info/?l=linux-netdev&m=121459157816964&w=2

Signed-off-by: Adam Langley <agl@imperialviolet.org>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-19 00:01:42 -07:00
Harvey Harrison 336d3262df sctp: remove unnecessary byteshifting, calculate directly in big-endian
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 23:07:09 -07:00
Vlad Yasevich 7dab83de50 sctp: Support ipv6only AF_INET6 sockets.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 23:05:40 -07:00
Stephen Hemminger c1e20f7c8b tcp: RTT metrics scaling
Some of the metrics (RTT, RTTVAR and RTAX_RTO_MIN) are stored in
kernel units (jiffies) and this leaks out through the netlink API to
user space where the units for jiffies are unknown.

This patches changes the kernel to convert to/from milliseconds. This
changes the ABI, but milliseconds seemed like the most natural unit
for these parameters.  Values available via syscall in
/proc/net/rt_cache and netlink will be in milliseconds.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 23:02:15 -07:00
David S. Miller 3072367300 pkt_sched: Manage qdisc list inside of root qdisc.
Idea is from Patrick McHardy.

Instead of managing the list of qdiscs on the device level, manage it
in the root qdisc of a netdev_queue.  This solves all kinds of
visibility issues during qdisc destruction.

The way to iterate over all qdiscs of a netdev_queue is to visit
the netdev_queue->qdisc, and then traverse it's list.

The only special case is to ignore builting qdiscs at the root when
dumping or doing a qdisc_lookup().  That was not needed previously
because builtin qdiscs were not added to the device's qdisc_list.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 22:50:15 -07:00
Matthew Garrett 92c4989092 Input: add switch for dock events
Add a SW_DOCK switch to input.h.  ACPI docks currently send their docking
status as a uevent, but not all docks are ACPI or correspond to a device.
In that case, it makes more sense to simply generate an input event on
docking or undocking.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2008-07-19 00:52:43 -04:00
Mark Brown 5ec461d083 Input: add microphone insert switch definition
Add a new switch type to the input API for reporting microphone
insertion. This will be used by the ALSA jack reporting API.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2008-07-19 00:52:36 -04:00
David S. Miller 72b25a913e pkt_sched: Get rid of u32_list.
The u32_list is just an indirect way of maintaining a reference
to a U32 node on a per-qdisc basis.

Just add an explicit node pointer for u32 to struct Qdisc an do
away with this global list.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 20:54:17 -07:00
Patrick McHardy 8913336a7e packet: add PACKET_RESERVE sockopt
Add new sockopt to reserve some headroom in the mmaped ring frames in
front of the packet payload. This can be used f.i. when the VLAN header
needs to be (re)constructed to avoid moving the entire payload.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 18:05:19 -07:00
venkatesh.pallipadi@intel.com ae79cdaacb x86: Add a arch directory for x86 under debugfs
Add a directory for x86 arch under debugfs. Can be used to accumulate all
x86 specific debugfs files.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-07-18 17:22:04 -07:00
Jan Beulich 48fe4a76e2 x86: i386: reduce boot fixmap space
As 256 entries are needed, aligning to a 256-entry boundary is
sufficient and still guarantees the single pte table requirement.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-07-18 16:17:52 -07:00
Jan Beulich 08ad8afaa0 x86: reduce force_mwait visibility
It's not used anywhere outside its single referencing file.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-07-18 15:55:09 -07:00
Jan Beulich 08e1a13e7d x86: reduce forbid_dac's visibility
It's not used anywhere outside its declaring file.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-07-18 14:39:37 -07:00
Mike Travis 77586c2bda cpumask: Provide a generic set of CPUMASK_ALLOC macros
* Provide a generic set of CPUMASK_ALLOC macros patterned after the
    SCHED_CPUMASK_ALLOC macros.  This is used where multiple cpumask_t
    variables are declared on the stack to reduce the amount of stack
    space required.

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 22:03:00 +02:00
Mike Travis 65c0118453 cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr
* This patch replaces the dangerous lvalue version of cpumask_of_cpu
    with new cpumask_of_cpu_ptr macros.  These are patterned after the
    node_to_cpumask_ptr macros.

    In general terms, if there is a cpumask_of_cpu_map[] then a pointer to
    the cpumask_of_cpu_map[cpu] entry is used.  The cpumask_of_cpu_map
    is provided when there is a large NR_CPUS count, reducing
    greatly the amount of code generated and stack space used for
    cpumask_of_cpu().  The pointer to the cpumask_t value is needed for
    calling set_cpus_allowed_ptr() to reduce the amount of stack space
    needed to pass the cpumask_t value.

    If there isn't a cpumask_of_cpu_map[], then a temporary variable is
    declared and filled in with value from cpumask_of_cpu(cpu) as well as
    a pointer variable pointing to this temporary variable.  Afterwards,
    the pointer is used to reference the cpumask value.  The compiler
    will optimize out the extra dereference through the pointer as well
    as the stack space used for the pointer, resulting in identical code.

    A good example of the orthogonal usages is in net/sunrpc/svc.c:

	case SVC_POOL_PERCPU:
	{
		unsigned int cpu = m->pool_to[pidx];
		cpumask_of_cpu_ptr(cpumask, cpu);

		*oldmask = current->cpus_allowed;
		set_cpus_allowed_ptr(current, cpumask);
		return 1;
	}
	case SVC_POOL_PERNODE:
	{
		unsigned int node = m->pool_to[pidx];
		node_to_cpumask_ptr(nodecpumask, node);

		*oldmask = current->cpus_allowed;
		set_cpus_allowed_ptr(current, nodecpumask);
		return 1;
	}

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 22:02:57 +02:00
Ingo Molnar bb2c018b09 Merge branch 'linus' into cpus4096
Conflicts:

	drivers/acpi/processor_throttling.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 22:00:54 +02:00
Jaswinder Singh 6ac8d51f01 x86: introducing asm-x86/traps.h
Declaring x86 traps under one hood.
Declaring x86 do_traps before defining them.

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 18:51:57 +02:00
Ingo Molnar f1b0c8d3d3 Merge branch 'linus' into x86/amd-iommu 2008-07-18 18:43:08 +02:00
Thomas Petazzoni 9781f39fd2 x86: consolidate the definition of the force_mwait variable
The force_mwait variable iss defined either in
arch/x86/kernel/cpu/amd.c or in arch/x86/kernel/setup_64.c, but it is
only initialized and used in arch/x86/kernel/process.c. This patch
moves the declaration to arch/x86/kernel/process.c.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: michael@free-electrons.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 18:39:19 +02:00
Herton Ronaldo Krzesinski 723edb5060 Fix typos from signal_32/64.h merge
Fallout from commit 33185c504f ("x86:
merge signal_32/64.h")

Thanks to Dick Streefland who provided an useful testcase on
http://lkml.org/lkml/2008/3/17/205 (only applicable to 2.6.24.x), that
helped a lot as a deterministic way to bisect an issue that leaded to
this fix.

Signed-off-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br>
Signed-off-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br>
Cc: Roland McGrath <roland@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 17:59:13 +02:00
Russ Anderson 7019cc2dd6 x86 BIOS interface for RTC on SGI UV
Real-time code needs to know the number of cycles per second
on SGI UV.  The information is provided via a run time BIOS
call.  This patch provides the linux side of that interface.
This is the first of several run time BIOS calls to be defined
in uv/bios.h and bios_uv.c.

Note that BIOS_CALL() is just a stub for now.  The bios
side is being worked on.

Signed-off-by: Russ Anderson <rja@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 14:35:14 +02:00
Alexander van Heukelum 8450e85399 x86, cleanup: fix description of __fls(): __fls(0) is undefined
Ricardo M. Correia spotted that the use of __fls() in fls64() did
not seem to make sense. In fact fls64()'s implementation is fine,
but the description of __fls() was wrong. Fix that.

Reported-by: "Ricardo M. Correia" <Ricardo.M.Correia@Sun.COM>
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 14:32:38 +02:00
Maciej W. Rozycki 35b680557f x86: more apic debugging
[ mingo@elte.hu: picked up this patch from Maciej, lets make apic=debug
                 print out more info - we had a lot of APIC changes ]

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 14:27:51 +02:00
Maciej W. Rozycki baa1318841 x86: APIC: Make apic_verbosity unsigned
As a microoptimisation, make apic_verbosity unsigned.  This will make
apic_printk(APIC_QUIET, ...) expand into just printk(...) with the
surrounding condition and a reference to apic_verbosity removed.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 14:27:43 +02:00
Yinghai Lu 1f067167a8 x86: seperate memtest from init_64.c
it's separate functionality that deserves its own file.

This also prepares 32-bit memtest support.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 14:10:27 +02:00
Ingo Molnar 1b427c153a sched: fix build error, provide partition_sched_domains() unconditionally
provide an empty partition_sched_domains() definition for the UP case:

 include/linux/cpuset.h: In function ‘rebuild_sched_domains':
 include/linux/cpuset.h:163: error: implicit declaration of function ‘partition_sched_domains'

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 14:02:46 +02:00
Harvey Harrison 3217256188 x86: suppress sparse returning void warnings
include/asm/paravirt.h:1404:2: warning: returning void-valued expression
include/asm/paravirt.h:1414:2: warning: returning void-valued expression

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 13:42:01 +02:00
Ingo Molnar 2fb5e1e101 Merge branch 'linus' into x86/paravirt-spinlocks
Conflicts:

	arch/x86/kernel/Makefile

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 13:41:27 +02:00
Max Krasnyansky e761b77252 cpu hotplug, sched: Introduce cpu_active_map and redo sched domain managment (take 2)
This is based on Linus' idea of creating cpu_active_map that prevents
scheduler load balancer from migrating tasks to the cpu that is going
down.

It allows us to simplify domain management code and avoid unecessary
domain rebuilds during cpu hotplug event handling.

Please ignore the cpusets part for now. It needs some more work in order
to avoid crazy lock nesting. Although I did simplfy and unify domain
reinitialization logic. We now simply call partition_sched_domains() in
all the cases. This means that we're using exact same code paths as in
cpusets case and hence the test below cover cpusets too.
Cpuset changes to make rebuild_sched_domains() callable from various
contexts are in the separate patch (right next after this one).

This not only boots but also easily handles
	while true; do make clean; make -j 8; done
and
	while true; do on-off-cpu 1; done
at the same time.
(on-off-cpu 1 simple does echo 0/1 > /sys/.../cpu1/online thing).

Suprisingly the box (dual-core Core2) is quite usable. In fact I'm typing
this on right now in gnome-terminal and things are moving just fine.

Also this is running with most of the debug features enabled (lockdep,
mutex, etc) no BUG_ONs or lockdep complaints so far.

I believe I addressed all of the Dmitry's comments for original Linus'
version. I changed both fair and rt balancer to mask out non-active cpus.
And replaced cpu_is_offline() with !cpu_active() in the main scheduler
code where it made sense (to me).

Signed-off-by: Max Krasnyanskiy <maxk@qualcomm.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Gregory Haskins <ghaskins@novell.com>
Cc: dmitry.adamushko@gmail.com
Cc: pj@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 13:22:25 +02:00
Sebastian Siewior 2b7207a6b5 ftrace: copy + paste typo in asm/ftrace.h
Signed-off-by: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Steven Rostedt <srostedt@redhat.co>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 13:14:08 +02:00
Pavel Emelyanov b6fcbdb4f2 proc: consolidate per-net single-release callers
They are symmetrical to single_open ones :)

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 04:07:44 -07:00
Pavel Emelyanov de05c557b2 proc: consolidate per-net single_open callers
There are already 7 of them - time to kill some duplicate code.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 04:07:21 -07:00
Pavel Emelyanov 923c6586b0 mib: put icmpmsg statistics on struct net
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 04:04:22 -07:00
Pavel Emelyanov b60538a0d7 mib: put icmp statistics on struct net
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 04:04:02 -07:00
Pavel Emelyanov 386019d351 mib: put udplite statistics on struct net
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 04:03:45 -07:00
Pavel Emelyanov 2f275f91a4 mib: put udp statistics on struct net
Similar to... ouch, I repeat myself.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 04:03:27 -07:00
Pavel Emelyanov 61a7e26028 mib: put net statistics on struct net
Similar to ip and tcp ones :)

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 04:03:08 -07:00
Pavel Emelyanov a20f5799ca mib: put ip statistics on struct net
Similar to tcp one.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 04:02:42 -07:00
Pavel Emelyanov 57ef42d59d mib: put tcp statistics on struct net
Proc temporary uses stats from init_net.

BTW, TCP_XXX_STATS are beautiful (w/o do { } while (0) facing) again :)

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 04:02:08 -07:00
Pavel Emelyanov 852566f53c mib: add netns/mib.h file
The only structure declared within is the netns_mib, which will
carry all our mibs within. I didn't put the mibs in the existing
netns_xxx structures to make it possible to mark this one as
properly aligned and get in a separate "read-mostly" cache-line.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 04:01:24 -07:00
Maciej W. Rozycki 593f4a788e x86: APIC: remove apic_write_around(); use alternatives
Use alternatives to select the workaround for the 11AP Pentium erratum
for the affected steppings on the fly rather than build time.  Remove the
X86_GOOD_APIC configuration option and replace all the calls to
apic_write_around() with plain apic_write(), protecting accesses to the
ESR as appropriate due to the 3AP Pentium erratum.  Remove
apic_read_around() and all its invocations altogether as not needed.
Remove apic_write_atomic() and all its implementing backends.  The use of
ASM_OUTPUT2() is not strictly needed for input constraints, but I have
used it for readability's sake.

I had the feeling no one else was brave enough to do it, so I went ahead
and here it is.  Verified by checking the generated assembly and tested
with both a 32-bit and a 64-bit configuration, also with the 11AP
"feature" forced on and verified with gdb on /proc/kcore to work as
expected (as an 11AP machines are quite hard to get hands on these days).
Some script complained about the use of "volatile", but apic_write() needs
it for the same reason and is effectively a replacement for writel(), so I
have disregarded it.

I am not sure what the policy wrt defconfig files is, they are generated
and there is risk of a conflict resulting from an unrelated change, so I
have left changes to them out.  The option will get removed from them at
the next run.

Some testing with machines other than mine will be needed to avoid some
stupid mistake, but despite its volume, the change is not really that
intrusive, so I am fairly confident that because it works for me, it will
everywhere.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 12:51:21 +02:00
David S. Miller 49997d7515 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:

	Documentation/powerpc/booting-without-of.txt
	drivers/atm/Makefile
	drivers/net/fs_enet/fs_enet-main.c
	drivers/pci/pci-acpi.c
	net/8021q/vlan.c
	net/iucv/iucv.c
2008-07-18 02:39:39 -07:00
Ingo Molnar 48ae744434 Merge branch 'linus' into x86/step 2008-07-18 10:14:56 +02:00
David S. Miller 432e8765f0 sparc64: Add missing hypervisor service group numbers.
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 00:43:52 -07:00
David S. Miller f7fe93344f sparc64: Remove 4MB and 512K base page size options.
Adrian Bunk reported that enabling 4MB page size breaks the build.
The problem is that MAX_ORDER combined with the page shift exceeds the
SECTION_SIZE_BITS we use in asm-sparc64/sparsemem.h

There are several ways I suppose we could work around this.  For one
we could define a CONFIG_FORCE_MAX_ZONEORDER to decrease MAX_ORDER in
these higher page size cases.

But I also know that these page size cases are broken wrt. TLB miss
handling especially on pre-hypervisor systems, and there isn't an easy
way to fix that.

These options were meant to be fun experimental hacks anyways, and
only 8K and 64K make any sense to support.

So remove 512K and 4M base page size support.  Of course, we still
support these page sizes for huge pages.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 23:44:53 -07:00
David S. Miller d172ad18f9 sparc64: Convert to generic helpers for IPI function calls.
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 23:44:50 -07:00
Sam Ravnborg f5e706ad88 sparc: join the remaining header files
With this commit all sparc64 header files are moved to asm-sparc.
The remaining files (71 files) were too different to be trivially
merged so divide them up in a _32.h and a _64.h file which
are both included from the file with no bit size.

The following script were used:
cd include
FILES=`wc -l asm-sparc64/*h | grep -v '^     1' | cut -b 20-`

for FILE in ${FILES}; do
  echo $FILE:
  BASE=`echo $FILE | cut -d '.' -f 1`
  FN32=${BASE}_32.h
  FN64=${BASE}_64.h
  GUARD=___ASM_SPARC_`echo $BASE | tr '-' '_' | tr [:lower:] [:upper:]`_H
  git mv asm-sparc/$FILE asm-sparc/$FN32
  git mv asm-sparc64/$FILE asm-sparc/$FN64
  echo git mv done
  printf "#ifndef %s\n" $GUARD                             >   asm-sparc/$FILE
  printf "#define %s\n" $GUARD                             >>  asm-sparc/$FILE
  printf "#if defined(__sparc__) && defined(__arch64__)\n" >>  asm-sparc/$FILE
  printf "#include <asm-sparc/%s>\n" $FN64                 >>  asm-sparc/$FILE
  printf "#else\n"                                         >>  asm-sparc/$FILE
  printf "#include <asm-sparc/%s>\n" $FN32                 >>  asm-sparc/$FILE
  printf "#endif\n"                                        >>  asm-sparc/$FILE
  printf "#endif\n"                                        >>  asm-sparc/$FILE
  git add asm-sparc/$FILE
  echo new file done
  printf "#include <asm-sparc/%s>\n" $FILE                 >  asm-sparc64/$FILE
  git add asm-sparc64/$FILE
  echo sparc64 file done
done

The guard contains three '_' to avoid conflict with existing guards.
In additing the two Kbuild files are emptied to avoid breaking
headers_* targets.
We will reintroduce the exported header files when the necessary
kbuild changes are merged.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 21:55:51 -07:00
Sam Ravnborg 5e3609f60c sparc: merge header files with trivial differences
A manual inspection revealed that the following headerfiles
contained only trivial differences:
hw_irq.h idprom.h kmap_types.h kvm.h spinlock_types.h sunbpp.h unaligned.h

The only noteworthy change are that sparc64 had a volatile
qualifer that sparc missed in spinlock_types.h.

In addition a few comments were updated.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:45:02 -07:00
Sam Ravnborg 075ae52532 sparc: when header files are equal use asm-sparc version
Used the following script to find equal header files:
SPARC64=`ls asm-sparc64`
for FILE in ${SPARC64}; do
	cmp -s asm-sparc/$FILE asm-sparc64/$FILE;
	if [ $? = 0 ]; then
		printf "#include <asm-sparc/%s>\n" $FILE > asm-sparc64/$FILE
	fi
done

A few of the equal files are a simple include from
asm-generic, but by including the file from asm-sparc
we know they are equal for sparc and sparc64.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:44:58 -07:00
Sam Ravnborg a00736e936 sparc: copy sparc64 specific files to asm-sparc
Used the following script to copy the files:
cd include
set -e
SPARC64=`ls asm-sparc64`
for FILE in ${SPARC64}; do
	if [ -f asm-sparc/$FILE ]; then
		echo $FILE exist in asm-sparc
	else
		git mv asm-sparc64/$FILE asm-sparc/$FILE
		printf "#include <asm-sparc/$FILE>\n" > asm-sparc64/$FILE
		git add asm-sparc64/$FILE
	fi
done

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:44:53 -07:00
Sam Ravnborg bdc3135ac9 sparc: Merge asm-sparc{,64}/asi.h
Joined the two files as they contain distinct definitions.
Inspired by patch from: Adrian Bunk <bunk@kernel.org>

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Adrian Bunk <bunk@kernel.org>
2008-07-17 21:42:30 -07:00
Sam Ravnborg b1a8bf92a0 sparc: export openprom.h to userspace
sparc64 exports openprom.h to userspace so let sparc follow
the example.
As openprom.h pulled in another not-for-export vaddrs.h header
file it required a few changes to fix the build.

The definition af VMALLOC_* were moved to pgtable as this is
where sparc64 has them.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:42:23 -07:00
Sam Ravnborg b444b9a5a1 sparc: Merge asm-sparc{,64}/types.h
Copy content of sparc64 file to sparc file.
There is only minimal possibilities for further unification.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:42:19 -07:00
Sam Ravnborg c6d1b0e3d2 sparc: Merge asm-sparc{,64}/termios.h
Bring the commit e55c57e0b5
("[SPARC64]: Report any user access faults in termios accessors")
over to sparc when unifying the two files.
The diff was manually inspected to contain no
other relevant changes.

This unification therefore changes functionality of sparc.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:42:15 -07:00
Sam Ravnborg 943d0e8613 sparc: Merge asm-sparc{,64}/termbits.h
The type of tcflag_t differs from 32 and 64 bit.
For 32 bit it is long
For 64 bit it is int

Altough these have same size then I was not sure that
it was OK to change the 64 bit version to long as this
is part of the ABI so it was made conditional.

:$ diff -u include/asm-sparc/termbits.h include/asm-sparc64/termbits.h
:-- include/asm-sparc/termbits.h	2008-06-13 06:42:07.000000000 +0200
:++ include/asm-sparc64/termbits.h	2008-06-13 06:42:07.000000000 +0200
:@@ -1,11 +1,11 @@
:-#ifndef _SPARC_TERMBITS_H
:-#define _SPARC_TERMBITS_H
:+#ifndef _SPARC64_TERMBITS_H
:+#define _SPARC64_TERMBITS_H
:
: #include <linux/posix_types.h>
:
: typedef unsigned char   cc_t;
: typedef unsigned int    speed_t;
:-typedef unsigned long   tcflag_t;
:+typedef unsigned int    tcflag_t;
:
: #define NCC 8
: struct termio {
:@@ -102,7 +102,7 @@
: #define IXANY	0x00000800
: #define IXOFF	0x00001000
: #define IMAXBEL	0x00002000
:-#define IUTF8   0x00004000
:+#define IUTF8	0x00004000
:
: /* c_oflag bits */
: #define OPOST	0x00000001
:@@ -171,7 +171,6 @@
: #define HUPCL	  0x00000400
: #define CLOCAL	  0x00000800
: #define CBAUDEX   0x00001000
:-/* We'll never see these speeds with the Zilogs, but for completeness... */
: #define  BOTHER   0x00001000
: #define  B57600   0x00001001
: #define  B115200  0x00001002
:@@ -199,7 +198,7 @@
: #define B3500000  0x00001012
: #define B4000000  0x00001013  */
: #define CIBAUD	  0x100f0000  /* input baud rate (not used) */
:-#define CMSPAR	  0x40000000  /* mark or space (stick) parity */
:+#define CMSPAR    0x40000000  /* mark or space (stick) parity */
: #define CRTSCTS	  0x80000000  /* flow control */
:
: #define IBSHIFT	  16		/* Shift from CBAUD to CIBAUD */
:@@ -258,4 +257,4 @@
: #define	TCSADRAIN	1
: #define	TCSAFLUSH	2
:
:-#endif /* !(_SPARC_TERMBITS_H) */
:+#endif /* !(_SPARC64_TERMBITS_H) */

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:42:12 -07:00
Sam Ravnborg 7c4285d836 sparc: Merge asm-sparc{,64}/setup.h
COMMAND_LINE_SIZE differ for 32 and 64 bit.
256 versus 2048

:$ diff -u include/asm-sparc/setup.h include/asm-sparc64/setup.h
:-- include/asm-sparc/setup.h	2008-06-13 06:42:07.000000000 +0200
:++ include/asm-sparc64/setup.h	2008-06-13 06:42:07.000000000 +0200
:@@ -2,9 +2,9 @@
:  *	Just a place holder.
:  */
:
:-#ifndef _SPARC_SETUP_H
:-#define _SPARC_SETUP_H
:+#ifndef _SPARC64_SETUP_H
:+#define _SPARC64_SETUP_H
:
:-#define COMMAND_LINE_SIZE	256
:+#define COMMAND_LINE_SIZE	2048
:
:-#endif /* _SPARC_SETUP_H */
:+#endif /* _SPARC64_SETUP_H */

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:42:09 -07:00
Sam Ravnborg 68a61c8d87 sparc: Merge asm-sparc{,64}/resource.h
RLIM_INFINITY differ from 32 and 64 bit.
The rest is equal.

:$ diff -u include/asm-sparc/resource.h include/asm-sparc64/resource.h
:-- include/asm-sparc/resource.h	2008-06-13 06:46:39.000000000 +0200
:++ include/asm-sparc64/resource.h	2008-06-13 06:46:39.000000000 +0200
:@@ -1,11 +1,11 @@
: /*
:  * resource.h: Resource definitions.
:  *
:- * Copyright (C) 1995 David S. Miller (davem@caip.rutgers.edu)
:+ * Copyright (C) 1996 David S. Miller (davem@caip.rutgers.edu)
:  */
:
:-#ifndef _SPARC_RESOURCE_H
:-#define _SPARC_RESOURCE_H
:+#ifndef _SPARC64_RESOURCE_H
:+#define _SPARC64_RESOURCE_H
:
: /*
:  * These two resource limit IDs have a Sparc/Linux-specific ordering,
:@@ -14,13 +14,6 @@
: #define RLIMIT_NOFILE		6	/* max number of open files */
: #define RLIMIT_NPROC		7	/* max number of processes */
:
:-/*
:- * SuS says limits have to be unsigned.
:- * We make this unsigned, but keep the
:- * old value for compatibility:
:- */
:-#define RLIM_INFINITY		0x7fffffff
:-
: #include <asm-generic/resource.h>
:
:-#endif /* !(_SPARC_RESOURCE_H) */
:+#endif /* !(_SPARC64_RESOURCE_H) */

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:42:07 -07:00
Sam Ravnborg 7acc483d21 sparc: Merge asm-sparc{,64}/fbio.h
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:42:04 -07:00
Sam Ravnborg fc86029910 sparc: copy asm-sparc64/fbio.h to asm-sparc
There were only a few trivial changes and a few additions
in the sparc64 variant of this file.
This patch copies the sparc64 specific bits to the sparc version
of fbio.h so they are equal. A later patch will merge the two.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:41:55 -07:00
Sam Ravnborg f92ffa12f4 sparc: Merge asm-sparc{,64}/mman.h
Renaming the function sparc64_mmap_check() to
sparc_mmap_check() was enough to make the two
header files identical.

:$ diff -u include/asm-sparc/mman.h include/asm-sparc64/mman.h
:-- include/asm-sparc/mman.h	2008-06-13 06:46:39.000000000 +0200
:++ include/asm-sparc64/mman.h	2008-06-13 06:46:39.000000000 +0200
:@@ -1,5 +1,5 @@
:-#ifndef __SPARC_MMAN_H__
:-#define __SPARC_MMAN_H__
:+#ifndef __SPARC64_MMAN_H__
:+#define __SPARC64_MMAN_H__
:
: #include <asm-generic/mman.h>
:
:@@ -23,9 +23,9 @@
:
: #ifdef __KERNEL__
: #ifndef __ASSEMBLY__
:-#define arch_mmap_check(addr,len,flags)	sparc_mmap_check(addr,len)
:-int sparc_mmap_check(unsigned long addr, unsigned long len);
:+#define arch_mmap_check(addr,len,flags)	sparc64_mmap_check(addr,len)
:+int sparc64_mmap_check(unsigned long addr, unsigned long len);
: #endif
: #endif
:
:-#endif /* __SPARC_MMAN_H__ */
:+#endif /* __SPARC64_MMAN_H__ */

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:41:51 -07:00
Sam Ravnborg 2d1419624c sparc: Merge asm-sparc{,64}/shmbuf.h
Padding in the shmbuf structure made conditional
as only 32 bit sparc did so.

:$ diff -u include/asm-sparc/shmbuf.h include/asm-sparc64/shmbuf.h
:-- include/asm-sparc/shmbuf.h	2008-06-13 06:42:07.000000000 +0200
:++ include/asm-sparc64/shmbuf.h	2008-06-13 06:42:07.000000000 +0200
:@@ -1,23 +1,19 @@
:-#ifndef _SPARC_SHMBUF_H
:-#define _SPARC_SHMBUF_H
:+#ifndef _SPARC64_SHMBUF_H
:+#define _SPARC64_SHMBUF_H
:
: /*
:- * The shmid64_ds structure for sparc architecture.
:+ * The shmid64_ds structure for sparc64 architecture.
:  * Note extra padding because this structure is passed back and forth
:  * between kernel and user space.
:  *
:  * Pad space is left for:
:- * - 64-bit time_t to solve y2038 problem
:- * - 2 miscellaneous 32-bit values
:+ * - 2 miscellaneous 64-bit values
:  */
:
: struct shmid64_ds {
: 	struct ipc64_perm	shm_perm;	/* operation perms */
:-	unsigned int		__pad1;
: 	__kernel_time_t		shm_atime;	/* last attach time */
:-	unsigned int		__pad2;
: 	__kernel_time_t		shm_dtime;	/* last detach time */
:-	unsigned int		__pad3;
: 	__kernel_time_t		shm_ctime;	/* last change time */
: 	size_t			shm_segsz;	/* size of segment (bytes) */
: 	__kernel_pid_t		shm_cpid;	/* pid of creator */
:@@ -39,4 +35,4 @@
: 	unsigned long	__unused4;
: };
:
:-#endif /* _SPARC_SHMBUF_H */
:+#endif /* _SPARC64_SHMBUF_H */

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:41:48 -07:00
Sam Ravnborg fcb07081f2 sparc: Merge asm-sparc{,64}/sembuf.h
Padding in the sembuf structure made conditional
as only 32 bit sparc did so.

:$ diff -u include/asm-sparc/sembuf.h include/asm-sparc64/sembuf.h
:-- include/asm-sparc/sembuf.h	2008-06-13 06:42:07.000000000 +0200
:++ include/asm-sparc64/sembuf.h	2008-06-13 06:42:07.000000000 +0200
:@@ -1,21 +1,18 @@
:-#ifndef _SPARC_SEMBUF_H
:-#define _SPARC_SEMBUF_H
:+#ifndef _SPARC64_SEMBUF_H
:+#define _SPARC64_SEMBUF_H
:
: /*
:- * The semid64_ds structure for sparc architecture.
:+ * The semid64_ds structure for sparc64 architecture.
:  * Note extra padding because this structure is passed back and forth
:  * between kernel and user space.
:  *
:  * Pad space is left for:
:- * - 64-bit time_t to solve y2038 problem
:- * - 2 miscellaneous 32-bit values
:+ * - 2 miscellaneous 64-bit values
:  */
:
: struct semid64_ds {
: 	struct ipc64_perm sem_perm;		/* permissions .. see ipc.h */
:-	unsigned int	__pad1;
: 	__kernel_time_t	sem_otime;		/* last semop time */
:-	unsigned int	__pad2;
: 	__kernel_time_t	sem_ctime;		/* last change time */
: 	unsigned long	sem_nsems;		/* no. of semaphores in array */
: 	unsigned long	__unused1;

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:41:44 -07:00
Sam Ravnborg 100b10d752 sparc: Merge asm-sparc{,64}/msgbuf.h
Padding from 32 bit sparc kept using preprocessor magic

:$ diff -u include/asm-sparc/msgbuf.h include/asm-sparc64/msgbuf.h
:-- include/asm-sparc/msgbuf.h	2008-06-13 06:42:07.000000000 +0200
:++ include/asm-sparc64/msgbuf.h	2008-06-13 06:42:07.000000000 +0200
:@@ -7,17 +7,13 @@
:  * between kernel and user space.
:  *
:  * Pad space is left for:
:- * - 64-bit time_t to solve y2038 problem
:- * - 2 miscellaneous 32-bit values
:+ * - 2 miscellaneous 64-bit values
:  */
:
: struct msqid64_ds {
: 	struct ipc64_perm msg_perm;
:-	unsigned int   __pad1;
: 	__kernel_time_t msg_stime;	/* last msgsnd time */
:-	unsigned int   __pad2;
: 	__kernel_time_t msg_rtime;	/* last msgrcv time */
:-	unsigned int   __pad3;
: 	__kernel_time_t msg_ctime;	/* last change time */
: 	unsigned long  msg_cbytes;	/* current number of bytes on queue */

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:41:42 -07:00
Sam Ravnborg 6d1f4b88ee sparc: Merge asm-sparc{,64}/fcntl.h
The definition of O_NDELAY differed - the rest was equal

:$ diff -u include/asm-sparc/fcntl.h include/asm-sparc64/fcntl.h
:-- include/asm-sparc/fcntl.h	2008-06-13 06:46:39.000000000 +0200
:++ include/asm-sparc64/fcntl.h	2008-06-13 06:46:39.000000000 +0200
:@@ -1,8 +1,9 @@
:-#ifndef _SPARC_FCNTL_H
:-#define _SPARC_FCNTL_H
:+#ifndef _SPARC64_FCNTL_H
:+#define _SPARC64_FCNTL_H
:
: /* open/fcntl - O_SYNC is only implemented on blocks devices and on files
:    located on an ext2 file system */
:+#define O_NDELAY	0x0004
: #define O_APPEND	0x0008
: #define FASYNC		0x0040	/* fcntl, for BSD compatibility */
: #define O_CREAT		0x0200	/* not fcntl */
:@@ -10,7 +11,6 @@
: #define O_EXCL		0x0800	/* not fcntl */
: #define O_SYNC		0x2000
: #define O_NONBLOCK	0x4000
:-#define O_NDELAY	(0x0004 | O_NONBLOCK)
: #define O_NOCTTY	0x8000	/* not fcntl */
: #define O_LARGEFILE	0x40000
: #define O_DIRECT        0x100000 /* direct disk access hint */
:@@ -29,8 +29,7 @@
: #define F_UNLCK		3
:
: #define __ARCH_FLOCK_PAD	short __unused;
:-#define __ARCH_FLOCK64_PAD	short __unused;
:
: #include <asm-generic/fcntl.h>
:
:-#endif
:+#endif /* !(_SPARC64_FCNTL_H) */

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:41:39 -07:00
Sam Ravnborg e880e8701c sparc: Merge asm-sparc{,64}/sockios.h
:$ diff -u include/asm-sparc/sockios.h include/asm-sparc64/sockios.h
:-- include/asm-sparc/sockios.h	2008-06-13 06:42:07.000000000 +0200
:++ include/asm-sparc64/sockios.h	2008-06-13 06:42:07.000000000 +0200
:@@ -1,5 +1,5 @@
:-#ifndef _ASM_SPARC_SOCKIOS_H
:-#define _ASM_SPARC_SOCKIOS_H
:+#ifndef _ASM_SPARC64_SOCKIOS_H
:+#define _ASM_SPARC64_SOCKIOS_H
:
: /* Socket-level I/O control calls. */
: #define FIOSETOWN 	0x8901
:@@ -10,5 +10,5 @@
: #define SIOCGSTAMP	0x8906		/* Get stamp (timeval) */
: #define SIOCGSTAMPNS	0x8907		/* Get stamp (timespec) */
:
:-#endif /* !(_ASM_SPARC_SOCKIOS_H) */
:+#endif /* !(_ASM_SPARC64_SOCKIOS_H) */

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:41:35 -07:00
Sam Ravnborg c8b8be5427 sparc: Merge asm-sparc{,64}/socket.h
:$ diff -u include/asm-sparc/socket.h include/asm-sparc64/socket.h
:-- include/asm-sparc/socket.h	2008-06-13 06:46:39.000000000 +0200
:++ include/asm-sparc64/socket.h	2008-06-13 06:46:39.000000000 +0200
:@@ -48,11 +48,10 @@
: #define SO_TIMESTAMPNS		0x0021
: #define SCM_TIMESTAMPNS		SO_TIMESTAMPNS
:
:-#define SO_MARK			0x0022
:-
: /* Security levels - as per NRL IPv6 - don't actually do anything */
: #define SO_SECURITY_AUTHENTICATION		0x5001
: #define SO_SECURITY_ENCRYPTION_TRANSPORT	0x5002
: #define SO_SECURITY_ENCRYPTION_NETWORK		0x5004
:
:+#define SO_MARK			0x0022
: #endif /* _ASM_SOCKET_H */

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:41:27 -07:00
Sam Ravnborg 62e612f0ab sparc: Merge asm-sparc{,64}/poll.h
:$ diff -u include/asm-sparc/poll.h include/asm-sparc64/poll.h
:-- include/asm-sparc/poll.h	2008-06-13 06:42:07.000000000 +0200
:++ include/asm-sparc64/poll.h	2008-06-13 06:42:07.000000000 +0200
:@@ -1,5 +1,5 @@
:-#ifndef __SPARC_POLL_H
:-#define __SPARC_POLL_H
:+#ifndef __SPARC64_POLL_H
:+#define __SPARC64_POLL_H
:
: #define POLLWRNORM	POLLOUT
: #define POLLWRBAND	256

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:41:24 -07:00
Sam Ravnborg 4835bd988e sparc: Merge asm-sparc{,64}/param.h
:$ diff -u include/asm-sparc/param.h include/asm-sparc64/param.h
:-- include/asm-sparc/param.h	2008-06-13 06:46:39.000000000 +0200
:++ include/asm-sparc64/param.h	2008-06-13 06:42:07.000000000 +0200
:@@ -1,5 +1,6 @@
:-#ifndef _ASMSPARC_PARAM_H
:-#define _ASMSPARC_PARAM_H
:+#ifndef _ASMSPARC64_PARAM_H
:+#define _ASMSPARC64_PARAM_H
:+
:
: #ifdef __KERNEL__
: # define HZ		CONFIG_HZ	/* Internal kernel timer frequency */
:@@ -19,4 +20,4 @@
:
: #define MAXHOSTNAMELEN	64	/* max length of hostname */
:
:-#endif
:+#endif /* _ASMSPARC64_PARAM_H */

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:41:20 -07:00
Sam Ravnborg f1ba03cac2 sparc: Merge asm-sparc{,64}/ioctls.h
Trivial differenses in comments - used the version from sparc64

:$ diff -u include/asm-sparc/ioctls.h include/asm-sparc64/ioctls.h
:-- include/asm-sparc/ioctls.h	2008-06-13 08:46:29.000000000 +0200
:++ include/asm-sparc64/ioctls.h	2008-06-13 08:46:29.000000000 +0200
:@@ -1,5 +1,5 @@
:-#ifndef _ASM_SPARC_IOCTLS_H
:-#define _ASM_SPARC_IOCTLS_H
:+#ifndef _ASM_SPARC64_IOCTLS_H
:+#define _ASM_SPARC64_IOCTLS_H
:
: #include <asm/ioctl.h>
:
:@@ -22,7 +22,7 @@
:
: /* Note that all the ioctls that are not available in Linux have a
:  * double underscore on the front to: a) avoid some programs to
:- * thing we support some ioctls under Linux (autoconfiguration stuff)
:+ * think we support some ioctls under Linux (autoconfiguration stuff)
:  */
: /* Little t */
: #define TIOCGETD	_IOR('t', 0, int)
:@@ -110,7 +110,7 @@
: #define TIOCSERGETLSR   0x5459 /* Get line status register */
: #define TIOCSERGETMULTI 0x545A /* Get multiport config  */
: #define TIOCSERSETMULTI 0x545B /* Set multiport config */
:-#define TIOCMIWAIT	0x545C /* Wait input */
:+#define TIOCMIWAIT	0x545C /* Wait for change on serial input line(s) */
: #define TIOCGICOUNT	0x545D /* Read serial port inline interrupt counts */
:
: /* Kernel definitions */
:@@ -133,4 +133,4 @@
: #define TIOCPKT_NOSTOP		16
: #define TIOCPKT_DOSTOP		32
:
:-#endif /* !(_ASM_SPARC_IOCTLS_H) */
:+#endif /* !(_ASM_SPARC64_IOCTLS_H) */

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:41:16 -07:00
Sam Ravnborg 278864fac7 sparc: Merge asm-sparc{,64}/ioctl.h
:$ diff -u include/asm-sparc/ioctl.h include/asm-sparc64/ioctl.h
:-- include/asm-sparc/ioctl.h	2008-06-13 06:46:39.000000000 +0200
:++ include/asm-sparc64/ioctl.h	2008-06-13 08:46:29.000000000 +0200
:@@ -1,5 +1,5 @@
:-#ifndef _SPARC_IOCTL_H
:-#define _SPARC_IOCTL_H
:+#ifndef _SPARC64_IOCTL_H
:+#define _SPARC64_IOCTL_H
:
:/*
:* Our DIR and SIZE overlap in order to simulteneously provide
:@@ -64,4 +64,4 @@
:#define IOCSIZE_MASK    (_IOC_XSIZEMASK << _IOC_SIZESHIFT)
:#define IOCSIZE_SHIFT   (_IOC_SIZESHIFT)
:
:-#endif /* !(_SPARC_IOCTL_H) */
:+#endif /* !(_SPARC64_IOCTL_H) */

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:41:06 -07:00
Sam Ravnborg 09d3e1baa1 sparc: copy exported sparc64 specific header files to asm-sparc
Copy was done using the following simple script:

set -e
SPARC64="h display7seg.h envctrl.h psrcompat.h pstate.h uctx.h utrap.h watchdog.h"
for FILE in ${SPARC64}; do
	if [ -f asm-sparc/$FILE ]; then
		echo $FILE exist in asm-sparc
	fi
	cat asm-sparc64/$FILE > asm-sparc/$FILE
	printf "#include <asm-sparc/$FILE>\n" > asm-sparc64/$FILE
done

The name of the copied files are added to asm-sparc/Kbuild
to keep "make headers_check" functional.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:40:49 -07:00
David S. Miller 597631f2ea sparc64 Kbuild: apb.h and bbc.h should not be exported to userspace
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 21:38:47 -07:00
Sam Ravnborg 3f261e829f sparc: Merge include/asm-sparc{,64}/perfctr.h
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 21:38:42 -07:00
Sam Ravnborg fc491d7da8 sparc: Merge include/asm-sparc{,64}/openpromio.h
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 21:38:36 -07:00
Adrian Bunk d6eaadfbbd sparc: remove PROM_AP1000
This seems to be left from the long gone AP1000 support.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 21:38:24 -07:00
Adrian Bunk 908f5162ca sparc64/kernel/: make code static
This patch makes the following needlessly global code static:
- central.c: struct central_bus
- central.c: struct fhc_list
- central.c: apply_fhc_ranges()
- central.c: apply_central_ranges()
- ds.c: struct ds_states_template[]
- pci_msi.c: sparc64_setup_msi_irq()
- pci_msi.c: sparc64_teardown_msi_irq()
- pci_sun4v.c: struct sun4v_dma_ops
- sys_sparc32.c: cp_compat_stat64()

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 21:38:08 -07:00
Adrian Bunk 50215d6511 sparc/mm/: possible cleanups
This patch contains the following possible cleanups:
- make the following needlessly global code static:
  - fault.c: force_user_fault()
  - init.c: calc_max_low_pfn()
  - init.c: pgt_cache_water[]
  - init.c: map_high_region()
  - srmmu.c: hwbug_bitmask
  - srmmu.c: srmmu_swapper_pg_dir
  - srmmu.c: srmmu_context_table
  - srmmu.c: is_hypersparc
  - srmmu.c: srmmu_cache_pagetables
  - srmmu.c: srmmu_nocache_size
  - srmmu.c: srmmu_nocache_end
  - srmmu.c: srmmu_get_nocache()
  - srmmu.c: srmmu_free_nocache()
  - srmmu.c: srmmu_early_allocate_ptable_skeleton()
  - srmmu.c: srmmu_nocache_calcsize()
  - srmmu.c: srmmu_nocache_init()
  - srmmu.c: srmmu_alloc_thread_info()
  - srmmu.c: early_pgtable_allocfail()
  - srmmu.c: srmmu_early_allocate_ptable_skeleton()
  - srmmu.c: srmmu_allocate_ptable_skeleton()
  - srmmu.c: srmmu_inherit_prom_mappings()
  - sunami.S: tsunami_copy_1page
- remove the following unused code:
  - init.c: struct sparc_aliases

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 21:38:01 -07:00
Adrian Bunk c61c65cdcd sparc/kernel/: possible cleanups
This patch contains the following possible cleanups:
- make the following needlessly global code static:
  - apc.c: apc_swift_idle()
  - ebus.c: ebus_blacklist_irq()
  - ebus.c: fill_ebus_child()
  - ebus.c: fill_ebus_device()
  - entry.S: syscall_is_too_hard
  - etra: tsetup_sun4c_stackchk
  - head.S: cputyp
  - head.S: prom_vector_p
  - idprom.c: Sun_Machines[]
  - ioport.c: _sparc_find_resource()
  - ioport.c: create_proc_read_entry()
  - irq.c: struct sparc_irq[]
  - rtrap.S: sun4c_rett_stackchk
  - setup.c: prom_sync_me()
  - setup.c: boot_flags
  - sun4c_irq.c: sun4c_sbint_to_irq()
  - sun4d_irq.c: sbus_tid[]
  - sun4d_irq.c: struct sbus_actions
  - sun4d_irq.c: sun4d_sbint_to_irq()
  - sun4m_irq.c: sun4m_sbint_to_irq()
  - sun4m_irq.c: sun4m_get_irqmask()
  - sun4m_irq.c: sun4m_timers
  - sun4m_smp.c: smp4m_cross_call()
  - sun4m_smp.c: smp4m_blackbox_id()
  - sun4m_smp.c: smp4m_blackbox_current()
  - time.c: sp_clock_typ
  - time.c: sbus_time_init()
  - traps.c: instruction_dump()
  - wof.S: spwin_sun4c_stackchk
  - wuf.S: sun4c_fwin_stackchk
- #if 0 the following unused code:
  - process.c: sparc_backtrace_lock
  - process.c: __show_backtrace()
  - process.c: show_backtrace()
  - process.c: smp_show_backtrace_all_cpus()
- remove the following unused code:
  - entry.S: __handle_exception
  - smp.c: smp_num_cpus
  - smp.c: smp_activated
  - smp.c: __cpu_number_map[]
  - smp.c: __cpu_logical_map[]
  - smp.c: bitops_spinlock
  - traps.c: trap_curbuf
  - traps.c: trapbuf[]
  - traps.c: linux_smp_still_initting
  - traps.c: thiscpus_tbr
  - traps.c: thiscpus_mid

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 21:37:46 -07:00
David S. Miller 93245dd6d3 pkt_sched: Don't used locked skb_queue_purge() in __qdisc_reset_queue()
We have to have exclusive access to the given qdisc anyways, so
doing even more locking is superfluous.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:32 -07:00
David S. Miller 8387400092 pkt_sched: Kill netdev_queue lock.
We can simply use the qdisc->q.lock for all of the
qdisc tree synchronization.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:30 -07:00
David S. Miller c7e4f3bbb4 pkt_sched: Kill qdisc_lock_tree and qdisc_unlock_tree.
No longer used.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:29 -07:00
David S. Miller 78a5b30b73 pkt_sched: Rework {sch,tbf}_tree_lock().
Make sch_tree_lock() lock the qdisc's root.  All of the
users hold the RTNL semaphore and the root qdisc is not
changing.

Implement tbf_tree_{lock,unlock}() simply in terms of
sch_tree_{lock,unlock}().

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:28 -07:00
David S. Miller ead81cc5fc netdevice: Move qdisc_list back into net_device proper.
And give it it's own lock.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:26 -07:00
David S. Miller 37437bb2e1 pkt_sched: Schedule qdiscs instead of netdev_queue.
When we have shared qdiscs, packets come out of the qdiscs
for multiple transmit queues.

Therefore it doesn't make any sense to schedule the transmit
queue when logically we cannot know ahead of time the TX
queue of the SKB that the qdisc->dequeue() will give us.

Just for sanity I added a BUG check to make sure we never
get into a state where the noop_qdisc is scheduled.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:20 -07:00
David S. Miller 7698b4fcab pkt_sched: Add and use qdisc_root() and qdisc_root_lock().
When code wants to lock the qdisc tree state, the logic
operation it's doing is locking the top-level qdisc that
sits of the root of the netdev_queue.

Add qdisc_root_lock() to represent this and convert the
easiest cases.

In order for this to work out in all cases, we have to
hook up the noop_qdisc to a dummy netdev_queue.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:19 -07:00
David S. Miller e2627c8c22 pkt_sched: Make QDISC_RUNNING a qdisc state.
Currently it is associated with a netdev_queue, but when we have
qdisc sharing that no longer makes any sense.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:18 -07:00
David S. Miller d3b753db7c pkt_sched: Move gso_skb into Qdisc.
We liberate any dangling gso_skb during qdisc destruction.

It really only matters for the root qdisc.  But when qdiscs
can be shared by multiple netdev_queue objects, we can't
have the gso_skb in the netdev_queue any more.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:18 -07:00
David S. Miller 92831bc395 netdev: Kill plain netif_schedule()
No more users.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:16 -07:00
David S. Miller 51cb6db0f5 mac80211: Reimplement WME using ->select_queue().
The only behavior change is that we do not drop packets under any
circumstances.  If that is absolutely needed, we could easily add it
back.

With cleanups and help from Johannes Berg.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:12 -07:00
David S. Miller eae792b722 netdev: Add netdev->select_queue() method.
Devices or device layers can set this to control the queue selection
performed by dev_pick_tx().

This function runs under RCU protection, which allows overriding
functions to have some way of synchronizing with things like dynamic
->real_num_tx_queues adjustments.

This makes the spinlock prefetch in dev_queue_xmit() a little bit
less effective, but that's the price right now for correctness.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:10 -07:00
David S. Miller e3c50d5d25 netdev: netdev_priv() can now be sane again.
The private area of a netdev is now at a fixed offset once more.

Unfortunately, some assumptions that netdev_priv() == netdev->priv
crept back into the tree.  In particular this happened in the
loopback driver.  Make it use netdev->ml_priv.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:09 -07:00
David S. Miller 6b0fb1261a netdev: Kill struct net_device_subqueue and netdev->egress_subqueue*
No longer used.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:08 -07:00
David S. Miller fd2ea0a79f net: Use queue aware tests throughout.
This effectively "flips the switch" by making the core networking
and multiqueue-aware drivers use the new TX multiqueue structures.

Non-multiqueue drivers need no changes.  The interfaces they use such
as netif_stop_queue() degenerate into an operation on TX queue zero.
So everything "just works" for them.

Code that really wants to do "X" to all TX queues now invokes a
routine that does so, such as netif_tx_wake_all_queues(),
netif_tx_stop_all_queues(), etc.

pktgen and netpoll required a little bit more surgery than the others.

In particular the pktgen changes, whilst functional, could be largely
improved.  The initial check in pktgen_xmit() will sometimes check the
wrong queue, which is mostly harmless.  The thing to do is probably to
invoke fill_packet() earlier.

The bulk of the netpoll changes is to make the code operate solely on
the TX queue indicated by by the SKB queue mapping.

Setting of the SKB queue mapping is entirely confined inside of
net/core/dev.c:dev_pick_tx().  If we end up needing any kind of
special semantics (drops, for example) it will be implemented here.

Finally, we now have a "real_num_tx_queues" which is where the driver
indicates how many TX queues are actually active.

With IGB changes from Jeff Kirsher.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:07 -07:00
David S. Miller 1d8ae3fdeb pkt_sched: Remove RR scheduler.
This actually fixes a bug added by the RR scheduler changes.  The
->bands and ->prio2band parameters were being set outside of the
sch_tree_lock() and thus could result in strange behavior and
inconsistencies.

It might be possible, in the new design (where there will be one qdisc
per device TX queue) to allow similar functionality via a TX hash
algorithm for RR but I really see no reason to export this aspect of
how these multiqueue cards actually implement the scheduling of the
the individual DMA TX rings and the single physical MAC/PHY port.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:04 -07:00
David S. Miller 09e83b5d7d netdev: Kill NETIF_F_MULTI_QUEUE.
There is no need for a feature bit for something that
can be tested by simply checking the TX queue count.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:03 -07:00
David S. Miller e8a0464cc9 netdev: Allocate multiple queues for TX.
alloc_netdev_mq() now allocates an array of netdev_queue
structures for TX, based upon the queue_count argument.

Furthermore, all accesses to the TX queues are now vectored
through the netdev_get_tx_queue() and netdev_for_each_tx_queue()
interfaces.  This makes it easy to grep the tree for all
things that want to get to a TX queue of a net device.

Problem spots which are not really multiqueue aware yet, and
only work with one queue, can easily be spotted by grepping
for all netdev_get_tx_queue() calls that pass in a zero index.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:00 -07:00
Dan Williams 2a46fa13d7 iop_adma: cleanup iop_chan_xor_slot_count
- use a table for iop13xx, trade text for data
- shrink the iop3xx to a cache line

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-17 17:59:56 -07:00
Dan Williams 0839875e0c async_tx: make async_tx_test_ack a boolean routine
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-17 17:59:56 -07:00
Dan Williams 3dce017137 async_tx: remove depend_tx from async_tx_sync_epilog
All callers of async_tx_sync_epilog have called async_tx_quiesce on the
depend_tx, so async_tx_sync_epilog need only call the callback to
complete the operation.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-17 17:59:55 -07:00
Dan Williams d2c52b7983 async_tx: export async_tx_quiesce
Replace open coded "wait and acknowledge" instances with async_tx_quiesce.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-17 17:59:55 -07:00
Joel Becker a6795e9ebb configfs: Allow ->make_item() and ->make_group() to return detailed errors.
The configfs operations ->make_item() and ->make_group() currently
return a new item/group.  A return of NULL signifies an error.  Because
of this, -ENOMEM is the only return code bubbled up the stack.

Multiple folks have requested the ability to return specific error codes
when these operations fail.  This patch adds that ability by changing the
->make_item/group() ops to return ERR_PTR() values.  These errors are
bubbled up appropriately.  NULL returns are changed to -ENOMEM for
compatibility.

Also updated are the in-kernel users of configfs.

This is a rework of reverted commit 11c3b79218.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2008-07-17 15:21:29 -07:00
Ingo Molnar 393d81aa02 Merge branch 'linus' into xen-64bit 2008-07-17 23:57:20 +02:00
Joel Becker f89ab8619e Revert "configfs: Allow ->make_item() and ->make_group() to return detailed errors."
This reverts commit 11c3b79218.  The code
will move to PTR_ERR().

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2008-07-17 14:53:48 -07:00
H. Peter Anvin 4fdf08b5bf x86: unify and correct the GDT_ENTRY() macro
Merge the GDT_ENTRY() macro between arch/x86/boot/pm.c and
arch/x86/kernel/acpi/sleep.c and put the new one in
<asm-x86/segment.h>.

While we're at it, correct the bitmasks for the limit and flags.  The
new version relies on using ULL constants in order to cause type
promotion rather than explicit casts; this avoids having to include
<linux/types.h> in <asm-x86/segments.h>.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-07-17 11:29:24 -07:00
Dimitri Sivanich 8cac39b99b [IA64] Update ia64 mmr list for SGI uv
This patch updates the ia64 mmr list for SGI uv.

Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-07-17 11:27:05 -07:00
Linus Torvalds 5b664cb235 Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2:
  [PATCH] ocfs2: fix oops in mmap_truncate testing
  configfs: call drop_link() to cleanup after create_link() failure
  configfs: Allow ->make_item() and ->make_group() to return detailed errors.
  configfs: Fix failing mkdir() making racing rmdir() fail
  configfs: Fix deadlock with racing rmdir() and rename()
  configfs: Make configfs_new_dirent() return error code instead of NULL
  configfs: Protect configfs_dirent s_links list mutations
  configfs: Introduce configfs_dirent_lock
  ocfs2: Don't snprintf() without a format.
  ocfs2: Fix CONFIG_OCFS2_DEBUG_FS #ifdefs
  ocfs2/net: Silence build warnings on sparc64
  ocfs2: Handle error during journal load
  ocfs2: Silence an error message in ocfs2_file_aio_read()
  ocfs2: use simple_read_from_buffer()
  ocfs2: fix printk format warnings with OCFS2_FS_STATS=n
  [PATCH 2/2] ocfs2: Instrument fs cluster locks
  [PATCH 1/2] ocfs2: Add CONFIG_OCFS2_FS_STATS config option
2008-07-17 10:55:51 -07:00
Tony Luck fca515fbfa Pull pvops into release branch 2008-07-17 10:53:37 -07:00
Linus Torvalds 2b04be7e8a Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: fix asm/e820.h for userspace inclusion
  x86: fix numaq_tsc_disable
  x86: fix kernel_physical_mapping_init() for large x86 systems
2008-07-17 10:38:59 -07:00
Rusty Russell 2567d71cc7 x86: fix asm/e820.h for userspace inclusion
asm-x86/e820.h is included from userspace.  'x86: make e820.c to have
common functions' (b79cd8f126) broke it:

	make -C Documentation/lguest
	cc -Wall -Wmissing-declarations -Wmissing-prototypes -O3 -I../../include
lguest.c  -lz -o lguest
	In file included from ../../include/asm-x86/bootparam.h:8,
	                 from lguest.c:45:
	../../include/asm/e820.h:66: error: expected ‘)’ before ‘start’
	../../include/asm/e820.h:67: error: expected ‘)’ before ‘start’
	../../include/asm/e820.h:68: error: expected ‘)’ before ‘start’
	../../include/asm/e820.h:72: error: expected ‘=’, ‘,’, ‘;’, ‘asm’
or ‘__attribute__’ before ‘e820_update_range’
	...

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-17 19:28:48 +02:00
Linus Torvalds 42fea1f385 Merge branch 'ptrace-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-utrace
* 'ptrace-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-utrace:
  fix dangling zombie when new parent ignores children
  do_wait: return security_task_wait() error code in place of -ECHILD
  ptrace children revamp
  do_wait reorganization
2008-07-17 09:15:23 -07:00
Jan Glauber 779e6e1c72 [S390] qdio: new qdio driver.
List of major changes:
- split qdio driver into several files
- seperation of thin interrupt code
- improved handling for multiple thin interrupt devices
- inbound and outbound processing now always runs in tasklet context
- significant less tasklet schedules per interrupt needed
- merged qebsm with non-qebsm handling
- cleanup qdio interface and added kerneldoc
- coding style

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Utz Bacher <utz.bacher@de.ibm.com>
Reviewed-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2008-07-17 17:22:10 +02:00
Adrian Bunk 626f311737 [S390] chsc headers userspace cleanup
Kernel headers shouldn't expose functions to userspace.

Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-07-17 17:22:08 +02:00
Neil Horman 9a6d276e85 core: add stat to track unresolved discards in neighbor cache
in __neigh_event_send, if we have a neighbour entry which is in
NUD_INCOMPLETE state, we enqueue any outbound frames to that neighbour
to the neighbours arp_queue, which is default capped to a length of 3
skbs.  If that queue exceeds its set length, it will drop an skb on
the queue to enqueue the newly arrived skb.  This results in a drop
for which we have no statistics incremented.  This patch adds an
unresolved_discards stat to /proc/net/stat/ndisc_cache to track these
lost frames.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:50:49 -07:00
Pavel Emelyanov ed88098e25 mib: add net to NET_ADD_STATS_USER
Done with NET_XXX_STATS macros :)

To be continued...

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:32:45 -07:00
Pavel Emelyanov f2bf415cfe mib: add net to NET_ADD_STATS_BH
This one is tricky. 

The thing is that this macro is only used when killing tw buckets, 
but since this killer is promiscuous wrt to which net each particular
tw belongs to, I have to use it only when NET_NS is off. When the net
namespaces are on, I use the INET_INC_STATS_BH for each bucket.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:32:25 -07:00
Pavel Emelyanov 6f67c817fc mib: add net to NET_INC_STATS_USER
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:31:39 -07:00
Pavel Emelyanov de0744af1f mib: add net to NET_INC_STATS_BH
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:31:16 -07:00
Pavel Emelyanov 4e6734447d mib: add net to NET_INC_STATS
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:30:14 -07:00
Pavel Emelyanov 5c52ba170f sock: add net to prot->enter_memory_pressure callback
The tcp_enter_memory_pressure calls NET_INC_STATS, but doesn't
have where to get the net from.

I decided to add a sk argument, not the net itself, only to factor
all the required sock_net(sk) calls inside the enter_memory_pressure 
callback itself.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:28:10 -07:00
Pavel Emelyanov cf1100a7a4 mib: add net to TCP_ADD_STATS_USER
Now we're done with the TCP_XXX_STATS macros.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:27:38 -07:00
Pavel Emelyanov 74688e487a mib: add net to TCP_DEC_STATS
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:22:46 -07:00
Pavel Emelyanov 63231bddf6 mib: add net to TCP_INC_STATS_BH
Same as before - the sock is always there to get the net from,
but there are also some places with the net already saved on 
the stack.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:22:25 -07:00
Pavel Emelyanov 81cc8a75d9 mib: add net to TCP_INC_STATS
Fortunately (almost) all the TCP code has a sock to get the net from :)

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:22:04 -07:00
Pavel Emelyanov a9c19329ec tcp: add net to tcp_mib_init
This one sets TCP MIBs after zeroing them, and thus requires
the net.

The existing single caller can use init_net (temporarily).

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:21:42 -07:00
Pavel Emelyanov f10f84314d mib: drop unused TCP_XXX_STATS macros
TCP_INC_STATS_USER and TCP_ADD_STATS_BH are currently unused.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:21:20 -07:00
Pavel Emelyanov c5346fe396 mib: add net to IP_ADD_STATS_BH
Very simple - only ip_evictor (fragments) requires such.
This patch ends up the IP_XXX_STATS patching.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:20:33 -07:00
Pavel Emelyanov 7c73a6faff mib: add net to IP_INC_STATS_BH
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:20:11 -07:00
Pavel Emelyanov 5e38e27044 mib: add net to IP_INC_STATS
All the callers already have either the net itself, or the place
where to get it from.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:19:49 -07:00
Pavel Emelyanov c6f8f7e3bb mib: drop unused IP_INC_STATS_USER
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:19:26 -07:00
Roland McGrath f470021adb ptrace children revamp
ptrace no longer fiddles with the children/sibling links, and the
old ptrace_children list is gone.  Now ptrace, whether of one's own
children or another's via PTRACE_ATTACH, just uses the new ptraced
list instead.

There should be no user-visible difference that matters.  The only
change is the order in which do_wait() sees multiple stopped
children and stopped ptrace attachees.  Since wait_task_stopped()
was changed earlier so it no longer reorders the children list, we
already know this won't cause any new problems.

Signed-off-by: Roland McGrath <roland@redhat.com>
2008-07-16 18:02:33 -07:00
Linus Torvalds dc7c65db28 Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (72 commits)
  Revert "x86/PCI: ACPI based PCI gap calculation"
  PCI: remove unnecessary volatile in PCIe hotplug struct controller
  x86/PCI: ACPI based PCI gap calculation
  PCI: include linux/pm_wakeup.h for device_set_wakeup_capable
  PCI PM: Fix pci_prepare_to_sleep
  x86/PCI: Fix PCI config space for domains > 0
  Fix acpi_pm_device_sleep_wake() by providing a stub for CONFIG_PM_SLEEP=n
  PCI: Simplify PCI device PM code
  PCI PM: Introduce pci_prepare_to_sleep and pci_back_from_sleep
  PCI ACPI: Rework PCI handling of wake-up
  ACPI: Introduce new device wakeup flag 'prepared'
  ACPI: Introduce acpi_device_sleep_wake function
  PCI: rework pci_set_power_state function to call platform first
  PCI: Introduce platform_pci_power_manageable function
  ACPI: Introduce acpi_bus_power_manageable function
  PCI: make pci_name use dev_name
  PCI: handle pci_name() being const
  PCI: add stub for pci_set_consistent_dma_mask()
  PCI: remove unused arch pcibios_update_resource() functions
  PCI: fix pci_setup_device()'s sprinting into a const buffer
  ...

Fixed up conflicts in various files (arch/x86/kernel/setup_64.c,
arch/x86/pci/irq.c, arch/x86/pci/pci.h, drivers/acpi/sleep/main.c,
drivers/pci/pci.c, drivers/pci/pci.h, include/acpi/acpi_bus.h) from x86
and ACPI updates manually.
2008-07-16 17:25:46 -07:00
Kumar Gala 6cfd8990e2 powerpc: rework FSL Book-E PTE access and TLB miss
This converts the FSL Book-E PTE access and TLB miss handling to match
with the recent changes to 44x that introduce support for non-atomic PTE
operations in pgtable-ppc32.h and removes write back to the PTE from
the TLB miss handlers. In addition, the DSI interrupt code no longer
tries to fixup write permission, this is left to generic code, and
_PAGE_HWWRITE is gone.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2008-07-16 17:57:51 -05:00
Kumar Gala b219108cba fs_enet: Remove !CONFIG_PPC_CPM_NEW_BINDING code
Now that arch/ppc is gone we always define CONFIG_PPC_CPM_NEW_BINDING so
we can remove all the code associated with !CONFIG_PPC_CPM_NEW_BINDING.

Also fixed some asm/of_platform.h to linux/of_platform.h (and of_device.h)

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2008-07-16 17:57:49 -05:00
Scott Wood d87eb12785 gianfar: Add magic packet and suspend/resume support.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2008-07-16 17:57:47 -05:00
Andy Fleming 7e1cc9c55a powerpc: Fix a bunch of sparse warnings in the qe_lib
Mostly having to do with not marking things __iomem.  And some failure
to use appropriate accessors to read MMIO regs.

Signed-off-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2008-07-16 17:57:45 -05:00
Scott Wood d49747bdfb powerpc/mpc83xx: Power Management support
Basic PM support for 83xx.  Standby is implemented as sleep.
Suspend-to-RAM is implemented as "deep sleep" (with the processor
turned off) on 831x.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2008-07-16 17:57:30 -05:00
Linus Torvalds 8a0ca91e1d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc: (68 commits)
  sdio_uart: Fix SDIO break control to now return success or an error
  mmc: host driver for Ricoh Bay1Controllers
  sdio: sdio_io.c Fix sparse warnings
  sdio: fix the use of hard coded timeout value.
  mmc: OLPC: update vdd/powerup quirk comment
  mmc: fix spares errors of sdhci.c
  mmc: remove multiwrite capability
  wbsd: fix bad dma_addr_t conversion
  atmel-mci: Driver for Atmel on-chip MMC controllers
  mmc: fix sdio_io sparse errors
  mmc: wbsd.c fix shadowing of 'dma' variable
  MMC: S3C24XX: Refuse incorrectly aligned transfers
  MMC: S3C24XX: Add maintainer entry
  MMC: S3C24XX: Update error debugging.
  MMC: S3C24XX: Add media presence test to request handling.
  MMC: S3C24XX: Fix use of msecs where jiffies are needed
  MMC: S3C24XX: Add MODULE_ALIAS() entries for the platform devices
  MMC: S3C24XX: Fix s3c2410_dma_request() return code check.
  MMC: S3C24XX: Allow card-detect on non-IRQ capable pin
  MMC: S3C24XX: Ensure host->mrq->data is valid
  ...

Manually fixed up bogus executable bits on drivers/mmc/core/sdio_io.c
and include/linux/mmc/sdio_func.h when merging.
2008-07-16 15:17:52 -07:00
Linus Torvalds 9c1be0c471 Merge branch 'for_linus' of git://git.infradead.org/~dedekind/ubifs-2.6
* 'for_linus' of git://git.infradead.org/~dedekind/ubifs-2.6:
  UBIFS: include to compilation
  UBIFS: add new flash file system
  UBIFS: add brief documentation
  MAINTAINERS: add UBIFS section
  do_mounts: allow UBI root device name
  VFS: export sync_sb_inodes
  VFS: move inode_lock into sync_sb_inodes
2008-07-16 15:02:57 -07:00
Linus Torvalds 42fdd144a4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (76 commits)
  IDE: Report errors during drive reset back to user space
  Update documentation of HDIO_DRIVE_RESET ioctl
  IDE: Remove unused code
  IDE: Fix HDIO_DRIVE_RESET handling
  hd.c: remove the #include <linux/mc146818rtc.h>
  update the BLK_DEV_HD help text
  move ide/legacy/hd.c to drivers/block/
  ide/legacy/hd.c: use late_initcall()
  remove BLK_DEV_HD_ONLY
  ide: endian annotations in ide-floppy.c
  ide-floppy: zero out the whole struct ide_atapi_pc on init
  ide-floppy: fold idefloppy_create_test_unit_ready_cmd into idefloppy_open
  ide-cd: move request prep chunk from cdrom_do_newpc_cont to rq issue path
  ide-cd: move request prep from cdrom_start_rw_cont to rq issue path
  ide-cd: move request prep from cdrom_start_seek_continuation to rq issue path
  ide-cd: fold cdrom_start_seek into ide_cd_do_request
  ide-cd: simplify request issuing path
  ide-cd: mv ide_do_rw_cdrom ide_cd_do_request
  ide-cd: cdrom_start_seek: remove unused argument block
  ide-cd: ide_do_rw_cdrom: add the catch-all bad request case to the if-else block
  ...
2008-07-16 14:53:54 -07:00
Linus Torvalds 4314652bb4 Merge branch 'release-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-acpi-merge-2.6
* 'release-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-acpi-merge-2.6: (87 commits)
  Fix FADT parsing
  Add the ability to reset the machine using the RESET_REG in ACPI's FADT table.
  ACPI: use dev_printk when possible
  PNPACPI: add support for HP vendor-specific CCSR descriptors
  PNP: avoid legacy IDE IRQs
  PNP: convert resource options to single linked list
  ISAPNP: handle independent options following dependent ones
  PNP: remove extra 0x100 bit from option priority
  PNP: support optional IRQ resources
  PNP: rename pnp_register_*_resource() local variables
  PNPACPI: ignore _PRS interrupt numbers larger than PNP_IRQ_NR
  PNP: centralize resource option allocations
  PNP: remove redundant pnp_can_configure() check
  PNP: make resource assignment functions return 0 (success) or -EBUSY (failure)
  PNP: in debug resource dump, make empty list obvious
  PNP: improve resource assignment debug
  PNP: increase I/O port & memory option address sizes
  PNP: introduce pnp_irq_mask_t typedef
  PNP: make resource option structures private to PNP subsystem
  PNP: define PNP-specific IORESOURCE_IO_* flags alongside IRQ, DMA, MEM
  ...
2008-07-16 14:52:12 -07:00
Martin K. Petersen d442cc44c0 block: Trivial fix for blk_integrity_rq()
Fail integrity check gracefully when request does not have a bio
attached (BLOCK_PC).

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-16 14:51:41 -07:00
Linus Torvalds 8df1b049bc Merge git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (82 commits)
  NFSv4: Remove BKL from the nfsv4 state recovery
  SUNRPC: Remove the BKL from the callback functions
  NFS: Remove BKL from the readdir code
  NFS: Remove BKL from the symlink code
  NFS: Remove BKL from the sillydelete operations
  NFS: Remove the BKL from the rename, rmdir and unlink operations
  NFS: Remove BKL from NFS lookup code
  NFS: Remove the BKL from nfs_link()
  NFS: Remove the BKL from the inode creation operations
  NFS: Remove BKL usage from open()
  NFS: Remove BKL usage from the write path
  NFS: Remove the BKL from the permission checking code
  NFS: Remove attribute update related BKL references
  NFS: Remove BKL requirement from attribute updates
  NFS: Protect inode->i_nlink updates using inode->i_lock
  nfs: set correct fl_len in nlmclnt_test()
  SUNRPC: Support registering IPv6 interfaces with local rpcbind daemon
  SUNRPC: Refactor rpcb_register to make rpcbindv4 support easier
  SUNRPC: None of rpcb_create's callers wants a privileged source port
  SUNRPC: Introduce a specific rpcb_create for contacting localhost
  ...
2008-07-16 14:49:49 -07:00
Aaron Durbin 4d3870431d Add the ability to reset the machine using the RESET_REG in ACPI's FADT table.
Signed-off-by: Aaron Durbin <adurbin@google.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2008-07-16 23:27:08 +02:00
Bjorn Helgaas 1f32ca31e7 PNP: convert resource options to single linked list
ISAPNP, PNPBIOS, and ACPI describe the "possible resource settings" of
a device, i.e., the possibilities an OS bus driver has when it assigns
I/O port, MMIO, and other resources to the device.

PNP used to maintain this "possible resource setting" information in
one independent option structure and a list of dependent option
structures for each device.  Each of these option structures had lists
of I/O, memory, IRQ, and DMA resources, for example:

  dev
    independent options
      ind-io0  -> ind-io1  ...
      ind-mem0 -> ind-mem1 ...
      ...
    dependent option set 0
      dep0-io0  -> dep0-io1  ...
      dep0-mem0 -> dep0-mem1 ...
      ...
    dependent option set 1
      dep1-io0  -> dep1-io1  ...
      dep1-mem0 -> dep1-mem1 ...
      ...
    ...

This data structure was designed for ISAPNP, where the OS configures
device resource settings by writing directly to configuration
registers.  The OS can write the registers in arbitrary order much
like it writes PCI BARs.

However, for PNPBIOS and ACPI devices, the OS uses firmware interfaces
that perform device configuration, and it is important to pass the
desired settings to those interfaces in the correct order.  The OS
learns the correct order by using firmware interfaces that return the
"current resource settings" and "possible resource settings," but the
option structures above doesn't store the ordering information.

This patch replaces the independent and dependent lists with a single
list of options.  For example, a device might have possible resource
settings like this:

  dev
    options
      ind-io0 -> dep0-io0 -> dep1->io0 -> ind-io1 ...

All the possible settings are in the same list, in the order they
come from the firmware "possible resource settings" list.  Each entry
is tagged with an independent/dependent flag.  Dependent entries also
have a "set number" and an optional priority value.  All dependent
entries must be assigned from the same set.  For example, the OS can
use all the entries from dependent set 0, or all the entries from
dependent set 1, but it cannot mix entries from set 0 with entries
from set 1.

Prior to this patch PNP didn't keep track of the order of this list,
and it assigned all independent options first, then all dependent
ones.  Using the example above, that resulted in a "desired
configuration" list like this:

  ind->io0 -> ind->io1 -> depN-io0 ...

instead of the list the firmware expects, which looks like this:

  ind->io0 -> depN-io0 -> ind-io1 ...

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Rene Herman <rene.herman@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-07-16 23:27:07 +02:00
Bjorn Helgaas d5ebde6ef5 PNP: support optional IRQ resources
This patch adds an IORESOURCE_IRQ_OPTIONAL flag for use when
assigning resources to a device.  If the flag is set and we are
unable to assign an IRQ to the device, we can leave the IRQ
disabled but allow the overall resource allocation to succeed.

Some devices request an IRQ, but can run without an IRQ
(possibly with degraded performance).  This flag lets us run
the device without the IRQ instead of just leaving the
device disabled.

This is a reimplementation of this previous change by Rene
Herman <rene.herman@gmail.com>:
    http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=3b73a223661ed137c5d3d2635f954382e94f5a43

I reimplemented this for two reasons:
    - to prepare for converting all resource options into a single linked
      list, as opposed to the per-resource-type lists we have now, and
    - to preserve the order and number of resource options.

In PNPBIOS and ACPI, we configure a device by giving firmware a
list of resource assignments.  It is important that this list
has exactly the same number of resources, in the same order,
as the "template" list we got from the firmware in the first
place.

The problem of a sound card MPU401 being left disabled for want of
an IRQ was reported by Uwe Bugla <uwe.bugla@gmx.de>.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Rene Herman <rene.herman@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-07-16 23:27:07 +02:00
Bjorn Helgaas a1802c4295 PNP: make resource option structures private to PNP subsystem
Nothing outside the PNP subsystem should need access to a
device's resource options, so this patch moves the option
structure declarations to a private header file.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Rene Herman <rene.herman@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-07-16 23:27:06 +02:00
Bjorn Helgaas 08c9f262f2 PNP: define PNP-specific IORESOURCE_IO_* flags alongside IRQ, DMA, MEM
PNP previously defined PNP_PORT_FLAG_16BITADDR and PNP_PORT_FLAG_FIXED
in a private header file, but put those flags in struct resource.flags
fields.  Better to make them IORESOURCE_IO_* flags like the existing
IRQ, DMA, and MEM flags.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Rene Herman <rene.herman@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-07-16 23:27:06 +02:00
Bjorn Helgaas 57fd51a8be PNP: add pnp_possible_config() -- can a device could be configured this way?
As part of a heuristic to identify modem devices, 8250_pnp.c
checks to see whether a device can be configured at any of the
legacy COM port addresses.

This patch moves the code that traverses the PNP "possible resource
options" from 8250_pnp.c to the PNP subsystem.  This encapsulation
is important because a future patch will change the implementation
of those resource options.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Rene Herman <rene.herman@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-07-16 23:27:06 +02:00
Bjorn Helgaas aee3ad815d PNP: replace pnp_resource_table with dynamically allocated resources
PNP used to have a fixed-size pnp_resource_table for tracking the
resources used by a device.  This table often overflowed, so we've
had to increase the table size, which wastes memory because most
devices have very few resources.

This patch replaces the table with a linked list of resources where
the entries are allocated on demand.

This removes messages like these:

    pnpacpi: exceeded the max number of IO resources
    00:01: too many I/O port resources

References:

    http://bugzilla.kernel.org/show_bug.cgi?id=9535
    http://bugzilla.kernel.org/show_bug.cgi?id=9740
    http://lkml.org/lkml/2007/11/30/110

This patch also changes the way PNP uses the IORESOURCE_UNSET,
IORESOURCE_AUTO, and IORESOURCE_DISABLED flags.

Prior to this patch, the pnp_resource_table entries used the flags
like this:

    IORESOURCE_UNSET
	This table entry is unused and available for use.  When this flag
	is set, we shouldn't look at anything else in the resource structure.
	This flag is set when a resource table entry is initialized.

    IORESOURCE_AUTO
	This resource was assigned automatically by pnp_assign_{io,mem,etc}().

	This flag is set when a resource table entry is initialized and
	cleared whenever we discover a resource setting by reading an ISAPNP
	config register, parsing a PNPBIOS resource data stream, parsing an
	ACPI _CRS list, or interpreting a sysfs "set" command.

	Resources marked IORESOURCE_AUTO are reinitialized and marked as
	IORESOURCE_UNSET by pnp_clean_resource_table() in these cases:

	    - before we attempt to assign resources automatically,
	    - if we fail to assign resources automatically,
	    - after disabling a device

    IORESOURCE_DISABLED
	Set by pnp_assign_{io,mem,etc}() when automatic assignment fails.
	Also set by PNPBIOS and PNPACPI for:

	    - invalid IRQs or GSI registration failures
	    - invalid DMA channels
	    - I/O ports above 0x10000
	    - mem ranges with negative length

After this patch, there is no pnp_resource_table, and the resource list
entries use the flags like this:

    IORESOURCE_UNSET
	This flag is no longer used in PNP.  Instead of keeping
	IORESOURCE_UNSET entries in the resource list, we remove
	entries from the list and free them.

    IORESOURCE_AUTO
	No change in meaning: it still means the resource was assigned
	automatically by pnp_assign_{port,mem,etc}(), but these functions
	now set the bit explicitly.

	We still "clean" a device's resource list in the same places,
	but rather than reinitializing IORESOURCE_AUTO entries, we
	just remove them from the list.

	Note that IORESOURCE_AUTO entries are always at the end of the
	list, so removing them doesn't reorder other list entries.
	This is because non-IORESOURCE_AUTO entries are added by the
	ISAPNP, PNPBIOS, or PNPACPI "get resources" methods and by the
	sysfs "set" command.  In each of these cases, we completely free
	the resource list first.

    IORESOURCE_DISABLED
	In addition to the cases where we used to set this flag, ISAPNP now
	adds an IORESOURCE_DISABLED resource when it reads a configuration
	register with a "disabled" value.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:05 +02:00
Bjorn Helgaas 20bfdbba72 PNP: make pnp_{port,mem,etc}_start(), et al work for invalid resources
Some callers use pnp_port_start() and similar functions without
making sure the resource is valid.  This patch makes us fall
back to returning the initial values if the resource is not
valid or not even present.

This mostly preserves the previous behavior, where we would just
return the initial values set by pnp_init_resource_table().  The
original 2.6.25 code didn't range-check the "bar", so it would
return garbage if the bar exceeded the table size.  This code
returns sensible values instead.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:05 +02:00
Zhao Yakui da5e09a1b3 ACPI : Create "idle=nomwait" bootparam
"idle=nomwait" disables the use of the MWAIT
instruction from both C1 (C1_FFH) and deeper (C2C3_FFH)
C-states.

When MWAIT is unavailable, the BIOS and OS generally
negotiate to use the HALT instruction for C1,
and use IO accesses for deeper C-states.

This option is useful for power and performance
comparisons, and also to work around BIOS bugs
where broken MWAIT support is advertised.

http://bugzilla.kernel.org/show_bug.cgi?id=10807
http://bugzilla.kernel.org/show_bug.cgi?id=10914

Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Li Shaohua <shaohua.li@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:05 +02:00
Zhao Yakui c1e3b377ad ACPI: Create "idle=halt" bootparam
"idle=halt" limits the idle loop to using
the halt instruction.  No MWAIT, no IO accesses,
no C-states deeper than C1.

If something is broken in the idle code,
"idle=halt" is a less severe workaround
than "idle=poll" which disables all power savings.

Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:05 +02:00
Zhang Rui 71b58cbb0c ACPI: Enhance /sys/firmware/interrupts to allow enable/disable/clear from user-space
Allow users to enable/disable/clear a specific & valid GPE/Fixed Event
in user space.

This is useful for debugging, especially for some
interrupt storm issues.

All wakeup GPEs are disabled and they can not be enabled at runtime,
and we mark them as invalid.

All GPEs that don't have a _Lxx/_Exx method are marked as invalid.

All Fixed Events that don't have an event handler are marked as invalid
and they can't be enabled until an event handler is registered.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Ling Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:04 +02:00
Bob Moore 9c9f6d052d ACPICA: Update version to 20080609
Update version to 20080609.

Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:04 +02:00
Bob Moore 71d993e115 ACPICA: Cleanup debug operand dump mechanism
Eliminated unnecessary operands; eliminated use of negative index
in loop.  Operands now displayed in correct order, not backwards.

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:04 +02:00
Bob Moore 75e5b5fb77 ACPICA: Update disassembler for DMAR table changes
Now supports the 2007 intel Virtualization Technology for Directed
I/O specification.

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:04 +02:00
Bob Moore 19d0cfe9dd ACPICA: Update DMAR and SRAT table definitions
Synchronized tables with current specifications.

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:04 +02:00
Bob Moore b25d2a470b ACPICA: Update version to 20080514
Update version to 20080514

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:04 +02:00
Bob Moore 4b8ed63167 ACPICA: Add const qualifier for appropriate string constants
Mostly MODULE_NAME and printf format strings.

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:04 +02:00
Bob Moore 67a119f990 ACPICA: Eliminate acpi_native_uint type v2
No longer needed; replaced mostly with u32, but also acpi_size
where a type that changes 32/64 bit on 32/64-bit platforms is
required.

v2: Fix a cast of a 32-bit int to a pointer in ACPI to avoid a compiler warning.
from David Howells

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:03 +02:00
Bob Moore 11f2a61ab4 ACPICA: Fix possible negative array index in acpi_ut_validate_exception
Added NULL fields to the exception string arrays to eliminate
the -1 subtraction on the SubStatus field.

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:03 +02:00
Jan Beulich 6719561f9b ACPICA: Update tracking macros to reduce code/data size
Changed ACPI_MODULE_NAME and ACPI_FUNCTION_NAME to use arrays of
strings instead of pointers to static strings. Jan Beulich and
Bob Moore.

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:03 +02:00
Bob Moore c91d924e3a ACPICA: Fix for hang on GPE method invocation
Fixes problem where the new method argument count validation mechanism
will enter an infinite loop when a GPE method is dispatched.
Problem fixed be removing the obsolete code that passes GPE block
information to the notify handler via the control method parameter pointer.

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:03 +02:00
Bob Moore f3454ae810 ACPICA: Add argument count checking to control method invocation via acpi_evaluate_object
Error if too few arguments, warning if too many. This applies
only to external programmatic control method execution, not
method-to-method calls within the AML.

Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:03 +02:00
Rafael J. Wysocki ebb12db51f Freezer: Introduce PF_FREEZER_NOSIG
The freezer currently attempts to distinguish kernel threads from
user space tasks by checking if their mm pointer is unset and it
does not send fake signals to kernel threads.  However, there are
kernel threads, mostly related to networking, that behave like
user space tasks and may want to be sent a fake signal to be frozen.

Introduce the new process flag PF_FREEZER_NOSIG that will be set
by default for all kernel threads and make the freezer only send
fake signals to the tasks having PF_FREEZER_NOSIG unset.  Provide
the set_freezable_with_signal() function to be called by the kernel
threads that want to be sent a fake signal for freezing.

This patch should not change the freezer's observable behavior.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-07-16 23:27:03 +02:00
David Brownell 2fe2de5f6c ACPI PM: acpi_pm_device_sleep_state() cleanup
Get rid of a superfluous acpi_pm_device_sleep_state() parameter.  The
only legitimate value of that parameter must be derived from the first
parameter, which is what all the callers already do.  (However, this
does not address the fact that ACPI still doesn't set up those flags.)

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-07-16 23:27:02 +02:00
Vegard Nossum 47c00d2bc2 ACPICA: fix mutex names in debug code.
Reorder the mutex names to match the preceding #defines

Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:01 +02:00
Bob Moore e38e8a0743 Make GPE disable more robust
Implemented another change for the GPE disable. We now perform a
read-change-write of the enable register instead of simply writing out the
cached enable mask. This will prevent inadvertent enabling of GPEs if a rogue
GPE is received during initialization (before GPE handlers are installed.)

http://bugzilla.kernel.org/show_bug.cgi?id=6217

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:01 +02:00
Mike Travis 706546d023 ACPI: change processors from array to per_cpu variable
Change processors from an array sized by NR_CPUS to a per_cpu variable.

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:01 +02:00
Roland McGrath 380fdd7585 x86 ptrace: user-sets-TF nits
This closes some arcane holes in single-step handling that can arise
only when user programs set TF directly (via popf or sigreturn) and
then use vDSO (syscall/sysenter) system call entry.  In those entry
paths, the clear_TF_reenable case hits and we must check TIF_SINGLESTEP
to be sure our bookkeeping stays correct wrt the user's view of TF.

Signed-off-by: Roland McGrath <roland@redhat.com>
2008-07-16 12:15:17 -07:00
Roland McGrath d4d6715016 x86 ptrace: unify syscall tracing
This unifies and cleans up the syscall tracing code on i386 and x86_64.

Using a single function for entry and exit tracing on 32-bit made the
do_syscall_trace() into some terrible spaghetti.  The logic is clear and
simple using separate syscall_trace_enter() and syscall_trace_leave()
functions as on 64-bit.

The unification adds PTRACE_SYSEMU and PTRACE_SYSEMU_SINGLESTEP support
on x86_64, for 32-bit ptrace() callers and for 64-bit ptrace() callers
tracing either 32-bit or 64-bit tasks.  It behaves just like 32-bit.

Changing syscall_trace_enter() to return the syscall number shortens
all the assembly paths, while adding the SYSEMU feature in a simple way.

Signed-off-by: Roland McGrath <roland@redhat.com>
2008-07-16 12:15:17 -07:00
Roland McGrath 64f0973319 x86 ptrace: unify TIF_SINGLESTEP
This unifies the treatment of TIF_SINGLESTEP on i386 and x86_64.
The bit is now excluded from _TIF_WORK_MASK on i386 as it has been
on x86_64.  This means the do_notify_resume() path using it is never
used, so TIF_SINGLESTEP is not cleared on returning to user mode.

Both now leave TIF_SINGLESTEP set when returning to user, so that
it's already set on an int $0x80 system call entry.  This removes
the need for testing TF on the system_call path.  Doing it this way
fixes the regression for PTRACE_SINGLESTEP into a sigreturn syscall,
introduced by commit 1e2e99f0e4.

The clear_TF_reenable case that sets TIF_SINGLESTEP can only happen
on a non-exception kernel entry, i.e. sysenter/syscall instruction.
That will always get to the syscall exit tracing path.

Signed-off-by: Roland McGrath <roland@redhat.com>
2008-07-16 12:15:16 -07:00
Elias Oltmanns 3ef5eb424e IDE: Remove unused code
Remove some code which has been made obsolete and hasn't worked properly
before anyway.  Part of the infrastructure may be reintroduced in a
follow up patch to implement a working command aborting facility.

Signed-off-by: Elias Oltmanns <eo@nebensachen.de>
Cc: "Alan Cox" <alan@lxorguk.ukuu.org.uk>
Cc: "Randy Dunlap" <randy.dunlap@oracle.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-16 20:33:48 +02:00
Elias Oltmanns 79e36a9f54 IDE: Fix HDIO_DRIVE_RESET handling
Currently, the code path executing an HDIO_DRIVE_RESET ioctl is broken
in various ways.  Most importantly, it is treated as an out of band
request in an illegal way which may very likely lead to system lock ups.
Use the drive's request queue to avoid this problem (and fix a locking
issue for free along the way).

Signed-off-by: Elias Oltmanns <eo@nebensachen.de>
Cc: "Alan Cox" <alan@lxorguk.ukuu.org.uk>
Cc: "Randy Dunlap" <randy.dunlap@oracle.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-16 20:33:48 +02:00
Bartlomiej Zolnierkiewicz e6d95bd149 ide: ->port_init_devs -> ->init_dev
Change ->port_init_devs method to take 'ide_drive_t *' as an argument
instead of 'ide_hwif_t *' and rename it to ->init_dev.

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-16 20:33:42 +02:00
Bartlomiej Zolnierkiewicz c56c5648a3 ide: set hwif->dev in ide_init_port_hw() (take 2)
* Add 'parent' field to hw_regs_t for optional parent device pointer (needed
  by macio PMAC IDE controllers) and set hwif->dev in ide_init_port_hw().

* Update au1xxx-ide.c, sgiioc4.c, pmac.c and setup-pci.c accordingly.

v2:

* Update scc_pata.c.

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-16 20:33:40 +02:00
Bartlomiej Zolnierkiewicz 63b51c6d1d ide: make ide_hwifs[] static
Move ide_hwifs[] from ide.c to ide-probe.c and make it static.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-16 20:33:40 +02:00
Bartlomiej Zolnierkiewicz 9ad5409375 ide: move PIO blacklist to ide-pio-blacklist.c
Move PIO blacklist to ide-pio-blacklist.c.

While at it:

- fix comment

- fix whitespace damage

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-16 20:33:39 +02:00
Bartlomiej Zolnierkiewicz 3e153cfb5e ide: remove no longer used ide_pio_timings[]
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-16 20:33:39 +02:00
Bartlomiej Zolnierkiewicz c9d6c1a237 ide: move ide_pio_cycle_time() to ide-timings.c
All ide_pio_cycle_time() users already select CONFIG_IDE_TIMINGS
so move the function from ide-lib.c to ide-timings.c.

While at it:

- convert ide_pio_cycle_time() to use ide_timing_find_mode()

- cleanup ide_pio_cycle_time() a bit

There should be no functional changes caused by this patch.

Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-16 20:33:39 +02:00
Bartlomiej Zolnierkiewicz f06ab3402a ide: convert ide-timing.h to ide-timings.c library (take 2)
* Don't include ide-timing.h in cs5535 and sis5513 host drivers
  (they don't need it currently).

* Convert ide-timing.h to ide-timings.c library and add CONFIG_IDE_TIMINGS
  config option to be selected by host drivers using the library.

While at it:

- fix ide_timing_find_mode() placement

v2:
* Add missing EXPORT_SYMBOLs. (Stephen Rothwell <sfr@canb.auug.org.au>)

There should be no functional changes caused by this patch.

Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-16 20:33:37 +02:00
Bartlomiej Zolnierkiewicz 3be53f3f21 ide: move some bits from ide-timing.h to <linux/ide.h>
Move struct ide_timing and IDE_TIMING_* defines to <linux/ide.h>
from drivers/ide/ide-timing.h.

While at it:

- use u8/u16 instead of short for struct ide_timing fields

- use enum for IDE_TIMING_*

There should be no functional changes caused by this patch.

Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-16 20:33:36 +02:00
Ingo Molnar 4bb689eee1 x86: paravirt spinlocks, !CONFIG_SMP build fixes
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:15:53 +02:00
Jeremy Fitzhardinge 2d9e1e2f58 xen: implement Xen-specific spinlocks
The standard ticket spinlocks are very expensive in a virtual
environment, because their performance depends on Xen's scheduler
giving vcpus time in the order that they're supposed to take the
spinlock.

This implements a Xen-specific spinlock, which should be much more
efficient.

The fast-path is essentially the old Linux-x86 locks, using a single
lock byte.  The locker decrements the byte; if the result is 0, then
they have the lock.  If the lock is negative, then locker must spin
until the lock is positive again.

When there's contention, the locker spin for 2^16[*] iterations waiting
to get the lock.  If it fails to get the lock in that time, it adds
itself to the contention count in the lock and blocks on a per-cpu
event channel.

When unlocking the spinlock, the locker looks to see if there's anyone
blocked waiting for the lock by checking for a non-zero waiter count.
If there's a waiter, it traverses the per-cpu "lock_spinners"
variable, which contains which lock each CPU is waiting on.  It picks
one CPU waiting on the lock and sends it an event to wake it up.

This allows efficient fast-path spinlock operation, while allowing
spinning vcpus to give up their processor time while waiting for a
contended lock.

[*] 2^16 iterations is threshold at which 98% locks have been taken
according to Thomas Friebel's Xen Summit talk "Preventing Guests from
Spinning Around".  Therefore, we'd expect the lock and unlock slow
paths will only be entered 2% of the time.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Lameter <clameter@linux-foundation.org>
Cc: Petr Tesarik <ptesarik@suse.cz>
Cc: Virtualization <virtualization@lists.linux-foundation.org>
Cc: Xen devel <xen-devel@lists.xensource.com>
Cc: Thomas Friebel <thomas.friebel@amd.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:15:53 +02:00
Jeremy Fitzhardinge 8efcbab674 paravirt: introduce a "lock-byte" spinlock implementation
Implement a version of the old spinlock algorithm, in which everyone
spins waiting for a lock byte.  In order to be compatible with the
ticket-lock's use of a zero initializer, this uses the convention of
'0' for unlocked and '1' for locked.

This algorithm is much better than ticket locks in a virtual
envionment, because it doesn't interact badly with the vcpu scheduler.
If there are multiple vcpus spinning on a lock and the lock is
released, the next vcpu to be scheduled will take the lock, rather
than cycling around until the next ticketed vcpu gets it.

To use this, you must call paravirt_use_bytelocks() very early, before
any spinlocks have been taken.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Lameter <clameter@linux-foundation.org>
Cc: Petr Tesarik <ptesarik@suse.cz>
Cc: Virtualization <virtualization@lists.linux-foundation.org>
Cc: Xen devel <xen-devel@lists.xensource.com>
Cc: Thomas Friebel <thomas.friebel@amd.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:15:53 +02:00
Jeremy Fitzhardinge 74d4affde8 x86/paravirt: add hooks for spinlock operations
Ticket spinlocks have absolutely ghastly worst-case performance
characteristics in a virtual environment.  If there is any contention
for physical CPUs (ie, there are more runnable vcpus than cpus), then
ticket locks can cause the system to end up spending 90+% of its time
spinning.

The problem is that (v)cpus waiting on a ticket spinlock will be
granted access to the lock in strict order they got their tickets.  If
the hypervisor scheduler doesn't give the vcpus time in that order,
they will burn timeslices waiting for the scheduler to give the right
vcpu some time.  In the worst case it could take O(n^2) vcpu scheduler
timeslices for everyone waiting on the lock to get it, not counting
new cpus trying to take the lock while the log-jam is sorted out.

These hooks allow a paravirt backend to replace the spinlock
implementation.

At the very least, this could revert the implementation back to the
old lock algorithm, which allows the next scheduled vcpu to take the
lock, and has basically fairly good performance.

It also allows the spinlocks to take advantages of the hypervisor
features to make locks more efficient (spin and block, for example).

The cost to native execution is an extra direct call when using a
spinlock function.  There's no overhead if CONFIG_PARAVIRT is turned
off.

The lock structure is fixed at a single "unsigned int", initialized to
zero, but the spinlock implementation can use it as it wishes.

Thanks to Thomas Friebel's Xen Summit talk "Preventing Guests from
Spinning Around" for pointing out this problem.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Lameter <clameter@linux-foundation.org>
Cc: Petr Tesarik <ptesarik@suse.cz>
Cc: Virtualization <virtualization@lists.linux-foundation.org>
Cc: Xen devel <xen-devel@lists.xensource.com>
Cc: Thomas Friebel <thomas.friebel@amd.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:15:52 +02:00
Jeremy Fitzhardinge 6a52e4b1cd x86_64: further cleanup of 32-bit compat syscall mechanisms
AMD only supports "syscall" from 32-bit compat usermode.
Intel and Centaur(?) only support "sysenter" from 32-bit compat usermode.

Set the X86 feature bits accordingly, and set up the vdso in
accordance with those bits.  On the offchance we run on in a 64-bit
environment which supports neither syscall nor sysenter from 32-bit
mode, then fall back to the int $0x80 vdso.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-07-16 11:08:27 +02:00
Ingo Molnar 9c8a442044 xen64: fix !HVC_XEN build dependency
fix:

arch/x86/xen/built-in.o: In function `set_page_prot':
enlighten.c:(.text+0x111d): undefined reference to `xen_raw_printk'
arch/x86/xen/built-in.o: In function `xen_start_kernel':
: undefined reference to `xen_raw_console_write'
arch/x86/xen/built-in.o: In function `xen_start_kernel':
: undefined reference to `xen_raw_console_write'

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:06:48 +02:00
Jeremy Fitzhardinge c24481e9da xen64: save lots of registers
The Xen hypercall interface is allowed to trash any or all of the
argument registers, so we need to be careful that the kernel state
isn't damaged.  On 32-bit kernels, the hypercall parameter registers
same as a regparm function call, so we've got away without explicit
clobbering so far.  The 64-bit ABI defines lots of caller-save
registers, so save them all for safety.  We can trim this set later by
re-distributing the responsibility for saving all these registers.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:05:23 +02:00
Jeremy Fitzhardinge c05f1cfaba xen64: implement 64-bit update_descriptor
64-bit hypercall interface can pass a maddr in one argument rather
than splitting it.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:05:09 +02:00
Eduardo Habkost 45eb0d8898 Xen64: HYPERVISOR_set_segment_base() implementation
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:03:31 +02:00
Jeremy Fitzhardinge 88459d4c7e xen64: register callbacks in arch-independent way
Use callback_op hypercall to register callbacks in a 32/64-bit
independent way (64-bit doesn't need a code segment, but that detail
is hidden in XEN_CALLBACK).

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:03:01 +02:00
Jeremy Fitzhardinge ce803e705f xen64: use arbitrary_virt_to_machine for xen_set_pmd
When building initial pagetables in 64-bit kernel the pud/pmd pointer may
be in ioremap/fixmap space, so we need to walk the pagetable to look up the
physical address.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:01:17 +02:00
Jeremy Fitzhardinge 084a2a4e76 xen64: early mapping setup
Set up the initial pagetables to map the kernel mapping into the
physical mapping space.  This makes __va() usable, since it requires
physical mappings.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:00:07 +02:00
Jeremy Fitzhardinge 5b09b2876e x86_64: add workaround for no %gs-based percpu
As a stopgap until Mike Travis's x86-64 gs-based percpu patches are
ready, provide workaround functions for x86_read/write_percpu for
Xen's use.

Specifically, this means that we can't really make use of vcpu
placement, because we can't use a single gs-based memory access to get
to vcpu fields.  So disable all that for now.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 10:58:13 +02:00
Jeremy Fitzhardinge f6e587325b xen64: add extra pv_mmu_ops
We need extra pv_mmu_ops for 64-bit, to deal with the extra level of
pagetable.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 10:57:16 +02:00
Jeremy Fitzhardinge e74359028d xen64: fix calls into hypercall page
The 64-bit calling convention for hypercalls uses different registers
from 32-bit.  Annoyingly, gcc's asm syntax doesn't have a way to
specify one of the extra numeric reigisters in a constraint, so we
must use explicitly placed register variables.  Given that we have to
do it for some args, may as well do it for all.

Also fix syntax gcc generates for the call instruction itself.  We
need a plain direct call, but the asm expansion which works on 32-bit
generates a rip-relative addressing mode in 64-bit, which is treated
as an indirect call.  The alternative is to pass the hypercall page
offset into the asm, and have it add it to the hypercall page start
address to generate the call.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 10:57:00 +02:00
Jeremy Fitzhardinge ca15f20f11 xen: fix 64-bit hypercall variants
64-bit guests can pass 64-bit quantities in a single argument,
so fix up the hypercalls.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 10:56:46 +02:00