Commit Graph

21487 Commits

Author SHA1 Message Date
Alexey Starikovskiy 4655c7deca x86: remove mpc_apic_id()
Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:41:07 +02:00
Alexey Starikovskiy ce3fe6b2bf x86: use get_bios_ebda in mpparse_64.c
Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:41:05 +02:00
Glauber de Oliveira Costa 9f3734f631 x86: introduce smpboot_clear_io_apic
x86_64 has two nr_ioapics = 0 statements. In 32-bit, it can be done
too. We do it through the smpboot_clear_io_apic() inline function,
to cope with subarchitectures (visws) that does not compile mpparse in

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:41:04 +02:00
Glauber de Oliveira Costa 3fa7b3487a x86: assign nr_ioapics = 0 in smpboot_hooks.h
change smpboot_setup_io_apic() by to match x86_64 behaviour

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:41:04 +02:00
Glauber de Oliveira Costa cb3c8b9003 x86: integrate do_boot_cpu
This is a very large patch, because it depends on a lot
of auxiliary static functions. But they all have been modified
to the point that they're sufficiently close now. So they're just
merged in smpboot.c

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:41:03 +02:00
Glauber de Oliveira Costa c70dcb7430 x86: change boot_cpu_id to boot_cpu_physical_apicid
This is to match i386. The former name was cuter,
but the current is more meaningful and more general,
since cpu_id can be a logical id.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:41:02 +02:00
Glauber de Oliveira Costa 9d97d0da71 x86: move stack_start to smp.h
voyager would conflict with it, but the types are ultimately
compatible. So remove the extern definition from voyager_smp.c
in favour of the common one

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:41:02 +02:00
Glauber de Oliveira Costa f6bc402909 x86: include mach_apic.h in smpboot_64.c and smpboot.c
After the inclusion, a lot of files needs fixing for conflicts,
some of them in the headers themselves, to accomodate for both
i386 and x86_64 versions.

[ mingo@elte.hu: build fix ]

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:41:02 +02:00
Glauber de Oliveira Costa 50e440aa53 x86: call nmi_watchdog_default in i386
this does not exist, so it will be an empty macro

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:41:01 +02:00
Glauber de Oliveira Costa 6d60cd5359 x86: unify nmi_32.h and nmi_64.h
Two more files goes away. nmi_64.h and nmi_32.h gives birth
to nmi.h

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:41:01 +02:00
Glauber de Oliveira Costa e32ede19ac x86: wipe get_nmi_reason out of nmi_64.h
use mach_traps when it is supposed to be used.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:41:01 +02:00
Glauber de Oliveira Costa fbac7fcbad x86: fix alloc_bootmem_pages_node macro
missing a semicolon

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:41:00 +02:00
Glauber de Oliveira Costa 04d1dd20f6 x86: make node to apic mapping declarations unconditional
Instead of declaring them inside of X86_64 ifdef, do it
unconditionally

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:41:00 +02:00
Glauber de Oliveira Costa cbe879fc6c x86: define bios to apicid mapping
This mapping already exists in x86_64, just provide it for
i386

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:41:00 +02:00
Glauber de Oliveira Costa 7e1efc0cde x86: unify extern masks declaration
take them off smp_{32,64}.h and move to smp.h

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:41:00 +02:00
Glauber de Oliveira Costa ac56ef61a1 x86: provide APIC_INTEGRATED definition for x86_64
it is always integrated, so define as 1.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:41:00 +02:00
Glauber de Oliveira Costa 1d89a7f072 x86: merge smp_store_cpu_info
now that it is the same between arches, put it into smpboot.c

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:41:00 +02:00
Glauber de Oliveira Costa d0173aeac4 x86: use start_ipi_hook in x86_64
It is used to match i386. The definition for the non-paravirt
case is moved to smp.h instead of smp_32.h

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:59 +02:00
Alexey Starikovskiy 037cab07e9 x86: move mp_bus_id_to_node to numa.c
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:59 +02:00
Alexey Starikovskiy e129cb490e x86: move mp_bus_id_to_local to numa.c
Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:59 +02:00
Alexey Starikovskiy c0a282c251 x86: make mp_bus_id_to_type optional
[ mingo@elte.hu: fix boot regression. ]

Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:59 +02:00
Alexey Starikovskiy a6333c3ccb x86: add mp_bus_not_pci bitmap to mpparse_32.c
Signed-off: Alexey Starikovskiy <astarikovskiy@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:58 +02:00
Yinghai Lu 8643f9d02a x86: get boot_cpu_id as early for k8_scan_nodes
When acpi=off or there is no SRAT defined, apicid_to_node is got from K8
Northbridge PCI configuration space in k8_scan_nodes() in
arch/x86_64/mm/k8toplogy.c.

The problem is that it assumes bsp apic id is 0 at that point.

For four socket system with Quad core cpus installed, all cpus apic id
is offset by 4, and bsp apic id is 4.

For eight socket system with dual core cpus installed, all cpus apic id
is offset by 2, and bsp apic id is 2.

We need get boot_cpu_id --- bsp apic id, before k8_scan_nodes by called.

So create early_acpi_boot_init and early_get_smp_config for get boot_cpu_id.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:58 +02:00
Alexey Starikovskiy 6079d2d5d1 x86: move quad_local_to_mp_bus_id to numa.c
Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:58 +02:00
Mikael Pettersson 5d570cbbf2 x86: correct/clarify comment in nops.h
<asm-x86/nops.h> describes certain multibyte instructions as
"generic" nops when in fact they aren't nops at all in 64-bit
mode (missing REX.W causing truncation of a register).

Update the comment to state that K8 or P6 style nops should be
used in 64-bit mode. This matches what the alternatives code does.

Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:58 +02:00
Jan Beulich 5b0e508415 x86: prevent unconditional writes to DebugCtl MSR
Otherwise, enabling (or better, subsequent disabling) of single
stepping would cause a kernel oops on CPUs not having this MSR.

The patch could have been added a conditional to the MSR write in
user_disable_single_step(), but centralizing the updates seems safer
and (looking forward) better manageable.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Markus Metzger <markus.t.metzger@intel.com>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:58 +02:00
stephane eranian 12db648c15 x86: add AMD Northbridge MSR definition
adds AMD Northbridge config MSR definition

Signed-off-by: Stephane Eranian <eranian@gmail.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:58 +02:00
stephane eranian 86975101e4 x86: add cpu_has_arch_perfmon
adds cpu_has_arch_perfmon to test presence of architectural perfmon on
Intel x86 processor

Signed-off-by: Stephane Eranian <eranian@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:58 +02:00
Joe Perches e40c0fe6b0 x86: cleanup duplicate includes
Signed-off-by: Joe Perches <joe@perches.com>

 arch/x86/kernel/reboot.c      |    1 -
 include/asm-x86/elf.h         |    5 ++---
 include/asm-x86/posix_types.h |    8 +-------
 include/asm-x86/processor.h   |    3 +--
 include/asm-x86/unistd.h      |    8 +-------
 5 files changed, 5 insertions(+), 20 deletions(-)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:58 +02:00
Yinghai Lu 01aaea1afb x86: introduce initial apicid
store initial_apicid from early identify. it is could be different from
phys_proc_id later.

also print it out in /proc/cpuinfo.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:58 +02:00
Ingo Molnar 459cce7267 x86: remove mach_reboot.h
all reboot details are handled in reboot.c and quirks are handled
via reboot_fixups_32.c.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:58 +02:00
Mathieu Desnoyers e587cadd8f x86: enhance DEBUG_RODATA support - alternatives
Fix a memcpy that should be a text_poke (in apply_alternatives).

Use kernel_wp_save/kernel_wp_restore in text_poke to support DEBUG_RODATA
correctly and so the CPU HOTPLUG special case can be removed.

Add text_poke_early, for alternatives and paravirt boot-time and module load
time patching.

Changelog:

- Fix text_set and text_poke alignment check (mixed up bitwise and and or)
- Remove text_set
- Export add_nops, so it can be used by others.
- Document text_poke_early.
- Remove clflush, since it breaks some VIA architectures and is not strictly
  necessary.
- Add kerneldoc to text_poke and text_poke_early.
- Create a second vmap instead of using the WP bit to support Xen and VMI.
- Move local_irq disable within text_poke and text_poke_early to be able to
  be sleepable in these functions.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Andi Kleen <andi@firstfloor.org>
CC: pageexec@freemail.hu
CC: H. Peter Anvin <hpa@zytor.com>
CC: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:58 +02:00
Ingo Molnar eee6dd1572 x86: move extern declaration to vdso.h
Before:
   total: 0 errors, 3 warnings, 685 lines checked
After:
   total: 0 errors, 1 warnings, 678 lines checked

No code changed:

arch/x86/kernel/signal_32.o:

   text	   data	    bss	    dec	    hex	filename
   5333	      0	      4	   5337	   14d9	signal_32.o.before
   5333	      0	      4	   5337	   14d9	signal_32.o.after

md5:
   c279e98012a2808e90cfa2a7787e42a4  signal_32.o.before.asm
   c279e98012a2808e90cfa2a7787e42a4  signal_32.o.after.asm

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:57 +02:00
Joe Perches 16281a998d x86: include/asm-x86/mutex_32.h - use angle brackets for include
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:57 +02:00
Ingo Molnar ca9cda2f7b x86: add comments to processor.h
add comments to the FPU structures of processor.h.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:57 +02:00
Glauber Costa 91718e8d13 x86: unify setup_trampoline
setup_trampoline() looks very similar between architectures, and this
patch unifies them. The i386 version allocates bootmem memory, while
the x86_64 version uses a fixed address.

In this patch, we initialize the global trampoline_base to the x86_64 version,
and i386 allocation can later override it.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:57 +02:00
Glauber Costa 4206882939 x86: move trampoline arrays extern definition to smp.h
In here, they can serve both architectures

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:56 +02:00
Glauber Costa 69c18c15d3 x86: merge __cpu_disable and cpu_die
They are now equal, and are moved to a common file

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:56 +02:00
Glauber Costa 1dbb4726fa x86: move hotplug related extern definitions to smp.h
definitions that are inside CONFIG_HOTPLUG_CPU in
the arch-specific smp*.h files are moved to common
header

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:56 +02:00
Glauber Costa 1452207689 x86: make set_cpu_sibling_map nonstatic
And move its extern definition to smp.h, the common header

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:56 +02:00
Glauber Costa a355352b97 x86: move equal types to common file
move definitions that are now equal in type from
smpboot_{32,64}.c to smpboot.c

cpu_callin_map is put temporarily in smp_64.h (already
exists in smp_32.h), and will soon be merged.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:56 +02:00
Glauber Costa 5382e89670 x86: adjust types in smpcommon_32.c
so they can have the same type as x86_64

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:56 +02:00
Glauber Costa fe6762030c x86: remove cpu_llc_id from processor.h
it is already defined in smp.h

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:56 +02:00
Glauber Costa 377d698426 x86: unify smp_send_stop
function definition is moved to common header.
x86_64 version is now called native_smp_send_stop

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:56 +02:00
Glauber Costa 3d3f487c58 x86: provide hlt_works function.
In x86_64, hlt always work. in i386, we'll query the cpuinfo associated
with this cpu

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:55 +02:00
Glauber Costa 68a1c3f8cd x86: move prefill_possible_map to common file
this patches moves prefill_possible_map() to smpboot.c
Right now it is x86_64-specific, but nothing intrinsically
prevents it to be used by i386

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:53 +02:00
Glauber Costa 93b016f8f3 x86: move disabled_cpus to common header
disabled_cpus is (up to now) a x86_64-only contruction.
But it's extern declaration can be moved to common header anyway

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:53 +02:00
Glauber Costa c559764923 x86: unify smp_cpus_done
definition is moved to common header. x86_64 version is now called
native_smp_cpus_done

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:53 +02:00
Glauber Costa 7557da6720 x86: unify smp_prepare_cpus
definition is moved to common header. x86_64 version is now called
native_smp_prepare_cpus

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:53 +02:00
Glauber Costa 1e3fac83da x86: unify prepare_boot_cpu
definition is moved to common header. x86_64 version is now called
native_prepare_boot_cpu

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:53 +02:00
Glauber Costa 71d195492a x86: unify __cpu_up.
function definition is moved to common header. x86_64 version
is now called native_cpu_up

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:53 +02:00
Glauber Costa 64b1a21e09 x86: unify smp_call_function_mask
definition is moved to common header, x86_64 function name
now is native_smp_call_function_mask

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:53 +02:00
Glauber Costa 8678969e60 x86: merge smp_send_reschedule
function definition is moved to common header, x86_64 version is now called
native_smp_send_reschedule

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:53 +02:00
Glauber Costa c76cb36846 x86: move smp_ops extern declaration to common header
the smp_ops symbol is temporarily defined in smp_64.c, but it will soon
be unified

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:53 +02:00
Glauber Costa 16694024d6 x86: define smp_ops in common header
x86_64 will benefit from it
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:52 +02:00
Glauber Costa 53ebef4961 x86: merge extern variables definitions
move extern definitions that are the same between smp_{32,64}.h
to smp.h

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:52 +02:00
Glauber Costa 639acb16e6 x86: merge extern function definitions
move extern function definitions that are the same between smp_{32,64}.h
to smp.h

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:52 +02:00
Glauber Costa c27cfeffad x86: commonize smp.h
this is the first step of integrating smp.h between x86_64
and i386

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:52 +02:00
Ingo Molnar 8b6451fe5c x86: fix switch_to() clobbers
Liu Pingfan noticed that switch_to() clobbers more registers than its
asm constraints specify.

We get away with this due to luck mostly - schedule()
by its nature only has 'local' state which gets reloaded
automatically. Fix it nevertheless, we could hit this anytime.

it turns out that with the extra constraints gcc manages to make
schedule() even more compact:

   text	   data	    bss	    dec	    hex	filename
  28626	    684	   2640	  31950	   7cce	sched.o.before
  28613	    684	   2640	  31937	   7cc1	sched.o.after

Reported-by: Liu Pingfan <kernelfans@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:52 +02:00
Ingo Molnar 23b55bd9f3 x86: clean up switch_to()
Make the code more readable and more hackable:

 - use symbolic asm parameters
 - use readable indentation
 - add comments that explains the details

No code changed:

kernel/sched.o:

   text	   data	    bss	    dec	    hex	filename
  28626	    684	   2640	  31950	   7cce	sched.o.before
  28626	    684	   2640	  31950	   7cce	sched.o.after

md5:
   2823d406c18b781975cdb2e7cfea0059  sched.o.before.asm
   2823d406c18b781975cdb2e7cfea0059  sched.o.after.asm

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:52 +02:00
Pavel Machek 0d7a1819e9 x86: wmb() confusion in system.h
Comment says wmb is a nop, but it is implemented as lock addl
below... Should it be compiled to nop if we know we are running on
"good" Intel cpu?

At least remove confusing comment for now.

Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:52 +02:00
Ingo Molnar 9fc34113f6 x86: debug pmd_bad()
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:52 +02:00
Ingo Molnar 40869cd038 x86: redo cded932b75
redo commit cded932b75.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:52 +02:00
Ingo Molnar 78a9909aab x86, tracing: add notrace to asm-x86/linkage.h
notrace signals that a function should not be traced. Most of the
time this is used by tracers to annotate code that cannot be
traced - it's in a volatile state (such as in user vdso context
or NMI context) or it's in the tracer internals.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:51 +02:00
Ingo Molnar 0f8d2b926d x86: clean up cpu capabilities accesses
introduce test_cpu_cap() for raw access to the real CPU
capabilities as they are present in x86_capability.

(cpu_has() will shortcut certain tests during build-time)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:50 +02:00
Yinghai Lu f8fffa4583 x86: apic_is_clustered_box for vsmp
quad core 8 socket system will have apic id lifting.the apic id range could
be [4, 0x23]. and apic_is_clustered_box will think that need to three clusters
and that is larger than 2. So it is treated as a clustered_box.

and will get:

   Marking TSC unstable due to TSCs unsynchronized

even if the CPUs have X86_FEATURE_CONSTANT_TSC set.

this quick fix will check if the cpu is from AMD.

but vsmp still needs that checking...

this patch is fix to make sure that vsmp not to be passed.

Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:50 +02:00
Yinghai Lu 3def3d6ddf x86: clean up e820_reserve_resources on 64-bit
e820_resource_resources could use insert_resource instead of request_resource
also move code_resource, data_resource, bss_resource, and crashk_res
out of e820_reserve_resources.

Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:49 +02:00
Ingo Molnar 513ad84bf6 x86: de-macro start_thread()
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:49 +02:00
Ingo Molnar 4d46a89e7c x86: clean up include/asm-x86/processor.h
basic style cleanup to flush out years of neglect:

 - consistent indentation
 - whitespace fixes
 - consistent comments

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:49 +02:00
David P. Reed bc0a733fac x86: define outb_pic and inb_pic to stop using outb_p and inb_p
x86: define outb_pic and inb_pic to stop using outb_p and inb_p

The delay between io port accesses to the PIC is now defined using outb_pic
and inb_pic.  This fix provides the next step, using udelay(2) to define the
*PIC specific* timing requirements, rather than on bus-oriented timing, which
is not well calibrated.

Again, the primary reason for fixing this is to use proper delay strategy,
and in particular to fix crashes that can result from using port 80 writes
on machines that have resources on port 80, such as the ENE chips used by Quanta
in latops it designs and sells to, e.g. HP.

Signed-off-by: David P. Reed <dpreed@reed.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:48 +02:00
Glauber Costa 2785c8d052 x86: call vsmp_init explicitly
It becomes to early for ioremap, so we use early_ioremap

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ravikiran Thirumalai <kiran@scalemp.com>
Acked-by: Shai Fultheim <shai@scalemp.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-17 17:40:47 +02:00
Yinghai Lu 04adf11435 x86: remove never used nodenumer in pda
Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-17 17:40:47 +02:00
Harvey Harrison 9902a702c7 x86: make X86_32 pt_regs members unsigned long
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-17 17:40:45 +02:00
Harvey Harrison 92bc205685 x86: change most X86_32 pt_regs members to unsigned long
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-17 17:40:45 +02:00
Yinghai Lu 48c508b364 x86: clean up find_e820_area(), 64-bit
Change size to unsigned long, becase caller and user all used unsigned long.
Also make bad_addr take an alignment parameter.

Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-17 17:40:45 +02:00
Ingo Molnar 00d1c5e057 x86: add gbpages switches
These new controls toggle experimental support for a new CPU feature,
the straightforward extension of largepages from the pmd level to the
pud level, which allows 1GB (kernel) TLBs instead of 2MB TLBs.

Turn it off by default, as this code has not been tested well enough yet.

Use the CONFIG_DIRECT_GBPAGES=y .config option or gbpages on the
boot line can be used to enable it. If enabled in the .config then
nogbpages boot option disables it.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-17 17:40:45 +02:00
Ingo Molnar 85eb69a16a x86: increase the kernel text limit to 512 MB
people sometimes do crazy stuff like building really large static
arrays into their kernels or building allyesconfig kernels. Give
more space to the kernel and push modules up a bit: kernel has
512 MB and modules have 1.5 GB.

Should be enough for a few years ;-)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 17:40:45 +02:00
Matthew Wilcox 714493cd54 Improve semaphore documentation
Move documentation from semaphore.h to semaphore.c as requested by
Andrew Morton.  Also reformat to kernel-doc style and add some more
notes about the implementation.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2008-04-17 10:43:01 -04:00
Matthew Wilcox b17170b2fa Simplify semaphore implementation
By removing the negative values of 'count' and relying on the wait_list to
indicate whether we have any waiters, we can simplify the implementation
by removing the protection against an unlikely race condition.  Thanks to
David Howells for his suggestions.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2008-04-17 10:42:54 -04:00
Matthew Wilcox f1241c87a1 Add down_timeout and change ACPI to use it
ACPI currently emulates a timeout for semaphores with calls to
down_trylock and sleep.  This produces horrible behaviour in terms of
fairness and excessive wakeups.  Now that we have a unified semaphore
implementation, adding a real down_trylock is almost trivial.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2008-04-17 10:42:46 -04:00
Matthew Wilcox f06d968658 Introduce down_killable()
down_killable() is the functional counterpart of mutex_lock_killable.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2008-04-17 10:42:40 -04:00
Matthew Wilcox 64ac24e738 Generic semaphore implementation
Semaphores are no longer performance-critical, so a generic C
implementation is better for maintainability, debuggability and
extensibility.  Thanks to Peter Zijlstra for fixing the lockdep
warning.  Thanks to Harvey Harrison for pointing out that the
unlikely() was unnecessary.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 10:42:34 -04:00
Matthew Wilcox 8b91de2e58 Fix quota.h includes
quota.h currently relies on asm/semaphore.h (through some chain; it
doesn't actually include semaphore.h itself) to include wait.h.  As
well as being bad practice to rely on an implicit include, subsequent
patches will break this.  While I'm in this file, add atomic.h and
list.h, and sort the list of includes.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2008-04-17 10:42:14 -04:00
Oleg Nesterov 3f3eafc921 locking: remove unused double_spin_lock()
double_spin_lock() has no callers, and it can't be used without additional
lockdep annotations, remove it.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-17 12:22:31 +02:00
Oleg Nesterov 8e60e05fdc hrtimers: simplify lockdep handling
In order to avoid the false positive from lockdep, each per-cpu base->lock has
the separate lock class and migrate_hrtimers() uses double_spin_lock().

This is overcomplicated: except for migrate_hrtimers() we never take 2 locks
at once, and migrate_hrtimers() can use spin_lock_nested().

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-17 12:22:31 +02:00
Thomas Gleixner a332d86d3c hrtimer: add nanosleep specific restart_block member
The back and forth typecasting of restart_block->args is horrible. We
added a separate union member for futex already. Do the same for
nanosleep.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-17 12:22:30 +02:00
Martin Schwidefsky ca68305bf3 [S390] Remove code duplication from monreader / dcssblk.
Move the function that prints the segment warning messages found in the
monreader driver and the dcssblk driver to the extmem base code.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2008-04-17 07:47:07 +02:00
Christian Borntraeger 9e74a6b898 [S390] kernel: show last breaking-event-address on oops
Newer s390 models have a breaking-event-address-recording register.
Each time an instruction causes a break in the sequential instruction
execution, the address is saved in that hardware register. On a program
interrupt the address is copied to the lowcore address 272-279, which
makes it software accessible.

This patch changes the program check handler and the stack overflow
checker to copy the value into the pt_regs argument.
The oops output is enhanced to show the last known breaking address.
It might give additional information if the stack trace is corrupted.

The feature is only available on 64 bit.

The new oops output looks like:

[---------snip----------]
Modules linked in: vmcp sunrpc qeth_l2 dm_mod qeth ccwgroup
CPU: 2 Not tainted 2.6.24zlive-host #8
Process modprobe (pid: 4788, task: 00000000bf3d8718, ksp: 00000000b2b0b8e0)
Krnl PSW : 0704200180000000 000003e000020028 (vmcp_init+0x28/0xe4 [vmcp])
           R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:2 PM:0 EA:3
Krnl GPRS: 0000000004000002 000003e000020000 0000000000000000 0000000000000001
           000000000015734c ffffffffffffffff 000003e0000b3b00 0000000000000000
           000003e00007ca30 00000000b5bb5d40 00000000b5bb5800 000003e0000b3b00
           000003e0000a2000 00000000003ecf50 00000000b2b0bd50 00000000b2b0bcb0
Krnl Code: 000003e000020018: c0c000040ff4       larl    %r12,3e0000a2000
           000003e00002001e: e3e0f0000024       stg     %r14,0(%r15)
           000003e000020024: a7f40001           brc     15,3e000020026
          >000003e000020028: e310c0100004       lg      %r1,16(%r12)
           000003e00002002e: c020000413dc       larl    %r2,3e0000a27e6
           000003e000020034: c0a00004aee6       larl    %r10,3e0000b5e00
           000003e00002003a: a7490001           lghi    %r4,1
           000003e00002003e: a75900f0           lghi    %r5,240
Call Trace:
([<000000000014b300>] blocking_notifier_call_chain+0x2c/0x40)
 [<000000000015735c>] sys_init_module+0x19d8/0x1b08
 [<0000000000110afc>] sysc_noemu+0x10/0x16
 [<000002000011cda2>] 0x2000011cda2
Last Breaking-Event-Address:
 [<000003e000020024>] vmcp_init+0x24/0xe4 [vmcp]
[---------snip----------]

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2008-04-17 07:47:07 +02:00
Heiko Carstens 1a5debaaac [S390] lowcore: Change type of lowcores softirq_pending to __u32.
As noted by akpm:

> kernel/time/tick-sched.c: In function 'tick_nohz_stop_sched_tick':
> kernel/time/tick-sched.c:229: warning: format '%02x' expects type 'unsigned int', but argument 2 has type '__u64'
>
> I don't think the architecture's local_softirq_pending() should return u64.
> This is the sort of thing which should be consistent across architectures.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2008-04-17 07:47:07 +02:00
Heiko Carstens a806170e29 [S390] Fix a lot of sparse warnings.
Most noteable part of this commit is the new local header file entry.h
which contains all the function declarations of functions that get only
called from asm code or are arch internal. That way we can avoid extern
declarations in C files.
This is more or less the same that was done for sparc64.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2008-04-17 07:47:06 +02:00
Heiko Carstens 5a62b19219 [S390] Convert s390 to GENERIC_CLOCKEVENTS.
This way we get rid of s390's NO_IDLE_HZ and use the generic dynticks
variant instead. In addition we get high resolution timers for free.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2008-04-17 07:47:05 +02:00
Russell King d7b906897e [S390] genirq/clockevents: move irq affinity prototypes/inlines to interrupt.h
> Generic code is not supposed to include irq.h. Replace this include
> by linux/hardirq.h instead and add/replace an include of linux/irq.h
> in asm header files where necessary.
> This change should only matter for architectures that make use of
> GENERIC_CLOCKEVENTS.
> Architectures in question are mips, x86, arm, sh, powerpc, uml and sparc64.
>
> I did some cross compile tests for mips, x86_64, arm, powerpc and sparc64.
> This patch fixes also build breakages caused by the include replacement in
> tick-common.h.

I generally dislike adding optional linux/* includes in asm/* includes -
I'm nervous about this causing include loops.

However, there's a separate point to be discussed here.

That is, what interfaces are expected of every architecture in the kernel.
If generic code wants to be able to set the affinity of interrupts, then
that needs to become part of the interfaces listed in linux/interrupt.h
rather than linux/irq.h.

So what I suggest is this approach instead (against Linus' tree of a
couple of days ago) - we move irq_set_affinity() and irq_can_set_affinity()
to linux/interrupt.h, change the linux/irq.h includes to linux/interrupt.h
and include asm/irq_regs.h where needed (asm/irq_regs.h is supposed to be
rarely used include since not much touches the stacked parent context
registers.)

Build tested on ARM PXA family kernels and ARM's Realview platform
kernels which both use genirq.

[ tglx@linutronix.de: add GENERIC_HARDIRQ dependencies ]

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2008-04-17 07:47:05 +02:00
Heiko Carstens 43ca5c3a1c [S390] Convert monitor calls to function calls.
Remove the program check generating monitor calls and use function
calls instead. Theres is no real advantage in using monitor calls,
but they do make debugging harder, because of all the program checks
it generates.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2008-04-17 07:47:05 +02:00
Michael Holzheu 9637c3f318 [S390] Add debug_register_mode() function to debug feature API
The new function supports setting of permissions for the debugfs files
created by the debug feature. In addition to that, the function provides
uid and gid as parameters for future use. Currently only root is allowed
for uid and gid.

Signed-off-by: Michael Holzheu <holzheu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2008-04-17 07:47:03 +02:00
Jan Glauber c0015f91d8 [S390] switch sched_clock to store-clock-extended.
Add get_clock_xt to read an 8 byte clock value using store clock
extended (STCKE) and use get_clock_xt for sched_clock. STCKE should
be faster than STCK on newer machines.

Signed-off-by: Jan Glauber <jan.glauber@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2008-04-17 07:47:02 +02:00
Heiko Carstens c10fde0d9e [S390] Vertical cpu management.
If vertical cpu polarization is active then the hypervisor will
dispatch certain cpus for a longer time than other cpus for maximum
performance. For example if a guest would have three virtual cpus,
each of them with a share of 33 percent, then in case of vertical
cpu polarization all of the processing time would be combined to a
single cpu which would run all the time, while the other two cpus
would get nearly no cpu time.

There are three different types of vertical cpus: high, medium and
low. Low cpus hardly get any real cpu time, while high cpus get a
full real cpu. Medium cpus get something in between.

In order to switch between the two possible modes (default is
horizontal) a 0 for horizontal polarization or a 1 for vertical
polarization must be written to the dispatching sysfs attribute:

/sys/devices/system/cpu/dispatching

The polarization of each single cpu can be figured out by the
polarization sysfs attribute of each cpu:

/sys/devices/system/cpu/cpuX/polarization

horizontal, vertical:high, vertical:medium, vertical:low or unknown.

When switching polarization the polarization attribute may contain
the value unknown until the configuration change is done and the
kernel has figured out the new polarization of each cpu.

Note that running a system with different types of vertical cpus may
result in significant performance regressions. If possible only one
type of vertical cpus should be used. All other cpus should be
offlined.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2008-04-17 07:47:01 +02:00
Heiko Carstens dbd70fb499 [S390] cpu topology support for s390.
Add s390 backend so we can give the scheduler some hints about the
cpu topology.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2008-04-17 07:47:01 +02:00
Heiko Carstens 7b758389a2 [S390] Export stfle.
Make stfle visible so other code can call this.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2008-04-17 07:47:01 +02:00
Martin Schwidefsky cbce70e687 [S390] Add new fields for System z10 to /proc/sysinfo
Add permanent and temporary model capacity and the corresponding
capacity value fields for the three capacity identifiers to the
output of /proc/sysinfo.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2008-04-17 07:47:01 +02:00
Christian Borntraeger aa24f7f08b [S390] KVM preparation: split sysinfo definitions for kvm use
drivers/s390/sysinfo.c uses the store system information intruction to query
the system about information of the machine, the LPAR and additional
hypervisors. KVM has to implement the host part for this instruction.

To avoid code duplication, this patch splits the common definitions from
sysinfo.c into a separate header file include/asm-s390/sysinfo.h for KVM use.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2008-04-17 07:47:00 +02:00
Martin Schwidefsky 374b8f45f1 [S390] allnoconfig build error.
Fix the following link error with allnoconfig:

vmem.c:(.text+0x175c): undefined reference to `smp_ptlb_all'
vmem.c:(.text+0x1b24): undefined reference to `smp_ptlb_all'
fork.c:(.text+0x4190): undefined reference to `smp_ptlb_all'
: undefined reference to `smp_ptlb_all'
: undefined reference to `smp_ptlb_all'
mm/built-in.o:: more undefined references to `smp_ptlb_all' follow
make[1]: *** [.tmp_vmlinux1] Error 1
make: *** [sub-make] Error 2

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2008-04-17 07:46:56 +02:00
Vladimir Sokolovsky bbf8eed1a0 IB/mlx4: Add support for resizing CQs
Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16 21:09:33 -07:00
Eli Cohen 3fdcb97f0b IB/mlx4: Add support for modifying CQ moderation parameters
Signed-off-by: Eli Cohen <eli@mellnaox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16 21:09:33 -07:00
Eli Cohen 2dd5716227 IB/core: Add support for modify CQ
Add support for modifying CQ parameters for controlling event
generation moderation.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16 21:09:33 -07:00
Roland Dreier 0f39cf3d54 IB/core: Add support for "send with invalidate" work requests
Add a new IB_WR_SEND_WITH_INV send opcode that can be used to mark a
"send with invalidate" work request as defined in the iWARP verbs and
the InfiniBand base memory management extensions.  Also put "imm_data"
and a new "invalidate_rkey" member in a new "ex" union in struct
ib_send_wr. The invalidate_rkey member can be used to pass in an
R_Key/STag to be invalidated.  Add this new union to struct
ib_uverbs_send_wr.  Add code to copy the invalidate_rkey field in
ib_uverbs_post_send().

Fix up low-level drivers to deal with the change to struct ib_send_wr,
and just remove the imm_data initialization from net/sunrpc/xprtrdma/,
since that code never does any send with immediate operations.

Also, move the existing IB_DEVICE_SEND_W_INV flag to a new bit, since
the iWARP drivers currently in the tree set the bit.  The amso1100
driver at least will silently fail to honor the IB_SEND_INVALIDATE bit
if passed in as part of userspace send requests (since it does not
implement kernel bypass work request queueing).  Remove the flag from
all existing drivers that set it until we know which ones are OK.

The values chosen for the new flag is not consecutive to avoid clashing
with flags defined in the XRC patches, which are not merged yet but
which are already in use and are likely to be merged soon.

This resurrects a patch sent long ago by Mikkel Hagen <mhagen@iol.unh.edu>.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16 21:09:32 -07:00
Eli Cohen b832be1e40 IB/mlx4: Add IPoIB LSO support
Add TSO support to the mlx4_ib driver.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16 21:09:27 -07:00
Eli Cohen c93570f23a IB/core: Add IPoIB UD LSO support
LSO (large send offload) allows the networking stack to pass SKBs with
data size larger than the MTU to the IPoIB driver and have the HCA HW
fragment the data to multiple MSS-sized packets.  Add a device
capability flag IB_DEVICE_UD_TSO for devices that can perform TCP
segmentation offload, a new send work request opcode IB_WR_LSO,
header, hlen and mss fields for the work request structure, and a new
IB_WC_LSO completion type.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16 21:09:27 -07:00
Eli Cohen b846f25aa2 IB/core: Add creation flags to struct ib_qp_init_attr
Add a create_flags member to struct ib_qp_init_attr that will allow a
kernel verbs consumer to create a pass special flags when creating a QP.
Add a flag value for telling low-level drivers that a QP will be used
for IPoIB UD LSO.  The create_flags member will also be useful for XRC
and ehca low-latency QP support.

Since no create_flags handling is implemented yet, add code to all
low-level drivers to return -EINVAL if create_flags is non-zero.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16 21:09:27 -07:00
Eli Cohen 8ff095ec4b IB/mlx4: Add IPoIB checksum offload support
ConnectX devices support checksum generation and verification of TCP
and UDP packets for UD IPoIB messages.  This patch checks if the HCA
supports this and sets the IB_DEVICE_UD_IP_CSUM capability flag if it
does.  It implements support for handling the IB_SEND_IP_CSUM send
flag and setting the csum_ok field in receive work completions.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Ali Ayub <ali@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16 21:01:10 -07:00
Roland Dreier 37608eea86 mlx4_core: Fix confusion between mlx4_event and mlx4_dev_event enums
The struct mlx4_interface.event() method was supposed to get an enum
mlx4_dev_event, but the driver code was actually passing in the
hardware enum mlx4_event values.  Fix up the callers of
mlx4_dispatch_event() so that they pass in the right type of value,
and fix up the event method in mlx4_ib so that it can handle the enum
mlx4_dev_event values.

This eliminates the need for the subtype parameter to the event
method, so remove it.

This also fixes the sparse warning

    drivers/net/mlx4/intf.c:127:48: warning: mixing different enum types
    drivers/net/mlx4/intf.c:127:48:     int enum mlx4_event  versus
    drivers/net/mlx4/intf.c:127:48:     int enum mlx4_dev_event

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16 21:01:08 -07:00
Roland Dreier b3d636b0d1 IB: Make struct ib_uobject.id a signed int
IDR IDs are signed, so struct ib_uobject.id should be signed.  This
avoids some sparse pointer signedness warnings.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16 21:01:06 -07:00
Linus Torvalds c970d5a32a Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
  it821x: do not describe noraid parameter with its value
  Pb1200/DBAu1200: fix bad IDE resource size
  Au1200: IDE driver build fix
  Au1200: kill IDE driver function prototypes
  avr32 mustn't select HAVE_IDE
2008-04-16 18:58:37 -07:00
Andy Fleming c5e38a949b phy: Clean up header style
Multi-line comments weren't all CodingStyle compliant

Signed-off-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-04-16 20:09:35 -04:00
Andy Fleming 9d9326d3bc phy: Change mii_bus id field to a string
Having the id field be an int was making more complex bus topologies
excessively difficult.  For now, just convert it to a string, and
change all instances of "bus->id = val" to
snprintf(id, MII_BUS_ID_LEN, "%x", val).

Signed-off-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-04-16 20:09:35 -04:00
Sergei Shtylyov b4dcaea36b Pb1200/DBAu1200: fix bad IDE resource size
The header files for the Pb1200/DBAu1200 boards have wrong definition for the
IDE interface's decoded range length -- it should be 512 bytes according to
what the IDE driver does.  In addition, the IDE platform device claims 1 byte
too many for its memory resource -- fix the platform code and the IDE driver
in accordance.

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-04-17 01:14:33 +02:00
Sergei Shtylyov 09a77441f2 Au1200: kill IDE driver function prototypes
Fix these warnings emitted when compiling drivers/ide/mips/au1xxx-ide.c:

include/asm/mach-au1x00/au1xxx_ide.h:137: warning: 'auide_tune_drive' declared 
`static' but never defined
include/asm/mach-au1x00/au1xxx_ide.h:138: warning: 'auide_tune_chipset' declared
 `static' but never defined

by wiping out the whole "function prototyping" section from the header file
<asm-mips/mach-au1x00/au1xxx_ide.h> as it mostly declared functions that are
already dead in the IDE driver; move the only useful prototype into the driver.

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-04-17 01:14:33 +02:00
Reinette Chatre d18ef29f34 mac80211: no BSS changes to driver from beacons processed during scanning
There is no need to send BSS changes to driver from beacons processed
during scanning. We are more interested in beacons from an AP with which
we are associated - these will still be used to send updates to driver as
the beacons are received without scanning.

This change·removes the requirement that bss_info_changed needs to be atomic.
The beacons received during scanning are processed from a tasklet, but if we
do not call bss_info_changed for these beacons there is no need for it to be
atomic. This function (bss_info_changed) is called either from workqueue or
ioctl in all other instances.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Acked-by: Tomas Winkler <tomas.winkler@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-04-16 15:59:56 -04:00
Linus Torvalds 6af74b03e0 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  block: update git url for blktrace
  io context: increment task attachment count in ioc_task_link()
2008-04-16 07:45:45 -07:00
Linus Torvalds b4b8f57965 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  [TCP]: Add return value indication to tcp_prune_ofo_queue().
  PS3: gelic: fix the oops on the broken IE returned from the hypervisor
  b43legacy: fix DMA mapping leakage
  mac80211: remove message on receiving unexpected unencrypted frames
  Update rt2x00 MAINTAINERS entry
  Add rfkill to MAINTAINERS file
  rfkill: Fix device type check when toggling states
  b43legacy: Fix usage of struct device used for DMAing
  ssb: Fix usage of struct device used for DMAing
  MAINTAINERS: move to generic repository for iwlwifi
  b43legacy: fix initvals loading on bcm4303
  rtl8187: Add missing priv->vif assignments
  netconsole: only set CON_PRINTBUFFER if the user specifies a netconsole
  [CAN]: Update documentation of struct sockaddr_can
  MAINTAINERS: isdn4linux@listserv.isdn4linux.de is subscribers-only
  [TCP]: Fix never pruned tcp out-of-order queue.
  [NET_SCHED] sch_api: fix qdisc_tree_decrease_qlen() loop
2008-04-16 07:44:27 -07:00
John Heffner dd9e0dda66 [TCP]: Increase the max_burst threshold from 3 to tp->reordering.
This change is necessary to allow cwnd to grow during persistent
reordering.  Cwnd moderation is applied when in the disorder state
and an ack that fills the hole comes in.  If the hole was greater
than 3 packets, but less than tp->reordering, cwnd will shrink when
it should not have.

Signed-off-by: John Heffner <jheffner@napa.(none)>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-16 02:29:56 -07:00
Denis V. Lunev f3005d7f4a [NETNS]: Add netns refcnt debug for network devices.
dev_set_net is called for
- just allocated devices
- devices moving from one namespace to another
release_net has proper check inside to distinguish these cases.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-16 02:02:18 -07:00
Denis V. Lunev 3661a91083 [NETNS]: Add netns refcnt debug to fib rules.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-16 02:01:56 -07:00
Denis V. Lunev 65a18ec58e [NETNS]: Add netns refcnt debug for kernel sockets.
Protocol control sockets and netlink kernel sockets should not prevent the
namespace stop request. They are initialized and disposed in a special way by
sk_change_net/sk_release_kernel.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-16 01:59:46 -07:00
Denis V. Lunev 5d1e4468a7 [NETNS]: Make netns refconting debug like a socket one.
Make release_net/hold_net noop for performance-hungry people. This is a debug
staff and should be used in the debug mode only.

Add check for net != NULL in hold/release calls. This will be required
later on.

[ Added minor simplifications suggested by Brian Haley. -DaveM ]

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-16 01:58:04 -07:00
Pavel Emelyanov a9fde26078 [VLAN]: Tag vlan_group_device with net device, not ifindex.
Currently vlan group is searched using one key - the ifindex.
We'll have to lookup the vlan_group by two keys - ifindex and
net. Turning the vlan_group lookup key to struct net_device
pointer will make this process easier.

Besides, this will eliminate one more place in the networking,
that assumes that indexes are unique in the kernel.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-16 00:48:04 -07:00
Pavel Emelyanov 669f87baab [RTNL]: Introduce the rtnl_kill_links helper.
This one is responsible for calling ->dellink on each net
device found in net to help with vlan net_exit hook in the
nearest future.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-16 00:46:52 -07:00
Krzysztof Helt 5f1a3f2ac4 acpi thermal trip points increased to 12
The THERMAL_MAX_TRIPS value is set to 10.  It is too few for the Compaq AP550
machine which has 12 trip points.

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: Len Brown <lenb@kernel.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-15 19:35:41 -07:00
Ben Dooks d1e7780638 spi: spi_s3c24xx must initialize num_chipselect
The SPI core now expects num_chipselect to be set correctly as due to added
checks on the chip being selected before an transfer is allowed.  This patch
adds a num_cs field to the platform data which needs to be set correctly
before adding the SPI platform device.

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-15 19:35:41 -07:00
Jan Kara 335e92e8a5 vfs: fix possible deadlock in ext2, ext3, ext4 when using xattrs
mb_cache_entry_alloc() was allocating cache entries with GFP_KERNEL.  But
filesystems are calling this function while holding xattr_sem so possible
recursion into the fs violates locking ordering of xattr_sem and transaction
start / i_mutex for ext2-4.  Change mb_cache_entry_alloc() so that filesystems
can specify desired gfp mask and use GFP_NOFS from all of them.

Signed-off-by: Jan Kara <jack@suse.cz>
Reported-by: Dave Jones <davej@redhat.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-15 19:35:41 -07:00
WANG Cong 1f4deba80a uml: compile error fix
This patch fixes this error:

In file included from /home/wangcong/projects/linux-2.6/arch/um/kernel/smp.c:9:
include2/asm/tlb.h: In function `tlb_remove_page':
include2/asm/tlb.h:101: error: implicit declaration of function `page_cache_release'

And since including <linux/pagemap.h> in <linux/swap.h> will break sparc,
we add this #include in uml's own header.

Acked-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: WANG Cong <wangcong@zeuux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-15 19:35:40 -07:00
Michael Buesch 4ac58469f1 ssb: Fix usage of struct device used for DMAing
This fixes DMA on architectures where DMA is nontrivial, like PPC64.
We must use the host-device's (PCI) struct device for any DMA
operation instead of the SSB device. For this we add a new
struct device pointer to the SSB device structure that will always
point to the right device for DMAing.

Without this patch b43 and b44 drivers won't work on complex-DMA
architectures, that for example need dev->archdata for DMA operations.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-04-15 15:04:35 -04:00
Pavel Emelyanov dec827d174 [NETNS]: The generic per-net pointers.
Add the elastic array of void * pointer to the struct net.
The access rules are simple:

 1. register the ops with register_pernet_gen_device to get
    the id of your private pointer
 2. call net_assign_generic() to put the private data on the
    struct net (most preferably this should be done in the
    ->init callback of the ops registered)
 3. do not store any private reference on the net_generic array;
 4. do not change this pointer while the net is alive;
 5. use the net_generic() to get the pointer.

When adding a new pointer, I copy the old array, replace it
with a new one and schedule the old for kfree after an RCU
grace period.

Since the net_generic explores the net->gen array inside rcu
read section and once set the net->gen->ptr[x] pointer never 
changes, this grants us a safe access to generic pointers.

Quoting Paul: "... RCU is protecting -only- the net_generic 
structure that net_generic() is traversing, and the [pointer]
returned by net_generic() is protected by a reference counter 
in the upper-level struct net."

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 00:36:08 -07:00
Pavel Emelyanov c93cf61fd1 [NETNS]: The net-subsys IDs generator.
To make some per-net generic pointers, we need some way to address
them, i.e. - IDs. This is simple IDA-based IDs generator for pernet
subsystems.

Addressing questions about potential checkpoint/restart problems: 
these IDs are "lite-offsets" within the net structure and are by no 
means supposed to be exported to the userspace.

Since it will be used in the nearest future by devices only (tun,
vlan, tunnels, bridge, etc), I make it resemble the functionality
of register_pernet_device().

The new ids is stored in the *id pointer _before_ calling the init
callback to make this id available in this callback.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 00:35:23 -07:00
Adrian Bunk 31efdf0530 [ISDN] include/linux/isdn.h: remove dead code
This patch remove the usage of a nonexisting kconfig variable.

Reported-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 00:30:16 -07:00
Adrian Bunk 7ef3abd210 [IRDA]: Remove irlan_eth_send_gratuitous_arp()
Even kernel 2.2.26 (sic) already contains the
  #undef CONFIG_IRLAN_SEND_GRATUITOUS_ARP
with the comment "but for some reason the machine crashes if you use DHCP".

Either someone finally looks into this or it's simply time to remove 
this dead code.

Reported-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 00:29:24 -07:00
Adrian Bunk 99971e70fd [WANPIPE]: Forgotten bits of Sangoma drivers removal.
Robert P. J. Day spotted that my removal of the Sangoma drivers missed
a few bits.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 00:27:58 -07:00
Jens Axboe d237e5c7ce io context: increment task attachment count in ioc_task_link()
Thanks to Nikanth Karthikesan <knikanth@suse.de> for reporting this.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-04-15 09:25:33 +02:00
Allan Stephens 0c3141e910 [TIPC]: Overhaul of socket locking logic
This patch modifies TIPC's socket code to follow the same approach
used by other protocols.  This change eliminates the need for a
mutex in the TIPC-specific portion of the socket protocol data
structure -- in its place, the standard Linux socket backlog queue
and associated locking routines are utilized.  These changes fix
a long-standing receive queue bug on SMP systems, and also enable
individual read and write threads to utilize a socket without
unnecessarily interfering with each other.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 00:22:02 -07:00
Christoph Lameter 0f389ec630 slub: No need for per node slab counters if !SLUB_DEBUG
The per node counters are used mainly for showing data through the sysfs API.
If that API is not compiled in then there is no point in keeping track of this
data. Disable counters for the number of slabs and the number of total slabs
if !SLUB_DEBUG. Incrementing the per node counters is also accessing a
potentially contended cacheline so this could actually be a performance
benefit to embedded systems.

SLABINFO support is also affected. It now must depends on SLUB_DEBUG (which
is on by default).

Patch also avoids a check for a NULL kmem_cache_node pointer in new_slab()
if the system is not compiled with NUMA support.

[penberg@cs.helsinki.fi: fix oops and move ->nr_slabs into CONFIG_SLUB_DEBUG]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2008-04-14 18:53:02 +03:00
Linus Torvalds 533bb8a4d7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (31 commits)
  [BRIDGE]: Fix crash in __ip_route_output_key with bridge netfilter
  [NETFILTER]: ipt_CLUSTERIP: fix race between clusterip_config_find_get and _entry_put
  [IPV6] ADDRCONF: Don't generate temporary address for ip6-ip6 interface.
  [IPV6] ADDRCONF: Ensure disabling multicast RS even if privacy extensions are disabled.
  [IPV6]: Use appropriate sock tclass setting for routing lookup.
  [IPV6]: IPv6 extension header structures need to be packed.
  [IPV6]: Fix ipv6 address fetching in raw6_icmp_error().
  [NET]: Return more appropriate error from eth_validate_addr().
  [ISDN]: Do not validate ISDN net device address prior to interface-up
  [NET]: Fix kernel-doc for skb_segment
  [SOCK] sk_stamp: should be initialized to ktime_set(-1L, 0)
  net: check for underlength tap writes
  net: make struct tun_struct private to tun.c
  [SCTP]: IPv4 vs IPv6 addresses mess in sctp_inet[6]addr_event.
  [SCTP]: Fix compiler warning about const qualifiers
  [SCTP]: Fix protocol violation when receiving an error lenght INIT-ACK
  [SCTP]: Add check for hmac_algo parameter in sctp_verify_param()
  [NET_SCHED] cls_u32: refcounting fix for u32_delete()
  [DCCP]: Fix skb->cb conflicts with IP
  [AX25]: Potential ax25_uid_assoc-s leaks on module unload.
  ...
2008-04-14 07:56:24 -07:00
David S. Miller 334f8b2afd Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6.26 2008-04-14 03:50:43 -07:00
David S. Miller df39e8ba56 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/ehea/ehea_main.c
	drivers/net/wireless/iwlwifi/Kconfig
	drivers/net/wireless/rt2x00/rt61pci.c
	net/ipv4/inet_timewait_sock.c
	net/ipv6/raw.c
	net/mac80211/ieee80211_sta.c
2008-04-14 02:30:23 -07:00
Peter Warasin e7bfd0a1a6 [NETFILTER]: bridge: add ebt_nflog watcher
This patch adds the ebtables nflog watcher to the kernel in order to
allow ebtables log through the nfnetlink_log backend.

Signed-off-by: Peter Warasin <peter@endian.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:54 +02:00
Jan Engelhardt 3c9fba656a [NETFILTER]: nf_conntrack: replace NF_CT_DUMP_TUPLE macro indrection by function call
Directly call IPv4 and IPv6 variants where the address family is
easily known.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:54 +02:00
Jan Engelhardt f2ea825f48 [NETFILTER]: nf_nat: use bool type in nf_nat_proto
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:53 +02:00
Jan Engelhardt 5f2b4c9006 [NETFILTER]: nf_conntrack: use bool type in struct nf_conntrack_tuple.h
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:53 +02:00
Jan Engelhardt 09f263cd39 [NETFILTER]: nf_conntrack: use bool type in struct nf_conntrack_l4proto
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:53 +02:00
Jan Engelhardt 8ce8439a31 [NETFILTER]: nf_conntrack: use bool type in struct nf_conntrack_l3proto
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:52 +02:00
Jan Engelhardt 9dbae79178 [NETFILTER]: Remove unused callbacks in nf_conntrack_l3proto
These functions are never called.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:52 +02:00
Patrick McHardy 5e8fbe2ac8 [NETFILTER]: nf_conntrack: add tuplehash l3num/protonum accessors
Add accessors for l3num and protonum and get rid of some overly long
expressions.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:52 +02:00
Patrick McHardy dd13b01036 [NETFILTER]: nf_nat: kill helper and seq_adjust hooks
Connection tracking helpers (specifically FTP) need to be called
before NAT sequence numbers adjustments are performed to be able
to compare them against previously seen ones. We've introduced
two new hooks around 2.6.11 to maintain this ordering when NAT
modules were changed to get called from conntrack helpers directly.

The cost of netfilter hooks is quite high and sequence number
adjustments are only rarely needed however. Add a RCU-protected
sequence number adjustment function pointer and call it from
IPv4 conntrack after calling the helper.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:52 +02:00
Patrick McHardy 55871d0479 [NETFILTER]: nf_conntrack_extend: warn on confirmed conntracks
New extensions may only be added to unconfirmed conntracks to avoid races
when reallocating the storage.

Also change NF_CT_ASSERT to use WARN_ON to get backtraces.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:51 +02:00
Patrick McHardy 8c87238b72 [NETFILTER]: nf_nat: don't add NAT extension for confirmed conntracks
Adding extensions to confirmed conntracks is not allowed to avoid races
on reallocation. Don't setup NAT for confirmed conntracks in case NAT
module is loaded late.

The has one side-effect, the connections existing before the NAT module
was loaded won't enter the bysource hash. The only case where this actually
makes a difference is in case of SNAT to a multirange where the IP before
NAT is also part of the range. Since old connections don't enter the
bysource hash the first new connection from the IP will have a new address
selected. This shouldn't matter at all.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:51 +02:00
Patrick McHardy 2bc780499a [NETFILTER]: nf_conntrack: add DCCP protocol support
Add DCCP conntrack helper. Thanks to Gerrit Renker <gerrit@erg.abdn.ac.uk>
for review and testing.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:49 +02:00
Patrick McHardy d63a650736 [NETFILTER]: Add partial checksum validation helper
Move the UDP-Lite conntrack checksum validation to a generic helper
similar to nf_checksum() and make it fall back to nf_checksum()
in case the full packet is to be checksummed and hardware checksums
are available. This is to be used by DCCP conntrack, which also
needs to verify partial checksums.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:49 +02:00
Patrick McHardy 2d2d84c40e [NETFILTER]: nf_nat: remove unused name from struct nf_nat_protocol
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:48 +02:00
Patrick McHardy 535b57c7c1 [NETFILTER]: nf_nat: move NAT ctnetlink helpers to nf_nat_proto_common
Move to nf_nat_proto_common and rename to nf_nat_proto_... since they're
also used by protocols that don't have port numbers.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:47 +02:00
Patrick McHardy 937e0dfd87 [NETFILTER]: nf_nat: add helpers for common NAT protocol operations
Add generic ->in_range and ->unique_tuple ops to avoid duplicating them
again and again for future NAT modules and save a few bytes of text:

net/ipv4/netfilter/nf_nat_proto_tcp.c:
  tcp_in_range     |  -62 (removed)
  tcp_unique_tuple | -259 # 271 -> 12, # inlines: 1 -> 0, size inlines: 7 -> 0
 2 functions changed, 321 bytes removed

net/ipv4/netfilter/nf_nat_proto_udp.c:
  udp_in_range     |  -62 (removed)
  udp_unique_tuple | -259 # 271 -> 12, # inlines: 1 -> 0, size inlines: 7 -> 0
 2 functions changed, 321 bytes removed

net/ipv4/netfilter/nf_nat_proto_gre.c:
  gre_in_range |  -62 (removed)
 1 function changed, 62 bytes removed

vmlinux:
 5 functions changed, 704 bytes removed

Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:46 +02:00
Jan Engelhardt 3bb0362d2f [NETFILTER]: remove arpt_(un)register_target indirection macros
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:44 +02:00
Jan Engelhardt 95eea855af [NETFILTER]: remove arpt_target indirection macro
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:43 +02:00
Jan Engelhardt 4abff0775d [NETFILTER]: remove arpt_table indirection macro
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:43 +02:00
Jan Engelhardt 5452e425ad [NETFILTER]: annotate {arp,ip,ip6,x}tables with const
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:35 +02:00
Jan Engelhardt b9f61b1603 [NETFILTER]: xt_sctp: simplify xt_sctp.h
The use of xt_sctp.h flagged up -Wshadow warnings in userspace, which
prompted me to look at it and clean it up. Basic operations have been
directly replaced by library calls (memcpy, memset is both available
in the kernel and userspace, and usually faster than a self-made
loop). The is_set and is_clear functions now use a processing time
shortcut, too.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 09:56:04 +02:00
Alexey Dobriyan 666953df35 [NETFILTER]: ip_tables: per-netns FILTER/MANGLE/RAW tables for real
Commit 9335f047fe aka
"[NETFILTER]: ip_tables: per-netns FILTER, MANGLE, RAW"
added per-netns _view_ of iptables rules. They were shown to user, but
ignored by filtering code. Now that it's possible to at least ping loopback,
per-netns tables can affect filtering decisions.

netns is taken in case of
	PRE_ROUTING, LOCAL_IN -- from in device,
	POST_ROUTING, LOCAL_OUT -- from out device,
	FORWARD -- from in device which should be equal to out device's netns.
		   This code is relatively new, so BUG_ON was plugged.

Wrappers were added to a) keep code the same from CONFIG_NET_NS=n users
(overwhelming majority), b) consolidate code in one place -- similar
changes will be done in ipv6 and arp netfilter code.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 09:56:02 +02:00
Gerrit Renker f5572855ec [SKB]: __skb_queue_tail = __skb_insert before
This expresses __skb_queue_tail() in terms of __skb_insert(),
using __skb_insert_before() as auxiliary function.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-14 00:05:28 -07:00
Gerrit Renker 7de6c03336 [SKB]: __skb_append = __skb_queue_after
This expresses __skb_append in terms of __skb_queue_after, exploiting that

  __skb_append(old, new, list) = __skb_queue_after(list, old, new).

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-14 00:05:09 -07:00
Gerrit Renker bf29927588 [SKB]: __skb_queue_after(prev) = __skb_insert(prev, prev->next)
By reordering, __skb_queue_after() is expressed in terms of __skb_insert().

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-14 00:04:51 -07:00
Gerrit Renker f525c06d12 [SKB]: __skb_dequeue = skb_peek + __skb_unlink
By rearranging the order of declarations, __skb_dequeue() is expressed in terms of

 * skb_peek() and
 * __skb_unlink(),

thus in effect mirroring the analogue implementation of __skb_dequeue_tail().

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-14 00:04:12 -07:00
YOSHIFUJI Hideaki e9df2e8fd8 [IPV6]: Use appropriate sock tclass setting for routing lookup.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-13 23:40:51 -07:00
YOSHIFUJI Hideaki 7cd636fe9c [IPV6]: IPv6 extension header structures need to be packed.
struct ipv6_opt_hdr is the common structure for IPv6 extension
headers, and it is common to increment the pointer to get
the real content.  On the other hand, since the structure
consists only of 1-byte next-header field and 1-byte length
field, size of that structure depends on architecture; 2 or 4.
Add "packed" attribute to get 2.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-13 23:33:52 -07:00
YOSHIFUJI Hideaki cee8947338 [IPV6] MROUTE: Do not call ipv6_find_idev() directly.
Since NETDEV_REGISTER notifier chain is responsible for creating
inet6_dev{}, we do not need to call ipv6_find_idev() directly here.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-13 23:21:16 -07:00
Pavel Emelyanov 0204774191 [NETNS][DCCPV6]: Move the dccp_v6_ctl_sk on the struct net.
And replace all its usage with init_net's socket.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-13 22:32:25 -07:00
Pavel Emelyanov 7b1cffa8c9 [NETNS][DCCPV4]: Move the dccp_v4_ctl_sk on the struct net.
And replace all its usage with init_net's socket.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-13 22:29:37 -07:00
Pavel Emelyanov 67019cc9ee [NETNS]: Add an empty netns_dccp structure on struct net.
According to the overall struct net design, it will be
filled with DCCP-related members.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-13 22:28:42 -07:00
Denis V. Lunev 5f4472c5a6 [TCP]: Remove owner from tcp_seq_afinfo.
Move it to tcp_seq_afinfo->seq_fops as should be.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-13 22:13:53 -07:00
Denis V. Lunev 68fcadd16c [TCP]: Place file operations directly into tcp_seq_afinfo.
No need to have separate never-used variable.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-13 22:13:30 -07:00
Denis V. Lunev 9427c4b36b [TCP]: Move seq_ops from tcp_iter_state to tcp_seq_afinfo.
No need to create seq_operations for each instance of 'netstat'.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-13 22:12:13 -07:00
Denis V. Lunev a4146b1b2c [TCP]: Replace struct net on tcp_iter_state with seq_net_private.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-13 22:11:14 -07:00
David S. Miller 6fb9114e4b Merge branch 'net-2.6.26-misc-20080412b' of git://git.linux-ipv6.org/gitroot/yoshfuji/linux-2.6-dev 2008-04-12 19:19:46 -07:00
Paul Moore 03e1ad7b5d LSM: Make the Labeled IPsec hooks more stack friendly
The xfrm_get_policy() and xfrm_add_pol_expire() put some rather large structs
on the stack to work around the LSM API.  This patch attempts to fix that
problem by changing the LSM API to require only the relevant "security"
pointers instead of the entire SPD entry; we do this for all of the
security_xfrm_policy*() functions to keep things consistent.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-12 19:07:52 -07:00
Paul Moore 00447872a6 NetLabel: Allow passing the LSM domain as a shared pointer
Smack doesn't have the need to create a private copy of the LSM "domain" when
setting NetLabel security attributes like SELinux, however, the current
NetLabel code requires a private copy of the LSM "domain".  This patches fixes
that by letting the LSM determine how it wants to pass the domain value.

 * NETLBL_SECATTR_DOMAIN_CPY
   The current behavior, NetLabel assumes that the domain value is a copy and
   frees it when done

 * NETLBL_SECATTR_DOMAIN
   New, Smack-friendly behavior, NetLabel assumes that the domain value is a
   reference to a string managed by the LSM and does not free it when done

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-12 19:06:42 -07:00
Rusty Russell 14daa02139 net: make struct tun_struct private to tun.c
There's no reason for this to be in the header, and it just hurts
recompile time.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Max Krasnyanskiy <maxk@qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-12 18:48:58 -07:00
Vlad Yasevich ab38fb04c9 [SCTP]: Fix compiler warning about const qualifiers
Fix 3 warnings about discarding const qualifiers:

net/sctp/ulpevent.c:862: warning: passing argument 1 of 'sctp_event2skb' discards qualifiers from pointer target type
net/sctp/sm_statefuns.c:4393: warning: passing argument 1 of 'SCTP_ASOC' discards qualifiers from pointer target type
net/sctp/socket.c:5874: warning: passing argument 1 of 'cmsg_nxthdr' discards qualifiers from pointer target type

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-12 18:40:06 -07:00
Gui Jianfeng f4ad85ca3e [SCTP]: Fix protocol violation when receiving an error lenght INIT-ACK
When receiving an error length INIT-ACK during COOKIE-WAIT,
a 0-vtag ABORT will be responsed. This action violates the
protocol apparently. This patch achieves the following things.
1 If the INIT-ACK contains all the fixed parameters, use init-tag
  recorded from INIT-ACK as vtag.
2 If the INIT-ACK doesn't contain all the fixed parameters,
  just reflect its vtag.

Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-12 18:39:34 -07:00
YOSHIFUJI Hideaki 7f1eced8b0 [IPV6] MIP6: Use our standard definitions for paddings.
MIP6_OPT_PAD_X are actually for paddings in destination
option header.  Replace them with our standard IPV6_TLV_PADX.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-12 13:43:22 +09:00
YOSHIFUJI Hideaki f3ee4010e8 [IPV6]: Define constants for link-local multicast addresses.
- Define link-local all-node / all-router multicast addresses.
- Remove ipv6_addr_all_nodes() and ipv6_addr_all_routers().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-12 13:43:19 +09:00
YOSHIFUJI Hideaki 9acd9f3ae9 [IPV6]: Make address arguments const.
- net/ipv6/addrconf.c:
	ipv6_get_ifaddr(), ipv6_dev_get_saddr()
- net/ipv6/mcast.c:
	ipv6_sock_mc_join(), ipv6_sock_mc_drop(),
	inet6_mc_check(),
	ipv6_dev_mc_inc(), __ipv6_dev_mc_dec(), ipv6_dev_mc_dec(),
	ipv6_chk_mcast_addr()
- net/ipv6/route.c:
	rt6_lookup(), icmp6_dst_alloc()
- net/ipv6/ip6_output.c:
	ip6_nd_hdr()
- net/ipv6/ndisc.c:
	ndisc_send_ns(), ndisc_send_rs(), ndisc_send_redirect(),
	ndisc_get_neigh(), __ndisc_send()

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-12 13:43:18 +09:00
YOSHIFUJI Hideaki dfd982baff [IPV6] ADDRCONF: Uninline ipv6_isatap_eui64().
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-12 13:43:17 +09:00
YOSHIFUJI Hideaki 3eb84f4929 [IPV6] ADDRCONF: Uninline ipv6_addr_hash().
The function is only used in net/ipv6/addrconf.c.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-12 13:43:15 +09:00
YOSHIFUJI Hideaki fed85383ac [IPV6]: Use XOR and OR rather than mutiple ands for ipv6 address comparisons.
ipv6_addr_equal(), ipv6_addr_v4mapped(),
ipv6_addr_is_ll_all_{nodes,routers}(),
ipv6_masked_addr_cmp()

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-12 13:43:14 +09:00
Zoltan Menyhart 98075d245a [IA64] Fix NUMA configuration issue
There is a NUMA memory configuration issue in 2.6.24:

A 2-node machine of ours has got the following memory layout:

Node 0:	0 - 2 Gbytes
Node 0:	4 - 8 Gbytes
Node 1:	8 - 16 Gbytes
Node 0:	16 - 18 Gbytes

"efi_memmap_init()" merges the three last ranges into one.

"register_active_ranges()" is called as follows:

efi_memmap_walk(register_active_ranges, NULL);

i.e. once for the 4 - 18 Gbytes range. It picks up the node
number from the start address, and registers all the memory for
the node #0.

"register_active_ranges()" should be called as follows to
make sure there is no merged address range at its entry:

efi_memmap_walk(filter_memory, register_active_ranges);

"filter_memory()" is similar to "filter_rsvd_memory()",
but the reserved memory ranges are not filtered out.

Signed-off-by: Zoltan Menyhart <Zoltan.Menyhart@bull.net>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-04-11 15:21:35 -07:00
Linus Torvalds 14897e35fd Merge branch 'docs' of git://git.lwn.net/linux-2.6
* 'docs' of git://git.lwn.net/linux-2.6:
  Add additional examples in Documentation/spinlocks.txt
  Move sched-rt-group.txt to scheduler/
  Documentation: move rpc-cache.txt to filesystems/
  Documentation: move nfsroot.txt to filesystems/
  Spell out behavior of atomic_dec_and_lock() in kerneldoc
  Fix a typo in highres.txt
  Fixes to the seq_file document
  Fill out information on patch tags in SubmittingPatches
  Add the seq_file documentation
2008-04-11 13:24:16 -07:00
J. Bruce Fields dc07e721a2 Spell out behavior of atomic_dec_and_lock() in kerneldoc
A little more detail here wouldn't hurt.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2008-04-11 13:17:46 -06:00
Heiko Carstens b0fac02370 Fix "$(AS) -traditional" compile breakage caused by asmlinkage_protect
git commit 54a0151041 ("asmlinkage_protect
replaces prevent_tail_call") causes this build failure on s390:

    AS      arch/s390/kernel/entry64.o
  In file included from arch/s390/kernel/entry64.S:14:
  include/linux/linkage.h:34: error: syntax error in macro parameter list
  make[1]: *** [arch/s390/kernel/entry64.o] Error 1
  make: *** [arch/s390/kernel] Error 2

and some other architectures.  The reason is that some architectures add
the "-traditional" flag to the invocation of $(AS), which disables
variadic macro argument support.

So just surround the new define with an #ifndef __ASSEMBLY__ to prevent
any side effects on asm code.

Cc: Roland McGrath <roland@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-11 08:29:13 -07:00
Linus Torvalds 90768c09bc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  [NETNS][IPV6] tcp - assign the netns for timewait sockets
  [IPV4]: Fix byte value boundary check in do_ip_getsockopt().
  BNX2X: Correct bringing chip out of reset
  [NETFILTER]: nf_nat: autoload IPv4 connection tracking
  [NETFILTER]: xt_hashlimit: fix mask calculation
  [XFRM]: xfrm_user: fix selector family initialization
  rt61pci: rt61pci_beacon_update do not free skb twice
  ssb-mipscore: Fix interrupt vectors
  ssb-pcicore: Fix IRQ TPS flag handling
  mac80211: use short_preamble mode from capability if ERP IE not present
  [NET]: Undo code bloat in hot paths due to print_mac().
  [TCP]: Don't allow FRTO to take place while MTU is being probed
  [TCP]: tcp_simple_retransmit can cause S+L
  [TCP]: Fix NewReno's fast rexmit/recovery problems with GSOed skb
  [TCP]: Restore 2.6.24 mark_head_lost behavior for newreno/fack
  nl80211: fix STA AID bug
  b43legacy: fix bcm4303 crash
  iwlwifi: fix n-band association problem
  ipw2200: set MAC address on radiotap interface
  libertas: fix mode initialization problem
2008-04-11 08:10:24 -07:00
Bjorn Helgaas 544451a1a3 pnp: increase number of devices supported per protocol
Increase the PNP "number of devices" limit.  We currently use an unsigned
char, which limits us to 256 devices per protocol.  This patch changes that to
an unsigned int.

Not all backends can take advantage of this: we limit ISAPNP to 10 devices in
isapnp_cfg_begin(), and PNPBIOS is limited to 256 devices because the BIOS
interfaces use a one-byte device node number.

But there is no limit on the number of PNPACPI devices we may have.  Large HP
Integrity machines have more than 256, which causes the current "unsigned char
number" to wrap around.  This causes errors like this:

    pnp: PnP ACPI init
    kobject_add failed for 00:00 with -EEXIST, don't try to register things with the same name in the same directory.

    Call Trace:
     [<a000000100010720>] show_stack+0x40/0xa0
     [<a0000001000107b0>] dump_stack+0x30/0x60
     [<a0000001001dbdf0>] kobject_add+0x290/0x2c0
     [<a0000001002bfd40>] device_add+0x160/0x860
     [<a0000001002c0470>] device_register+0x30/0x60
     [<a00000010026ba70>] __pnp_add_device+0x130/0x180
     [<a00000010026bb70>] pnp_add_device+0xb0/0xe0
     [<a0000001007f2730>] pnpacpi_add_device+0x510/0x5a0
     [<a0000001007f2810>] pnpacpi_add_device_handler+0x50/0x80

This patch increases the limit to fix this PNPACPI problem.  It should not
have any adverse effect on ISAPNP or PNPBIOS because their limits are still
enforced in the backends.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-11 08:06:44 -07:00
Linus Torvalds d10d89ec78 Add commentary about the new "asmlinkage_protect()" macro
It's really a pretty ugly thing to need, and some day it will hopefully
be obviated by teaching gcc about the magic calling conventions for the
low-level system call code, but in the meantime we can at least add big
honking comments about why we need these insane and strange macros.

I took my comments from my version of the macro, but I ended up deciding
to just pick Roland's version of the actual code instead (with his
prettier syntax that uses vararg macros).  Thus the previous two commits
that actually implement it.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-10 17:35:23 -07:00
Roland McGrath 54a0151041 asmlinkage_protect replaces prevent_tail_call
The prevent_tail_call() macro works around the problem of the compiler
clobbering argument words on the stack, which for asmlinkage functions
is the caller's (user's) struct pt_regs.  The tail/sibling-call
optimization is not the only way that the compiler can decide to use
stack argument words as scratch space, which we have to prevent.
Other optimizations can do it too.

Until we have new compiler support to make "asmlinkage" binding on the
compiler's own use of the stack argument frame, we have work around all
the manifestations of this issue that crop up.

More cases seem to be prevented by also keeping the incoming argument
variables live at the end of the function.  This makes their original
stack slots attractive places to leave those variables, so the compiler
tends not clobber them for something else.  It's still no guarantee, but
it handles some observed cases that prevent_tail_call() did not.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-10 17:28:26 -07:00
David Howells f17520e1f1 FRV: Don't make smp_{r, w, }mb() interpolate MEMBAR when CONFIG_SMP=n [try #2]
Don't make smp_{r,w,}mb() interpolate a MEMBAR instruction when CONFIG_SMP=n as
SMP memory barries on UP systems should interpolate a compiler barrier only.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-10 13:41:29 -07:00
David Howells e31c243f98 FRV: Add support for emulation of userspace atomic ops [try #2]
Use traps 120-126 to emulate atomic cmpxchg32, xchg32, and XOR-, OR-, AND-, SUB-
and ADD-to-memory operations for userspace.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-10 13:41:29 -07:00
David Howells 0c93d8e4d3 FRV: Move STACK_TOP_MAX up [try #2]
Move STACK_TOP_MAX up so that we don't try moving the stack above it as that
causes setup_arg_pages() to malfunction.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-10 13:41:28 -07:00
David Howells a31b9dd8ed FRV: Handle update_mmu_cache() being called when current->mm is NULL [try #2]
Handle update_mmu_cache() being called when current->mm is NULL.

We cache static TLB mappings for the current page table in DAMPR4 and DAMPR5
on the theory that the next data lookup is likely to be in the same general
region, and thus is likely to be mapped by the same page table.  However, we
can't get this information if we can't access the appropriate mm_struct.

If current->mm is NULL, we just clear the cache in the knowledge that the TLB
miss handlers will load it.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-10 13:41:28 -07:00
Florian Westphal 4dfc281702 [Syncookies]: Add support for TCP options via timestamps.
Allow the use of SACK and window scaling when syncookies are used
and the client supports tcp timestamps. Options are encoded into
the timestamp sent in the syn-ack and restored from the timestamp
echo when the ack is received.

Based on earlier work by Glenn Griffin.
This patch avoids increasing the size of structs by encoding TCP
options into the least significant bits of the timestamp and
by not using any 'timestamp offset'.

The downside is that the timestamp sent in the packet after the synack
will increase by several seconds.

changes since v1:
 don't duplicate timestamp echo decoding function, put it into ipv4/syncookie.c
 and have ipv6/syncookies.c use it.
 Feedback from Glenn Griffin: fix line indented with spaces, kill redundant if ()

Reviewed-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-10 03:12:40 -07:00
Rami Rosen 5c06f510a2 [IPV6]: Remove unused declarations in include/net/ip6_route.h.
1) Standlaone ip6_null_entry is no longer needed as it is replaced by
   the ip6_null_entry member of ipv6 (instance of struct netns_ipv6) in
   struct net (as a result of Network Namespaces patches).


2) These 3 methods from this same header are not defined anywhere:
   ip6_rt_addr_add(), ip6_rt_addr_del(), rt6_sndmsg()

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-10 02:31:20 -07:00
Patrick McHardy 4738c1db15 [SKFILTER]: Add SKF_ADF_NLATTR instruction
SKF_ADF_NLATTR searches for a netlink attribute, which avoids manually
parsing and walking attributes. It takes the offset at which to start
searching in the 'A' register and the attribute type in the 'X' register
and returns the offset in the 'A' register. When the attribute is not
found it returns zero.

A top-level attribute can be located using a filter like this
(example for nfnetlink, using struct nfgenmsg):

	...
	{
		/* A = offset of first attribute */
		.code	= BPF_LD | BPF_IMM,
		.k	= sizeof(struct nlmsghdr) + sizeof(struct nfgenmsg)
	},
	{
		/* X = CTA_PROTOINFO */
		.code	= BPF_LDX | BPF_IMM,
		.k	= CTA_PROTOINFO,
	},
	{
		/* A = netlink attribute offset */
		.code	= BPF_LD | BPF_B | BPF_ABS,
		.k	= SKF_AD_OFF + SKF_AD_NLATTR
	},
	{
		/* Exit if not found */
		.code   = BPF_JMP | BPF_JEQ | BPF_K,
		.k	= 0,
		.jt	= <error>
	},
	...

A nested attribute below the CTA_PROTOINFO attribute would then
be parsed like this:

	...
	{
		/* A += sizeof(struct nlattr) */
		.code	= BPF_ALU | BPF_ADD | BPF_K,
		.k	= sizeof(struct nlattr),
	},
	{
		/* X = CTA_PROTOINFO_TCP */
		.code	= BPF_LDX | BPF_IMM,
		.k	= CTA_PROTOINFO_TCP,
	},
	{
		/* A = netlink attribute offset */
		.code	= BPF_LD | BPF_B | BPF_ABS,
		.k	= SKF_AD_OFF + SKF_AD_NLATTR
	},
	...

The data of an attribute can be loaded into 'A' like this:

	...
	{
		/* X = A (attribute offset) */
		.code	= BPF_MISC | BPF_TAX,
	},
	{
		/* A = skb->data[X + k] */
		.code 	= BPF_LD | BPF_B | BPF_IND,
		.k	= sizeof(struct nlattr),
	},
	...

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-10 02:02:28 -07:00
Rami Rosen 3cccd60784 [IPV6] Remove three method declarations in include/net/ndisc.h.
This patch removes two unused method declarations in
include/net/ndisc.h: ndisc_forwarding_on(void) and
ndisc_forwarding_off(void);

Also igmp6_cleanup(void) appears twice in this header, so one
igmp6_cleanup(void) declaration is removed.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-10 02:01:21 -07:00
Stephen Hemminger 43db6d65e0 socket: sk_filter deinline
The sk_filter function is too big to be inlined. This saves 2296 bytes
of text on allyesconfig.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-10 01:43:09 -07:00
Stephen Hemminger b715631fad socket: sk_filter minor cleanups
Some minor style cleanups:
  * Move __KERNEL__ definitions to one place in filter.h
  * Use const for sk_filter_len
  * Line wrapping
  * Put EXPORT_SYMBOL next to function definition

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-10 01:33:47 -07:00
Russ Anderson c19b2930df [IA64] Itanium Spec updates
Updates based on the "Intel® Itanium® Architecture Software Developer's Manual
Specification Update October 2007".

http://download.intel.com/design/itanium/specupdt/24869911.pdf

Signed-off-by: Russ Anderson <rja@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-04-09 13:05:54 -07:00
Masami Hiramatsu 34e1ceb188 [IA64] kprobes: kprobe-booster for ia64
Add kprobe-booster support on ia64.

Kprobe-booster improves the performance of kprobes by eliminating single-step,
where possible.  Currently, kprobe-booster is implemented on x86 and x86-64.
This is an ia64 port.

On ia64, kprobe-booster executes a copied bundle directly, instead of single
stepping.  Bundles which have B or X unit and which may cause an exception
(including break) are not executed directly.  And also, to prevent hitting
break exceptions on the copied bundle, only the hindmost kprobe is executed
directly if several kprobes share a bundle and are placed in different slots.
Note: set_brl_inst() is used for preparing an instruction buffer(it does not
modify any active code), so it does not need any atomic operation.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: bibo,mao <bibo.mao@intel.com>
Cc: Rusty Lynch <rusty.lynch@intel.com>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-04-09 10:36:43 -07:00
KOSAKI Motohiro e4b05d4097 [IA64] pgd_offset() constfication.
when compile 2.6.25-rc8-mm1, below warning happend.
because walk_page_range pass argument as "const struct mm*",
but pgd_offset() receive as "struct mm*".

  CC      mm/pagewalk.o
mm/pagewalk.c: In function 'walk_page_range':
mm/pagewalk.c:111: warning: passing argument 1 of 'pgd_offset' discards qualifiers from pointer target type

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-04-09 10:10:18 -07:00
holt@sgi.com 2c6e6db41f [IA64] Minimize per_cpu reservations.
This attached patch significantly shrinks boot memory allocation on ia64.
It does this by not allocating per_cpu areas for cpus that can never
exist.

In the case where acpi does not have any numa node description of the
cpus, I defaulted to assigning the first 32 round-robin on the known
nodes..  For the !CONFIG_ACPI  I used for_each_possible_cpu().

Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-04-08 13:51:35 -07:00
Mohamed Abbas 84363e6e07 mac80211: notify mac from low level driver (iwlwifi)
Add new API to MAC80211 to allow low level driver to
notify MAC with driver status.

Signed-off-by: Mohamed Abbas <mabbas@linux.intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-04-08 16:44:43 -04:00
Michael Buesch d625a29ba6 ssb: Add support for block-I/O
This adds support for block based I/O to SSB.
This is needed in order to efficiently support PIO data
transfers to the card.
The block-I/O support is only compiled, if it's selected by the
weird driver that needs it. So there's no overhead for sane devices.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-04-08 16:44:40 -04:00
Chr fff7710937 mac80211: add station aid into ieee80211_tx_control
This patch is necessary for the upcoming Accesspoint patch for p54.

Signed-off-by: Christian Lamparter <chunkeey@web.de>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-04-08 15:05:57 -04:00
Michael Buesch 8fe2b65a18 ssb: Turn suspend/resume upside down
Turn the SSB bus suspend mechanism upside down.
Instead of deciding by an internal reference count when to suspend/resume,
let the parent bus call us in their suspend/resume routine.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-04-08 15:05:57 -04:00
Tomas Winkler 21c0cbe760 mac80211: add association capabilty and timing info into bss_conf
This patch adds assocation capability, timestamp (tsf) and beacon interval
to bss_conf. This is required for successful assocation of iwlwifi drivers

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-04-08 15:05:56 -04:00
Tomas Winkler 38668c059f mac80211: eliminate conf_ht
This patch eliminates the use of conf_ht, replacing it with
bss_info_changed.

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-04-08 15:05:56 -04:00
David S. Miller 8eefca4888 Merge branch 'net-2.6.26-isatap-20080403' of git://git.linux-ipv6.org/gitroot/yoshfuji/linux-2.6-dev 2008-04-08 02:33:36 -07:00
Ilpo Järvinen 882bebaaca [TCP]: tcp_simple_retransmit can cause S+L
This fixes Bugzilla #10384

tcp_simple_retransmit does L increment without any checking
whatsoever for overflowing S+L when Reno is in use.

The simplest scenario I can currently think of is rather
complex in practice (there might be some more straightforward
cases though). Ie., if mss is reduced during mtu probing, it
may end up marking everything lost and if some duplicate ACKs
arrived prior to that sacked_out will be non-zero as well,
leading to S+L > packets_out, tcp_clean_rtx_queue on the next
cumulative ACK or tcp_fastretrans_alert on the next duplicate
ACK will fix the S counter.

More straightforward (but questionable) solution would be to
just call tcp_reset_reno_sack() in tcp_simple_retransmit but
it would negatively impact the probe's retransmission, ie.,
the retransmissions would not occur if some duplicate ACKs
had arrived.

So I had to add reno sacked_out reseting to CA_Loss state
when the first cumulative ACK arrives (this stale sacked_out
might actually be the explanation for the reports of left_out
overflows in kernel prior to 2.6.23 and S+L overflow reports
of 2.6.24). However, this alone won't be enough to fix kernel
before 2.6.24 because it is building on top of the commit
1b6d427bb7 ([TCP]: Reduce sacked_out with reno when purging
write_queue) to keep the sacked_out from overflowing.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Reported-by: Alessandro Suardi <alessandro.suardi@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-07 22:33:07 -07:00
Ralf Baechle 9c5a3d729c [MIPS] Handle aliases in vmalloc correctly.
flush_cache_vmap / flush_cache_vunmap were calling flush_cache_all which -
having been deprecated - turned into a nop ...

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-04-07 22:31:04 +01:00
Linus Torvalds 950b0d2837 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
  x86: fix 64-bit asm NOPS for CONFIG_GENERIC_CPU
  x86: fix call to set_cyc2ns_scale() from time_cpufreq_notifier()
  revert "x86: tsc prevent time going backwards"
2008-04-07 13:14:37 -07:00
Rusty Russell 2557a933b7 virtio: remove overzealous BUG_ON.
The 'disable_cb' callback is designed as an optimization to tell the host
we don't need callbacks now.  As it is not reliable, the debug check is
overzealous: it can happen on two CPUs at the same time.  Document this.

Even if it were reliable, the virtio_net driver doesn't disable
callbacks on transmit so the START_USE/END_USE debugging reentrance
protection can be easily tripped even on UP.

Thanks to Balaji Rao for the bug report and testing.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
CC: Balaji Rao <balajirrao@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-07 13:14:22 -07:00
Suresh Siddha 871de93903 x86: fix 64-bit asm NOPS for CONFIG_GENERIC_CPU
ASM_NOP's for 64-bit kernel with CONFIG_GENERIC_CPU is broken
with the recent x86 nops merge. They were using GENERIC_NOPS
which will truncate the upper 32bits of %rsi, because of the missing
64bit rex prefix.

For now, fall back ASM NOPS for generic cpu to K8 NOPS, similar
to the code before the wrong x86 nop merge.

This should resolve the crash seen by Ingo on a test-system:

BUG: unable to handle kernel paging request at 00000000d80d8ee8
IP: [<ffffffff802121af>] save_i387_ia32+0x61/0xd8
PGD b8e0067 PUD 51490067 PMD 0
Oops: 0000 [1] SMP
CPU 2
Modules linked in:
Pid: 3871, comm: distcc Not tainted 2.6.25-rc7-sched-devel.git-x86-latest.git #359
RIP: 0010:[<ffffffff802121af>]  [<ffffffff802121af>] save_i387_ia32+0x61/0xd8
RSP: 0000:ffff81003abd3cb8  EFLAGS: 00010246
RAX: ffff810082e93400 RBX: 00000000ffc37f84 RCX: ffff8100d80d8ee0
RDX: 0000000000000000 RSI: 00000000d80d8ee0 RDI: ffff810082e93400
RBP: 00000000ffc37fdc R08: 00000000ffc37f88 R09: 0000000000000008
R10: ffff81003abd2000 R11: 0000000000000000 R12: ffff810082e93400
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff81011fb12dc0(0063) knlGS:00000000f7f1a6c0
CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
CR2: 00000000d80d8ee8 CR3: 0000000076922000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process distcc (pid: 3871, threadinfo ffff81003abd2000, task ffff8100d80d8ee0)
Stack:  ffff8100bb670380 ffffffff8026de50 0000000000000118 0000000000000002
 0000000000000002 ffff81003abd3e68 ffff81003abd3ed8 ffff81003abd3de8
 ffff81003abd3d18 ffffffff80229785 ffff8100d80d8ee0 ffff810001041280
Call Trace:
 [<ffffffff8026de50>] ? __generic_file_aio_write_nolock+0x343/0x377
 [<ffffffff80229785>] ? update_curr+0x54/0x64
 [<ffffffff80227cd3>] ? ia32_setup_sigcontext+0x125/0x1d2
 [<ffffffff8022839f>] ? ia32_setup_frame+0x73/0x1a5
 [<ffffffff8020b2a5>] ? do_notify_resume+0x1aa/0x7db
 [<ffffffff8024ae8c>] ? getnstimeofday+0x31/0x85
 [<ffffffff80249858>] ? ktime_get_ts+0x17/0x48
 [<ffffffff80249933>] ? ktime_get+0xc/0x41
 [<ffffffff8024973e>] ? hrtimer_nanosleep+0x75/0xd5
 [<ffffffff80249261>] ? hrtimer_wakeup+0x0/0x21
 [<ffffffff8020bfbc>] ? int_signal+0x12/0x17
 [<ffffffff8030e6b3>] ? dummy_file_free_security+0x0/0x1

Code: a6 08 05 00 00 f6 40 14 01 74 34 4c 89 e7 48 0f ae 07 48 8b 86 08 05 00 00 80 78 02 00 79 02 db e2 90 8d b4 26 00 00 00 00 89 f6 <48> 8b 46 08 83 60 14 fe 0f 20 c0 48 83 c8 08 0f 22 c0 eb 07 c6 

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-07 21:09:14 +02:00
James Bottomley 79bc14813c [SCSI] libsas: fix missing inlines in header file
Two functions in include/scsi/sas_ata.h don't have static inlines
leading to problems if they're built in:

On Thu, 2008-04-03 at 14:06 +0200, Toralf Förster wrote:
> drivers/scsi/mvsas.o: In function `sas_ata_init_host_and_port':
> mvsas.c:(.text+0x0): multiple definition of `sas_ata_init_host_and_port'
> drivers/scsi/libsas/built-in.o:(.text+0x37f4): first defined here
> drivers/scsi/mvsas.o: In function `sas_ata_task_abort':
> mvsas.c:(.text+0x7): multiple definition of `sas_ata_task_abort'
> drivers/scsi/libsas/built-in.o:(.text+0x37fb): first defined here
> make[2]: *** [drivers/scsi/built-in.o] Error 1
> make[1]: *** [drivers/scsi] Error 2
> make: *** [drivers] Error 2

Add the correct static inline modifiers.

Tested-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-04-07 12:19:10 -05:00
James Bottomley 2f3edc6936 [SCSI] transport_class: BUG if we can't release the attribute container
Every current transport class calls transport_container_release but
ignores the return value.  This is catastrophic if it returns an error
because the containers are part of a global list and the next action of
almost every transport class is to free the memory used by the
container.

Fix this by making transport_container_release a void, but making it BUG
if attribute_container_release returns an error ... this catches the
root cause of a system panic much earlier.  If we don't do this, we get
an eventual BUG when the attribute container list notices the corruption
caused by the freed memory it's still referencing.

Also made attribute_container_release __must_check as a reminder.

Cc: Greg KH <greg@kroah.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-04-07 12:19:10 -05:00
FUJITA Tomonori 3bc6a26192 [SCSI] add scsi_build_sense_buffer helper function
This adds scsi_build_sense_buffer, a simple helper function to build
sense data in a buffer.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-04-07 12:19:01 -05:00
James Bottomley 1c353f7d61 [SCSI] export command allocation and freeing functions independently of the host
This is needed by things like USB storage that want to set up static
commands for later use at start of day.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-04-07 12:18:57 -05:00
FUJITA Tomonori 9ac16b616a [SCSI] scsi: add wrapper functions for sg buffer copy helper functions
LLDs need to copies data between the SG table in struct scsi_cmnd and
liner buffer. So they use the helper functions like

sg_copy_from_buffer(scsi_sglist(sc), scsi_sg_count(sc), buf, buflen)
sg_copy_to_buffer(scsi_sglist(sc), scsi_sg_count(sc), buf, buflen)

This patch just adds wrapper functions:

scsi_sg_copy_from_buffer(sc, buf, buflen)
scsi_sg_copy_to_buffer(sc, buf, buflen)

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-04-07 12:15:45 -05:00
FUJITA Tomonori b1adaf65ba [SCSI] block: add sg buffer copy helper functions
This patch adds new three helper functions to copy data between an SG
list and a linear buffer.

- sg_copy_from_buffer copies data from linear buffer to an SG list

- sg_copy_to_buffer copies data from an SG list to a linear buffer

When the APIs copy data from a linear buffer to an SG list,
flush_kernel_dcache_page is called. It's not necessary for everyone
but it's a no-op on most architectures and in general the API is not
used in performance critical path.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-04-07 12:15:45 -05:00
Mike Christie 30bd7df8ce [SCSI] scsi_error: add target reset handler
The problem is that serveral drivers are sending a target reset from the
device reset handler, and if we have multiple devices a target reset gets
sent for each device when only one would be sufficient. And if we do a target
reset it affects all the commands on the target so the device reset handler
code only cleaning up one devices's commands makes programming the driver a
little more difficult than it should be.

This patch adds a target reset handler, which drivers can use to send
a target reset. If successful it cleans up the commands for a devices
accessed through that starget.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-04-07 12:15:41 -05:00
Kai Makisara 40f6b36c62 [SCSI] st: add option to use SILI in variable block reads
Add new option MT_ST_SILI to enable setting the SILI bit in reads in variable
block mode. If SILI is set, reading a block shorter than the byte count does
not result in CHECK CONDITION. The length of the block is determined using the
residual count from the HBA. Avoiding the REQUEST SENSE command for every
block speeds up some real applications considerably.

Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-04-07 12:15:39 -05:00
Darrick J. Wong 45e6cdf414 [SCSI] libsas: Provide a transport-level facility to request SAS addrs
Provide a facility to use the request_firmware() interface to get a SAS
address from userspace.  This can be used by SAS LLDDs that cannot
obtain the address from the host adapter.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-04-07 12:15:38 -05:00
YOSHIFUJI Hideaki 12802d058a [IPV6]: Comment MRT6_xxx sockopts in include/linux/in6.h.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-05 22:33:40 +09:00
YOSHIFUJI Hideaki 14fb64e1f4 [IPV6] MROUTE: Support PIM-SM (SSM).
Based on ancient patch by Mickael Hoerdt
<hoerdt@clarinet.u-strasbg.fr>, which is available at
<http://www-r2.u-strasbg.fr/~hoerdt/dev/linux_ipv6_mforwarding/patch-linux-ipv6-mforwarding-0.1a>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-05 22:33:39 +09:00
YOSHIFUJI Hideaki 7bc570c8b4 [IPV6] MROUTE: Support multicast forwarding.
Based on ancient patch by Mickael Hoerdt
<hoerdt@clarinet.u-strasbg.fr>, which is available at
<http://www-r2.u-strasbg.fr/~hoerdt/dev/linux_ipv6_mforwarding/patch-linux-ipv6-mforwarding-0.1a>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-05 22:33:38 +09:00
Linus Torvalds 6fdf5e67fe Merge branch 'upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/ralf/upstream-linus
* 'upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/ralf/upstream-linus:
  [MIPS] Make KGDB compile on UP
  [MIPS] Pb1200: Fix header breakage
2008-04-04 15:09:44 -07:00
Paul Menage 8bab8dded6 cgroups: add cgroup support for enabling controllers at boot time
The effects of cgroup_disable=foo are:

- foo isn't auto-mounted if you mount all cgroups in a single hierarchy
- foo isn't visible as an individually mountable subsystem

As a result there will only ever be one call to foo->create(), at init time;
all processes will stay in this group, and the group will never be mounted on
a visible hierarchy.  Any additional effects (e.g.  not allocating metadata)
are up to the foo subsystem.

This doesn't handle early_init subsystems (their "disabled" bit isn't set be,
but it could easily be extended to do so if any of the early_init systems
wanted it - I think it would just involve some nastier parameter processing
since it would occur before the command-line argument parser had been run.

Hugh said:

  Ballpark figures, I'm trying to get this question out rather than
  processing the exact numbers: CONFIG_CGROUP_MEM_RES_CTLR adds 15% overhead
  to the affected paths, booting with cgroup_disable=memory cuts that back to
  1% overhead (due to slightly bigger struct page).

  I'm no expert on distros, they may have no interest whatever in
  CONFIG_CGROUP_MEM_RES_CTLR=y; and the rest of us can easily build with or
  without it, or apply the cgroup_disable=memory patches.

Unix bench's execl test result on x86_64 was

== just after boot without mounting any cgroup fs.==
mem_cgorup=off : Execl Throughput       43.0     3150.1      732.6
mem_cgroup=on  : Execl Throughput       43.0     2932.6      682.0
==

[lizf@cn.fujitsu.com: fix boot option parsing]
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Paul Menage <menage@google.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Sudhir Kumar <skumar@linux.vnet.ibm.com>
Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-04 14:46:26 -07:00
Sergei Shtylyov 865ab87538 [MIPS] Pb1200: Fix header breakage
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-04-04 22:43:47 +01:00
Linus Torvalds 3a143125dd Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
  x86: revert assign IRQs to hpet timer
  x86: tsc prevent time going backwards
  xen: Clear PG_pinned in release_{pt,pd}()
  xen: Do not pin/unpin PMD pages
  xen: refactor xen_{alloc,release}_{pt,pd}()
  x86, agpgart: scary messages are fortunately obsolete
  xen: fix grant table bug
  x86: fix breakage of vSMP irq operations
  x86: print message if nmi_watchdog=2 cannot be enabled
  x86: fix nmi_watchdog=2 on Pentium-D CPUs
2008-04-04 14:42:58 -07:00
Fenghua Yu a6c75b86ce [IA64] Kernel parameter for max number of concurrent global TLB purges
The patch defines kernel parameter "nptcg=". The parameter overrides max number
of concurrent global TLB purges which is reported from either PAL_VM_SUMMARY or
SAL PALO.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-04-04 11:06:38 -07:00
Fenghua Yu 2046b94e7c [IA64] Multiple outstanding ptc.g instruction support
According to SDM2.2, Itanium supports multiple outstanding ptc.g instructions.
But current kernel function ia64_global_tlb_purge() uses a spinlock to serialize
ptc.g instructions issued by multiple processors. This serialization might have
scalability issue on a big SMP machine where many processors could purge TLB
in parallel.

The patch fixes this problem by issuing multiple ptc.g instructions in
ia64_global_tlb_purge(). It also adds support for the "PALO" table to get
a platform view of the max number of outstanding ptc.g instructions (which
may be different from the processor view found from PAL_VM_SUMMARY).

PALO specification can be found at: http://www.dig64.org/home/DIG64_PALO_R1_0.pdf

spinaphore implementation by Matthew Wilcox.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-04-04 11:05:59 -07:00
Thomas Gleixner 5761d64b27 x86: revert assign IRQs to hpet timer
The commits:

commit 37a47db8d7
Author: Balaji Rao <balajirrao@gmail.com>
Date:   Wed Jan 30 13:30:03 2008 +0100

    x86: assign IRQs to HPET timers, fix

and

commit e3f37a54f6
Author: Balaji Rao <balajirrao@gmail.com>
Date:   Wed Jan 30 13:30:03 2008 +0100

    x86: assign IRQs to HPET timers

have been identified to cause a regression on some platforms due to
the assignement of legacy IRQs which makes the legacy devices
connected to those IRQs disfunctional.

Revert them.

This fixes http://bugzilla.kernel.org/show_bug.cgi?id=10382

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-04 18:36:49 +02:00
Ravikiran G Thirumalai bae1d2507e x86: fix breakage of vSMP irq operations
25-rc* stopped working with CONFIG_X86_VSMP on vSMP machines.

Looks like the vsmp irq ops got accidentally removed during merge of x86_64
pvops in 2.6.25. -- commit 6abcd98ffa removed
vsmp irq ops.

Tested with both CONFIG_X86_VSMP and without CONFIG_X86_VSMP, on vSMP and non
vSMP x86_64 machines.

Please apply.

Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-04 18:36:46 +02:00
Tejun Heo e52dcc4899 libata: ATA_12/16 doesn't fall into ATAPI_MISC
SAT passthrus don't really fit into ATAPI_MISC class.  SAT passthru
commands always transfer multiple of 512 bytes and variable length
response is not allowed.  This patch creates a separate category -
ATAPI_PASS_THRU - for these.

This fixes HSM violation on "hdparm -I".

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-04-04 02:43:36 -04:00
Tejun Heo 436d34b362 libata: uninline atapi_cmd_type()
Uninline atapi_cmd_type().  It doesn't really have to be inline and
more case will be added which need to access unexported libata
variable.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-04-04 02:43:35 -04:00
YOSHIFUJI Hideaki 80a9492a33 [IPV4] MROUTE: Adjust include files for user-space.
<linux/mroute.h> needs <linux/types.h>.
Avoid including <linux/in.h> in user-space, which conflicts with
standard <netinet/in.h>.
Add basic struct and constant in <linux/pim.h>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-04 10:44:42 +09:00
YOSHIFUJI Hideaki 2e8046271f [IPV4] MROUTE: Move PIM definitions to <linux/pim.h>.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-04 10:44:42 +09:00
David S. Miller 3bb5da3837 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 2008-04-03 14:33:42 -07:00
Denis V. Lunev 046ee90235 [NETNS]: Create tcp control socket in the each namespace.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-03 14:31:33 -07:00
Denis V. Lunev 5677242f43 [NETNS]: Inet control socket should not hold a namespace.
This is a generic requirement, so make inet_ctl_sock_create namespace
aware and create a inet_ctl_sock_destroy wrapper around
sk_release_kernel.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-03 14:28:30 -07:00
Denis V. Lunev eee4fe4ded [INET]: Let inet_ctl_sock_create return sock rather than socket.
All upper protocol layers are already use sock internally.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-03 14:27:58 -07:00
Denis V. Lunev 3d58b5fa8e [INET]: Rename inet_csk_ctl_sock_create to inet_ctl_sock_create.
This call is nothing common with INET connection sockets code. It
simply creates an unhashes kernel sockets for protocol messages.

Move the new call into af_inet.c after the rename.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-03 14:22:32 -07:00
Denis V. Lunev a4aa834a91 [NETNS]: Declare init_net even without CONFIG_NET defined.
This does not look good, but there is no other choice. The compilation
without CONFIG_NET is broken and can not be fixed with ease.

After that there is no need for the following commits:
1567ca7eec
3edf8fa5cc
2d38f9a4f8

Revert them.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-03 13:04:33 -07:00
Xiantao Zhang 31a6b11fed [IA64] Implement smp_call_function_mask for ia64
This interface provides more flexible functionality for smp
infrastructure ... e.g. KVM frequently needs to operate on
a subset of cpus.

Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-04-03 11:39:43 -07:00
Xiantao Zhang 96651896b8 [IA64] Add API for allocating Dynamic TR resource.
Dynamic TR resource should be managed in the uniform way.
Add two interfaces for kernel:
ia64_itr_entry: Allocate a (pair of) TR for caller.
ia64_ptr_entry: Purge a (pair of ) TR by caller.

Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-04-03 11:02:58 -07:00
David S. Miller e1ec1b8ccd Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/s2io.c
2008-04-02 22:35:23 -07:00
YOSHIFUJI Hideaki de357cc013 [IPV6] NDISC: Don't rely on node-type hint from L2 unless required.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-03 10:06:01 +09:00
YOSHIFUJI Hideaki 52eeeb8481 [IPV6]: Unify ip6_onlink() and ipip6_onlink().
Both are identical, let's create ipv6_chk_prefix() and use it
in both places.
2008-04-03 10:06:00 +09:00
YOSHIFUJI Hideaki 300aaeeaab [IPV6] SIT: Add SIOCGETPRL ioctl to get/dump PRL.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-03 10:06:00 +09:00
Templin, Fred L fadf6bf060 [IPV6] SIT: Add PRL management for ISATAP.
This patch updates the Linux the Intra-Site Automatic Tunnel Addressing
Protocol (ISATAP) implementation. It places the ISATAP potential router
list (PRL) in the kernel and adds three new private ioctls for PRL
management.

[Add several changes of structure name, constant names etc. - yoshfuji]

Signed-off-by: Fred L. Templin <fred.l.templin@boeing.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-03 10:05:58 +09:00
Andrew Morton 06f11f37aa alpha: get_current(): don't add zero to current_thread_info()->task
A nasty compile error:

In file included from security/keys/internal.h:16,
                 from security/keys/sysctl.c:14:
include/linux/key-ui.h: In function 'key_permission':
include/linux/key-ui.h:51: error: invalid use of undefined type 'struct task_struct'

apparently the compiler has decided that it needs to know sizeof(task_struct)
so that it can add zero to a task_struct* (which is rather dumb of it).

Getting task_struct in scope in these deeply-nested headers is scary-looking,
so let's just remove the "+ 0".

Cc: David Howells <dhowells@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-02 15:28:20 -07:00
Ivan Kokshaysky c143d43aa3 alpha: fix ALSA DMA mmap crash
Make dma_alloc_coherent respect gfp flags (__GFP_COMP is one that
matters).

Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Tested-by: Michael Cree <mcree@orcon.net.nz>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-02 15:28:19 -07:00
Christian Borntraeger dd135ebbd2 kvm: provide kvm.h for all architecture: fixes headers_install
Currently include/linux/kvm.h is not considered by make headers_install,
because Kbuild cannot handle " unifdef-$(CONFIG_FOO) += foo.h.  This problem
was introduced by

commit fb56dbb31c
Author: Avi Kivity <avi@qumranet.com>
Date:   Sun Dec 2 10:50:06 2007 +0200

    KVM: Export include/linux/kvm.h only if $ARCH actually supports KVM

    Currently, make headers_check barfs due to <asm/kvm.h>, which <linux/kvm.h>
    includes, not existing.  Rather than add a zillion <asm/kvm.h>s, export kvm.
    only if the arch actually supports it.

    Signed-off-by: Avi Kivity <avi@qumranet.com>

which makes this an 2.6.25 regression.

One way of solving the issue is to enhance Kbuild, but Avi and David conviced
me, that changing headers_install is not the way to go.  This patch changes
the definition for linux/kvm.h to unifdef-y.

If  unifdef-y is used for linux/kvm.h "make headers_check" will fail on all
architectures without asm/kvm.h.  Therefore, this patch also provides
asm/kvm.h on all architectures.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Avi Kivity <avi@qumranet.com>
Cc: Sam Ravnborg <sam@ravnborg.org
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-02 15:28:18 -07:00
Linus Torvalds 2f819ae881 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (45 commits)
  [VLAN]: Proc entry is not renamed when vlan device name changes.
  [IPV6]: Fix ICMP relookup error path dst leak
  [ATM] drivers/atm/iphase.c: compilation warning fix
  IPv6: do not create temporary adresses with too short preferred lifetime
  IPv6: only update the lifetime of the relevant temporary address
  bluetooth : __rfcomm_dlc_close lock fix
  bluetooth : use lockdep sub-classes for diffrent bluetooth protocol
  [ROSE/AX25] af_rose: rose_release() fix
  mac80211: correct use_short_preamble handling
  b43: Fix PCMCIA IRQ routing
  b43: Add DMA mapping failure messages
  mac80211: trigger ieee80211_sta_work after opening interface
  [LLC]: skb allocation size for responses
  [IP] UDP: Use SEQ_START_TOKEN.
  [NET]: Remove Documentation/networking/sk98lin.txt
  [ATM] atm/idt77252.c: Make 2 functions static
  [ATM]: Make atm/he.c:read_prom_byte() static
  [IPV6] MCAST: Ensure to check multicast listener(s).
  [LLC]: Kill llc_station_mac_sa symbol export.
  forcedeth: fix locking bug with netconsole
  ...
2008-04-02 07:46:18 -07:00
Fabio Checconi 34e6bbf23c cfq-iosched: fix rcu freeing of cfq io contexts
SLAB_DESTROY_BY_RCU is not a direct substitute for normal call_rcu()
freeing, since it'll page freeing but NOT object freeing. So change
cfq to do the freeing on its own.

Signed-off-by: Fabio Checconi <fabio@gandalf.sssup.it>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-04-02 15:42:20 +02:00
Denis V. Lunev c0f39322c3 [NETNS]: Do not include net/net_namespace.h from seq_file.h
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-02 00:10:28 -07:00
Denis V. Lunev 225c0a0107 [NETNS]: Merge ifdef CONFIG_NET in include/net/net_namespace.h.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-02 00:09:29 -07:00
Linus Torvalds 10027471a3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6.25
* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6.25:
  sh: Fix up uImage compression type
  remove include/asm-sh/floppy.h
  sh: Fix TIF_USEDFPU clearing under FPU emulation.
  sh: Fix occasional FPU register corruption under preempt.
2008-04-01 11:45:48 -07:00
Linus Torvalds 61434392f7 Merge branch 'upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/ralf/upstream-linus
* 'upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/ralf/upstream-linus:
  [MIPS] XSS1500: Fix compilation
  [MIPS] Bigsur: make defconfig more useful.
  [MIPS] Alchemy: work around clock misdetection on early Au1000
  [MIPS] Add missing 4KEC TLB refill handler
  [MIPS] BCM1480: Fix PCI/HT IO access
  [MIPS] Fix the installation condition of MIPS clocksource
  [MIPS] Check for GCC r10k-cache-barrier support
  [MIPS] I8253: Export i2853_lock to modules.
  [MIPS] VPE loader: Check result of memory allocation.
2008-04-01 11:31:31 -07:00
Sergei Shtylyov 758e285fac [MIPS] Alchemy: work around clock misdetection on early Au1000
Work around the CPU clock miscalculation on Au1000DA/HA/HB due the
sys_cpupll register being write-only, i.e. actually do what the comment
before cal_r4off() function advertised for years but the code failed at.
This is achieved by just giving user a chance to define the clock
explicitly  in the board config. via CONFIG_SOC_AU1000_FREQUENCY option,
defaulting to 396 MHz if the option is not given...

The patch is based on the AMD's big unpublished patch, the issue seems to
be an undocumented errata (or feature :-)...

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-04-01 15:46:34 +01:00
Dmitry Torokhov a7097ff89c Input: make sure input interfaces pin parent input devices
Recent driver core change causes references to parent devices being
dropped early, at device_del() time, as opposed to when all children
are freed. This causes oops in evdev with grabbed devices. Take the
reference to the parent input device ourselves to ensure that it
stays around long enough.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2008-04-01 00:22:53 -04:00
Joonwoo Park f83f1768f8 [LLC]: skb allocation size for responses
Allocate the skb for llc responses with the received packet size by
using the size adjustable llc_frame_alloc.
Don't allocate useless extra payload.
Cleanup magic numbers.

So, this fixes oops.
Reported by Jim Westfall:
kernel: skb_over_panic: text:c0541fc7 len:1000 put:997 head:c166ac00 data:c166ac2f tail:0xc166b017 end:0xc166ac80 dev:eth0
kernel: ------------[ cut here ]------------
kernel: kernel BUG at net/core/skbuff.c:95!

Signed-off-by: Joonwoo Park <joonwpark81@gmail.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-31 21:02:47 -07:00
Pavel Emelyanov 70ee115942 [SOCK][NETNS]: Add the percpu prot_inuse counter in the struct net.
Such an accounting would cost us two more dereferences to get the
percpu variable from the struct net, so I make sock_prot_inuse_get
and _add calls work differently depending on CONFIG_NET_NS - without
it old optimized routines are used.

The per-cpu counter for init_net is prepared in core_initcall, so
that even af_inet, that starts as fs_initcall, will already have the
init_net prepared.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-31 19:42:16 -07:00
Pavel Emelyanov c29a0bc4df [SOCK][NETNS]: Add a struct net argument to sock_prot_inuse_add and _get.
This counter is about to become per-proto-and-per-net, so we'll need 
two arguments to determine which cell in this "table" to work with.

All the places, but proc already pass proper net to it - proc will be
tuned a bit later.

Some indentation with spaces in proc files is done to keep the file
coding style consistent.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-31 19:41:46 -07:00
Pavel Emelyanov 8efa6e93cb [NETNS]: Introduce a netns_core structure.
There's already some stuff on the struct net, that should better
be folded into netns_core structure. I'm making the per-proto inuse 
counter be per-net also, which is also a candidate for this, so 
introduce this structure and populate it a bit.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-31 19:41:14 -07:00
Benjamin Marzinski 58e9fee13e [GFS2] Invalidate cache at correct point
GFS2 wasn't invalidating its cache before it called into the lock manager
with a request that could potentially drop a lock.  This was leaving a
window where the lock could be actually be held by another node, but the
file's page cache would still appear valid, causing coherency problems.
This patch moves the cache invalidation to before the lock manager call
when dropping a lock. It also adds the option to the lock_dlm lock
manager to not use conversion mode deadlock avoidance, which, on a
conversion from shared to exclusive, could internally drop the lock, and
then reacquire in. GFS2 now asks lock_dlm to not do this.  Instead, GFS2
manually drops the lock and reacquires it.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:41:44 +01:00
David S. Miller 3edf8fa5cc [NET]: Fix allnoconfig build on powerpc and avr32
As reported by Haavard Skinnemoen and Stephen Rothwell:

> allnoconfig fails with
>
> include/linux/netdevice.h:843: error: implicit declaration of function 'dev_net'
>
> which seems to be because the definition of dev_net is inside #ifdef
> CONFIG_NET, while next_net_device, which calls it, is not.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-31 00:28:14 -07:00
Adrian Bunk 7d7f7c3ed2 remove include/asm-sh/floppy.h
This patch removes the unused include/asm-sh/floppy.h
(ARCH_MAY_HAVE_PC_FDC was not enabled).

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-03-31 16:16:02 +09:00
Linus Torvalds a77df5cd1c Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  libata: ATA_EHI_LPM should be ATA_EH_LPM
  pata_sil680: only enable MMIO on Cell blades
2008-03-30 14:26:27 -07:00
Linus Torvalds 62ad36a8a6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
  ide: fix defining SUPPORT_VLB_SYNC
  Revert "ide: change master/slave IDENTIFY order"
2008-03-30 14:24:32 -07:00
Al Viro b2ddb9019e dma_page_list ->base_address is a userland pointer
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-30 14:20:23 -07:00
Al Viro 7d61c4596d compat_sys_wait4() prototype misannotation
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-30 14:20:23 -07:00
Al Viro 7c43f2b888 NULL noise: frv cmpxchg()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-30 14:20:23 -07:00
Bartlomiej Zolnierkiewicz 729d4de96a ide: fix defining SUPPORT_VLB_SYNC
We need to check for CONFIG_{CRIS,FRV} not {CRIS,FRV}.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-03-29 19:55:17 +01:00
Tejun Heo 3ec25ebd69 libata: ATA_EHI_LPM should be ATA_EH_LPM
EH actions are ATA_EH_* not ATA_EHI_*.  Rename ATA_EHI_LPM to
ATA_EH_LPM.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-03-29 12:21:31 -04:00
David S. Miller 17eed24953 Merge branch 'upstream-net26' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2008-03-28 19:48:26 -07:00
S.Caglar Onur 9307b570a7 drivers/net/arcnet/arcnet.c: use time_* macros
The functions time_before, time_before_eq, time_after, and time_after_eq are
more robust for comparing jiffies against other values.

So use the time_after() macro, defined in linux/jiffies.h, which deals with
wrapping correctly.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: S.Caglar Onur <caglar@pardus.org.tr>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-03-28 22:14:15 -04:00
Denis V. Lunev 4ad96d39a2 [UDP]: Remove owner from udp_seq_afinfo.
Move it to udp_seq_afinfo->seq_fops as should be.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 18:25:53 -07:00
Denis V. Lunev 3ba9441bdf [UDP]: Place file operations directly into udp_seq_afinfo.
No need to have separate never-used variable.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 18:25:32 -07:00
Denis V. Lunev dda61925f8 [UDP]: Move seq_ops from udp_iter_state to udp_seq_afinfo.
No need to create seq_operations for each instance of 'netstat'.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 18:24:26 -07:00
Denis V. Lunev 6f191efe48 [UDP]: Replace struct net on udp_iter_state with seq_net_private.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 18:23:33 -07:00
Pavel Emelyanov 095d911201 [LIB]: Drop the pcounter itself.
The knock-out. The pcounter abstraction is not used any longer in the
kernel.

Not sure whether this should go via netdev tree, but as far as I
remember it was added via this one, and besides Eric thinks that
Andrew shouldn't mind this.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:39:58 -07:00
Pavel Emelyanov bdcde3d71a [SOCK]: Drop inuse pcounter from struct proto (v2).
An uppercut - do not use the pcounter on struct proto.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:39:33 -07:00
Pavel Emelyanov 60e7663d46 [SOCK]: Drop per-proto inuse init and fre functions (v2).
Constructive part of the set is finished here. We have to remove the
pcounter, so start with its init and free functions.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:39:10 -07:00
Pavel Emelyanov 1338d466d9 [SOCK]: Introduce a percpu inuse counters array (v2).
And redirect sock_prot_inuse_add and _get to use one.

As far as the dereferences are concerned. Before the patch we made
1 dereference to proto->inuse.add call, the call itself and then
called the __get_cpu_var() on a static variable. After the patch we 
make a direct call, then one dereference to proto->inuse_idx and 
then the same __get_cpu_var() on a still static variable. So this 
patch doesn't seem to produce performance penalty on SMP.

This is not per-net yet, but I will deliberately make NET_NS=y case
separated from NET_NS=n one, since it'll cost us one-or-two more 
dereferences to get the struct net and the inuse counter.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:38:43 -07:00
Pavel Emelyanov 13ff3d6fa4 [SOCK]: Enumerate struct proto-s to facilitate percpu inuse accounting (v2).
The inuse counters are going to become a per-cpu array.  Introduce an
index for this array on the struct proto.

To handle the case of proto register-unregister-register loop the
bitmap is used. All its bits manipulations are protected with
proto_list_lock and a sanity check for the bitmap being exhausted is
also added.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:38:17 -07:00
Joe Perches bc578a54f0 [NET]: Rename inet_frag.h identifiers COMPLETE, FIRST_IN, LAST_IN to INET_FRAG_*
On Fri, 2008-03-28 at 03:24 -0700, Andrew Morton wrote:
> they should all be renamed.

Done for include/net and net

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:35:27 -07:00
Matti Linnanvuori 0ef4730927 net: Comment dev_kfree_skb_irq and dev_kfree_skb_any better
Comment dev_kfree_skb_irq and dev_kfree_skb_any better.

Signed-off-by: Matti Linnanvuori <mattilinnanvuori@yahoo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:33:00 -07:00
Joonwoo Park a5a04819c5 [LLC]: station source mac address
kill unnecessary llc_station_mac_sa.

Signed-off-by: Joonwoo Park <joonwpark81@gmail.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:28:36 -07:00
Rami Rosen be2ce06b49 [IPV6]: Remove unused method declaration in include/net/addrconf.h.
This patches removes unused declaration of addrconf_forwarding_on() method
in include/net/addrconf.h.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:26:45 -07:00
David S. Miller 1567ca7eec [NET]: Protect device namespace inlines with CONFIG_NET
Include sites should not be bothered by whether
CONFIG_NET is set or not when trying to include
benign files like linux/etherdevice.h et al.

From a report by Stephen Rothwell.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 15:53:11 -07:00
Linus Torvalds af8be4e4b3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  [PATCH] mnt_expire is protected by namespace_sem, no need for vfsmount_lock
  [PATCH] do shrink_submounts() for all fs types
  [PATCH] sanitize locking in mark_mounts_for_expiry() and shrink_submounts()
  [PATCH] count ghost references to vfsmounts
  [PATCH] reduce stack footprint in namespace.c
2008-03-28 15:23:01 -07:00
Christoph Hellwig 5ac7ec85bc ext3: don't export ext3_fs.h and jbd.h
Neither of the headers actually compiles when included from userpsace nor
should it be made available as userspace tools should be using the libraries
or at least headers from e2fsprogs.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-28 14:45:22 -07:00
Harvey Harrison 3afe392598 kernel: add bit rotation helpers for 16 and 8 bit
Will replace open-coded variants elsewhere.  Done in the same
style as the 32-bit versions.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: John W. Linville <linville@tuxdriver.com>
Cc: Joe Perches <joe@perches.com>
Cc: Jiri Benc <jbenc@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-28 14:45:22 -07:00
Jonathan Corbet 8c703d35fa in_atomic(): document why it is unsuitable for general use
Discourage people from inappropriately using in_atomic()

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-28 14:45:21 -07:00
David S. Miller 8e8e43843b Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/usb/rndis_host.c
	drivers/net/wireless/b43/dma.c
	net/ipv6/ndisc.c
2008-03-27 18:48:56 -07:00
David S. Miller ed85f2c3b2 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.26 2008-03-27 18:01:13 -07:00
Ilpo Järvinen bc09dff198 [SCTP]: Remove sctp_add_cmd_sf wrapper bloat
With a was number of callsites sctp_add_cmd_sf wrapper bloats
kernel by some amount. Due to unlikely tracking allyesconfig,
with the initial result were around ~7kB (thus caught my
attention) while a non-debug config produced only ~2.3kB effect.

I (ij) proposed first a patch to uninline it but Vlad responded
with a patch that removed the only sctp_add_cmd call which is
wrapped by sctp_add_cmd_sf (I wasn't sure if I could do that).
I did minor cleanup to Vlad's patch.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 17:54:29 -07:00
Ilpo Järvinen 419ae74ecc [NET]: uninline skb_trim, de-bloats
Allyesconfig (v2.6.24-mm1):
-10976  209 funcs, 123 +, 11099 -, diff: -10976 --- skb_trim

Without number of debug related CONFIGs (v2.6.25-rc2-mm1):
-7360  192 funcs, 131 +, 7491 -, diff: -7360 --- skb_trim
skb_trim                      |  +42

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 17:54:01 -07:00
Ilpo Järvinen 8d3308687f [NET]: uninline dst_release
Codiff stats (allyesconfig, v2.6.24-mm1):
-16420  187 funcs, 103 +, 16523 -, diff: -16420 --- dst_release

Without number of debug related CONFIGs (v2.6.25-rc2-mm1):
-7257  186 funcs, 70 +, 7327 -, diff: -7257 --- dst_release
dst_release                   |  +40

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 17:53:31 -07:00
Ilpo Järvinen c2aa270ad7 [NET]: uninline skb_push, de-bloats a lot
Allyesconfig (v2.6.24-mm1):

-21593  356 funcs, 2418 +, 24011 -, diff: -21593 --- skb_push

Without many debug related CONFIGs (v2.6.25-rc2-mm1):

-13890  341 funcs, 189 +, 14079 -, diff: -13890 --- skb_push
skb_push                      |  +46

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 17:52:40 -07:00
Ilpo Järvinen f58518e678 [NET]: uninline dev_alloc_skb, de-bloats a lot
Allyesconfig (v2.6.24-mm1):

-23668  392 funcs, 104 +, 23772 -, diff: -23668 --- dev_alloc_skb

Without many debug CONFIGs (v2.6.25-rc2-mm1):

-12178  382 funcs, 157 +, 12335 -, diff: -12178 --- dev_alloc_skb
dev_alloc_skb                 |  +37

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 17:51:31 -07:00
Al Viro c35038beca [PATCH] do shrink_submounts() for all fs types
... and take it out of ->umount_begin() instances.  Call with all locks
already taken (by do_umount()) and leave calling release_mounts() to
caller (it will do release_mounts() anyway, so we can just put into
the same list).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-27 20:47:58 -04:00
Al Viro 7c4b93d826 [PATCH] count ghost references to vfsmounts
make propagate_mount_busy() exclude references from the vfsmounts
that had been isolated by umount_tree() and are just waiting for
release_mounts() to dispose of their ->mnt_parent/->mnt_mountpoint.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-27 20:47:46 -04:00
Ilpo Järvinen 6be8ac2fdc [NET]: uninline skb_pull, de-bloats a lot
Allyesconfig (v2.6.24-mm1):

-28162  354 funcs, 3005 +, 31167 -, diff: -28162 --- skb_pull

Without number of debug related CONFIGs (v2.6.25-rc2-mm1):

-9697  338 funcs, 221 +, 9918 -, diff: -9697 --- skb_pull
skb_pull                      |  +44

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 17:47:24 -07:00
Ilpo Järvinen 0dde3e1648 [NET]: uninline skb_put, de-bloats a lot
Allyesconfig (v2.6.24-mm1):

~500 files changed
...
 869 funcs, 198 +, 111003 -, diff: -110805 --- skb_put
  skb_put                       | +104

Without number of debug related CONFIGs (v2.6.25-rc2-mm1):

-60744  855 funcs, 861 +, 61605 -, diff: -60744 --- skb_put
  skb_put                       |  +57

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 17:43:41 -07:00
David S. Miller 50fd4407b8 [NET]: Use local_irq_{save,restore}() in napi_complete().
Based upon a lockdep report.

Since ->poll() can be invoked from netpoll with interrupts
disabled, we must not unconditionally enable interrupts
in napi_complete().

Instead we must use local_irq_{save,restore}().

Noticed by Peter Zijlstra:

<irqs disabled>

  netpoll_poll()
    poll_napi()
      spin_trylock(&napi->poll_lock)
      poll_one_napi()
        napi->poll() := sky2_poll()
          napi_complete()
            local_irq_disable()
            local_irq_enable() <--- *BUG*

  <irq>
    irq_exit()
      do_softirq()
        net_rx_action()
          spin_lock(&napi->poll_lock) <--- Deadlock!

Because we still hold the lock....

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 17:42:50 -07:00
Rami Rosen 4f95165d4b [IPV6]: Remove three unused method declarations in include/net/ipv6.h
This patch removes three unused method declarations in include/net/ipv6.h:
inet_getfrag_t(), ipv6_build_nfrag_opts() and ipv6_build_frag_opts().

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 17:39:19 -07:00
Rusty Russell a6bd8e1303 lguest: comment documentation update.
Took some cycles to re-read the Lguest Journey end-to-end, fix some
rot and tighten some phrases.

Only comments change.  No new jokes, but a couple of recycled old jokes.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-03-28 11:05:54 +11:00
Denis V. Lunev 09382bac66 [PKT_SCHED]: Pass real namespace in net scheduler classifiers.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 16:53:37 -07:00
Denis V. Lunev 0e5f8be138 [NETNS]: Compile NET /proc support only if CONFIG_NET is set.
This fix broken compilation for 'allnoconfig'. This was introduced by
Introduced by commit 1218854afa ("[NET]
NETNS: Omit seq_net_private->net without CONFIG_NET_NS.")

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 14:25:53 -07:00
Linus Torvalds fb8c7fb25d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
  xen: fix UP setup of shared_info
  xen: fix RMW when unmasking events
  x86, documentation: nmi_watchdog=2 works on x86_64
  x86: stricter check in follow_huge_addr()
  rdc321x: GPIO routines bugfixes
  x86: ptrace.c: fix defined-but-unused warnings
  x86: fix prefetch workaround
2008-03-27 13:20:47 -07:00
Johannes Berg 6c507cd040 cfg80211: don't export ieee80211_get_channel
This patch makes ieee80211_get_channel a static inline defined in
cfg80211's header file which simply calls __ieee80211_get_channel
to avoid symbol clashes with the ieee80211 code.

The problem was pointed out by David Miller, thanks!

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-27 16:03:20 -04:00
Linus Torvalds 074fcab574 Merge branch 'avr32-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6
* 'avr32-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6:
  avr32: Fix bug in early resource allocation code
  avr32: Build fix for CONFIG_BUG=n
  avr32: Work around byteswap bug in gcc < 4.2
2008-03-27 09:14:07 -07:00
Florian Fainelli b2ef749720 rdc321x: GPIO routines bugfixes
This patch fixes the use of GPIO routines which are in the PCI
configuration space of the RDC321x, therefore reading/writing
to this space without spinlock protection can be problematic.

We also now request and free GPIOs and support the MGB100
board, previous code was very AR525W-centric.

Signed-off-by: Volker Weiss <volker@tintuc.de>
Signed-off-by: Florian Fainelli <florian.fainelli@telecomint.eu>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-27 16:08:45 +01:00
Linus Torvalds c94b4321eb Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
  ACPI: drivers/acpi: elide a non-zero test on a result that is never 0
  pnpacpi: reduce printk severity for "pnpacpi: exceeded the max number of ..."
  cpuidle: fix 100% C0 statistics regression
  cpuidle: fix cpuidle time and usage overflow
  ACPI: fix mis-merge -- invoke acpi_unlazy_tlb() only on C3 entry
  ACPI: fix a regression of ACPI device driver autoloading
  ACPI: SBS: remove typo from sbchc.c
2008-03-27 08:03:22 -07:00
Len Brown 86d9fc1293 Merge branches 'release', 'idle', 'redhat-bugzilla-436589', 'sbs' and 'video' into release 2008-03-26 22:50:09 -04:00
Linus Torvalds ee20a0dd54 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (43 commits)
  [IPSEC]: Fix BEET output
  [ICMP]: Dst entry leak in icmp_send host re-lookup code (v2).
  [AX25]: Remove obsolete references to BKL from TODO file.
  [NET]: Fix multicast device ioctl checks
  [IRDA]: Store irnet_socket termios properly.
  [UML]: uml-net: don't set IFF_ALLMULTI in set_multicast_list
  [VLAN]: Don't copy ALLMULTI/PROMISC flags from underlying device
  netxen, phy/marvell, skge: minor checkpatch fixes
  S2io: Handle TX completions on the same CPU as the sender for MIS-X interrupts
  b44: Truncate PHY address
  skge napi->poll() locking bug
  rndis_host: fix oops when query for OID_GEN_PHYSICAL_MEDIUM fails
  cxgb3: Fix lockdep problems with sge.reg_lock
  ehea: Fix IPv6 support
  dm9000: Support promisc and all-multi modes
  dm9601: configure MAC to drop invalid (crc/length) packets
  dm9601: add Hirose USB-100 device ID
  Marvell PHY m88e1111 driver fix
  netxen: fix rx dropped stats
  netxen: remove low level tx lock
  ...
2008-03-26 18:35:50 -07:00
Linus Torvalds d55a4528f7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  [SPARC64]: Define TASK_SIZE_OF()
  [SPARC64]: flush_ptrace_access() needs preemption disable.
  [SPARC64]: Update defconfig.
  [SPARC64]: Fix allnoconfig build, ptrace.c missing CONFIG_COMPAT checks.
  [SPARC64]: Fix __get_cpu_var in preemption-enabled area.
  [SPARC64]: Fix sparse warnings in arch/sparc64/kernel/signal.c
  [SPARC64]: Fix most sparse warnings in arch/sparc64/kernel/sys_sparc.c
  [SPARC64]: Fix sparse warnings in arch/sparc64/kernel/time.c
  [SPARC64]: Fix sparse warnings in arch/sparc64/kernel/ptrace.c
  [SPARC64]: Fix sparse warnings in arch/sparc64/kernel/irq.c
  [SPARC64]: Fix sparse warnings in arch/sparc64/kernel/iommu.c
  [SPARC64]: Fix sparse errors in arch/sparc64/kernel/traps.c
  [SPARC64]: Fix sparse warnings in arch/sparc64/kernel/{cpu,setup}.c
  [SPARC64]: Adjust {TLBTEMP,TSBMAP}_BASE.
  [SPARC64]: Make save_stack_trace() more efficient.
2008-03-26 18:35:22 -07:00
David S. Miller c101b088ba [SPARC64]: Define TASK_SIZE_OF()
This make "cat /proc/${PID}/pagemap" more efficient for
32-bit tasks.

Based upon a report by Mariusz Kozlowski.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 17:32:33 -07:00
Benjamin Thery 60e8fbc4c5 [NETNS][IPV6] flowlabels - make flowlabels per namespace
This patch introduces a new member, fl_net, in struct ip6_flowlabel.
This allows to create labels with the same value in different namespaces.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 16:53:08 -07:00
Daniel Lezcano 6ab57e7e7f [NETNS][IPV6] anycast - handle several network namespace
Make use of the network namespace information to have this protocol to
handle several network namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 16:52:32 -07:00
Herbert Xu 732c8bd590 [IPSEC]: Fix BEET output
The IPv6 BEET output function is incorrectly including the inner
header in the payload to be protected.  This causes a crash as
the packet doesn't actually have that many bytes for a second
header.

The IPv4 BEET output on the other hand is broken when it comes
to handling an inner IPv6 header since it always assumes an
inner IPv4 header.

This patch fixes both by making sure that neither BEET output
function touches the inner header at all.  All access is now
done through the protocol-independent cb structure.  Two new
attributes are added to make this work, the IP header length
and the IPv4 option length.  They're filled in by the inner
mode's output function.

Thanks to Joakim Koskela for finding this problem.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 16:51:09 -07:00
Pavel Emelyanov 67727184f2 [VLAN]: Reduce memory consumed by vlan_groups
Currently each vlan_groupd contains 8 pointers on arrays with 512
pointers on struct net_device each  :)  Such a construction "in many
cases ... wastes memory".

My proposal is to allow for some of these arrays pointers be NULL,
meaning that there are no devices in it. When a new device is added
to the vlan_group, the appropriate array is allocated.

The check in vlan_group_get_device's is safe, since the pointer
vg->vlan_devices_arrays[x] can only switch from NULL to not-NULL.
The vlan_group_prealloc_vid() is guarded with rtnl lock and is
also safe.

I've checked (I hope that) all the places, that use these arrays
and found, that the register_vlan_dev is the only place, that can
put a vlan device on an empty vlan_group.

Rough calculations shows, that after the patch a setup with a
single vlan dev (or up to 512 vlans with sequential vids) will
occupy approximately 8 times less memory.

The question I have is - does this patch makes sense, or a totally
new structures are required to store the vlan_devs?

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-03-26 16:27:22 -07:00
Suresh Siddha d546b67a94 x86: fix performance drop for glx
fix the 3D performance drop reported at:

   http://bugzilla.kernel.org/show_bug.cgi?id=10328

fb drivers are using ioremap()/ioremap_nocache(), followed by mtrr_add with
WC attribute. Recent changes in page attribute code made both
ioremap()/ioremap_nocache() mappings as UC (instead of previous UC-). This
breaks the graphics performance, as the effective memory type is UC instead
of expected WC.

The correct way to fix this is to add ioremap_wc() (which uses UC- in the
absence of PAT kernel support and WC with PAT) and change all the
fb drivers to use this new ioremap_wc() API.

We can take this correct and longer route for post 2.6.25. For now,
revert back to the UC- behavior for ioremap/ioremap_nocache.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-26 22:23:41 +01:00
Paul Mundt 138bed154e sh: Fix TIF_USEDFPU clearing under FPU emulation.
The unlazy_fpu() path calls in to save_fpu() if the task has
TIF_USEDFPU set. save_fpu() being the crap API that it is has the side
effect of clearing the flag itself, which presently doesn't happen
if we're using FPU emulation. Fix this up for now, pending an overhaul
in 2.6.26.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-03-26 19:09:21 +09:00
Paul Mundt 9bbafce2ee sh: Fix occasional FPU register corruption under preempt.
Presently with preempt enabled there's the possibility to be preempted
after the TIF_USEDFPU test and the register save, leading to bogus
state post-__switch_to(). Use an explicit preempt_disable()/enable()
pair around unlazy_fpu()/clear_fpu() to avoid this. Follows the x86
change.

Reported-by: Takuo Koguchi <takuo.koguchi.sw@hitachi.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-03-26 19:02:47 +09:00
Pavel Emelyanov 68528f0998 [NETNS][ICMP]: Make ctl tables for ICMP sysctls per-net.
Add some flesh to ipv4_sysctl_init_net and ipv4_sysctl_exit_net,
i.e. copy the table, alter .data pointers and register it per-net.

Other ipv4_table's sysctls are now global, but this is going to
change once sysctl permissions patches migrate from -mm tree to 
mainline in 2.6.26 merge window :)

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 01:56:24 -07:00
Pavel Emelyanov a24022e188 [NETNS][ICMP]: Move ICMP sysctls on struct net.
Initialization is moved to icmp_sk_init, all the places, that
refer to them use init_net for now.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 01:55:37 -07:00
David S. Miller cf3d7c1ef4 [SPARC64]: Fix sparse warnings in arch/sparc64/kernel/time.c
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 01:11:55 -07:00
Denis V. Lunev f5aa23fd49 [NETNS]: Compilation warnings under CONFIG_NET_NS.
Recent commits from YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
have been introduced a several compilation warnings
'assignment discards qualifiers from pointer target type'
due to extra const modifier in the inline call parameters of
{dev|sock|twsk}_net_set.

Drop it.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 00:48:17 -07:00
Denis V. Lunev 9c2f5746b9 [NETNS]: Compilation fix for include/linux/netdevice.h.
Commit commit c346dca108
([NET] NETNS: Omit net_device->nd_net without CONFIG_NET_NS)
breaks compilation with CONFIG_NET_NS set.

Fix the typo.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 00:47:14 -07:00
David S. Miller d91aa123b4 [SPARC64]: Fix sparse warnings in arch/sparc64/kernel/irq.c
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 00:37:51 -07:00
Thomas Gleixner 06d8308c61 NOHZ: reevaluate idle sleep length after add_timer_on()
add_timer_on() can add a timer on a CPU which is currently in a long
idle sleep, but the timer wheel is not reevaluated by the nohz code on
that CPU. So a timer can be delayed for quite a long time. This
triggered a false positive in the clocksource watchdog code.

To avoid this we need to wake up the idle CPU and enforce the
reevaluation of the timer wheel for the next timer event.

Add a function, which checks a given CPU for idle state, marks the
idle task with NEED_RESCHED and sends a reschedule IPI to notify the
other CPU of the change in the timer wheel.

Call this function from add_timer_on().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: stable@kernel.org

--
 include/linux/sched.h |    6 ++++++
 kernel/sched.c        |   43 +++++++++++++++++++++++++++++++++++++++++++
 kernel/timer.c        |   10 +++++++++-
 3 files changed, 58 insertions(+), 1 deletion(-)
2008-03-26 08:28:55 +01:00
David S. Miller 99cd220133 [SPARC64]: Fix sparse errors in arch/sparc64/kernel/traps.c
Add 'UL' markers to DCU_* macros.

Declare C functions called from assembler in entry.h

Declare C functions called from within the sparc64 arch
code in include/asm-sparc64/*.h headers as appropriate.

Remove unused routines in traps.c

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 00:19:43 -07:00
David S. Miller 3d5ae6b69e [SPARC64]: Fix sparse warnings in arch/sparc64/kernel/{cpu,setup}.c
We create a local header file entry.h, under arch/sparc64/kernel/,
that we can use to declare routines either defined in assembler
or only invoked from assembler.  As well as other data objects
which are private to the inner sparc64 kernel arch code.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 21:51:40 -07:00
Yi Yang 8b78cf602f cpuidle: fix cpuidle time and usage overflow
cpuidle C-state sysfs node time and usage are very easy to overflow because
they are all of unsigned int type, time will overflow within about two hours,
usage will take longer time to overflow, but they are increasing for ever.

This patch will convert them to unsigned long long.

Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-03-26 00:45:26 -04:00
David S. Miller 606d5b1939 [SPARC64]: Adjust {TLBTEMP,TSBMAP}_BASE.
Move them further from the main kernel image area
to facilitate larger kernel sizes.

Adjust comments to match.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 21:13:22 -07:00
Patrick McHardy c7f485abd6 [NETFILTER]: nf_conntrack_sip: RTP routing optimization
Optimize call routing between NATed endpoints: when an external
registrar sends a media description that contains an existing RTP
expectation from a different SNATed connection, the gatekeeper
is trying to route the call directly between the two endpoints.

We assume both endpoints can reach each other directly and
"un-NAT" the addresses, which makes the media stream go between
the two endpoints directly.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:26:43 -07:00