OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Pallipadi, Venkatesh	e888d7facd	x86, delay: tsc based udelay should have rdtsc_barrier delay_tsc needs rdtsc_barrier to provide proper delay. Output from a test driver using hpet to cross check delay provided by udelay(). Before: [ 86.794363] Expected delay 5us actual 4679ns [ 87.154362] Expected delay 5us actual 698ns [ 87.514162] Expected delay 5us actual 4539ns [ 88.653716] Expected delay 5us actual 4539ns [ 94.664106] Expected delay 10us actual 9638ns [ 95.049351] Expected delay 10us actual 10126ns [ 95.416110] Expected delay 10us actual 9568ns [ 95.799216] Expected delay 10us actual 9638ns [ 103.624104] Expected delay 10us actual 9707ns [ 104.020619] Expected delay 10us actual 768ns [ 104.419951] Expected delay 10us actual 9707ns After: [ 50.983320] Expected delay 5us actual 5587ns [ 51.261807] Expected delay 5us actual 5587ns [ 51.565715] Expected delay 5us actual 5657ns [ 51.861171] Expected delay 5us actual 5587ns [ 52.164704] Expected delay 5us actual 5726ns [ 52.487457] Expected delay 5us actual 5657ns [ 52.789338] Expected delay 5us actual 5726ns [ 57.119680] Expected delay 10us actual 10755ns [ 57.893997] Expected delay 10us actual 10615ns [ 58.261287] Expected delay 10us actual 10755ns [ 58.620505] Expected delay 10us actual 10825ns [ 58.941035] Expected delay 10us actual 10755ns [ 59.320903] Expected delay 10us actual 10615ns [ 61.306311] Expected delay 10us actual 10755ns [ 61.520542] Expected delay 10us actual 10615ns Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-06-25 16:47:40 -07:00
Linus Torvalds	9063c61fd5	x86, 64-bit: Clean up user address masking The discussion about using "access_ok()" in get_user_pages_fast() (see commit 7f8189068726492950bf1a2dcfd9b51314560abf: "x86: don't use 'access_ok()' as a range check in get_user_pages_fast()" for details and end result), made us notice that x86-64 was really being very sloppy about virtual address checking. So be way more careful and straightforward about masking x86-64 virtual addresses: - All the VIRTUAL_MASK* variants now cover half of the address space, it's not like we can use the full mask on a signed integer, and the larger mask just invites mistakes when applying it to either half of the 48-bit address space. - /proc/kcore's kc_offset_to_vaddr() becomes a lot more obvious when it transforms a file offset into a (kernel-half) virtual address. - Unify/simplify the 32-bit and 64-bit USER_DS definition to be based on TASK_SIZE_MAX. This cleanup and more careful/obvious user virtual address checking also uncovered a buglet in the x86-64 implementation of strnlen_user(): it would do an "access_ok()" check on the whole potential area, even if the string itself was much shorter, and thus return an error even for valid strings. Our sloppy checking had hidden this. So this fixes 'strnlen_user()' to do this properly, the same way we already handled user strings in 'strncpy_from_user()'. Namely by just checking the first byte, and then relying on fault handling for the rest. That always works, since we impose a guard page that cannot be mapped at the end of the user space address space (and even if we didn't, we'd have the address space hole). Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Nick Piggin <npiggin@suse.de> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-06-20 15:40:00 -07:00
Borislav Petkov	b034c19f9f	x86: MSR: add methods for writing of an MSR on several CPUs Provide for concurrent MSR writes on all the CPUs in the cpumask. Also, add a temporary workaround for smp_call_function_many which skips the CPU we're executing on. Bart: zero out rv struct which is allocated on stack. CC: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-06-10 12:18:43 +02:00
Borislav Petkov	6bc1096d7a	x86: MSR: add a struct representation of an MSR Add a struct representing a 64bit MSR pair consisting of a low and high register part and convert msr_info to use it. Also, rename msr-on-cpu.c to msr.c. Side note: Put the cpumask.h include in __KERNEL__ space thus fixing an allmodconfig build failure in the headers_check target. CC: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-06-10 12:18:42 +02:00
Ingo Molnar	f3b6eaf014	x86: memcpy, clean up Impact: cleanup Make this file more readable by bringing it more in line with the usual kernel style. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-12 12:21:17 +01:00
Jan Beulich	dd1ef4ec47	x86-64: remove unnecessary spill/reload of rbx from memcpy Impact: micro-optimization This should slightly improve its performance. Signed-off-by: Jan Beulich <jbeulich@novell.com> LKML-Reference: <49B8F641.76E4.0078.0@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-12 12:04:47 +01:00
Jeremy Fitzhardinge	0341c14da4	x86: use _types.h headers in asm where available In general, the only definitions that assembly files can use are in _types.S headers (where available), so convert them. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2009-02-13 11:35:01 -08:00
Andi Kleen	e0a96129db	x86: use early clobbers in usercopy*.c Impact: fix rare (but currently harmless) miscompile with certain configs and gcc versions Hugh Dickins noticed that strncpy_from_user() was miscompiled in some circumstances with gcc 4.3. Thanks to Hugh's excellent analysis it was easy to track down. Hugh writes: > Try building an x86_64 defconfig 2.6.29-rc1 kernel tree, > except not quite defconfig, switch CONFIG_PREEMPT_NONE=y > and CONFIG_PREEMPT_VOLUNTARY off (because it expands a > might_fault() there, which hides the issue): using a > gcc 4.3.2 (I've checked both openSUSE 11.1 and Fedora 10). > > It generates the following: > > 0000000000000000 <__strncpy_from_user>: > 0: 48 89 d1 mov %rdx,%rcx > 3: 48 85 c9 test %rcx,%rcx > 6: 74 0e je 16 <__strncpy_from_user+0x16> > 8: ac lods %ds:(%rsi),%al > 9: aa stos %al,%es:(%rdi) > a: 84 c0 test %al,%al > c: 74 05 je 13 <__strncpy_from_user+0x13> > e: 48 ff c9 dec %rcx > 11: 75 f5 jne 8 <__strncpy_from_user+0x8> > 13: 48 29 c9 sub %rcx,%rcx > 16: 48 89 c8 mov %rcx,%rax > 19: c3 retq > > Observe that "sub %rcx,%rcx; mov %rcx,%rax", whereas gcc 4.2.1 > (and many other configs) say "sub %rcx,%rdx; mov %rdx,%rax". > Isn't it returning 0 when it ought to be returning strlen? The asm constraints for the strncpy_from_user() result were missing an early clobber, which tells gcc that the last output arguments are written before all input arguments are read. Also add more early clobbers in the rest of the file and fix 32-bit usercopy.c in the same way. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> [ since this API is rarely used and no in-kernel user relies on a 'len' return value (they only rely on negative return values) this miscompile was never noticed in the field. But it's worth fixing it nevertheless. ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-21 09:43:17 +01:00
Ingo Molnar	d1a76187a5	Merge commit 'v2.6.28-rc2' into core/locking Conflicts: arch/um/include/asm/system.h	2008-10-28 16:54:49 +01:00
Ingo Molnar	0afe2db213	Merge branch 'x86/unify-cpu-detect' into x86-v28-for-linus-phase4-D Conflicts: arch/x86/kernel/cpu/common.c arch/x86/kernel/signal_64.c include/asm-x86/cpufeature.h	2008-10-11 20:23:20 +02:00
Ingo Molnar	1d18ef4895	x86: some lock annotations for user copy paths, v3 - add annotation back to clear_user() - change probe_kernel_address() to _inatomic*() method Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-09-11 21:42:59 +02:00
Nick Piggin	3ee1afa308	x86: some lock annotations for user copy paths, v2 - introduce might_fault() - handle the atomic user copy paths correctly [ mingo@elte.hu: move might_sleep() outside of in_atomic(). ] Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-09-11 09:44:21 +02:00
Nick Piggin	c10d38dda1	x86: some lock annotations for user copy paths copy_to/from_user and all its variants (except the atomic ones) can take a page fault and perform non-trivial work like taking mmap_sem and entering the filesyste/pagecache. Unfortunately, this often escapes lockdep because a common pattern is to use it to read in some arguments just set up from userspace, or write data back to a hot buffer. In those cases, it will be unlikely for page reclaim to get a window in to cause copy_*_user to fault. With the new might_lock primitives, add some annotations to x86. I don't know if I caught all possible faulting points (it's a bit of a maze, and I didn't really look at 32-bit). But this is a starting point. Boots and runs OK so far. Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-09-10 13:48:49 +02:00
Andi Kleen	fb481dd56a	x86: drop -funroll-loops for csum_partial_64.c Impact: performance optimization I did some rebenchmarking with modern compilers and dropping -funroll-loops makes the function consistently go faster by a few percent. So drop that flag. Thanks to Richard Guenther for a hint. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-09-04 08:42:06 -07:00
H. Peter Anvin	b30a72a7ed	Merge branch 'x86/urgent' into x86/cpu Conflicts: arch/x86/kernel/cpu/cyrix.c	2008-08-27 19:17:07 -07:00
H. Peter Anvin	bdd314616f	x86: msr-on-cpu: remove unnecessary level of abstraction Remove an unnecessary level of abstraction in the msr-on-cpu library. Although this duplicates some code, the duplicated code is less than the additional code, and this way should be faster. Additionally, change the order of the functions to make the regular structure of this file more obvious. Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-08-25 22:45:50 -07:00
H. Peter Anvin	94d4ac2f4a	Merge branch 'x86/urgent' into x86/cleanups	2008-08-25 22:45:37 -07:00
H. Peter Anvin	c6f31932d0	x86: msr: propagate errors from smp_call_function_single() Propagate error (-ENXIO) from smp_call_function_single(). These errors can happen when a CPU is unplugged while the MSR driver is open. Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-08-25 17:45:48 -07:00
Thomas Petazzoni	8bfcb3960f	x86: make movsl_mask definition non-CPU specific movsl_mask is currently defined in arch/x86/kernel/cpu/intel.c, which contains code specific to Intel CPUs. However, movsl_mask is used in the non-CPU specific code in arch/x86/lib/usercopy_32.c, which breaks the compilation when support for Intel CPUs is compiled out. This patch solves this problem by moving movsl_mask's definition close to its users in arch/x86/lib/usercopy_32.c. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Cc: michael@free-electrons.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-18 16:05:45 +02:00
Paolo Ciarrocchi	3492cdf017	x86: coding style fixes to arch/x86/lib/string_32.c Before: total: 21 errors, 0 warnings, 237 lines checked After: total: 0 errors, 0 warnings, 237 lines checked paolo@paolo-desktop:~/linux.trees.git$ md5sum /tmp/string_32.o.* c55d059ef1612b32a8bb2771a72ae0d5 /tmp/string_32.o.after c55d059ef1612b32a8bb2771a72ae0d5 /tmp/string_32.o.before Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-15 16:53:25 +02:00
Paolo Ciarrocchi	209b580fd8	x86: coding style fixes to arch/x86/lib/strstr_32.c Before: total: 3 errors, 0 warnings, 31 lines checked After: total: 0 errors, 0 warnings, 31 lines checked paolo@paolo-desktop:~/linux.trees.git$ md5sum /tmp/strstr_32.o.* c96006ec3387862e5bacb139207a3098 /tmp/strstr_32.o.after c96006ec3387862e5bacb139207a3098 /tmp/strstr_32.o.before Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-15 16:53:24 +02:00
Vitaly Mayatskikh	afd962a9e8	x86: wrong register was used in align macro New ALIGN_DESTINATION macro has sad typo: r8d register was used instead of ecx in fixup section. This can be considered as a regression. Register ecx was also wrongly loaded with value in r8d in copy_user_nocache routine. Signed-off-by: Vitaly Mayatskikh <v.mayatskih@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-30 10:10:39 -07:00
Ingo Molnar	1a781a777b	Merge branch 'generic-ipi' into generic-ipi-for-linus Conflicts: arch/powerpc/Kconfig arch/s390/kernel/time.c arch/x86/kernel/apic_32.c arch/x86/kernel/cpu/perfctr-watchdog.c arch/x86/kernel/i8259_64.c arch/x86/kernel/ldt.c arch/x86/kernel/nmi_64.c arch/x86/kernel/smpboot.c arch/x86/xen/smp.c include/asm-x86/hw_irq_32.h include/asm-x86/hw_irq_64.h include/asm-x86/mach-default/irq_vectors.h include/asm-x86/mach-voyager/irq_vectors.h include/asm-x86/smp.h kernel/Makefile Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-15 21:55:59 +02:00
Ingo Molnar	5806b81ac1	Merge branch 'auto-ftrace-next' into tracing/for-linus Conflicts: arch/x86/kernel/entry_32.S arch/x86/kernel/process_32.c arch/x86/kernel/process_64.c arch/x86/lib/Makefile include/asm-x86/irqflags.h kernel/Makefile kernel/sched.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-14 16:11:52 +02:00
Jeremy Fitzhardinge	27cb0a75ba	x86: fix compile error in current tip.git Gas 2.15 complains about 32-bit registers being used in lea. AS arch/x86/lib/copy_user_64.o /local/scratch-2/jeremy/hg/xen/paravirt/linux/arch/x86/lib/copy_user_64.S: Assembler messages: /local/scratch-2/jeremy/hg/xen/paravirt/linux/arch/x86/lib/copy_user_64.S:188: Error: `(%edx,%ecx,8)' is not a valid 64 bit base/index expression /local/scratch-2/jeremy/hg/xen/paravirt/linux/arch/x86/lib/copy_user_64.S:257: Error: `(%edx,%ecx,8)' is not a valid 64 bit base/index expression AS arch/x86/lib/copy_user_nocache_64.o /local/scratch-2/jeremy/hg/xen/paravirt/linux/arch/x86/lib/copy_user_nocache_64.S: Assembler messages: /local/scratch-2/jeremy/hg/xen/paravirt/linux/arch/x86/lib/copy_user_nocache_64.S:107: Error: `(%edx,%ecx,8)' is not a valid 64 bit base/index expression Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Vitaly Mayatskikh <v.mayatskih@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-10 21:55:59 +02:00
Vitaly Mayatskikh	ad2fc2cd92	x86: fix copy_user on x86 Switch copy_user_generic_string(), copy_user_generic_unrolled() and __copy_user_nocache() from custom tail handlers to generic copy_user_tail_handle(). Signed-off-by: Vitaly Mayatskikh <v.mayatskih@gmail.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 15:51:16 +02:00
Vitaly Mayatskikh	1129585a08	x86: introduce copy_user_handle_tail() routine Introduce generic C routine for handling necessary tail operations after protection fault in copy_*_user on x86. Signed-off-by: Vitaly Mayatskikh <v.mayatskih@gmail.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 15:51:03 +02:00
Glauber Costa	5cbbc3b1eb	x86: merge putuser asm functions. putuser_32.S and putuser_64.S are merged into putuser.S. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 09:14:13 +02:00
Glauber Costa	2528de431d	x86: use macros from asm.h. In putuser_32.S and putuser_64.S, replace things like .quad, .long, and explicit references to [r\|e]ax for the apropriate macros in asm/asm.h. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 09:14:12 +02:00
Glauber Costa	efea505d83	x86: don't use word-size specifiers in putuser files. Remove them where unambiguous. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 09:14:11 +02:00
Glauber Costa	766ed42821	x86: replace function headers by macros. In putuser_64.S, do it the i386 way, and replace the code in beginning and end of functions with macros, since it's always the same thing. Save lines. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 09:14:10 +02:00
Glauber Costa	663aa96df3	x86: change testing logic in putuser_64.S. Instead of operating over a register we need to put back into normal state afterwards (the memory position), just sub from rbx, which is trashed anyway. We can save a few instructions. Also, this is the i386 way. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 09:14:09 +02:00
Glauber Costa	0ada316403	x86: pass argument to putuser_64 functions in ax register. This is consistent with i386 usage. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 09:14:08 +02:00
Glauber Costa	770546b99f	x86: clobber rbx in putuser_64.S. Instead of clobbering r8, clobber rbx, which is the i386 way. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 09:14:08 +02:00
Glauber Costa	268cf048c8	x86: don't save ebx in putuser_32.S. Clobber it in the inline asm macros, and let the compiler do this for us. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 09:14:06 +02:00
Glauber Costa	6c2d458680	x86: merge getuser asm functions. getuser_32.S and getuser_64.S are merged into getuser.S. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 09:14:05 +02:00
Glauber Costa	87e2f1e7f6	x86: use _ASM_PTR instead of explicit word-size pointers. Switch .long and .quad with _ASM_PTR in getuser*.S. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 09:14:04 +02:00
Glauber Costa	40faf463e6	x86: introduce __ASM_REG macro. There are situations in which the architecture wants to use the register that represents its word-size, whatever it is. For those, introduce __ASM_REG in asm.h, along with the first users _ASM_AX and _ASM_DX. They have users waiting for it, namely the getuser functions. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 09:14:04 +02:00
Glauber Costa	ef8c1a2d0e	x86: don't use word-size specifiers on getuser_64. The instructions access registers, so the size is unambiguous. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 09:14:03 +02:00
Glauber Costa	26ccb8a718	x86: rename threadinfo to TI. This is for consistency with i386. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 09:14:02 +02:00
Glauber Costa	9262875395	x86: adapt x86_64 getuser functions. Instead of doing a sub after the addition, use the offset directly at the memory operand of the mov instructions. This is the way i386 do. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 09:14:01 +02:00
Glauber Costa	9aa038815b	x86: don't use word-size specifiers. Since the instructions refer to registers, they'll be able to figure it out. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 09:14:00 +02:00
Glauber Costa	edf10162b2	x86: don't clobber r8 nor use rcx. There's really no reason to clobber r8 or pass the address in rcx. We can safely use only two registers (which we already have to touch anyway) to do the job. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 09:13:59 +02:00
Glauber Costa	f0fbf0abc0	x86: integrate delay functions. delay_32.c, delay_64.c are now equal, and are integrated into delay.c. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 08:52:05 +02:00
Glauber Costa	7e58818d32	x86: explicitly use edx in const delay function. For x86_64, we can't just use %0, as it would generate a mul against rdx, which is not really what we want (note the ">> 32" in x86_64 version). Using a u64 variable with a shift in i386 generates bad code, so the solution is to explicitly use %%edx in inline assembly for both. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 08:52:04 +02:00
Glauber Costa	a76febe975	x86: use rdtscll in read_current_timer for i386. This way we achieve the same code for both arches. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 08:52:02 +02:00
Glauber Costa	0a4d8a472f	x86: provide delay loop for x86_64. This is for consistency with i386. We call use_tsc_delay() at tsc initialization for x86_64, so we'll be always using it. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 08:51:41 +02:00
Glauber Costa	ff1b15b646	x86: don't use size specifiers. Remove the "l" from inline asm at arch/x86/lib/delay_32.c. It is not needed. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 08:49:27 +02:00
Jens Axboe	8691e5a8f6	smp_call_function: get rid of the unused nonatomic/retry argument It's never used and the comments refer to nonatomic and retry interchangably. So get rid of it. Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-06-26 11:24:35 +02:00
Ingo Molnar	28f73e51d0	Merge branch 'linus' into x86/delay Conflicts: arch/x86/kernel/tsc_32.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-25 12:30:10 +02:00
Ingo Molnar	f34bfb1bee	Merge branch 'linus' into tracing/ftrace	2008-06-23 11:11:42 +02:00
Linus Torvalds	42a886af72	x86-64: Fix "bytes left to copy" return value for copy_from_user() Most users by far do not care about the exact return value (they only really care about whether the copy succeeded in its entirety or not), but a few special core routines actually care deeply about exactly how many bytes were copied from user space. And the unrolled versions of the x86-64 user copy routines would sometimes report that it had copied more bytes than it actually had. Very few uses actually have partial copies to begin with, but to make this bug even harder to trigger, most x86 CPU's use the "rep string" instructions for normal user copies, and that version didn't have this issue. To make it even harder to hit, the one user of this that really cared about the return value (and used the uncached version of the copy that doesn't use the "rep string" instructions) was the generic write routine, which pre-populated its source, once more hiding the problem by avoiding the exception case that triggers the bug. In other words, very special thanks to Bron Gondwana who not only triggered this, but created a test-program to show it, and bisected the behavior down to commit `08291429cf` ("mm: fix pagecache write deadlocks") which changed the access pattern just enough that you can now trigger it with 'writev()' with multiple iovec's. That commit itself was not the cause of the bug, it just allowed all the stars to align just right that you could trigger the problem. [ Side note: this is just the minimal fix to make the copy routines (with __copy_from_user_inatomic_nocache as the particular version that was involved in showing this) have the right return values. We really should improve on the exceptional case further - to make the copy do a byte-accurate copy up to the exact page limit that causes it to fail. As it is, the callers have to do extra work to handle the limit case gracefully. ] Reported-by: Bron Gondwana <brong@fastmail.fm> Cc: Nick Piggin <npiggin@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (which didn't have this problem), and since most users that do the carethis was very hard to trigger, but	2008-06-17 17:47:50 -07:00
Jiri Hladky	e01b70ef3e	x86: fix bug in arch/i386/lib/delay.c file, delay_loop function when trying to understand how Bogomips are implemented I have found a bug in arch/i386/lib/delay.c file, delay_loop function. The function fails for loops > 2^31+1. It because SF is set when dec returns numbers > 2^31. The fix is to use jnz instruction instead of jns (and add one decl instruction to the end to have exactly the same number of loops as in original version). Martin Mares observed: > It is a long time since I have hacked that file, but you should definitely > make sure that the function is never called with a zero argument. In such > case, the original version made just a single pass, but your version > makes 2^32 of them. fixed that. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-17 10:55:47 +02:00
Ingo Molnar	e765ee90da	Merge branch 'linus' into tracing/ftrace	2008-06-16 11:15:58 +02:00
Steven Rostedt	5c1ea08215	x86: enable preemption in delay The RT team has been searching for a nasty latency. This latency shows up out of the blue and has been seen to be as big as 5ms! Using ftrace I found the cause of the latency. pcscd-2995 3dNh1 52360300us : irq_exit (smp_apic_timer_interrupt) pcscd-2995 3dN.2 52360301us : idle_cpu (irq_exit) pcscd-2995 3dN.2 52360301us : rcu_irq_exit (irq_exit) pcscd-2995 3dN.1 52360771us : smp_apic_timer_interrupt (apic_timer_interrupt ) pcscd-2995 3dN.1 52360771us : exit_idle (smp_apic_timer_interrupt) Here's an example of a 400 us latency. pcscd took a timer interrupt and returned with "need resched" enabled, but did not reschedule until after the next interrupt came in at 52360771us 400us later! At first I thought we somehow missed a preemption check in entry.S. But I also noticed that this always seemed to happen during a __delay call. pcscd-2995 3dN.2 52360836us : rcu_irq_exit (irq_exit) pcscd-2995 3.N.. 52361265us : preempt_schedule (__delay) Looking at the x86 delay, I found my problem. In git commit `35d5d08a08`, Andrew Morton placed preempt_disable around the entire delay due to TSC's not working nicely on SMP. Unfortunately for those that care about latencies this is devastating! Especially when we have callers to mdelay(8). Here I enable preemption during the loop and account for anytime the task migrates to a new CPU. The delay asked for may be extended a bit by the migration, but delay only guarantees that it will delay for that minimum time. Delaying longer should not be an issue. [ Thanks to Thomas Gleixner for spotting that cpu wasn't updated, and to place the rep_nop between preempt_enabled/disable. ] Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: akpm@osdl.org Cc: Clark Williams <clark.williams@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Luis Claudio R. Goncalves" <lclaudio@uudg.org> Cc: Gregory Haskins <ghaskins@novell.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andi Kleen <andi-suse@firstfloor.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-06-04 13:11:46 +02:00
Steven Rostedt	81d68a96a3	ftrace: trace irq disabled critical timings This patch adds latency tracing for critical timings (how long interrupts are disabled for). "irqsoff" is added to /debugfs/tracing/available_tracers Note: tracing_max_latency also holds the max latency for irqsoff (in usecs). (default to large number so one must start latency tracing) tracing_thresh threshold (in usecs) to always print out if irqs off is detected to be longer than stated here. If irq_thresh is non-zero, then max_irq_latency is ignored. Here's an example of a trace with ftrace_enabled = 0 ======= preemption latency trace v1.1.5 on 2.6.24-rc7 Signed-off-by: Ingo Molnar <mingo@elte.hu> -------------------------------------------------------------------- latency: 100 us, #3/3, CPU#1 \| (M:rt VP:0, KP:0, SP:0 HP:0 #P:2) ----------------- \| task: swapper-0 (uid:0 nice:0 policy:0 rt_prio:0) ----------------- => started at: _spin_lock_irqsave+0x2a/0xb7 => ended at: _spin_unlock_irqrestore+0x32/0x5f _------=> CPU# / _-----=> irqs-off \| / _----=> need-resched \|\| / _---=> hardirq/softirq \|\|\| / _--=> preempt-depth \|\|\|\| / \|\|\|\|\| delay cmd pid \|\|\|\|\| time \| caller \ / \|\|\|\|\| \ \| / swapper-0 1d.s3 0us+: _spin_lock_irqsave+0x2a/0xb7 (e1000_update_stats+0x47/0x64c [e1000]) swapper-0 1d.s3 100us : _spin_unlock_irqrestore+0x32/0x5f (e1000_update_stats+0x641/0x64c [e1000]) swapper-0 1d.s3 100us : trace_hardirqs_on_caller+0x75/0x89 (_spin_unlock_irqrestore+0x32/0x5f) vim:ft=help ======= And this is a trace with ftrace_enabled == 1 ======= preemption latency trace v1.1.5 on 2.6.24-rc7 -------------------------------------------------------------------- latency: 102 us, #12/12, CPU#1 \| (M:rt VP:0, KP:0, SP:0 HP:0 #P:2) ----------------- \| task: swapper-0 (uid:0 nice:0 policy:0 rt_prio:0) ----------------- => started at: _spin_lock_irqsave+0x2a/0xb7 => ended at: _spin_unlock_irqrestore+0x32/0x5f _------=> CPU# / _-----=> irqs-off \| / _----=> need-resched \|\| / _---=> hardirq/softirq \|\|\| / _--=> preempt-depth \|\|\|\| / \|\|\|\|\| delay cmd pid \|\|\|\|\| time \| caller \ / \|\|\|\|\| \ \| / swapper-0 1dNs3 0us+: _spin_lock_irqsave+0x2a/0xb7 (e1000_update_stats+0x47/0x64c [e1000]) swapper-0 1dNs3 46us : e1000_read_phy_reg+0x16/0x225 [e1000] (e1000_update_stats+0x5e2/0x64c [e1000]) swapper-0 1dNs3 46us : e1000_swfw_sync_acquire+0x10/0x99 [e1000] (e1000_read_phy_reg+0x49/0x225 [e1000]) swapper-0 1dNs3 46us : e1000_get_hw_eeprom_semaphore+0x12/0xa6 [e1000] (e1000_swfw_sync_acquire+0x36/0x99 [e1000]) swapper-0 1dNs3 47us : __const_udelay+0x9/0x47 (e1000_read_phy_reg+0x116/0x225 [e1000]) swapper-0 1dNs3 47us+: __delay+0x9/0x50 (__const_udelay+0x45/0x47) swapper-0 1dNs3 97us : preempt_schedule+0xc/0x84 (__delay+0x4e/0x50) swapper-0 1dNs3 98us : e1000_swfw_sync_release+0xc/0x55 [e1000] (e1000_read_phy_reg+0x211/0x225 [e1000]) swapper-0 1dNs3 99us+: e1000_put_hw_eeprom_semaphore+0x9/0x35 [e1000] (e1000_swfw_sync_release+0x50/0x55 [e1000]) swapper-0 1dNs3 101us : _spin_unlock_irqrestore+0xe/0x5f (e1000_update_stats+0x641/0x64c [e1000]) swapper-0 1dNs3 102us : _spin_unlock_irqrestore+0x32/0x5f (e1000_update_stats+0x641/0x64c [e1000]) swapper-0 1dNs3 102us : trace_hardirqs_on_caller+0x75/0x89 (_spin_unlock_irqrestore+0x32/0x5f) vim:ft=help ======= Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-23 20:32:46 +02:00
Ingo Molnar	89804c022f	x86: fix csum_partial() export Fix this symbol export problem: Building modules, stage 2. MODPOST 193 modules ERROR: "csum_partial" [fs/reiserfs/reiserfs.ko] undefined! make[1]: * [__modpost] Error 1 make: * [modules] Error 2 This is due to a known weakness of symbol exports: if a symbol's only in-core user is an EXPORT_SYMBOL from a lib-y section, the symbol is not linked in. The solution is to move the export to x8664_ksyms_64.c - but the real solution would be to fix kbuild. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-13 19:38:47 +02:00
Alexander van Heukelum	5245698f66	x86, UML: remove x86-specific implementations of find_first_bit x86 has been switched to the generic versions of find_first_bit and find_first_zero_bit, but the original versions were retained. This patch just removes the now unused x86-specific versions. also update UML. Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-26 19:21:17 +02:00
Alexander van Heukelum	2aba6925fd	x86: switch 64-bit to generic find_first_bit Switch x86_64 to generic find_first_bit. The x86_64-specific implementation is not removed. Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-26 19:21:16 +02:00
Alexander van Heukelum	6fd92b63d0	x86: change x86 to use generic find_next_bit The versions with inline assembly are in fact slower on the machines I tested them on (in userspace) (Athlon XP 2800+, p4-like Xeon 2.8GHz, AMD Opteron 270). The i386-version needed a fix similar to `06024f21` to avoid crashing the benchmark. Benchmark using: gcc -fomit-frame-pointer -Os. For each bitmap size 1...512, for each possible bitmap with one bit set, for each possible offset: find the position of the first bit starting at offset. If you follow ;). Times include setup of the bitmap and checking of the results. Athlon Xeon Opteron 32/64bit x86-specific: 0m3.692s 0m2.820s 0m3.196s / 0m2.480s generic: 0m2.622s 0m1.662s 0m2.100s / 0m1.572s If the bitmap size is not a multiple of BITS_PER_LONG, and no set (cleared) bit is found, find_next_bit (find_next_zero_bit) returns a value outside of the range [0, size]. The generic version always returns exactly size. The generic version also uses unsigned long everywhere, while the x86 versions use a mishmash of int, unsigned (int), long and unsigned long. Using the generic version does give a slightly bigger kernel, though. defconfig: text data bss dec hex filename x86-specific: `4738555` 481232 626688 5846475 5935cb vmlinux (32 bit) generic: 4738621 481232 626688 5846541 59360d vmlinux (32 bit) x86-specific: 5392395 846568 724424 6963387 6a40bb vmlinux (64 bit) generic: 5392458 846568 724424 6963450 6a40fa vmlinux (64 bit) Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-26 19:21:16 +02:00
Linus Torvalds	9e9abecfc0	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86: (613 commits) x86: standalone trampoline code x86: move suspend wakeup code to C x86: coding style fixes to arch/x86/kernel/acpi/sleep.c x86: setup_trampoline() - fix section mismatch warning x86: section mismatch fixes, #1 x86: fix paranoia about using BIOS quickboot mechanism. x86: print out buggy mptable x86: use cpu_online() x86: use cpumask_of_cpu() x86: remove unnecessary tmp local variable x86: remove unnecessary memset() x86: use ioapic_read_entry() and ioapic_write_entry() x86: avoid redundant loop in io_apic_level_ack_pending() x86: remove superfluous initialisation in boot code. x86: merge mpparse_{32,64}.c x86: unify mp_register_gsi x86: unify mp_config_acpi_legacy_irqs x86: unify mp_register_ioapic x86: unify uniq_io_apic_id x86: unify smp_scan_config ...	2008-04-18 08:25:51 -07:00
Paolo Ciarrocchi	3f50dbc1ae	x86: coding style fixes to arch/x86/lib/usercopy_32.c Before: total: 63 errors, 2 warnings, 878 lines checked After: total: 0 errors, 2 warnings, 878 lines checked Compile tested, no change in the binary output: text data bss dec hex filename 3231 0 0 3231 c9f usercopy_32.o.after 3231 0 0 3231 c9f usercopy_32.o.before md5sum: 9f9a3eb43970359ae7cecfd1c9e7cf42 usercopy_32.o.after 9f9a3eb43970359ae7cecfd1c9e7cf42 usercopy_32.o.before Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:40:51 +02:00
Paolo Ciarrocchi	93d8bd3d4f	x86: coding style fixes to arch/x86/lib/memcpy_32.c Before: total: 2 errors, 0 warnings, 43 lines checked After: total: 0 errors, 0 warnings, 43 lines checked No code changed: arch/x86/lib/memcpy_32.o: text data bss dec hex filename 164 0 0 164 a4 memcpy_32.o.before 164 0 0 164 a4 memcpy_32.o.after md5: d759f55621af27f51720b59c8ca96a4d memcpy_32.o.before.asm d759f55621af27f51720b59c8ca96a4d memcpy_32.o.after.asm Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:40:49 +02:00
Paolo Ciarrocchi	f73920cd63	x86: coding style fixes to arch/x86/lib/strstr_3 Before: total: 3 errors, 0 warnings, 31 lines checked After: total: 0 errors, 0 warnings, 31 lines checked No code changed: arch/x86/lib/strstr_32.o: text data bss dec hex filename 49 0 0 49 31 strstr_32.o.before 49 0 0 49 31 strstr_32.o.after md5: a224a7c4082e75a4f31f9d91dd34fe8e strstr_32.o.before.asm a224a7c4082e75a4f31f9d91dd34fe8e strstr_32.o.after.asm Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:40:49 +02:00
Paolo Ciarrocchi	8cf36d2bc5	x86: coding style fixes to arch/x86/lib/string_32.c The patch kills 45 errors and a few warnings. The file is now error/warning free: total: 0 errors, 0 warnings, 237 lines checked arch/x86/lib/string_32.c has no obvious style problems and is ready for submission. no code changed: arch/x86/lib/string_32.o: text data bss dec hex filename 639 0 0 639 27f string_32.o.before 639 0 0 639 27f string_32.o.after md5: 2db1c48187cf5113bb595153ee1fc73d string_32.o.before.asm 2db1c48187cf5113bb595153ee1fc73d string_32.o.after.asm Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:40:48 +02:00
Paolo Ciarrocchi	e940659788	x86: coding style fixes to arch/x86/lib/memmove_64.c After the patch: total: 0 errors, 0 warnings, 21 lines checked no code changed: arch/x86/lib/memmove_64.o: text data bss dec hex filename 116 0 0 116 74 memmove_64.o.before 116 0 0 116 74 memmove_64.o.after md5: 2d6b0951cafb86a11a222cdd70f6104f memmove_64.o.before.asm 2d6b0951cafb86a11a222cdd70f6104f memmove_64.o.after.asm Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:40:48 +02:00
Ingo Molnar	ca5d3f1491	x86: clean up mmx_32.c checkpatch.pl --file cleanups: before: total: 74 errors, 3 warnings, 386 lines checked after: total: 0 errors, 0 warnings, 377 lines checked no code changed: arch/x86/lib/mmx_32.o: text data bss dec hex filename 1323 0 8 1331 533 mmx_32.o.before 1323 0 8 1331 533 mmx_32.o.after md5: 4cc39f1017dc40a5ebf02ce0ff7312bc mmx_32.o.before.asm 4cc39f1017dc40a5ebf02ce0ff7312bc mmx_32.o.after.asm Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-17 17:40:47 +02:00
Matthew Wilcox	64ac24e738	Generic semaphore implementation Semaphores are no longer performance-critical, so a generic C implementation is better for maintainability, debuggability and extensibility. Thanks to Peter Zijlstra for fixing the lockdep warning. Thanks to Harvey Harrison for pointing out that the unlikely() was unnecessary. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Acked-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 10:42:34 -04:00
Ingo Molnar	d76c1ae4d1	x86: clean up csum-wrappers_64.c some more no code changed: arch/x86/lib/csum-wrappers_64.o: text data bss dec hex filename 839 0 0 839 347 csum-wrappers_64.o.before 839 0 0 839 347 csum-wrappers_64.o.after md5: b31994226c33e0b52bef5a0e110b84b0 csum-wrappers_64.o.before.asm b31994226c33e0b52bef5a0e110b84b0 csum-wrappers_64.o.after.asm Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-19 16:18:32 +01:00
Paolo Ciarrocchi	0df025b709	x86: coding style fixes in arch/x86/lib/csum-wrappers_64.c no code changed: arch/x86/lib/csum-wrappers_64.o: text data bss dec hex filename 839 0 0 839 347 csum-wrappers_64.o.before 839 0 0 839 347 csum-wrappers_64.o.after md5: b31994226c33e0b52bef5a0e110b84b0 csum-wrappers_64.o.before.asm b31994226c33e0b52bef5a0e110b84b0 csum-wrappers_64.o.after.asm Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-19 16:18:32 +01:00
Paolo Ciarrocchi	4b44f81016	x86: coding style fixes in arch/x86/lib/io_64.c This simple patch makes the file error free (according to checkpatch.pl) no code changed: arch/x86/lib/io_64.o: text data bss dec hex filename 308 0 0 308 134 io_64.o.before 308 0 0 308 134 io_64.o.after md5: 3c64f9ed83d091678e849b36ca27bee3 io_64.o.before.asm 3c64f9ed83d091678e849b36ca27bee3 io_64.o.after.asm Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-19 16:18:32 +01:00
Andrew Morton	941e492bdb	read_current_timer() cleanups - All implementations can be __devinit - The function prototypes were in asm/timex.h but they all must be the same, so create a single declaration in linux/timex.h. - uninline the sparc64 version to match the other architectures - Don't bother #defining ARCH_HAS_READ_CURRENT_TIMER to a particular value. [ezk@cs.sunysb.edu: fix build] Cc: "David S. Miller" <davem@davemloft.net> Cc: Haavard Skinnemoen <hskinnemoen@atmel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-06 10:41:02 -08:00
FUJITA Tomonori	67ec11cf96	iommu sg: kill __clear_bit_string and find_next_zero_string This kills unused __clear_bit_string and find_next_zero_string (they were used by only gart and calgary IOMMUs). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Jeff Garzik <jeff@garzik.org> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Muli Ben-Yehuda <mulix@mulix.org> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-05 09:44:11 -08:00
Linus Torvalds	d2fc0bacd5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86 * git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86: (78 commits) x86: fix RTC lockdep warning: potential hardirq recursion x86: cpa, micro-optimization x86: cpa, clean up code flow x86: cpa, eliminate CPA_ enum x86: cpa, cleanups x86: implement gbpages support in change_page_attr() x86: support gbpages in pagetable dump x86: add gbpages support to lookup_address x86: add pgtable accessor functions for gbpages x86: add PUD_PAGE_SIZE x86: add feature macros for the gbpages cpuid bit x86: switch direct mapping setup over to set_pte x86: fix page-present check in cpa_flush_range x86: remove cpa warning x86: remove now unused clear_kernel_mapping x86: switch pci-gart over to using set_memory_np() instead of clear_kernel_mapping() x86: cpa selftest, skip non present entries x86: CPA fix pagetable split x86: rename LARGE_PAGE_SIZE to PMD_PAGE_SIZE x86: cpa, fix lookup_address ...	2008-02-04 09:16:03 -08:00
H. Peter Anvin	8da804f2b2	x86: use _ASM_EXTABLE macro in arch/x86/lib/usercopy_64.c Use the _ASM_EXTABLE macro from <asm/asm.h>, instead of open-coding __ex_table entires in arch/x86/lib/usercopy_64.c. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:47:57 +01:00
H. Peter Anvin	2877744145	x86: use _ASM_EXTABLE macro in arch/x86/lib/usercopy_32.c Use the _ASM_EXTABLE macro from <asm/asm.h>, instead of open-coding __ex_table entires in arch/x86/lib/usercopy_32.c. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:47:57 +01:00
H. Peter Anvin	e7a40d268e	x86: use _ASM_EXTABLE macro in arch/x86/lib/mmx_32.c Use the _ASM_EXTABLE macro from <asm/asm.h>, instead of open-coding __ex_table entires in arch/x86/lib/mmx_32.c. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:47:57 +01:00
Robert P. J. Day	4fe3fcaca0	Correct explanations of "find_next" bit routines. Correct the obvious "copy and paste" errors explaining some of the "find_next" routines. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Adrian Bunk <bunk@kernel.org>	2008-02-03 15:02:21 +02:00
Andrew Morton	914c82694c	x86: export copy_from_user_ll_nocache[_nozero] Cc: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 23:27:57 +01:00
Sam Ravnborg	c6c2d7a084	x86: fix usage of .section .sched.text in assembler code Without this patch the linker will generate a section named .sched.text.1 which is unexpected. This is because the gcc generated section has "ax" but the assembler usage of .sched.text lacks the "ax" specifier. It would be better to have a definition we could use from assembler code but I did not find a suitable header file for it. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:37 +01:00
Jan Engelhardt	ade1af7712	x86: remove unneded casts x86: remove unneeded casts Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:23 +01:00
John Reiser	6b8be6df7f	x86: add ENDPROC() markers The ENDPROCs() were not used everywhere. Some code used just END() instead, while other code used nothing. um/sys-i386/checksum.S didn't #include <linux/linkage.h> . I also got confused because gcc puts the .type near the ENTRY, while ENDPROC puts it on the opposite end. Signed off by: John Reiser <jreiser@BitWagon.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:13 +01:00
Sam Ravnborg	583d0e90ea	x86: unify arch/x86/lib/Makefile(s) Trivial unification of Makefiles for the x86 specific library part. Linking order is slightly modified but should be harmless. Tested doing a defconfig build before and after and saw no build changes. It adds almost as many lines as it deletes - bacause I broke a few lines up fo readability in the Makefile. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:32:31 +01:00
Andrew Morton	35d5d08a08	x86: disable preemption in delay_tsc() Marin Mitov points out that delay_tsc() can misbehave if it is preempted and rescheduled on a different CPU which has a skewed TSC. Fix it by disabling preemption. (I assume that the worst-case behaviour here is a stall of 2^32 cycles) Cc: Andi Kleen <ak@suse.de> Cc: Marin Mitov <mitov@issp.bas.bg> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-14 18:45:44 -08:00
Linus Torvalds	60812a4a99	Merge ssh://master.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-x86 * ssh://master.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-x86: (33 commits) x86: convert cpuinfo_x86 array to a per_cpu array x86: introduce frame_pointer() and stack_pointer() x86 & generic: change to __builtin_prefetch() i386: do not BUG_ON() when MSR is unknown x86: acpi use cpu_physical_id x86: convert cpu_llc_id to be a per cpu variable x86: convert cpu_to_apicid to be a per cpu variable i386: introduce "used_vectors" bitmap which can be used to reserve vectors. x86: use raw locks during oopses x86: honor _PAGE_PSE bit on page walks i386: do cpuid_device_create() in CPU_UP_PREPARE instead of CPU_ONLINE. x86: implement missing x86_64 function smp_call_function_mask() x86: use descriptor's functions instead of inline assembly i386: consolidate show_regs and show_registers for i386 i386: make callgraph use dump_trace() on i386/x86_64 x86: enable iommu_merge by default i386: i386 add AMD64 Barcelona PMU MSR definitions to msr.h x86: Unify i386 and x86-64 early quirks x86: enable HPET on ICH3 and ICH4 x86: force enable HPET on VT8235/8237 chipsets ... Manually fix trivial conflict with task pid container helper changes in arch/x86/kernel/process_32.c	2007-10-19 15:06:00 -07:00
Serge E. Hallyn	b460cbc581	pid namespaces: define is_global_init() and is_container_init() is_init() is an ambiguous name for the pid==1 check. Split it into is_global_init() and is_container_init(). A cgroup init has it's tsk->pid == 1. A global init also has it's tsk->pid == 1 and it's active pid namespace is the init_pid_ns. But rather than check the active pid namespace, compare the task structure with 'init_pid_ns.child_reaper', which is initialized during boot to the /sbin/init process and never changes. Changelog: 2.6.22-rc4-mm2-pidns1: - Use 'init_pid_ns.child_reaper' to determine if a given task is the global init (/sbin/init) process. This would improve performance and remove dependence on the task_pid(). 2.6.21-mm2-pidns2: - [Sukadev Bhattiprolu] Changed is_container_init() calls in {powerpc, ppc,avr32}/traps.c for the _exception() call to is_global_init(). This way, we kill only the cgroup if the cgroup's init has a bug rather than force a kernel panic. [akpm@linux-foundation.org: fix comment] [sukadev@us.ibm.com: Use is_global_init() in arch/m32r/mm/fault.c] [bunk@stusta.de: kernel/pid.c: remove unused exports] [sukadev@us.ibm.com: Fix capability.c to work with threaded init] Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Acked-by: Pavel Emelianov <xemul@openvz.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Herbert Poetzel <herbert@13thfloor.at> Cc: Kirill Korotaev <dev@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:37 -07:00
Mike Travis	92cb7612ae	x86: convert cpuinfo_x86 array to a per_cpu array cpu_data is currently an array defined using NR_CPUS. This means that we overallocate since we will rarely really use maximum configured cpus. When NR_CPU count is raised to 4096 the size of cpu_data becomes 3,145,728 bytes. These changes were adopted from the sparc64 (and ia64) code. An additional field was added to cpuinfo_x86 to be a non-ambiguous cpu index. This corresponds to the index into a cpumask_t as well as the per_cpu index. It's used in various places like show_cpuinfo(). cpu_data is defined to be the boot_cpu_data structure for the NON-SMP case. Signed-off-by: Mike Travis <travis@sgi.com> Acked-by: Christoph Lameter <clameter@sgi.com> Cc: Andi Kleen <ak@suse.de> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Dmitry Torokhov <dtor@mail.ru> Cc: "Antonino A. Daplas" <adaplas@pol.net> Cc: Mark M. Hoffman <mhoffman@lightlink.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-19 20:35:04 +02:00
Adrian Bunk	7e02cb941d	x86: rename .i assembler includes to .h .i is an ending used for preprocessed stuff. This patch therefore renames assembler include files to .h and guards the contents with an #ifdef __ASSEMBLY__. [ tglx: arch/x86 adaptation ] Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:16:29 +02:00
Andi Kleen	61d08a9ea3	i386: Remove strrchr assembler implementation The constraints in the inline assembler implementation of i386 strrchr() were incorrect and break the build with recent gcc 4.3. Since there are only very few callers of strrchr() and none of them are performance relevant just remove the assembler implementation and use the C fallback instead. [ tglx: arch/x86 adaptation ] Cc: rguenther@suse.de Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:16:23 +02:00
Avi Kivity	5f1f935ca4	i386: simplify smp_call_function_single() call sequence in msr-on-cpu smp_call_function_single() now knows how to call the function on the current cpu. [ tglx: arch/x86 adaptation ] Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:16:20 +02:00
Andrew Hastings	801916c1b3	x86: fix off-by-one in find_next_zero_string Fix an off-by-one error in find_next_zero_string which prevents allocating the last bit. [ tglx: arch/x86 adaptation ] Signed-off-by: Andrew Hastings <abh@cray.com> on behalf of Cray Inc. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:15:22 +02:00
Peter Zijlstra	10cd706d18	lockdep: x86_64: connect the sysexit hook Run the lockdep_sys_exit hook after all other C code on the syscall return path. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-11 22:11:12 +02:00
Nick Piggin	df1bdc0667	x86: fence oostores on 64-bit movnt* instructions are not strongly ordered with respect to other stores, so if we are to assume stores are strongly ordered in the rest of the 64 bit code, we must fence these off (see similar examples in 32 bit code). [ The AMD memory ordering document seems to say that nontemporal stores can also pass earlier regular stores, so maybe we need sfences _before_ movnt* everywhere too? ] Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-12 18:41:21 -07:00
Thomas Gleixner	185f3d3890	x86_64: move lib Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-11 11:17:08 +02:00
Thomas Gleixner	44f0257fc3	i386: move lib Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-11 11:16:33 +02:00

... 2 3 4 5 6

295 Commits