Checkin b3ac891b67bd4b1fc728d1c784cad1212dea433d:
x86: Add support for lock prefix in alternatives
... did not define LOCK_PREFIX_HERE in the case of a uniprocessor
build. As a result, it would cause any of the usages of this macro to
fail on a uniprocessor build. Fix this by defining LOCK_PREFIX_HERE
as a null string.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Luca Barbieri <luca@luca-barbieri.com>
LKML-Reference: <1267005265-27958-2-git-send-email-luca@luca-barbieri.com>
The x86-64 implementation of the atomics is totally different from the
i586+ implementation, which makes it quite confusing to call it
"586+". Also fix indentation, and add "i" for "i386" and "i586" as
used elsewhere in the kernel.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Luca Barbieri <luca@luca-barbieri.com>
LKML-Reference: <1267005265-27958-4-git-send-email-luca@luca-barbieri.com>
atomic64_inc_not_zero must return 1 if it perfomed the add and 0 otherwise.
It was doing the opposite thing.
Signed-off-by: Luca Barbieri <luca@luca-barbieri.com>
LKML-Reference: <1267469749-11878-6-git-send-email-luca@luca-barbieri.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
atomic64_inc_not_zero must return 1 if it perfomed the add and 0 otherwise.
The test assumed the opposite convention.
Signed-off-by: Luca Barbieri <luca@luca-barbieri.com>
LKML-Reference: <1267469749-11878-5-git-send-email-luca@luca-barbieri.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
atomic64_add_unless must return 1 if it perfomed the add and 0 otherwise.
The generic implementation did the opposite thing.
Reported-by: H. Peter Anvin <hpa@zytor.com>
Confirmed-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Luca Barbieri <luca@luca-barbieri.com>
LKML-Reference: <1267469749-11878-4-git-send-email-luca@luca-barbieri.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
atomic64_add_unless must return 1 if it perfomed the add and 0 otherwise.
The implementation did the opposite thing.
Reported-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Luca Barbieri <luca@luca-barbieri.com>
LKML-Reference: <1267469749-11878-3-git-send-email-luca@luca-barbieri.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
atomic64_add_unless must return 1 if it perfomed the add and 0 otherwise.
The test assumed the opposite convention.
Reported-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Luca Barbieri <luca@luca-barbieri.com>
LKML-Reference: <1267469749-11878-2-git-send-email-luca@luca-barbieri.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Add support for atomic_dec_if_positive(), and
atomic64_dec_if_positive() for x86-64.
atomic64_dec_if_positive() for x86-32 was already implemented in a previous patch.
Signed-off-by: Luca Barbieri <luca@luca-barbieri.com>
LKML-Reference: <1267183361-20775-2-git-send-email-luca@luca-barbieri.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Currently atomic64_dec_if_positive() is only supported by PowerPC,
MIPS and x86-32.
Signed-off-by: Luca Barbieri <luca@luca-barbieri.com>
LKML-Reference: <1267183361-20775-1-git-send-email-luca@luca-barbieri.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
This patch replaces atomic64_32.c with two assembly implementations,
one for 386/486 machines using pushf/cli/popf and one for 586+ machines
using cmpxchg8b.
The cmpxchg8b implementation provides the following advantages over the
current one:
1. Implements atomic64_add_unless, atomic64_dec_if_positive and
atomic64_inc_not_zero
2. Uses the ZF flag changed by cmpxchg8b instead of doing a comparison
3. Uses custom register calling conventions that reduce or eliminate
register moves to suit cmpxchg8b
4. Reads the initial value instead of using cmpxchg8b to do that.
Currently we use lock xaddl and movl, which seems the fastest.
5. Does not use the lock prefix for atomic64_set
64-bit writes are already atomic, so we don't need that.
We still need it for atomic64_read to avoid restoring a value
changed in the meantime.
6. Allocates registers as well or better than gcc
The 386 implementation provides support for 386 and 486 machines.
386/486 SMP is not supported (we dropped it), but such support can be
added easily if desired.
A pure assembly implementation is required due to the custom calling
conventions, and desire to use %ebp in atomic64_add_return (we need
7 registers...), as well as the ability to use pushf/popf in the 386
code without an intermediate pop/push.
The parameter names are changed to match the convention in atomic_64.h
Changes in v3 (due to rebasing to tip/x86/asm):
- Patches atomic64_32.h instead of atomic_32.h
- Uses the CALL alternative mechanism from commit
1b1d925818
Changes in v2:
- Merged 386 and cx8 support in the same patch
- 386 support now done in assembly, C code no longer used at all
- cmpxchg64 is used for atomic64_cmpxchg
- stop using macros, use one-line inline functions instead
- miscellanous changes and improvements
Signed-off-by: Luca Barbieri <luca@luca-barbieri.com>
LKML-Reference: <1267005265-27958-5-git-send-email-luca@luca-barbieri.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
This patch adds self-test on boot code for atomic64_t.
This has been used to test the later changes in this patchset.
Signed-off-by: Luca Barbieri <luca@luca-barbieri.com>
LKML-Reference: <1267005265-27958-4-git-send-email-luca@luca-barbieri.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Use the functionality just introduced in the previous patch: mark the
lock prefixes in cmpxchg64 alternatives for UP removal.
Changes in v2:
- Naming change
Signed-off-by: Luca Barbieri <luca@luca-barbieri.com>
LKML-Reference: <1267005265-27958-3-git-send-email-luca@luca-barbieri.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The current lock prefix UP/SMP alternative code doesn't allow
LOCK_PREFIX to be used in alternatives code.
This patch solves the problem by adding a new LOCK_PREFIX_ALTERNATIVE_PATCH
macro that only records the lock prefix location but does not emit
the prefix.
The user of this macro can then start any alternative sequence with
"lock" and have it UP/SMP patched.
To make this work, the UP/SMP alternative code is changed to do the
lock/DS prefix switching only if the byte actually contains a lock or
DS prefix.
Thus, if an alternative without the "lock" is selected, it will now do
nothing instead of clobbering the code.
Changes in v2:
- Naming change
- Change label to not conflict with alternatives
Signed-off-by: Luca Barbieri <luca@luca-barbieri.com>
LKML-Reference: <1267005265-27958-2-git-send-email-luca@luca-barbieri.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The atomic ops emulation for 32bit legacy CPUs floods the tracer with
irq off/on entries. The irq disabled regions are short and therefor
not interesting when chasing long irq disabled latencies. Mark them
raw and keep them out of the trace.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Using kernel_stack_pointer() allows 32-bit and 64-bit versions to
be merged. This is more correct for 64-bit, since the old %rsp is
always saved on the stack.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1263397555-27695-1-git-send-email-brgerst@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Use a macro to define the cache sizes when cachesize > 1 MB.
This is less typing, and less prone to introducing bugs like we
saw in e02e0e1a13, and means we
don't have to do maths when adding new non-power-of-2 updates
like those seen recently.
Signed-off-by: Dave Jones <davej@redhat.com>
LKML-Reference: <20100104144735.GA18390@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This makes gcc use the right register names and instruction operand sizes
automatically for the rwsem inline asm statements.
So instead of using "(%%eax)" to specify the memory address that is the
semaphore, we use "(%1)" or similar. And instead of forcing the operation
to always be 32-bit, we use "%z0", taking the size from the actual
semaphore data structure itself.
This doesn't actually matter on x86-32, but if we want to use the same
inline asm for x86-64, we'll need to have the compiler generate the proper
64-bit names for the registers (%rax instead of %eax), and if we want to
use a 64-bit counter too (in order to avoid the 15-bit limit on the
write counter that limits concurrent users to 32767 threads), we'll need
to be able to generate instructions with "q" accesses rather than "l".
Since this header currently isn't enabled on x86-64, none of that matters,
but we do want to use the xadd version of the semaphores rather than have
to take spinlocks to do a rwsem. The mm->mmap_sem can be heavily contended
when you have lots of threads all taking page faults, and the fallback
rwsem code that uses a spinlock performs abysmally badly in that case.
[ hpa: modified the patch to skip size suffixes entirely when they are
redundant due to register operands. ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <alpine.LFD.2.00.1001121613560.17145@localhost.localdomain>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Merge the now identical code from asm/atomic_32.h and asm/atomic_64.h
into asm/atomic.h.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1262883215-4034-4-git-send-email-brgerst@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Prepare for merging into asm/atomic.h.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1262883215-4034-3-git-send-email-brgerst@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Split atomic64_t functions out into separate headers, since they will
not be practical to merge between 32 and 64 bits.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1262883215-4034-2-git-send-email-brgerst@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
In order to avoid unnecessary chains of branches, rather than
implementing memcpy()/memset()'s access to their alternative
implementations via a jump, patch the (larger) original function
directly.
The memcpy() part of this is slightly subtle: while alternative
instruction patching does itself use memcpy(), with the
replacement block being less than 64-bytes in size the main loop
of the original function doesn't get used for copying memcpy_c()
over memcpy(), and hence we can safely write over its beginning.
Also note that the CFI annotations are fine for both variants of
each of the functions.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <4B2BB8D30200007800026AF2@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In order to avoid unnecessary chains of branches, rather than
implementing copy_user_generic() as a function consisting of
just a single (possibly patched) branch, instead properly deal
with patching call instructions in the alternative instructions
framework, and move the patching into the callers.
As a follow-on, one could also introduce something like
__EXPORT_SYMBOL_ALT() to avoid patching call sites in modules.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <4B2BB8180200007800026AE7@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The early ioremap fixmap entries cover half (or for 32-bit
non-PAE, a quarter) of a page table, yet they got
uncondtitionally aligned so far to a 256-entry boundary. This is
not necessary if the range of page table entries anyway falls
into a single page table.
This buys back, for (theoretically) 50% of all configurations
(25% of all non-PAE ones), at least some of the lowmem
necessarily lost with commit e621bd1895.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <4B2BB66F0200007800026AD6@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Optimize hweight32 by using the same technique in hweight64.
The proof of this technique can be found in the commit log for
f9b4192923 ("bitops: hweight()
speedup").
The userspace benchmark on x86_32 showed 20% speedup with
bitmap_weight() which uses hweight32 to count bits for each
unsigned long on 32bit architectures.
int main(void)
{
#define SZ (1024 * 1024 * 512)
static DECLARE_BITMAP(bitmap, SZ) = {
[0 ... 100] = 1,
};
return bitmap_weight(bitmap, SZ);
}
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <1258603932-4590-1-git-send-email-akinobu.mita@gmail.com>
[ only x86 sets ARCH_HAS_FAST_MULTIPLIER so we do this via the x86 tree]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'sysctl' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc-2.6:
SYSCTL: Add a mutex to the page_alloc zone order sysctl
SYSCTL: Print binary sysctl warnings (nearly) only once
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (34 commits)
classmate-laptop: add support for Classmate PC ACPI devices
hp-wmi: Fix two memleaks
acer-wmi, msi-wmi: Remove needless DMI MODULE_ALIAS
dell-wmi: do not keep driver loaded on unsupported boxes
wmi: Free the allocated acpi objects through wmi_get_event_data
drivers/platform/x86/acerhdf.c: check BIOS information whether it begins with string of table
acerhdf: add new BIOS versions
acerhdf: limit modalias matching to supported
toshiba_acpi: convert to seq_file
asus_acpi: convert to seq_file
ACPI: do not select ACPI_DOCK from ATA_ACPI
sony-laptop: enumerate rfkill devices using SN06
sony-laptop: rfkill support for newer models
ACPI: fix OSC regression that caused aer and pciehp not to load
MAINTAINERS: add maintainer for msi-wmi driver
fujitu-laptop: fix tests of acpi_evaluate_integer() return value
arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c: avoid cross-CPU interrupts by using smp_call_function_any()
ACPI: processor: remove _PDC object list from struct acpi_processor
ACPI: processor: change acpi_processor_set_pdc() interface
ACPI: processor: open code acpi_processor_cleanup_pdc
...
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2:
ocfs2/trivial: Use le16_to_cpu for a disk value in xattr.c
ocfs2/trivial: Use proper mask for 2 places in hearbeat.c
Ocfs2: Let ocfs2 support fiemap for symlink and fast symlink.
Ocfs2: Should ocfs2 support fiemap for S_IFDIR inode?
ocfs2: Use FIEMAP_EXTENT_SHARED
fiemap: Add new extent flag FIEMAP_EXTENT_SHARED
ocfs2: replace u8 by __u8 in ocfs2_fs.h
ocfs2: explicit declare uninitialized var in user_cluster_connect()
ocfs2-devel: remove redundant OCFS2_MOUNT_POSIX_ACL check in ocfs2_get_acl_nolock()
ocfs2: return -EAGAIN instead of EAGAIN in dlm
ocfs2/cluster: Make fence method configurable - v2
ocfs2: Set MS_POSIXACL on remount
ocfs2: Make acl use the default
ocfs2: Always include ACL support
* 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm:
VIDEO: cyberpro: pci_request_regions needs a persistent name
ARM: dma-isa: request cascade channel after registering it
ARM: footbridge: trim down old ISA rtc setup
ARM: fix PAGE_KERNEL
ARM: Fix wrong shared bit for CPU write buffer bug test
ARM: 5857/1: ARM: dmabounce: fix build
ARM: 5856/1: Fix bug of uart0 platfrom data for nuc900
ARM: 5855/1: putc support for nuc900
ARM: 5854/1: fix compiling error for NUC900
ARM: 5849/1: ARMv7: fix Oprofile events count
ARM: add missing include to nwflash.c
ARM: Kill CONFIG_CPU_32
ARM: Convert VFP/Crunch/XscaleCP thread_release() to exit_thread()
ARM: 5853/1: ARM: Fix build break on ARM v6 and v7
* 'sh/for-2.6.33' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
sh: Ensure all PG_dcache_dirty pages are written back.
sh: mach-ecovec24: setup.c detailed correction
serial: sh-sci: Convert tremaining ctrl_xxx I/O routines to __raw_xxx.
serial: sh-sci: earlyprintk zero uartclk fix
sh: Only use bl bit toggling for sleeping idle.
sh: Restore bl bit toggling in idle loop.
sh: Fix up MAX_DMA_CHANNELS definition when DMA is disabled.
sh: dmaengine support for SH7785
sh: dmaengine support for sh7724.
Don't pass a name pointer from the kernel stack, it will not survive
and will result in corrupted /proc/iomem output.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This fixes a "start_kernel(): bug: interrupts were enabled early".
rtc_cmos now takes care of initializing the ISA RTC and reading the
current time and date from it; there's no need to repeat that here,
thereby causing interrupts to be enabled too early.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
PAGE_KERNEL should not be executable; any area marked executable can
be prefetched into the instruction cache. We don't want vmalloc areas
to be read in this way.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Do not spam the logs needlessly with the sole info that
edac_pci_dev_parity_clear is being called.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Currently, the module does not initialize fully when the DIMMs aren't
ECC but remains still loaded. Propagate the error when no instance of
the driver is properly initialized and prevent further loading.
Reorganize and polish error handling in amd64_edac_init() while at it.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Fix use-after-free errors by pushing all memory-freeing calls to the end
of amd64_remove_one_instance().
Reported-by: Darren Jenkins <darrenrjenkins@gmail.com>
LKML-Reference: <1261370306.11354.52.camel@ICE-BOX>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Fix the case when amd64_debug_display_dimm_sizes() reports only half the
amount of DRAM on it because it doesn't account for when the single DCT
operates in 128-bit mode and merges chip selects from different DIMMs.
Reported-by: Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de>
LKML-Reference: <200912112202.48173.johannes.hirte@fem.tu-ilmenau.de>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
This add supports for devices like keyboard, backlight, tablet and
accelerometer.
This work is supported by International Syst S/A.
[randy.dunlap@oracle.com: cmpc_acpi: depends on ACPI]
[akpm@linux-foundation.org: readability tweaks]
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>