Commit Graph

1231 Commits

Author SHA1 Message Date
Thomas Gleixner dea1078e1a ia64: iosapic: Cleanup irq_desc access
Use irq_to_desc() and use accessors for setting chip and handler.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-29 14:48:03 +02:00
Thomas Gleixner 8fac171f72 ia64: Convert iosapic to new irq_chip functions
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-29 14:48:02 +02:00
Thomas Gleixner 5c217b60fe ia64: Convert lsapic to new irq_chip functions
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-29 14:48:02 +02:00
Thomas Gleixner f1f701e937 ia64: Convert msi to new irq_chip functions
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-29 14:48:02 +02:00
Thomas Gleixner 3d373ce82a ia64: Remove stale irq_chip.end
irq_chip.end got obsolete with the removal of __do_IRQ().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tony Luck <tony.luck@intel.com>
LKML-Reference: <20110203004210.143127544@linutronix.de>
2011-03-29 14:48:00 +02:00
Thomas Gleixner 428a40c591 ia64: Cleanup migrate_irqs()
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-29 14:48:00 +02:00
Thomas Gleixner 097e98b4fc ia64: Convert migrate_platform_irqs() to new irq chip functions
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-29 14:47:59 +02:00
Linus Torvalds 6d1e9a42e7 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  pstore: cleanups to pstore_dump()
  [IA64] New syscalls for 2.6.39
2011-03-24 10:05:23 -07:00
Linus Torvalds 047f61c5d1 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (42 commits)
  ACPI: minor printk format change in acpi_pad
  ACPI: make acpi_pad /sys output more readable
  ACPICA: Update version to 20110316
  ACPICA: Header support for SLIC table
  ACPI: Make sure the FADT is at least rev 2 before using the reset register
  ACPI: Bug compatibility for Windows on the ACPI reboot vector
  ACPICA: Fix access width for reset vector
  ACPI battery: fribble sysfs files from a resume notifier
  ACPI button: remove unused procfs I/F
  ACPI, APEI, Add PCIe AER error information printing support
  PCIe, AER, use pre-generated prefix in error information printing
  ACPI, APEI, Add ERST record ID cache
  ACPI: Use syscore_ops instead of sysdev class and sysdev
  ACPI: Remove the unused EC sysdev class
  ACPI: use __cpuinit for the acpi_processor_set_pdc() call tree
  ACPI: use __init where possible in processor driver
  Thermal_Framework-Fix_crash_during_hwmon_unregister
  ACPICA: Update version to 20110211.
  ACPICA: Add mechanism to defer _REG methods for some installed handlers
  ACPICA: Add support for FunctionalFixedHW in acpi_ut_get_region_name
  ...
2011-03-24 08:25:15 -07:00
Olaf Hering 93a72052be crash_dump: export is_kdump_kernel to modules, consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn
The Xen PV drivers in a crashed HVM guest can not connect to the dom0
backend drivers because both frontend and backend drivers are still in
connected state.  To run the connection reset function only in case of a
crashdump, the is_kdump_kernel() function needs to be available for the PV
driver modules.

Consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn into
kernel/crash_dump.c Also export elfcorehdr_addr to make is_kdump_kernel()
usable for modules.

Leave 'elfcorehdr' as early_param().  This changes powerpc from __setup()
to early_param().  It adds an address range check from x86 also on ia64
and powerpc.

[akpm@linux-foundation.org: additional #includes]
[akpm@linux-foundation.org: remove elfcorehdr_addr export]
[akpm@linux-foundation.org: fix for Tejun's mm/nobootmem.c changes]
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23 19:47:19 -07:00
Len Brown 02e2407858 Merge branch 'linus' into release
Conflicts:
	arch/x86/kernel/acpi/sleep.c

Signed-off-by: Len Brown <len.brown@intel.com>
2011-03-23 02:34:54 -04:00
Len Brown 8a9026d2e9 Merge branch 'misc' into release 2011-03-23 02:19:58 -04:00
Tony Luck 9298168d16 [IA64] New syscalls for 2.6.39
Four new syscalls:
	sys_name_to_handle_at
	sys_open_by_handle_at
	sys_clock_adjtime
	sys_syncfs

Signed-off-by: Tony Luck <tony.luck@intel.com>
2011-03-22 10:54:24 -07:00
Linus Torvalds 242e5d06be Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] tioca: Fix assignment from incompatible pointer warnings
  [IA64] mca.c: Fix cast from integer to pointer warning
  [IA64] setup.c Typo fix "Architechtuallly"
  [IA64] Add CONFIG_MISC_DEVICES=y to configs that need it.
  [IA64] disable interrupts at end of ia64_mca_cpe_int_handler()
  [IA64] Add DMA_ERROR_CODE define.
  pstore: fix build warning for unused return value from sysfs_create_file
  pstore: X86 platform interface using ACPI/APEI/ERST
  pstore: new filesystem interface to platform persistent storage
2011-03-16 19:01:29 -07:00
Tony Luck 4897313a62 Pull misc-2.6.39 into release branch 2011-03-16 09:57:50 -07:00
Linus Torvalds 79d8a8f736 Merge branch 'for-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu, x86: Add arch-specific this_cpu_cmpxchg_double() support
  percpu: Generic support for this_cpu_cmpxchg_double()
  alpha: use L1_CACHE_BYTES for cacheline size in the linker script
  percpu: align percpu readmostly subsection to cacheline

Fix up trivial conflict in arch/x86/kernel/vmlinux.lds.S due to the
percpu alignment having changed ("x86: Reduce back the alignment of the
per-CPU data section")
2011-03-16 08:22:41 -07:00
Jan Beulich af10f941ab ACPI: use __cpuinit for the acpi_processor_set_pdc() call tree
Once acpi_map_lsapic() in ia64 follows how x86 treats it wrt section
placement, the whole tree from acpi_processor_set_pdc() can become
__cpuinit.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-03-02 20:58:20 -05:00
Jeff Mahoney c1d036c4d1 [IA64] mca.c: Fix cast from integer to pointer warning
ia64_mca_cpu_init has a void *data local variable that is assigned
the value from either __get_free_pages() or mca_bootmem(). The problem
is that __get_free_pages returns an unsigned long and mca_bootmem, via
alloc_bootmem(), returns a void *. format_mca_init_stack takes the void *,
and it's also used with __pa(), but that casts it to long anyway.

This results in the following build warning:

arch/ia64/kernel/mca.c:1898: warning: assignment makes pointer from
integer without a cast

Cast the return of __get_free_pages to a void * to avoid
the warning.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2011-03-02 14:02:50 -08:00
Tony Luck a396768574 [IA64] disable interrupts at end of ia64_mca_cpe_int_handler()
SAL requires that interrupts be enabled when making some calls
to it to pick up error records, so we enable interrupts inside
this handler.  We should disable them again at the end.

Found by a new WARN_ONCE that tglx added to handle_irq_event_percpu()

Signed-off-by: Tony Luck <tony.luck@intel.com>
2011-02-24 15:22:05 -08:00
Rafael J. Wysocki f1a2003e22 ACPI / PM: Merge do_suspend_lowlevel() into acpi_save_state_mem()
The function do_suspend_lowlevel() is specific to x86 and defined in
assembly code, so it should be called from the x86 low-level suspend
code rather than from acpi_suspend_enter().

Merge do_suspend_lowlevel() into the x86's acpi_save_state_mem() and
change the name of the latter to acpi_suspend_lowlevel(), so that the
function's purpose is better reflected by its name.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-02-24 19:58:54 +01:00
Rafael J. Wysocki c41b93fb85 ACPI / PM: Drop acpi_restore_state_mem()
The function acpi_restore_state_mem() has never been and most likely
never will be used, so remove it.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-02-24 19:58:54 +01:00
Torben Hohn 1aabd67d2e ia64: Switch do_timer() to xtime_update()
local_cpu_data->itm_next = new_itm; does not need to be protected by
xtime_lock. xtime_update() takes the lock itself.

Signed-off-by: Torben Hohn <torbenh@gmx.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: johnstul@us.ibm.com
Cc: hch@infradead.org
Cc: yong.zhang0@gmail.com
LKML-Reference: <20110127145956.23248.49107.stgit@localhost>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-01-31 14:55:45 +01:00
Tejun Heo 19df0c2fef percpu: align percpu readmostly subsection to cacheline
Currently percpu readmostly subsection may share cachelines with other
percpu subsections which may result in unnecessary cacheline bounce
and performance degradation.

This patch adds @cacheline parameter to PERCPU() and PERCPU_VADDR()
linker macros, makes each arch linker scripts specify its cacheline
size and use it to align percpu subsections.

This is based on Shaohua's x86 only patch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Shaohua Li <shaohua.li@intel.com>
2011-01-25 14:26:50 +01:00
Linus Torvalds dc8e7e3ec6 Merge branch 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6
* 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6:
  cpuidle/x86/perf: fix power:cpu_idle double end events and throw cpu_idle events from the cpuidle layer
  intel_idle: open broadcast clock event
  cpuidle: CPUIDLE_FLAG_CHECK_BM is omap3_idle specific
  cpuidle: CPUIDLE_FLAG_TLB_FLUSHED is specific to intel_idle
  cpuidle: delete unused CPUIDLE_FLAG_SHALLOW, BALANCED, DEEP definitions
  SH, cpuidle: delete use of NOP CPUIDLE_FLAGS_SHALLOW
  cpuidle: delete NOP CPUIDLE_FLAG_POLL
  ACPI: processor_idle: delete use of NOP CPUIDLE_FLAGs
  cpuidle: Rename X86 specific idle poll state[0] from C0 to POLL
  ACPI, intel_idle: Cleanup idle= internal variables
  cpuidle: Make cpuidle_enable_device() call poll_idle_init()
  intel_idle: update Sandy Bridge core C-state residency targets
2011-01-13 20:15:18 -08:00
Tony Luck 09579770dc [IA64] fix build error - arch/ia64/kernel/perfmon.c
arch/ia64/kernel/perfmon.c:621: error: duplicate 'static'

Introduced by commit c74a1cbb3c

    pass default dentry_operations to mount_pseudo()

Signed-off-by: Tony Luck <tony.luck@intel.com>
2011-01-13 14:49:56 -08:00
Linus Torvalds 581548db3b Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] Fix format warning in arch/ia64/kernel/acpi.c
2011-01-13 11:02:55 -08:00
Al Viro c74a1cbb3c pass default dentry_operations to mount_pseudo()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-01-12 20:03:43 -05:00
Len Brown 56dbed129d Merge branch 'linus' into idle-test 2011-01-12 18:06:06 -05:00
Tony Luck dff0092bcd [IA64] Fix format warning in arch/ia64/kernel/acpi.c
arch/ia64/kernel/acpi.c:481: warning: format ‘%d’ expects type ‘int’, but argument 2 has type ‘long unsigned int’

Introduced by commit 05f2f274c8
    [IA64] Avoid array overflow if there are too many cpus in SRAT table

Signed-off-by: Tony Luck <tony.luck@intel.com>
2011-01-12 11:02:43 -08:00
Thomas Renninger d18960494f ACPI, intel_idle: Cleanup idle= internal variables
Having four variables for the same thing:
  idle_halt, idle_nomwait, force_mwait and boot_option_idle_overrides
is rather confusing and unnecessary complex.

if idle= boot param is passed, only set up one variable:
boot_option_idle_overrides

Introduces following functional changes/fixes:
  - intel_idle driver does not register if any idle=xy
    boot param is passed.
  - processor_idle.c will also not register a cpuidle driver
    and get active if idle=halt is passed.
    Before a cpuidle driver with one (C1, halt) state got registered
    Now the default_idle function will be used which finally uses
    the same idle call to enter sleep state (safe_halt()), but
    without registering a whole cpuidle driver.

That means idle= param will always avoid cpuidle drivers to register
with one exception (same behavior as before):
idle=nomwait
may still register acpi_idle cpuidle driver, but C1 will not use
mwait, but hlt. This can be a workaround for IO based deeper sleep
states where C1 mwait causes problems.

Signed-off-by: Thomas Renninger <trenn@suse.de>
cc: x86@kernel.org
Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-12 12:47:30 -05:00
Linus Torvalds ecacc6c70c Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] Avoid array overflow if there are too many cpus in SRAT table
  [IA64] Remove unlikely from cpu_is_offline
  [IA64] irq_ia64, use set_irq_chip
  [IA64] perfmon: Change vmalloc to vzalloc and drop memset.
  [IA64] eliminate race condition in smp_flush_tlb_mm
2011-01-10 14:52:44 -08:00
Tony Luck 05f2f274c8 [IA64] Avoid array overflow if there are too many cpus in SRAT table
acpi_numa_init() has to parse the whole SRAT table, even if the
kernel wants to limit the number of cpus it will use (because the
ones it is going to use may be described by entries at the end of
the SRAT table).  Avoid overflowing the node_cpuid array.

Reported-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2011-01-07 09:11:55 -08:00
Nick Piggin b3e19d924b fs: scale mntget/mntput
The problem that this patch aims to fix is vfsmount refcounting scalability.
We need to take a reference on the vfsmount for every successful path lookup,
which often go to the same mount point.

The fundamental difficulty is that a "simple" reference count can never be made
scalable, because any time a reference is dropped, we must check whether that
was the last reference. To do that requires communication with all other CPUs
that may have taken a reference count.

We can make refcounts more scalable in a couple of ways, involving keeping
distributed counters, and checking for the global-zero condition less
frequently.

- check the global sum once every interval (this will delay zero detection
  for some interval, so it's probably a showstopper for vfsmounts).

- keep a local count and only taking the global sum when local reaches 0 (this
  is difficult for vfsmounts, because we can't hold preempt off for the life of
  a reference, so a counter would need to be per-thread or tied strongly to a
  particular CPU which requires more locking).

- keep a local difference of increments and decrements, which allows us to sum
  the total difference and hence find the refcount when summing all CPUs. Then,
  keep a single integer "long" refcount for slow and long lasting references,
  and only take the global sum of local counters when the long refcount is 0.

This last scheme is what I implemented here. Attached mounts and process root
and working directory references are "long" references, and everything else is
a short reference.

This allows scalable vfsmount references during path walking over mounted
subtrees and unattached (lazy umounted) mounts with processes still running
in them.

This results in one fewer atomic op in the fastpath: mntget is now just a
per-CPU inc, rather than an atomic inc; and mntput just requires a spinlock
and non-atomic decrement in the common case. However code is otherwise bigger
and heavier, so single threaded performance is basically a wash.

Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07 17:50:33 +11:00
Nick Piggin fb045adb99 fs: dcache reduce branches in lookup path
Reduce some branches and memory accesses in dcache lookup by adding dentry
flags to indicate common d_ops are set, rather than having to check them.
This saves a pointer memory access (dentry->d_op) in common path lookup
situations, and saves another pointer load and branch in cases where we
have d_op but not the particular operation.

Patched with:

git grep -E '[.>]([[:space:]])*d_op([[:space:]])*=' | xargs sed -e 's/\([^\t ]*\)->d_op = \(.*\);/d_set_d_op(\1, \2);/' -e 's/\([^\t ]*\)\.d_op = \(.*\);/d_set_d_op(\&\1, \2);/' -i

Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07 17:50:28 +11:00
Nick Piggin fe15ce446b fs: change d_delete semantics
Change d_delete from a dentry deletion notification to a dentry caching
advise, more like ->drop_inode. Require it to be constant and idempotent,
and not take d_lock. This is how all existing filesystems use the callback
anyway.

This makes fine grained dentry locking of dput and dentry lru scanning
much simpler.

Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07 17:50:18 +11:00
Joe Perches e7d282535c [IA64] Remove unlikely from cpu_is_offline
cpu_is_offline already uses unlikely internally.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2010-12-28 14:20:42 -08:00
Jiri Slaby 409e590572 [IA64] irq_ia64, use set_irq_chip
Don't access desc->chip directly, because them chip member will
disappear some time later.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2010-12-28 14:19:20 -08:00
Jesper Juhl e21763dbce [IA64] perfmon: Change vmalloc to vzalloc and drop memset.
vzalloc() nicely zeroes memory for us, so we don't have to do a vmalloc()
and then manually memset() the returned memory when all we want is for it
to be zero. Patch changes this for pfm_rvmalloc().

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2010-12-28 14:13:36 -08:00
Dimitri Sivanich 75c1c91cb9 [IA64] eliminate race condition in smp_flush_tlb_mm
A race condition exists within smp_call_function_many() when called from
smp_flush_tlb_mm().  On rare occasions the cpu_vm_mask can be cleared
while smp_call_function_many is executing, occasionally resulting in a
hung process.

Make a copy of the mask prior to calling smp_call_function_many().

Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2010-12-28 14:06:21 -08:00
Al Viro 51139adac9 convert get_sb_pseudo() users
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-10-29 04:16:33 -04:00
Namhyung Kim 9b05a69e05 ptrace: change signature of arch_ptrace()
Fix up the arguments to arch_ptrace() to take account of the fact that
@addr and @data are now unsigned long rather than long as of a preceding
patch in this series.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: <linux-arch@vger.kernel.org>
Acked-by: Roland McGrath <roland@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-27 18:03:10 -07:00
Linus Torvalds 092e0e7e52 Merge branch 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl
* 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
  vfs: make no_llseek the default
  vfs: don't use BKL in default_llseek
  llseek: automatically add .llseek fop
  libfs: use generic_file_llseek for simple_attr
  mac80211: disallow seeks in minstrel debug code
  lirc: make chardev nonseekable
  viotape: use noop_llseek
  raw: use explicit llseek file operations
  ibmasmfs: use generic_file_llseek
  spufs: use llseek in all file operations
  arm/omap: use generic_file_llseek in iommu_debug
  lkdtm: use generic_file_llseek in debugfs
  net/wireless: use generic_file_llseek in debugfs
  drm: use noop_llseek
2010-10-22 10:52:56 -07:00
Linus Torvalds b22793f7fd Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] Cannot use register_percpu_irq() from ia64_mca_init()
  [IA64] Initialize interrupts later (from init_IRQ())
  [IA64] enable ARCH_DMA_ADDR_T_64BIT
  [IA64] ioc3_serial: release resources in error return path
  [IA64] Stop using the deprecated __do_IRQ() code path
  [IA64] Remove unnecessary casts of private_data in perfmon.c
  [IA64] Fix missing iounmap in error path in cyclone.c
  [IA64] salinfo: sema_init instead of init_MUTEX
  [IA64] xen: use ARRAY_SIZE macro in xen_pv_ops.c
  [IA64] Use static const char * const in palinfo.c
  [IA64] remove asm/compat.h
  [IA64] Add CONFIG_STACKTRACE_SUPPORT
  [IA64] Move local_softirq_pending() definition
  [IA64] iommu: Add a dummy iommu_table.h file in IA64.
  [IA64] unwind - optimise linked-list searches for modules
  [IA64] unwind: remove preprocesser noise, and correct comment
2010-10-21 14:27:18 -07:00
Arnd Bergmann 6038f373a3 llseek: automatically add .llseek fop
All file_operations should get a .llseek operation so we can make
nonseekable_open the default for future file operations without a
.llseek pointer.

The three cases that we can automatically detect are no_llseek, seq_lseek
and default_llseek. For cases where we can we can automatically prove that
the file offset is always ignored, we use noop_llseek, which maintains
the current behavior of not returning an error from a seek.

New drivers should normally not use noop_llseek but instead use no_llseek
and call nonseekable_open at open time.  Existing drivers can be converted
to do the same when the maintainer knows for certain that no user code
relies on calling seek on the device file.

The generated code is often incorrectly indented and right now contains
comments that clarify for each added line why a specific variant was
chosen. In the version that gets submitted upstream, the comments will
be gone and I will manually fix the indentation, because there does not
seem to be a way to do that using coccinelle.

Some amount of new code is currently sitting in linux-next that should get
the same modifications, which I will do at the end of the merge window.

Many thanks to Julia Lawall for helping me learn to write a semantic
patch that does all this.

===== begin semantic patch =====
// This adds an llseek= method to all file operations,
// as a preparation for making no_llseek the default.
//
// The rules are
// - use no_llseek explicitly if we do nonseekable_open
// - use seq_lseek for sequential files
// - use default_llseek if we know we access f_pos
// - use noop_llseek if we know we don't access f_pos,
//   but we still want to allow users to call lseek
//
@ open1 exists @
identifier nested_open;
@@
nested_open(...)
{
<+...
nonseekable_open(...)
...+>
}

@ open exists@
identifier open_f;
identifier i, f;
identifier open1.nested_open;
@@
int open_f(struct inode *i, struct file *f)
{
<+...
(
nonseekable_open(...)
|
nested_open(...)
)
...+>
}

@ read disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
<+...
(
   *off = E
|
   *off += E
|
   func(..., off, ...)
|
   E = *off
)
...+>
}

@ read_no_fpos disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
... when != off
}

@ write @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
<+...
(
  *off = E
|
  *off += E
|
  func(..., off, ...)
|
  E = *off
)
...+>
}

@ write_no_fpos @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
... when != off
}

@ fops0 @
identifier fops;
@@
struct file_operations fops = {
 ...
};

@ has_llseek depends on fops0 @
identifier fops0.fops;
identifier llseek_f;
@@
struct file_operations fops = {
...
 .llseek = llseek_f,
...
};

@ has_read depends on fops0 @
identifier fops0.fops;
identifier read_f;
@@
struct file_operations fops = {
...
 .read = read_f,
...
};

@ has_write depends on fops0 @
identifier fops0.fops;
identifier write_f;
@@
struct file_operations fops = {
...
 .write = write_f,
...
};

@ has_open depends on fops0 @
identifier fops0.fops;
identifier open_f;
@@
struct file_operations fops = {
...
 .open = open_f,
...
};

// use no_llseek if we call nonseekable_open
////////////////////////////////////////////
@ nonseekable1 depends on !has_llseek && has_open @
identifier fops0.fops;
identifier nso ~= "nonseekable_open";
@@
struct file_operations fops = {
...  .open = nso, ...
+.llseek = no_llseek, /* nonseekable */
};

@ nonseekable2 depends on !has_llseek @
identifier fops0.fops;
identifier open.open_f;
@@
struct file_operations fops = {
...  .open = open_f, ...
+.llseek = no_llseek, /* open uses nonseekable */
};

// use seq_lseek for sequential files
/////////////////////////////////////
@ seq depends on !has_llseek @
identifier fops0.fops;
identifier sr ~= "seq_read";
@@
struct file_operations fops = {
...  .read = sr, ...
+.llseek = seq_lseek, /* we have seq_read */
};

// use default_llseek if there is a readdir
///////////////////////////////////////////
@ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier readdir_e;
@@
// any other fop is used that changes pos
struct file_operations fops = {
... .readdir = readdir_e, ...
+.llseek = default_llseek, /* readdir is present */
};

// use default_llseek if at least one of read/write touches f_pos
/////////////////////////////////////////////////////////////////
@ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read.read_f;
@@
// read fops use offset
struct file_operations fops = {
... .read = read_f, ...
+.llseek = default_llseek, /* read accesses f_pos */
};

@ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier write.write_f;
@@
// write fops use offset
struct file_operations fops = {
... .write = write_f, ...
+	.llseek = default_llseek, /* write accesses f_pos */
};

// Use noop_llseek if neither read nor write accesses f_pos
///////////////////////////////////////////////////////////

@ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
identifier write_no_fpos.write_f;
@@
// write fops use offset
struct file_operations fops = {
...
 .write = write_f,
 .read = read_f,
...
+.llseek = noop_llseek, /* read and write both use no f_pos */
};

@ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier write_no_fpos.write_f;
@@
struct file_operations fops = {
... .write = write_f, ...
+.llseek = noop_llseek, /* write uses no f_pos */
};

@ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
@@
struct file_operations fops = {
... .read = read_f, ...
+.llseek = noop_llseek, /* read uses no f_pos */
};

@ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
@@
struct file_operations fops = {
...
+.llseek = noop_llseek, /* no read or write fn */
};
===== End semantic patch =====

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Julia Lawall <julia@diku.dk>
Cc: Christoph Hellwig <hch@infradead.org>
2010-10-15 15:53:27 +02:00
Tony Luck c0f37d2ac3 Merge branches 'release', 'drop_do_IRQ', 'fix_early_irq', 'misc-2.6.37', 'next-fixes', 'optimize-unwind', 'remove-compat-h' and 'stack_trace' into release 2010-10-12 15:06:59 -07:00
Thomas Gleixner 5c2837fbaa dmar: Convert to new irq chip functions
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Acked-by: David Woodhouse <dwmw2@infradead.org>
2010-10-12 16:53:37 +02:00
Thomas Gleixner 1c9db52534 pci: Convert msi to new irq_chip functions
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Russell King <linux@arm.linux.org.uk>
2010-10-12 16:53:34 +02:00
Tony Luck c75f2aa13f [IA64] Cannot use register_percpu_irq() from ia64_mca_init()
This is called before early_irq_init() which will clobber any
registrations made too early.  Move the calls to ia64_mca_late_init().

Signed-off-by: Tony Luck <tomy.luck@intel.com>
2010-10-07 16:23:34 -07:00
Tony Luck 4de0a75948 [IA64] Initialize interrupts later (from init_IRQ())
Thomas Gleixner is cleaning up the generic irq code, and ia64 ran
into problems because it calls register_intr() before early_irq_init()
is called.  Move the call to acpi_boot_init() from setup_arch() to
init_IRQ().

As a bonus - moving the call later means we no longer need the
hacks in iosapic.c to switch between the bootmem and regular
allocator - we can just used kzalloc() for allocation.

Signed-off-by: Tony Luck <tony.luck@intel.com>
2010-10-05 15:41:25 -07:00
Tony Luck 5d4bff94f9 [IA64] Stop using the deprecated __do_IRQ() code path
Thomas Gleixner <tglx@linutronix.de> wrote:
>__do_IRQ() has been deprecated after a two years migration phase in
>commit 0e57aa1. Since then another 18 months have gone by ...

Mostly trivial stuff for this. The only tricky part was realizing
that the new handler_*_irq() paths do not use desc->chip->end(irq).
Not a problem for the edge case as the ia64 iosapic routine for
that was nop(). But the "level" case handled interrupt migration
there.  Just use a slightly modified version of the "end" routine
as "unmask" for the level triggered case.

Signed-off-by: Tony Luck <tony.luck@intel.com>
2010-09-27 13:58:14 -07:00