Commit Graph

18800 Commits

Author SHA1 Message Date
Guennadi Liakhovetski ee81152ff0 V4L/DVB (13647): v4l: Add a 10-bit monochrome and missing 8- and 10-bit Bayer fourcc codes
The 16-bit monochrome fourcc code has been previously abused for a 10-bit
format, add a new 10-bit code instead. Also add missing 8- and 10-bit Bayer
fourcc codes for completeness.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2009-12-16 09:27:16 -02:00
Andi Kleen facb6011f3 HWPOISON: Add soft page offline support
This is a simpler, gentler variant of memory_failure() for soft page
offlining controlled from user space.  It doesn't kill anything, just
tries to invalidate and if that doesn't work migrate the
page away.

This is useful for predictive failure analysis, where a page has
a high rate of corrected errors, but hasn't gone bad yet. Instead
it can be offlined early and avoided.

The offlining is controlled from sysfs, including a new generic
entry point for hard page offlining for symmetry too.

We use the page isolate facility to prevent re-allocation
race. Normally this is only used by memory hotplug. To avoid
races with memory allocation I am using lock_system_sleep().
This avoids the situation where memory hotplug is about
to isolate a page range and then hwpoison undoes that work.
This is a big hammer currently, but the simplest solution
currently.

When the page is not free or LRU we try to free pages
from slab and other caches. The slab freeing is currently
quite dumb and does not try to focus on the specific slab
cache which might own the page. This could be potentially
improved later.

Thanks to Fengguang Wu and Haicheng Li for some fixes.

[Added fix from Andrew Morton to adapt to new migrate_pages prototype]
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-16 12:20:00 +01:00
Wu Fengguang d324236b33 memcg: add accessor to mem_cgroup.css
So that an outside user can free the reference count grabbed by
try_get_mem_cgroup_from_page().

CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
CC: Hugh Dickins <hugh.dickins@tiscali.co.uk>
CC: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
CC: Balbir Singh <balbir@linux.vnet.ibm.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-16 12:19:59 +01:00
Wu Fengguang e42d9d5d47 memcg: rename and export try_get_mem_cgroup_from_page()
So that the hwpoison injector can get mem_cgroup for arbitrary page
and thus know whether it is owned by some mem_cgroup task(s).

[AK: Merged with latest git tree]

CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
CC: Hugh Dickins <hugh.dickins@tiscali.co.uk>
CC: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
CC: Balbir Singh <balbir@linux.vnet.ibm.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-16 12:19:59 +01:00
Wu Fengguang 1a9b5b7fe0 mm: export stable page flags
Rename get_uflags() to stable_page_flags() and make it a global function
for use in the hwpoison page flags filter, which need to compare user
page flags with the value provided by user space.

Also move KPF_* to kernel-page-flags.h for use by user space tools.

Acked-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
CC: Nick Piggin <npiggin@suse.de>
CC: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-16 12:19:59 +01:00
Wu Fengguang 847ce401df HWPOISON: Add unpoisoning support
The unpoisoning interface is useful for stress testing tools to
reclaim poisoned pages (to prevent OOM)

There is no hardware level unpoisioning, so this
cannot be used for real memory errors, only for software injected errors.

Note that it may leak pages silently - those who have been removed from
LRU cache, but not isolated from page cache/swap cache at hwpoison time.
Especially the stress test of dirty swap cache pages shall reboot system
before exhausting memory.

AK: Fix comments, add documentation, add printks, rename symbol

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-16 12:19:58 +01:00
Andi Kleen 82ba011b90 HWPOISON: Turn ref argument into flags argument
Now that "ref" is just a boolean turn it into
a flags argument. First step is only a single flag
that makes the code's intention more clear, but more
may follow.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-16 12:19:57 +01:00
Andi Kleen 588f9ce6ca HWPOISON: Be more aggressive at freeing non LRU caches
shake_page handles more types of page caches than lru_drain_all()

- per cpu page allocator pages
- per CPU LRU

Stops early when the page became free.

Used in followon patches.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-16 12:19:57 +01:00
Len Brown 1a544d28dd Merge branch 'ipmi' into release 2009-12-16 02:20:58 -05:00
David S. Miller bb5b7c1126 tcp: Revert per-route SACK/DSACK/TIMESTAMP changes.
It creates a regression, triggering badness for SYN_RECV
sockets, for example:

[19148.022102] Badness at net/ipv4/inet_connection_sock.c:293
[19148.022570] NIP: c02a0914 LR: c02a0904 CTR: 00000000
[19148.023035] REGS: eeecbd30 TRAP: 0700   Not tainted  (2.6.32)
[19148.023496] MSR: 00029032 <EE,ME,CE,IR,DR>  CR: 24002442  XER: 00000000
[19148.024012] TASK = eee9a820[1756] 'privoxy' THREAD: eeeca000

This is likely caused by the change in the 'estab' parameter
passed to tcp_parse_options() when invoked by the functions
in net/ipv4/tcp_minisocks.c

But even if that is fixed, the ->conn_request() changes made in
this patch series is fundamentally wrong.  They try to use the
listening socket's 'dst' to probe the route settings.  The
listening socket doesn't even have a route, and you can't
get the right route (the child request one) until much later
after we setup all of the state, and it must be done by hand.

This stuff really isn't ready, so the best thing to do is a
full revert.  This reverts the following commits:

f55017a93f
022c3f7d82
1aba721eba
cda42ebd67
345cda2fd6
dc343475ed
05eaade278
6a2a2d6bf8

Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-15 20:56:42 -08:00
Muralidharan Karicheri b6456c0cfe V4L/DVB (13571): v4l: Adding Digital Video Timings APIs
This adds the above APIs to the v4l2 core. This is based on version v1.2
of the RFC titled "V4L - Support for video timings at the input/output interface"
Following new ioctls are added:-

        - VIDIOC_ENUM_DV_PRESETS
        - VIDIOC_S_DV_PRESET
        - VIDIOC_G_DV_PRESET
        - VIDIOC_QUERY_DV_PRESET
        - VIDIOC_S_DV_TIMINGS
        - VIDIOC_G_DV_TIMINGS

Please refer to the RFC for the details. This code was tested using vpfe
capture driver on TI's DM365. Following is the test configuration used :-

Blu-Ray HD DVD source -> TVP7002 -> DM365 (VPFE) ->DDR

A draft version of the TVP7002 driver (currently being reviewed in the mailing
list) was used that supports V4L2_DV_1080I60 & V4L2_DV_720P60 presets.

A loopback video capture application was used for testing these APIs. This calls
following IOCTLS :-

 -  verify the new v4l2_input capabilities flag added
 -  Enumerate available presets using VIDIOC_ENUM_DV_PRESETS
 -  Set one of the supported preset using VIDIOC_S_DV_PRESET
 -  Get current preset using VIDIOC_G_DV_PRESET
 -  Detect current preset using VIDIOC_QUERY_DV_PRESET
 -  Using stub functions in tvp7002, verify VIDIOC_S_DV_TIMINGS
    and VIDIOC_G_DV_TIMINGS ioctls are received at the sub device.
 -  Tested on 64bit platform by Hans Verkuil

Signed-off-by: Muralidharan Karicheri <m-karicheri2@ti.com>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2009-12-16 00:18:03 -02:00
Bjorn Helgaas 9065ce4500 PNP: add interface to retrieve ACPI device from a PNPACPI device
Add pnp_acpi_device(pnp_dev), which takes a PNP device and returns the
associated ACPI device (or NULL, if the device is not a PNPACPI device).

This allows us to write a PNP driver that can manage both traditional
PNPBIOS and ACPI devices, treating ACPI-only functionality as an optional
extension.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-12-15 17:35:26 -05:00
Phillip Lougher c1e7c3ae59 bzip2/lzma/gzip: pre-boot malloc doesn't return NULL on failure
The trivial malloc implementation used in the pre-boot environment by the
decompressors returns a bad pointer on failure (falling through after
calling error).  This is doubly wrong - the callers expect malloc to
return NULL on failure, second the error function is intended to be
used by the decompressors to propagate errors to *their* callers.  The
decompressors have no access to any state set by the error function.

Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
LKML-Reference: <4b26b1ef.hIInb2AYPMtImAJO%phillip@lougher.demon.co.uk>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-15 14:04:12 -08:00
J. Bruce Fields 1557aca790 nfsd: move most of nfsfh.h to fs/nfsd
Most of this can be trivially moved to a private header as well.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-15 15:01:46 -05:00
J. Bruce Fields c7af6b0895 nfsd: remove unused field rq_reffh
This field is never referenced anywhere else.  I don't know what it was
intended for.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-15 15:01:46 -05:00
J. Bruce Fields 3d8986c758 nfsd: enable V4ROOT exports
With the v4root option now enforced everywhere it should be, it is safe
to advertise support for it to mountd.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-15 15:01:45 -05:00
Arjan van de Ven f251177486 PM: Add initcall_debug style timing for suspend/resume
In order to diagnose overall suspend/resume times, we need
basic instrumentation to break down the total time into per
device timing, similar to initcall_debug.

This patch adds the basic timing instrumentation, needed
for a scritps/bootgraph.pl equivalent or humans.
The bootgraph.pl program is still a work in progress, but
is far enough along to know that this patch is sufficient.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2009-12-15 20:42:06 +01:00
Peter Zijlstra f13c12c634 perf_events: Fix perf_event_attr layout
The miss-alignment of bp_addr created a 32bit hole, causing
different structure packings on 32 and 64 bit machines.

Fix that by moving __reserve_2 into that hole.

Further, remove the useless struct and redundant __bp_reserve
muck.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1260902591.8023.781.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-15 20:12:20 +01:00
Steve Dickson eb4c86c6a5 nfsd: introduce export flag for v4 pseudoroot
NFSv4 differs from v2 and v3 in that it presents a single unified
filesystem tree, whereas v2 and v3 exported multiple filesystem (whose
roots could be found using a separate mount protocol).

Our original NFSv4 server implementation asked the administrator to
designate a single filesystem as the NFSv4 root, then to mount
filesystems they wished to export underneath.  (Often using bind mounts
of already-existing filesystems.)

This was conceptually simple, and allowed easy implementation, but
created a serious obstacle to upgrading between v2/v3: since the paths
to v4 filesystems were different, administrators would have to adjust
all the paths in client-side mount commands when switching to v4.

Various workarounds are possible.  For example, the administrator could
export "/" and designate it as the v4 root.  However, the security risks
of that approach are obvious, and in any case we shouldn't be requiring
the administrator to take extra steps to fix this problem; instead, the
server should present consistent paths across different versions by
default.

These patches take a modified version of that approach: we provide a new
export option which exports only a subset of a filesystem.  With this
flag, it becomes safe for mountd to export "/" by default, with no need
for additional configuration.

We begin just by defining the new flag.

Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-15 14:00:40 -05:00
Alexandros Batsakis cf3b01b548 rpc: add a new priority in RPC task
Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-15 13:53:54 -05:00
Alexandros Batsakis 48f1861242 rpc: add rpc_queue_empty function
Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-15 13:51:17 -05:00
Linus Torvalds 53365383c4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm
* git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm: (80 commits)
  dm snapshot: use merge origin if snapshot invalid
  dm snapshot: report merge failure in status
  dm snapshot: merge consecutive chunks together
  dm snapshot: trigger exceptions in remaining snapshots during merge
  dm snapshot: delay merging a chunk until writes to it complete
  dm snapshot: queue writes to chunks being merged
  dm snapshot: add merging
  dm snapshot: permit only one merge at once
  dm snapshot: support barriers in snapshot merge target
  dm snapshot: avoid allocating exceptions in merge
  dm snapshot: rework writing to origin
  dm snapshot: add merge target
  dm exception store: add merge specific methods
  dm snapshot: create function for chunk_is_tracked wait
  dm snapshot: make bio optional in __origin_write
  dm mpath: reject messages when device is suspended
  dm: export suspended state to targets
  dm: rename dm_suspended to dm_suspended_md
  dm: swap target postsuspend call and setting suspended flag
  dm crypt: add plain64 iv
  ...
2009-12-15 09:12:01 -08:00
Linus Torvalds 8f0ddf91f2 Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (26 commits)
  clockevents: Convert to raw_spinlock
  clockevents: Make tick_device_lock static
  debugobjects: Convert to raw_spinlocks
  perf_event: Convert to raw_spinlock
  hrtimers: Convert to raw_spinlocks
  genirq: Convert irq_desc.lock to raw_spinlock
  smp: Convert smplocks to raw_spinlocks
  rtmutes: Convert rtmutex.lock to raw_spinlock
  sched: Convert pi_lock to raw_spinlock
  sched: Convert cpupri lock to raw_spinlock
  sched: Convert rt_runtime_lock to raw_spinlock
  sched: Convert rq->lock to raw_spinlock
  plist: Make plist debugging raw_spinlock aware
  bkl: Fixup core_lock fallout
  locking: Cleanup the name space completely
  locking: Further name space cleanups
  alpha: Fix fallout from locking changes
  locking: Implement new raw_spinlock
  locking: Convert raw_rwlock functions to arch_rwlock
  locking: Convert raw_rwlock to arch_rwlock
  ...
2009-12-15 09:02:01 -08:00
Linus Torvalds 48e902f0a3 Merge git://git.infradead.org/battery-2.6
* git://git.infradead.org/battery-2.6:
  power_supply_sysfs: Handle -ENODATA in a special way
  wm831x_backup: Remove unused variables
  gta02: Set pcf50633 charger_reference_current_ma
  pcf50633: Query charger status directly
  pcf50633: Properly reenable charging when the supply conditions change
  pcf50633: Get rid of charging restart software auto-triggering
  pcf50633: introduces battery charging current control
  pcf50633: Add ac power supply class to the charger
  wm831x: Factor out WM831x backup battery charger
2009-12-15 08:59:33 -08:00
Samu Onkalo 2db4a76d5f lis3: selftest support
Implement selftest feature as specified by chip manufacturer.  Control:
read selftest sysfs entry

Response: "OK x y z" or "FAIL x y z"

where x, y, and z are difference between selftest mode and normal mode.
Test is passed when values are within acceptance limit values.

Acceptance limits are provided via platform data.  See chip spesifications
for acceptance limits.  If limits are not properly set, OK / FAIL decision
is meaningless.  However, userspace application can still make decision
based on the numeric x, y, z values.

Selftest is meant for HW diagnostic purposes.  It is not meant to be
called during normal use of the chip.  It may cause false interrupt
events.  Selftest mode delays polling of the normal results but it doesn't
cause wrong values.  Chip must be in static state during selftest.  Any
acceration during the test causes most probably failure.

Signed-off-by: Samu Onkalo <samu.p.onkalo@nokia.com>
Acked-by: Éric Piel <Eric.Piel@tremplin-utc.net>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:36 -08:00
Samu Onkalo e40d6eaa79 lis3lv02d: axis remap and resource setup/release
Add the possibility to remap axes via platform data.  Function pointers
for resource setup and release purposes

Signed-off-by: Samu Onkalo <samu.p.onkalo@nokia.com>
Acked-by: Éric Piel <eric.piel@tremplin-utc.net>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Jean Delvare <khali@linux-fr.org>
Cc:  "Trisal, Kalhan" <kalhan.trisal@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:35 -08:00
Nicolas Ferre 2635d1ba71 atmel-mci: change use of dma slave interface
Allow the use of another DMA controller driver in atmel-mci sd/mmc driver.
 This adds a generic dma_slave pointer to the mci platform structure where
we can store DMA controller information.  In atmel-mci we use information
provided by this structure to initialize the driver (with new helper
functions that are architecture dependant).

This also adds at32/avr32 chip modifications to cope with this new access
method.

Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:35 -08:00
KOSAKI Motohiro ca54cb8c9e Subject: Re: [PATCH] strstrip incorrectly marked __must_check
Recently, We marked strstrip() as must_check.  because it was frequently
misused and it should be checked.  However, we found one exception.
scsi/ipr.c intentionally ignore return value of strstrip.  Because it
wishes to keep the whitespace at the beginning.

Thus we need to keep with and without checked whitespace trim function.
This patch adds a new strim() and changes ipr.c to use it.

[akpm@linux-foundation.org: coding-style fixes]
Suggested-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:34 -08:00
Joe Perches 925ede0bf4 efi.h: use %pUl to print UUIDs
Shrinks vmlinux

without:
$ size vmlinux
   text    data     bss     dec     hex filename
6975863  679652 1359668 9015183  898f8f vmlinux

with:
$ size vmlinux
   text    data     bss     dec     hex filename
6975639 679652 1359668 9014959 898eaf vmlinux

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Jeff Garzik <jgarzik@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:33 -08:00
André Goddard Rosa f653398c86 string: factorize skip_spaces and export it to be generally available
On the following sentence:
    while (*s && isspace(*s))
        s++;

If *s == 0, isspace() evaluates to ((_ctype[*s] & 0x20) != 0), which
evaluates to ((0x08 & 0x20) != 0) which equals to 0 as well.
If *s == 1, we depend on isspace() result anyway. In other words,
"a char equals zero is never a space", so remove this check.

Also, *s != 0 is most common case (non-null string).

Fixed const return as noticed by Jan Engelhardt and James Bottomley.
Fixed unnecessary extra cast on strstrip() as noticed by Jan Engelhardt.

Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:32 -08:00
André Goddard Rosa 7707e61c70 ctype: constify read-only _ctype string
While at it, use tabs to indent the comments.

Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:32 -08:00
Bernhard Walle 5ada918b82 vt: introduce and use vt_kmsg_redirect() function
The kernel offers with TIOCL_GETKMSGREDIRECT ioctl() the possibility to
redirect the kernel messages to a specific console.

However, since it's not possible to switch to the kernel message console
after a panic(), it would be nice if the kernel would print the panic
message on the current console.

This patch series adds a new interface to access the global kmsg_redirect
variable by a function to be able to use it in code where
CONFIG_VT_CONSOLE is not set (kernel/panic.c).

This patch:

Instead of using and exporting a global value kmsg_redirect, introduce a
function vt_kmsg_redirect() that both can set and return the console where
messages are printed.

Change all users of kmsg_redirect (the VT code itself and kernel/power.c)
to the new interface.

The main advantage is that vt_kmsg_redirect() can also be used when
CONFIG_VT_CONSOLE is not set.

Signed-off-by: Bernhard Walle <bernhard@bwalle.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:28 -08:00
Andres Salomon f3a57a60d3 cs5535: define lxfb/gxfb MSRs in linux/cs5535.h
..and include them in the lxfb/gxfb drivers rather than asm/geode.h (where
possible).

Signed-off-by: Andres Salomon <dilinger@collabora.co.uk>
Cc: Jordan Crouse <jordan@cosmicpenguin.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Chris Ball <cjb@laptop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:28 -08:00
Andres Salomon f060f27007 cs5535: move VSA2 checks into linux/cs5535.h
Signed-off-by: Andres Salomon <dilinger@collabora.co.uk>
Cc: Jordan Crouse <jordan@cosmicpenguin.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Chris Ball <cjb@laptop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:28 -08:00
Andres Salomon 2e8c12436f cs5535: move the DIVIL MSR definition into linux/cs5535.h
The only thing that uses this is the reboot_fixups code.

Signed-off-by: Andres Salomon <dilinger@collabora.co.uk>
Cc: Jordan Crouse <jordan@cosmicpenguin.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Chris Ball <cjb@laptop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:28 -08:00
Andres Salomon 82dca611bb cs5535: add a generic MFGPT driver
This is based on the old code on arch/x86/kernel/mfgpt_32.c, except it's
not x86 specific, it's modular, and it makes use of a PCI BAR rather than
a random MSR.  Currently module unloading is not supported; it's uncertain
whether or not it can be made work with the hardware.

[akpm@linux-foundation.org: add X86 dependency]
Signed-off-by: Andres Salomon <dilinger@collabora.co.uk>
Cc: Jordan Crouse <jordan@cosmicpenguin.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Chris Ball <cjb@laptop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:28 -08:00
Andres Salomon 5f0a96b044 cs5535-gpio: add AMD CS5535/CS5536 GPIO driver support
This creates a CS5535/CS5536 GPIO driver which uses a gpio_chip backend
(allowing GPIO users to use the generic GPIO API if desired) while also
allowing architecture-specific users directly (via the cs5535_gpio_*
functions).

Tested on an OLPC machine.  Some Leemotes also use CS5536 (with a mips
cpu), which is why this is in drivers/gpio rather than arch/x86.
Currently, it conflicts with older geode GPIO support; once MFGPT support
is reworked to also be more generic, the older geode code will be removed.

Signed-off-by: Andres Salomon <dilinger@collabora.co.uk>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Jordan Crouse <jordan@cosmicpenguin.net>
Cc: David Brownell <david-b@pacbell.net>
Reviewed-by: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:27 -08:00
Phil Carmody 603c4ba96b err.h: add helper function to simplify pointer error checking
There are quite a few instances in the kernel of checks of pointers both
against NULL and against the errno range, handling both cases identically.
This additional helper function would simplify such code.

[akpm@linux-foundation.org: build fix]
Signed-off-by: Phil Carmody <ext-phil.2.carmody@nokia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:27 -08:00
Hiroshi Shimamoto e4c570c4cb task_struct: make journal_info conditional
journal_info in task_struct is used in journaling file system only.  So
introduce CONFIG_FS_JOURNAL_INFO and make it conditional.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:27 -08:00
Joe Perches 8a64f336bc kernel.h: add printk_ratelimited and pr_<level>_rl
Add a printk_ratelimited statement expression macro that uses a per-call
ratelimit_state so that multiple subsystems output messages are not
suppressed by a global __ratelimit state.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: s/_rl/_ratelimited/g]
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Naohiro Ooiwa <nooiwa@miraclelinux.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:26 -08:00
Amerigo Wang 948c1e2521 kallsyms: remove deprecated print_fn_descriptor_symbol()
According to feature-removal-schedule.txt, it is the time to remove
print_fn_descriptor_symbol().

And a quick grep shows that it no longer has any callers.

Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:26 -08:00
Amerigo Wang 29671f22a8 rwsem: fix rwsem_is_locked() bugs
rwsem_is_locked() tests ->activity without locks, so we should always keep
->activity consistent.  However, the code in __rwsem_do_wake() breaks this
rule, it updates ->activity after _all_ readers waken up, this may give
some reader a wrong ->activity value, thus cause rwsem_is_locked() behaves
wrong.

Quote from Andrew:

"
- we have one or more processes sleeping in down_read(), waiting for access.

- we wake one or more processes up without altering ->activity

- they start to run and they do rwsem_is_locked().  This incorrectly
  returns "false", because the waker process is still crunching away in
  __rwsem_do_wake().

- the waker now alters ->activity, but it was too late.
"

So we need get a spinlock to protect this.  And rwsem_is_locked() should
not block, thus we use spin_trylock_irqsave().

[akpm@linux-foundation.org: simplify code]
Reported-by: Brian Behlendorf <behlendorf1@llnl.gov>
Cc: Ben Woodard <bwoodard@llnl.gov>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:26 -08:00
Joe Perches 0b2749aa6c kernel.h: remove initialization of bool in printk_once
Don't initialize __print_once.  Invert the test to reduce initialized
data.

defconfig before:	$size vmlinux
   text	   data	    bss	    dec	    hex	filename
6976022	 679572	1359668	9015262	 898fde	vmlinux

defconfig after:	$size vmlinux
   text	   data	    bss	    dec	    hex	filename
6976006	 679508	1359700	9015214	 898fae	vmlinux

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:26 -08:00
Joe Perches 00b55864bb dynamic_debug.h/kernel.h: Remove KBUILD_MODNAME from dynamic_pr_debug
If CONFIG_DYNAMIC_DEBUG is enabled and a source file has:

#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <linux/kernel.h>

dynamic_debug.h will duplicate KBUILD_MODNAME
in the output string.

Remove the use of KBUILD_MODNAME from the
output format string generated by dynamic_debug.h

If CONFIG_DYNAMIC_DEBUG is not enabled, no compile-time
check is done to printk/dev_printk arguments.

Add it.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Jason Baron <jbaron@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:25 -08:00
Alexey Dobriyan 471452104b const: constify remaining dev_pm_ops
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:25 -08:00
Naoya Horiguchi 5dc37642cb mm hugetlb: add hugepage support to pagemap
This patch enables extraction of the pfn of a hugepage from
/proc/pid/pagemap in an architecture independent manner.

Details
-------
My test program (leak_pagemap) works as follows:
 - creat() and mmap() a file on hugetlbfs (file size is 200MB == 100 hugepages,)
 - read()/write() something on it,
 - call page-types with option -p,
 - munmap() and unlink() the file on hugetlbfs

Without my patches
------------------
$ ./leak_pagemap
             flags page-count       MB  symbolic-flags                     long-symbolic-flags
0x0000000000000000          1        0  __________________________________
0x0000000000000804          1        0  __R________M______________________ referenced,mmap
0x000000000000086c         81        0  __RU_lA____M______________________ referenced,uptodate,lru,active,mmap
0x0000000000005808          5        0  ___U_______Ma_b___________________ uptodate,mmap,anonymous,swapbacked
0x0000000000005868         12        0  ___U_lA____Ma_b___________________ uptodate,lru,active,mmap,anonymous,swapbacked
0x000000000000586c          1        0  __RU_lA____Ma_b___________________ referenced,uptodate,lru,active,mmap,anonymous,swapbacked
             total        101        0

The output of page-types don't show any hugepage.

With my patches
---------------
$ ./leak_pagemap
             flags page-count       MB  symbolic-flags                     long-symbolic-flags
0x0000000000000000          1        0  __________________________________
0x0000000000030000      51100      199  ________________TG________________ compound_tail,huge
0x0000000000028018        100        0  ___UD__________H_G________________ uptodate,dirty,compound_head,huge
0x0000000000000804          1        0  __R________M______________________ referenced,mmap
0x000000000000080c          1        0  __RU_______M______________________ referenced,uptodate,mmap
0x000000000000086c         80        0  __RU_lA____M______________________ referenced,uptodate,lru,active,mmap
0x0000000000005808          4        0  ___U_______Ma_b___________________ uptodate,mmap,anonymous,swapbacked
0x0000000000005868         12        0  ___U_lA____Ma_b___________________ uptodate,lru,active,mmap,anonymous,swapbacked
0x000000000000586c          1        0  __RU_lA____Ma_b___________________ referenced,uptodate,lru,active,mmap,anonymous,swapbacked
             total      51300      200

The output of page-types shows 51200 pages contributing to hugepages,
containing 100 head pages and 51100 tail pages as expected.

[akpm@linux-foundation.org: build fix]
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:24 -08:00
Huang Shijie f096e59e84 include/linux/mm.h: remove unneeded ifdef
The check code for CONFIG_SWAP is redundant, because there is a
non-CONFIG_SWAP version for PageSwapCache() which just returns 0.

Signed-off-by: Huang Shijie <shijie8@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:22 -08:00
Andrew Morton b4e655a4aa mm: memory_hotplug: make offline_pages() static
It has no references outside memory_hotplug.c.

Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:20 -08:00
Hugh Dickins 62b61f611e ksm: memory hotremove migration only
The previous patch enables page migration of ksm pages, but that soon gets
into trouble: not surprising, since we're using the ksm page lock to lock
operations on its stable_node, but page migration switches the page whose
lock is to be used for that.  Another layer of locking would fix it, but
do we need that yet?

Do we actually need page migration of ksm pages?  Yes, memory hotremove
needs to offline sections of memory: and since we stopped allocating ksm
pages with GFP_HIGHUSER, they will tend to be GFP_HIGHUSER_MOVABLE
candidates for migration.

But KSM is currently unconscious of NUMA issues, happily merging pages
from different NUMA nodes: at present the rule must be, not to use
MADV_MERGEABLE where you care about NUMA.  So no, NUMA page migration of
ksm pages does not make sense yet.

So, to complete support for ksm swapping we need to make hotremove safe.
ksm_memory_callback() take ksm_thread_mutex when MEM_GOING_OFFLINE and
release it when MEM_OFFLINE or MEM_CANCEL_OFFLINE.  But if mapped pages
are freed before migration reaches them, stable_nodes may be left still
pointing to struct pages which have been removed from the system: the
stable_node needs to identify a page by pfn rather than page pointer, then
it can safely prune them when MEM_OFFLINE.

And make NUMA migration skip PageKsm pages where it skips PageReserved.
But it's only when we reach unmap_and_move() that the page lock is taken
and we can be sure that raised pagecount has prevented a PageAnon from
being upgraded: so add offlining arg to migrate_pages(), to migrate ksm
page when offlining (has sufficient locking) but reject it otherwise.

Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Izik Eidus <ieidus@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Chris Wright <chrisw@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:20 -08:00
Hugh Dickins e9995ef978 ksm: rmap_walk to remove_migation_ptes
A side-effect of making ksm pages swappable is that they have to be placed
on the LRUs: which then exposes them to isolate_lru_page() and hence to
page migration.

Add rmap_walk() for remove_migration_ptes() to use: rmap_walk_anon() and
rmap_walk_file() in rmap.c, but rmap_walk_ksm() in ksm.c.  Perhaps some
consolidation with existing code is possible, but don't attempt that yet
(try_to_unmap needs to handle nonlinears, but migration pte removal does
not).

rmap_walk() is sadly less general than it appears: rmap_walk_anon(), like
remove_anon_migration_ptes() which it replaces, avoids calling
page_lock_anon_vma(), because that includes a page_mapped() test which
fails when all migration ptes are in place.  That was valid when NUMA page
migration was introduced (holding mmap_sem provided the missing guarantee
that anon_vma's slab had not already been destroyed), but I believe not
valid in the memory hotremove case added since.

For now do the same as before, and consider the best way to fix that
unlikely race later on.  When fixed, we can probably use rmap_walk() on
hwpoisoned ksm pages too: for now, they remain among hwpoison's various
exceptions (its PageKsm test comes before the page is locked, but its
page_lock_anon_vma fails safely if an anon gets upgraded).

Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Izik Eidus <ieidus@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Chris Wright <chrisw@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:20 -08:00
Hugh Dickins db114b83ab ksm: hold anon_vma in rmap_item
For full functionality, page_referenced_one() and try_to_unmap_one() need
to know the vma: to pass vma down to arch-dependent flushes, or to observe
VM_LOCKED or VM_EXEC.  But KSM keeps no record of vma: nor can it, since
vmas get split and merged without its knowledge.

Instead, note page's anon_vma in its rmap_item when adding to stable tree:
all the vmas which might map that page are listed by its anon_vma.

page_referenced_ksm() and try_to_unmap_ksm() then traverse the anon_vma,
first to find the probable vma, that which matches rmap_item's mm; but if
that is not enough to locate all instances, traverse again to try the
others.  This catches those occasions when fork has duplicated a pte of a
ksm page, but ksmd has not yet come around to assign it an rmap_item.

But each rmap_item in the stable tree which refers to an anon_vma needs to
take a reference to it.  Andrea's anon_vma design cleverly avoided a
reference count (an anon_vma was free when its list of vmas was empty),
but KSM now needs to add that.  Is a 32-bit count sufficient?  I believe
so - the anon_vma is only free when both count is 0 and list is empty.

Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Izik Eidus <ieidus@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Chris Wright <chrisw@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:19 -08:00
Hugh Dickins 5ad6468801 ksm: let shared pages be swappable
Initial implementation for swapping out KSM's shared pages: add
page_referenced_ksm() and try_to_unmap_ksm(), which rmap.c calls when
faced with a PageKsm page.

Most of what's needed can be got from the rmap_items listed from the
stable_node of the ksm page, without discovering the actual vma: so in
this patch just fake up a struct vma for page_referenced_one() or
try_to_unmap_one(), then refine that in the next patch.

Add VM_NONLINEAR to ksm_madvise()'s list of exclusions: it has always been
implicit there (being only set with VM_SHARED, already excluded), but
let's make it explicit, to help justify the lack of nonlinear unmap.

Rely on the page lock to protect against concurrent modifications to that
page's node of the stable tree.

The awkward part is not swapout but swapin: do_swap_page() and
page_add_anon_rmap() now have to allow for new possibilities - perhaps a
ksm page still in swapcache, perhaps a swapcache page associated with one
location in one anon_vma now needed for another location or anon_vma.
(And the vma might even be no longer VM_MERGEABLE when that happens.)

ksm_might_need_to_copy() checks for that case, and supplies a duplicate
page when necessary, simply leaving it to a subsequent pass of ksmd to
rediscover the identity and merge them back into one ksm page.
Disappointingly primitive: but the alternative would have to accumulate
unswappable info about the swapped out ksm pages, limiting swappability.

Remove page_add_ksm_rmap(): page_add_anon_rmap() now has to allow for the
particular case it was handling, so just use it instead.

Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Izik Eidus <ieidus@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Chris Wright <chrisw@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:19 -08:00
Hugh Dickins 08beca44df ksm: stable_node point to page and back
Add a pointer to the ksm page into struct stable_node, holding a reference
to the page while the node exists.  Put a pointer to the stable_node into
the ksm page's ->mapping.

Then we don't need get_ksm_page() while traversing the stable tree: the
page to compare against is sure to be present and correct, even if it's no
longer visible through any of its existing rmap_items.

And we can handle the forked ksm page case more efficiently: no need to
memcmp our way through the tree to find its match.

Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Izik Eidus <ieidus@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:19 -08:00
Hugh Dickins af8e3354b4 mm: CONFIG_MMU for PG_mlocked
Remove three degrees of obfuscation, left over from when we had
CONFIG_UNEVICTABLE_LRU.  MLOCK_PAGES is CONFIG_HAVE_MLOCKED_PAGE_BIT is
CONFIG_HAVE_MLOCK is CONFIG_MMU.  rmap.o (and memory-failure.o) are only
built when CONFIG_MMU, so don't need such conditions at all.

Somehow, I feel no compulsion to remove the CONFIG_HAVE_MLOCK* lines from
169 defconfigs: leave those to evolve in due course.

Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Izik Eidus <ieidus@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Nick Piggin <npiggin@suse.de>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:17 -08:00
Hugh Dickins 3ca7b3c5b6 mm: define PAGE_MAPPING_FLAGS
At present we define PageAnon(page) by the low PAGE_MAPPING_ANON bit set
in page->mapping, with the higher bits a pointer to the anon_vma; and have
defined PageKsm(page) as that with NULL anon_vma.

But KSM swapping will need to store a pointer there: so in preparation for
that, now define PAGE_MAPPING_FLAGS as the low two bits, including
PAGE_MAPPING_KSM (always set along with PAGE_MAPPING_ANON, until some
other use for the bit emerges).

Declare page_rmapping(page) to return the pointer part of page->mapping,
and page_anon_vma(page) to return the anon_vma pointer when that's what it
is.  Use these in a few appropriate places: notably, unuse_vma() has been
testing page->mapping, but is better to be testing page_anon_vma() (cases
may be added in which flag bits are set without any pointer).

Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Izik Eidus <ieidus@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:17 -08:00
KOSAKI Motohiro bb3ab59683 vmscan: stop kswapd waiting on congestion when the min watermark is not being met
If reclaim fails to make sufficient progress, the priority is raised.
Once the priority is higher, kswapd starts waiting on congestion.
However, if the zone is below the min watermark then kswapd needs to
continue working without delay as there is a danger of an increased rate
of GFP_ATOMIC allocation failure.

This patch changes the conditions under which kswapd waits on congestion
by only going to sleep if the min watermarks are being met.

[mel@csn.ul.ie: add stats to track how relevant the logic is]
[mel@csn.ul.ie: make kswapd only check its own zones and rename the relevant counters]
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:16 -08:00
Mel Gorman f50de2d381 vmscan: have kswapd sleep for a short interval and double check it should be asleep
After kswapd balances all zones in a pgdat, it goes to sleep.  In the
event of no IO congestion, kswapd can go to sleep very shortly after the
high watermark was reached.  If there are a constant stream of allocations
from parallel processes, it can mean that kswapd went to sleep too quickly
and the high watermark is not being maintained for sufficient length time.

This patch makes kswapd go to sleep as a two-stage process.  It first
tries to sleep for HZ/10.  If it is woken up by another process or the
high watermark is no longer met, it's considered a premature sleep and
kswapd continues work.  Otherwise it goes fully to sleep.

This adds more counters to distinguish between fast and slow breaches of
watermarks.  A "fast" premature sleep is one where the low watermark was
hit in a very short time after kswapd going to sleep.  A "slow" premature
sleep indicates that the high watermark was breached after a very short
interval.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Cc: Frans Pop <elendil@planet.nl>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:16 -08:00
Lee Schermerhorn d4906e1aa5 swap: rework map_swap_page() again
Seems that page_io.c doesn't really need to know that page_private(page)
is the swp_entry 'val'.  Rework map_swap_page() to do what its name says
and map a page to a page offset in the swap space.

The only other caller of map_swap_page() is internal to mm/swapfile.c and
it does want to map a swap entry to the 'sector'.  So rename
map_swap_page() to map_swap_entry(), make it 'static' and and implement
map_swap_page() as a wrapper around that.

Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:16 -08:00
Hugh Dickins 7509765a29 swap_info: reorder its fields
Reorder (and comment) the fields of swap_info_struct, to make better
use of its cachelines: it's good for swap_duplicate() in particular
if unsigned int max and swap_map are near the start.

Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:16 -08:00
Hugh Dickins aaa468653b swap_info: note SWAP_MAP_SHMEM
While we're fiddling with the swap_map values, let's assign a particular
value to shmem/tmpfs swap pages: their swap counts are never incremented,
and it helps swapoff's try_to_unuse() a little if it can immediately
distinguish those pages from process pages.

Since we've no use for SWAP_MAP_BAD | COUNT_CONTINUED,
we might as well use that 0xbf value for SWAP_MAP_SHMEM.

Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:16 -08:00
Hugh Dickins 570a335b8e swap_info: swap count continuations
Swap is duplicated (reference count incremented by one) whenever the same
swap page is inserted into another mm (when forking finds a swap entry in
place of a pte, or when reclaim unmaps a pte to insert the swap entry).

swap_info_struct's vmalloc'ed swap_map is the array of these reference
counts: but what happens when the unsigned short (or unsigned char since
the preceding patch) is full? (and its high bit is kept for a cache flag)

We then lose track of it, never freeing, leaving it in use until swapoff:
at which point we _hope_ that a single pass will have found all instances,
assume there are no more, and will lose user data if we're wrong.

Swapping of KSM pages has not yet been enabled; but it is implemented,
and makes it very easy for a user to overflow the maximum swap count:
possible with ordinary process pages, but unlikely, even when pid_max
has been raised from PID_MAX_DEFAULT.

This patch implements swap count continuations: when the count overflows,
a continuation page is allocated and linked to the original vmalloc'ed
map page, and this used to hold the continuation counts for that entry
and its neighbours.  These continuation pages are seldom referenced:
the common paths all work on the original swap_map, only referring to
a continuation page when the low "digit" of a count is incremented or
decremented through SWAP_MAP_MAX.

Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:15 -08:00
Hugh Dickins 8d69aaee80 swap_info: swap_map of chars not shorts
Halve the vmalloc'ed swap_map array from unsigned shorts to unsigned
chars: it's still very unusual to reach a swap count of 126, and the
next patch allows it to be extended indefinitely.

Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:15 -08:00
Hugh Dickins 253d553ba7 swap_info: SWAP_HAS_CACHE cleanups
Though swap_count() is useful, I'm finding that swap_has_cache() and
encode_swapmap() obscure what happens in the swap_map entry, just at
those points where I need to understand it.  Remove them, and pass
more usable "usage" values to scan_swap_map(), swap_entry_free() and
__swap_duplicate(), instead of the SWAP_MAP and SWAP_CACHE enum.

Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:15 -08:00
Hugh Dickins 9625a5f289 swap_info: include first_swap_extent
Make better use of the space by folding first swap_extent into its
swap_info_struct, instead of just the list_head: swap partitions need
only that one, and for others it's used as a circular list anyway.

[jirislaby@gmail.com: fix crash on double swapon]
Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:15 -08:00
Hugh Dickins efa90a981b swap_info: change to array of pointers
The swap_info_struct is only 76 or 104 bytes, but it does seem wrong
to reserve an array of about 30 of them in bss, when most people will
want only one.  Change swap_info[] to an array of pointers.

That does need a "type" field in the structure: pack it as a char with
next type and short prio (aha, char is unsigned by default on PowerPC).
Use the (admittedly peculiar) name "type" throughout for this index.

/proc/swaps does not take swap_lock: I wouldn't want it to, but do take
care with barriers when adding a new item to the array (never removed).

Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:15 -08:00
Hugh Dickins f29ad6a99b swap_info: private to swapfile.c
The swap_info_struct is mostly private to mm/swapfile.c, with only
one other in-tree user: get_swap_bio().  Adjust its interface to
map_swap_page(), so that we can then remove get_swap_info_struct().

But there is a popular user out-of-tree, TuxOnIce: so leave the
declaration of swap_info_struct in linux/swap.h.

Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Nigel Cunningham <ncunningham@crca.org.au>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:13 -08:00
David Rientjes bad44b5be8 mm: add gfp flags for NODEMASK_ALLOC slab allocations
Objects passed to NODEMASK_ALLOC() are relatively small in size and are
backed by slab caches that are not of large order, traditionally never
greater than PAGE_ALLOC_COSTLY_ORDER.

Thus, using GFP_KERNEL for these allocations on large machines when
CONFIG_NODES_SHIFT > 8 will cause the page allocator to loop endlessly in
the allocation attempt, each time invoking both direct reclaim or the oom
killer.

This is of particular interest when using NODEMASK_ALLOC() from a
mempolicy context (either directly in mm/mempolicy.c or the mempolicy
constrained hugetlb allocations) since the oom killer always kills current
when allocations are constrained by mempolicies.  So for all present use
cases in the kernel, current would end up being oom killed when direct
reclaim fails.  That would allow the NODEMASK_ALLOC() to succeed but
current would have sacrificed itself upon returning.

This patch adds gfp flags to NODEMASK_ALLOC() to pass to kmalloc() on
CONFIG_NODES_SHIFT > 8; this parameter is a nop on other configurations.
All current use cases either directly from hugetlb code or indirectly via
NODEMASK_SCRATCH() union __GFP_NORETRY to avoid direct reclaim and the oom
killer when the slab allocator needs to allocate additional pages.

The side-effect of this change is that all current use cases of either
NODEMASK_ALLOC() or NODEMASK_SCRATCH() need appropriate -ENOMEM handling
when the allocation fails (never for CONFIG_NODES_SHIFT <= 8).  All
current use cases were audited and do have appropriate error handling at
this time.

Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:13 -08:00
Lee Schermerhorn 39da08cb07 hugetlb: offload per node attribute registrations
Offload the registration and unregistration of per node hstate sysfs
attributes to a worker thread rather than attempt the
allocation/attachment or detachment/freeing of the attributes in the
context of the memory hotplug handler.

I don't know that this is absolutely required, but the registration can
sleep in allocations and other mem hot plug handlers do it this way.  If
it turns out this is NOT required, we can drop this patch.

N.B.,  Only tested build, boot, libhugetlbfs regression.
       i.e., no memory hotplug testing.

Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Reviewed-by: Andi Kleen <andi@firstfloor.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:13 -08:00
David Rientjes 8fe23e0571 mm: clear node in N_HIGH_MEMORY and stop kswapd when all memory is offlined
When memory is hot-removed, its node must be cleared in N_HIGH_MEMORY if
there are no present pages left.

In such a situation, kswapd must also be stopped since it has nothing left
to do.

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Rik van Riel <riel@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:13 -08:00
Lee Schermerhorn 9a30523066 hugetlb: add per node hstate attributes
Add the per huge page size control/query attributes to the per node
sysdevs:

/sys/devices/system/node/node<ID>/hugepages/hugepages-<size>/
	nr_hugepages       - r/w
	free_huge_pages    - r/o
	surplus_huge_pages - r/o

The patch attempts to re-use/share as much of the existing global hstate
attribute initialization and handling, and the "nodes_allowed" constraint
processing as possible.

Calling set_max_huge_pages() with no node indicates a change to global
hstate parameters.  In this case, any non-default task mempolicy will be
used to generate the nodes_allowed mask.  A valid node id indicates an
update to that node's hstate parameters, and the count argument specifies
the target count for the specified node.  From this info, we compute the
target global count for the hstate and construct a nodes_allowed node mask
contain only the specified node.

Setting the node specific nr_hugepages via the per node attribute
effectively ignores any task mempolicy or cpuset constraints.

With this patch:

(me):ls /sys/devices/system/node/node0/hugepages/hugepages-2048kB
./  ../  free_hugepages  nr_hugepages  surplus_hugepages

Starting from:
Node 0 HugePages_Total:     0
Node 0 HugePages_Free:      0
Node 0 HugePages_Surp:      0
Node 1 HugePages_Total:     0
Node 1 HugePages_Free:      0
Node 1 HugePages_Surp:      0
Node 2 HugePages_Total:     0
Node 2 HugePages_Free:      0
Node 2 HugePages_Surp:      0
Node 3 HugePages_Total:     0
Node 3 HugePages_Free:      0
Node 3 HugePages_Surp:      0
vm.nr_hugepages = 0

Allocate 16 persistent huge pages on node 2:
(me):echo 16 >/sys/devices/system/node/node2/hugepages/hugepages-2048kB/nr_hugepages

[Note that this is equivalent to:
	numactl -m 2 hugeadmin --pool-pages-min 2M:+16
]

Yields:
Node 0 HugePages_Total:     0
Node 0 HugePages_Free:      0
Node 0 HugePages_Surp:      0
Node 1 HugePages_Total:     0
Node 1 HugePages_Free:      0
Node 1 HugePages_Surp:      0
Node 2 HugePages_Total:    16
Node 2 HugePages_Free:     16
Node 2 HugePages_Surp:      0
Node 3 HugePages_Total:     0
Node 3 HugePages_Free:      0
Node 3 HugePages_Surp:      0
vm.nr_hugepages = 16

Global controls work as expected--reduce pool to 8 persistent huge pages:
(me):echo 8 >/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages

Node 0 HugePages_Total:     0
Node 0 HugePages_Free:      0
Node 0 HugePages_Surp:      0
Node 1 HugePages_Total:     0
Node 1 HugePages_Free:      0
Node 1 HugePages_Surp:      0
Node 2 HugePages_Total:     8
Node 2 HugePages_Free:      8
Node 2 HugePages_Surp:      0
Node 3 HugePages_Total:     0
Node 3 HugePages_Free:      0
Node 3 HugePages_Surp:      0

Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Andi Kleen <andi@firstfloor.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:12 -08:00
Lee Schermerhorn 4e25b2576e hugetlb: add generic definition of NUMA_NO_NODE
Move definition of NUMA_NO_NODE from ia64 and x86_64 arch specific headers
to generic header 'linux/numa.h' for use in generic code.  NUMA_NO_NODE
replaces bare '-1' where it's used in this series to indicate "no node id
specified".  Ultimately, it can be used to replace the -1 elsewhere where
it is used similarly.

Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Andi Kleen <andi@firstfloor.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:12 -08:00
Lee Schermerhorn 06808b0827 hugetlb: derive huge pages nodes allowed from task mempolicy
This patch derives a "nodes_allowed" node mask from the numa mempolicy of
the task modifying the number of persistent huge pages to control the
allocation, freeing and adjusting of surplus huge pages when the pool page
count is modified via the new sysctl or sysfs attribute
"nr_hugepages_mempolicy".  The nodes_allowed mask is derived as follows:

* For "default" [NULL] task mempolicy, a NULL nodemask_t pointer
  is produced.  This will cause the hugetlb subsystem to use
  node_online_map as the "nodes_allowed".  This preserves the
  behavior before this patch.
* For "preferred" mempolicy, including explicit local allocation,
  a nodemask with the single preferred node will be produced.
  "local" policy will NOT track any internode migrations of the
  task adjusting nr_hugepages.
* For "bind" and "interleave" policy, the mempolicy's nodemask
  will be used.
* Other than to inform the construction of the nodes_allowed node
  mask, the actual mempolicy mode is ignored.  That is, all modes
  behave like interleave over the resulting nodes_allowed mask
  with no "fallback".

See the updated documentation [next patch] for more information
about the implications of this patch.

Examples:

Starting with:

	Node 0 HugePages_Total:     0
	Node 1 HugePages_Total:     0
	Node 2 HugePages_Total:     0
	Node 3 HugePages_Total:     0

Default behavior [with or without this patch] balances persistent
hugepage allocation across nodes [with sufficient contiguous memory]:

	sysctl vm.nr_hugepages[_mempolicy]=32

yields:

	Node 0 HugePages_Total:     8
	Node 1 HugePages_Total:     8
	Node 2 HugePages_Total:     8
	Node 3 HugePages_Total:     8

Of course, we only have nr_hugepages_mempolicy with the patch,
but with default mempolicy, nr_hugepages_mempolicy behaves the
same as nr_hugepages.

Applying mempolicy--e.g., with numactl [using '-m' a.k.a.
'--membind' because it allows multiple nodes to be specified
and it's easy to type]--we can allocate huge pages on
individual nodes or sets of nodes.  So, starting from the
condition above, with 8 huge pages per node, add 8 more to
node 2 using:

	numactl -m 2 sysctl vm.nr_hugepages_mempolicy=40

This yields:

	Node 0 HugePages_Total:     8
	Node 1 HugePages_Total:     8
	Node 2 HugePages_Total:    16
	Node 3 HugePages_Total:     8

The incremental 8 huge pages were restricted to node 2 by the
specified mempolicy.

Similarly, we can use mempolicy to free persistent huge pages
from specified nodes:

	numactl -m 0,1 sysctl vm.nr_hugepages_mempolicy=32

yields:

	Node 0 HugePages_Total:     4
	Node 1 HugePages_Total:     4
	Node 2 HugePages_Total:    16
	Node 3 HugePages_Total:     8

The 8 huge pages freed were balanced over nodes 0 and 1.

[rientjes@google.com: accomodate reworked NODEMASK_ALLOC]
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Andi Kleen <andi@firstfloor.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:12 -08:00
Lee Schermerhorn c1e6c8d074 hugetlb: factor init_nodemask_of_node()
Factor init_nodemask_of_node() out of the nodemask_of_node() macro.

This will be used to populate the huge pages "nodes_allowed" nodemask for
a single node when basing nodes_allowed on a preferred/local mempolicy or
when a persistent huge page pool page count is modified via a per node
sysfs attribute.

Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Andi Kleen <andi@firstfloor.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Nishanth Aravamudan <nacc@us.ibm.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:12 -08:00
David Rientjes 4e7b8a6cef nodemask: make NODEMASK_ALLOC more general
This is a series of patches to provide control over the location of the
allocation and freeing of persistent huge pages on a NUMA platform.
Please consider for merging into mmotm.

This series uses two mechanisms to constrain the nodes from which
persistent huge pages are allocated: 1) the task NUMA mempolicy of the
task modifying a new sysctl "nr_hugepages_mempolicy", based on a
suggestion by Mel Gorman; and 2) a subset of the hugepages hstate sysfs
attributes have been added [in V4] to each node system device under:

	/sys/devices/node/node[0-9]*/hugepages

The per node attibutes allow direct assignment of a huge page count on a
specific node, regardless of the task's mempolicy or cpuset constraints.

This patch:

NODEMASK_ALLOC(x, m) assumes x is a type of struct, which is unnecessary.
It's perfectly reasonable to use this macro to allocate a nodemask_t,
which is anonymous, either dynamically or on the stack depending on
NODES_SHIFT.

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:12 -08:00
Dmitry Torokhov 7547a3e8a4 Merge commit 'linus' into next 2009-12-15 08:49:32 -08:00
Steven Rostedt e36c54582c tracing: Fix return of trace_dump_stack()
The trace_dump_stack() returned a value for a void function.

Also, added the missing stub for trace_dump_stack() when tracing is
not configured.

Reported-by: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <20091214162713.GA31060@elte.hu>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-15 08:36:11 +01:00
Alan Jenkins 9e1b9b8072 module: make MODULE_SYMBOL_PREFIX into a CONFIG option
The next commit will require the use of MODULE_SYMBOL_PREFIX in
.tmp_exports-asm.S.  Currently it is mixed in with C structure
definitions in "asm/module.h".  Move the definition of this arch option
into Kconfig, so it can be easily accessed by any code.

This also lets modpost.c use the same definition.  Previously modpost
relied on a hardcoded list of architectures in mk_elfconfig.c.

A build test for blackfin, one of the two MODULE_SYMBOL_PREFIX archs,
showed the generated code was unchanged.  vmlinux was identical save
for build ids, and an apparently randomized suffix on a single "__key"
symbol in the kallsyms data).

Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Acked-by: Mike Frysinger <vapier@gentoo.org> (blackfin)
CC: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-12-15 16:28:26 +10:30
J. Bruce Fields 12045a6ee9 nfsd: let "insecure" flag vary by pseudoflavor
This was an oversight; it should be among the export flags that can be
allowed to vary by pseudoflavor.  This allows an administrator to (for
example) allow auth_sys mounts only from low ports, but allow auth_krb5
mounts to use any port.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-14 19:08:58 -05:00
J. Bruce Fields e8e8753f7a nfsd: new interface to advertise export features
Soon we will add the new V4ROOT flag, and allow the INSECURE flag to
vary by pseudoflavor.  It would be useful for nfs-utils (for example,
for improved exportfs error reporting) to be able to know when this
happens.  Use this new interface for that purpose.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-14 18:51:29 -05:00
Boaz Harrosh 9a74af2133 nfsd: Move private headers to source directory
Lots of include/linux/nfsd/* headers are only used by
nfsd module. Move them to the source directory

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-14 18:12:12 -05:00
Boaz Harrosh 72579ac9cd nfsd: Headers Independence and include cleanups
* Add includes that are directly used by headers
* Remove includes that are not needed

These are the changes made:

[xdr.h]
struct nfsd_readdirres has an embedded struct readdir_cd from nfsd.h
fixing that we can drop other includes

[xdr4.h]
embedded types defined both at state.h and nfsd.h

[syscall.h]
After export.h fix none of these stuff is needed.
fix extra space in # include <> statement

[stats.h]
does not need <linux/nfs4.h> but was export to user-mode
so I don't touch it

[state.h]
embedded types from nfsfh.h like struct knfsd_fh. bringing that
eliminates the need for all other includes

[nfsfh.h]
directly manipulating types from sunrpc/svc.h.
Removed Other unused headers.

[nfsd.h]
removed unused headers include

[export.h]
lots of sunrpc/svc.h types and a single prototype declaration
with pointer from nfsfh.h, but all users of export.h do need
nfsfh.h any way. remove now un-needed include.

[const.h]
Unfixed (not independent)

[cache.h]
could do with a forward declaration of "struct svc_rqst;"
from sunrpc/svc.h but all users absolutely will need
sunrpc/svc.h it is easier overall this way.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-14 18:12:09 -05:00
Boaz Harrosh d703158229 nfsd: Fix independence of a few nfsd related headers
An header should be compilation independent, .i.e pull in
any header who's declarations are directly used by this header.
And not let users re-include all it's dependencies all over
again.

[At the end of the day what's the use of a header if it does
 not have more then one user?]

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-14 18:12:08 -05:00
Boaz Harrosh a600ffcbb3 sunrpc: Clean never used include files
Remove include of two headers never used by this file.
Doing so exposed a missing #include <linux/types.h> in
include/linux/sunrpc/rpc_rdma.h.

I did not see any other users dependency but if exist they
should be fixed since these headers are totally irrelevant
to here.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-14 18:12:08 -05:00
Boaz Harrosh 4056c9a344 nfsd: Remove unused dprintk
This doesn't appear to be useful.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-14 18:12:08 -05:00
Thomas Gleixner e625cce1b7 perf_event: Convert to raw_spinlock
Convert locks which cannot be sleeping locks in preempt-rt to
raw_spinlocks.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14 23:55:34 +01:00
Thomas Gleixner ecb49d1a63 hrtimers: Convert to raw_spinlocks
Convert locks which cannot be sleeping locks in preempt-rt to
raw_spinlocks.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14 23:55:34 +01:00
Thomas Gleixner 239007b844 genirq: Convert irq_desc.lock to raw_spinlock
Convert locks which cannot be sleeping locks in preempt-rt to
raw_spinlocks.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14 23:55:33 +01:00
Thomas Gleixner d209d74d52 rtmutes: Convert rtmutex.lock to raw_spinlock
Convert locks which cannot be sleeping locks in preempt-rt to
raw_spinlocks.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14 23:55:33 +01:00
Thomas Gleixner 1d61548254 sched: Convert pi_lock to raw_spinlock
Convert locks which cannot be sleeping locks in preempt-rt to
raw_spinlocks.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14 23:55:33 +01:00
Thomas Gleixner a26724591e plist: Make plist debugging raw_spinlock aware
plists are used with spinlocks and raw_spinlocks. Change the plist
debugging to handle both types.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14 23:55:33 +01:00
Thomas Gleixner 9c1721aa49 locking: Cleanup the name space completely
Make the name space hierarchy of locking functions consistent:
     raw_spin* -> _raw_spin* -> __raw_spin*

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14 23:55:33 +01:00
Thomas Gleixner 9828ea9d75 locking: Further name space cleanups
The name space hierarchy for the internal lock functions is now a bit
backwards. raw_spin* functions map to _spin* which use __spin*, while
we would like to have _raw_spin* and __raw_spin*.

_raw_spin* is already used by lock debugging, so rename those funtions
to do_raw_spin* to free up the _raw_spin* name space.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14 23:55:33 +01:00
Thomas Gleixner c2f21ce2e3 locking: Implement new raw_spinlock
Now that the raw_spin name space is freed up, we can implement
raw_spinlock and the related functions which are used to annotate the
locks which are not converted to sleeping spinlocks in preempt-rt.

A side effect is that only such locks can be used with the low level
lock fsunctions which circumvent lockdep.

For !rt spin_* functions are mapped to the raw_spin* implementations.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14 23:55:32 +01:00
Thomas Gleixner e5931943d0 locking: Convert raw_rwlock functions to arch_rwlock
Name space cleanup for rwlock functions. No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: linux-arch@vger.kernel.org
2009-12-14 23:55:32 +01:00
Thomas Gleixner fb3a6bbc91 locking: Convert raw_rwlock to arch_rwlock
Not strictly necessary for -rt as -rt does not have non sleeping
rwlocks, but it's odd to not have a consistent naming convention.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: linux-arch@vger.kernel.org
2009-12-14 23:55:32 +01:00
Thomas Gleixner 0199c4e68d locking: Convert __raw_spin* functions to arch_spin*
Name space cleanup. No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: linux-arch@vger.kernel.org
2009-12-14 23:55:32 +01:00
Thomas Gleixner edc35bd72e locking: Rename __RAW_SPIN_LOCK_UNLOCKED to __ARCH_SPIN_LOCK_UNLOCKED
Further name space cleanup. No functional change

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: linux-arch@vger.kernel.org
2009-12-14 23:55:32 +01:00
Thomas Gleixner 445c89514b locking: Convert raw_spinlock to arch_spinlock
The raw_spin* namespace was taken by lockdep for the architecture
specific implementations. raw_spin_* would be the ideal name space for
the spinlocks which are not converted to sleeping locks in preempt-rt.

Linus suggested to convert the raw_ to arch_ locks and cleanup the
name space instead of using an artifical name like core_spin,
atomic_spin or whatever

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: linux-arch@vger.kernel.org
2009-12-14 23:55:32 +01:00
Thomas Gleixner 6b6b4792f8 locking: Separate rwlock api from spinlock api
Move the rwlock smp api defines and functions into a separate header
file. Makes the -rt selection simpler and less intrusive.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14 23:55:32 +01:00
Thomas Gleixner ef12f10994 locking: Split rwlock from spinlock headers
Move the rwlock defines and inlines into separate header files. This
makes the selection for -rt easier.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14 23:55:32 +01:00
Linus Torvalds 5443040754 Merge branch 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
* 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
  i2c-core: i2c bus should support PM entries in struct dev_pm_ops
  i2c: Get rid of I2C_CLIENT_MODULE_PARM
  i2c: Drop I2C_CLIENT_INSMOD_2 to 8
  i2c: Drop I2C_CLIENT_INSMOD_1
  i2c: Get rid of struct i2c_client_address_data
  i2c: Drop the kind parameter from detect callbacks
2009-12-14 14:11:56 -08:00
Jean Delvare 7f508118b1 i2c: Get rid of I2C_CLIENT_MODULE_PARM
There is no user left of I2C_CLIENT_MODULE_PARM, so we can finally
get rid of this ugly macro.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
2009-12-14 21:17:29 +01:00
Jean Delvare e5e9f44c24 i2c: Drop I2C_CLIENT_INSMOD_2 to 8
These macros simply declare an enum, so drivers might as well declare
it themselves. This puts an end to the arbitrary limit of 8 chip types
per i2c driver.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
2009-12-14 21:17:27 +01:00
Jean Delvare 1f86df49dd i2c: Drop I2C_CLIENT_INSMOD_1
This macro simply declares an enum, so drivers might as well declare
it themselves.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
2009-12-14 21:17:26 +01:00
Jean Delvare c3813d6af1 i2c: Get rid of struct i2c_client_address_data
Struct i2c_client_address_data only contains one field at this point,
which makes its usefulness questionable. Get rid of it and pass simple
address lists around instead.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
2009-12-14 21:17:25 +01:00
Jean Delvare 310ec79210 i2c: Drop the kind parameter from detect callbacks
The "kind" parameter always has value -1, and nobody is using it any
longer, so we can remove it.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
2009-12-14 21:17:23 +01:00
Linus Torvalds 478e4e9d7a Merge branch 'next-spi' of git://git.secretlab.ca/git/linux-2.6
* 'next-spi' of git://git.secretlab.ca/git/linux-2.6: (23 commits)
  spi: fix probe/remove section markings
  Add OMAP spi100k driver
  spi-imx: don't access struct device directly but use dev_get_platdata
  spi-imx: Add mx25 support
  spi-imx: use positive logic to distinguish cpu variants
  spi-imx: correct check for platform_get_irq failing
  ARM: NUC900: Add spi driver support for nuc900
  spi: SuperH MSIOF SPI Master driver V2
  spi: fix spidev compilation failure when VERBOSE is defined
  spi/au1550_spi: fix setupxfer not to override cfg with zeros
  spi/mpc8xxx: don't use __exit_p to wrap plat_mpc8xxx_spi_remove
  spi/i.MX: fix broken error handling for gpio_request
  spi/i.mx: drain MXC SPI transfer buffer when probing device
  MAINTAINERS: add SPI co-maintainer.
  spi/xilinx_spi: fix incorrect casting
  spi/mpc52xx-spi: minor cleanups
  xilinx_spi: add a platform driver using the xilinx_spi common module.
  xilinx_spi: add support for the DS570 IP.
  xilinx_spi: Switch to iomem functions and support little endian.
  xilinx_spi: Split into of driver and generic part.
  ...
2009-12-14 10:22:11 -08:00
Linus Torvalds 2205afa7d1 Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf sched: Fix build failure on sparc
  perf bench: Add "all" pseudo subsystem and "all" pseudo suite
  perf tools: Introduce perf_session class
  perf symbols: Ditch dso->find_symbol
  perf symbols: Allow lookups by symbol name too
  perf symbols: Add missing "Variables" entry to map_type__name
  perf symbols: Add support for 'variable' symtabs
  perf symbols: Introduce ELF counterparts to symbol_type__is_a
  perf symbols: Introduce symbol_type__is_a
  perf symbols: Rename kthreads to kmaps, using another abstraction for it
  perf tools: Allow building for ARM
  hw-breakpoints: Handle bad modify_user_hw_breakpoint off-case return value
  perf tools: Allow cross compiling
  tracing, slab: Fix no callsite ifndef CONFIG_KMEMTRACE
  tracing, slab: Define kmem_cache_alloc_notrace ifdef CONFIG_TRACING

Trivial conflict due to different fixes to modify_user_hw_breakpoint()
in include/linux/hw_breakpoint.h
2009-12-14 10:13:22 -08:00
David Howells 491424c0f4 PCI: Global variable decls must match the defs in section attributes
Global variable declarations must match the definitions in section attributes
as the compiler is at liberty to vary the method it uses to access a variable,
depending on the section it is in.

When building the FRV arch, I now see:

  drivers/built-in.o: In function `pci_apply_final_quirks':
  drivers/pci/quirks.c:2606: relocation truncated to fit: R_FRV_GPREL12 against symbol `pci_dfl_cache_line_size' defined in .devinit.data section in drivers/built-in.o
  drivers/pci/quirks.c:2623: relocation truncated to fit: R_FRV_GPREL12 against symbol `pci_dfl_cache_line_size' defined in .devinit.data section in drivers/built-in.o
  drivers/pci/quirks.c:2630: relocation truncated to fit: R_FRV_GPREL12 against symbol `pci_dfl_cache_line_size' defined in .devinit.data section in drivers/built-in.o

because the declaration of pci_dfl_cache_line_size in linux/pci.h does not
match the definition in drivers/pci/pci.c.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-14 10:11:34 -08:00
David Howells 5185fb0699 FRV: Fix no-hardware-breakpoint case
If there is no hardware breakpoint support, modify_user_hw_breakpoint()
tries to return a NULL pointer through as an 'int' return value:

  In file included from kernel/exit.c:53:
  include/linux/hw_breakpoint.h: In function 'modify_user_hw_breakpoint':
  include/linux/hw_breakpoint.h:96: warning: return makes integer from pointer without a cast

Return 0 instead.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-14 10:10:55 -08:00
Linus Torvalds 37222e1c9e Merge branch 'for-linus' of git://neil.brown.name/md
* 'for-linus' of git://neil.brown.name/md: (27 commits)
  md: add 'recovery_start' per-device sysfs attribute
  md: rcu_read_lock() walk of mddev->disks in md_do_sync()
  md: integrate spares into array at earliest opportunity.
  md: move compat_ioctl handling into md.c
  md: revise Kconfig help for MD_MULTIPATH
  md: add MODULE_DESCRIPTION for all md related modules.
  raid: improve MD/raid10 handling of correctable read errors.
  md/raid10: print more useful messages on device failure.
  md/bitmap: update dirty flag when bitmap bits are explicitly set.
  md: Support write-intent bitmaps with externally managed metadata.
  md/bitmap: move setting of daemon_lastrun out of bitmap_read_sb
  md: support updating bitmap parameters via sysfs.
  md: factor out parsing of fixed-point numbers
  md: support bitmap offset appropriate for external-metadata arrays.
  md: remove needless setting of thread->timeout in raid10_quiesce
  md: change daemon_sleep to be in 'jiffies' rather than 'seconds'.
  md: move offset, daemon_sleep and chunksize out of bitmap structure
  md: collect bitmap-specific fields into one structure.
  md/raid1: add takeover support for raid5->raid1
  md: add honouring of suspend_{lo,hi} to raid1.
  ...
2009-12-14 10:03:36 -08:00
Linus Torvalds 76b8f82cde Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6: (58 commits)
  mfd: Add twl6030 regulator subdevices
  regulator: Add support for twl6030 regulators
  rtc: Add twl6030 RTC support
  mfd: Add support for twl6030 irq framework
  mfd: Rename twl4030_ routines in twl-regulator.c
  mfd: Rename twl4030_ routines in rtc-twl.c
  mfd: Rename all twl4030_i2c*
  mfd: Rename twl4030* driver files to enable re-use
  mfd: Clarify twl4030 return value for read and write
  mfd: Add all twl4030 regulators to the twl4030 mfd driver
  mfd: Don't set mc13783 ADREFMODE for touch conversions
  mfd: Remove ezx-pcap defines for custom led gpio encoding
  mfd: Near complete mc13783 rewrite
  mfd: Remove build time warning for WM835x register default tables
  mfd: Force I2C to be built in when building WM831x
  mfd: Don't allow wm831x to be built as a module
  mfd: Fix incorrect error check for wm8350-core
  mfd: Fix twl4030 warning
  gpiolib: Implement gpio_to_irq() for wm831x
  mfd: Remove default selection of AB4500
  ...
2009-12-14 10:02:35 -08:00
Linus Torvalds 3f86ce72cf Merge git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (75 commits)
  NFS: Fix nfs_migrate_page()
  rpc: remove unneeded function parameter in gss_add_msg()
  nfs41: Invoke RECLAIM_COMPLETE on all new client ids
  SUNRPC: IS_ERR/PTR_ERR confusion
  NFSv41: Fix a potential state leakage when restarting nfs4_close_prepare
  nfs41: Handle NFSv4.1 session errors in the delegation recall code
  nfs41: Retry delegation return if it failed with session error
  nfs41: Handle session errors during delegation return
  nfs41: Mark stateids in need of reclaim if state manager gets stale clientid
  NFS: Fix up the declaration of nfs4_restart_rpc when NFSv4 not configured
  nfs41: Don't clear DRAINING flag on NFS4ERR_STALE_CLIENTID
  nfs41: nfs41_setup_state_renewal
  NFSv41: More cleanups
  NFSv41: Fix up some bugs in the NFS4CLNT_SESSION_DRAINING code
  NFSv41: Clean up slot table management
  NFSv41: Fix nfs4_proc_create_session
  nfs41: Invoke RECLAIM_COMPLETE
  nfs41: RECLAIM_COMPLETE functionality
  nfs41: RECLAIM_COMPLETE XDR functionality
  Cleanup some NFSv4 XDR decode comments
  ...
2009-12-14 10:00:24 -08:00
Linus Torvalds d0316554d3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits)
  m68k: rename global variable vmalloc_end to m68k_vmalloc_end
  percpu: add missing per_cpu_ptr_to_phys() definition for UP
  percpu: Fix kdump failure if booted with percpu_alloc=page
  percpu: make misc percpu symbols unique
  percpu: make percpu symbols in ia64 unique
  percpu: make percpu symbols in powerpc unique
  percpu: make percpu symbols in x86 unique
  percpu: make percpu symbols in xen unique
  percpu: make percpu symbols in cpufreq unique
  percpu: make percpu symbols in oprofile unique
  percpu: make percpu symbols in tracer unique
  percpu: make percpu symbols under kernel/ and mm/ unique
  percpu: remove some sparse warnings
  percpu: make alloc_percpu() handle array types
  vmalloc: fix use of non-existent percpu variable in put_cpu_var()
  this_cpu: Use this_cpu_xx in trace_functions_graph.c
  this_cpu: Use this_cpu_xx for ftrace
  this_cpu: Use this_cpu_xx in nmi handling
  this_cpu: Use this_cpu operations in RCU
  this_cpu: Use this_cpu ops for VM statistics
  ...

Fix up trivial (famous last words) global per-cpu naming conflicts in
	arch/x86/kvm/svm.c
	mm/slab.c
2009-12-14 09:58:24 -08:00
Ingo Molnar 0087aabd6a Merge branch 'tip/tracing/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/urgent 2009-12-14 17:12:37 +01:00
Ingo Molnar cc0104e877 Merge branch 'linus' into tracing/urgent
Conflicts:
	kernel/trace/trace_kprobe.c

Merge reason: resolve the conflict.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-14 09:16:49 +01:00
Oliver Hartkopp c7cd606f60 can: Fix data length code handling in rx path
A valid CAN dataframe can have a data length code (DLC) of 0 .. 8 data bytes.

When reading the CAN controllers register the 4-bit value may contain values
from 0 .. 15 which may exceed the reserved space in the socket buffer!

The ISO 11898-1 Chapter 8.4.2.3 (DLC field) says that register values > 8
should be reduced to 8 without any error reporting or frame drop.

This patch introduces a new helper macro to cast a given 4-bit data length
code (dlc) to __u8 and ensure the DLC value to be max. 8 bytes.

The different handlings in the rx path of the CAN netdevice drivers are fixed.

Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-13 19:47:42 -08:00
NeilBrown 7820f9e1dd md: remove sparse warning:symbol XXX was not declared.
Signed-off-by: NeilBrown <neilb@suse.de>
2009-12-14 12:49:47 +11:00
Rajendra Nayak 9da6653928 mfd: Add twl6030 regulator subdevices
This patch adds initial support for creating twl6030 PMIC
specific voltage regulators in the twl mfd driver.

Board specific regulator configurations will have to be passed from
respective board files.

Signed-off-by: Rajendra Nayak <rnayak@ti.com>
Signed-off-by: Balaji T K <balajitk@ti.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Reviewed-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-12-14 00:26:26 +01:00
Rajendra Nayak 441a450554 regulator: Add support for twl6030 regulators
This patch updates the regulator driver to add support
for TWL6030 PMIC specific LDO regulators.
SMPS resources are not yet supported for TWL6030 and
also .set_mode and .get_status for LDO's are yet to
be implemented for TWL6030.

Signed-off-by: Rajendra Nayak <rnayak@ti.com>
Signed-off-by: Balaji T K <balajitk@ti.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Reviewed-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-12-14 00:26:20 +01:00
Balaji T K e8deb28ca8 mfd: Add support for twl6030 irq framework
This patch adds support for phoenix interrupt framework. New iInterrupt
status register A, B, C are introduced in Phoenix and are cleared on write.
Due to the differences in interrupt handling with respect to TWL4030,
twl6030-irq.c is created for TWL6030 PMIC

Signed-off-by: Rajendra Nayak <rnayak@ti.com>
Signed-off-by: Balaji T K <balajitk@ti.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Reviewed-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-12-14 00:25:31 +01:00
Balaji T K fc7b92fca4 mfd: Rename all twl4030_i2c*
This patch renames function names like twl4030_i2c_write_u8,
twl4030_i2c_read_u8 to twl_i2c_write_u8, twl_i2c_read_u8
and also common variable in twl-core.c

Signed-off-by: Rajendra Nayak <rnayak@ti.com>
Signed-off-by: Balaji T K <balajitk@ti.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Kevin Hilman <khilman@deeprootsystems.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-12-13 21:23:33 +01:00
Santosh Shilimkar b07682b605 mfd: Rename twl4030* driver files to enable re-use
The upcoming TWL6030 is companion chip for OMAP4 like the current TWL4030
for OMAP3. The common modules like RTC, Regulator creates opportunity
to re-use the most of the code from twl4030.

This patch renames few common drivers twl4030* files to twl* to enable
the code re-use.

Signed-off-by: Rajendra Nayak <rnayak@ti.com>
Signed-off-by: Balaji T K <balajitk@ti.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Kevin Hilman <khilman@deeprootsystems.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-12-13 20:05:51 +01:00
Trond Myklebust 52c9948b1f Merge branch 'nfs-for-2.6.33' 2009-12-13 13:56:27 -05:00
Juha Keski-Saari ab4abe056d mfd: Add all twl4030 regulators to the twl4030 mfd driver
Add all twl4030 regulators to the twl4030 mfd driver and
twl4030_platform_data

Signed-off-by: Juha Keski-Saari <ext-juha.1.keski-saari@nokia.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-12-13 19:21:59 +01:00
Antonio Ospite ed89a7551c mfd: Remove ezx-pcap defines for custom led gpio encoding
We used these, in a first version of leds-pcap driver, in order to encode gpio
enabling and gpio inversion for a led inside the variable used for the gpio
number. In the new leds-pcap driver we rely on gpio_is_valid() to derive if a
led is gpio enabled and we have a dedicated flag to tell if the gpio value has
to be inverted.

Signed-off-by: Antonio Ospite <ospite@studenti.unina.it>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-12-13 19:21:56 +01:00
Uwe Kleine-König 9e2726776d mfd: Near complete mc13783 rewrite
This fixes several things while still providing the old API:

 - simplify and fix locking
 - better error handling
 - don't ack all irqs making it impossible to detect a reset of the
   rtc
 - use a timeout variant to wait for completion of ADC conversion
 - provide platform-data to regulator subdevice (This allows making
   struct mc13783 opaque for other drivers after the regulator driver is
   updated to use its platform_data.)
 - expose all interrupts
 - use threaded irq

After all users in mainline are converted to the new API, some things
(e.g. mc13783-private.h) can go away.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-12-13 19:21:54 +01:00
Mark Brown 5fb4d38b19 mfd: Move WM831x to generic IRQ
Replace the wm831x-local IRQ infrastructure with genirq, allowing access
to the diagnostic infrastructure of genirq and allowing us to implement
interrupt support for the GPIOs.  The switchover is done within the
wm831x specific IRQ API, further patches will convert the individual
drivers to use genirq directly.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-12-13 19:21:41 +01:00
Ilkka Koskinen 1920a61e20 mfd: Initial support for twl5031
TWL5031 introduces two new interrupts in PIH. Moreover, BCI
has changed remarkably and, thus, it's disabled when TWL5031
is in use.

Signed-off-by: Ilkka Koskinen <ilkka.koskinen@nokia.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-12-13 19:21:41 +01:00
Mark Brown 5a65edbc12 mfd: Convert wm8350 IRQ handlers to irq_handler_t
This is done as simple code transformation, the semantics of the
IRQ API provided by the core are are still very different to those
of genirq (mainly with regard to masking).

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-12-13 19:21:39 +01:00
Ben Dooks b45440c33a mfd: Allow configuration of VDCDC2 for tps65010
Add function to allow the configuation fo the VDCDC2 register by
external users, to allow changing of the standard and low-power
running modes.

This is needed, for example, for the Simtec IM2440D20 where we need
to use the low-power mode to shutdown the LDO/DCDC that are not needed
during suspend (saving substantial power) and the runtime use of the
low-power mode to change VCore.

Signed-off-by: Ben Dooks <ben@simtec.co.uk>
Signed-off-by: Simtec Linux Team <linux@simtec.co.uk>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-12-13 19:21:36 +01:00
Ilkka Koskinen 38a684963f mfd: Enable twl4030 32kHz oscillator low-power mode
Allows TWL's 32kHz oscillator to go in low-power mode when
main battery voltage is running low.

Signed-off-by: Ilkka Koskinen <ilkka.koskinen@nokia.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-12-13 19:21:33 +01:00
Mark Brown 75b75722b4 mfd: Allow platforms to specify an IRQ base for WM8350
This is currently unused by the wm8350 drivers but getting it merged
now will reduce merge issues in the future when implementing wm8350
genirq support.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-12-13 19:21:31 +01:00
Aaro Koskinen 56baa66797 mfd: fix undefined twl4030-power resconfig value checks
The code tries to skip values initialized with -1, but since the values
are unsigned the comparison is always true.

The patch eliminates the following compiler warnings:

drivers/mfd/twl4030-power.c: In function 'twl4030_configure_resource':
drivers/mfd/twl4030-power.c:338: warning: comparison is always true due to
limited range of data type
drivers/mfd/twl4030-power.c:358: warning: comparison is always true due to
limited range of data type
drivers/mfd/twl4030-power.c:363: warning: comparison is always true due to
limited range of data type

Signed-off-by: Aaro Koskinen <aaro.koskinen@nokia.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-12-13 19:21:25 +01:00
Amit Kucheria b4ead61e57 mfd: Add support for remapping twl4030-power power states
The <RESOURCE>_REMAP register allows configuration of the <RESOURCE> in case
of a sleep or off transition.

Allow this property of resources to be configured (through twl4030_resconfig)
and add code to parse these values to program the registers accordingly.

Signed-off-by: Amit Kucheria <amit.kucheria@verdurent.com>
Cc: linux-omap@vger.kernel.org
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-12-13 19:21:24 +01:00
Lars-Peter Clausen 68d641efd8 mfd: Fix memleak in pcf50633_client_dev_register
Since platform_device_add_data copies the passed data, the allocated
subdev_pdata is never freed. A simple fix would be to either free subdev_pdata
or put it onto the stack. But since the pcf50633 child devices can rely on
beeing children of the pcf50633 core device it's much more elegant to get access
to pcf50633 core structure through that link. This allows to get completly rid
of pcf5033_subdev_pdata.

Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Paul Fertser <fercerpav@gmail.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-12-13 19:21:17 +01:00
Mark Brown 0c7229f93a mfd: Convert WM835x IRQ handling to use a data table
Rather than open coding individual IRQs in each function which
manipulates them store data for IRQs in a table which is then
referenced in the users.

This is a substantial code shrink and should be a performance win in
cases where only a single IRQ goes off at once since instead of
reading four of the second level IRQ registers for each interrupt
we read only the sub-registers which have had an interrupt flagged.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-12-13 19:21:02 +01:00
Mark Brown e0a3389ab9 mfd: Split wm8350 IRQ code into a separate file
In preparation for refactoring - it's over 700 lines of well-isolated
code and having it in a file by itself makes things more managable.

While we're at it make sure that we clean up the IRQ if we fail after
acquiring it on init.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-12-13 19:20:55 +01:00
Michael Hennerich a5736e0b62 mfd: Add ADP5520/ADP5501 driver
Base driver for Analog Devices ADP5520/ADP5501 MFD PMICs

Subdevs:
LCD Backlight   : drivers/video/backlight/adp5520_bl.c
LEDs            : drivers/led/leds-adp5520.c
GPIO            : drivers/gpio/adp5520-gpio.c (ADP5520 only)
Keys            : drivers/input/keyboard/adp5520-keys.c (ADP5520 only)

Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-12-13 19:20:53 +01:00
Mark Brown d4e0a89e3d mfd: Add support for WM8320 PMICs
The WM8320 is an integrated power management subsystem providing
voltage regulators, RTC, watchdog and other functionality. The
WM8320 is derived from the WM831x and therefore shares most of
the driver code with the WM831x.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-12-13 19:20:44 +01:00
Mark Brown 6f2ecaae72 gpiolib: Make WM831x GPIO count dynamic
This supports future devices with fewer GPIOs.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-12-13 19:20:43 +01:00
Srinidhi Kasagar 0c41839e98 mfd: add AB4500 driver
This adds core driver support for AB4500 mixed signal
multimedia & power management chip. This connects to U8500
on the SSP (pl022) and exports read/write functions for
the device to get access to this chip. This also registers
the client devices and sets the parent.

Signed-off-by: srinidhi kasagar <srinidhi.kasagar@stericsson.com>
Acked-by: Andrea Gallo <andrea.gallo@stericsson.com>
Reviewed-by: Mark Brown <broonie@sirena.org.uk>
Reviewed-by: Jean-Christophe <plagnioj@jcrosoft.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-12-13 19:20:38 +01:00
Haojian Zhuang 4107da2a28 mfd: Add 88PM8607 driver
This adds a core driver for 88PM8607 found in Marvell DKB development
platform. This driver is a proxy for all accesses to 88PM8607
sub-drivers which will be merged on top of this one, RTC, regulators,
battery and so on.

This chip is manufactured by Marvell.

Signed-off-by: Haojian Zhuang <haojian.zhuang@marvell.com>
Reviewed-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-12-13 19:20:37 +01:00
Li Zefan e00bf2ec60 tracing: Change event->profile_count to be int type
Like total_profile_count, struct ftrace_event_call::profile_count
is protected by event_mutex, so it doesn't need to be atomic_t.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <4B1DC549.5010705@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-12-13 18:37:28 +01:00
Li Zefan 614a71a26b tracing: Pull up calls to trace_define_common_fields()
Call trace_define_common_fields() in event_create_dir() only.
This avoids trace events to handle it from their define_fields
callbacks and shrinks the kernel code size:

   text    data     bss     dec     hex filename
5346802 1961864 7103260 14411926         dbe896 vmlinux.o.old
5345151 1961864 7103260 14410275         dbe223 vmlinux.o

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
LKML-Reference: <4B1DC49C.8000107@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-12-13 18:34:23 +01:00
Li Zefan 87d9b4e1c5 tracing: Extract duplicate ftrace_raw_init_event_foo()
Use a generic trace_event_raw_init() function for all event's raw_init
callbacks (but kprobes) instead of defining the same version for each
of these.
This shrinks the kernel code:

   text    data     bss     dec     hex filename
5355293 1961928 7103260 14420481         dc0a01 vmlinux.o.old
5346802 1961864 7103260 14411926         dbe896 vmlinux.o

raw_init can't be removed, because ftrace events and kprobe events
use different raw_init callbacks. Though it's possible to totally
remove raw_init, I choose to leave it as it is for now.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <4B1DC48C.7080603@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-12-13 18:34:23 +01:00
Magnus Damm 8051effcbc spi: SuperH MSIOF SPI Master driver V2
This patch is V2 of SPI Master support for the SuperH MSIOF.
Full duplex, spi mode 0-3, active high cs, 3-wire and lsb
first should all be supported, but the driver has so far
only been tested with "mmc_spi".

The MSIOF hardware comes with 32-bit FIFOs for receive and
transmit, and this driver simply breaks the SPI messages
into FIFO-sized chunks. The MSIOF hardware manages the pins
for clock, receive and transmit (sck/miso/mosi), but the chip
select pin is managed by software and must be configured as
a regular GPIO pin by the board code.

Performance wise there is still room for improvement, but
on a Ecovec board with the built-in sh7724 MSIOF0 this driver
gets Mini-sd read speeds of about half a megabyte per second.

Future work include better clock setup and merging of 8-bit
transfers into 32-bit words to reduce interrupt load and
improve throughput.

Signed-off-by: Magnus Damm <damm@opensource.se>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2009-12-13 00:48:27 -07:00
Albert Herranz c5df7f7751 powerpc: allow ioremap within reserved memory regions
Add a flag to let a platform ioremap memory regions marked as reserved.

This flag will be used later by the Nintendo Wii support code to allow
ioremapping the I/O region sitting between MEM1 and MEM2 and marked
as reserved RAM in the patch "wii: use both mem1 and mem2 as ram".

This will no longer be needed when proper discontig memory support
for 32-bit PowerPC is added to the kernel.

Signed-off-by: Albert Herranz <albert_herranz@yahoo.es>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2009-12-12 22:24:32 -07:00
Linus Torvalds 09cea96caa Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (151 commits)
  powerpc: Fix usage of 64-bit instruction in 32-bit altivec code
  MAINTAINERS: Add PowerPC patterns
  powerpc/pseries: Track previous CPPR values to correctly EOI interrupts
  powerpc/pseries: Correct pseries/dlpar.c build break without CONFIG_SMP
  powerpc: Make "intspec" pointers in irq_host->xlate() const
  powerpc/8xx: DTLB Miss cleanup
  powerpc/8xx: Remove DIRTY pte handling in DTLB Error.
  powerpc/8xx: Start using dcbX instructions in various copy routines
  powerpc/8xx: Restore _PAGE_WRITETHRU
  powerpc/8xx: Add missing Guarded setting in DTLB Error.
  powerpc/8xx: Fixup DAR from buggy dcbX instructions.
  powerpc/8xx: Tag DAR with 0x00f0 to catch buggy instructions.
  powerpc/8xx: Update TLB asm so it behaves as linux mm expects.
  powerpc/8xx: Invalidate non present TLBs
  powerpc/pseries: Serialize cpu hotplug operations during deactivate Vs deallocate
  pseries/pseries: Add code to online/offline CPUs of a DLPAR node
  powerpc: stop_this_cpu: remove the cpu from the online map.
  powerpc/pseries: Add kernel based CPU DLPAR handling
  sysfs/cpu: Add probe/release files
  powerpc/pseries: Kernel DLPAR Infrastructure
  ...
2009-12-12 14:27:24 -08:00
Linus Torvalds 702a7c7609 Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (21 commits)
  sched: Remove forced2_migrations stats
  sched: Fix memory leak in two error corner cases
  sched: Fix build warning in get_update_sysctl_factor()
  sched: Update normalized values on user updates via proc
  sched: Make tunable scaling style configurable
  sched: Fix missing sched tunable recalculation on cpu add/remove
  sched: Fix task priority bug
  sched: cgroup: Implement different treatment for idle shares
  sched: Remove unnecessary RCU exclusion
  sched: Discard some old bits
  sched: Clean up check_preempt_wakeup()
  sched: Move update_curr() in check_preempt_wakeup() to avoid redundant call
  sched: Sanitize fork() handling
  sched: Clean up ttwu() rq locking
  sched: Remove rq->clock coupling from set_task_cpu()
  sched: Consolidate select_task_rq() callers
  sched: Remove sysctl.sched_features
  sched: Protect sched_rr_get_param() access to task->sched_class
  sched: Protect task->cpus_allowed access in sched_getaffinity()
  sched: Fix balance vs hotplug race
  ...

Fixed up conflicts in kernel/sysctl.c (due to sysctl cleanup)
2009-12-12 11:34:10 -08:00
Jie Zhang 7a77080dbe net: add net_tstamp.h to headers_install
include/linux/net_tstamp.h is userspace API for hardware time stamping
of network packets. It should be exported to userspace.

Signed-off-by: Jie Zhang <jie.zhang@analog.com>
Signed-off-by: Barry Song <barry.song@analog.com>
Signed-off-by: Patrick Ohly <patrick.ohly@gmx.de>
Signed-off-by: Michal Marek <mmarek@suse.cz>
2009-12-12 13:08:15 +01:00
Sam Ravnborg 273b281fa2 kbuild: move utsrelease.h to include/generated
Fix up all users of utsrelease.h

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Michal Marek <mmarek@suse.cz>
2009-12-12 13:08:15 +01:00
Sam Ravnborg 98b8788ae9 drop explicit include of autoconf.h
kbuild.h forces include of autoconf.h on the
commandline using -include - so we do not need to
include the file explicit.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Michal Marek <mmarek@suse.cz>
2009-12-12 13:08:15 +01:00
Sam Ravnborg 01fc0ac198 kbuild: move bounds.h to include/generated
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Michal Marek <mmarek@suse.cz>
2009-12-12 13:08:14 +01:00
Matthew Garrett 967c9ef9b8 Input: i8042 - allow installing platform filters for incoming data
Some hardware (such as Dell laptops) signal a variety of events through
the i8042 controller, even if these don't map to keyboard events. Add
support for drivers to filter the i8042 event stream in order to respond
to these events and (if appropriate) block them from entering the input
stream.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2009-12-11 23:55:42 -08:00
Linus Torvalds 053fe57ac2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (75 commits)
  net: Handle NETREG_UNINITIALIZED devices correctly
  can: add the driver for Analog Devices Blackfin on-chip CAN controllers
  xfrm: Fix truncation length of authentication algorithms installed via PF_KEY
  net: use compat helper functions in compat_sys_recvmmsg
  net: fix compat_sys_recvmmsg parameter type
  cxgb3: Fixing EEH handlers
  cnic: Zero out status block and Event Queue indices.
  cnic: Send delete command when shutting down iSCSI ring.
  net: smc91x: Fix up type mismatch in smc_drv_resume().
  smc91x: fix unused flags warnings on UP systems
  MAINTAINERS: Transfering maintainership of cdc-ether
  net: Add missing TST_CFG_WRITE bits around sky2_pci_write
  net: Fix Yukon-2 Optima TCP offload setup
  net: niu uses crc32, so select CRC32
  wireless: update old static regulatory domain rules
  mac80211: Revert 'Use correct sign for mesh active path refresh'
  mac80211: Fixed bug in mesh portal paths
  net/mac80211: Correct size given to memset
  b43: Remove reset after fatal DMA error
  rtl8187: add radio led and fix warnings on suspend
  ...
2009-12-11 20:58:20 -08:00
Linus Torvalds 92340ee319 Merge branch 'compat-ioctl-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground
* 'compat-ioctl-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground:
  usbdevfs: move compat_ioctl handling to devio.c
  lp: move compat_ioctl handling into lp.c
  compat_ioctl: pass compat pointer directly to handlers
  compat_ioctl: simplify lookup table
  compat_ioctl: simplify calling of handlers
  compat_ioctl: inline all conversion handlers
  compat_ioctl: Remove BKL
  compat_ioctl: remove all VT ioctl handling
2009-12-11 20:57:46 -08:00
Linus Torvalds 3070f27d6e Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  itimer: Fix the itimer trace print format
  hrtimer: move timer stats helper functions to hrtimer.c
  hrtimer: Tune hrtimer_interrupt hang logic
2009-12-11 20:49:09 -08:00
Linus Torvalds df7147b3c3 Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  tracing: Remove comparing of NULL to va_list in trace_array_vprintk()
  tracing: Fix function graph trace_pipe to properly display failed entries
  tracing: Add full state to trace_seq
  tracing: Buffer the output of seq_file in case of filled buffer
  tracing: Only call pipe_close if pipe_close is defined
  tracing: Add pipe_close interface
2009-12-11 20:47:44 -08:00
Linus Torvalds 6f696eb17b Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (57 commits)
  x86, perf events: Check if we have APIC enabled
  perf_event: Fix variable initialization in other codepaths
  perf kmem: Fix unused argument build warning
  perf symbols: perf_header__read_build_ids() offset'n'size should be u64
  perf symbols: dsos__read_build_ids() should read both user and kernel buildids
  perf tools: Align long options which have no short forms
  perf kmem: Show usage if no option is specified
  sched: Mark sched_clock() as notrace
  perf sched: Add max delay time snapshot
  perf tools: Correct size given to memset
  perf_event: Fix perf_swevent_hrtimer() variable initialization
  perf sched: Fix for getting task's execution time
  tracing/kprobes: Fix field creation's bad error handling
  perf_event: Cleanup for cpu_clock_perf_event_update()
  perf_event: Allocate children's perf_event_ctxp at the right time
  perf_event: Clean up __perf_event_init_context()
  hw-breakpoints: Modify breakpoints without unregistering them
  perf probe: Update perf-probe document
  perf probe: Support --del option
  trace-kprobe: Support delete probe syntax
  ...
2009-12-11 20:47:30 -08:00
David S. Miller 501706565b Merge branch 'master' of /home/davem/src/GIT/linux-2.6/
Conflicts:
	include/net/tcp.h
2009-12-11 17:12:17 -08:00
Linus Torvalds 2fe77b81c7 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
  [ACPI/CPUFREQ] Introduce bios_limit per cpu cpufreq sysfs interface
  [CPUFREQ] make internal cpufreq_add_dev_* static
  [CPUFREQ] use an enum for speedstep processor identification
  [CPUFREQ] Document units for transition latency
  [CPUFREQ] Use global sysfs cpufreq structure for conservative governor tunings
  [CPUFREQ] Documentation: ABI: /sys/devices/system/cpu/cpu#/cpufreq/
  [CPUFREQ] powernow-k6: set transition latency value so ondemand governor can be used
  [CPUFREQ] cpumask: don't put a cpumask on the stack in x86...cpufreq/powernow-k8.c
2009-12-11 15:59:23 -08:00
Linus Torvalds 0f4974c439 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6: (58 commits)
  tty: split the lock up a bit further
  tty: Move the leader test in disassociate
  tty: Push the bkl down a bit in the hangup code
  tty: Push the lock down further into the ldisc code
  tty: push the BKL down into the handlers a bit
  tty: moxa: split open lock
  tty: moxa: Kill the use of lock_kernel
  tty: moxa: Fix modem op locking
  tty: moxa: Kill off the throttle method
  tty: moxa: Locking clean up
  tty: moxa: rework the locking a bit
  tty: moxa: Use more tty_port ops
  tty: isicom: fix deadlock on shutdown
  tty: mxser: Use the new locking rules to fix setserial properly
  tty: mxser: use the tty_port_open method
  tty: isicom: sort out the board init logic
  tty: isicom: switch to the new tty_port_open helper
  tty: tty_port: Add a kref object to the tty port
  tty: istallion: tty port open/close methods
  tty: stallion: Convert to the tty_port_open/close methods
  ...
2009-12-11 15:34:40 -08:00
Linus Torvalds 3126c136bc Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: (21 commits)
  ext3: PTR_ERR return of wrong pointer in setup_new_group_blocks()
  ext3: Fix data / filesystem corruption when write fails to copy data
  ext4: Support for 64-bit quota format
  ext3: Support for vfsv1 quota format
  quota: Implement quota format with 64-bit space and inode limits
  quota: Move definition of QFMT_OCFS2 to linux/quota.h
  ext2: fix comment in ext2_find_entry about return values
  ext3: Unify log messages in ext3
  ext2: clear uptodate flag on super block I/O error
  ext2: Unify log messages in ext2
  ext3: make "norecovery" an alias for "noload"
  ext3: Don't update the superblock in ext3_statfs()
  ext3: journal all modifications in ext3_xattr_set_handle
  ext2: Explicitly assign values to on-disk enum of filetypes
  quota: Fix WARN_ON in lookup_one_len
  const: struct quota_format_ops
  ubifs: remove manual O_SYNC handling
  afs: remove manual O_SYNC handling
  kill wait_on_page_writeback_range
  vfs: Implement proper O_SYNC semantics
  ...
2009-12-11 15:31:13 -08:00
Linus Torvalds f58df54a54 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (27 commits)
  Driver core: fix race in dev_driver_string
  Driver Core: Early platform driver buffer
  sysfs: sysfs_setattr remove unnecessary permission check.
  sysfs: Factor out sysfs_rename from sysfs_rename_dir and sysfs_move_dir
  sysfs: Propagate renames to the vfs on demand
  sysfs: Gut sysfs_addrm_start and sysfs_addrm_finish
  sysfs: In sysfs_chmod_file lazily propagate the mode change.
  sysfs: Implement sysfs_getattr & sysfs_permission
  sysfs: Nicely indent sysfs_symlink_inode_operations
  sysfs: Update s_iattr on link and unlink.
  sysfs: Fix locking and factor out sysfs_sd_setattr
  sysfs: Simplify iattr time assignments
  sysfs: Simplify sysfs_chmod_file semantics
  sysfs: Use dentry_ops instead of directly playing with the dcache
  sysfs: Rename sysfs_d_iput to sysfs_dentry_iput
  sysfs: Update sysfs_setxattr so it updates secdata under the sysfs_mutex
  debugfs: fix create mutex racy fops and private data
  Driver core: Don't remove kobjects in device_shutdown.
  firmware_class: make request_firmware_nowait more useful
  Driver-Core: devtmpfs - set root directory mode to 0755
  ...
2009-12-11 15:24:56 -08:00
Linus Torvalds 748e566b7e Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (122 commits)
  USB: mos7840: add device IDs for B&B electronics devices
  USB: ftdi_sio: add USB device ID's for B&B Electronics line
  USB: musb: musb_host: fix sparse warning
  USB: musb: musb_gadget: fix sparse warning
  USB: musb: omap2430: fix sparse warning
  USB: core: message: fix sparse warning
  USB: core: hub: fix sparse warning
  USB: core: fix sparse warning for static function
  USB: Added USB_ETH_RNDIS to use instead of CONFIG_USB_ETH_RNDIS
  USB: Check bandwidth when switching alt settings.
  USB: Refactor code to find alternate interface settings.
  USB: xhci: Fix command completion after a drop endpoint.
  USB: xhci: Make reverting an alt setting "unfailable".
  USB: usbtmc: Use usb_clear_halt() instead of custom code.
  USB: xhci: Add correct email and files to MAINTAINERS entry.
  USB: ehci-omap.c: introduce missing kfree
  USB: xhci-mem.c: introduce missing kfree
  USB: add remove_id sysfs attr for usb drivers
  USB: g_multi kconfig: fix depends and help text
  USB: option: add pid for ZTE
  ...
2009-12-11 15:22:55 -08:00
Alan Cox eeb89d918c tty: push the BKL down into the handlers a bit
Start trying to untangle the remaining BKL mess

Updated to fix missing unlock_kernel noted by Dan Carpenter

Signed-off-by: Alan "I must be out of my tree" Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-11 15:18:08 -08:00
Alan Cox 6ed847d8ef tty: isicom: sort out the board init logic
Split this into two flags - INIT meaning the board is set up and ACTIVE
meaning the board has ports open. Remove the broken HUPCL casing and push
the counts somewhere sensible.

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-11 15:18:07 -08:00
Alan Cox 568aafc627 tty: tty_port: Add a kref object to the tty port
Users of tty port need a way to refcount ports when hotplugging is
involved.

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-11 15:18:07 -08:00
Alan Cox 44e4909e45 tty: tty_port: Change the buffer allocator locking
We want to be able to do this without regard for the activate/own open
method being used which causes a problem using port->mutex. Add another
mutex for now. Once everything uses port_open to do buffer allocs we can
kill it back off

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-11 15:18:06 -08:00
Alan Cox 82fc594343 usb_serial: Kill port mutex
The tty port has a port mutex used for all the port related locking so we
don't need the one in the USB serial layer any more.

Signed-off-by: Alan Cox <alan@linux.intel.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Oliver Neukum <oliver@neukum.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-11 15:18:04 -08:00
Alan Cox 64bc397914 tty_port: add "tty_port_open" helper
For the moment this just moves the USB logic over and fixes the 'what if
we open and hangup at the same time' race noticed by Oliver Neukum.

Signed-off-by: Alan Cox <alan@linux.intel.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Oliver Neukum <oliver@neukum.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-11 15:18:04 -08:00
Alan Cox f53a2ade0b tty: esp: remove broken driver
The ESP driver has been marked broken for years. It's an old ISA device
that clearly nobody cares about any more. Remove it

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-11 15:18:03 -08:00
Linus Torvalds aad3bf04dc Merge git://git.kernel.org/pub/scm/linux/kernel/git/viro/mmap
* git://git.kernel.org/pub/scm/linux/kernel/git/viro/mmap:
  Add missing alignment check in arch/score sys_mmap()
  fix broken aliasing checks for MAP_FIXED on sparc32, mips, arm and sh
  Get rid of open-coding in ia64_brk()
  sparc_brk() is not needed anymore
  switch do_brk() to get_unmapped_area()
  Take arch_mmap_check() into get_unmapped_area()
  fix a struct file leak in do_mmap_pgoff()
  Unify sys_mmap*
  Cut hugetlb case early for 32bit on ia64
  arch_mmap_check() on mn10300
  Kill ancient crap in s390 compat mmap
  arm: add arch_mmap_check(), get rid of sys_arm_mremap()
  file ->get_unmapped_area() shouldn't duplicate work of get_unmapped_area()
  kill useless checks in sparc mremap variants
  fix pgoff in "have to relocate" case of mremap()
  fix the arch checks in MREMAP_FIXED case
  fix checks for expand-in-place mremap
  do_mremap() untangling, part 3
  do_mremap() untangling, part 2
  untangling do_mremap(), part 1
2009-12-11 12:23:29 -08:00
Linus Torvalds 11bd04f6f3 Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (109 commits)
  PCI: fix coding style issue in pci_save_state()
  PCI: add pci_request_acs
  PCI: fix BUG_ON triggered by logical PCIe root port removal
  PCI: remove ifdefed pci_cleanup_aer_correct_error_status
  PCI: unconditionally clear AER uncorr status register during cleanup
  x86/PCI: claim SR-IOV BARs in pcibios_allocate_resource
  PCI: portdrv: remove redundant definitions
  PCI: portdrv: remove unnecessary struct pcie_port_data
  PCI: portdrv: minor cleanup for pcie_port_device_register
  PCI: portdrv: add missing irq cleanup
  PCI: portdrv: enable device before irq initialization
  PCI: portdrv: cleanup service irqs initialization
  PCI: portdrv: check capabilities first
  PCI: portdrv: move PME capability check
  PCI: portdrv: remove redundant pcie type calculation
  PCI: portdrv: cleanup pcie_device registration
  PCI: portdrv: remove redundant pcie_port_device_probe
  PCI: Always set prefetchable base/limit upper32 registers
  PCI: read-modify-write the pcie device control register when initiating pcie flr
  PCI: show dma_mask bits in /sys
  ...

Fixed up conflicts in:
	arch/x86/kernel/amd_iommu_init.c
	drivers/pci/dmar.c
	drivers/pci/hotplug/acpiphp_glue.c
2009-12-11 12:18:16 -08:00
Sarah Sharp 91017f9cf5 USB: Refactor code to find alternate interface settings.
Refactor out the code to find alternate interface settings into
usb_find_alt_setting().  Print a debugging message and return null if the
alt setting is not found.

While we're at it, correct a bug in the refactored code.  The interfaces
in the configuration's interface cache are not necessarily in numerical
order, so we can't just use the interface number as an array index.  Loop
through the interface caches, looking for the correct interface.

Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-11 11:55:27 -08:00
Alan Stern a0bb108112 USB: usb-storage: add BAD_SENSE flag
This patch (as1311) fixes a problem in usb-storage: Some devices are
pretty broken when it comes to reporting sense data.  The information
they send back indicates that they have more than 18 bytes of sense
data available, but when the system asks for more than 18 they fail or
hang.  The symptom is that probing fails with multiple resets.

The patch adds a new BAD_SENSE flag to indicate that usb-storage
should never ask for more than 18 bytes of sense data.  The flag can
be set in an unusual_devs entry or via the "quirks=" module parameter,
and it is set automatically whenever a REQUEST SENSE command for more
than 18 bytes fails or times out.

An unusual_devs entry is added for the Agfa photo frame, which uses a
Prolific chip having this bug.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Daniel Kukula <daniel.kuku@gmail.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-11 11:55:26 -08:00
Alan Stern 8e4ceb38eb USB: prepare for changover to Runtime PM framework
This patch (as1303) revises the USB Power Management infrastructure to
make it compatible with the new driver-model Runtime PM framework:

	Drivers are no longer allowed to access intf->pm_usage_cnt
	directly; the PM framework manages its own usage counters.

	usb_autopm_set_interface() is eliminated, because it directly
	sets intf->pm_usage_cnt.

	usb_autopm_enable() and usb_autopm_disable() are eliminated,
	because they call usb_autopm_set_interface().

	usb_autopm_get_interface_no_resume() and
	usb_autopm_put_interface_no_suspend() are added.  They
	correspond to pm_runtime_get_noresume() and
	pm_runtime_put_noidle() in the PM framework.

	The power/level attribute no longer accepts "suspend", only
	"on" and "auto".  The PM framework doesn't allow devices to be
	forced into a suspended mode.

The hub driver contains the only code that violates the new
guidelines.  It is updated to use the new interface routines instead.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-11 11:55:25 -08:00
Alan Stern fb34d53752 USB: remove the auto_pm flag
This patch (as1302) removes the auto_pm flag from struct usb_device.
The flag's only purpose was to distinguish between autosuspends and
external suspends, but that information is now available in the
pm_message_t argument passed to suspend methods.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-11 11:55:21 -08:00
Daniel Mack 2d57a95f09 USB OTG: Add generic driver for ULPI OTG transceiver
This adds a minimal generic driver for ULPI connected transceivers,
using the OTG framework functions recently introduced.

The driver got a table to match the ULPI chips, which currently only has
one entry for NXP's ISP 1504 transceiver.

Signed-off-by: Daniel Mack <daniel@caiaq.de>
Cc: Heikki Krogerus <ext-heikki.krogerus@nokia.com>
Cc: David Brownell <dbrownell@users.sourceforge.net>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-11 11:55:16 -08:00
Daniel Mack 91c8a5a998 USB OTG: add support for ulpi connected external transceivers
This adds support for OTG transceivers directly connected to the ULPI
interface. In particular, the following details are added

- a struct for low level io functions (read/write)
- a priv field to be used as 'viewport' by low level access functions
- an (*init) and (*shutdown) callbacks, along with static inline helpers
- a (*set_vbus) callback to switch the port power on and off
- a flags field for per-transceiver settings
- some defines for the flags bitmask to configure platform specific
  details

Signed-off-by: Daniel Mack <daniel@caiaq.de>
Cc: Heikki Krogerus <ext-heikki.krogerus@nokia.com>
Cc: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-11 11:55:16 -08:00
Laurent Pinchart 5242658d1b USB gadget: Handle endpoint requests at the function level
Control requests targeted at an endpoint (that is sent to EP0 but
specifying the target endpoint address in wIndex) are dispatched to the
current configuration's setup callback, requiring all gadget drivers to
dispatch the requests to the correct function driver.

To avoid this, record which endpoints are used by each function in the
composite driver SET CONFIGURATION handler and dispatch requests
targeted at endpoints to the correct function.

Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-11 11:55:15 -08:00
David Vrabel 4c1bd3d7a7 USB: make urb scatter-gather support more generic
The WHCI HCD will also support urbs with scatter-gather lists.  Add a
usb_bus field to indicated how many sg list elements are supported by
the HCD.  Use this to decide whether to pass the scatter-list to the HCD
or not.

Make the usb-storage driver use this new field.

Signed-off-by: David Vrabel <david.vrabel@csr.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Cc: Matthew Dharm <mdharm-usb@one-eyed-alien.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-11 11:55:14 -08:00
Linus Torvalds 4e2ccdb040 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2: (49 commits)
  nilfs2: separate wait function from nilfs_segctor_write
  nilfs2: add iterator for segment buffers
  nilfs2: hide nilfs_write_info struct in segment buffer code
  nilfs2: relocate io status variables to segment buffer
  nilfs2: do not return io error for bio allocation failure
  nilfs2: use list_splice_tail or list_splice_tail_init
  nilfs2: replace mark_inode_dirty as nilfs_mark_inode_dirty
  nilfs2: delete mark_inode_dirty in nilfs_delete_entry
  nilfs2: delete mark_inode_dirty in nilfs_commit_chunk
  nilfs2: change return type of nilfs_commit_chunk
  nilfs2: split nilfs_unlink as nilfs_do_unlink and nilfs_unlink
  nilfs2: delete redundant mark_inode_dirty
  nilfs2: expand inode_inc_link_count and inode_dec_link_count
  nilfs2: delete mark_inode_dirty from nilfs_set_link
  nilfs2: delete mark_inode_dirty in nilfs_new_inode
  nilfs2: add norecovery mount option
  nilfs2: add helper to get if volume is in a valid state
  nilfs2: move recovery completion into load_nilfs function
  nilfs2: apply readahead for recovery on mount
  nilfs2: clean up get/put function of a segment usage
  ...
2009-12-11 11:49:18 -08:00
Magnus Damm c60e0504c8 Driver Core: Early platform driver buffer
Add early_platform_init_buffer() support and update the
early platform driver code to allow passing parameters
to the driver on the kernel command line.

early_platform_init_buffer() simply allows early platform
drivers to provide a pointer and length to a memory area
where the remaining part of the kernel command line option
will be stored.

Needed to pass baud rate and other serial port options
to the reworked early serial console code on SuperH.

Signed-off-by: Magnus Damm <damm@opensource.se>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-11 11:24:55 -08:00
Eric W. Biederman 832b6af198 sysfs: Propagate renames to the vfs on demand
By teaching sysfs_revalidate to hide a dentry for
a sysfs_dirent if the sysfs_dirent has been renamed,
and by teaching sysfs_lookup to return the original
dentry if the sysfs dirent has been renamed.  I can
show the results of renames correctly without having to
update the dcache during the directory rename.

This massively simplifies the rename logic allowing a lot
of weird sysfs special cases to be removed along with
a lot of now unnecesary helper code.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-11 11:24:54 -08:00
Johannes Berg 9ebfbd45f9 firmware_class: make request_firmware_nowait more useful
Unfortunately, one cannot hold on to the struct firmware
that request_firmware_nowait() hands off, which is needed
in some cases. Allow this by requiring the callback to
free it (via release_firmware).

Additionally, give it a gfp_t parameter -- all the current
users call it from a GFP_KERNEL context so the GFP_ATOMIC
isn't necessary. This also marks an API break which is
useful in a sense, although that is obviously not the
primary purpose of this change.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Cc: Ming Lei <tom.leiming@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Woodhouse <David.Woodhouse@intel.com>
Cc: Pavel Roskin <proski@gnu.org>
Cc: Abhay Salunke <abhay_salunke@dell.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-11 11:24:52 -08:00
Kay Sievers 073120cc28 Driver Core: devtmpfs: use sys_mount()
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-11 11:24:51 -08:00
Steven Rostedt 03889384ce tracing: Add trace_dump_stack()
I've been asked a few times about how to find out what is calling
some location in the kernel. One way is to use dynamic function tracing
and implement the func_stack_trace. But this only finds out who is
calling a particular function. It does not tell you who is calling
that function and entering a specific if conditional.

I have myself implemented a quick version of trace_dump_stack() for
this purpose a few times, and just needed it now. This is when I realized
that this would be a good tool to have in the kernel like trace_printk().

Using trace_dump_stack() is similar to dump_stack() except that it
writes to the trace buffer instead and can be used in critical locations.

For example:

@@ -5485,8 +5485,12 @@ need_resched_nonpreemptible:
 	if (prev->state && !(preempt_count() & PREEMPT_ACTIVE)) {
 		if (unlikely(signal_pending_state(prev->state, prev)))
 			prev->state = TASK_RUNNING;
-		else
+		else {
 			deactivate_task(rq, prev, 1);
+			trace_printk("Deactivating task %s:%d\n",
+				     prev->comm, prev->pid);
+			trace_dump_stack();
+		}
 		switch_count = &prev->nvcsw;
 	}

Produces:

           <...>-3249  [001]   296.105269: schedule: Deactivating task ntpd:3249
           <...>-3249  [001]   296.105270: <stack trace>
 => schedule
 => schedule_hrtimeout_range
 => poll_schedule_timeout
 => do_select
 => core_sys_select
 => sys_select
 => system_call_fastpath

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-12-11 10:38:47 -05:00
Al Viro f8b7256096 Unify sys_mmap*
New helper - sys_mmap_pgoff(); switch syscalls to using it.

Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-11 06:44:29 -05:00
Frederic Weisbecker 99ac64c826 hw-breakpoints: Handle bad modify_user_hw_breakpoint off-case return value
While converting modify_user_hw_breakpoint() return value, we
forgot to handle the off-case. It's not returning a pointer
anymore.

This solves the build warning reported by Stephen Rothwell against
linux-next.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Prasad <prasad@linux.vnet.ibm.com>
LKML-Reference: <1260529122-6260-1-git-send-regression-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Prasad <prasad@linux.vnet.ibm.com>
2009-12-11 12:03:53 +01:00
Li Zefan 0f24f1287a tracing, slab: Define kmem_cache_alloc_notrace ifdef CONFIG_TRACING
Define kmem_trace_alloc_{,node}_notrace() if CONFIG_TRACING is
enabled, otherwise perf-kmem will show wrong stats ifndef
CONFIG_KMEM_TRACE, because a kmalloc() memory allocation may
be traced by both trace_kmalloc() and trace_kmem_cache_alloc().

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: linux-mm@kvack.org <linux-mm@kvack.org>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
LKML-Reference: <4B21F89A.7000801@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-11 09:17:02 +01:00
Guennadi Liakhovetski a88f666707 dmaengine: clarify the meaning of the DMA_CTRL_ACK flag
DMA_CTRL_ACK's description applies to its clear state, not to its set state.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-12-10 23:43:19 -07:00
Linus Torvalds aa2cf42059 Merge branch 'for-linus' of git://gitorious.org/linux-omap-dss2/linux
* 'for-linus' of git://gitorious.org/linux-omap-dss2/linux:
  MAINTAINERS: Add OMAP2/3 DSS and OMAPFB maintainer
  OMAP: SDP: Enable DSS2 for OMAP3 SDP board
  OMAP: DSS2: Taal DSI command mode panel driver
  OMAP: DSS2: Add generic and Sharp panel drivers
  OMAP: DSS2: omapfb driver
  OMAP: DSS2: DSI driver
  OMAP: DSS2: SDI driver
  OMAP: DSS2: RFBI driver
  OMAP: DSS2: Video encoder driver
  OMAP: DSS2: DPI driver
  OMAP: DSS2: DISPC
  OMAP: DSS2: Add more core files
  OMAP: DSS2: Display Subsystem Driver core
  OMAP: DSS2: Documentation for DSS2
  OMAP: Add support for VRFB rotation engine
  OMAP: Add VRAM manager
  OMAP: OMAPFB: add omapdss device
  OMAP: OMAPFB: split omapfb.h
  OMAP2: Add funcs for writing SMS_ROT_* registers
2009-12-10 21:55:17 -08:00
Kiyoshi Ueda 64dbce580d dm: export suspended state to targets
This patch adds the exported dm_suspended() function so that targets
can check whether or not they are suspended.

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:27 +00:00
Kiyoshi Ueda 4f186f8bbf dm: rename dm_suspended to dm_suspended_md
This patch renames dm_suspended() to dm_suspended_md() and
keeps it internal to dm.
No functional change.

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:26 +00:00
Alasdair G Kergon 042d2a9bcd dm: keep old table until after resume succeeded
When swapping a new table into place, retain the old table until
its replacement is in place.

An old check for an empty table is removed because this is enforced
in populate_table().

__unbind() becomes redundant when followed by __bind().

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:24 +00:00
Mike Snitzer 1d0f3ce832 dm ioctl: retrieve status from inactive table
Add the flag DM_QUERY_INACTIVE_TABLE_FLAG to the ioctls to return
infomation about the loaded-but-not-yet-active table instead of the live
table.  Prior to this patch it was impossible to obtain this information
until the device had been 'resumed'.

Userspace dmsetup and libdevmapper support the flag as of version 1.02.40.
e.g. dmsetup info --inactive vg1-lv1

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:22 +00:00
Alasdair G Kergon 7c6664114b dm: rename dm_get_table to dm_get_live_table
Rename dm_get_table to dm_get_live_table.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:19 +00:00
Mikulas Patocka c58098be97 dm raid1: remove bio_endio from dm_rh_mark_nosync
Move bio completion out of dm_rh_mark_nosync in preparation for the
next patch.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Takahiro Yasui <tyasui@redhat.com>
Tested-by: Takahiro Yasui <tyasui@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:05 +00:00