Commit d7b1956fed ("DMI: Introduce
dmi_first_match to make the interface more flexible") introduced compile
errors like the following when !CONFIG_DMI
drivers/ata/sata_sil.c: In function 'sil_broken_system_poweroff':
drivers/ata/sata_sil.c:713: error: implicit declaration of function 'dmi_first_match'
drivers/ata/sata_sil.c:713: warning: initialization makes pointer from integer without a cast
We just need a dummy version of dmi_first_match() to fix this all up.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Schedule a vblank signal, kill the process, and we'll go walking over freed
memory. Given that no open-source userland exists using this, nor have I
ever heard of a consumer, just let this code die.
Signed-off-by: Eric Anholt <eric@anholt.net>
Requested-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Dave Airlie <airlied@linux.ie>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In commit 9db66bdcc8 (net: convert
TCP/DCCP ehash rwlocks to spinlocks), I forgot to change one
occurrence of rwlock_t to spinlock_t
I believe sizeof(raw_spinlock_t) might be > 0 on !CONFIG_SMP if
CONFIG_DEBUG_SPINLOCK while sizeof(raw_rwlock_t) should be 0 in this
case.
Fortunatly, CONFIG_DEBUG_SPINLOCK adds fields to both spinlock_t and
rwlock_t, but at this might change in the future (being able to debug
spinlocks but not rwlocks for example), better to be safe.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (36 commits)
USB: Driver for Freescale QUICC Engine USB Host Controller
USB: option: add QUANTA HSDPA Data Card device ids
USB: storage: Add another unusual_dev for off-by-one bug
USB: unusual_dev: usb-storage needs to ignore a device
USB: GADGET: fix !x & y
USB: new id for ti_usb_3410_5052 driver
USB: cdc-acm: Add another conexant modem to the quirks
USB: 'option' driver - onda device MT503HS has wrong id
USB: Remove ZTE modem from unusual_devices
USB: storage: support of Dane-Elec MediaTouch USB device
USB: usbmon: Implement compat_ioctl
USB: add kernel-doc for wusb_dev in struct usb_device
USB: ftdi_sio driver support of bar code scanner from Diebold
USB: ftdi_sio: added Alti-2 VID and Neptune 3 PID
USB: cp2101 device
USB: usblp.c: add USBLP_QUIRK_BIDIR to Brother HL-1440
USB: remove vernier labpro from ldusb
USB: CDC-ACM quirk for MTK GPS
USB: cdc-acm: support some gps data loggers
USB: composite: Fix bug: low byte of w_index is the usb interface number not the whole 2 bytes of w_index
...
Reported by Randy Dunlap from a warning on the v2.6.29 merge window.
Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Cc: David Vrabel <david.vrabel@csr.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
gcc 3.4.6 doesn't like MODULE_DEVICE_TABLE(dmi, x) expansion enough to
error out. Shut it up in a most simple way.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The SLAB kmalloc with a constant value isn't consistent with the other
implementations because it bails out with __you_cannot_kmalloc_that_much
rather than returning NULL and properly allowing the caller to fall back
to vmalloc or take other action. This doesn't happen with a non-constant
value or with SLOB or SLUB.
Starting with 2.6.28, I've been seeing build failures on s390x. This is
due to init_section_page_cgroup trying to allocate 2.5MB when the max size
for a kmalloc on s390x is 2MB.
It's failing because the value is constant. The workarounds at the call
size are ugly and the caller shouldn't have to change behavior depending
on what the backend of the API is.
So, this patch eliminates the link failure and returns NULL like the other
implementations.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: <stable@kernel.org> [2.6.28.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
* 'hibern_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
SATA PIIX: Blacklist system that spins off disks during ACPI power off
SATA Sil: Blacklist system that spins off disks during ACPI power off
SATA AHCI: Blacklist system that spins off disks during ACPI power off
SATA: Blacklisting of systems that spin off disks during ACPI power off
DMI: Introduce dmi_first_match to make the interface more flexible
Hibernation: Introduce system_entering_hibernation
Introduce a new ioctl UBI_IOCSETPROP to set properties
on a volume. Also add the first property:
UBI_PROP_DIRECT_WRITE, this property is used to set the
ability to use direct writes in userspace
Signed-off-by: Sidney Amani <seed@uffs.org>
Signed-off-by: Corentin Chary <corentincj@iksaif.net>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Introduce new libata flags ATA_FLAG_NO_POWEROFF_SPINDOWN and
ATA_FLAG_NO_HIBERNATE_SPINDOWN that, if set, will prevent disks from
being spun off during system power off and hibernation, respectively
(to handle the hibernation case we need the new system state
SYSTEM_HIBERNATE_ENTER that can be checked against by libata, in
analogy with SYSTEM_POWER_OFF).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Some notebooks from HP have the problem that their BIOSes attempt to
spin down hard drives before entering ACPI system states S4 and S5.
This leads to a yo-yo effect during system power-off shutdown and the
last phase of hibernation when the disk is first spun down by the
kernel and then almost immediately turned on and off by the BIOS.
This, in turn, may result in shortening the disk's life times.
To prevent this from happening we can blacklist the affected systems
using DMI information. However, only the on-board controlles should
be blacklisted and their PCI slot numbers can be used for this
purpose. Unfortunately the existing interface for checking DMI
information of the system is not very convenient for this purpose,
because to use it, we would have to define special callback functions
or create a separate struct dmi_system_id table for each blacklisted
system.
To overcome this difficulty introduce a new function
dmi_first_match() returning a pointer to the first entry in an array
of struct dmi_system_id elements that matches the system DMI
information. Then, we can use this pointer to access the entry's
.driver_data field containing the additional information, such as
the PCI slot number, allowing us to do the desired blacklisting.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Introduce boolean function system_entering_hibernation() returning
'true' during the last phase of hibernation, in which devices are
being put into low power states and the sleep state (for example,
ACPI S4) is finally entered.
Some device drivers need such a function to check if the system is
in the final phase of hibernation. In particular, some SATA drivers
are going to use it for blacklisting systems in which the disks
should not be spun down during the last phase of hibernation (the
BIOS will do that anyway).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Introduced by 8adb711f36 ("debugfs:
introduce stub for debugfs_create_size_t() when DEBUG_FS=n") and due to
a simple missing "static inline".
Reported-and-tested-by: Jeff Chua <jeff.chua.linux@gmail.com>
Acked-by: Greg KH <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
i2c: Warn on deprecated binding model use
eeprom: More consistent symbol names
eeprom: Move 93cx6 eeprom driver to /drivers/misc/eeprom
spi: Move at25 (for SPI eeproms) to /drivers/misc/eeprom
i2c: Move old eeprom driver to /drivers/misc/eeprom
i2c: Move at24 to drivers/misc/eeprom
i2c: Quilt tree has moved
i2c: Delete many unused adapter IDs
i2c: Delete 10 unused driver IDs
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (92 commits)
gianfar: Revive VLAN support
vlan: Export symbols as non GPL symbols.
bnx2x: tx_has_work should not wait for FW
netxen: reduce memory footprint
netxen: fix vlan tso/checksum offload
net: Fix linux/if_frad.h's suitability for userspace.
net: Move config NET_NS to from net/Kconfig to init/Kconfig
isdn: Fix missing ifdef in isdn_ppp
networking: document "nc" in addition to "netcat" in netconsole.txt
e1000e: workaround hw errata
af_key: initialize xfrm encap_oa
virtio_net: Fix MAX_PACKET_LEN to support 802.1Q VLANs
lcs: fix compilation for !CONFIG_IP_MULTICAST
rtl8187: Add termination packet to prevent stall
iwlwifi: fix rs_get_rate WARN_ON()
p54usb: fix packet loss with first generation devices
sctp: Fix another socket race during accept/peeloff
sctp: Properly timestamp outgoing data chunks for rtx purposes
sctp: Correctly start rtx timer on new packet transmissions.
sctp: Fix crc32c calculations on big-endian arhes.
...
The userspace interfaces are protected by CONFIG_* ifdefs
and that of course can't work.
Reported by Jaswinder Singh Rajput.
Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Let the kernel developers know that i2c_attach_client() and
i2c_detach_client() are deprecated and should no longer be used.
Drivers using these should be converted to the standard device
driver binding model (probe and remove methods.)
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Ben Dooks <ben-linux@fluff.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
klist.c: bit 0 in pointer can't be used as flag
debugfs: introduce stub for debugfs_create_size_t() when DEBUG_FS=n
sysfs: fix problems with binary files
PNP: fix broken pnp lowercasing for acpi module aliases
driver core: Convert '/' to '!' in dev_set_name()
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/i915: Fix cursor physical address choice to match the 2D driver.
drm: stash AGP include under the do-we-have-AGP ifdef
drm: don't whine about not reading EDID data
drm/i915: hook up LVDS DPMS property
drm/i915: remove unnecessary debug output in KMS init
i915: fix freeing path for gem phys objects.
drm: create mode_config idr lock
drm: fix leak of device mappings since multi-master changes.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI hotplug: fix lock imbalance in pciehp
PCI PM: Restore standard config registers of all devices early
PCI/MSI: bugfix/utilize for msi_capability_init()
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx:
i.MX31: framebuffer driver
i.MX31: Image Processing Unit DMA and IRQ drivers
dmaengine: add async_tx_clear_ack() macro
dmaengine: dma_issue_pending_all == nop when CONFIG_DMA_ENGINE=n
dmaengine: kill some dubious WARN_ONCEs
fsldma: print correct IRQ on mpc83xx
fsldma: check for NO_IRQ in fsl_dma_chan_remove()
dmatest: Use custom map/unmap for destination buffer
fsldma: use a valid 'device' for dma_pool_create
dmaengine: fix dependency chaining
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
debugobjects: add and use INIT_WORK_ON_STACK
rcu: remove duplicate CONFIG_RCU_CPU_STALL_DETECTOR
relay: fix lock imbalance in relay_late_setup_files
oprofile: fix uninitialized use of struct op_entry
rcu: move Kconfig menu
softlock: fix false panic which can occur if softlockup_thresh is reduced
rcu: add __cpuinit to rcu_init_percpu_data()
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
hrtimers: fix inconsistent lock state on resume in hres_timers_resume
time-sched.c: tick_nohz_update_jiffies should be static
locking, hpet: annotate false positive warning
kernel/fork.c: unused variable 'ret'
itimers: remove the per-cpu-ish-ness
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (29 commits)
xen: unitialised return value in xenbus_write_transaction
x86: fix section mismatch warning
x86: unmask CPUID levels on Intel CPUs, fix
x86: work around PAGE_KERNEL_WC not getting WC in iomap_atomic_prot_pfn.
x86: use standard PIT frequency
xen: handle highmem pages correctly when shrinking a domain
x86, mm: fix pte_free()
xen: actually release memory when shrinking domain
x86: unmask CPUID levels on Intel CPUs
x86: add MSR_IA32_MISC_ENABLE bits to <asm/msr-index.h>
x86: fix PTE corruption issue while mapping RAM using /dev/mem
x86: mtrr fix debug boot parameter
x86: fix page attribute corruption with cpa()
Revert "x86: signal: change type of paramter for sys_rt_sigreturn()"
x86: use early clobbers in usercopy*.c
x86: remove kernel_physical_mapping_init() from init section
fix: crash: IP: __bitmap_intersects+0x48/0x73
cpufreq: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write
work_on_cpu: Use our own workqueue.
work_on_cpu: don't try to get_online_cpus() in work_on_cpu.
...
It supports VX855 and future chips whose IDE controller uses PCI ID 0x0571.
Signed-off-by: Joseph Chan <josephchan@via.com.tw>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
crc32c algorithm provides a byteswaped result. On little-endian
arches, the result ends up in big-endian/network byte order.
On big-endinan arches, the result ends up in little-endian
order and needs to be byte swapped again. Thus calling cpu_to_le32
gives the right output.
Tested-by: Jukka Taimisto <jukka.taimisto@mail.suomi.net>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Impact: Fix debugobjects warning
debugobject enabled kernels spit out a warning in hpet code due to a
workqueue which is initialized on stack.
Add INIT_WORK_ON_STACK() which calls init_timer_on_stack() and use it
in hpet.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Create a separate mode_config IDR lock for simplicity. The core DRM
config structures (connector, mode, etc. lists) are still protected by
the mode_config mutex, but the CRTC IDR (used for the various identifier
IDs) is now protected by the mode_config idr_mutex. Simplifies the
locking a bit and removes a warning.
All objects are protected by the config mutex, we may in the future,
split the object further to have reference counts.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Improve usbnet's devdbg to always type-check diagnostic arguments,
like dev_dbg (device.h). This makes no change to the resulting size of
usbnet modules.
This patch also removes an #ifdef DEBUG directive from rndis_wlan so
it's devdbg statements are always type-checked at compile time.
Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The commit a1ed5b0cff
(klist: don't iterate over deleted entries) introduces use of the
low bit in a pointer to indicate if the knode is dead or not,
assuming that this bit is always free.
This is not true for all architectures, CRIS for example may align data
on byte borders.
The result is a bunch of warnings on bootup, devices not being
added correctly etc, reported by Hinko Kocevar <hinko.kocevar@cetrtapot.si>:
------------[ cut here ]------------
WARNING: at lib/klist.c:62 ()
Modules linked in:
Stack from c1fe1cf0:
c01cc7f4 c1fe1d11 c000eb4e c000e4de 00000000 00000000 c1f4f78f c1f50c2d
c01d008c c1fdd1a0 c1fdd1a0 c1fe1d38 c0192954 c1fe0000 00000000 c1fe1dc0
00000002 7fffffff c1fe1da8 c0192d50 c1fe1dc0 00000002 7fffffff c1ff9fcc
Call Trace: [<c000eb4e>] [<c000e4de>] [<c0192954>] [<c0192d50>] [<c001d49e>] [<c000b688>] [<c0192a3c>]
[<c000b63e>] [<c000b63e>] [<c001a542>] [<c00b55b0>] [<c00411c0>] [<c00b559c>] [<c01918e6>] [<c0191988>]
[<c01919d0>] [<c00cd9c8>] [<c00cdd6a>] [<c0034178>] [<c000409a>] [<c0015576>] [<c0029130>] [<c0029078>]
[<c0029170>] [<c0012336>] [<c00b4076>] [<c00b4770>] [<c006d6e4>] [<c006d974>] [<c006dca0>] [<c0028d6c>]
[<c0028e12>] [<c0006424>] <4>---[ end trace 4eaa2a86a8e2da22 ]---
------------[ cut here ]------------
Repeat ad nauseam.
Wed, Jan 14, 2009 at 12:11:32AM +0100, Bastien ROUCARIES wrote:
> Perhaps using a pointerhackalign trick on this structure where
> #define pointerhackalign(x) __attribute__ ((aligned (x)))
> and declare
> struct klist_node {
> ...
> } pointerhackalign(2);
>
> Because __attribute__ ((aligned (x))) could only increase alignment
> it will safe to do that and serve as documentation purpose :)
That works, but we need to do it not for the struct klist_node,
but for the struct we insert into the void * in klist_node,
which is struct klist.
Reported-by: Hinko Kocevar <hinko.kocevar@cetrtapot.si
Cc: Bastien ROUCARIES <roucaries.bastien@gmail.com>
Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Toralf Förster <toralf.foerster@gmx.de> reported a build failure in
the WiMAX stack when CONFIG_DEBUG_FS=n
http://linuxwimax.org/pipermail/wimax/2009-January/000449.html
This is due to debugfs_create_size_t() missing an stub that returns
-ENODEV when the DEBUGFS subsystem is not configured in (like the rest
of the debugfs API).
This patch adds said stub.
Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
To complete the DMA_CTRL_ACK handling API add a async_tx_clear_ack() macro.
Signed-off-by: Guennadi Liakhovetski <lg@denx.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The device list will always be empty in this configuration, so no need
to walk the list.
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This patch adds ioctl to check if an LEB is mapped or not (as a
debugging option so far).
[Re-named ioctl to make it look the same as the other one and made
some minor stylistic changes. Artem Bityutskiy.]
Signed-off-by: Corentin Chary <corentincj@iksaif.net>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
This patch adds ioctl for the LEB unmap operation (as a debugging
option so far).
[Re-named ioctl to make it look the same as the other one and made
some minor stylistic changes. Artem Bityutskiy.]
Signed-off-by: Corentin Chary <corentincj@iksaif.net>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
This patch adds ioctl for the LEB map operation (as a debugging
option so far).
[Re-named ioctl to make it look the same as the other one and made
some minor stylistic changes. Artem Bityutskiy.]
Signed-off-by: Corentin Chary <corentincj@iksaif.net>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Fix (delete) more mac80211 kernel-doc:
Warning(linux-2.6.28-git13//include/net/mac80211.h:375): Excess struct/union/enum/typedef member 'retry_count' description in 'ieee80211_tx_info'
Warning(linux-2.6.28-git13//net/mac80211/sta_info.h:308): Excess struct/union/enum/typedef member 'last_txrate' description in 'sta_info'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
There is a problem in our handling of suspend-resume of PCI devices that
many of them have their standard config registers restored with
interrupts enabled and they are put into the full power state with
interrupts enabled as well. This may lead to the following scenario:
* an interrupt vector is shared between two or more devices
* one device is resumed earlier and generates an interrupt
* the interrupt handler of another device tries to handle it and
attempts to access the device the config space of which hasn't been
restored yet and/or which still is in a low power state
* the system crashes as a result
To prevent this from happening we should restore the standard
configuration registers of all devices with interrupts disabled and we
should put them into the D0 power state right after that.
Unfortunately, this cannot be done using the existing
pci_set_power_state(), because it can sleep. Also, to do it we have to
make sure that the config spaces of all devices were actually saved
during suspend.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Make the comment for ACPI_FADT_S4_RTC_WAKE match the ACPI spec;
that bit has nothing to do with status bits.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
We implement dqget() and dqput() that need neither dqonoff_mutex nor dqptr_sem.
Then move dqget() and dqput() calls so that they are not called from under
dqptr_sem. This is important because filesystem callbacks aren't called from
under dqptr_sem which used to cause *lots* of problems with lock ranking
(and with OCFS2 they became close to unsolvable).
The patch also removes two functions which were introduced solely because OCFS2
needed them to cope with the old locking scheme. As time showed, they were not
enough for OCFS2 anyway and it would be unnecessary work to adapt them to the
new locking scheme in which they aren't needed. As a result OCFS2 needs the
following patch to compile properly with quotas. Sorry to any bisecters which
hit this in advance.
Signed-off-by: Jan Kara <jack@suse.cz>
Otherwise it can be very confusing to find a "EXT3-fs: " failure in
the middle of EXT4-fs failures, and it makes it harder to track the
source of the failure.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
sata_fsl: Return non-zero on error in probe()
drivers/ata/pata_ali.c: s/isa_bridge/ali_isa_bridge/ to fix alpha build
libata: New driver for OCTEON SOC Compact Flash interface (v7).
libata: Add another column to the ata_timing table.
sata_via: Add VT8261 support
pata_atiixp: update port enabledness test handling
[libata] get-identity ioctl: Fix use of invalid memory pointer
The forthcoming OCTEON SOC Compact Flash driver needs an additional
timing value that was not available in the ata_timing table. I add a
new column for dmack_hold time. The values were obtained from the
Compact Flash specification Rev 4.1.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
for SAS drivers.
Caught by Ke Wei (and team?) at Marvell.
Also, move the ata_scsi_ioctl export to libata-scsi.c, as that seems to be the
general trend.
Acked-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Merge header files for m68k and m68knommu to the single location:
arch/m68k/include/asm
The majority of this patch was the result of the
script that is included in the changelog below.
The script was originally written by Arnd Bergman and
exten by me to cover a few more files.
When the header files differed the script uses the following:
The original m68k file is named <file>_mm.h [mm for memory manager]
The m68knommu file is named <file>_no.h [no for no memory manager]
The files uses the following include guard:
This include gaurd works as the m68knommu toolchain set
the __uClinux__ symbol - so this should work in userspace too.
Merging the header files for m68k and m68knommu exposes the
(unexpected?) ABI differences thus it is easier to actually
identify these and thus to fix them.
The commit has been build tested with both a m68k and
a m68knommu toolchain - with success.
The commit has also been tested with "make headers_check"
and this patch fixes make headers_check for m68knommu.
The script used:
TARGET=arch/m68k/include/asm
SOURCE=arch/m68knommu/include/asm
INCLUDE="cachectl.h errno.h fcntl.h hwtest.h ioctls.h ipcbuf.h \
linkage.h math-emu.h md.h mman.h movs.h msgbuf.h openprom.h \
oplib.h poll.h posix_types.h resource.h rtc.h sembuf.h shmbuf.h \
shm.h shmparam.h socket.h sockios.h spinlock.h statfs.h stat.h \
termbits.h termios.h tlb.h types.h user.h"
EQUAL="auxvec.h cputime.h device.h emergency-restart.h futex.h \
ioctl.h irq_regs.h kdebug.h local.h mutex.h percpu.h \
sections.h topology.h"
NOMUUFILES="anchor.h bootstd.h coldfire.h commproc.h dbg.h \
elia.h flat.h m5206sim.h m520xsim.h m523xsim.h m5249sim.h \
m5272sim.h m527xsim.h m528xsim.h m5307sim.h m532xsim.h \
m5407sim.h m68360_enet.h m68360.h m68360_pram.h m68360_quicc.h \
m68360_regs.h MC68328.h MC68332.h MC68EZ328.h MC68VZ328.h \
mcfcache.h mcfdma.h mcfmbus.h mcfne.h mcfpci.h mcfpit.h \
mcfsim.h mcfsmc.h mcftimer.h mcfuart.h mcfwdebug.h \
nettel.h quicc_simple.h smp.h"
FILES="atomic.h bitops.h bootinfo.h bug.h bugs.h byteorder.h cache.h \
cacheflush.h checksum.h current.h delay.h div64.h \
dma-mapping.h dma.h elf.h entry.h fb.h fpu.h hardirq.h hw_irq.h io.h \
irq.h kmap_types.h machdep.h mc146818rtc.h mmu.h mmu_context.h \
module.h page.h page_offset.h param.h pci.h pgalloc.h \
pgtable.h processor.h ptrace.h scatterlist.h segment.h \
setup.h sigcontext.h siginfo.h signal.h string.h system.h swab.h \
thread_info.h timex.h tlbflush.h traps.h uaccess.h ucontext.h \
unaligned.h unistd.h"
mergefile() {
BASE=${1%.h}
git mv ${SOURCE}/$1 ${TARGET}/${BASE}_no.h
git mv ${TARGET}/$1 ${TARGET}/${BASE}_mm.h
cat << EOF > ${TARGET}/$1
EOF
git add ${TARGET}/$1
}
set -e
mkdir -p ${TARGET}
git mv include/asm-m68k/* ${TARGET}
rmdir include/asm-m68k
git rm ${SOURCE}/Kbuild
for F in $INCLUDE $EQUAL; do
git rm ${SOURCE}/$F
done
for F in $NOMUUFILES; do
git mv ${SOURCE}/$F ${TARGET}/$F
done
for F in $FILES ; do
mergefile $F
done
rmdir arch/m68knommu/include/asm
rmdir arch/m68knommu/include
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
When mode setting is first initialized, the driver will call into
drm_helper_initial_config() to set up an initial output and framebuffer
configuration. This routine is responsible for probing the available
connectors, encoders, and crtcs, looking for modes and putting together
something reasonable (where reasonable is defined as "allows kernel
messages to be visible on as many displays as possible").
However, the code was a bit too aggressive in setting default modes when
none were found on a given connector. Even if some connectors had modes,
any connectors found lacking modes would have the default 800x600 mode added
to their mode list, which in some cases could cause problems later down the
line. In my case, the LVDS was perfectly available, but the initial config
code added 800x600 modes to both of the detected but unavailable HDMI
connectors (which are on my non-existent docking station). This ended up
preventing later code from setting a mode on my LVDS, which is bad.
This patch fixes that behavior by making the initial config code walk
through the connectors first, counting the available modes, before it decides
to add any default modes to a possibly connected output. It also fixes the
logic in drm_target_preferred() that was causing zeroed out modes to be set
as the preferred mode for a given connector, even if no modes were available.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@linux.ie>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (95 commits)
b44: GFP_DMA skb should not escape from driver
korina: do not use IRQF_SHARED with IRQF_DISABLED
korina: do not stop queue here
korina: fix handling tx_chain_tail
korina: do tx at the right position
korina: do schedule napi after testing for it
korina: rework korina_rx() for use with napi
korina: disable napi on close and restart
korina: reset resource buffer size to 1536
korina: fix usage of driver_data
bnx2x: First slow path interrupt race
bnx2x: MTU Filter
bnx2x: Indirection table initialization index
bnx2x: Missing brackets
bnx2x: Fixing the doorbell size
bnx2x: Endianness issues
bnx2x: VLAN tagged packets without VLAN offload
bnx2x: Protecting the link change indication
bnx2x: Flow control updated before reporting the link
bnx2x: Missing mask when calculating flow control
...
Impact: fix 15 make headers_check warnings:
include of <linux/types.h> is preferred over <asm/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Unlike other alphas, marvel doesn't have real PC-style CMOS clock hardware
- RTC accesses are emulated via PAL calls. Unfortunately, for unknown
reason these calls work only on CPU #0. So current implementation for
arbitrary CPU makes CMOS_READ/WRITE to be executed on CPU #0 via IPI.
However, for obvious reason this doesn't work with standard
get/set_rtc_time() functions, where a bunch of CMOS accesses is done with
disabled interrupts.
Solved by making the IPI calls for entire get/set_rtc_time() functions,
not for individual CMOS accesses. Which is also a lot more effective
performance-wise.
The patch is largely based on the code from Jay Estabrook.
My changes:
- tweak asm-generic/rtc.h by adding a couple of #defines to
avoid a massive code duplication in arch/alpha/include/asm/rtc.h;
- sys_marvel.c: fix get/set_rtc_time() return values (Jay's FIXMEs).
NOTE: this fixes *only* LIB_RTC drivers. Legacy (CONFIG_RTC) driver
wont't work on marvel. Actually I think that we should just disable
CONFIG_RTC on alpha (maybe in 2.6.30?), like most other arches - AFAIK,
all modern distributions use LIB_RTC anyway.
Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use the standard magic.h for btrfs and squashfs.
Signed-off-by: Qinghuang Feng <qhfeng.kernel@gmail.com>
Cc: Phillip Lougher <phillip@lougher.demon.co.uk>
Cc: Chris Mason <chris.mason@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix __request_region() parameter kernel-doc notation and parameter name:
Warning(linux-2.6.28-git10//kernel/resource.c:627): No description found for parameter 'flags'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix jbd header file kernel-doc notation:
Warning(linux-2.6.28-git13//include/linux/jbd.h:823): No description found for parameter 'j_average_commit_time'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This adds an init_dummy_netdev() function that gets a network device
structure (allocation and lifetime entirely under caller's control) and
initialize the minimum amount of fields so it can be used to schedule
NAPI polls without registering a full blown interface. This is to be
used by drivers that need to tie several hardware interfaces to a single
NAPI poll scheduler due to HW limitations.
It also updates the ibm_newemac driver to use that, this fixing the
oops on 2.6.29 due to passing NULL as "dev" to netif_napi_add()
Symbol is exported GPL only a I don't think we want binary drivers doing
that sort of acrobatics (if we want them at all).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (29 commits)
powerpc/83xx: Move mcu_mpc8349emitx driver out of drivers/i2c/chips/
powerpc/83xx: Make serial ports work on MPC8315E-RDB w/ FSL U-Boots
powerpc/e500mc: Doorbells need to be taken w/exceptions disabled
powerpc: Enable PS3 options and QPACE in ppc64_defconfig
powerpc/powermac: Fix occasional SMP boot failure
powerpc/cacheinfo: Rename cache_dir per-cpu variable
hvc_console: Use kzalloc() instead of kmalloc() + memset()
hvc_console: Do not set low_latency when using interrupts
hvc_console: Call free_irq() only if request_irq() was successful
hvc_console: Change an mb() to smp_mb() and add some comments
powerpc: Cleanup from l64 to ll64 change: drivers/net
powerpc: Cleanup from l64 to ll64 change: drivers/char
powerpc: Cleanup from l64 to ll64 change: arch code
powerpc: Change u64/s64 to a long long integer type
powerpc/kexec: Check crash_base for relocatable kernel
powerpc: Make dummy section a valid note header
Xilinx: SPI: updated driver for device tree
drivers/of: Add the of_find_i2c_device_by_node function.
powerpc/xsysace: add compatible string for non-ipcore instance
powerpc/mpc52xx: remove dead code from GPIO driver
...
* 'syscalls' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: (44 commits)
[CVE-2009-0029] s390 specific system call wrappers
[CVE-2009-0029] System call wrappers part 33
[CVE-2009-0029] System call wrappers part 32
[CVE-2009-0029] System call wrappers part 31
[CVE-2009-0029] System call wrappers part 30
[CVE-2009-0029] System call wrappers part 29
[CVE-2009-0029] System call wrappers part 28
[CVE-2009-0029] System call wrappers part 27
[CVE-2009-0029] System call wrappers part 26
[CVE-2009-0029] System call wrappers part 25
[CVE-2009-0029] System call wrappers part 24
[CVE-2009-0029] System call wrappers part 23
[CVE-2009-0029] System call wrappers part 22
[CVE-2009-0029] System call wrappers part 21
[CVE-2009-0029] System call wrappers part 20
[CVE-2009-0029] System call wrappers part 19
[CVE-2009-0029] System call wrappers part 18
[CVE-2009-0029] System call wrappers part 17
[CVE-2009-0029] System call wrappers part 16
[CVE-2009-0029] System call wrappers part 15
...
Add swab.h to kbuild.asm and remove the individual entries from
each arch, mark as unifdef as some arches have some kernel-only
bits inside.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
IDE: fix sparse signed-ness errors with host->host_busy
ide: fix suspend regression
tx4938ide: Fix build error due to read_sff_dma_status moving
ide: remove unused CONFIG_BLK_DEV_IDE_AU1XXX_SEQTS_PER_RQ
sl82c105: remove dead code
via82cxxx: fix cable warning message
ide: can't use SSD/non-rotational queue flag for all CFA devices
it821x.c: use dev->revision instead of pci_read_config_byte
it821x: Add ultra_mask quirk for Vortex86SX
ide: fix accidental LOCKDEP breakage caused by local_irq_set() removal
The host_busy field in struct ide_host defaults to a
signed-long, where most arch's test_and_set_bit_*
macros use an unsigned long.
Change to using an unsigned long, which on ARM removes
the following sparse errors:
drivers/ide/ide-io.c:681:8: warning: incorrect type in argument 2 (different signedness)
drivers/ide/ide-io.c:681:8: expected unsigned long volatile *p
drivers/ide/ide-io.c:681:8: got long volatile *<noident>
drivers/ide/ide-io.c:681:8: warning: incorrect type in argument 2 (different signedness)
drivers/ide/ide-io.c:681:8: expected unsigned long volatile *p
drivers/ide/ide-io.c:681:8: got long volatile *<noident>
drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness)
drivers/ide/ide-io.c:695:3: expected unsigned long volatile *p
drivers/ide/ide-io.c:695:3: got long volatile *<noident>
drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness)
drivers/ide/ide-io.c:695:3: expected unsigned long volatile *p
drivers/ide/ide-io.c:695:3: got long volatile *<noident>
drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness)
drivers/ide/ide-io.c:695:3: expected unsigned long volatile *p
drivers/ide/ide-io.c:695:3: got long volatile *<noident>
drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness)
drivers/ide/ide-io.c:695:3: expected unsigned long volatile *p
drivers/ide/ide-io.c:695:3: got long volatile *<noident>
drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness)
drivers/ide/ide-io.c:695:3: expected unsigned long volatile *p
drivers/ide/ide-io.c:695:3: got long volatile *<noident>
drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness)
drivers/ide/ide-io.c:695:3: expected unsigned long volatile *p
drivers/ide/ide-io.c:695:3: got long volatile *<noident>
drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness)
drivers/ide/ide-io.c:695:3: expected unsigned long volatile *p
drivers/ide/ide-io.c:695:3: got long volatile *<noident>
drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness)
drivers/ide/ide-io.c:695:3: expected unsigned long volatile *p
drivers/ide/ide-io.c:695:3: got long volatile *<noident>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
On Vortex86SX with IDE controller revision 0x11 ultra DMA must be
disabled. This patch was tested by DMP and seems to work.
It is a cleaned up version of their older Kernel patch:
http://www.dmp.com.tw/tech/vortex86sx/patch-2.6.24-DMP.gz
Tested-by: Shawn Lin <shawn@dmp.com.tw>
Signed-off-by: Brandon Philips <bphilips@suse.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
This assertion is incorrect for lockless pagecache. By definition if we
have an unpinned page that we are trying to take a speculative reference
to, it may become the tail of a compound page at any time (if it is
freed, then reallocated as a compound page).
It was still a valid assertion for the vmscan.c LRU isolation case, but
it doesn't seem incredibly helpful... if somebody wants it, they can
put it back directly where it applies in the vmscan code.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This enables the use of syscall wrappers to do proper sign extension
for 64-bit programs.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
From: Martin Schwidefsky <schwidefsky@de.ibm.com>
By selecting HAVE_SYSCALL_WRAPPERS architectures can activate
system call wrappers in order to sign extend system call arguments.
All architectures where the ABI defines that the caller of a function
has to perform sign extension probably need this.
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Convert all system calls to return a long. This should be a NOP since all
converted types should have the same size anyway.
With the exception of sys_exit_group which returned void. But that doesn't
matter since the system call doesn't return.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
At run-time, if softlockup_thresh is changed to a much lower value,
touch_timestamp is likely to be much older than the new softlock_thresh.
This will cause a false softlockup to be detected. If softlockup_panic
is enabled, the system will panic.
The fix is to touch all watchdogs before changing softlockup_thresh.
Signed-off-by: Mandeep Singh Baines <msb@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
Change the protection parameter for track_pfn_vma_new() into a pgprot_t pointer.
Subsequent patch changes the x86 PAT handling to return a compatible
memtype in pgprot_t, if what was requested cannot be allowed due to conflicts.
No fuctionality change in this patch.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: reduce kernel image size
Hugh Dickins noticed that older gcc versions when the kernel
is built for code size didn't inline some of the bitops.
Mark all complex x86 bitops that have more than a single
asm statement or two as always inline to avoid this problem.
Probably should be done for other architectures too.
Ingo then found a better fix that only requires
a single line change, but it unfortunately only
works on gcc 4.3.
On older gccs the original patch still makes a ~0.3% defconfig
difference with CONFIG_OPTIMIZE_INLINING=y.
With gcc 4.1 and a defconfig like build:
6116998 1138540 883788 8139326 7c323e vmlinux-oi-with-patch
6137043 1138540 883788 8159371 7c808b vmlinux-optimize-inlining
~20k / 0.3% difference.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
virt_to_page() call should not be used on kernel text and data
addresses. virt_to_page() is used by sg_init_one(). So change padbuf
to be allocated within iscsi_segment.
Signed-off-by: Karen Xie <kxie@chelsio.com>
Acked-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
reorder struct xt_match to remove 8 bytes of padding and make its size
128 bytes.
This saves a small amount of data space in each of the xt netfilter
modules and fits xt_match in one 128 byte cache line.
Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sparc64: Fix cpumask related build failure
smp_call_function_single(): be slightly less stupid, fix
smp_call_function_single(): be slightly less stupid
rcu: fix bug in rcutorture system-shutdown code
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (37 commits)
ucc_geth: use correct UCCE macros
net_dma: acquire/release dma channels on ifup/ifdown
cxgb3: Keep LRO off if disabled when interface is down
sfc: SFT9001: Fix condition for LNPGA power-off
dccp ccid-3: Fix RFC reference
smsc911x: register irq with device name, not driver name
smsc911x: fix smsc911x_reg_read compiler warning
forcedeth: napi schedule lock fix
net: fix section mismatch warnings in dccp/ccids/lib/tfrc.c
forcedeth: remove mgmt unit for mcp79 chipset
qlge: Remove dynamic alloc of rx ring control blocks.
qlge: Fix schedule while atomic issue.
qlge: Remove support for device ID 8000.
qlge: Get rid of split addresses in hardware control blocks.
qlge: Get rid of volatile usage for shadow register.
forcedeth: version bump and copyright
forcedeth: xmit lock fix
netdev: missing validate_address hooks
netdev: add missing set_mac_address hook
netdev: gianfar: add MII ioctl handler
...
* 'for_2.6.29' of git://git.kernel.org/pub/scm/linux/kernel/git/kkeil/ISDN-2.6:
Fix small typo
misdn: indentation and braces disagree - add braces
misdn: one handmade ARRAY_SIZE converted
drivers/isdn/hardware/mISDN: move a dereference below a NULL test
indentation & braces disagree - add braces
Make parameter debug writable
BUGFIX: used NULL pointer at ioctl(sk,IMGETDEVINFO,&devinfo) when devinfo.id not registered
warning: ignoring return value of 'device_register', declared with attribute
warn_unused_result
warning: ignoring return value of 'device_create_file', declared with
attribute warn_unused_result
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Alexander Beregalov reported that this warning is caused by the HPET code:
> hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0
> hpet0: 3 comparators, 64-bit 14.318180 MHz counter
> ODEBUG: object is on stack, but not annotated
> ------------[ cut here ]------------
> WARNING: at lib/debugobjects.c:251 __debug_object_init+0x2a4/0x352()
> Bisected down to 26afe5f2fb
> (x86: HPET_MSI Initialise per-cpu HPET timers)
The commit is fine - but the on-stack workqueue entry needs annotation.
Reported-and-bisected-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The recent dmaengine rework removed the capability to remove dma device
driver modules while net_dma is active. Rather than notify
dmaengine-clients that channels are trying to be removed, we now rely on
clients to notify dmaengine when they no longer have a need for
channels. Teach net_dma to release channels by taking dmaengine
references at netdevice open and dropping references at netdevice close.
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The kernel-doc was referring to member @debufs_dentry instead of
@debugfs_dentry.
Reported by Randy Dunlap http://marc.info/?l=linux-netdev&m=123147942302885&w=2
As well, escape the colon in the field's text description, as it is
causing the generated text to be erraticly broken up (with paragraphs
moved down). Could not find a reason why it is happening so, even when
other field descriptions use colons and work as expected.
Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If you do
smp_call_function_single(expression-with-side-effects, ...)
then expression-with-side-effects never gets evaluated on UP builds.
As always, implementing it in C is the correct thing to do.
While we're there, uninline it for size and possible header dependency
reasons.
And create a new kernel/up.c, as a place in which to put
uniprocessor-specific code and storage. It should mirror kernel/smp.c.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Changes from V1:
- Removed support for suspend_enable & suspend_disable functions.
Signed-off-by: Balaji Rao <balajirrao@openmoko.org>
Cc: Andy Green <andy@openmoko.com>
Cc: Liam Girdwood <lrg@slimlogic.co.uk>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
Signed-off-by: Balaji Rao <balajirrao@openmoko.org>
Cc: Andy Green <andy@openmoko.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Acked-by: Anton Vorontsov <cbouatmailru@gmail.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
What the PCF05633 calls as a 'GPIO' is much more than the GPIO in the linux
sense and there are only 4 of them - which means, the gpiolib is not used
here.
Signed-off-by: Balaji Rao <balajirrao@openmoko.org>
Cc: Andy Green <andy@openmoko.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
This patch adds basic support for the PCF50633 ADC. The subtractive mode
is not supported yet.
Since we don't have adc subsystem, it currently lives in drivers/mfd.
Signed-off-by: Balaji Rao <balajirrao@openmoko.org>
Cc: Andy Green <andy@openmoko.com>
Acked-by: Jonathan Cameron <jonathan.cameron@gmail.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
This patch implements the core of the PCF50633 driver. This core driver has
generic register read/write functions and does interrupt management for its
sub devices.
Signed-off-by: Balaji Rao <balajirrao@openmoko.org>
Cc: Andy Green <andy@openmoko.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
This patch adds a per host flag that allows drivers to opt in into
having its busses scanned in parallel.
Drivers that do not set this flag get their ports scanned in
the "original" sequence.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'cpus4096-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
[IA64] fix typo in cpumask_of_pcibus()
x86: fix x86_32 builds for summit and es7000 arch's
cpumask: use work_on_cpu in acpi-cpufreq.c for read_measured_perf_ctrs
cpumask: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write
cpumask: use cpumask_var_t in acpi-cpufreq.c
cpumask: use work_on_cpu in acpi/cstate.c
cpumask: convert struct cpufreq_policy to cpumask_var_t
cpumask: replace CPUMASK_ALLOC etc with cpumask_var_t
x86: cleanup remaining cpumask_t ops in smpboot code
cpumask: update pci_bus_show_cpuaffinity to use new cpumask API
cpumask: update local_cpus_show to use new cpumask API
ia64: cpumask fix for is_affinity_mask_valid()
The 'rb_first()', 'rb_last()', 'rb_next()' and 'rb_prev()' calls
take a pointer to an RB node or RB root. They do not change the
pointed objects, so add a 'const' qualifier in order to make life
of the users of these functions easier.
Indeed, if I have my own constant pointer &const struct my_type *p,
and I call 'rb_next(&p->rb)', I get a GCC warning:
warning: passing argument 1 of ‘rb_next’ discards qualifiers from pointer target type
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The ioctls for the generic freeze feature are below.
o Freeze the filesystem
int ioctl(int fd, int FIFREEZE, arg)
fd: The file descriptor of the mountpoint
FIFREEZE: request code for the freeze
arg: Ignored
Return value: 0 if the operation succeeds. Otherwise, -1
o Unfreeze the filesystem
int ioctl(int fd, int FITHAW, arg)
fd: The file descriptor of the mountpoint
FITHAW: request code for unfreeze
arg: Ignored
Return value: 0 if the operation succeeds. Otherwise, -1
Error number: If the filesystem has already been unfrozen,
errno is set to EINVAL.
[akpm@linux-foundation.org: fix CONFIG_BLOCK=n]
Signed-off-by: Takashi Sato <t-sato@yk.jp.nec.com>
Signed-off-by: Masayuki Hamaguchi <m-hamaguchi@ys.jp.nec.com>
Cc: <xfs-masters@oss.sgi.com>
Cc: <linux-ext4@vger.kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Alasdair G Kergon <agk@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently, ext3 in mainline Linux doesn't have the freeze feature which
suspends write requests. So, we cannot take a backup which keeps the
filesystem's consistency with the storage device's features (snapshot and
replication) while it is mounted.
In many case, a commercial filesystem (e.g. VxFS) has the freeze feature
and it would be used to get the consistent backup.
If Linux's standard filesystem ext3 has the freeze feature, we can do it
without a commercial filesystem.
So I have implemented the ioctls of the freeze feature.
I think we can take the consistent backup with the following steps.
1. Freeze the filesystem with the freeze ioctl.
2. Separate the replication volume or create the snapshot
with the storage device's feature.
3. Unfreeze the filesystem with the unfreeze ioctl.
4. Take the backup from the separated replication volume
or the snapshot.
This patch:
VFS:
Changed the type of write_super_lockfs and unlockfs from "void"
to "int" so that they can return an error.
Rename write_super_lockfs and unlockfs of the super block operation
freeze_fs and unfreeze_fs to avoid a confusion.
ext3, ext4, xfs, gfs2, jfs:
Changed the type of write_super_lockfs and unlockfs from "void"
to "int" so that write_super_lockfs returns an error if needed,
and unlockfs always returns 0.
reiserfs:
Changed the type of write_super_lockfs and unlockfs from "void"
to "int" so that they always return 0 (success) to keep a current behavior.
Signed-off-by: Takashi Sato <t-sato@yk.jp.nec.com>
Signed-off-by: Masayuki Hamaguchi <m-hamaguchi@ys.jp.nec.com>
Cc: <xfs-masters@oss.sgi.com>
Cc: <linux-ext4@vger.kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Alasdair G Kergon <agk@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The code was shifting the endianness appropriately everywhere, annotate
the structs to avoid the sparse warnings when assigning the endian types
to the struct members, or passing them to be[16|32]_to_cpu:
drivers/memstick/core/mspro_block.c:331:4: warning: cast to restricted __be16
drivers/memstick/core/mspro_block.c:333:4: warning: cast to restricted __be16
drivers/memstick/core/mspro_block.c:335:4: warning: cast to restricted __be16
drivers/memstick/core/mspro_block.c:337:4: warning: cast to restricted __be16
drivers/memstick/core/mspro_block.c:341:4: warning: cast to restricted __be16
drivers/memstick/core/mspro_block.c:347:4: warning: cast to restricted __be32
drivers/memstick/core/mspro_block.c:356:4: warning: cast to restricted __be16
drivers/memstick/core/mspro_block.c:358:4: warning: cast to restricted __be16
drivers/memstick/core/mspro_block.c:364:4: warning: cast to restricted __be16
drivers/memstick/core/mspro_block.c:367:4: warning: cast to restricted __be16
drivers/memstick/core/mspro_block.c:369:4: warning: cast to restricted __be16
drivers/memstick/core/mspro_block.c:371:4: warning: cast to restricted __be16
drivers/memstick/core/mspro_block.c:377:4: warning: cast to restricted __be16
drivers/memstick/core/mspro_block.c:478:4: warning: cast to restricted __be16
drivers/memstick/core/mspro_block.c:480:4: warning: cast to restricted __be16
drivers/memstick/core/mspro_block.c:482:4: warning: cast to restricted __be16
drivers/memstick/core/mspro_block.c:484:4: warning: cast to restricted __be16
drivers/memstick/core/mspro_block.c:486:4: warning: cast to restricted __be16
drivers/memstick/core/mspro_block.c:689:22: expected unsigned int [unsigned] [assigned] data_address
drivers/memstick/core/mspro_block.c:689:22: got restricted __be32 [usertype] <noident>
drivers/memstick/core/mspro_block.c:697:3: warning: cast to restricted __be32
drivers/memstick/core/mspro_block.c:960:17: warning: incorrect type in initializer (different base types)
drivers/memstick/core/mspro_block.c:960:17: expected unsigned short [unsigned] data_count
drivers/memstick/core/mspro_block.c:960:17: got restricted __be16 [usertype] <noident>
drivers/memstick/core/mspro_block.c:993:6: warning: cast to restricted __be16
drivers/memstick/core/mspro_block.c:995:28: warning: cast to restricted __be16
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Cc: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for_2.6.29' of git://git.kernel.org/pub/scm/linux/kernel/git/kkeil/ISDN-2.6: (28 commits)
mISDN: Add HFC USB driver
mISDN: Add layer1 prim MPH_INFORMATION_REQ
mISDN: Fix kernel crash when doing hardware conference with more than two members
mISDN: Added missing create_l1() call
mISDN: Add MODULE_DEVICE_TABLE() to hfcpci
mISDN: Minor cleanups
mISDN: Create /sys/class/mISDN
mISDN: Add missing release functions
mISDN: Add different different timer settings for hfc-pci
mISDN: Minor fixes
mISDN: Correct busy device detection
mISDN: Fix deactivation, if peer IP is removed from l1oip instance.
mISDN: Add ISDN_P_TE_UP0 / ISDN_P_NT_UP0
mISDN: Fix irq detection
mISDN: Add ISDN sample clock API to mISDN core
mISDN: Return error on E-channel access
mISDN: Add E-Channel logging features
mISDN: Use protocol to detect D-channel
mISDN: Fixed more indexing bugs
mISDN: Make debug output a little bit more verbose
...
This reverts commit 2831fe6f9c.
Turns out that device_initialize shouldn't fail silently.
This series needs to be reworked in order to get into proper
shape.
Reported-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This reverts commit 11c3b5c3e0.
Turns out that device_initialize shouldn't fail silently.
This series needs to be reworked in order to get into proper
shape.
Reported-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The of_find_i2c_device_by_node function allows you to follow a
reference in the device tree to an i2c device node and then locate
the linux device instantiated by the device tree. Example use: an I2S
bus driver finding the i2c_device instance for a codec described by
a device tree node.
This was waiting for Anton's i2c patches that were just added.
Signed-off-by: Jon Smirl <jonsmirl@gmail.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
This reverts commit 93e746db18.
Turns out that device_initialize shouldn't fail silently.
This series needs to be reworked in order to get into proper
shape.
Reported-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This reverts commit b9daa99ee5.
Turns out that device_initialize shouldn't fail silently.
This series needs to be reworked in order to get into proper
shape.
Reported-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-nommu:
NOMMU: Support XIP on initramfs
NOMMU: Teach kobjsize() about VMA regions.
FLAT: Don't attempt to expand the userspace stack to fill the space allocated
FDPIC: Don't attempt to expand the userspace stack to fill the space allocated
NOMMU: Improve procfs output using per-MM VMAs
NOMMU: Make mmap allocation page trimming behaviour configurable.
NOMMU: Make VMAs per MM as for MMU-mode linux
NOMMU: Delete askedalloc and realalloc variables
NOMMU: Rename ARM's struct vm_region
NOMMU: Fix cleanup handling in ramfs_nommu_get_umapped_area()
* 'for-linus' of git://git.o-hand.com/linux-rpurdie-leds:
leds: ledtrig-timer - on deactivation hardware blinking should be disabled
leds: Add suspend/resume to the core class
leds: Add WM8350 LED driver
leds: leds-pcs9532 - Move i2c work to a workqueque
leds: leds-pca9532 - fix memory leak and properly handle errors
leds: Fix wrong loop direction on removal in leds-ams-delta
leds: fix Cobalt Raq LED dependency
leds: Fix sparse warning in leds-ams-delta
leds: Fixup kdoc comment to match parameter names
leds: Make header variable naming consistent
leds: eds-pca9532: mark pca9532_event() static
leds: ALIX.2 LEDs driver
* 'for-linus' of git://git.o-hand.com/linux-rpurdie-backlight:
backlight: Rename the corgi backlight driver to generic
backlight: add support for Toppoly TDO35S series to tdo24m lcd driver
backlight: Add suspend/resume support to the backlight core
bd->props.brightness doesn't reflect the actual backlight level.
backlight: Support VGA/QVGA mode switching in tosa_lcd
backlight: Catch invalid input in sysfs attributes
backlight: Value of ILI9320_RGB_IF2 register should not be hardcoded
backlight: crbllcd_bl - Use platform_device_register_simple()
backlight: progear_bl - Use platform_device_register_simple()
backlight: hp680_bl - Use platform_device_register_simple()
MPH_INFORMATION provides full D- and B-Channel status overview
- new layer1 primitive: MPF_INFORMATON_REQ
- layer1 replies with MPH_INFORMATION_IND containing
- dch->[state,Flags,nrbchan]
- bch[]->[protocol,Flags]
- hardware driver should send MPH_INFORMATION_IND
on all ph state changes and BChannel state changes to MISDN_ID_ANY
Signed-off-by: Martin Bachem <m.bachem@gmx.de>
Signed-off-by: Karsten Keil <kkeil@suse.de>
Added GETPEER operation.
Socket now checks if device is already busy at a differen mode.
Signed-off-by: Andreas Eversberg <andreas@eversberg.eu>
Signed-off-by: Karsten Keil <kkeil@suse.de>
- new layer1 protocols for UP0 bus
- helper #defines to test for TE/NT/S0/E1/UP0
Signed-off-by: Martin Bachem <m.bachem@gmx.de>
Signed-off-by: Karsten Keil <kkeil@suse.de>
Add ISDN sample clock API to mISDN core (new file clock.c)
hfcmulti and mISDNdsp use clock API.
Signed-off-by: Andreas Eversberg <andreas@eversberg.eu>
Signed-off-by: Karsten Keil <kkeil@suse.de>
New prim PH_DATA_E_IND.
- all E-ch frames are indicated by recv_Echannel(), which pushes E-Channel
frames into dch's rqueue
- if dchannel is opened with channel nr 0, no E-Channel logging
is requested
- if dchannel is opened with channel nr 1, E-Channel logging
is requested. if layer1 does not support that, -EINVAL
in return is appropriate
Signed-off-by: Martin Bachem <m.bachem@gmx.de>
Signed-off-by: Karsten Keil <kkeil@suse.de>
To get persistent device names with hotplug we need to rename devices
sometime.
Signed-off-by: Matthias Urlichs <matthias@urlichs.de>
Signed-off-by: Karsten Keil <kkeil@suse.de>
This prevents underrun of fifo when filled and in case of an underrun it
prevents subsequent underruns due to jitter.
Improve dsp, so buffers are kept filled with a certain delay, so moderate
jitter will not cause underrun all the time -> the audio quality is highly
improved. tones are not interrupted by gaps anymore, except when CPU is
stalling or in high load.
Signed-off-by: Andreas Eversberg <andreas@eversberg.eu>
Signed-off-by: Karsten Keil <kkeil@suse.de>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile: (31 commits)
powerpc/oprofile: fix whitespaces in op_model_cell.c
powerpc/oprofile: IBM CELL: add SPU event profiling support
powerpc/oprofile: fix cell/pr_util.h
powerpc/oprofile: IBM CELL: cleanup and restructuring
oprofile: make new cpu buffer functions part of the api
oprofile: remove #ifdef CONFIG_OPROFILE_IBS in non-ibs code
ring_buffer: fix ring_buffer_event_length()
oprofile: use new data sample format for ibs
oprofile: add op_cpu_buffer_get_data()
oprofile: add op_cpu_buffer_add_data()
oprofile: rework implementation of cpu buffer events
oprofile: modify op_cpu_buffer_read_entry()
oprofile: add op_cpu_buffer_write_reserve()
oprofile: rename variables in add_ibs_begin()
oprofile: rename add_sample() in cpu_buffer.c
oprofile: rename variable ibs_allowed to has_ibs in op_model_amd.c
oprofile: making add_sample_entry() inline
oprofile: remove backtrace code for ibs
oprofile: remove unused ibs macro
oprofile: remove unused components in struct oprofile_cpu_buffer
...
* git://git.infradead.org/mtd-2.6: (67 commits)
[MTD] [MAPS] Fix printk format warning in nettel.c
[MTD] [NAND] add cmdline parsing (mtdparts=) support to cafe_nand
[MTD] CFI: remove major/minor version check for command set 0x0002
[MTD] [NAND] ndfc driver
[MTD] [TESTS] Fix some size_t printk format warnings
[MTD] LPDDR Makefile and KConfig
[MTD] LPDDR extended physmap driver to support LPDDR flash
[MTD] LPDDR added new pfow_base parameter
[MTD] LPDDR Command set driver
[MTD] LPDDR PFOW definition
[MTD] LPDDR QINFO records definitions
[MTD] LPDDR qinfo probing.
[MTD] [NAND] pxa3xx: convert from ns to clock ticks more accurately
[MTD] [NAND] pxa3xx: fix non-page-aligned reads
[MTD] [NAND] fix nandsim sched.h references
[MTD] [NAND] alauda: use USB API functions rather than constants
[MTD] struct device - replace bus_id with dev_name(), dev_set_name()
[MTD] fix m25p80 64-bit divisions
[MTD] fix dataflash 64-bit divisions
[MTD] [NAND] Set the fsl elbc ECCM according the settings in bootloader.
...
Fixed up trivial debug conflicts in drivers/mtd/devices/{m25p80.c,mtd_dataflash.c}
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (94 commits)
ACPICA: hide private headers
ACPICA: create acpica/ directory
ACPI: fix build warning
ACPI : Use RSDT instead of XSDT by adding boot option of "acpi=rsdt"
ACPI: Avoid array address overflow when _CST MWAIT hint bits are set
fujitsu-laptop: Simplify SBLL/SBL2 backlight handling
fujitsu-laptop: Add BL power, LED control and radio state information
ACPICA: delete utcache.c
ACPICA: delete acdisasm.h
ACPICA: Update version to 20081204.
ACPICA: FADT: Update error msgs for consistency
ACPICA: FADT: set acpi_gbl_use_default_register_widths to TRUE by default
ACPICA: FADT parsing changes and fixes
ACPICA: Add ACPI_MUTEX_TYPE configuration option
ACPICA: Fixes for various ACPI data tables
ACPICA: Restructure includes into public/private
ACPI: remove private acpica headers from driver files
ACPI: reboot.c: use new acpi_reset interface
ACPICA: New: acpi_reset interface - write to reset register
ACPICA: Move all public H/W interfaces to new hwxface
...
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx: (22 commits)
ioat: fix self test for multi-channel case
dmaengine: bump initcall level to arch_initcall
dmaengine: advertise all channels on a device to dma_filter_fn
dmaengine: use idr for registering dma device numbers
dmaengine: add a release for dma class devices and dependent infrastructure
ioat: do not perform removal actions at shutdown
iop-adma: enable module removal
iop-adma: kill debug BUG_ON
iop-adma: let devm do its job, don't duplicate free
dmaengine: kill enum dma_state_client
dmaengine: remove 'bigref' infrastructure
dmaengine: kill struct dma_client and supporting infrastructure
dmaengine: replace dma_async_client_register with dmaengine_get
atmel-mci: convert to dma_request_channel and down-level dma_slave
dmatest: convert to dma_request_channel
dmaengine: introduce dma_request_channel and private channels
net_dma: convert to dma_find_channel
dmaengine: provide a common 'issue_pending_all' implementation
dmaengine: centralize channel allocation, introduce dma_find_channel
dmaengine: up-level reference counting to the module level
...
The NOR Flash memory K8P2815UQB from Samsung uses the major version
number '0'. Add a quirk to cope with it.
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Conflicts:
drivers/acpi/pci_irq.c
Note that this merge disables
e1d3a90846
pci, acpi: reroute PCI interrupt to legacy boot interrupt equivalent
Signed-off-by: Len Brown <len.brown@intel.com>
On some boxes there exist both RSDT and XSDT table. But unfortunately
sometimes there exists the following error when XSDT table is used:
a. 32/64X address mismatch
b. The 32/64X FACS address mismatch
In such case the boot option of "acpi=rsdt" is provided so that
RSDT is tried instead of XSDT table when the system can't work well.
http://bugzilla.kernel.org/show_bug.cgi?id=8246
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
cc:Thomas Renninger <trenn@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (57 commits)
jbd2: Fix oops in jbd2_journal_init_inode() on corrupted fs
ext4: Remove "extents" mount option
block: Add Kconfig help which notes that ext4 needs CONFIG_LBD
ext4: Make printk's consistently prefixed with "EXT4-fs: "
ext4: Add sanity checks for the superblock before mounting the filesystem
ext4: Add mount option to set kjournald's I/O priority
jbd2: Submit writes to the journal using WRITE_SYNC
jbd2: Add pid and journal device name to the "kjournald2 starting" message
ext4: Add markers for better debuggability
ext4: Remove code to create the journal inode
ext4: provide function to release metadata pages under memory pressure
ext3: provide function to release metadata pages under memory pressure
add releasepage hooks to block devices which can be used by file systems
ext4: Fix s_dirty_blocks_counter if block allocation failed with nodelalloc
ext4: Init the complete page while building buddy cache
ext4: Don't allow new groups to be added during block allocation
ext4: mark the blocks/inode bitmap beyond end of group as used
ext4: Use new buffer_head flag to check uninit group bitmaps initialization
ext4: Fix the race between read_inode_bitmap() and ext4_new_inode()
ext4: code cleanup
...
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (45 commits)
[SCSI] qla2xxx: Update version number to 8.03.00-k1.
[SCSI] qla2xxx: Add ISP81XX support.
[SCSI] qla2xxx: Use proper request/response queues with MQ instantiations.
[SCSI] qla2xxx: Correct MQ-chain information retrieval during a firmware dump.
[SCSI] qla2xxx: Collapse EFT/FCE copy procedures during a firmware dump.
[SCSI] qla2xxx: Don't pollute kernel logs with ZIO/RIO status messages.
[SCSI] qla2xxx: Don't fallback to interrupt-polling during re-initialization with MSI-X enabled.
[SCSI] qla2xxx: Remove support for reading/writing HW-event-log.
[SCSI] cxgb3i: add missing include
[SCSI] scsi_lib: fix DID_RESET status problems
[SCSI] fc transport: restore missing dev_loss_tmo callback to LLDD
[SCSI] aha152x_cs: Fix regression that keeps driver from using shared interrupts
[SCSI] sd: Correctly handle 6-byte commands with DIX
[SCSI] sd: DIF: Fix tagging on platforms with signed char
[SCSI] sd: DIF: Show app tag on error
[SCSI] Fix error handling for DIF/DIX
[SCSI] scsi_lib: don't decrement busy counters when inserting commands
[SCSI] libsas: fix test for negative unsigned and typos
[SCSI] a2091, gvp11: kill warn_unused_result warnings
[SCSI] fusion: Move a dereference below a NULL test
...
Fixed up trivial conflict due to moving the async part of sd_probe
around in the async probes vs using dev_set_name() in naming.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (84 commits)
wimax: fix kernel-doc for debufs_dentry member of struct wimax_dev
net: convert pegasus driver to net_device_ops
bnx2x: Prevent eeprom set when driver is down
net: switch kaweth driver to netdevops
pcnet32: round off carrier watch timer
i2400m/usb: wrap USB power saving in #ifdef CONFIG_PM
wimax: testing for rfkill support should also test for CONFIG_RFKILL_MODULE
wimax: fix kconfig interactions with rfkill and input layers
wimax: fix '#ifndef CONFIG_BUG' layout to avoid warning
r6040: bump release number to 0.20
r6040: warn about MAC address being unset
r6040: check PHY status when bringing interface up
r6040: make printks consistent with DRV_NAME
gianfar: Fixup use of BUS_ID_SIZE
mlx4_en: Returning real Max in get_ringparam
mlx4_en: Consider inline packets on completion
netdev: bfin_mac: enable bfin_mac net dev driver for BF51x
qeth: convert to net_device_ops
vlan: add neigh_setup
dm9601: warn on invalid mac address
...
* 'for-linus' of git://neil.brown.name/md:
md: don't retry recovery of raid1 that fails due to error on source drive.
md: Allow md devices to be created by name.
md: make devices disappear when they are no longer needed.
md: centralise all freeing of an 'mddev' in 'md_free'
md: move allocation of ->queue from mddev_find to md_probe
md: need another print_sb for mdp_superblock_1
md: use list_for_each_entry macro directly
md: raid0: make hash_spacing and preshift sector-based.
md: raid0: Represent the size of strip zones in sectors.
md: raid0 create_strip_zones(): Add KERN_INFO/KERN_ERR to printk's.
md: raid0 create_strip_zones(): Make two local variables sector-based.
md: raid0: Represent zone->zone_offset in sectors.
md: raid0: Represent device offset in sectors.
md: raid0_make_request(): Replace local variable block by sector.
md: raid0_make_request(): Remove local variable chunk_size.
md: raid0_make_request(): Replace chunksize_bits by chunksect_bits.
md: use sysfs_notify_dirent to notify changes to md/sync_action.
md: fix bitmap-on-external-file bug.
This matters for some controllers and in one or two cases almost doubles
PIO performance. Add a bmdma32 operations set we can inherit and activate
it for some controllers
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
If a raid1 has only one working drive and it has a sector which
gives an error on read, then an attempt to recover onto a spare will
fail, but as the single remaining drive is not removed from the
array, the recovery will be immediately re-attempted, resulting
in an infinite recovery loop.
So detect this situation and don't retry recovery once an error
on the lone remaining drive is detected.
Allow recovery to be retried once every time a spare is added
in case the problem wasn't actually a media error.
Signed-off-by: NeilBrown <neilb@suse.de>
Using sequential numbers to identify md devices is somewhat artificial.
Using names can be a lot more user-friendly.
Also, creating md devices by opening the device special file is a bit
awkward.
So this patch provides a new option for creating and naming devices.
Writing a name such as "md_home" to
/sys/modules/md_mod/parameters/new_array
will cause an array with that name to be created. It will appear in
/sys/block/ /proc/partitions and /proc/mdstat as 'md_home'.
It will have an arbitrary minor number allocated.
md devices that a created by an open are destroyed on the last
close when the device is inactive.
For named md devices, they will not be destroyed until the array
is explicitly stopped, either with the STOP_ARRAY ioctl or by
writing 'clear' to /sys/block/md_XXXX/md/array_state.
The name of the array must start 'md_' to avoid conflict with
other devices.
Signed-off-by: NeilBrown <neilb@suse.de>
Currently md devices, once created, never disappear until the module
is unloaded. This is essentially because the gendisk holds a
reference to the mddev, and the mddev holds a reference to the
gendisk, this a circular reference.
If we drop the reference from mddev to gendisk, then we need to ensure
that the mddev is destroyed when the gendisk is destroyed. However it
is not possible to hook into the gendisk destruction process to enable
this.
So we drop the reference from the gendisk to the mddev and destroy the
gendisk when the mddev gets destroyed. However this has a
complication.
Between the call
__blkdev_get->get_gendisk->kobj_lookup->md_probe
and the call
__blkdev_get->md_open
there is no obvious way to hold a reference on the mddev any more, so
unless something is done, it will disappear and gendisk will be
destroyed prematurely.
Also, once we decide to destroy the mddev, there will be an unlockable
moment before the gendisk is unlinked (blk_unregister_region) during
which a new reference to the gendisk can be created. We need to
ensure that this reference can not be used. i.e. the ->open must
fail.
So:
1/ in md_probe we set a flag in the mddev (hold_active) which
indicates that the array should be treated as active, even
though there are no references, and no appearance of activity.
This is cleared by md_release when the device is closed if it
is no longer needed.
This ensures that the gendisk will survive between md_probe and
md_open.
2/ In md_open we check if the mddev we expect to open matches
the gendisk that we did open.
If there is a mismatch we return -ERESTARTSYS and modify
__blkdev_get to retry from the top in that case.
In the -ERESTARTSYS sys case we make sure to wait until
the old gendisk (that we succeeded in opening) is really gone so
we loop at most once.
Some udev configurations will always open an md device when it first
appears. If we allow an md device that was just created by an open
to disappear on an immediate close, then this can race with such udev
configurations and result in an infinite loop the device being opened
and closed, then re-open due to the 'ADD' even from the first open,
and then close and so on.
So we make sure an md device, once created by an open, remains active
at least until some md 'ioctl' has been made on it. This means that
all normal usage of md devices will allow them to disappear promptly
when not needed, but the worst that an incorrect usage will do it
cause an inactive md device to be left in existence (it can easily be
removed).
As an array can be stopped by writing to a sysfs attribute
echo clear > /sys/block/mdXXX/md/array_state
we need to use scheduled work for deleting the gendisk and other
kobjects. This allows us to wait for any pending gendisk deletion to
complete by simply calling flush_scheduled_work().
Signed-off-by: NeilBrown <neilb@suse.de>
The rdev_for_each macro defined in <linux/raid/md_k.h> is identical to
list_for_each_entry_safe, from <linux/list.h>, it should be defined to
use list_for_each_entry_safe, instead of reinventing the wheel.
But some calls to each_entry_safe don't really need a safe version,
just a direct list_for_each_entry is enough, this could save a temp
variable (tmp) in every function that used rdev_for_each.
In this patch, most rdev_for_each loops are replaced by list_for_each_entry,
totally save many tmp vars; and only in the other situations that will call
list_del to delete an entry, the safe version is used.
Signed-off-by: Cheng Renquan <crquan@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
This patch renames the hash_spacing and preshift members of struct
raid0_private_data to spacing and sector_shift respectively and
changes the semantics as follows:
We always have spacing = 2 * hash_spacing. In case
sizeof(sector_t) > sizeof(u32) we also have sector_shift = preshift + 1
while sector_shift = preshift = 0 otherwise.
Note that the values of nb_zone and zone are unaffected by these changes
because in the sector_div() preceeding the assignement of these two
variables both arguments double.
Signed-off-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: NeilBrown <neilb@suse.de>
This completes the block -> sector conversion of struct strip_zone.
Signed-off-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: NeilBrown <neilb@suse.de>
For the same reason as in the previous patch, rename it from zone_offset
to zone_start.
Signed-off-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: NeilBrown <neilb@suse.de>
Rename zone->dev_offset to zone->dev_start to make sure all users
have been converted.
Signed-off-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: NeilBrown <neilb@suse.de>
There is no compelling need for this, but sysfs_notify_dirent is a
nicer interface and the change is good for consistency.
Signed-off-by: NeilBrown <neilb@suse.de>
Reported by Randy Dunlap from a warning in the v2.6.29 merge window
tree as of 2009/1/8.
Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix kernel-doc warnings in regulator/driver.h:
Warning(linux-next-20090108//include/linux/regulator/driver.h:95): Excess struct/union/enum/typedef member 'set_current' description in 'regulator_ops'
Warning(linux-next-20090108//include/linux/regulator/driver.h:95): Excess struct/union/enum/typedef member 'get_current' description in 'regulator_ops'
Warning(linux-next-20090108//include/linux/regulator/driver.h:124): No description found for parameter 'irq'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
cc: Liam Girdwood <lrg@slimlogic.co.uk>
cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
This is only the documentation that the kerneldoc system warns about.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Remove kerneldoc warnings that don't relate to missing documentation,
mostly by renaming parameters in the documentation to match their
actual names.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
This patch adds GRO support for IPv6. IPv6 GRO supports extension
headers in the same way as GSO (by using the same infrastructure).
It's also simpler compared to IPv4 since we no longer have to worry
about fragmentation attributes or header checksums.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add suspend/resume to the core class and remove all the now unneeded
code from various drivers. Originally the class code couldn't support
suspend/resume but since class_device can there is no reason for
each driver doing its own suspend/resume anymore.
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (53 commits)
serial: Add driver for the Cell Network Processor serial port NWP device
powerpc: enable dynamic ftrace
powerpc/cell: Fix the prototype of create_vma_map()
powerpc/mm: Make clear_fixmap() actually work
powerpc/kdump: Use ppc_save_regs() in crash_setup_regs()
powerpc: Export cacheable_memzero as its now used in a driver
powerpc: Fix missing semicolons in mmu_decl.h
powerpc/pasemi: local_irq_save uses an unsigned long
powerpc/cell: Fix some u64 vs. long types
powerpc/cell: Use correct types in beat files
powerpc: Use correct type in prom_init.c
powerpc: Remove unnecessary casts
mtd/ps3vram: Use _PAGE_NO_CACHE in memory ioremap
mtd/ps3vram: Use msleep in waits
mtd/ps3vram: Use proper kernel types
mtd/ps3vram: Cleanup ps3vram driver messages
mtd/ps3vram: Remove ps3vram debug routines
mtd/ps3vram: Add modalias support to the ps3vram driver
mtd/ps3vram: Add ps3vram driver for accessing video RAM as MTD
powerpc: Fix iseries drivers build failure without CONFIG_VIOPATH
...
There have been some local definitions of swap(), it's time to replace
them all with a uniform one.
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
While discussing[1] the need for glibc to have access to random bytes
during program load, it seems that an earlier attempt to implement
AT_RANDOM got stalled. This implements a random 16 byte string, available
to every ELF program via a new auxv AT_RANDOM vector.
[1] http://sourceware.org/ml/libc-alpha/2008-10/msg00006.html
Ulrich said:
glibc needs right after startup a bit of random data for internal
protections (stack canary etc). What is now in upstream glibc is that we
always unconditionally open /dev/urandom, read some data, and use it. For
every process startup. That's slow.
...
The solution is to provide a limited amount of random data to the
starting process in the aux vector. I suggested 16 bytes and this is
what the patch implements. If we need only 16 bytes or less we use the
data directly. If we need more we'll use the 16 bytes to see a PRNG.
This avoids the costly /dev/urandom use and it allows the kernel to use
the most adequate source of random data for this purpose. It might not
be the same pool as that for /dev/urandom.
Concerns were expressed about the depletion of the randomness pool. But
this patch doesn't make the situation worse, it doesn't deplete entropy
more than happens now.
Signed-off-by: Kees Cook <kees.cook@canonical.com>
Cc: Jakub Jelinek <jakub@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently task_active_pid_ns is not safe to call after a task becomes a
zombie and exit_task_namespaces is called, as nsproxy becomes NULL. By
reading the pid namespace from the pid of the task we can trivially solve
this problem at the cost of one extra memory read in what should be the
same cacheline as we read the namespace from.
When moving things around I have made task_active_pid_ns out of line
because keeping it in pid_namespace.h would require adding includes of
pid.h and sched.h that I don't think we want.
This change does make task_active_pid_ns unsafe to call during
copy_process until we attach a pid on the task_struct which seems to be a
reasonable trade off.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Bastian Blank <bastian@waldi.eu.org>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Nadia Derbey <Nadia.Derbey@bull.net>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A current problem with the pid namespace is that it is easy to do pid
related work after exit_task_namespaces which drops the nsproxy pointer.
However if we are doing pid namespace related work we are always operating
on some struct pid which retains the pid_namespace pointer of the pid
namespace it was allocated in.
So provide ns_of_pid which allows us to find the pid namespace a pid was
allocated in.
Using this we have the needed infrastructure to do pid namespace related
work at anytime we have a struct pid, removing the chance of accidentally
having a NULL pointer dereference when accessing current->nsproxy.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Bastian Blank <bastian@waldi.eu.org>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Nadia Derbey <Nadia.Derbey@bull.net>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Impact: cleanups, use new cpumask API
Final trivial cleanups: mainly s/cpumask_t/struct cpumask
Note there is a FIXME in generate_sched_domains(). A future patch will
change struct cpumask *doms to struct cpumask *doms[].
(I suppose Rusty will do this.)
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Mike Travis <travis@sgi.com>
Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add css_tryget(), that obtains a counted reference on a CSS. It is used
in situations where the caller has a "weak" reference to the CSS, i.e.
one that does not protect the cgroup from removal via a reference count,
but would instead be cleaned up by a destroy() callback.
css_tryget() will return true on success, or false if the cgroup is being
removed.
This is similar to Kamezawa Hiroyuki's patch from a week or two ago, but
with the difference that in the event of css_tryget() racing with a
cgroup_rmdir(), css_tryget() will only return false if the cgroup really
does get removed.
This implementation is done by biasing css->refcnt, so that a refcnt of 1
means "releasable" and 0 means "released or releasing". In the event of a
race, css_tryget() distinguishes between "released" and "releasing" by
checking for the CSS_REMOVED flag in css->flags.
Signed-off-by: Paul Menage <menage@google.com>
Tested-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
These patches introduce new locking/refcount support for cgroups to
reduce the need for subsystems to call cgroup_lock(). This will
ultimately allow the atomicity of cgroup_rmdir() (which was removed
recently) to be restored.
These three patches give:
1/3 - introduce a per-subsystem hierarchy_mutex which a subsystem can
use to prevent changes to its own cgroup tree
2/3 - use hierarchy_mutex in place of calling cgroup_lock() in the
memory controller
3/3 - introduce a css_tryget() function similar to the one recently
proposed by Kamezawa, but avoiding spurious refcount failures in
the event of a race between a css_tryget() and an unsuccessful
cgroup_rmdir()
Future patches will likely involve:
- using hierarchy mutex in place of cgroup_lock() in more subsystems
where appropriate
- restoring the atomicity of cgroup_rmdir() with respect to cgroup_create()
This patch:
Add a hierarchy_mutex to the cgroup_subsys object that protects changes to
the hierarchy observed by that subsystem. It is taken by the cgroup
subsystem (in addition to cgroup_mutex) for the following operations:
- linking a cgroup into that subsystem's cgroup tree
- unlinking a cgroup from that subsystem's cgroup tree
- moving the subsystem to/from a hierarchy (including across the
bind() callback)
Thus if the subsystem holds its own hierarchy_mutex, it can safely
traverse its own hierarchy.
Signed-off-by: Paul Menage <menage@google.com>
Tested-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now, you can see following even when swap accounting is enabled.
1. Create Group 01, and 02.
2. allocate a "file" on tmpfs by a task under 01.
3. swap out the "file" (by memory pressure)
4. Read "file" from a task in group 02.
5. the charge of "file" is moved to group 02.
This is not ideal behavior. This is because SwapCache which was loaded
by read-ahead is not taken into account..
This is a patch to fix shmem's swapcache behavior.
- remove mem_cgroup_cache_charge_swapin().
- Add SwapCache handler routine to mem_cgroup_cache_charge().
By this, shmem's file cache is charged at add_to_page_cache()
with GFP_NOWAIT.
- pass the page of swapcache to shrink_mem_cgroup.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Paul Menage <menage@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After previous patch, mem_cgroup_try_charge is not used by anyone, so we
can remove it.
Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently, inactive_ratio of memcg is calculated at setting limit.
because page_alloc.c does so and current implementation is straightforward
porting.
However, memcg introduced hierarchy feature recently. In hierarchy
restriction, memory limit is not only decided memory.limit_in_bytes of
current cgroup, but also parent limit and sibling memory usage.
Then, The optimal inactive_ratio is changed frequently. So, everytime
calculation is better.
Tested-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently, /proc/sys/vm/swappiness can change swappiness ratio for global
reclaim. However, memcg reclaim doesn't have tuning parameter for itself.
In general, the optimal swappiness depend on workload. (e.g. hpc
workload need to low swappiness than the others.)
Then, per cgroup swappiness improve administrator tunability.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now, get_scan_ratio() return correct value although memcg reclaim. Then,
mem_cgroup_calc_reclaim() can be removed.
So, memcg reclaim get the same capability of anon/file reclaim balancing
as global reclaim now.
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@redhat.com>
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The inactive_anon_is_low() is key component of active/inactive anon
balancing on reclaim. However current inactive_anon_is_low() function
only consider global reclaim.
Therefore, we need following ugly scan_global_lru() condition.
if (lru == LRU_ACTIVE_ANON &&
(!scan_global_lru(sc) || inactive_anon_is_low(zone))) {
shrink_active_list(nr_to_scan, zone, sc, priority, file);
return 0;
it cause that memcg reclaim always deactivate pages when shrink_list() is
called. To make mem_cgroup_inactive_anon_is_low() improve active/inactive
anon balancing of memcgroup.
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: "Pekka Enberg" <penberg@cs.helsinki.fi>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The inactive_anon_is_low() is called only vmscan. Then it can move to
vmscan.c
This patch doesn't have any functional change.
Reviewd-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
My patch, memcg-fix-gfp_mask-of-callers-of-charge.patch changed gfp_mask
of callers of charge to be GFP_HIGHUSER_MOVABLE for showing what will
happen at memory reclaim.
But in recent discussion, it's NACKed because it sounds ugly.
This patch is for reverting it and add some clean up to gfp_mask of
callers of charge. No behavior change but need review before generating
HUNK in deep queue.
This patch also adds explanation to meaning of gfp_mask passed to charge
functions in memcontrol.h.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Current mmtom has new oom function as pagefault_out_of_memory(). It's
added for select bad process rathar than killing current.
When memcg hit limit and calls OOM at page_fault, this handler called and
system-wide-oom handling happens. (means kernel panics if panic_on_oom is
true....)
To avoid overkill, check memcg's recent behavior before starting
system-wide-oom.
And this patch also fixes to guarantee "don't accnout against process with
TIF_MEMDIE". This is necessary for smooth OOM.
[akpm@linux-foundation.org: build fix]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Jan Blunck <jblunck@suse.de>
Cc: Hirokazu Takahashi <taka@valinux.co.jp>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm_match_cgroup() calls cgroup_subsys_state().
We must use rcu_read_lock() to protect cgroup_subsys_state().
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add support for building hierarchies in resource counters. Cgroups allows
us to build a deep hierarchy, but we currently don't link the resource
counters belonging to the memory controller control groups, in the same
fashion as the corresponding cgroup entries in the cgroup hierarchy. This
patch provides the infrastructure for resource counters that have the same
hiearchy as their cgroup counter parts.
These set of patches are based on the resource counter hiearchy patches
posted by Pavel Emelianov.
NOTE: Building hiearchies is expensive, deeper hierarchies imply charging
the all the way up to the root. It is known that hiearchies are
expensive, so the user needs to be careful and aware of the trade-offs
before creating very deep ones.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Cc: Paul Menage <menage@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pavel Emelianov <xemul@openvz.org>
Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We check mem_cgroup is disabled or not by checking
mem_cgroup_subsys.disabled. I think it has more references than expected,
now.
replacing
if (mem_cgroup_subsys.disabled)
with
if (mem_cgroup_disabled())
give us good look, I think.
[kamezawa.hiroyu@jp.fujitsu.com: fix typo]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A big patch for changing memcg's LRU semantics.
Now,
- page_cgroup is linked to mem_cgroup's its own LRU (per zone).
- LRU of page_cgroup is not synchronous with global LRU.
- page and page_cgroup is one-to-one and statically allocated.
- To find page_cgroup is on what LRU, you have to check pc->mem_cgroup as
- lru = page_cgroup_zoneinfo(pc, nid_of_pc, zid_of_pc);
- SwapCache is handled.
And, when we handle LRU list of page_cgroup, we do following.
pc = lookup_page_cgroup(page);
lock_page_cgroup(pc); .....................(1)
mz = page_cgroup_zoneinfo(pc);
spin_lock(&mz->lru_lock);
.....add to LRU
spin_unlock(&mz->lru_lock);
unlock_page_cgroup(pc);
But (1) is spin_lock and we have to be afraid of dead-lock with zone->lru_lock.
So, trylock() is used at (1), now. Without (1), we can't trust "mz" is correct.
This is a trial to remove this dirty nesting of locks.
This patch changes mz->lru_lock to be zone->lru_lock.
Then, above sequence will be written as
spin_lock(&zone->lru_lock); # in vmscan.c or swap.c via global LRU
mem_cgroup_add/remove/etc_lru() {
pc = lookup_page_cgroup(page);
mz = page_cgroup_zoneinfo(pc);
if (PageCgroupUsed(pc)) {
....add to LRU
}
spin_lock(&zone->lru_lock); # in vmscan.c or swap.c via global LRU
This is much simpler.
(*) We're safe even if we don't take lock_page_cgroup(pc). Because..
1. When pc->mem_cgroup can be modified.
- at charge.
- at account_move().
2. at charge
the PCG_USED bit is not set before pc->mem_cgroup is fixed.
3. at account_move()
the page is isolated and not on LRU.
Pros.
- easy for maintenance.
- memcg can make use of laziness of pagevec.
- we don't have to duplicated LRU/Active/Unevictable bit in page_cgroup.
- LRU status of memcg will be synchronized with global LRU's one.
- # of locks are reduced.
- account_move() is simplified very much.
Cons.
- may increase cost of LRU rotation.
(no impact if memcg is not configured.)
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch implements per cgroup limit for usage of memory+swap. However
there are SwapCache, double counting of swap-cache and swap-entry is
avoided.
Mem+Swap controller works as following.
- memory usage is limited by memory.limit_in_bytes.
- memory + swap usage is limited by memory.memsw_limit_in_bytes.
This has following benefits.
- A user can limit total resource usage of mem+swap.
Without this, because memory resource controller doesn't take care of
usage of swap, a process can exhaust all the swap (by memory leak.)
We can avoid this case.
And Swap is shared resource but it cannot be reclaimed (goes back to memory)
until it's used. This characteristic can be trouble when the memory
is divided into some parts by cpuset or memcg.
Assume group A and group B.
After some application executes, the system can be..
Group A -- very large free memory space but occupy 99% of swap.
Group B -- under memory shortage but cannot use swap...it's nearly full.
Ability to set appropriate swap limit for each group is required.
Maybe someone wonder "why not swap but mem+swap ?"
- The global LRU(kswapd) can swap out arbitrary pages. Swap-out means
to move account from memory to swap...there is no change in usage of
mem+swap.
In other words, when we want to limit the usage of swap without affecting
global LRU, mem+swap limit is better than just limiting swap.
Accounting target information is stored in swap_cgroup which is
per swap entry record.
Charge is done as following.
map
- charge page and memsw.
unmap
- uncharge page/memsw if not SwapCache.
swap-out (__delete_from_swap_cache)
- uncharge page
- record mem_cgroup information to swap_cgroup.
swap-in (do_swap_page)
- charged as page and memsw.
record in swap_cgroup is cleared.
memsw accounting is decremented.
swap-free (swap_free())
- if swap entry is freed, memsw is uncharged by PAGE_SIZE.
There are people work under never-swap environments and consider swap as
something bad. For such people, this mem+swap controller extension is just an
overhead. This overhead is avoided by config or boot option.
(see Kconfig. detail is not in this patch.)
TODO:
- maybe more optimization can be don in swap-in path. (but not very safe.)
But we just do simple accounting at this stage.
[nishimura@mxp.nes.nec.co.jp: make resize limit hold mutex]
[hugh@veritas.com: memswap controller core swapcache fixes]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For accounting swap, we need a record per swap entry, at least.
This patch adds following function.
- swap_cgroup_swapon() .... called from swapon
- swap_cgroup_swapoff() ... called at the end of swapoff.
- swap_cgroup_record() .... record information of swap entry.
- swap_cgroup_lookup() .... lookup information of swap entry.
This patch just implements "how to record information". No actual method
for limit the usage of swap. These routine uses flat table to record and
lookup. "wise" lookup system like radix-tree requires requires memory
allocation at new records but swap-out is usually called under memory
shortage (or memcg hits limit.) So, I used static allocation. (maybe
dynamic allocation is not very hard but it adds additional memory
allocation in memory shortage path.)
Note1: In this, we use pointer to record information and this means
8bytes per swap entry. I think we can reduce this when we
create "id of cgroup" in the range of 0-65535 or 0-255.
Reported-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Tested-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Reported-by: Hugh Dickins <hugh@veritas.com>
Reported-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Pavel Emelianov <xemul@openvz.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Config and control variable for mem+swap controller.
This patch adds CONFIG_CGROUP_MEM_RES_CTLR_SWAP
(memory resource controller swap extension.)
For accounting swap, it's obvious that we have to use additional memory to
remember "who uses swap". This adds more overhead. So, it's better to
offer "choice" to users. This patch adds 2 choices.
This patch adds 2 parameters to enable swap extension or not.
- CONFIG
- boot option
Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SwapCache support for memory resource controller (memcg)
Before mem+swap controller, memcg itself should handle SwapCache in proper
way. This is cut-out from it.
In current memcg, SwapCache is just leaked and the user can create tons of
SwapCache. This is a leak of account and should be handled.
SwapCache accounting is done as following.
charge (anon)
- charged when it's mapped.
(because of readahead, charge at add_to_swap_cache() is not sane)
uncharge (anon)
- uncharged when it's dropped from swapcache and fully unmapped.
means it's not uncharged at unmap.
Note: delete from swap cache at swap-in is done after rmap information
is established.
charge (shmem)
- charged at swap-in. this prevents charge at add_to_page_cache().
uncharge (shmem)
- uncharged when it's dropped from swapcache and not on shmem's
radix-tree.
at migration, check against 'old page' is modified to handle shmem.
Comparing to the old version discussed (and caused troubles), we have
advantages of
- PCG_USED bit.
- simple migrating handling.
So, situation is much easier than several months ago, maybe.
[hugh@veritas.com: memcg: handle swap caches build fix]
Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Tested-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now, management of "charge" under page migration is done under following
manner. (Assume migrate page contents from oldpage to newpage)
before
- "newpage" is charged before migration.
at success.
- "oldpage" is uncharged at somewhere(unmap, radix-tree-replace)
at failure
- "newpage" is uncharged.
- "oldpage" is charged if necessary (*1)
But (*1) is not reliable....because of GFP_ATOMIC.
This patch tries to change behavior as following by charge/commit/cancel ops.
before
- charge PAGE_SIZE (no target page)
success
- commit charge against "newpage".
failure
- commit charge against "oldpage".
(PCG_USED bit works effectively to avoid double-counting)
- if "oldpage" is obsolete, cancel charge of PAGE_SIZE.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>