- update sanity checks
- add DLC to length conversion helpers
- can_dlc2len() - get data length from can_dlc with sanitized can_dlc
- can_len2dlc() - map the sanitized data length to an appropriate DLC
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
- introduce a new sockopt CAN_RAW_FD_FRAMES to allow CAN FD frames
- handle CAN frames and CAN FD frames simultaneously when enabled
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
- handle ETH_P_CAN and ETH_P_CANFD skbuffs
- update sanity checks for CAN and CAN FD
- make sure the CAN frame can pass the selected CAN netdevice on send
- bump core version and abi version to indicate the new CAN FD support
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
- add new struct canfd_frame
- check identical element offsets in struct can_frame and struct canfd_frame
- new ETH_P_CANFD definition to tag CAN FD skbs correctly
- add CAN_MTU and CANFD_MTU definitions for easy frame and mode detection
- add CAN[FD]_MAX_[DLC|DLEN] helper constants to remove hard coded values
- update existing struct can_frame with helper constants and comments
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
ERROR: "nfqnl_ct_parse" [net/netfilter/nfnetlink_queue.ko] undefined!
ERROR: "nfqnl_ct_seq_adjust" [net/netfilter/nfnetlink_queue.ko] undefined!
ERROR: "nfqnl_ct_put" [net/netfilter/nfnetlink_queue.ko] undefined!
ERROR: "nfqnl_ct_get" [net/netfilter/nfnetlink_queue.ko] undefined!
We have to use CONFIG_NETFILTER_NETLINK_QUEUE_CT in
include/net/netfilter/nfnetlink_queue.h, not CONFIG_NF_CONNTRACK.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move AUTO_OFF_TIMEOUT to other constants changing name to
HCI_AUTO_OFF_TIMEOUT and convert to jiffies.
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
In "9cb0176 netfilter: add glue code to integrate nfnetlink_queue and ctnetlink"
the compilation with NF_CONNTRACK disabled is broken. This patch fixes this
issue.
I have moved the conntrack part into nfnetlink_queue_ct.c to avoid
peppering the entire nfnetlink_queue.c code with ifdefs.
I also needed to rename nfnetlink_queue.c to nfnetlink_queue_pkt.c
to update the net/netfilter/Makefile to support conditional compilation
of the conntrack integration.
This patch also adds CONFIG_NETFILTER_QUEUE_CT in case you want to explicitly
disable the integration between nf_conntrack and nfnetlink_queue.
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The asm-generic/bug.h __ASSEMBLY__ guarding is completely bogus, which
tripped up the powerpc build when the kernel.h include was added:
In file included from include/asm-generic/bug.h:5:0,
from arch/powerpc/include/asm/bug.h:127,
from arch/powerpc/kernel/head_64.S:31:
include/linux/kernel.h:44:0: warning: "ALIGN" redefined [enabled by default]
include/linux/linkage.h:57:0: note: this is the location of the previous definition
include/linux/sysinfo.h: Assembler messages:
include/linux/sysinfo.h:7: Error: Unrecognized opcode: `struct'
include/linux/sysinfo.h:8: Error: Unrecognized opcode: `__kernel_long_t'
Moving the __ASSEMBLY__ guard up and stashing the kernel.h include under
it fixes this up, as well as covering the case the original fix was
attempting to handle.
Tested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 5963e317b1 ("ftrace/x86: Do not change stacks in DEBUG when
calling lockdep") prevented lockdep calls from the int3 breakpoint handler
from reseting the stack if a function that was called was in the process
of being converted for tracing and had a breakpoint on it. The idea is,
before calling the lockdep code, do a load_idt() to the special IDT that
kept the breakpoint stack from reseting. This worked well as a quick fix
for this kernel release, until a certain config caused a lockup in the
function tracer start up tests.
Investigating it, I found that the load_idt that was used to prevent
the int3 from changing stacks was itself being traced!
Even though the config had CONFIG_OPTIMIZE_INLINING disabled, and
all 'inline' tags were set to always inline, there were still cases that
it did not inline! This was caused by CONFIG_PARAVIRT_GUEST, where it
would add a pointer to the native_load_idt() which made that function
to be traced.
Commit 45959ee7aa ("ftrace: Do not function trace inlined functions")
only touched the 'inline' tags when CONFIG_OPMITIZE_INLINING was enabled.
PARAVIRT_GUEST shows that this was not enough and we need to also
mark always_inline with notrace as well.
Reported-by: Fengguang Wu <wfg@linux.intel.com>
Tested-by: Fengguang Wu <wfg@linux.intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Currently there is a 'chicken and egg' issue when the DS is also the mounted
MDS. The nfs_match_client() reference from nfs4_set_ds_client bumps the
cl_count, the nfs_client is not freed at umount, and nfs4_deviceid_purge_client
is not called to dereference the MDS usage of a deviceid which holds a
reference to the DS nfs_client. The result is the umount program returns,
but the nfs_client is not freed, and the cl_session hearbeat continues.
The MDS (and all other nfs mounts) lose their last nfs_client reference in
nfs_free_server when the last nfs_server (fsid) is umounted.
The file layout DS lose their last nfs_client reference in destroy_ds
when the last deviceid referencing the data server is put and destroy_ds is
called. This is triggered by a call to nfs4_deviceid_purge_client which
removes references to a pNFS deviceid used by an MDS mount.
The fix is to track how many pnfs enabled filesystems are mounted from
this server, and then to purge the device id cache once that count reaches
zero.
Reported-by: Jorge Mora <Jorge.Mora@netapp.com>
Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
As defined in section 13.10.9.3 Case D (802.11-2012), this
control variable is used to limit the mesh STA to send only
one PREQ to a root mesh STA within this interval of time
(in TUs). The default value for this variable is set to
2000 TUs. However, for current implementation, the maximum
configurable of dot11MeshHWMPconfirmationInterval is
restricted by dot11MeshHWMPactivePathTimeout.
Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@gmail.com>
[line-break commit log]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
All driver specific and fairly small. The pxa-ssp changes are larger
than I'd like but they're build failures and are pretty clear to
inspection.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJP3kOeAAoJEBus8iNuMP3dpf0QAJLZjq1Ttzcm2ZrtYkr6edgm
axHhApj6ptOlLqb16zCJ+zjgmjUy/GteJg766KZqAkvMJxWSPRaPr/aTQnWlxUAE
CU/WjdywbVPKIkn5/r+jpEOifsHNTzj14/gF1Cn9V+58pN6YgflmfElV2vFe3rdX
IrNxGfRBHDgHwqDbHTi2OL8xUTE4oJnnhQZxk7FcsYKiwpwB2hVkbrTTKGNBPL4d
DoV8Xcy+Zb12iDQwh9BvzYAznQTQwCjGsF11UVAb0ChTV14XExQTABNZnAL3hgfj
AhBtm/coDmym61AvQMddsbg8Hrlp/d9Qzmr+0Fxj9HBb+SGVt9XlHY3ETePBM4hQ
wIjhedCed4Pv/Jb9IYDzQ1wJD/9iDG+Xuyj1iQC1QQKaoKg6eI9QgTLoVwRHCssg
DVTzFPOFEFiIfI+eH2cLyZZZlWL1LDbeVmH+KvWthb6+mbfOXo/VPlely2Odl+b7
NErfdjbggoc4Oy8o4Lo0XMb/6HxE7tHd/AzbKHdeHjNk8BCn5NCrdOLSuJb3xrr5
8fLP/0T9Z955R6YsHNg88rhQ74QhWSLGvs9TjUw7sRHN9xkC4gQIwcJ5i0dZcXjC
zIdP0Lj8XB0FdDGYNALdy2B48jU1uKs0cGW+xV/Z0uA0QllB0pfFy+O49z+jRKzJ
d/cIeM8hsxyOHWS4y+BJ
=67R8
-----END PGP SIGNATURE-----
Merge tag 'asoc-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Updates for 3.5
All driver specific and fairly small. The pxa-ssp changes are larger
than I'd like but they're build failures and are pretty clear to
inspection.
Pablo says:
====================
This is the second batch of Netfilter updates for net-next. It contains the
kernel changes for the new user-space connection tracking helper
infrastructure.
More details on this infrastructure are provides here:
http://lwn.net/Articles/500196/
Still, I plan to provide some official documentation through the
conntrack-tools user manual on how to setup user-space utilities for this.
So far, it provides two helper in user-space, one for NFSv3 and another for
Oracle/SQLnet/TNS. Yet in my TODO list.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix code style - place the asterisk where it belongs.
Signed-off-by: Eldad Zack <eldad@fogrefinery.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are good reasons to supports helpers in user-space instead:
* Rapid connection tracking helper development, as developing code
in user-space is usually faster.
* Reliability: A buggy helper does not crash the kernel. Moreover,
we can monitor the helper process and restart it in case of problems.
* Security: Avoid complex string matching and mangling in kernel-space
running in privileged mode. Going further, we can even think about
running user-space helpers as a non-root process.
* Extensibility: It allows the development of very specific helpers (most
likely non-standard proprietary protocols) that are very likely not to be
accepted for mainline inclusion in the form of kernel-space connection
tracking helpers.
This patch adds the infrastructure to allow the implementation of
user-space conntrack helpers by means of the new nfnetlink subsystem
`nfnetlink_cthelper' and the existing queueing infrastructure
(nfnetlink_queue).
I had to add the new hook NF_IP6_PRI_CONNTRACK_HELPER to register
ipv[4|6]_helper which results from splitting ipv[4|6]_confirm into
two pieces. This change is required not to break NAT sequence
adjustment and conntrack confirmation for traffic that is enqueued
to our user-space conntrack helpers.
Basic operation, in a few steps:
1) Register user-space helper by means of `nfct':
nfct helper add ftp inet tcp
[ It must be a valid existing helper supported by conntrack-tools ]
2) Add rules to enable the FTP user-space helper which is
used to track traffic going to TCP port 21.
For locally generated packets:
iptables -I OUTPUT -t raw -p tcp --dport 21 -j CT --helper ftp
For non-locally generated packets:
iptables -I PREROUTING -t raw -p tcp --dport 21 -j CT --helper ftp
3) Run the test conntrackd in helper mode (see example files under
doc/helper/conntrackd.conf
conntrackd
4) Generate FTP traffic going, if everything is OK, then conntrackd
should create expectations (you can check that with `conntrack':
conntrack -E expect
[NEW] 301 proto=6 src=192.168.1.136 dst=130.89.148.12 sport=0 dport=54037 mask-src=255.255.255.255 mask-dst=255.255.255.255 sport=0 dport=65535 master-src=192.168.1.136 master-dst=130.89.148.12 sport=57127 dport=21 class=0 helper=ftp
[DESTROY] 301 proto=6 src=192.168.1.136 dst=130.89.148.12 sport=0 dport=54037 mask-src=255.255.255.255 mask-dst=255.255.255.255 sport=0 dport=65535 master-src=192.168.1.136 master-dst=130.89.148.12 sport=57127 dport=21 class=0 helper=ftp
This confirms that our test helper is receiving packets including the
conntrack information, and adding expectations in kernel-space.
The user-space helper can also store its private tracking information
in the conntrack structure in the kernel via the CTA_HELP_INFO. The
kernel will consider this a binary blob whose layout is unknown. This
information will be included in the information that is transfered
to user-space via glue code that integrates nfnetlink_queue and
ctnetlink.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
User-space programs that receive traffic via NFQUEUE may mangle packets.
If NAT is enabled, this usually puzzles sequence tracking, leading to
traffic disruptions.
With this patch, nfnl_queue will make the corresponding NAT TCP sequence
adjustment if:
1) The packet has been mangled,
2) the NFQA_CFG_F_CONNTRACK flag has been set, and
3) NAT is detected.
There are some records on the Internet complaning about this issue:
http://stackoverflow.com/questions/260757/packet-mangling-utilities-besides-iptables
By now, we only support TCP since we have no helpers for DCCP or SCTP.
Better to add this if we ever have some helper over those layer 4 protocols.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch allows you to include the conntrack information together
with the packet that is sent to user-space via NFQUEUE.
Previously, there was no integration between ctnetlink and
nfnetlink_queue. If you wanted to access conntrack information
from your libnetfilter_queue program, you required to query
ctnetlink from user-space to obtain it. Thus, delaying the packet
processing even more.
Including the conntrack information is optional, you can set it
via NFQA_CFG_F_CONNTRACK flag with the new NFQA_CFG_FLAGS attribute.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch uses the new variable length conntrack extensions.
Instead of using union nf_conntrack_help that contain all the
helper private data information, we allocate variable length
area to store the private helper data.
This patch includes the modification of all existing helpers.
It also includes a couple of include header to avoid compilation
warnings.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
We can now define conntrack extensions of variable size. This
patch is useful to get rid of these unions:
union nf_conntrack_help
union nf_conntrack_proto
union nf_conntrack_nat_help
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch modifies the struct nf_conntrack_helper to allocate
the room for the helper name. The maximum length is 16 bytes
(this was already introduced in 2.6.24).
For the maximum length for expectation policy names, I have
also selected 16 bytes.
This patch is required by the follow-up patch to support
user-space connection tracking helpers.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Fix warnings on some architectures/configs (not on x86):
include/linux/vga_switcheroo.h:28:30: warning: 'struct pci_dev' declared inside parameter list [enabled by default]
include/linux/vga_switcheroo.h:28:30: warning: its scope is only this definition or declaration, which is probably not what you want [enabled by default]
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Takashi Iwai <tiwai@suse.de>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Conflicts:
net/ipv6/route.c
Pull in 'net' again to get the revert of Thomas's change
which introduced regressions.
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 2a0c451ade.
It causes crashes, because now ip6_null_entry is used before
it is initialized.
Signed-off-by: David S. Miller <davem@davemloft.net>
Minchan Kim reports that when a system has many swap areas, and tmpfs
swaps out to the ninth or more, shmem_getpage_gfp()'s attempts to read
back the page cannot locate it, and the read fails with -ENOMEM.
Whoops. Yes, I blindly followed read_swap_header()'s pte_to_swp_entry(
swp_entry_to_pte()) technique for determining maximum usable swap
offset, without stopping to realize that that actually depends upon the
pte swap encoding shifting swap offset to the higher bits and truncating
it there. Whereas our radix_tree swap encoding leaves offset in the
lower bits: it's swap "type" (that is, index of swap area) that was
truncated.
Fix it by reducing the SWP_TYPE_SHIFT() in swapops.h, and removing the
broken radix_to_swp_entry(swp_to_radix_entry()) from read_swap_header().
This does not reduce the usable size of a swap area any further, it
leaves it as claimed when making the original commit: no change from 3.0
on x86_64, nor on i386 without PAE; but 3.0's 512GB is reduced to 128GB
per swapfile on i386 with PAE. It's not a change I would have risked
five years ago, but with x86_64 supported for ten years, I believe it's
appropriate now.
Hmm, and what if some architecture implements its swap pte with offset
encoded below type? That would equally break the maximum usable swap
offset check. Happily, they all follow the same tradition of encoding
offset above type, but I'll prepare a check on that for next.
Reported-and-Reviewed-and-Tested-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: stable@vger.kernel.org [3.1, 3.2, 3.3, 3.4]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Highlights include:
- Fix a couple of mount regressions due to the recent cleanups.
- Fix an Oops in the open recovery code
- Fix an rpc_pipefs upcall hang that results from some of the
net namespace work from 3.4.x (stable kernel candidate).
- Fix a couple of write and o_direct regressions that were found
at last weeks Bakeathon testing event in Ann Arbor.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJP2gmaAAoJEGcL54qWCgDyrBMP/RY/T++He8y5k3M9aEqiIv0q
D8ZVMwzID6f4Zgw4xRg96aYr02sBTw0q+0mP5x1EZmg8mK29rnBiVeKHE1iwSfXq
10/SYISlpIjhJC4I4kHXGd2KClgj7qRRCbDKFRWwoIIwYU+kJn8MRnPa9XqdL8kP
q68lrtayW8THSJDR8bk1GQn+ARxGeoY++qzHxm3vpQCbZVVb19VqKMWAWSN4VKqb
epWehOSAzB3iA7HrLRbf8Y8/sDdXewxCQpr9CC/wxuu++l5ifPphR0ToX+k9VZXI
BKFLUojCUZHTMAgCxuxjrFYehMeyClbzL2lLkz5Pgj0gQhOX6Myj+WMXoEg/uWfo
XNf51FH3yBbnfayTaOUs6Y50iuU+dQO7TUTAoWTPpW9V/iT5z/fWAKUVJhDtrPk5
DVDkR6SEgb4P1RqkehZKLq5k5GSAcTR+MZr452eDrFYXJrY8ORDE6o6kP4Rr3Nnd
n8gap0gHxzIYlhBghem6+nLN+HhpZQopWeD8mNub20VuXsChRDr9/+XWuMCSJaZF
2kleVdt2+rTDzi9bJTRYlsX397oaThL0NbRvshHAwnXIDtIQrzxx6+dUyOsEWMEu
go/EdSUUESXGNlsWTqewCBsOjPeE4L5ijI/QglfDkF+CzD5dDjrxl+5i57iMKVfc
Ydste3pQJkS7PiZu1sWA
=unbu
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-3.5-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
"Highlights include:
- Fix a couple of mount regressions due to the recent cleanups.
- Fix an Oops in the open recovery code
- Fix an rpc_pipefs upcall hang that results from some of the net
namespace work from 3.4.x (stable kernel candidate).
- Fix a couple of write and o_direct regressions that were found at
last weeks Bakeathon testing event in Ann Arbor."
* tag 'nfs-for-3.5-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS: add an endian notation for sparse
NFSv4.1: integer overflow in decode_cb_sequence_args()
rpc_pipefs: allow rpc_purge_list to take a NULL waitq pointer
NFSv4 do not send an empty SETATTR compound
NFSv2: EOF incorrectly set on short read
NFS: Use the NFS_DEFAULT_VERSION for v2 and v3 mounts
NFS: fix directio refcount bug on commit
NFSv4: Fix unnecessary delegation returns in nfs4_do_open
NFSv4.1: Convert another trivial printk into a dprintk
NFS4: Fix open bug when pnfs module blacklisted
NFS: Remove incorrect BUG_ON in nfs_found_client
NFS: Map minor mismatch error to protocol not support error.
NFS: Fix a commit bug
NFS4: Set parsed mount data version to 4
NFSv4.1: Ensure we clear session state flags after a session creation
NFSv4.1: Convert a trivial printk into a dprintk
NFSv4: Fix up decode_attr_mdsthreshold
NFSv4: Fix an Oops in the open recovery code
NFSv4.1: Fix a request leak on the back channel
Here are a bunch of tiny fixes for the USB core and drivers for 3.5-rc3
A bunch of gadget fixes, and new device ids, as well as some fixes for a number
of different regressions that have been reported recently. We also fixed some
PCI host controllers to resolve a long-standing bug with a whole class of host
controllers that have been plaguing people for a number of kernel releases,
preventing their systems from suspending properly.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iEYEABECAAYFAk/byIEACgkQMUfUDdst+ykLBwCgiAi/cS4ObblcBEbnBvosPgsA
PqEAoJLMAwhcozeEOWLhb++45Vs+UjKv
=HAWc
-----END PGP SIGNATURE-----
Merge tag 'usb-3.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg Kroah-Hartman:
"Here are a bunch of tiny fixes for the USB core and drivers for
3.5-rc3
A bunch of gadget fixes, and new device ids, as well as some fixes for
a number of different regressions that have been reported recently.
We also fixed some PCI host controllers to resolve a long-standing bug
with a whole class of host controllers that have been plaguing people
for a number of kernel releases, preventing their systems from
suspending properly.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'usb-3.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (41 commits)
USB: fix gathering of interface associations
usb: ehci-sh: fix illegal phy_init() running when platform_data is NULL
usb: cdc-acm: fix devices not unthrottled on open
Fix OMAP EHCI suspend/resume failure (i693)
USB: ohci-hub: Mark ohci_finish_controller_resume() as __maybe_unused
usb: use usb_serial_put in usb_serial_probe errors
USB: EHCI: Fix build warning in xilinx ehci driver
USB: fix PS3 EHCI systems
xHCI: Increase the timeout for controller save/restore state operation
xhci: Don't free endpoints in xhci_mem_cleanup()
xhci: Fix invalid loop check in xhci_free_tt_info()
xhci: Fix error path return value.
USB: Checking the wrong variable in usb_disable_lpm()
usb-storage: Add 090c:1000 to unusal-devs
USB: serial-generic: use a single set of device IDs
USB: serial: Enforce USB driver and USB serial driver match
USB: add NO_D3_DURING_SLEEP flag and revert 151b612847
USB: option: add more YUGA device ids
USB: mos7840: Fix compilation of usb serial driver
USB: option: fix memory leak
...
Pull core updates (RCU and locking) from Ingo Molnar:
"Most of the diffstat comes from the RCU slow boot regression fixes,
but there's also a debuggability improvements/fixes."
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
memblock: Document memblock_is_region_{memory,reserved}()
rcu: Precompute RCU_FAST_NO_HZ timer offsets
rcu: Move RCU_FAST_NO_HZ per-CPU variables to rcu_dynticks structure
rcu: Update RCU_FAST_NO_HZ tracing for lazy callbacks
rcu: RCU_FAST_NO_HZ detection of callback adoption
spinlock: Indicate that a lockup is only suspected
kdump: Execute kmsg_dump(KMSG_DUMP_PANIC) after smp_send_stop()
panic: Make panic_on_oops configurable
Pull target updates from Nicholas Bellinger:
"This series contains post merge qla_target.c / tcm_qla2xxx bugfixes
from the past weeks, including the patch to allow target-core to use
an optional session shutdown callback to help address an active I/O
shutdown bug in tcm_qla2xxx code (Joern).
Also included is a target regression bugfix releated to explict ALUA
target port group CDB emulation that is CC'ed to stable (Roland)."
* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
qla2xxx: Remove version.h header file inclusion
tcm_qla2xxx: Handle malformed wwn strings properly
tcm_qla2xxx: tcm_qla2xxx_handle_tmr() can be static
qla2xxx: Don't leak commands we give up on in qlt_do_work()
qla2xxx: Don't crash if we can't find cmd for failed CTIO
tcm_qla2xxx: Don't insert nacls without sessions into the btree
target: Return error to initiator if SET TARGET PORT GROUPS emulation fails
tcm_qla2xxx: Clear session s_id + loop_id earlier during shutdown
tcm_qla2xxx: Convert to TFO->put_session() usage
target: Add TFO->put_session() caller for HW fabric session shutdown
Conflicts:
net/ipv6/route.c
This deals with a merge conflict between the net-next addition of the
inetpeer network namespace ops, and Thomas Graf's bug fix in
2a0c451ade which makes sure we don't
register /proc/net/ipv6_route before it is actually safe to do so.
Signed-off-by: David S. Miller <davem@davemloft.net>
/proc/net/ipv6_route reflects the contents of fib_table_hash. The proc
handler is installed in ip6_route_net_init() whereas fib_table_hash is
allocated in fib6_net_init() _after_ the proc handler has been installed.
This opens up a short time frame to access fib_table_hash with its pants
down.
fib6_init() as a whole can't be moved to an earlier position as it also
registers the rtnetlink message handlers which should be registered at
the end. Therefore split it into fib6_init() which is run early and
fib6_init_late() to register the rtnetlink message handlers.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Reviewed-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Orphaning skb in dev_hard_start_xmit() makes bonding behavior
unfriendly for applications sending big UDP bursts : Once packets
pass the bonding device and come to real device, they might hit a full
qdisc and be dropped. Without orphaning, the sender is automatically
throttled because sk->sk_wmemalloc reaches sk->sk_sndbuf (assuming
sk_sndbuf is not too big)
We could try to defer the orphaning adding another test in
dev_hard_start_xmit(), but all this seems of little gain,
now that BQL tends to make packets more likely to be parked
in Qdisc queues instead of NIC TX ring, in cases where performance
matters.
Reverts commits :
fc6055a5ba net: Introduce skb_orphan_try()
87fd308cfc net: skb_tx_hash() fix relative to skb_orphan_try()
and removes SKBTX_DRV_NEEDS_SK_REF flag
Reported-and-bisected-by: Jean-Michel Hautbois <jhautbois@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
One tricky issue on the ipv6 side vs. ipv4 is that the ICMP callouts
to handle the error pass the 32-bit info cookie in network byte order
whereas ipv4 passes it around in host byte order.
Like the ipv4 side, we have two helper functions. One for when we
have a socket context and one for when we do not.
ip6ip6 tunnels are not handled here, because they handle PMTU events
by essentially relaying another ICMP packet-too-big message back to
the original sender.
This patch allows us to get rid of rt6_do_pmtu_disc(). It handles all
kinds of situations that simply cannot happen when we do the PMTU
update directly using a fully resolved route.
In fact, the "plen == 128" check in ip6_rt_update_pmtu() can very
likely be removed or changed into a BUG_ON() check. We should never
have a prefixed ipv6 route when we get there.
Another piece of strange history here is that TCP and DCCP, unlike in
ipv4, never invoke the update_pmtu() method from their ICMP error
handlers. This is incredibly astonishing since this is the context
where we have the most accurate context in which to make a PMTU
update, namely we have a fully connected socket and associated cached
socket route.
Signed-off-by: David S. Miller <davem@davemloft.net>
Provide an iterator to receive the log buffer content, and convert all
kmsg_dump() users to it.
The structured data in the kmsg buffer now contains binary data, which
should no longer be copied verbatim to the kmsg_dump() users.
The iterator should provide reliable access to the buffer data, and also
supports proper log line-aware chunking of data while iterating.
Signed-off-by: Kay Sievers <kay@vrfy.org>
Tested-by: Tony Luck <tony.luck@intel.com>
Reported-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Tested-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This function was only used by btrfs code in btrfs_abort_devices()
(seems in a wrong way).
It was removed in commit d07eb91170,
So, Let's remove the dead code to avoid any confusion.
Changes in v2: update commit log, btrfs_abort_devices() was removed
already.
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-kernel@vger.kernel.org
Cc: Chris Mason <chris.mason@oracle.com>
Cc: linux-btrfs@vger.kernel.org
Cc: David Sterba <dave@jikos.cz>
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
With ip_rt_frag_needed() removed, we have to explicitly update PMTU
information in every ICMP error handler.
Create two helper functions to facilitate this.
1) ipv4_sk_update_pmtu()
This updates the PMTU when we have a socket context to
work with.
2) ipv4_update_pmtu()
Raw version, used when no socket context is available. For this
interface, we essentially just pass in explicit arguments for
the flow identity information we would have extracted from the
socket.
And you'll notice that ipv4_sk_update_pmtu() is simply implemented
in terms of ipv4_update_pmtu()
Note that __ip_route_output_key() is used, rather than something like
ip_route_output_flow() or ip_route_output_key(). This is because we
absolutely do not want to end up with a route that does IPSEC
encapsulation and the like. Instead, we only want the route that
would get us to the node described by the outermost IP header.
Reported-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Fix a regression of USB-audio PCM assignment since 3.4
- A few VGA-switcheroo-related fixes for proper HDMI audio enablement
- Fixed the missing initializations of HD-audio verbs, which may have
resulted in various breakage
- Some driver-specific ASoC updates
- A few fixes for the dynamic PCM code
- The addition of pinctrl support for the i.MX audmux which didn't make it
into -rc1 due to cross tree dependency issues
- A few minor fixes in compress API codes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iQIcBAABAgAGBQJP2KtCAAoJEGwxgFQ9KSmkCosQAIKjgSBIT9JMZ714q2LkyZkb
bo+Rr4N/WKq7E2xV/2nsziWhZoPlInws+hYvFx8eFP84gKlGUCZyJwV5Rzn37h3W
IK6ACkgDD3f9U50oO9jwM2M6lFxpu+4LJPIJEewYAfhtr9gLv9uxALDhocfhg6Gd
H+IaYdCKZxffCaXTNUiLURoEDA+qp7/T4d5CHuc7/l/RRCb2n7ZaKXoreCvon+iK
h6I3J91MdudDvn6X+h7KFOIq01L5R5gp3ml+T7REg9rNB2aQk9UGo+qBqsKEoihO
/7Ih0pvny167dVZJKGzjh3uM7G7cnLcnhmc88jFFRhRooETNDwk1Qz3V+ur+sSdw
rkoNicdgujXRsuPr+xqmWYRZrfmeBZ3SJWEA+aFA8pRhrU/WrbNAaj7EBVc0Gq3f
p1rEZ4t16Qqj+1Px+xHUEMjOidF1voCX3NiWZrfYBqO8AheerrNKe6zmCY20cRaC
4PYWJYzRui9rec1Ic5fW/7BWqWEv+2rAoBdisZmVAqjNEUlP/pHEdi4OTmCL8ss0
hrp0BLksfeedDcVhdVSQG6UjA4U6IOcA528Z731FGZ5QP7ZZ3IdwvuZFiact5QHx
HbBTjCYR52VsmTPSWJpQ5rc5Kq0lBz5VayTrTIt0f9nom7Qm7RgiFatDkMeObwgm
Ncwa8CIInqsFq384g4E4
=Hrh7
-----END PGP SIGNATURE-----
Merge tag 'sound-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
- Fix a regression of USB-audio PCM assignment since 3.4
- A few VGA-switcheroo-related fixes for proper HDMI audio enablement
- Fixed the missing initializations of HD-audio verbs, which may have
resulted in various breakage
- Some driver-specific ASoC updates
- A few fixes for the dynamic PCM code
- The addition of pinctrl support for the i.MX audmux which didn't make
it into -rc1 due to cross tree dependency issues
- A few minor fixes in compress API codes
* tag 'sound-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Don't forget to call init verbs added by fixup list
ALSA: HDA: Pin fixup for Zotac Z68 motherboard
ALSA: compress_core: cleanup pointers on stop
ALSA: compress_core: don't wake up on pause
ALSA: hda - Fix detection of Creative SoundCore3D controllers
vga_switcheroo: Enable/disable audio clients at the right time
ALSA: hda - HDMI Audio init all connectors when VGA-switcheroo is off
vga_switcheroo: Fix error without CONFIG_VGA_SWITCHEROO
ALSA: hda - Fix uninitialized HDMI controllers with VGA-switcheroo
vga_switcheroo: Add a helper function to get the client state
ALSA: usb-audio: Fix substream assignments
ASoC: tegra: add MODULE_DEVICE_TABLE to tegra30_ahub
ASoC: wm2000: Always use a 4s timeout for the firmware
ASoC: dapm: Fix input list to use source widgets
ASoC: dpcm: Fix dpcm_get_be() to check that DAI is BE
ASoC: wm8994: Apply volume updates with clocks enabled
ASoC: wm8994: Ensure all AIFnCLK events are run from the _late variants
ASoC: imx-audmux: add pinctrl support
ASoC: dapm: Fix connected widget capture path query.
Pull networking fixes from David S. Miller:
This has the fix for the wireless issues I ran into the other week as
well as:
1) Fix CAN c_can driver transmit handling resulting in BUG check
triggers, from AnilKumar Ch.
2) Fix packet drop monitor sleeping in atomic context, from Eric
Dumazet.
3) Fix mv643xx_eth driver build regression, from Andrew Lunn.
4) Inetpeer freeing needs an RCU grace period in order to avoid races
during tree invalidation. From Eric Dumazet.
5) Fix endianness bugs in xt_HMARK netfilter module, from Hans
Schillstrom.
6) Add proper module refcounting to l2tp_eth to avoid crash on module
unload, from Eric Dumazet.
7) Fix truncation of neighbour entry dumps due to logic errors in
neigh_dump_info() and friends, from Eric Dumazet.
8) The conversion of fib6_age() to dst_neigh_lookup() accidently
reversed the logic of a flags test, fix from Thomas Graf.
9) Fix checksum configuration in newer sky2 chips, from Stephen
Hemminger.
10) Revert BQL support in NIU driver, doesn't work.
11) l2tp_ip_sendmsg() illegally uses a route without a proper reference.
From Eric Dumazet.
12) be2net driver references an SKB after it's potentially been freed,
also from Eric Dumazet.
13) Fix RCU stalls in dummy net driver init. Also from Eric Dumazet.
14) lpc_eth has several bugs in it's transmit engine leading to packet
leaks and improper queue wakes, from Eric Dumazet.
15) Apply short DMA workaround to more tg3 chips, from Matt Carlson.
16) Add tilegx network driver.
17) Bonding queue mapping for a packet can get corrupted, fix from Eric
Dumazet.
18) Fix bug in netpoll_send_udp() SKB management that can leave garbage
in the payload in certain situations. From Eric Dumazet.
19) bnx2x driver interprets chip RX checksum offload incorrectly in
encapsulation situations. Fix from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (75 commits)
bnx2x: fix checksum validation
netpoll: fix netpoll_send_udp() bugs
bonding: Fix corrupted queue_mapping
bonding:record primary when modify it via sysfs
tilegx network driver: initial support
tg3: Apply short DMA frag workaround to 5906
net: stmmac: Fix clock en-/disable calls
lpc_eth: fix tx completion
lpc_eth: add missing ndo_change_mtu()
dummy: fix rcu_sched self-detected stalls
net: Reorder initialization in ip_route_output to fix gcc warning
virtio-net: fix a race on 32bit arches
r8169: avoid NAPI scheduling delay.
net: Make linux/tcp.h C++ friendly (trivial)
netdev: fix drivers/net/phy/ kernel-doc warnings
net/core: fix kernel-doc warnings
be2net: fix a race in be_xmit()
l2tp: fix a race in l2tp_ip_sendmsg()
mac80211: add back channel change flag
NFC: Fix possible NULL ptr deref when getting the name of a socket
...
Generate the proactive PREQ element as defined in
Sec. 13.10.9.3 (Case C) of IEEE Std. 802.11-2012
based on the selection of dot11MeshHWMPRootMode as follow:
dot11MeshHWMPRootMode (2) is proactivePREQnoPREP
dot11MeshHWMPRootMode (3) is proactivePREQwithPREP
The proactive PREQ is generated based on the interval
defined by dot11MeshHWMProotInterval.
With this change, proactive RANN element is now generated
if the dot11MeshHWMPRootMode is set to (4) instead of (1).
Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@gmail.com>
[line-break commit log]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Add the mesh configuration parameters dot11MeshHWMProotInterval
and dot11MeshHWMPactivePathToRootTimeout to be used by
proactive PREQ mechanism.
Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@gmail.com>
[line-break commit log]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
A handy function that we will use outside of ram_core soon. But
so far just factor it out and start using it in post_init().
Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Without the update, we'll only see the new dmesg buffer after the
reboot, but previously we could see it right away. Making an oops
visible in pstore filesystem before reboot is a somewhat dubious
feature, but removing it wasn't an intentional change, so let's
restore it.
For this we have to make persistent_ram_save_old() safe for calling
multiple times, and also extern it.
Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch (as1558) fixes a problem affecting several ASUS computers:
The machine crashes or corrupts memory when going into suspend if the
ehci-hcd driver is bound to any controllers. Users have been forced
to unbind or unload ehci-hcd before putting their systems to sleep.
After extensive testing, it was determined that the machines don't
like going into suspend when any EHCI controllers are in the PCI D3
power state. Presumably this is a firmware bug, but there's nothing
we can do about it except to avoid putting the controllers in D3
during system sleep.
The patch adds a new flag to indicate whether the problem is present,
and avoids changing the controller's power state if the flag is set.
Runtime suspend is unaffected; this matters only for system suspend.
However as a side effect, the controller will not respond to remote
wakeup requests while the system is asleep. Hence USB wakeup is not
functional -- but of course, this is already true in the current state
of affairs.
A similar patch has already been applied as commit
151b612847 (USB: EHCI: fix crash during
suspend on ASUS computers). The patch supersedes that one and reverts
it. There are two differences:
The old patch added the flag at the USB level; this patch
adds it at the PCI level.
The old patch applied to all chipsets with the same vendor,
subsystem vendor, and product IDs; this patch makes an
exception for a known-good system (based on DMI information).
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Dâniel Fraga <fragabr@gmail.com>
Tested-by: Andrey Rahmatullin <wrar@wrar.name>
Tested-by: Steven Rostedt <rostedt@goodmis.org>
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dave Jones reported a kernel BUG at mm/slub.c:3474! triggered
by splice_shrink_spd() called from vmsplice_to_pipe()
commit 35f3d14dbb (pipe: add support for shrinking and growing pipes)
added capability to adjust pipe->buffers.
Problem is some paths don't hold pipe mutex and assume pipe->buffers
doesn't change for their duration.
Fix this by adding nr_pages_max field in struct splice_pipe_desc, and
use it in place of pipe->buffers where appropriate.
splice_shrink_spd() loses its struct pipe_inode_info argument.
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Tom Herbert <therbert@google.com>
Cc: stable <stable@vger.kernel.org> # 2.6.35
Tested-by: Dave Jones <davej@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
It should be NL80211_SCHED_SCAN_MATCH_ATTR_SSID as
documented, not NL80211_ATTR_SCHED_SCAN_MATCH_SSID.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Conflicts:
MAINTAINERS
drivers/net/wireless/iwlwifi/pcie/trans.c
The iwlwifi conflict was resolved by keeping the code added
in 'net' that turns off the buggy chip feature.
The MAINTAINERS conflict was merely overlapping changes, one
change updated all the wireless web site URLs and the other
changed some GIT trees to be Johannes's instead of John's.
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds an optional target_core_fabric_ops->put_session() caller
within the existing target_put_session() code path.
This is required by tcm_qla2xxx code in order to invoke it's own fabric
specific session shutdown handler using se_session->sess_kref.
Signed-off-by: Joern Engel <joern@logfs.org>
Cc: Roland Dreier <roland@purestorage.com>
Cc: Arun Easi <arun.easi@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Change flags field to matches userspace structure.
This field needs to be converted to little endian before forward it.
Signed-off-by: Jefferson Delfes <jefferson.delfes@openbossa.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Add dev_loopback_xmit() in order to deduplicate functions
ip_dev_loopback_xmit() (in net/ipv4/ip_output.c) and
ip6_dev_loopback_xmit() (in net/ipv6/ip6_output.c).
I was about to reinvent the wheel when I noticed that
ip_dev_loopback_xmit() and ip6_dev_loopback_xmit() do exactly what I
need and are not IP-only functions, but they were not available to reuse
elsewhere.
ip6_dev_loopback_xmit() does not have line "skb_dst_force(skb);", but I
understand that this is harmless, and should be in dev_loopback_xmit().
Signed-off-by: Michel Machado <michel@digirati.com.br>
CC: "David S. Miller" <davem@davemloft.net>
CC: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
CC: James Morris <jmorris@namei.org>
CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
CC: Patrick McHardy <kaber@trash.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jpirko@redhat.com>
CC: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
CC: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The flag of EVENT_DEV_WAKING is not used any more, so just remove it.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the transmit path of the bonding driver, skb->cb is used to
stash the skb->queue_mapping so that the bonding device can set its
own queue mapping. This value becomes corrupted since the skb->cb is
also used in __dev_xmit_skb.
When transmitting through bonding driver, bond_select_queue is
called from dev_queue_xmit. In bond_select_queue the original
skb->queue_mapping is copied into skb->cb (via bond_queue_mapping)
and skb->queue_mapping is overwritten with the bond driver queue.
Subsequently in dev_queue_xmit, __dev_xmit_skb is called which writes
the packet length into skb->cb, thereby overwriting the stashed
queue mappping. In bond_dev_queue_xmit (called from hard_start_xmit),
the queue mapping for the skb is set to the stashed value which is now
the skb length and hence is an invalid queue for the slave device.
If we want to save skb->queue_mapping into skb->cb[], best place is to
add a field in struct qdisc_skb_cb, to make sure it wont conflict with
other layers (eg : Qdiscc, Infiniband...)
This patchs also makes sure (struct qdisc_skb_cb)->data is aligned on 8
bytes :
netem qdisc for example assumes it can store an u64 in it, without
misalignment penalty.
Note : we only have 20 bytes left in (struct qdisc_skb_cb)->data[].
The largest user is CHOKe and it fills it.
Based on a previous patch from Tom Herbert.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Tom Herbert <therbert@google.com>
Cc: John Fastabend <john.r.fastabend@intel.com>
Cc: Roland Dreier <roland@kernel.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Routing of 127/8 is tradtionally forbidden, we consider
packets from that address block martian when routing and do
not process corresponding ARP requests.
This is a sane default but renders a huge address space
practically unuseable.
The RFC states that no address within the 127/8 block should
ever appear on any network anywhere but it does not forbid
the use of such addresses outside of the loopback device in
particular. For example to address a pool of virtual guests
behind a load balancer.
This patch adds a new interface option 'route_localnet'
enabling routing of the 127/8 address block and processing
of ARP requests on a specific interface.
Note that for the feature to work, the default local route
covering 127/8 dev lo needs to be removed.
Example:
$ sysctl -w net.ipv4.conf.eth0.route_localnet=1
$ ip route del 127.0.0.0/8 dev lo table local
$ ip addr add 127.1.0.1/16 dev eth0
$ ip route flush cache
V2: Fix invalid check to auto flush cache (thanks davem)
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Before Kernel 3.5, no one was checking for the return value of
drm_connector_attach_property, so we never noticed that we were unable
to create some properties. Commit "drm: WARN() when
drm_connector_attach_property fails" added a WARN when we fail to
create a property, and the transition from "connector properties" to
"object properties" changed the warning message a little bit.
On i915 machines with many TV connectors we hit the maximum number of
properties (since each TV connector uses a lot of properties), so we
get a few backtraces in our logs. This commit increases the maximum
number of properties to 24 hoping we'll have enough room for
everybody.
Chris suggested that we convert this code to "lists", but I believe
this conversion can come after we make sure people's dmesgs are not
spammed by our driver.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reported-by: Dave Jones <davej@redhat.com>
Tested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Fix kernel-doc warning in input.h:
Warning(include/linux/input.h:140): No description found for parameter 'len'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
The HCI constants are always used in form of jiffies. So just
include the conversion from msecs in the define itself. This has the
advantage of making the code where the timeout is used more readable
and avoiding unnecessary conversions.
The patch is similar to commit ba13ccd9 doing the same job for L2CAP
Reported-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
If no peer actually gets attached (either because create is zero or
the peer allocation fails) we'll trigger a BUG because we
unconditionally do an rt{,6}_peer_ptr() afterwards.
Fix this by guarding it with the proper check.
Signed-off-by: David S. Miller <davem@davemloft.net>
__dev_get_by_name() is slow because pm_qos_req has been inserted between
name[] and name_hlist, adding cache misses.
pm_qos_req has nothing to do at the beginning of struct net_device
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Before this patch the owner field of the /dev/radio# device fops was set to
the snd-tea575x-tuner module itself. Meaning that the module which was using
it could be rmmod-ed while the device is open, and then BAD things happen.
I know, as I found out the hard way :)
Note that there is no need to also somehow increase the refcount of the
snd-tea575x-tuner module itself, since any drivers using it will have
symbolic references to it.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
CC: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Yuck. The VIDIOC_(TRY_)DECODER_CMD ioctls already had ioctl numbers 96 and 97,
and after merging the timings API I forgot to continue numbering from 98. So
now we have two ioctls with number 96 and two with 97.
With the new table-driver ioctl handling in v4l2-ioctl.c it is essential that
each ioctl has its own unique number, so let's fix this quickly for 3.5.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
We handle NULL in rt{,6}_set_peer but then our caller will try to pass
that NULL pointer into inet_putpeer() which isn't ready for it.
Fix this by moving the NULL check one level up, and then remove the
now unnecessary NULL check from inetpeer_ptr_set_peer().
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This implementation can deal with having many inetpeer roots, which is
a necessary prerequisite for per-FIB table rooted peer tables.
Each family (AF_INET, AF_INET6) has a sequence number which we bump
when we get a family invalidation request.
Each peer lookup cheaply checks whether the flush sequence of the
root we are using is out of date, and if so flushes it and updates
the sequence number.
Signed-off-by: David S. Miller <davem@davemloft.net>
There is zero point to this function.
It's only real substance is to perform an extremely outdated BSD4.2
ICMP check, which we can safely remove. If you really have a MTU
limited link being routed by a BSD4.2 derived system, here's a nickel
go buy yourself a real router.
The other actions of ip_rt_frag_needed(), checking and conditionally
updating the peer, are done by the per-protocol handlers of the ICMP
event.
TCP, UDP, et al. have a handler which will receive this event and
transmit it back into the associated route via dst_ops->update_pmtu().
This simplification is important, because it eliminates the one place
where we do not have a proper route context in which to make an
inetpeer lookup.
Signed-off-by: David S. Miller <davem@davemloft.net>
We encode the pointer(s) into an unsigned long with one state bit.
The state bit is used so we can store the inetpeer tree root to use
when resolving the peer later.
Later the peer roots will be per-FIB table, and this change works to
facilitate that.
Signed-off-by: David S. Miller <davem@davemloft.net>
Merge RCU fixes from Paul E. McKenney:
" This series has four patches, the major point of which is to eliminate
some slowdowns (including boot-time slowdowns) resulting from some
RCU_FAST_NO_HZ changes. The issue with the changes is that posting timers
from the idle loop has no effect if the CPU has entered dyntick-idle
mode because the CPU has already computed its wakeup time, and posting
a timer does not cause it to be recomputed. The short-term fix is for
RCU to precompute the timeout value so that the CPU's calculation is
correct. "
Signed-off-by: Ingo Molnar <mingo@kernel.org>
fix the coding style related to mesh parameters, especially the indentation,
as pointed out by Johannes Berg.
Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Add the missing kernel-doc for mesh configuration parameters as pointed
out by Johannes Berg.
Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
If I build with W=1, for every file that includes <net/route.h>, I get the warning
include/net/route.h: In function 'ip_route_output':
include/net/route.h:135:3: warning: initialized field overwritten [-Woverride-init]
include/net/route.h:135:3: warning: (near initialization for 'fl4') [-Woverride-init]
(This is with "gcc (Debian 4.6.3-1) 4.6.3")
A fix seems pretty trivial: move the initialization of .flowi4_tos
earlier. As far as I can tell, this has no effect on code generation.
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
asm-generic/bug.h uses taint flags that are only defined in
linux/kernel.h, resulting in build failures on platforms that
don't include linux/kernel.h some other way:
arch/sh/include/asm/thread_info.h:172:2: error: 'TAINT_WARN' undeclared (first use in this function)
Caused by commit edd63a2763 ("set_restore_sigmask() is never called
without SIGPENDING (and never should be)").
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
pxa-ssp.c uses API like cpu_is_pxa3xx(), cpu_is_pxa2xx(), which is
defined under arch-pxa architecture, and drivers under mach-mmp
can't find it. so just use ssp->type to replace that API.
Signed-off-by: Qiao Zhou <zhouqiao@marvell.com>
Acked-by: Haojian Zhuang <haojian.zhuang@gmail.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
add pxa910-ssp into ssp_id_table, and fix pxa-ssp compiling issue
under mach-mmp architect.
Signed-off-by: Qiao Zhou <zhouqiao@marvell.com>
Acked-by: Haojian Zhuang <haojian.zhuang@gmail.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
I originally sent this patch to <trivial@kernel.org>, but Jiri Kosina did
not feel that this is fully appropriate for the trivial tree.
Using linux/tcp.h from C++ results in:
cat t.cc
#include <linux/tcp.h>
int main() { }
g++ -c t.cc
In file included from t.cc:1:
/usr/include/linux/tcp.h:72: error: '__u32 __fswab32(__u32)' cannot appear in a constant-expression
/usr/include/linux/tcp.h:72: error: a function call cannot appear in a constant-expression
...
Attached trivial patch fixes this problem.
Tested:
- the t.cc above compiles with g++ and
- the following program generates the same output before/after
the patch:
#include <linux/tcp.h>
#include <stdio.h>
int main ()
{
#define P(a) printf("%s: %08x\n", #a, (int)a)
P(TCP_FLAG_CWR);
P(TCP_FLAG_ECE);
P(TCP_FLAG_URG);
P(TCP_FLAG_ACK);
P(TCP_FLAG_PSH);
P(TCP_FLAG_RST);
P(TCP_FLAG_SYN);
P(TCP_FLAG_FIN);
P(TCP_RESERVED_BITS);
P(TCP_DATA_OFFSET);
#undef P
return 0;
}
Signed-off-by: Paul Pluzhnikov <ppluzhnikov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We only need one interface for this operation, since we always know
which inetpeer root we want to flush.
Signed-off-by: David S. Miller <davem@davemloft.net>
Since it's guarenteed that we will access the inetpeer if we're trying
to do timewait recycling and TCP options were enabled on the
connection, just cache the peer in the timewait socket.
In the future, inetpeer lookups will be context dependent (per routing
realm), and this helps facilitate that as well.
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a few kernel-doc descriptions that were missed
during development.
Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The get_peer method TCP uses is full of special cases that make no
sense accommodating, and it also gets in the way of doing more
reasonable things here.
First of all, if the socket doesn't have a usable cached route, there
is no sense in trying to optimize timewait recycling.
Likewise for the case where we have IP options, such as SRR enabled,
that make the IP header destination address (and thus the destination
address of the route key) differ from that of the connection's
destination address.
Just return a NULL peer in these cases, and thus we're also able to
get rid of the clumsy inetpeer release logic.
Signed-off-by: David S. Miller <davem@davemloft.net>
There's a lot of places that open-code rt{,6}_get_peer() only because
they want to set 'create' to one. So add an rt{,6}_get_peer_create()
for their sake.
There were also a few spots open-coding plain rt{,6}_get_peer() and
those are transformed here as well.
Signed-off-by: David S. Miller <davem@davemloft.net>
With LE/SMP the completion of a security level elavation from medium to
high is indicated by a HCI Encryption Key Refresh Complete event. The
necessary behavior upon receiving this event is a mix of what's done for
auth_complete and encryption_change, which is also where most of the
event handling code has been copied from.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
/proc/net/unix has quadratic behavior, and can hold unix_table_lock for
a while if high number of unix sockets are alive. (90 ms for 200k
sockets...)
We already have a hash table, so its quite easy to use it.
Problem is unbound sockets are still hashed in a single hash slot
(unix_socket_table[UNIX_HASH_TABLE])
This patch also spreads unbound sockets to 256 hash slots, to speedup
both /proc/net/unix and unix_diag.
Time to read /proc/net/unix with 200k unix sockets :
(time dd if=/proc/net/unix of=/dev/null bs=4k)
before : 520 secs
after : 2 secs
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
add struct net as a parameter of inet_getpeer_v[4,6],
use net to replace &init_net.
and modify some places to provide net for inet_getpeer_v[4,6]
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
now inetpeer doesn't support namespace,the information will
be leaking across namespace.
this patch move the global vars v4_peers and v6_peers to
netns_ipv4 and netns_ipv6 as a field peers.
add struct pernet_operations inetpeer_ops to initial pernet
inetpeer data.
and change family_to_base and inet_getpeer to support namespace.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull perf fixes from Ingo Molnar:
"A bit larger than what I'd wish for - half of it is due to hw driver
updates to Intel Ivy-Bridge which info got recently released,
cycles:pp should work there now too, amongst other things. (but we
are generally making exceptions for hardware enablement of this type.)
There are also callchain fixes in it - responding to mostly
theoretical (but valid) concerns. The tooling side sports perf.data
endianness/portability fixes which did not make it for the merge
window - and various other fixes as well."
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits)
perf/x86: Check user address explicitly in copy_from_user_nmi()
perf/x86: Check if user fp is valid
perf: Limit callchains to 127
perf/x86: Allow multiple stacks
perf/x86: Update SNB PEBS constraints
perf/x86: Enable/Add IvyBridge hardware support
perf/x86: Implement cycles:p for SNB/IVB
perf/x86: Fix Intel shared extra MSR allocation
x86/decoder: Fix bsr/bsf/jmpe decoding with operand-size prefix
perf: Remove duplicate invocation on perf_event_for_each
perf uprobes: Remove unnecessary check before strlist__delete
perf symbols: Check for valid dso before creating map
perf evsel: Fix 32 bit values endianity swap for sample_id_all header
perf session: Handle endianity swap on sample_id_all header data
perf symbols: Handle different endians properly during symbol load
perf evlist: Pass third argument to ioctl explicitly
perf tools: Update ioctl documentation for PERF_IOC_FLAG_GROUP
perf tools: Make --version show kernel version instead of pull req tag
perf tools: Check if callchain is corrupted
perf callchain: Make callchain cursors TLS
...
Pull drm intel and exynos fixes from Dave Airlie:
"A bunch of fixes for Intel and exynos, nothing too major, a new intel
PCI ID, and a fix for CRT detection."
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/i915: pch_irq_handler -> {ibx, cpt}_irq_handler
char/agp: add another Ironlake host bridge
drm/i915: fix up ivb plane 3 pageflips
drm/exynos: fixed blending for hdmi graphic layer
drm/exynos: Remove dummy encoder get_crtc operation implementation
drm/exynos: Keep a reference to frame buffer GEM objects
drm/exynos: Don't cast GEM object to Exynos GEM object when not needed
drm/exynos: DRIVER_BUS_PLATFORM is not a driver feature
drm/exynos: fixed size type.
drm/exynos: Use DRM_FORMAT_{NV12, YUV420} instead of DRM_FORMAT_{NV12M, YUV420M}
drm/i915: hold forcewake around ring hw init
drm/i915: Mark the ringbuffers as being in the GTT domain
drm/i915/crt: Do not rely upon the HPD presence pin
drm/i915: Reset last_retired_head when resetting ring
Add vga_switcheroo_get_client_state() to get the current state of the
client. This is necessary to determine the proper initial state of
audio clients in HD-audio driver.
Acked-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
* 'exynos-drm-fixes' of git://git.infradead.org/users/kmpark/linux-samsung:
drm/exynos: fixed blending for hdmi graphic layer
drm/exynos: Remove dummy encoder get_crtc operation implementation
drm/exynos: Keep a reference to frame buffer GEM objects
drm/exynos: Don't cast GEM object to Exynos GEM object when not needed
drm/exynos: DRIVER_BUS_PLATFORM is not a driver feature
drm/exynos: fixed size type.
drm/exynos: Use DRM_FORMAT_{NV12, YUV420} instead of DRM_FORMAT_{NV12M, YUV420M}
Commit 026cee0086 "params:
<level>_initcall-like kernel parameters" set old-style module
parameters to level 0. And we call those level 0 calls where we used
to, early in start_kernel().
We also loop through the initcall levels and call the levelled
module_params before the corresponding initcall. Unfortunately level
0 is early_init(), so we call the standard module_param calls twice.
(Turns out most things don't care, but at least ubi.mtd does).
Change the level to -1 for standard module_param calls.
Reported-by: Benoît Thébaudeau <benoit.thebaudeau@advansee.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org
Zero is written at clear_tid_address when the process exits. This
functionality is used by pthread_join().
We already have sys_set_tid_address() to change this address for the
current task but there is no way to obtain it from user space.
Without the ability to find this address and dump it we can't restore
pthread'ed apps which call pthread_join() once they have been restored.
This patch introduces the PR_GET_TID_ADDRESS prctl option which allows
the current process to obtain own clear_tid_address.
This feature is available iif CONFIG_CHECKPOINT_RESTORE is set.
[akpm@linux-foundation.org: fix prctl numbering]
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pedro Alves <palves@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A fix for commit b32dfe3771 ("c/r: prctl: add ability to set new
mm_struct::exe_file").
After removing mm->num_exe_file_vmas kernel keeps mm->exe_file until
final mmput(), it never becomes NULL while task is alive.
We can check for other mapped files in mm instead of checking
mm->num_exe_file_vmas, and mark mm with flag MMF_EXE_FILE_CHANGED in
order to forbid second changing of mm->exe_file.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch extends the kernel's ethtool interface by adding support
for 2 new EEE commands - get_eee and set_eee.
Thanks goes to Giuseppe Cavallaro for his original patch adding this support.
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The mask option allows you put all address belonging that mask into
the same recent slot. This can be useful in case that recent is used
to detect attacks from the same network segment.
Tested for backward compatibility.
Signed-off-by: Denys Fedoryshchenko <denys@visp.net.lb>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch adds namespace support for cttimeout.
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Since the sysctl data for l[3|4]proto now resides in pernet nf_proto_net.
We can now remove this unused fields from struct nf_contrack_l[3,4]proto.
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch adds namespace support for ICMPv6 protocol tracker.
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch adds namespace support for ICMP protocol tracker.
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch adds namespace support for UDP protocol tracker.
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch adds namespace support for TCP protocol tracker.
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch adds namespace support for the generic layer 4 protocol
tracker.
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch prepares the namespace support for layer 3 protocol trackers.
Basically, this modifies the following interfaces:
* nf_ct_l3proto_[un]register_sysctl.
* nf_conntrack_l3proto_[un]register.
We add a new nf_ct_l3proto_net is used to get the pernet data of l3proto.
This adds rhe new struct nf_ip_net that is used to store the sysctl header
and l3proto_ipv4,l4proto_tcp(6),l4proto_udp(6),l4proto_icmp(v6) because the
protos such tcp and tcp6 use the same data,so making nf_ip_net as a field
of netns_ct is the easiest way to manager it.
This patch also adds init_net to struct nf_conntrack_l3proto to initial
the layer 3 protocol pernet data.
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch prepares the namespace support for layer 4 protocol trackers.
Basically, this modifies the following interfaces:
* nf_ct_[un]register_sysctl
* nf_conntrack_l4proto_[un]register
to include the namespace parameter. We still use init_net in this patch
to prepare the ground for follow-up patches for each layer 4 protocol
tracker.
We add a new net_id field to struct nf_conntrack_l4proto that is used
to store the pernet_operations id for each layer 4 protocol tracker.
Note that AF_INET6's protocols do not need to do sysctl compat. Thus,
we only register compat sysctl when l4proto.l3proto != AF_INET6.
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Implement a new "fail-open" mode where packets are not dropped
upon queue-full condition. This mode can be enabled/disabled per
queue using netlink NFQA_CFG_FLAGS & NFQA_CFG_MASK attributes.
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: Vivek Kashyap <vivk@us.ibm.com>
Signed-off-by: Sridhar Samudrala <samudrala@us.ibm.com>
It was scheduled to be removed.
Cc: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
It was scheduled to be removed.
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch addresses two issues:
a) Fix usage of u32 and __be32 that causes endianess warnings via sparse.
b) Ensure consistent hashing in a cluster that is composed of big and
little endian systems. Thus, we obtain the same hash mark in an
heterogeneous cluster.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
When a CPU is entering dyntick-idle mode, tick_nohz_stop_sched_tick()
calls rcu_needs_cpu() see if RCU needs that CPU, and, if not, computes the
next wakeup time based on the timer wheels. Only later, when actually
entering the idle loop, rcu_prepare_for_idle() will be invoked. In some
cases, rcu_prepare_for_idle() will post timers to wake the CPU back up.
But all for naught: The next wakeup time for the CPU has already been
computed, and posting a timer afterwards does not force that wakeup
time to be recomputed. This means that rcu_prepare_for_idle()'s have
no effect.
This is not a problem on a busy system because something else will wake
up the CPU soon enough. However, on lightly loaded systems, the CPU
might stay asleep for a considerable length of time. If that CPU has
a callback that the rest of the system is waiting on, the system might
run very slowly or (in theory) even hang.
This commit avoids this problem by having rcu_needs_cpu() give
tick_nohz_stop_sched_tick() an estimate of when RCU will need the CPU
to wake back up, which tick_nohz_stop_sched_tick() takes into account
when programming the CPU's wakeup time. An alternative approach is
for rcu_prepare_for_idle() to use hrtimers instead of normal timers,
but timers are much more efficient than are hrtimers for frequently
and repeatedly posting and cancelling a given timer, which is exactly
what RCU_FAST_NO_HZ does.
Reported-by: Pascal Chapperon <pascal.chapperon@wanadoo.fr>
Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Tested-by: Pascal Chapperon <pascal.chapperon@wanadoo.fr>
In the current code, a short dyntick-idle interval (where there is
at least one non-lazy callback on the CPU) and a long dyntick-idle
interval (where there are only lazy callbacks on the CPU) are traced
identically, which can be less than helpful. This commit therefore
emits different event traces in these two cases.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Tested-by: Pascal Chapperon <pascal.chapperon@wanadoo.fr>
Redesign all the off-channel code, getting rid of
the generic off-channel work concept, replacing
it with a simple remain-on-channel list.
This fixes a number of small issues with the ROC
implementation:
* offloaded remain-on-channel couldn't be queued,
now we can queue it as well, if needed
* in iwlwifi (the only user) offloaded ROC is
mutually exclusive with scanning, use the new
queue to handle that case -- I expect that it
will later depend on a HW flag
The bigger issue though is that there's a bad bug
in the current implementation: if we get a mgmt
TX request while HW roc is active, and this new
request has a wait time, we actually schedule a
software ROC instead since we can't guarantee the
existing offloaded ROC will still be that long.
To fix this, the queuing mechanism was needed.
The queuing mechanism for offloaded ROC isn't yet
optimal, ideally we should add API to have the HW
extend the ROC if needed. We could add that later
but for now use a software implementation.
Overall, this unifies the behaviour between the
offloaded and software-implemented case as much
as possible.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The IDLE handling in HW off-channel is broken right
now since we turn off IDLE only when the off-channel
period already started. Therefore, all drivers that
use it today (only iwlwifi!) must support off-channel
while idle, so playing with idle isn't needed at all.
Off-channel in general, since it's no longer used for
authentication/association, shouldn't affect PS, so
also remove that logic.
Also document a small caveat for reporting TX status
from off-channel frames in HW remain-on-channel.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The remain-on-channel time validation shouldn't
depend on the value of HZ, as it does now with
the check against jiffies, since then you might
use a value that works on one system but not on
another. Fix it by checking against a minimum
that's fixed.
Also add validation of the wait duration for a
management frame TX since this also translates
into remain-on-channel internally.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
I found this core on a BCM4322, a PCI card in the Linksys WRT610N V1.
This core is not used by the driver, this patch just makes ssb show the
correct name.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Now that we've removed all uses of the set_channel
API except for the monitor channel and in libertas,
clarify this. Split the libertas mesh use into a
new libertas_set_mesh_channel() operation, just to
keep backward compatibility, and rename the normal
set_channel() to set_monitor_channel().
Also describe the desired set_monitor_channel()
semantics more clearly.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Pull ACPI and Power Management changes from Len Brown.
This does an evil merge to fix up what I think is a mismerge by Len to
the gma500 driver, and restore it to the mainline state.
In that driver, both branches had commented out the call to
acpi_video_register(), and Len resolved the merge to that commented-out
version.
However, in mainline, further changes by Alan (commit d839ede47a56:
"gma500: opregion and ACPI" to be exact) had re-enabled the ACPI video
registration, so the current state of the driver seems to want it.
Alan is apparently still feeling the effects of partying with the Queen,
so he didn't reply to my query, but I'll do the evil merge anyway.
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
ACPI: fix acpi_bus.h build warnings when ACPI is not enabled
drivers: acpi: Fix dependency for ACPI_HOTPLUG_CPU
tools/power turbostat: fix IVB support
tools/power turbostat: fix un-intended affinity of forked program
ACPI video: use after input_unregister_device()
gma500: don't register the ACPI video bus
acpi_video: Intel video is not always i915
acpi_video: fix leaking PCI references
ACPI: Ignore invalid _PSS entries, but use valid ones
ACPI battery: only refresh the sysfs files when pertinent information changes
commit 5faa5df1fa (inetpeer: Invalidate the inetpeer tree along with
the routing cache) added a race :
Before freeing an inetpeer, we must respect a RCU grace period, and make
sure no user will attempt to increase refcnt.
inetpeer_invalidate_tree() waits for a RCU grace period before inserting
inetpeer tree into gc_list and waking the worker. At that time, no
concurrent lookup can find a inetpeer in this tree.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Weird topologies can lead to asymmetric domain setups. This needs
further consideration since these setups are typically non-minimal
too.
For now, make it work by adding an extra mask selecting which CPUs
are allowed to iterate up.
The topology that triggered it is the one from David Rientjes:
10 20 20 30
20 10 20 20
20 20 10 20
30 20 20 10
resulting in boxes that wouldn't even boot.
Reported-by: David Rientjes <rientjes@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-3p86l9cuaqnxz7uxsojmz5rm@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
SDIO_CCCR_IF[1:0] in SDIO card is used for card data bus width
setting as below:
00b: 1-bit bus
01b: Reserved
10b: 4-bit bus
11b: 8-bit bus (only for embedded SDIO)
And sdio_enable_wide is for setting data bus width as 4-bit.
But currently, it first reads the register, second OR' 1b with
SDIO_CCCR_IF[1], and then writes it back.
As we can see, this is based on such assumption that the
SDIO_CCCR_IF[0] is always 0. Apparently, this is not right.
Signed-off-by: Yong Ding <yongd@marvell.com>
Acked-by: Philip Rakity <prakity@marvell.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
Pull perf fixes from Arnaldo Carvalho de Melo:
* Endianness fixes from Jiri Olsa
* Fixes for make perf tarball
* Fix for DSO name in perf script callchains, from David Ahern
* Segfault fixes for perf top --callchain, from Namhyung Kim
* Minor function result fixes from Srikar Dronamraju
* Add missing 3rd ioctl parameter, from Namhyung Kim
* Fix pager usage in minimal embedded systems, from Avik Sil
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The new commit code fails to copy the verifier into the wb_verf field
of _all_ the nfs_page structures; it only copies it into the first entry.
The consequence is that most requests end up failing to match in
nfs_commit_release.
Fix is to copy the verifier into the req->wb_verf field in
nfs_write_completion.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Fred Isaman <iisaman@netapp.com>
Just like the AP mode patch, instead of setting
the channel and then joining the mesh network,
provide the channel to join the network on to
the join_mesh() function.
Like in AP mode, you can also give the channel
to the join-mesh nl80211 command now.
Unlike AP mode, it picks a default channel if
none was given.
As libertas uses mesh mode interfaces but has
no join_mesh callback and we can't simply break
it, keep some compatibility code for that case
and configure the channel directly for it.
In the non-libertas case, where we store the
channel until join, allow setting it while the
interface is down.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Instead of setting the channel first and then
starting the AP, let cfg80211 store the channel
and provide it as one of the AP settings.
This means that now you have to set the channel
before you can start an AP interface, but since
hostapd/wpa_supplicant always do that we're OK
with this change.
Alternatively, it's now possible to give the
channel as an attribute to the start-ap nl80211
command, overriding any preset channel.
Cc: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Change cfg80211_can_beacon_sec_chan() to return true
if there is no secondary channel to simplify all the
current users of it. They all check the channel type
before calling the function because it returns false
if there's no secondary channel.
Also actually document the return value.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
ieee80211_get_operstate() was used by drivers in order to
know whether the sta link is up, but it's no longer needed
(nor used) as mac80211 notifies the drivers about
authorization changes (via the sta_state callback)
Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Simplify the use of #ifdef CONFIG_MAC80211_IBSS_DEBUG/#endif
by adding a logging macro to encapsulate the test.
Convert the appropriate uses too.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Simplify the use of #ifdef CONFIG_MAC80211_HT_DEBUG/#endif
by adding a logging macro to encapsulate the test.
Convert the appropriate uses too.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Tell userspace about beacon loss event.
This event doesn't replace the deauth/disassoc that
might come if the AP is not available.
The driver can send this event in order to hint
userspace what might follow (which in turn can
use it as roaming trigger).
Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Low level drivers can now set certain netdev feature bits in
netdev_features member of the ieee80211_hw struct. These will be
propagated to every netdev created from this HW.
The white-listed features currently include only ones related to HW
checksumming.
Signed-off-by: Arik Nemtsov <arik@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Pull embedded i2c update from Wolfram Sang:
"This only contains one new driver which had multiple dependencies
(pinctrl, i2c-mux-rework, new devm_* functions), so I decided to wait
for rc1. Plus, it had to wait a little for the ack of a devicetree
maintainer since the bindings were not trivial enough for me to pass
through.
So, given that, I hope there is still something like the "new driver
rule", so we could have the driver in 3.5 and people can start using
it. That would make merging support for some boards easier for 3.6
since the dependency on this driver is gone then."
* 'i2c-embedded/for-current' of git://git.pengutronix.de/git/wsa/linux:
i2c: Add generic I2C multiplexer using pinctrl API
Pull drm radeon fixes from Dave Airlie:
"This is just radeon fixes and a bunch of new PCI ids. The fixes are
for a deadlock, an audio regression, and a couple of audio fixes."
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/radeon/kms: add new SI PCI ids
drm/radeon/kms: add new BTC PCI ids
drm/radeon/kms: add new Palm, Sumo PCI ids
drm/radeon/kms: add new Trinity PCI ids
drm/radeon: fix vm deadlocks on cayman
drm/radeon: fix gpu_init on si
drm/radeon/hdmi: don't set SEND_MAX_PACKETS bit
drm/radeon/audio: don't hardcode CRTC id
drm/radeon: make audio_init consistent across asics
This patch fixes bug in macro radix_tree_for_each_contig().
If radix_tree_next_slot() sees NULL in next slot it returns NULL, but following
radix_tree_next_chunk() switches iterating into next chunk. As result iterating
becomes non-contiguous and breaks vfs "splice" and all its users.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Reported-and-bisected-by: Hans de Bruin <jmdebruin@xmsnet.nl>
Reported-and-bisected-by: Ondrej Zary <linux@rainbow-software.org>
Reported-bisected-and-tested-by: Toralf Förster <toralf.foerster@gmx.de>
Link: https://lkml.org/lkml/2012/6/5/64
Cc: stable <stable@vger.kernel.org> # 3.4.x
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull scheduler fixes from Ingo Molnar.
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Remove NULL assignment of dattr_cur
sched: Remove the last NULL entry from sched_feat_names
sched: Make sched_feat_names const
sched/rt: Fix SCHED_RR across cgroups
sched: Move nr_cpus_allowed out of 'struct sched_rt_entity'
sched: Make sure to not re-read variables after validation
sched: Fix SD_OVERLAP
sched: Don't try allocating memory from offline nodes
sched/nohz: Fix rq->cpu_load calculations some more
sched/x86: Use cpu_llc_shared_mask(cpu) for coregroup_mask
The open recovery code does not need to request a new value for the
mdsthreshold, and so does not allocate a struct nfs4_threshold.
The problem is that encode_getfattr_open() will still request an
mdsthreshold, and so we end up Oopsing in decode_attr_mdsthreshold.
This patch fixes encode_getfattr_open so that it doesn't request an
mdsthreshold when the caller isn't asking for one. It also fixes
decode_attr_mdsthreshold so that it errors if the server returns
an mdsthreshold that we didn't ask for (instead of Oopsing).
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Andy Adamson <andros@netapp.com>
A2MP doesn't use part of the L2CAP chan ops API so we just create general
empty function instead of the A2MP specific one.
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
This patch renames L2CAP_LE_DEFAULT_MTU macro to L2CAP_LE_MIN_MTU
since it represents the minimum MTU value, not the default MTU
value for LE.
Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Those are not used anywhere in code (and never were since introduction
in 2006) so just remove them.
Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
A2MP fixed channel do not have sk
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
A2MP status codes copied from Bluez patch sent by Peter Krystad
<pkrystad@codeaurora.org>.
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Helper function to build and send A2MP messages.
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Define AMP Manager and some basic functions.
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
This move socket specific code to l2cap_sock.c.
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This remove a bit more of socket code from l2cap core, this calls set the
SOCK_ZAPPED and do some clean up depending on the socket state.
Reported-by: Mat Martineau <mathewm@codeaurora.org>
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Use chan instead of void * makes more sense here.
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Add HCI commands to deal with Bluetooth AMP controllers.
Those commands will be used by bluetooth and softamp code.
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Define assigned Protocol and Service Multiplexor (PSM) identifiers
and use them instead of magic numbers.
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Define Continuation flag which the only flag used from Flags field
in L2CAP Configuration Request and Response.
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Most of the include were unnecessary or already included by some other
header.
Replace module.h by export.h where possible.
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Fix all warning and errors reported by checkpatch but license trailing
whitespace and bdaddr_t definition.
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Remove magic number with defined link key size.
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
HCI_QUIRK_NO_RESET name is misleading - purpose of this quirk is to
reset device on close instead of init, not to not reset at all.
Rename it to HCI_QUIRK_RESET_ON_CLOSE to avoid confusion.
Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Now that l2cap_ctrl is used to set up control fields, these macros are
not needed.
Signed-off-by: Mat Martineau <mathewm@codeaurora.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
The ERTM specification requires the retransmit timer to be cancelled
when the monitor timer is set. The retransmit timer cannot be set
again while the monitor timer is pending.
Signed-off-by: Mat Martineau <mathewm@codeaurora.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
This deletes the receive code that had handlers for each frame type at
the top level, and then had logic to determine the receive state
within each handler.
Signed-off-by: Mat Martineau <mathewm@codeaurora.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
This fixes a regression from commit
2ead70b839 that is present in all
kernels starting at v3.0.
When L2CAP information was moved to struct l2cap_chan, a check was
added to l2cap_chan_del to avoid certain cleanup operations when ERTM
or streaming mode had not yet been initialized. The logic in the
check did not take in to account that chan->conf_state is set to 0 in
l2cap_chan_ready, so l2cap_chan_del failed to cancel timers and leaked
memory any time the ERTM queues or lists were not empty.
This change makes sure that l2cap_chan_del only returns early if
ERTM initialization was not performed.
Signed-off-by: Mat Martineau <mathewm@codeaurora.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
size type of drm_exynos_gem_mmap struct is changed to uint64_t and
it adds pad for the struct to be aligned as 64bit.
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
This routine will be called by drivers whenever they receive data in target
mode. This should be unexpected events and as such should be handled by a
standalone API (i.e. not as a callback pointer from an existing API).
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Userspace gets a netlink event upon target mode activation.
The LLCP layer is also signaled when we get an ATR_REQ in order to get
the remote general bytes.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
In some environments, dramatic performance savings may be obtained because
swapped pages are saved in RAM (or a RAM-like device) instead of a swap disk.
This tag provides the basic infrastructure along with some changes to the
existing backends.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQEcBAABAgAGBQJPsorBAAoJEFjIrFwIi8fJcz8H/RBXCtFo0kiJmRked3nMAIDO
/2zN/q/Qawsg9aeoGlP7G8hQi9PMipbhQj3ixHyCTMv0zMbH988GXbBce+gIcg6e
TOQi7xXAuPEwLizmSpiTv84XzN5bMgu1oJXEqIXw0EIpuZAmp+9m/o3WBwEAtyxi
B+hvjE7eZM8f75K3lxs6sOtmIcERj9zqmT933Y8+i9iiuRyGMey2SyKtvVLbYZ+j
HroFMUi0so5TzxT/cpkRiHu0U75c651o+LV00zh7InMqbwyRsWlKTf53k8Q/q2WP
I7dVmfItwN/TpOrYTfxglYFlbYuUP35ziFvZ2trd6hcs9RK8OuKw+OmBLReHTtc=
=x9Vp
-----END PGP SIGNATURE-----
Merge tag 'stable/frontswap.v16-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/mm
Pull frontswap feature from Konrad Rzeszutek Wilk:
"Frontswap provides a "transcendent memory" interface for swap pages.
In some environments, dramatic performance savings may be obtained
because swapped pages are saved in RAM (or a RAM-like device) instead
of a swap disk. This tag provides the basic infrastructure along with
some changes to the existing backends."
Fix up trivial conflict in mm/Makefile due to removal of swap token code
changing a line next to the new frontswap entry.
This pull request came in before the merge window even opened, it got
delayed to after the merge window by me just wanting to make sure it had
actual users. Apparently IBM is using this on their embedded side, and
Jan Beulich says that it's already made available for SLES and OpenSUSE
users.
Also acked by Rik van Riel, and Konrad points to other people liking it
too. So in it goes.
By Dan Magenheimer (4) and Konrad Rzeszutek Wilk (2)
via Konrad Rzeszutek Wilk
* tag 'stable/frontswap.v16-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/mm:
frontswap: s/put_page/store/g s/get_page/load
MAINTAINER: Add myself for the frontswap API
mm: frontswap: config and doc files
mm: frontswap: core frontswap functionality
mm: frontswap: core swap subsystem hooks and headers
mm: frontswap: add frontswap header file
Pull timer updates from Thomas Gleixner:
"The clocksource driver is pure hardware enablement and the skew option
is default off, well tested and non dangerous."
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tick: Move skew_tick option into the HIGH_RES_TIMER section
clocksource: em_sti: Add DT support
clocksource: em_sti: Emma Mobile STI driver
clockevents: Make clockevents_config() a global symbol
tick: Add tick skew boot option
Adding socket backlog len in INET_DIAG_SKMEMINFO is really useful to
diagnose various TCP problems.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is useful for SoCs whose I2C module's signals can be routed to
different sets of pins at run-time, using the pinctrl API.
+-----+ +-----+
| dev | | dev |
+------------------------+ +-----+ +-----+
| SoC | | |
| /----|------+--------+
| +---+ +------+ | child bus A, on first set of pins
| |I2C|---|Pinmux| |
| +---+ +------+ | child bus B, on second set of pins
| \----|------+--------+--------+
| | | | |
+------------------------+ +-----+ +-----+ +-----+
| dev | | dev | | dev |
+-----+ +-----+ +-----+
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
introduced in Linux-3.5-rc1 by
66886d6f8c
(ACPI: Add stubs for (un)register_acpi_bus_type)
Fix header file warnings when CONFIG_ACPI is not enabled:
include/acpi/acpi_bus.h:443:42: warning: 'struct acpi_bus_type' declared inside parameter list
include/acpi/acpi_bus.h:443:42: warning: its scope is only this definition or declaration, which is probably not
include/acpi/acpi_bus.h:444:44: warning: 'struct acpi_bus_type' declared inside parameter list
Signed-off-by: Len Brown <len.brown@intel.com>
This reverts commit 5ceb9ce6fe.
That commit seems to be the cause of the mm compation list corruption
issues that Dave Jones reported. The locking (or rather, absense
there-of) is dubious, as is the use of the 'page' variable once it has
been found to be outside the pageblock range.
So revert it for now, we can re-visit this for 3.6. If we even need to:
as Minchan Kim says, "The patch wasn't a bug fix and even test workload
was very theoretical".
Reported-and-tested-by: Dave Jones <davej@redhat.com>
Acked-by: Hugh Dickins <hughd@google.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The comment above it says "Stat data, not accessed from path walking",
but in fact some of inode fields we use for the common stat data was way
down at the end of the inode, causing unnecessary cache misses for the
common stat operations.
The inode structure is pretty big, and this can change padding depending
on field width, but at least on the common 64-bit configurations this
doesn't change the size. Some of our inode layout has historically been
to tro to avoid unnecessary padding fields, but cache locality is at
least as important for layout, if not more.
Noticed by looking at kernel profiles, and noticing that the "i_blkbits"
access stood out like a sore thumb.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull networking updates from David Miller:
1) Make syn floods consume significantly less resources by
a) Not pre-COW'ing routing metrics for SYN/ACKs
b) Mirroring the device queue mapping of the SYN for the SYN/ACK
reply.
Both from Eric Dumazet.
2) Fix calculation errors in Byte Queue Limiting, from Hiroaki SHIMODA.
3) Validate the length requested when building a paged SKB for a
socket, so we don't overrun the page vector accidently. From Jason
Wang.
4) When netlabel is disabled, we abort all IP option processing when we
see a CIPSO option. This isn't the right thing to do, we should
simply skip over it and continue processing the remaining options
(if any). Fix from Paul Moore.
5) SRIOV fixes for the mellanox driver from Jack orgenstein and Marcel
Apfelbaum.
6) 8139cp enables the receiver before the ring address is properly
programmed, which potentially lets the device crap over random
memory. Fix from Jason Wang.
7) e1000/e1000e fixes for i217 RST handling, and an improper buffer
address reference in jumbo RX frame processing from Bruce Allan and
Sebastian Andrzej Siewior, respectively.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
fec_mpc52xx: fix timestamp filtering
mcs7830: Implement link state detection
e1000e: fix Rapid Start Technology support for i217
e1000: look into the page instead of skb->data for e1000_tbi_adjust_stats()
r8169: call netif_napi_del at errpaths and at driver unload
tcp: reflect SYN queue_mapping into SYNACK packets
tcp: do not create inetpeer on SYNACK message
8139cp/8139too: terminate the eeprom access with the right opmode
8139cp: set ring address before enabling receiver
cipso: handle CIPSO options correctly when NetLabel is disabled
net: sock: validate data_len before allocating skb in sock_alloc_send_pskb()
bql: Avoid possible inconsistent calculation.
bql: Avoid unneeded limit decrement.
bql: Fix POSDIFF() to integer overflow aware.
net/mlx4_core: Fix obscure mlx4_cmd_box parameter in QUERY_DEV_CAP
net/mlx4_core: Check port out-of-range before using in mlx4_slave_cap
net/mlx4_core: Fixes for VF / Guest startup flow
net/mlx4_en: Fix improper use of "port" parameter in mlx4_en_event
net/mlx4_core: Fix number of EQs used in ICM initialisation
net/mlx4_core: Fix the slave_id out-of-range test in mlx4_eq_int
This reverts the tty layer change to use per-tty locking, because it's
not correct yet, and fixing it will require some more deep surgery.
The main revert is d29f3ef39b ("tty_lock: Localise the lock"), but
there are several smaller commits that built upon it, they also get
reverted here. The list of reverted commits is:
fde86d3108 - tty: add lockdep annotations
8f6576ad47 - tty: fix ldisc lock inversion trace
d3ca8b64b9 - pty: Fix lock inversion
b1d679afd7 - tty: drop the pty lock during hangup
abcefe5fc3 - tty/amiserial: Add missing argument for tty_unlock()
fd11b42e35 - cris: fix missing tty arg in wait_event_interruptible_tty call
d29f3ef39b - tty_lock: Localise the lock
The revert had a trivial conflict in the 68360serial.c staging driver
that got removed in the meantime.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It includes:
- driver for AUO-K1900 and AUO-K1901 epaper controller
- large updates for OMAP (e.g. decouple HDMI audio and video)
- some updates for Exynos and SH Mobile
- various other small fixes and cleanups
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.12 (GNU/Linux)
iQIcBAABAgAGBQJPyAhmAAoJECSVL5KnPj1PBcoQAIWftuoXo3sk94f5jKcV4Ucx
MthEc5iEpMVs8xaEruHHNHXWv8ic0x/PfdC2xrpKOEbNXQcNPlb/QE2xWmBRxmT1
ucDyu10HJ36jKcwcK4ra5IQwOW+GtbTBEoBZT+WNAjxHZtJmxzjQGM4C12zVQpdJ
+qV2RP93JmsJoVBL9aKVAg1Ko135LLfD8TcKd+z8TmgFnLfSwKhfl7Jtd2xXwyvz
/hmW3kJUEnD8E5wuj+/g8sKJhQkGalEiITTqG2j2vJyFgxHSqyLSw8BBixrFW1uT
B9VnZsHF35ccCo+96UZRH4QsGJTx08+rea/qsv8IMSGczyRp5ey1ufjL+CzKiiIN
FWfex6fY0HHqZGAopQhjag54e914SIbSxdBwWS/iRrtVt3e9d03BzkhYs4rXl4Ey
CTC5obzWNTbQ6hLEjgWfVKkKcrF56BnRn3zGPgCTKGp2NK3vODdBkt/EmzUFvCWR
CcyQhh+PvZzEWp3XsdOGossYs/0aP4bO+7XPGJxZaa3+WVcRaZwAG/uZvJXXBfnp
DGRFy4wPsTTwKYIx4+t/KrsLtNVKioSMS5GEtuM1YEb8pA7mkUIkqwJv1I261h58
heTr6vWUsviUqHlKALJ+1CdwWGr3CtktCZssGsSUri61nm8CvlSRn2Nr2aJ/L3RN
AkemC/33RE5X/+lfkdMx
=tmIU
-----END PGP SIGNATURE-----
Merge tag 'fbdev-updates-for-3.5' of git://github.com/schandinat/linux-2.6
Pull fbdev updates from Florian Tobias Schandinat:
- driver for AUO-K1900 and AUO-K1901 epaper controller
- large updates for OMAP (e.g. decouple HDMI audio and video)
- some updates for Exynos and SH Mobile
- various other small fixes and cleanups
* tag 'fbdev-updates-for-3.5' of git://github.com/schandinat/linux-2.6: (130 commits)
video: bfin_adv7393fb: Fix cleanup code
video: exynos_dp: reduce delay time when configuring video setting
video: exynos_dp: move sw reset prioir to enabling sw defined function
video: exynos_dp: use devm_ functions
fb: handle NULL pointers in framebuffer release
OMAPDSS: HDMI: OMAP4: Update IRQ flags for the HPD IRQ request
OMAPDSS: Apply VENC timings even if panel is disabled
OMAPDSS: VENC/DISPC: Delay dividing Y resolution for managers connected to VENC
OMAPDSS: DISPC: Support rotation through TILER
OMAPDSS: VRFB: remove compiler warnings when CONFIG_BUG=n
OMAPFB: remove compiler warnings when CONFIG_BUG=n
OMAPDSS: remove compiler warnings when CONFIG_BUG=n
OMAPDSS: DISPC: fix usage of dispc_ovl_set_accu_uv
OMAPDSS: use DSI_FIFO_BUG workaround only for manual update displays
OMAPDSS: DSI: Support command mode interleaving during video mode blanking periods
OMAPDSS: DISPC: Update Accumulator configuration for chroma plane
drivers/video: fsl-diu-fb: don't initialize the THRESHOLDS registers
video: exynos mipi dsi: support reverse panel type
video: exynos mipi dsi: Properly interpret the interrupt source flags
video: exynos mipi dsi: Avoid races in probe()
...
- Updates to mxc_nand and gpmi drivers to support new boards and device tree
- Improve consistency of information about ECC strength in NAND devices
- Clean up partition handling of plat_nand
- Support NAND drivers without dedicated access to OOB area
- BCH hardware ECC support for OMAP
- Other fixes and cleanups, and a few new device IDs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iEYEABECAAYFAk/JG1wACgkQdwG7hYl686M80wCglN4kutx20j+KJWuZofkr9Hog
weEAoI4jrqEWEdW9EcT2CIWQw7eG+1v+
=7tdo
-----END PGP SIGNATURE-----
Merge tag 'for-linus-3.5-20120601' of git://git.infradead.org/linux-mtd
Pull mtd update from David Woodhouse:
- More robust parsing especially of xattr data in JFFS2
- Updates to mxc_nand and gpmi drivers to support new boards and device tree
- Improve consistency of information about ECC strength in NAND devices
- Clean up partition handling of plat_nand
- Support NAND drivers without dedicated access to OOB area
- BCH hardware ECC support for OMAP
- Other fixes and cleanups, and a few new device IDs
Fixed trivial conflict in drivers/mtd/nand/gpmi-nand/gpmi-nand.c due to
added include files next to each other.
* tag 'for-linus-3.5-20120601' of git://git.infradead.org/linux-mtd: (75 commits)
mtd: mxc_nand: move ecc strengh setup before nand_scan_tail
mtd: block2mtd: fix recursive call of mtd_writev
mtd: gpmi-nand: define ecc.strength
mtd: of_parts: fix breakage in Kconfig
mtd: nand: fix scan_read_raw_oob
mtd: docg3 fix in-middle of blocks reads
mtd: cfi_cmdset_0002: Slight cleanup of fixup messages
mtd: add fixup for S29NS512P NOR flash.
jffs2: allow to complete xattr integrity check on first GC scan
jffs2: allow to discriminate between recoverable and non-recoverable errors
mtd: nand: omap: add support for hardware BCH ecc
ARM: OMAP3: gpmc: add BCH ecc api and modes
mtd: nand: check the return code of 'read_oob/read_oob_raw'
mtd: nand: remove 'sndcmd' parameter of 'read_oob/read_oob_raw'
mtd: m25p80: Add support for Winbond W25Q80BW
jffs2: get rid of jffs2_sync_super
jffs2: remove unnecessary GC pass on sync
jffs2: remove unnecessary GC pass on umount
jffs2: remove lock_super
mtd: gpmi: add gpmi support for mx6q
...
Pull third pile of signal handling patches from Al Viro:
"This time it's mostly helpers and conversions to them; there's a lot
of stuff remaining in the tree, but that'll either go in -rc2
(isolated bug fixes, ideally via arch maintainers' trees) or will sit
there until the next cycle."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
x86: get rid of calling do_notify_resume() when returning to kernel mode
blackfin: check __get_user() return value
whack-a-mole with TIF_FREEZE
FRV: Optimise the system call exit path in entry.S [ver #2]
FRV: Shrink TIF_WORK_MASK [ver #2]
FRV: Prevent syscall exit tracing and notify_resume at end of kernel exceptions
new helper: signal_delivered()
powerpc: get rid of restore_sigmask()
most of set_current_blocked() callers want SIGKILL/SIGSTOP removed from set
set_restore_sigmask() is never called without SIGPENDING (and never should be)
TIF_RESTORE_SIGMASK can be set only when TIF_SIGPENDING is set
don't call try_to_freeze() from do_signal()
pull clearing RESTORE_SIGMASK into block_sigmask()
sh64: failure to build sigframe != signal without handler
openrisc: tracehook_signal_handler() is supposed to be called on success
new helper: sigmask_to_save()
new helper: restore_saved_sigmask()
new helpers: {clear,test,test_and_clear}_restore_sigmask()
HAVE_RESTORE_SIGMASK is defined on all architectures now
When NetLabel is not enabled, e.g. CONFIG_NETLABEL=n, and the system
receives a CIPSO tagged packet it is dropped (cipso_v4_validate()
returns non-zero). In most cases this is the correct and desired
behavior, however, in the case where we are simply forwarding the
traffic, e.g. acting as a network bridge, this becomes a problem.
This patch fixes the forwarding problem by providing the basic CIPSO
validation code directly in ip_options_compile() without the need for
the NetLabel or CIPSO code. The new validation code can not perform
any of the CIPSO option label/value verification that
cipso_v4_validate() does, but it can verify the basic CIPSO option
format.
The behavior when NetLabel is enabled is unchanged.
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull vfs changes from Al Viro.
"A lot of misc stuff. The obvious groups:
* Miklos' atomic_open series; kills the damn abuse of
->d_revalidate() by NFS, which was the major stumbling block for
all work in that area.
* ripping security_file_mmap() and dealing with deadlocks in the
area; sanitizing the neighborhood of vm_mmap()/vm_munmap() in
general.
* ->encode_fh() switched to saner API; insane fake dentry in
mm/cleancache.c gone.
* assorted annotations in fs (endianness, __user)
* parts of Artem's ->s_dirty work (jff2 and reiserfs parts)
* ->update_time() work from Josef.
* other bits and pieces all over the place.
Normally it would've been in two or three pull requests, but
signal.git stuff had eaten a lot of time during this cycle ;-/"
Fix up trivial conflicts in Documentation/filesystems/vfs.txt (the
'truncate_range' inode method was removed by the VM changes, the VFS
update adds an 'update_time()' method), and in fs/btrfs/ulist.[ch] (due
to sparse fix added twice, with other changes nearby).
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (95 commits)
nfs: don't open in ->d_revalidate
vfs: retry last component if opening stale dentry
vfs: nameidata_to_filp(): don't throw away file on error
vfs: nameidata_to_filp(): inline __dentry_open()
vfs: do_dentry_open(): don't put filp
vfs: split __dentry_open()
vfs: do_last() common post lookup
vfs: do_last(): add audit_inode before open
vfs: do_last(): only return EISDIR for O_CREAT
vfs: do_last(): check LOOKUP_DIRECTORY
vfs: do_last(): make ENOENT exit RCU safe
vfs: make follow_link check RCU safe
vfs: do_last(): use inode variable
vfs: do_last(): inline walk_component()
vfs: do_last(): make exit RCU safe
vfs: split do_lookup()
Btrfs: move over to use ->update_time
fs: introduce inode operation ->update_time
reiserfs: get rid of resierfs_sync_super
reiserfs: mark the superblock as dirty a bit later
...
The major new feature added in this update is Darrick J. Wong's
metadata checksum feature, which adds crc32 checksums to ext4's
metadata fields. There is also the usual set of cleanups and bug
fixes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
iQIcBAABCAAGBQJPyNleAAoJENNvdpvBGATwtLMP/i3WsPyTvxmYP6HttHXQb8Jk
GYCoTQ5bZMuTbOwOGg3w137cXWBv5uuPpxIk79YVLHSWx6HuanlGIa7/VnPKIaLu
2ihuvVfnrDqpwQ4MJaSq4R1Eka9JCwZ7HbYYo+fYOVobxgw588JVV9VVI9EdKRGz
z11UkW8iHE0f6Xa5gOhdAMkR0uaPnxwJX/qHZYiHuognRivuwMglqWJSiMr8nQmo
A2GmeoLehhW+k65IqgTCmSW6ZgFTvZdk6bskQIij3fOYHW3hHn/gcLFtmLTIZ/B5
LZdg/lngPYve+R/UyypliGKi+pv1qNEiTiBm0rrBgsdZFkBdGj0soSvGZzeK+Mp4
Q1vAmOBPYPFzs6nVzPst2n/osryyykFCK6TgSGZ50dosJ0NO8cBeDdX/gh9JKD2R
yQUMUltOCCSj/eWU4iwqZ0T3FXRiH/+S3XMHznoKJiwUyGDBNQy4+Yg2k2WzUXrz
Cu5t5BwNG2WNP7y5Et/wmUIzpC7VPId4qYmGyHe7OwTxSJgW+6f7GVkHfjWcDMuv
pGgEUiInbMmLajP3v2/LKfVU4hXLZy4uJbhoBgDdeIpZrnPifJG/MwDOS4W+dLVT
tDzgO1SAh3/E4jATreZ5bjzD/HGsfe1OX09UH3Pbc1EcgkrLnyrQXFwdHshdVu4A
cxMoKNPVCQJySb1UrLkO
=SdJJ
-----END PGP SIGNATURE-----
Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull Ext4 updates from Theodore Ts'o:
"The major new feature added in this update is Darrick J Wong's
metadata checksum feature, which adds crc32 checksums to ext4's
metadata fields.
There is also the usual set of cleanups and bug fixes."
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (44 commits)
ext4: hole-punch use truncate_pagecache_range
jbd2: use kmem_cache_zalloc wrapper instead of flag
ext4: remove mb_groups before tearing down the buddy_cache
ext4: add ext4_mb_unload_buddy in the error path
ext4: don't trash state flags in EXT4_IOC_SETFLAGS
ext4: let getattr report the right blocks in delalloc+bigalloc
ext4: add missing save_error_info() to ext4_error()
ext4: add debugging trigger for ext4_error()
ext4: protect group inode free counting with group lock
ext4: use consistent ssize_t type in ext4_file_write()
ext4: fix format flag in ext4_ext_binsearch_idx()
ext4: cleanup in ext4_discard_allocated_blocks()
ext4: return ENOMEM when mounts fail due to lack of memory
ext4: remove redundundant "(char *) bh->b_data" casts
ext4: disallow hard-linked directory in ext4_lookup
ext4: fix potential integer overflow in alloc_flex_gd()
ext4: remove needs_recovery in ext4_mb_init()
ext4: force ro mount if ext4_setup_super() fails
ext4: fix potential NULL dereference in ext4_free_inodes_counts()
ext4/jbd2: add metadata checksumming to the list of supported features
...
Does block_sigmask() + tracehook_signal_handler(); called when
sigframe has been successfully built. All architectures converted
to it; block_sigmask() itself is gone now (merged into this one).
I'm still not too happy with the signature, but that's a separate
story (IMO we need a structure that would contain signal number +
siginfo + k_sigaction, so that get_signal_to_deliver() would fill one,
signal_delivered(), handle_signal() and probably setup...frame() -
take one).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Only 3 out of 63 do not. Renamed the current variant to __set_current_blocked(),
added set_current_blocked() that will exclude unblockable signals, switched
open-coded instances to it.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
replace boilerplate "should we use ->saved_sigmask or ->blocked?"
with calls of obvious inlined helper...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
first fruits of ..._restore_sigmask() helpers: now we can take
boilerplate "signal didn't have a handler, clear RESTORE_SIGMASK
and restore the blocked mask from ->saved_mask" into a common
helper. Open-coded instances switched...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Everyone either defines it in arch thread_info.h or has TIF_RESTORE_SIGMASK
and picks default set_restore_sigmask() in linux/thread_info.h. Kill the
ifdefs, slap #error in linux/thread_info.h to catch breakage when new ones
get merged.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
NFS optimizes away d_revalidates for last component of open. This means that
open itself can find the dentry stale.
This patch allows the filesystem to return EOPENSTALE and the VFS will retry the
lookup on just the last component if possible.
If the lookup was done using RCU mode, including the last component, then this
is not possible since the parent dentry is lost. In this case fall back to
non-RCU lookup. Currently this is not used since NFS will always leave RCU
mode.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Btrfs has to make sure we have space to allocate new blocks in order to modify
the inode, so updating time can fail. We've gotten around this by having our
own file_update_time but this is kind of a pain, and Christoph has indicated he
would like to make xfs do something different with atime updates. So introduce
->update_time, where we will deal with i_version an a/m/c time updates and
indicate which changes need to be made. The normal version just does what it
has always done, updates the time and marks the inode dirty, and then
filesystems can choose to do something different.
I've gone through all of the users of file_update_time and made them check for
errors with the exception of the fault code since it's complicated and I wasn't
quite sure what to do there, also Jan is going to be pushing the file time
updates into page_mkwrite for those who have it so that should satisfy btrfs and
make it not a big deal to check the file_update_time() return code in the
generic fault path. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
Pull the rest of the nfsd commits from Bruce Fields:
"... and then I cherry-picked the remainder of the patches from the
head of my previous branch"
This is the rest of the original nfsd branch, rebased without the
delegation stuff that I thought really needed to be redone.
I don't like rebasing things like this in general, but in this situation
this was the lesser of two evils.
* 'for-3.5' of git://linux-nfs.org/~bfields/linux: (50 commits)
nfsd4: fix, consolidate client_has_state
nfsd4: don't remove rebooted client record until confirmation
nfsd4: remove some dprintk's and a comment
nfsd4: return "real" sequence id in confirmed case
nfsd4: fix exchange_id to return confirm flag
nfsd4: clarify that renewing expired client is a bug
nfsd4: simpler ordering of setclientid_confirm checks
nfsd4: setclientid: remove pointless assignment
nfsd4: fix error return in non-matching-creds case
nfsd4: fix setclientid_confirm same_cred check
nfsd4: merge 3 setclientid cases to 2
nfsd4: pull out common code from setclientid cases
nfsd4: merge last two setclientid cases
nfsd4: setclientid/confirm comment cleanup
nfsd4: setclientid remove unnecessary terms from a logical expression
nfsd4: move rq_flavor into svc_cred
nfsd4: stricter cred comparison for setclientid/exchange_id
nfsd4: move principal name into svc_cred
nfsd4: allow removing clients not holding state
nfsd4: rearrange exchange_id logic to simplify
...
Pull second pile of signal handling patches from Al Viro:
"This one is just task_work_add() series + remaining prereqs for it.
There probably will be another pull request from that tree this
cycle - at least for helpers, to get them out of the way for per-arch
fixes remaining in the tree."
Fix trivial conflict in kernel/irq/manage.c: the merge of Andrew's pile
had brought in commit 97fd75b7b8 ("kernel/irq/manage.c: use the
pr_foo() infrastructure to prefix printks") which changed one of the
pr_err() calls that this merge moves around.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
keys: kill task_struct->replacement_session_keyring
keys: kill the dummy key_replace_session_keyring()
keys: change keyctl_session_to_parent() to use task_work_add()
genirq: reimplement exit_irq_thread() hook via task_work_add()
task_work_add: generic process-context callbacks
avr32: missed _TIF_NOTIFY_RESUME on one of do_notify_resume callers
parisc: need to check NOTIFY_RESUME when exiting from syscall
move key_repace_session_keyring() into tracehook_notify_resume()
TIF_NOTIFY_RESUME is defined on all targets now
Pull nfsd update from Bruce Fields.
* 'for-3.5-take-2' of git://linux-nfs.org/~bfields/linux: (23 commits)
nfsd: trivial: use SEEK_SET instead of 0 in vfs_llseek
SUNRPC: split upcall function to extract reusable parts
nfsd: allocate id-to-name and name-to-id caches in per-net operations.
nfsd: make name-to-id cache allocated per network namespace context
nfsd: make id-to-name cache allocated per network namespace context
nfsd: pass network context to idmap init/exit functions
nfsd: allocate export and expkey caches in per-net operations.
nfsd: make expkey cache allocated per network namespace context
nfsd: make export cache allocated per network namespace context
nfsd: pass pointer to export cache down to stack wherever possible.
nfsd: pass network context to export caches init/shutdown routines
Lockd: pass network namespace to creation and destruction routines
NFSd: remove hard-coded dereferences to name-to-id and id-to-name caches
nfsd: pass pointer to expkey cache down to stack wherever possible.
nfsd: use hash table from cache detail in nfsd export seq ops
nfsd: pass svc_export_cache pointer as private data to "exports" seq file ops
nfsd: use exp_put() for svc_export_cache put
nfsd: use cache detail pointer from svc_export structure on cache put
nfsd: add link to owner cache detail to svc_export structure
nfsd: use passed cache_detail pointer expkey_parse()
...
Merge misc patches from Andrew Morton:
- the "misc" tree - stuff from all over the map
- checkpatch updates
- fatfs
- kmod changes
- procfs
- cpumask
- UML
- kexec
- mqueue
- rapidio
- pidns
- some checkpoint-restore feature work. Reluctantly. Most of it
delayed a release. I'm still rather worried that we don't have a
clear roadmap to completion for this work.
* emailed from Andrew Morton <akpm@linux-foundation.org>: (78 patches)
kconfig: update compression algorithm info
c/r: prctl: add ability to set new mm_struct::exe_file
c/r: prctl: extend PR_SET_MM to set up more mm_struct entries
c/r: procfs: add arg_start/end, env_start/end and exit_code members to /proc/$pid/stat
syscalls, x86: add __NR_kcmp syscall
fs, proc: introduce /proc/<pid>/task/<tid>/children entry
sysctl: make kernel.ns_last_pid control dependent on CHECKPOINT_RESTORE
aio/vfs: cleanup of rw_copy_check_uvector() and compat_rw_copy_check_uvector()
eventfd: change int to __u64 in eventfd_signal()
fs/nls: add Apple NLS
pidns: make killed children autoreap
pidns: use task_active_pid_ns in do_notify_parent
rapidio/tsi721: add DMA engine support
rapidio: add DMA engine support for RIO data transfers
ipc/mqueue: add rbtree node caching support
tools/selftests: add mq_perf_tests
ipc/mqueue: strengthen checks on mqueue creation
ipc/mqueue: correct mq_attr_ok test
ipc/mqueue: improve performance of send/recv
selftests: add mq_open_tests
...
When we do restore we would like to have a way to setup a former
mm_struct::exe_file so that /proc/pid/exe would point to the original
executable file a process had at checkpoint time.
For this the PR_SET_MM_EXE_FILE code is introduced. This option takes a
file descriptor which will be set as a source for new /proc/$pid/exe
symlink.
Note it allows to change /proc/$pid/exe if there are no VM_EXECUTABLE
vmas present for current process, simply because this feature is a special
to C/R and mm::num_exe_file_vmas become meaningless after that.
To minimize the amount of transition the /proc/pid/exe symlink might have,
this feature is implemented in one-shot manner. Thus once changed the
symlink can't be changed again. This should help sysadmins to monitor the
symlinks over all process running in a system.
In particular one could make a snapshot of processes and ring alarm if
there unexpected changes of /proc/pid/exe's in a system.
Note -- this feature is available iif CONFIG_CHECKPOINT_RESTORE is set and
the caller must have CAP_SYS_RESOURCE capability granted, otherwise the
request to change symlink will be rejected.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Matt Helsley <matthltc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
During checkpoint we dump whole process memory to a file and the dump
includes process stack memory. But among stack data itself, the stack
carries additional parameters such as command line arguments, environment
data and auxiliary vector.
So when we do restore procedure and once we've restored stack data itself
we need to setup mm_struct::arg_start/end, env_start/end, so restored
process would be able to find command line arguments and environment data
it had at checkpoint time. The same applies to auxiliary vector.
For this reason additional PR_SET_MM_(ARG_START | ARG_END | ENV_START |
ENV_END | AUXV) codes are introduced.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Andrew Vagin <avagin@openvz.org>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Vasiliy Kulikov <segoon@openwall.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
While doing the checkpoint-restore in the user space one need to determine
whether various kernel objects (like mm_struct-s of file_struct-s) are
shared between tasks and restore this state.
The 2nd step can be solved by using appropriate CLONE_ flags and the
unshare syscall, while there's currently no ways for solving the 1st one.
One of the ways for checking whether two tasks share e.g. mm_struct is to
provide some mm_struct ID of a task to its proc file, but showing such
info considered to be not that good for security reasons.
Thus after some debates we end up in conclusion that using that named
'comparison' syscall might be the best candidate. So here is it --
__NR_kcmp.
It takes up to 5 arguments - the pids of the two tasks (which
characteristics should be compared), the comparison type and (in case of
comparison of files) two file descriptors.
Lookups for pids are done in the caller's PID namespace only.
At moment only x86 is supported and tested.
[akpm@linux-foundation.org: fix up selftests, warnings]
[akpm@linux-foundation.org: include errno.h]
[akpm@linux-foundation.org: tweak comment text]
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Andrey Vagin <avagin@openvz.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Vasiliy Kulikov <segoon@openwall.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Valdis.Kletnieks@vt.edu
Cc: Michal Marek <mmarek@suse.cz>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A cleanup of rw_copy_check_uvector and compat_rw_copy_check_uvector after
changes made to support CMA in an earlier patch.
Rather than having an additional check_access parameter to these
functions, the first paramater type is overloaded to allow the caller to
specify CHECK_IOVEC_ONLY which means check that the contents of the iovec
are valid, but do not check the memory that they point to. This is used
by process_vm_readv/writev where we need to validate that a iovec passed
to the syscall is valid but do not want to check the memory that it points
to at this point because it refers to an address space in another process.
Signed-off-by: Chris Yeoh <yeohc@au1.ibm.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
eventfd_ctx->count is an __u64 counter which is allowed to reach
ULLONG_MAX. eventfd_write() adds a __u64 value to "count", but the kernel
side eventfd_signal() only adds an int value to it. Make them consistent.
[akpm@linux-foundation.org: update interface documentation]
Signed-off-by: Sha Zhengju <handai.szj@taobao.com>
Cc: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adds DMA Engine framework support into RapidIO subsystem.
Uses DMA Engine DMA_SLAVE interface to generate data transfers to/from
remote RapidIO target devices.
Introduces RapidIO-specific wrapper for prep_slave_sg() interface with an
extra parameter to pass target specific information.
Uses scatterlist to describe local data buffer. Address flat data buffer
on a remote side.
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Acked-by: Vinod Koul <vinod.koul@linux.intel.com>
Cc: Li Yang <leoli@freescale.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit b231cca438 ("message queues: increase range limits") changed
mqueue default value when attr parameter is specified NULL from hard
coded value to fs.mqueue.{msg,msgsize}_max sysctl value.
This made large side effect. When user need to use two mqueue
applications 1) using !NULL attr parameter and it require big message
size and 2) using NULL attr parameter and only need small size message,
app (1) require to raise fs.mqueue.msgsize_max and app (2) consume large
memory size even though it doesn't need.
Doug Ledford propsed to switch back it to static hard coded value.
However it also has a compatibility problem. Some applications might
started depend on the default value is tunable.
The solution is to separate default value from maximum value.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Acked-by: Doug Ledford <dledford@redhat.com>
Acked-by: Joe Korty <joe.korty@ccur.com>
Cc: Amerigo Wang <amwang@redhat.com>
Acked-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mqueue limitation is slightly naieve parameter likes other ipcs because
unprivileged user can consume kernel memory by using ipcs.
Thus, too aggressive raise bring us security issue. Example, current
setting allow evil unprivileged user use 256GB (= 256 * 1024 * 1024*1024)
and it's enough large to system will belome unresponsive. Don't do that.
Instead, every admin should adjust the knobs for their own systems.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Doug Ledford <dledford@redhat.com>
Acked-by: Joe Korty <joe.korty@ccur.com>
Cc: Amerigo Wang <amwang@redhat.com>
Acked-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit b231cca438 ("message queues: increase range limits") changed the
maximum size of a message in a message queue from INT_MAX to 8192*128.
Unfortunately, we had customers that relied on a size much larger than
8192*128 on their production systems. After reviewing POSIX, we found
that it is silent on the maximum message size. We did find a couple other
areas in which it was not silent. Fix up the mqueue maximums so that the
customer's system can continue to work, and document both the POSIX and
real world requirements in ipc_namespace.h so that we don't have this
issue crop back up.
Also, commit 9cf18e1dd7 ("ipc: HARD_MSGMAX should be higher not lower
on 64bit") fiddled with HARD_MSGMAX without realizing that the number was
intentionally in place to limit the msg queue depth to one that was small
enough to kmalloc an array of pointers (hence why we divided 128k by
sizeof(long)). If we wish to meet POSIX requirements, we have no choice
but to change our allocation to a vmalloc instead (at least for the large
queue size case). With that, it's possible to increase our allowed
maximum to the POSIX requirements (or more if we choose).
[sfr@canb.auug.org.au: using vmalloc requires including vmalloc.h]
Signed-off-by: Doug Ledford <dledford@redhat.com>
Cc: Serge E. Hallyn <serue@us.ibm.com>
Cc: Amerigo Wang <amwang@redhat.com>
Cc: Joe Korty <joe.korty@ccur.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit b231cca438 ("message queues: increase range limits") changed
how we create a queue that does not include an attr struct passed to
open so that it creates the queue with whatever the maximum values are.
However, if the admin has set the maximums to allow flexibility in
creating a queue (aka, both a large size and large queue are allowed,
but combined they create a queue too large for the RLIMIT_MSGQUEUE of
the user), then attempts to create a queue without an attr struct will
fail. Switch back to using acceptable defaults regardless of what the
maximums are.
Note: so far, we only know of a few applications that rely on this
behavior (specifically, set the maximums in /proc, then run the
application which calls mq_open() without passing in an attr struct, and
the application expects the newly created message queue to have the
maximum sizes that were set in /proc used on the mq_open() call, and all
of those applications that we know of are actually part of regression
test suites that were coded to do something like this:
for size in 4096 65536 $((1024 * 1024)) $((16 * 1024 * 1024)); do
echo $size > /proc/sys/fs/mqueue/msgsize_max
mq_open || echo "Error opening mq with size $size"
done
These test suites that depend on any behavior like this are broken. The
concept that programs should rely upon the system wide maximum in order
to get their desired results instead of simply using a attr struct to
specify what they want is fundamentally unfriendly programming practice
for any multi-tasking OS.
Fixing this will break those few apps that we know of (and those app
authors recognize the brokenness of their code and the need to fix it).
However, the following patch "mqueue: separate mqueue default value"
allows a workaround in the form of new knobs for the default msg queue
creation parameters for any software out there that we don't already
know about that might rely on this behavior at the moment.
Signed-off-by: Doug Ledford <dledford@redhat.com>
Cc: Serge E. Hallyn <serue@us.ibm.com>
Cc: Amerigo Wang <amwang@redhat.com>
Cc: Joe Korty <joe.korty@ccur.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since commit b231cca438 ("message queues: increase range limits") on
Oct 18, 2008, calls to mq_open() that did not pass in an attribute
struct and expected to get default values for the size of the queue and
the max message size now get the system wide maximums instead of
hardwired defaults like they used to get.
This was uncovered when one of the earlier patches in this patch set
increased the default system wide maximums at the same time it increased
the hard ceiling on the system wide maximums (a customer specifically
needed the hard ceiling brought back up, the new ceiling that commit
b231cca438 introduced was too low for their production systems). By
increasing the default maximums and not realising they were tied to any
attempt to create a message queue without an attribute struct, I had
inadvertently made it such that all message queue creation attempts
without an attribute struct were failing because the new default
maximums would create a queue that exceeded the default rlimit for
message queue bytes.
As a result, the system wide defaults were brought back down to their
previous levels, and the system wide ceilings on the maximums were
raised to meet the customer's needs. However, the fact that the no
attribute struct behavior of mq_open() could be broken by changing the
system wide maximums for message queues was seen as fundamentally broken
itself. So we hardwired the no attribute case back like it used to be.
But, then we realized that on the very off chance that some piece of
software in the wild depended on that behavior, we could work around
that issue by adding two new knobs to /proc that allowed setting the
defaults for message queues created without an attr struct separately
from the system wide maximums.
What is not an option IMO is to leave the current behavior in place. No
piece of software should ever rely on setting the system wide maximums
in order to get a desired message queue. Such a reliance would be so
fundamentally multitasking OS unfriendly as to not really be tolerable.
Fortunately, we don't know of any software in the wild that uses this
except for a regression test program that caught the issue in the first
place. If there is though, we have made accommodations with the two new
/proc knobs (and that's all the accommodations such fundamentally broken
software can be allowed)..
This patch:
The various defines for minimums and maximums of the sysctl controllable
mqueue values are scattered amongst different files and named
inconsistently. Move them all into ipc_namespace.h and make them have
consistent names. Additionally, make the number of queues per namespace
also have a minimum and maximum and use the same sysctl function as the
other two settable variables.
Signed-off-by: Doug Ledford <dledford@redhat.com>
Acked-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Amerigo Wang <amwang@redhat.com>
Cc: Joe Korty <joe.korty@ccur.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add userspace definitions, guard all relevant kernel structures. While at
it document stuff and remove now useless userspace hint.
It is easy to add the relevant system call to respective libc's, but it
seems pointless to have to duplicate the data structures.
This is based on the kexec-tools headers, with the exception of just using
int on return (succes or failure) and using size_t instead of 'unsigned
long int' for the number of segments argument of kexec_load().
Signed-off-by: maximilian attems <max@stro.at>
Cc: Simon Horman <horms@verge.net.au>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Haren Myneni <hbabu@us.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Many architectures clear tasks' mm_cpumask like this:
read_lock(&tasklist_lock);
for_each_process(p) {
if (p->mm)
cpumask_clear_cpu(cpu, mm_cpumask(p->mm));
}
read_unlock(&tasklist_lock);
Depending on the context, the code above may have several problems,
such as:
1. Working with task->mm w/o getting mm or grabing the task lock is
dangerous as ->mm might disappear (exit_mm() assigns NULL under
task_lock(), so tasklist lock is not enough).
2. Checking for process->mm is not enough because process' main
thread may exit or detach its mm via use_mm(), but other threads
may still have a valid mm.
This patch implements a small helper function that does things
correctly, i.e.:
1. We take the task's lock while whe handle its mm (we can't use
get_task_mm()/mmput() pair as mmput() might sleep);
2. To catch exited main thread case, we use find_lock_task_mm(),
which walks up all threads and returns an appropriate task
(with task lock held).
Also, Per Peter Zijlstra's idea, now we don't grab tasklist_lock in
the new helper, instead we take the rcu read lock. We can do this
because the function is called after the cpu is taken down and marked
offline, so no new tasks will get this cpu set in their mm mask.
Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 8f92054e7c ("CRED: Fix __task_cred()'s lockdep check and banner
comment"):
add the following validation condition:
task->exit_state >= 0
to permit the access if the target task is dead and therefore
unable to change its own credentials.
OK, but afaics currently this can only help wait_task_zombie() which calls
__task_cred() without rcu lock.
Remove this validation and change wait_task_zombie() to use task_uid()
instead. This means we do rcu_read_lock() only to shut up the lockdep,
but we already do the same in, say, wait_task_stopped().
task_is_dead() should die, task->exit_state != 0 means that this task has
passed exit_notify(), only do_wait-like code paths should use this.
Unfortunately, we can't kill task_is_dead() right now, it has already
acquired buggy users in drivers/staging. The fix already exists.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If we move call_usermodehelper_fns() to kmod.c file and EXPORT_SYMBOL it
we can avoid exporting all it's helper functions:
call_usermodehelper_setup
call_usermodehelper_setfns
call_usermodehelper_exec
And make all of them static to kmod.c
Since the optimizer will see all these as a single call site it will
inline them inside call_usermodehelper_fns(). So we loose the call to
_fns but gain 3 calls to the helpers. (Not that it matters)
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
call_usermodehelper_freeinfo() is not used outside of kmod.c. So unexport
it, and make it static to kmod.c
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is patchset makes fatfs stop using the VFS '->write_super()' method
for writing out the FSINFO block.
The final goal is to get rid of the 'sync_supers()' kernel thread. This
kernel thread wakes up every 5 seconds (by default) and calls
'->write_super()' for all mounted file-systems. And the bad thing is that
this is done even if all the superblocks are clean. Moreover, some
file-systems do not even need this end they do not register the
'->write_super()' method at all (e.g., btrfs).
So 'sync_supers()' most often just generates useless wake-ups and wastes
power. I am trying to make all file-systems independent of
'->write_super()' and plan to remove 'sync_supers()' and '->write_super'
completely once there are no more users.
The '->write_supers()' method is mostly used by baroque file-systems like
hfs, udf, etc. Modern file-systems like btrfs and xfs do not use it.
This justifies removing this stuff from VFS completely and make every FS
self-manage own superblock.
Tested with xfstests.
This patch:
Preparation for further changes. It introduces a special inode
('fsinfo_inode') in FAT file-system which we'll later use for managing the
FSINFO block. Note, this there is already one special inode ('fat_inode')
which is used for managing the FAT tables.
Introduce new 'MSDOS_FSINFO_INO' constant for this special inode. It is
safe to do because FAT file-system does not store inode numbers on the
media but generates them run-time.
I've also cleaned up the comment to existing 'MSDOS_ROOT_INO' constant,
while on it.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Previous code was using optimizations which were developed to work well
even on narrow-word CPUs (by today's standards). But Linux runs only on
32-bit and wider CPUs. We can use that.
First: using 32x32->64 multiply and trivial 32-bit shift, we can correctly
divide by 10 much larger numbers, and thus we can print groups of 9 digits
instead of groups of 5 digits.
Next: there are two algorithms to print larger numbers. One is generic:
divide by 1000000000 and repeatedly print groups of (up to) 9 digits.
It's conceptually simple, but requires an (unsigned long long) /
1000000000 division.
Second algorithm splits 64-bit unsigned long long into 16-bit chunks,
manipulates them cleverly and generates groups of 4 decimal digits. It so
happens that it does NOT require long long division.
If long is > 32 bits, division of 64-bit values is relatively easy, and we
will use the first algorithm. If long long is > 64 bits (strange
architecture with VERY large long long), second algorithm can't be used,
and we again use the first one.
Else (if long is 32 bits and long long is 64 bits) we use second one.
And third: there is a simple optimization which takes fast path not only
for zero as was done before, but for all one-digit numbers.
In all tested cases new code is faster than old one, in many cases by 30%,
in few cases by more than 50% (for example, on x86-32, conversion of
12345678). Code growth is ~0 in 32-bit case and ~130 bytes in 64-bit
case.
This patch is based upon an original from Michal Nazarewicz.
[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Douglas W Jones <jones@cs.uiowa.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ULONG_MAX is often used to check for integer overflow when calculating
allocation size. While ULONG_MAX happens to work on most systems, there
is no guarantee that `size_t' must be the same size as `long'.
This patch introduces SIZE_MAX, the maximum value of `size_t', to improve
portability and readability for allocation size validation.
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Acked-by: Alex Elder <elder@dreamhost.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move the rq_flavor into struct svc_cred, and use it in setclientid and
exchange_id comparisons as well.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Instead of keeping the principal name associated with a request in a
structure that's private to auth_gss and using an accessor function,
move it to svc_cred.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
This new routine is responsible for service registration in a specified
network context.
The idea is to separate service creation from per-net operations.
Note also: since registering service with svc_bind() can fail, the
service will be destroyed and during destruction it will try to
unregister itself from rpcbind. In this case unregistration has to be
skipped.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
In SRIOV mode, the number of EQs used when computing the total ICM size
was incorrect.
To fix this, we do the following:
1. We add a new structure to mlx4_dev, mlx4_phys_caps, to contain physical HCA
capabilities. The PPF uses the phys capabilities when it computes things
like ICM size.
The dev_caps structure will then contain the paravirtualized values, making
bookkeeping much easier in SRIOV mode. We add a structure rather than a
single parameter because there will be other fields in the phys_caps.
The first field we add to the mlx4_phys_caps structure is num_phys_eqs.
2. In INIT_HCA, when running in SRIOV mode, the "log_num_eqs" parameter
passed to the FW is the number of EQs per VF/PF; each function (PF or VF)
has this number of EQs available.
However, the total number of EQs which must be allowed for in the ICM is
(1 << log_num_eqs) * (#VFs + #PFs). Rather than compute this quantity,
we allocate ICM space for 1024 EQs (which is the device maximum
number of EQs, and which is the value we place in the mlx4_phys_caps structure).
For INIT_HCA, however, we use the per-function number of EQs as described
above.
Signed-off-by: Marcel Apfelbaum <marcela@dev.mellanox.co.il>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iQIcBAABAgAGBQJPxr4VAAoJEGgI9fZJve1bmOkP/R+hVC0lHRzbevDiDXVzxx+M
XNG3krM73Y6jC9sdIxUj5wU1/BpQ3z6wYNEKKPKeXJoHPW+UJaN+wjhm6+uYQPx/
6QM7Fkraxcya98I7vKIsz+uVRd9vETMBvgrix6hZ/ec8xO9q62d5ozkXjfG3E4qO
3vUFSihGmeVVGES1BFehIMkLEHRqlEuiUsXwMw71cBaIYATXruzy46iRqS9e3fVS
mLc5+Ylvsm9q65wY1djv2Kieq5AuZ1dOgH8du2FYWJED+vogkqRcZTWeJeuZ3HAc
ql72WhN2ga2U+xuxypVt+mVl2Gb1pjfE1j802EDjZX6ir1E50iSWpcaKIM1M8th7
kZwCvzcMihtzBaKvZjXf1IVazZyBo8Chi02YNbG3UVWW/rSrYjV9GSSh5HPKfY3A
Arw80C4t/I3kiCPr5uwdZZO/D5lsleMF677NEilTmsKZgPSOe9EqbZOTOGPKuALj
A/y2SVCV0sPVb2ki8wYlQ5dpNRsz/Uya+I3cqRFW63YGeSyGHm8VysxWRjzzu6LN
+n7I2q/3Un95aHGMMIoJ/3crHcdtKSqtXeBKbBSiRdRpyO4b4rVRkLJRARv36V7m
eiaduSZJEf5ZrEb3z20FmM2SqjiFHz+d0Q1QH6MQmhwDQ44acKxx/iw6moxGtLTQ
JsDneVe+4d3WlJFqXd3s
=B6u0
-----END PGP SIGNATURE-----
Merge tag 'for-v3.5' of git://git.infradead.org/battery-2.6
Pull battery updates from Anton Vorontsov:
"A bunch of fixes for v3.5, nothing extraordinary."
* tag 'for-v3.5' of git://git.infradead.org/battery-2.6: (27 commits)
smb347-charger: Include missing <linux/err.h>
smb347-charger: Clean up battery attributes
max17042_battery: Add support for max17047/50 chip
sbs-battery.c: Capacity attr = remaining relative capacity
isp1704_charger: Use after free on probe error
ds2781_battery: Use DS2781_PARAM_EEPROM_SIZE and DS2781_USER_EEPROM_SIZE
power_supply: Fix a typo in BATTERY_DS2781 Kconfig entry
charger-manager: Provide cm_notify_event function for in-kernel use
charger-manager: Poll battery health in normal state
smb347-charger: Convert to regmap API
smb347-charger: Move IRQ enabling to the end of probe
smb347-charger: Rename few functions to match better what they are doing
smb347-charger: Convert to use module_i2c_driver()
smb347_charger: Cleanup power supply registration code in probe
ab8500: Clean up probe routines
ab8500_fg: Harden platform data check
ab8500_btemp: Harden platform data check
ab8500_charger: Harden platform data check
MAINTAINERS: Fix 'F' entry for the power supply class
max17042_battery: Handle irq request failure case
...
Pull two small kvm fixes from Avi Kivity:
"A build fix for non-kvm archs and a transparent hugepage refcount
bugfix on hosts with 4M pages."
* git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: Export asm-generic/kvm_para.h
KVM: MMU: fix huge page adapted on non-PAE host
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iQEcBAABAgAGBQJPx1M+AAoJEDeqqVYsXL0MNOMH/jSbgDAHQskBuZMCEoVUHykZ
3aKiPFJQfnF1nQqN/xxECGFc7glrKSHv1fpAG9wDk0HLHNhP+QoOBVYdDGHpzktk
eP1hB6rWE/auJz90rIrKomJoD+cVYDRHkhlbNr1DsYBuXI+BGX0aUp+uAaajoxAT
8wp4/Z5007llQQXnep2Z0AvzIWBdCeR4PBXX5YvalJ8Qz3Rj8bYeY10oDpx6nO7v
iGcyh+h0Eo+q9KEQ3PosoDnqaskq44yTY4MWeE1Kd64fQM1JYTJo0SxOGGVxHHwQ
ZLfhX+fH3jCyBP0qRzCqBvSKTuiWeMBc8POdLbLMnq6ClCgQTr41iHH7UTuXXjE=
=fZOy
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull final round of SCSI updates from James Bottomley:
"This is primarily another round of driver updates (bnx2fc, qla2xxx,
qla4xxx) including the target mode driver for qla2xxx. We've also got
a couple of regression fixes (async scanning, broken this merge window
and a fix to a long standing break in the scsi_wait_scan module)."
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (45 commits)
[SCSI] fix scsi_wait_scan
[SCSI] fix async probe regression
[SCSI] be2iscsi: fix dma free size mismatch regression
[SCSI] qla4xxx: Update driver version to 5.02.00-k17
[SCSI] qla4xxx: Capture minidump for ISP82XX on firmware failure
[SCSI] qla4xxx: Add change_queue_depth API support
[SCSI] qla4xxx: Fix clear ddb mbx command failure issue.
[SCSI] qla4xxx: Fix kernel panic during discovery logout.
[SCSI] qla4xxx: Correct early completion of pending mbox.
[SCSI] fcoe, bnx2fc, libfcoe: SW FCoE and bnx2fc use FCoE Syfs
[SCSI] libfcoe: Add fcoe_sysfs
[SCSI] bnx2fc: Allocate fcoe_ctlr with bnx2fc_interface, not as a member
[SCSI] fcoe: Allocate fcoe_ctlr with fcoe_interface, not as a member
[SCSI] Fix dm-multipath starvation when scsi host is busy
[SCSI] ufs: fix potential NULL pointer dereferencing error in ufshcd_prove.
[SCSI] qla2xxx: don't free pool that wasn't allocated
[SCSI] mptfusion: unlock on error in mpt_config()
[SCSI] tcm_qla2xxx: Add >= 24xx series fabric module for target-core
[SCSI] qla2xxx: Add LLD target-mode infrastructure for >= 24xx series
[SCSI] Revert "qla2xxx: During loopdown perform Diagnostic loopback."
...
Use the same mechanism as the block devices are using, but move the
helper functions from fs/direct-io.c into fs/inode.c to remove the
dependency on CONFIG_BLOCK.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull networking changes from David S. Miller:
1) Fix IPSEC header length calculation for transport mode in ESP. The
issue is whether to do the calculation before or after alignment.
Fix from Benjamin Poirier.
2) Fix regression in IPV6 IPSEC fragment length calculations, from Gao
Feng. This is another transport vs tunnel mode issue.
3) Handle AF_UNSPEC connect()s properly in L2TP to avoid OOPSes. Fix
from James Chapman.
4) Fix USB ASIX driver's reception of full sized VLAN packets, from
Eric Dumazet.
5) Allow drop monitor (and, more generically, all generic netlink
protocols) to be automatically loaded as a module. From Neil
Horman.
Fix up trivial conflict in Documentation/feature-removal-schedule.txt
due to new entries added next to each other at the end. As usual.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (38 commits)
net/smsc911x: Repair broken failure paths
virtio-net: remove useless disable on freeze
netdevice: Update netif_dbg for CONFIG_DYNAMIC_DEBUG
drop_monitor: Add module alias to enable automatic module loading
genetlink: Build a generic netlink family module alias
net: add MODULE_ALIAS_NET_PF_PROTO_NAME
r6040: Do a Proper deinit at errorpath and also when driver unloads (calling r6040_remove_one)
r6040: disable pci device if the subsequent calls (after pci_enable_device) fails
skb: avoid unnecessary reallocations in __skb_cow
net: sh_eth: fix the rxdesc pointer when rx descriptor empty happens
asix: allow full size 8021Q frames to be received
rds_rdma: don't assume infiniband device is PCI
l2tp: fix oops in L2TP IP sockets for connect() AF_UNSPEC case
mac80211: fix ADDBA declined after suspend with wowlan
wlcore: fix undefined symbols when CONFIG_PM is not defined
mac80211: fix flag check for QoS NOACK frames
ath9k_hw: apply internal regulator settings on AR933x
ath9k_hw: update AR933x initvals to fix issues with high power devices
ath9k: fix a use-after-free-bug when ath_tx_setup_buffer() fails
ath9k: stop rx dma before stopping tx
...
We faced segmentation fault on perf top -G at very high sampling rate
due to a corrupted callchain. While the root cause was not revealed (I
failed to figure it out), this patch tries to protect us from the
segfault on such cases.
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Sunjin Yang <fan4326@gmail.com>
Link: http://lkml.kernel.org/r/1338443007-24857-2-git-send-email-namhyung.kim@lge.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Recently I'm working on fanotify and found the following strange
behaviors.
I wrote a program to set fanotify_mark on "/tmp/block" and FAN_DENY
all events notified.
fanotify_mask = FAN_ALL_EVENTS | FAN_ALL_PERM_EVENTS | FAN_EVENT_ON_CHILD:
$ cd /tmp/block; cat foo
cat: foo: Operation not permitted
Operation on the file is blocked as expected.
But,
fanotify_mask = FAN_ALL_PERM_EVENTS | FAN_EVENT_ON_CHILD:
$ cd /tmp/block; cat foo
aaa
It's not blocked anymore. This is confusing behavior. Also reading
commit "fsnotify: call fsnotify_parent in perm events", it seems like
fsnotify should handle subfiles' perm events as well as the other notify
events.
With this patch, regardless of FAN_ALL_EVENTS set or not:
$ cd /tmp/block; cat foo
cat: foo: Operation not permitted
Operation on the file is now blocked properly.
FS_OPEN_PERM and FS_ACCESS_PERM are not listed on FS_EVENTS_POSS_ON_CHILD.
Due to fsnotify_inode_watches_children() check, if you only specify only
these events as fsnotify_mask, you don't get subfiles' perm events
notified.
This patch add the events to FS_EVENTS_POSS_ON_CHILD to get them notified
even if only these events are specified to fsnotify_mask.
Signed-off-by: Naohiro Aota <naota@elisp.net>
Cc: Eric Paris <eparis@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Make netif_dbg use dynamic debugging whenever
CONFIG_DYNAMIC_DEBUG is enabled.
commit b558c96ffa
("dynamic_debug: make dynamic-debug supersede DEBUG ccflag")
missed updating the netif_dbg variant.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull ceph updates from Sage Weil:
"There are some updates and cleanups to the CRUSH placement code, a bug
fix with incremental maps, several cleanups and fixes from Josh Durgin
in the RBD block device code, a series of cleanups and bug fixes from
Alex Elder in the messenger code, and some miscellaneous bounds
checking and gfp cleanups/fixes."
Fix up trivial conflicts in net/ceph/{messenger.c,osdmap.c} due to the
networking people preferring "unsigned int" over just "unsigned".
* git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (45 commits)
libceph: fix pg_temp updates
libceph: avoid unregistering osd request when not registered
ceph: add auth buf in prepare_write_connect()
ceph: rename prepare_connect_authorizer()
ceph: return pointer from prepare_connect_authorizer()
ceph: use info returned by get_authorizer
ceph: have get_authorizer methods return pointers
ceph: ensure auth ops are defined before use
ceph: messenger: reduce args to create_authorizer
ceph: define ceph_auth_handshake type
ceph: messenger: check return from get_authorizer
ceph: messenger: rework prepare_connect_authorizer()
ceph: messenger: check prepare_write_connect() result
ceph: don't set WRITE_PENDING too early
ceph: drop msgr argument from prepare_write_connect()
ceph: messenger: send banner in process_connect()
ceph: messenger: reset connection kvec caller
libceph: don't reset kvec in prepare_write_banner()
ceph: ignore preferred_osd field
ceph: fully initialize new layout
...
Pull i2c updates from Jean Delvare.
* 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
i2c: Split I2C_M_NOSTART support out of I2C_FUNC_PROTOCOL_MANGLING
i2c-dev: Add support for I2C_M_RECV_LEN
Pull second set of watchdog updates from Wim Van Sebroeck:
"This changeset contains following changes:
* Add support for multiple watchdog devices. We use dynamically
allocated device id's for this.
* Add locking into the generic watchdog infrastructure.
* Add support for dynamically allocated watchdog_device structs so
that we can deal with devices that get unbound.
* convert following drivers to the generic watchdog framework:
sch5627, sch5636 and sp805_wdt.
* Add DA9052/53 PMIC watchdog support
* Fix printk format warnings for iTCO_wdt.c"
* git://www.linux-watchdog.org/linux-watchdog:
watchdog: iTCO_wdt.c: fix printk format warnings
watchdog: sp805_wdt: Add clk_{un}prepare support
watchdog: sp805_wdt: convert to watchdog core
hwmon/sch56xx: Depend on watchdog for watchdog core functions
watchdog: sch56xx-common: set correct bits in register()
Watchdog: DA9052/53 PMIC watchdog support
watchdog: sch56xx-common: Add proper ref-counting of watchdog data
watchdog: sch56xx: Remove unnecessary checks for register changes
watchdog: sch56xx: Use watchdog core
watchdog: Add support for dynamically allocated watchdog_device structs
watchdog: Add Locking support
watchdog: watchdog_dev: Rewrite wrapper code
watchdog: use dev_ functions
watchdog: create all the proper device files
watchdog: Add a flag to indicate the watchdog doesn't reboot things
watchdog: Add multiple device support
watchdog: watchdog_core.h: make functions extern
watchdog: correct the name of the watchdog_core inlude file
watchdog: Add watchdog_active() routine
watchdog: watchdog_dev: include private header to pickup global symbol prototypes
Pull block driver updates from Jens Axboe:
"Here are the driver related changes for 3.5. It contains:
- The floppy changes from Jiri. Jiri is now also marked as the
maintainer of floppy.c, I shall be publically branding his forehead
with red hot iron at the next opportune moment.
- A batch of drbd updates and fixes from the linbit crew, as well as
fixes from others.
- Two small fixes for xen-blkfront courtesy of Jan."
* 'for-3.5/drivers' of git://git.kernel.dk/linux-block: (70 commits)
floppy: take over maintainership
floppy: remove floppy-specific O_EXCL handling
floppy: convert to delayed work and single-thread wq
xen-blkfront: module exit handling adjustments
xen-blkfront: properly name all devices
drbd: grammar fix in log message
drbd: check MODULE for THIS_MODULE
drbd: Restore the request restart logic
drbd: introduce a bio_set to allocate housekeeping bios from
drbd: remove unused define
drbd: bm_page_async_io: properly initialize page->private
drbd: use the newly introduced page pool for bitmap IO
drbd: add page pool to be used for meta data IO
drbd: allow bitmap to change during writeout from resync_finished
drbd: fix race between drbdadm invalidate/verify and finishing resync
drbd: fix resend/resubmit of frozen IO
drbd: Ensure that data_size is not 0 before using data_size-1 as index
drbd: Delay/reject other state changes while establishing a connection
drbd: move put_ldev from __req_mod() to the endio callback
drbd: fix WRITE_ACKED_BY_PEER_AND_SIS to not set RQ_NET_DONE
...
Merge block/IO core bits from Jens Axboe:
"This is a bit bigger on the core side than usual, but that is purely
because we decided to hold off on parts of Tejun's submission on 3.4
to give it a bit more time to simmer. As a consequence, it's seen a
long cycle in for-next.
It contains:
- Bug fix from Dan, wrong locking type.
- Relax splice gifting restriction from Eric.
- A ton of updates from Tejun, primarily for blkcg. This improves
the code a lot, making the API nicer and cleaner, and also includes
fixes for how we handle and tie policies and re-activate on
switches. The changes also include generic bug fixes.
- A simple fix from Vivek, along with a fix for doing proper delayed
allocation of the blkcg stats."
Fix up annoying conflict just due to different merge resolution in
Documentation/feature-removal-schedule.txt
* 'for-3.5/core' of git://git.kernel.dk/linux-block: (92 commits)
blkcg: tg_stats_alloc_lock is an irq lock
vmsplice: relax alignement requirements for SPLICE_F_GIFT
blkcg: use radix tree to index blkgs from blkcg
blkcg: fix blkcg->css ref leak in __blkg_lookup_create()
block: fix elvpriv allocation failure handling
block: collapse blk_alloc_request() into get_request()
blkcg: collapse blkcg_policy_ops into blkcg_policy
blkcg: embed struct blkg_policy_data in policy specific data
blkcg: mass rename of blkcg API
blkcg: style cleanups for blk-cgroup.h
blkcg: remove blkio_group->path[]
blkcg: blkg_rwstat_read() was missing inline
blkcg: shoot down blkgs if all policies are deactivated
blkcg: drop stuff unused after per-queue policy activation update
blkcg: implement per-queue policy activation
blkcg: add request_queue->root_blkg
blkcg: make request_queue bypassing on allocation
blkcg: make sure blkg_lookup() returns %NULL if @q is bypassing
blkcg: make blkg_conf_prep() take @pol and return with queue lock held
blkcg: remove static policy ID enums
...
Not much stuff this time. The only change to the IOMMU core code is the
addition of a handle to the fault handling code. A few updates to the
AMD IOMMU driver to work around new errata. The other patches are mostly
fixes and enhancements to the existing ARM IOMMU drivers and
documentation updates.
A new IOMMU driver for the Exynos platform was also underway but got
merged via the Samsung tree and is not part of this tree.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJPxfseAAoJECvwRC2XARrjvL4QAL39988y7ajHSI3ym3Dxovn9
w8md63xKNlTpCB8NJPRIJpcGrE7QFtNXPFCagTqO713ulwCoKayEwKGOU7VQagFc
0/JoHxE5usE5OuA6tyAJbpWK10kWKDzu6HjZfqF2yoa0q/REbsu65KsY7zc7HbpF
qEAXX1xr9IC7GUM7gv75OR8CP2VJCW3+6VyhiD/37t3KpNwINMpRDO/eN/KiwoUI
1t+/DVwO6pH5UrGReWrmjs/gcxFMzkeelt+iCA32kzkWLtyWjeWBujVWnFvVtpkz
R4pV2T2jvs6fWPU5MMBXZRd5AvLLqcu/g/Yr21WYHz07jCcGxlCUp9qpnGLt2el0
/YTY3LBZUQJ5sx3OSJV+oQVTtI5x0EkAiOrJ8Dx20wNAFqun9bhJb1WX0IXflmZc
oC7SF5wjXq8pUQmX/wpGMbW7XYompypJGqlEsftJEytf4dfR6KJ2Vo1h3pHtpaex
IaY6TqmdW44e0EgbFTM7RMNFtC7GrIY9NE+WKlrFtsHhUFrqt1NVBEcO3faU0ES6
UAguFRPM/HAdkVmY620+DUT/JkEMemWq2jgWExLGLC9gI8L1Xj2cdU8esstuMUoV
GGG4u9a5W1rALwg+zPCQGoVxPKmd6fpeC3U+Rmg2639chy+h4c/cBXkzfUsxe2lg
wvMDVbjDN1Fz0c29YJit
=K23I
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU updates from Joerg Roedel:
"Not much stuff this time. The only change to the IOMMU core code is
the addition of a handle to the fault handling code. A few updates to
the AMD IOMMU driver to work around new errata. The other patches are
mostly fixes and enhancements to the existing ARM IOMMU drivers and
documentation updates.
A new IOMMU driver for the Exynos platform was also underway but got
merged via the Samsung tree and is not part of this tree."
* tag 'iommu-updates-v3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
Documentation: kernel-parameters.txt Add amd_iommu_dump
iommu/core: pass a user-provided token to fault handlers
iommu/tegra: gart: Fix register offset correctly
iommu: OMAP: device detach on domain destroy
iommu: tegra/gart: Add device tree support
iommu: tegra/gart: use correct gart_device
iommu/tegra: smmu: Print device name correctly
iommu/amd: Add workaround for event log erratum
iommu/amd: Check for the right TLP prefix bit
dma-debug: release free_entries_lock before saving stack trace
Since nr_cpus_allowed is used outside of sched/rt.c and wants to be
used outside of there more, move it to a more natural site.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-kr61f02y9brwzkh6x53pdptm@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Follow up on commit 556061b00 ("sched/nohz: Fix rq->cpu_load[]
calculations") since while that fixed the busy case it regressed the
mostly idle case.
Add a callback from the nohz exit to also age the rq->cpu_load[]
array. This closes the hole where either there was no nohz load
balance pass during the nohz, or there was a 'significant' amount of
idle time between the last nohz balance and the nohz exit.
So we'll update unconditionally from the tick to not insert any
accidental 0 load periods while busy, and we try and catch up from
nohz idle balance and nohz exit. Both these are still prone to missing
a jiffy, but that has always been the case.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: pjt@google.com
Cc: Venkatesh Pallipadi <venki@google.com>
Link: http://lkml.kernel.org/n/tip-kt0trz0apodbf84ucjfdbr1a@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Since there are uses for I2C_M_NOSTART which are much more sensible and
standard than most of the protocol mangling functionality (the main one
being gather writes to devices where something like a register address
needs to be inserted before a block of data) create a new I2C_FUNC_NOSTART
for this feature and update all the users to use it.
Also strengthen the disrecommendation of the protocol mangling while we're
at it.
In the case of regmap-i2c we remove the requirement for mangling as
I2C_M_NOSTART is the only mangling feature which is being used.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
If a driver's watchdog_device struct is part of a dynamically allocated
struct (which it often will be), merely locking the module is not enough,
even with a drivers module locked, the driver can be unbound from the device,
examples:
1) The root user can unbind it through sysfd
2) The i2c bus master driver being unloaded for an i2c watchdog
I will gladly admit that these are corner cases, but we still need to handle
them correctly.
The fix for this consists of 2 parts:
1) Add ref / unref operations, so that the driver can refcount the struct
holding the watchdog_device struct and delay freeing it until any
open filehandles referring to it are closed
2) Most driver operations will do IO on the device and the driver should not
do any IO on the device after it has been unbound. Rather then letting each
driver deal with this internally, it is better to ensure at the watchdog
core level that no operations (other then unref) will get called after
the driver has called watchdog_unregister_device(). This actually is the
bulk of this patch.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
This patch fixes some potential multithreading issues, despite only
allowing one process to open the /dev/watchdog device, we can still get
called multiple times at the same time, since a program could be using thread,
or could share the fd after a fork.
This causes 2 potential problems:
1) watchdog_start / open do an unlocked test_n_set / test_n_clear,
if these 2 race, the watchdog could be stopped while the active
bit indicates it is running or visa versa.
2) Most watchdog_dev drivers probably assume that only one
watchdog-op will get called at a time, this is not necessary
true atm.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Create the watchdog class and it's associated devices.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Some watchdogs merely trigger external alarms and controls. In a managed
environment this is very useful but we want drivers to be able to figure
out which is which now multiple dogs can be loaded. Thus add an ALARMONLY
feature flag.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
We keep the old /dev/watchdog interface file for the first watchdog via
miscdev. This is basically a cut and paste of the relevant interface code
from the rtc driver layer tweaked for watchdog.
Revised to fix problems noted by Hans de Goede
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Some watchdog may need to check if watchdog is ACTIVE or not, for example in
their suspend/resume hooks.
This patch adds this routine and changes the core drivers to use it.
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
lglocks and brlocks are currently generated with some complicated macros
in lglock.h. But there's no reason to not just use common utility
functions and put all the data into a common data structure.
Since there are at least two users it makes sense to share this code in a
library. This is also easier maintainable than a macro forest.
This will also make it later possible to dynamically allocate lglocks and
also use them in modules (this would both still need some additional, but
now straightforward, code)
[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Optimizing the slow paths adds a lot of complexity. If you need to
grab every lock often, you have other problems.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Nick Piggin <npiggin@kernel.dk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
pass inode + parent's inode or NULL instead of dentry + bool saying
whether we want the parent or not.
NOTE: that needs ceph fix folded in.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Generic netlink searches for -type- formatted aliases when requesting a module to
fulfill a protocol request (i.e. net-pf-16-proto-16-type-<x>, where x is a type
value). However generic netlink protocols have no well defined type numbers,
they have string names. Modify genl_ctrl_getfamily to request an alias in the
format net-pf-16-proto-16-family-<x> instead, where x is a generic string, and
add a macro that builds on the previously added MODULE_ALIAS_NET_PF_PROTO_NAME
macro to allow modules to specifify those generic strings.
Note, l2tp previously hacked together an net-pf-16-proto-16-type-l2tp alias
using the MODULE_ALIAS macro, with these updates we can convert that to use the
PROTO_NAME macro.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: James Chapman <jchapman@katalix.com>
CC: David Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The MODULE_ALAIS_NET_PF macro set is missing a variant that allows for the
appending of an arbitrary string to the net-pf-<x>-proto-<y> base. while
MODULE_ALIAS_NET_PF_PROTO_NAME_TYPE allows an appending of a numerical type, we
need to be able to append a generic string to support generic netlink families
that have neither a fix numberical protocol nor type number
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: David Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull EDAC internal API changes from Mauro Carvalho Chehab:
"This changeset is the first part of a series of patches that fixes the
EDAC sybsystem. On this set, it changes the Kernel EDAC API in order
to properly represent the Intel i3/i5/i7, Xeon 3xxx/5xxx/7xxx, and
Intel E5-xxxx memory controllers.
The EDAC core used to assume that:
- the DRAM chip select pin is directly accessed by the memory
controller
- when multiple channels are used, they're all filled with the
same type of memory.
None of the above premises is true on Intel memory controllers since
2002, when RAMBUS and FB-DIMMs were introduced, and Advanced Memory
Buffer or by some similar technologies hides the direct access to the
DRAM pins.
So, the existing drivers for those chipsets had to lie to the EDAC
core, in general telling that just one channel is filled. That
produces some hard to understand error messages like:
EDAC MC0: CE row 3, channel 0, label "DIMM1": 1 Unknown error(s): memory read error on FATAL area : cpu=0 Err=0008:00c2 (ch=2), addr = 0xad1f73480 => socket=0, Channel=0(mask=2), rank=1
The location information there (row3 channel 0) is completely bogus:
it has no physical meaning, and are just some random values that the
driver uses to talk with the EDAC core. The error actually happened
at CPU socket 0, channel 0, slot 1, but this is not reported anywhere,
as the EDAC core doesn't know anything about the memory layout. So,
only advanced users that know how the EDAC driver works and that tests
their systems to see how DIMMs are mapped can actually benefit for
such error logs.
This patch series fixes the error report logic, in order to allow the
EDAC to expose the memory architecture used by them to the EDAC core.
So, as the EDAC core now understands how the memory is organized, it
can provide an useful report:
EDAC MC0: CE memory read error on DIMM1 (channel:0 slot:1 page:0x364b1b offset:0x600 grain:32 syndrome:0x0 - count:1 area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:4)
The location of the DIMM where the error happened is reported by "MC0"
(cpu socket #0), at "channel:0 slot:1" location, and matches the
physical location of the DIMM.
There are two remaining issues not covered by this patch series:
- The EDAC sysfs API will still report bogus values. So,
userspace tools like edac-utils will still use the bogus data;
- Add a new tracepoint-based way to get the binary information
about the errors.
Those are on a second series of patches (also at -next), but will
probably miss the train for 3.5, due to the slow review process."
Fix up trivial conflict (due to spelling correction of removed code) in
drivers/edac/edac_device.c
* git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac: (42 commits)
i7core: fix ranks information at the per-channel struct
i5000: Fix the fatal error handling
i5100_edac: Fix a warning when compiled with 32 bits
i82975x_edac: Test nr_pages earlier to save a few CPU cycles
e752x_edac: provide more info about how DIMMS/ranks are mapped
i5000_edac: Fix the logic that retrieves memory information
i5400_edac: improve debug messages to better represent the filled memory
edac: Cleanup the logs for i7core and sb edac drivers
edac: Initialize the dimm label with the known information
edac: Remove the legacy EDAC ABI
x38_edac: convert driver to use the new edac ABI
tile_edac: convert driver to use the new edac ABI
sb_edac: convert driver to use the new edac ABI
r82600_edac: convert driver to use the new edac ABI
ppc4xx_edac: convert driver to use the new edac ABI
pasemi_edac: convert driver to use the new edac ABI
mv64x60_edac: convert driver to use the new edac ABI
mpc85xx_edac: convert driver to use the new edac ABI
i82975x_edac: convert driver to use the new edac ABI
i82875p_edac: convert driver to use the new edac ABI
...
Pull MIPS updates from Ralf Baechle:
"The whole series has been sitting in -next for quite a while with no
complaints. The last change to the series was before the weekend the
removal of an SPI patch which Grant - even though previously acked by
himself - appeared to raise objections. So I removed it until the
situation is clarified. Other than that all the patches have the acks
from their respective maintainers, all MIPS and x86 defconfigs are
building fine and I'm not aware of any problems introduced by this
series.
Among the key features for this patch series is a sizable patchset for
Lantiq which among other things introduces support for Lantiq's
flagship product, the FALCON SOC. It also means that the opensource
developers behind this patchset have overtaken Lantiq's competing
inhouse development team that was working behind closed doors.
Less noteworthy the ath79 patchset which adds support for a few more
chip variants, cleanups and fixes. Finally the usual dose of tweaking
of generic code."
Fix up trivial conflicts in arch/mips/lantiq/xway/gpio_{ebu,stp}.c where
printk spelling fixes clashed with file move and eventual removal of the
printk.
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (81 commits)
MIPS: lantiq: remove orphaned code
MIPS: Remove all -Wall and almost all -Werror usage from arch/mips.
MIPS: lantiq: implement support for FALCON soc
MTD: MIPS: lantiq: verify that the NOR interface is available on falcon soc
MTD: MIPS: lantiq: implement OF support
watchdog: MIPS: lantiq: implement OF support and minor fixes
SERIAL: MIPS: lantiq: implement OF support
GPIO: MIPS: lantiq: convert gpio-stp-xway to OF
GPIO: MIPS: lantiq: convert gpio-mm-lantiq to OF and of_mm_gpio
GPIO: MIPS: lantiq: move gpio-stp and gpio-ebu to the subsystem folder
MIPS: pci: convert lantiq driver to OF
MIPS: lantiq: convert dma to platform driver
MIPS: lantiq: implement support for clkdev api
MIPS: lantiq: drop ltq_gpio_request() and gpio_to_irq()
OF: MIPS: lantiq: implement irq_domain support
OF: MIPS: lantiq: implement OF support
MIPS: lantiq: drop mips_machine support
OF: PCI: const usage needed by MIPS
MIPS: Cavium: Remove smp_reserve_lock.
MIPS: Move cache setup to setup_arch().
...
Some DS13XX devices have "trickle chargers". Its configuration register
is at different locations, the setup is the same, though. Since the
configuration is board specific, introduce a platform_data to this driver.
Tested with a DS1339 on a custom board.
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Cc: Alessandro Zummo <alessandro.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently there is no generic way to get the RTC battery status within an
application. So add an ioctl to read the status bit. The idea is that
the bit is set once a low voltage is detected. It stays there until it is
reset using the RTC_VL_CLR ioctl.
Signed-off-by: Alexander Stein <alexander.stein@systec-electronic.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Using %ps in a printk format will sometimes fail silently and print the
empty string if the address passed in does not match a symbol that
kallsyms knows about. But using %pS will fall back to printing the full
address if kallsyms can't find the symbol. Make %ps act the same as %pS
by falling back to printing the address.
While we're here also make %ps print the module that a symbol comes from
so that it matches what %pS already does. Take this simple function for
example (in a module):
static void test_printk(void)
{
int test;
pr_info("with pS: %pS\n", &test);
pr_info("with ps: %ps\n", &test);
}
Before this patch:
with pS: 0xdff7df44
with ps:
After this patch:
with pS: 0xdff7df44
with ps: 0xdff7df44
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
max brightness is 127, so the range of brt_val should be from 0 to 127
Signed-off-by: Milo(Woogyom) Kim <milo.kim@ti.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Cc: Shreshtha Kumar SAHU <shreshthakumar.sahu@stericsson.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Bryan Wu <bryan.wu@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a new field to led_classdev to save activattion state after activate
routine is successful. This saved state is used in deactivate routine to
do cleanup such as removing device files, and free memory allocated during
activation. Currently trigger_data not being null is used for this
purpose.
Existing triggers will need changes to use this new field.
Signed-off-by: Shuah Khan <shuahkhan@gmail.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Bryan Wu <bryan.wu@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Include the header to pickup the exported symbol prototype.
Quiets the sparse warning:
warning: symbol 'apple_bl_register' was not declared. Should it be static?
warning: symbol 'apple_bl_unregister' was not declared. Should it be static?
[akpm@linux-foundation.org: fix resulting build error]
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patchset adds early fb blank feature that a callback of lcd panel
driver is called prior to specific fb driver's one. In the case of
MIPI-DSI based video mode LCD Panel, for lcd power off, the power off
commands should be transferred to lcd panel with display and mipi-dsi
controller enabled because the commands is set to lcd panel at vsync porch
period. and in opposite case, the callback of fb driver should be called
prior to lcd panel driver's one because of same issue. Also if fb_blank
mode is changed to FB_BLANK_POWERDOWN then display controller would be
off(clock disable) but lcd panel would be still on. at this time, you
could see some issue like sparkling on lcd panel because video clock to be
delivered to ldi module of lcd panel was disabled. this issue could
occurs for all lcd panels.
The callback order is as the following:
at fb_blank function of fbmem.c
-> fb_notifier_call_chain(FB_EARLY_EVENT_BLANK)
-> lcd panel driver's early_set_power()
-> info->fbops->fb_blank()
-> spcefic fb driver's fb_blank()
-> fb_notifier_call_chain(FB_EVENT_BLANK)
-> lcd panel driver's set_power()
-> fb_notifier_call_chain(FB_R_EARLY_EVENT_BLANK) if
info->fops->fb_blank() was failed.
fb_notifier_call_chain(FB_R_EARLY_EVENT_BLANK) would be called to revert
the effects of previous FB_EARLY_EVENT_BLANK call. and note that if
early_set_power() of lcd_ops is NULL then early fb blank callback would be
ignored.
This patch:
Add early_set_power and r_early_set_power callbacks. early_set_power
callback is called prior to fb_blank() of fbmem.c and r_early_set_power
callback is called if fb_blank() was failed to revert the effects of the
early_set_power call of lcd panel driver.
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Lars-Peter Clausen <lars@metafoo.de>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add FB_EARLY_EVENT_BLANK and FB_R_EARLY_EVENT_BLANK event mode supports.
first, fb_notifier_call_chain() is called with FB_EARLY_EVENT_BLANK and
fb_blank() of specific fb driver is called and then
fb_notifier_call_chain() is called with FB_EVENT_BLANK again at
fb_blank(). and if fb_blank() was failed then fb_nitifier_call_chain()
would be called with FB_R_EARLY_EVENT_BLANK to revert the previous
effects.
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We call the destroy function when a cgroup starts to be removed, such as
by a rmdir event.
However, because of our reference counters, some objects are still
inflight. Right now, we are decrementing the static_keys at destroy()
time, meaning that if we get rid of the last static_key reference, some
objects will still have charges, but the code to properly uncharge them
won't be run.
This becomes a problem specially if it is ever enabled again, because now
new charges will be added to the staled charges making keeping it pretty
much impossible.
We just need to be careful with the static branch activation: since there
is no particular preferred order of their activation, we need to make sure
that we only start using it after all call sites are active. This is
achieved by having a per-memcg flag that is only updated after
static_key_slow_inc() returns. At this time, we are sure all sites are
active.
This is made per-memcg, not global, for a reason: it also has the effect
of making socket accounting more consistent. The first memcg to be
limited will trigger static_key() activation, therefore, accounting. But
all the others will then be accounted no matter what. After this patch,
only limited memcgs will have its sockets accounted.
[akpm@linux-foundation.org: move enum sock_flag_bits into sock.h,
document enum sock_flag_bits,
convert memcg_proto_active() and memcg_proto_activated() to test_bit(),
redo tcp_update_limit() comment to 80 cols]
Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Take lruvec further: pass it instead of zone to add_page_to_lru_list() and
del_page_from_lru_list(); and pagevec_lru_move_fn() pass lruvec down to
its target functions.
This cleanup eliminates a swathe of cruft in memcontrol.c, including
mem_cgroup_lru_add_list(), mem_cgroup_lru_del_list() and
mem_cgroup_lru_move_lists() - which never actually touched the lists.
In their place, mem_cgroup_page_lruvec() to decide the lruvec, previously
a side-effect of add, and mem_cgroup_update_lru_size() to maintain the
lru_size stats.
Whilst these are simplifications in their own right, the goal is to bring
the evaluation of lruvec next to the spin_locking of the lrus, in
preparation for a future patch.
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Konstantin just introduced mem_cgroup_get_lruvec_size() and
get_lruvec_size(), I'm about to add mem_cgroup_update_lru_size(): but
we're dealing with the same thing, lru_size[lru]. We ought to agree on
the naming, and I do think lru_size is the more correct: so rename his
ones to get_lru_size().
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since we will succeed with the allocation no matter what, there isn't a
need to use __must_check with it. It can very well be optional.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ying Han <yinghan@google.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When killing a res_counter which is a child of other counter, we need to
do
res_counter_uncharge(child, xxx)
res_counter_charge(parent, xxx)
This is not atomic and wastes CPU. This patch adds
res_counter_uncharge_until(). This function's uncharge propagates to
ancestors until specified res_counter.
res_counter_uncharge_until(child, parent, xxx)
Now the operation is atomic and efficient.
Signed-off-by: Frederic Weisbecker <fweisbec@redhat.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Ying Han <yinghan@google.com>
Cc: Glauber Costa <glommer@parallels.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If memory cgroup is enabled we always use lruvecs which are embedded into
struct mem_cgroup_per_zone, so we can reach lru_size counters via
container_of().
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is the first stage of struct mem_cgroup_zone removal. Further
patches replace struct mem_cgroup_zone with a pointer to struct lruvec.
If CONFIG_CGROUP_MEM_RES_CTLR=n lruvec_zone() is just container_of().
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch kills mem_cgroup_lru_del(), we can use
mem_cgroup_lru_del_list() instead. On 0-order isolation we already have
right lru list id.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After patch "mm: forbid lumpy-reclaim in shrink_active_list()" we can
completely remove anon/file and active/inactive lru type filters from
__isolate_lru_page(), because isolation for 0-order reclaim always
isolates pages from right lru list. And pages-isolation for lumpy
shrink_inactive_list() or memory-compaction anyway allowed to isolate
pages from all evictable lru lists.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
GCC sometimes ignores "inline" directives even for small and simple functions.
This supposed to be fixed in gcc 4.7, but it was released only yesterday.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With mem_cgroup_disabled() now explicit, it becomes clear that the
zone_reclaim_stat structure actually belongs in lruvec, per-zone when
memcg is disabled but per-memcg per-zone when it's enabled.
We can delete mem_cgroup_get_reclaim_stat(), and change
update_page_reclaim_stat() to update just the one set of stats, the one
which get_scan_count() will actually use.
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch changes memcg's behavior at task_move().
At task_move(), the kernel scans a task's page table and move the changes
for mapped pages from source cgroup to target cgroup. There has been a
bug at handling shared anonymous pages for a long time.
Before patch:
- The spec says 'shared anonymous pages are not moved.'
- The implementation was 'shared anonymoys pages may be moved'.
If page_mapcount <=2, shared anonymous pages's charge were moved.
After patch:
- The spec says 'all anonymous pages are moved'.
- The implementation is 'all anonymous pages are moved'.
Considering usage of memcg, this will not affect user's experience.
'shared anonymous' pages only exists between a tree of processes which
don't do exec(). Moving one of process without exec() seems not sane.
For example, libcgroup will not be affected by this change. (Anyway, no
one noticed the implementation for a long time...)
Below is a discussion log:
- current spec/implementation are complex
- Now, shared file caches are moved
- It adds unclear check as page_mapcount(). To do correct check,
we should check swap users, etc.
- No one notice this implementation behavior. So, no one get benefit
from the design.
- In general, once task is moved to a cgroup for running, it will not
be moved....
- Finally, we have control knob as memory.move_charge_at_immigrate.
Here is a patch to allow moving shared pages, completely. This makes
memcg simpler and fix current broken code.
Suggested-by: Hugh Dickins <hughd@google.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Glauber Costa <glommer@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Transparent huge pages can change page->flags (PG_compound_lock) without
taking Slab lock. Since THP can not break slab pages we can safely access
compound page without taking compound lock.
Specifically this patch fixes a race between compound_unlock() and slab
functions which perform page-flags updates. This can occur when
get_page()/put_page() is called on a page from slab.
[akpm@linux-foundation.org: tweak comment text, fix comment layout, fix label indenting]
Reported-by: Amey Bhide <abhide@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Acked-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>