Unlike recent modern userspace API such as:
epoll_create1 (EPOLL_CLOEXEC), eventfd (EFD_CLOEXEC),
fanotify_init (FAN_CLOEXEC), inotify_init1 (IN_CLOEXEC),
signalfd (SFD_CLOEXEC), timerfd_create (TFD_CLOEXEC),
or the venerable general purpose open (O_CLOEXEC),
perf_event_open() syscall lack a flag to atomically set FD_CLOEXEC
(eg. close-on-exec) flag on file descriptor it returns to userspace.
The present patch adds a PERF_FLAG_FD_CLOEXEC flag to allow
perf_event_open() syscall to atomically set close-on-exec.
Having this flag will enable userspace to remove the file descriptor
from the list of file descriptors being inherited across exec,
without the need to call fcntl(fd, F_SETFD, FD_CLOEXEC) and the
associated race condition between the current thread and another
thread calling fork(2) then execve(2).
Links:
- Secure File Descriptor Handling (Ulrich Drepper, 2008)
http://udrepper.livejournal.com/20407.html
- Excuse me son, but your code is leaking !!! (Dan Walsh, March 2012)
http://danwalsh.livejournal.com/53603.html
- Notes in DMA buffer sharing: leak and security hole
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/dma-buf-sharing.txt?id=v3.13-rc3#n428
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/8c03f54e1598b1727c19706f3af03f98685d9fe6.1388952061.git.ydroneaud@opteya.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Still a slightly high amount of changes than wished, but they are all
good regression and/or device-specific fixes. Majority of commits are
for HD-audio, an HDMI ctl index fix that hits old graphics boards,
regression fixes for AD codecs and a few quirks. Other than that, two
major fixes are included: a 64bit ABI fix for compress offload, and
64bit dma_addr_t truncation fix, which had hit on PAE kernels.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJSqCKIAAoJEGwxgFQ9KSmkPqAQALL4GbkCVyMvKFP6f98jhDX6
+owx2h4FiQMzw7aApHCjrZmZIpsHTiPL4Ma5vcTcjL3H8KE1UHaHwQ4NObvKLWmX
IncxEy3uMurHA3v5I00X67VarCtPdXN51C1Ky6ShcQ1y9lcBGd+RSEaxlpK605n5
nMqNF0wIAmFdesQ22hnCNiwIuqvkurcHUEaZvEh+dCXniv2zSQ6Y/e/Qjzy7x5uF
rAzulso5X3ERjDA7B27OpoWGeU4n4OcY/3leAtiz0k7fCeAqZaXNchaBScAwIztH
H6ydHe0v9NpDKDYTjuPoGkuYAj2vj9rJIsz1ZW8yrEdwAFWAUOev68F6hhDmC90o
2+k6ibb9tXHrjDM2MP3m6d/Hl98u0q6r8fFy3Hnwyfpr7ZXi1SmS7i/VmW6e/JYt
cpMrTPVGV46boDaegeIqKpSDIqhZqtCHTESN0pl4UOlGqsqdViXLOQ2EzWzzcDuh
qMGlY7uCRN7vp7NwtK/oFVva8HT/myM47TF3116z/5SaYT3RNOjquwbsCNQC5Wt5
usUQws46Rn8XlruXsKMuzXosgEuWYs7XUVZckm796m/acEFgMPQJV9qfDA2q6O2Q
Mzu+W5CDx3Zi50T/RSBeuntnis1spP6JV7YkalofLHwNxIi7mNXkWcd2oI/66d4r
+W2+w0O1TEbjdfv2DxsZ
=R0zk
-----END PGP SIGNATURE-----
Merge tag 'sound-3.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Still a slightly high amount of changes than wished, but they are all
good regression and/or device-specific fixes. Majority of commits are
for HD-audio, an HDMI ctl index fix that hits old graphics boards,
regression fixes for AD codecs and a few quirks.
Other than that, two major fixes are included: a 64bit ABI fix for
compress offload, and 64bit dma_addr_t truncation fix, which had hit
on PAE kernels"
* tag 'sound-3.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Add static DAC/pin mapping for AD1986A codec
ALSA: hda - One more Dell headset detection quirk
ALSA: hda - hdmi: Fix IEC958 ctl indexes for some simple HDMI devices
ALSA: hda - Mute all aamix inputs as default
ALSA: compress: Fix 64bit ABI incompatibility
ALSA: memalloc.h - fix wrong truncation of dma_addr_t
ALSA: hda - Another Dell headset detection quirk
ALSA: hda - A Dell headset detection quirk
ALSA: hda - Remove quirk for Dell Vostro 131
ALSA: usb-audio: fix uninitialized variable compile warning
ALSA: hda - fix mic issues on Acer Aspire E-572
snd_pcm_uframes_t is defined as unsigned long so it would take
different sizes depending on 32 or 64bit architectures. As we don't
want this ABI incompatibility, and there is no real 64bit user yet,
let's make it the fixed size with __u32.
Also bump the protocol version number to 0.1.2.
Acked-by: Vinod Koul <vinod.koul@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Pull input updates from Dmitry Torokhov:
"An update to ALPS to support devices on Dell XT2 (hopefully working
better this time around and although it is largish it should not
affect any other ALPS devices) and a tiny update to Elantech driver to
support newer devices as well.
Also a coupe of new input event codes have been defined"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: ALPS - add support for DualPoint device on Dell XT2 model
Input: elantech - add support for newer (August 2013) devices
Input: add SW_MUTE_DEVICE switch definition
Input: usbtouchscreen - separate report and transmit buffer size handling
Input: sur40 - suppress false uninitialized variable warning
Input: add key code for ambient light sensor button
Input: keyboard - "keycode & KEY_MAX" changes some keycode values
Nothing huge, just a few small bugfixes for problems reported, and a device id
update.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iEYEABECAAYFAlKiE8QACgkQMUfUDdst+yl9dQCgwXrgSRyBlULDHOdBnKblqrMq
gYQAoLVKfuEOXRXBJOD4CzNeCMLButsM
=HbWB
-----END PGP SIGNATURE-----
Merge tag 'char-misc-3.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Nothing huge, just a few small bugfixes for problems reported, and a
device id update"
* tag 'char-misc-3.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
mei: add 9 series PCH mei device ids
drivers/char/i8k.c: add Dell XPLS L421X
MAINTAINERS: add HSI subsystem
misc: mic: Suppress memory space sparse warnings
misc: mic: Fix endianness issues.
misc: mic: Fix user space namespace pollution from mic_common.h.
misc: mic: Bug fix for sysfs poll usage.
misc: mic: Minor bug fix in 'retry' loops.
misc: mic: Change mic_notify(...) to return true.
extcon: remove freed groups caused the panic or warning in unregister flow
extcon: arizona: Get pdata from arizona structure not device
- cpufreq regression fix from Bjørn Mork restoring the pre-3.12
behavior of the framework during system suspend/hibernation to
avoid garbage sysfs files from being left behind in case of a
suspend error.
- PNP regression fix to restore the correct states of devices after
resume from hibernation broken in 3.12. From Dmitry Torokhov.
- cpuidle fix to prevent cpuidle device unregistration from crashing
due to a NULL pointer dereference if cpuidle has been disabled
from the kernel command line. From Konrad Rzeszutek Wilk.
- intel_idle fix for the C6 state definition on Intel Avoton/Rangeley
processors from Arne Bockholdt.
- Power capping framework fix to make the energy_uj sysfs attribute
work in accordance with the documentation. From Srinivas Pandruvada.
- epoll fix to make it ignore the EPOLLWAKEUP flag if the kernel has
been compiled with CONFIG_PM_SLEEP unset (in which case that flag
should not have any effect). From Amit Pundir.
- cpufreq fix to prevent governor sysfs files from being lost over
system suspend/resume in some (arguably unusual) situations. From
Viresh Kumar.
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQIcBAABCAAGBQJSoTE5AAoJEILEb/54YlRxh/sP/jFZhLTc8g4MC3XzhguuROzT
u0Pu9VJVfACqz8LyiCOtfOvvb2EPV7VSq7qYqszL5y9Dn5gwIHvQMMBK2PjZ4cc2
MtSiw02Bk/DEESXYOjt++n5ja/0lc05CtJTlb3uoJXBOCqp3cMrvW7+QqnLbEfbG
S+TcPFBr+4Owt/J7r2Z2JBYGZ6NbVol/x1hAFjiM+rBan6UGw7uNcg2LgQrVHcs4
S1Cm6lsJTwRcSiswvJv9/C+ML9Z/1gYYUyu7ijQnGdbNUolyzHY6AxLWZdnSkAQO
s8JVDRKy9+V44LtnWSENnJNftjlOoXWcZRJxvDePyM3dVpxESBa8Z/AxYWwCcmcB
e4rsgm/WOF86DMhRu+gfeTF+1OkU7KhuPhbXskbw+JDcZKCui2FP/xti6IAaTsU/
9M30/VeOpD1UBqckLnDTGcsFif7hVZ9LOHH5wK8OctjyaTMfUYtPd7WxfTQCpcSc
1M0NQapwfXHASmPmMW4SszAaeduecUdgXU1epOPx0EpOMQhvuLeENJBgVC5uu1cA
KAQ7suOx9ReS3slso8lpTlTEw2rsDPRuiHcF3hv7YpklNXV2jvtxsl4upHT5VN2t
3L59Unq8vY2lt3SdzWMypaAquphc3Te5woYoFwgsSPfw40Kr+jg+oUtO6IHqM/ga
OATUkTffzp+Rp3pg036Y
=gHBV
-----END PGP SIGNATURE-----
Merge tag 'pm-3.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
- cpufreq regression fix from Bjørn Mork restoring the pre-3.12
behavior of the framework during system suspend/hibernation to avoid
garbage sysfs files from being left behind in case of a suspend error
- PNP regression fix to restore the correct states of devices after
resume from hibernation broken in 3.12. From Dmitry Torokhov.
- cpuidle fix to prevent cpuidle device unregistration from crashing
due to a NULL pointer dereference if cpuidle has been disabled from
the kernel command line. From Konrad Rzeszutek Wilk.
- intel_idle fix for the C6 state definition on Intel Avoton/Rangeley
processors from Arne Bockholdt.
- Power capping framework fix to make the energy_uj sysfs attribute
work in accordance with the documentation. From Srinivas Pandruvada.
- epoll fix to make it ignore the EPOLLWAKEUP flag if the kernel has
been compiled with CONFIG_PM_SLEEP unset (in which case that flag
should not have any effect). From Amit Pundir.
- cpufreq fix to prevent governor sysfs files from being lost over
system suspend/resume in some (arguably unusual) situations. From
Viresh Kumar.
* tag 'pm-3.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PowerCap: Fix mode for energy counter
PNP: fix restoring devices after hibernation
cpuidle: Check for dev before deregistering it.
epoll: drop EPOLLWAKEUP if PM_SLEEP is disabled
cpufreq: fix garbage kobjects on errors during suspend/resume
cpufreq: suspend governors on system suspend/hibernate
intel_idle: Fixed C6 state on Avoton/Rangeley processors
Some devices, such as new Intuos series tablets, have a hardware switch to
turn touch data on/off. To report the state, SW_MUTE_DEVICE is added
in include/uapi/linux/input.h.
Reviewed_by: Chris Bagwell <chris@cnpbagwell.com>
Acked-by: Peter Hutterer <peter.hutterer@who-t.net>
Tested-by: Jason Gerecke <killertofu@gmail.com>
Signed-off-by: Ping Cheng <pingc@wacom.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Drop EPOLLWAKEUP from epoll events mask if CONFIG_PM_SLEEP is disabled.
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull networking updates from David Miller:
"Here is a pile of bug fixes that accumulated while I was in Europe"
1) In fixing kernel leaks to userspace during copying of socket
addresses, we broke a case that used to work, namely the user
providing a buffer larger than the in-kernel generic socket address
structure. This broke Ruby amongst other things. Fix from Dan
Carpenter.
2) Fix regression added by byte queue limit support in 8139cp driver,
from Yang Yingliang.
3) The addition of MSG_SENDPAGE_NOTLAST buggered up a few sendpage
implementations, they should just treat it the same as MSG_MORE.
Fix from Richard Weinberger and Shawn Landden.
4) Handle icmpv4 errors received on ipv6 SIT tunnels correctly, from
Oussama Ghorbel. In particular we should send an ICMPv6 unreachable
in such situations.
5) Fix some regressions in the recent genetlink fixes, in particular
get the pmcraid driver to use the new safer interfaces correctly.
From Johannes Berg.
6) macvtap was converted to use a per-cpu set of statistics, but some
code was still bumping tx_dropped elsewhere. From Jason Wang.
7) Fix build failure of xen-netback due to missing include on some
architectures, from Andy Whitecroft.
8) macvtap double counts received packets in statistics, fix from Vlad
Yasevich.
9) Fix various cases of using *_STATS_BH() when *_STATS() is more
appropriate. From Eric Dumazet and Hannes Frederic Sowa.
10) Pktgen ipsec mode doesn't update the ipv4 header length and checksum
properly after encapsulation. Fix from Fan Du.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (61 commits)
net/mlx4_en: Remove selftest TX queues empty condition
{pktgen, xfrm} Update IPv4 header total len and checksum after tranformation
virtio_net: make all RX paths handle erors consistently
virtio_net: fix error handling for mergeable buffers
virtio_net: Fixed a trivial typo (fitler --> filter)
netem: fix gemodel loss generator
netem: fix loss 4 state model
netem: missing break in ge loss generator
net/hsr: Support iproute print_opt ('ip -details ...')
net/hsr: Very small fix of comment style.
MAINTAINERS: Added net/hsr/ maintainer
ipv6: fix possible seqlock deadlock in ip6_finish_output2
ixgbe: Make ixgbe_identify_qsfp_module_generic static
ixgbe: turn NETIF_F_HW_L2FW_DOFFLOAD off by default
ixgbe: ixgbe_fwd_ring_down needs to be static
e1000: fix possible reset_task running after adapter down
e1000: fix lockdep warning in e1000_reset_task
e1000: prevent oops when adapter is being closed and reset simultaneously
igb: Fixed Wake On LAN support
inet: fix possible seqlock deadlocks
...
This implements the rtnl_link_ops fill_info routine for HSR.
Signed-off-by: Arvid Brodin <arvid.brodin@alten.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
The pmcraid driver is abusing the genetlink API and is using its
family ID as the multicast group ID, which is invalid and may
belong to somebody else (and likely will.)
Make it use the correct API, but since this may already be used
as-is by userspace, reserve a family ID for this code and also
reserve that group ID to not break userspace assumptions.
My previous patch broke event delivery in the driver as I missed
that it wasn't using the right API and forgot to update it later
in my series.
While changing this, I noticed that the genetlink code could use
the static group ID instead of a strcmp(), so also do that for
the VFS_DQUOT family.
Cc: Anil Ravindranath <anil_ravindranath@pmc-sierra.com>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The first netlink attribute (value 0) must always be defined as none/unspec.
This is correctly done in inet_diag.h, but other diag interfaces are wrong.
Because we cannot change an existing API, I add a comment to point the mistake
and avoid to propagate it in a new diag API in the future.
CC: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Endianness issues are now consistent as per the documentation in
host/mic_virtio.h. Sparse warnings related to endianness are also fixed.
Note that the MIC driver implementation assumes that the host can be
both BE or LE whereas the card is always LE.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Avoid declaring ALIGN() and __aligned() in
include/uapi/linux/mic_common.h since they pollute user space
namespace. Also, mic_aligned_size() can be simply replaced simply by
sizeof() since all structures where mic_aligned_size() is used are
declared using __attribute__ ((aligned(8)));
--
>From mail from H Peter Anvin about this:
On Fri, Nov 08, 2013 H Peter Anvin <h.peter.anvin@intel.com> wrote:
Subject: Namespace pollution in mic_common.h
This puts two macros, ALIGN() and __aligned(), into arbitrary user space
namespace. This really isn't safe or acceptable, especially since those
symbols are highly generic.
...
When these structures are forced-aligned, they will in fact have padding
automatically added by the compiler to an 8-byte boundary anyway, so
mic_aligned_size() does nothing.
...
Reported-by: H Peter Anvin <h.peter.anvin@intel.com>
Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Many notebooks have a special button for enabling/disabling ambient
light sensor.
Signed-off-by: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Pull DRM fixes from Dave Airlie:
"I was going to leave this until post -rc1 but sysfs fixes broke
hotplug in userspace, so I had to fix it harder, otherwise a set of
pulls from intel, radeon and vmware,
The vmware/ttm changes are bit larger but since its early and they are
unlikely to break anything else I put them in, it lets vmware work
with dri3"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (36 commits)
drm/sysfs: fix hotplug regression since lifetime changes
drm/exynos: g2d: fix memory leak to userptr
drm/i915: Fix gen3 self-refresh watermarks
drm/ttm: Remove set_need_resched from the ttm fault handler
drm/ttm: Don't move non-existing data
drm/radeon: hook up backlight functions for CI and KV family.
drm/i915: Replicate BIOS eDP bpp clamping hack for hsw
drm/i915: Do not enable package C8 on unsupported hardware
drm/i915: Hold pc8 lock around toggling pc8.gpu_idle
drm/i915: encoder->get_config is no longer optional
drm/i915/tv: add ->get_config callback
drm/radeon/cik: Add macrotile mode array query
drm/radeon/cik: Return backend map information to userspace
drm/vmwgfx: Make vmwgfx dma buffers prime aware
drm/vmwgfx: Make surfaces prime-aware
drm/vmwgfx: Hook up the prime ioctls
drm/ttm: Add a minimal prime implementation for ttm base objects
drm/vmwgfx: Fix false lockdep warning
drm/ttm: Allow execbuf util reserves without ticket
drm/i915: restore the early forcewake cleanup
...
Pull security subsystem updates from James Morris:
"In this patchset, we finally get an SELinux update, with Paul Moore
taking over as maintainer of that code.
Also a significant update for the Keys subsystem, as well as
maintenance updates to Smack, IMA, TPM, and Apparmor"
and since I wanted to know more about the updates to key handling,
here's the explanation from David Howells on that:
"Okay. There are a number of separate bits. I'll go over the big bits
and the odd important other bit, most of the smaller bits are just
fixes and cleanups. If you want the small bits accounting for, I can
do that too.
(1) Keyring capacity expansion.
KEYS: Consolidate the concept of an 'index key' for key access
KEYS: Introduce a search context structure
KEYS: Search for auth-key by name rather than target key ID
Add a generic associative array implementation.
KEYS: Expand the capacity of a keyring
Several of the patches are providing an expansion of the capacity of a
keyring. Currently, the maximum size of a keyring payload is one page.
Subtract a small header and then divide up into pointers, that only gives
you ~500 pointers on an x86_64 box. However, since the NFS idmapper uses
a keyring to store ID mapping data, that has proven to be insufficient to
the cause.
Whatever data structure I use to handle the keyring payload, it can only
store pointers to keys, not the keys themselves because several keyrings
may point to a single key. This precludes inserting, say, and rb_node
struct into the key struct for this purpose.
I could make an rbtree of records such that each record has an rb_node
and a key pointer, but that would use four words of space per key stored
in the keyring. It would, however, be able to use much existing code.
I selected instead a non-rebalancing radix-tree type approach as that
could have a better space-used/key-pointer ratio. I could have used the
radix tree implementation that we already have and insert keys into it by
their serial numbers, but that means any sort of search must iterate over
the whole radix tree. Further, its nodes are a bit on the capacious side
for what I want - especially given that key serial numbers are randomly
allocated, thus leaving a lot of empty space in the tree.
So what I have is an associative array that internally is a radix-tree
with 16 pointers per node where the index key is constructed from the key
type pointer and the key description. This means that an exact lookup by
type+description is very fast as this tells us how to navigate directly to
the target key.
I made the data structure general in lib/assoc_array.c as far as it is
concerned, its index key is just a sequence of bits that leads to a
pointer. It's possible that someone else will be able to make use of it
also. FS-Cache might, for example.
(2) Mark keys as 'trusted' and keyrings as 'trusted only'.
KEYS: verify a certificate is signed by a 'trusted' key
KEYS: Make the system 'trusted' keyring viewable by userspace
KEYS: Add a 'trusted' flag and a 'trusted only' flag
KEYS: Separate the kernel signature checking keyring from module signing
These patches allow keys carrying asymmetric public keys to be marked as
being 'trusted' and allow keyrings to be marked as only permitting the
addition or linkage of trusted keys.
Keys loaded from hardware during kernel boot or compiled into the kernel
during build are marked as being trusted automatically. New keys can be
loaded at runtime with add_key(). They are checked against the system
keyring contents and if their signatures can be validated with keys that
are already marked trusted, then they are marked trusted also and can
thus be added into the master keyring.
Patches from Mimi Zohar make this usable with the IMA keyrings also.
(3) Remove the date checks on the key used to validate a module signature.
X.509: Remove certificate date checks
It's not reasonable to reject a signature just because the key that it was
generated with is no longer valid datewise - especially if the kernel
hasn't yet managed to set the system clock when the first module is
loaded - so just remove those checks.
(4) Make it simpler to deal with additional X.509 being loaded into the kernel.
KEYS: Load *.x509 files into kernel keyring
KEYS: Have make canonicalise the paths of the X.509 certs better to deduplicate
The builder of the kernel now just places files with the extension ".x509"
into the kernel source or build trees and they're concatenated by the
kernel build and stuffed into the appropriate section.
(5) Add support for userspace kerberos to use keyrings.
KEYS: Add per-user_namespace registers for persistent per-UID kerberos caches
KEYS: Implement a big key type that can save to tmpfs
Fedora went to, by default, storing kerberos tickets and tokens in tmpfs.
We looked at storing it in keyrings instead as that confers certain
advantages such as tickets being automatically deleted after a certain
amount of time and the ability for the kernel to get at these tokens more
easily.
To make this work, two things were needed:
(a) A way for the tickets to persist beyond the lifetime of all a user's
sessions so that cron-driven processes can still use them.
The problem is that a user's session keyrings are deleted when the
session that spawned them logs out and the user's user keyring is
deleted when the UID is deleted (typically when the last log out
happens), so neither of these places is suitable.
I've added a system keyring into which a 'persistent' keyring is
created for each UID on request. Each time a user requests their
persistent keyring, the expiry time on it is set anew. If the user
doesn't ask for it for, say, three days, the keyring is automatically
expired and garbage collected using the existing gc. All the kerberos
tokens it held are then also gc'd.
(b) A key type that can hold really big tickets (up to 1MB in size).
The problem is that Active Directory can return huge tickets with lots
of auxiliary data attached. We don't, however, want to eat up huge
tracts of unswappable kernel space for this, so if the ticket is
greater than a certain size, we create a swappable shmem file and dump
the contents in there and just live with the fact we then have an
inode and a dentry overhead. If the ticket is smaller than that, we
slap it in a kmalloc()'d buffer"
* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (121 commits)
KEYS: Fix keyring content gc scanner
KEYS: Fix error handling in big_key instantiation
KEYS: Fix UID check in keyctl_get_persistent()
KEYS: The RSA public key algorithm needs to select MPILIB
ima: define '_ima' as a builtin 'trusted' keyring
ima: extend the measurement list to include the file signature
kernel/system_certificate.S: use real contents instead of macro GLOBAL()
KEYS: fix error return code in big_key_instantiate()
KEYS: Fix keyring quota misaccounting on key replacement and unlink
KEYS: Fix a race between negating a key and reading the error set
KEYS: Make BIG_KEYS boolean
apparmor: remove the "task" arg from may_change_ptraced_domain()
apparmor: remove parent task info from audit logging
apparmor: remove tsk field from the apparmor_audit_struct
apparmor: fix capability to not use the current task, during reporting
Smack: Ptrace access check mode
ima: provide hash algo info in the xattr
ima: enable support for larger default filedata hash algorithms
ima: define kernel parameter 'ima_template=' to change configured default
ima: add Kconfig default measurement list template
...
Pull audit updates from Eric Paris:
"Nothing amazing. Formatting, small bug fixes, couple of fixes where
we didn't get records due to some old VFS changes, and a change to how
we collect execve info..."
Fixed conflict in fs/exec.c as per Eric and linux-next.
* git://git.infradead.org/users/eparis/audit: (28 commits)
audit: fix type of sessionid in audit_set_loginuid()
audit: call audit_bprm() only once to add AUDIT_EXECVE information
audit: move audit_aux_data_execve contents into audit_context union
audit: remove unused envc member of audit_aux_data_execve
audit: Kill the unused struct audit_aux_data_capset
audit: do not reject all AUDIT_INODE filter types
audit: suppress stock memalloc failure warnings since already managed
audit: log the audit_names record type
audit: add child record before the create to handle case where create fails
audit: use given values in tty_audit enable api
audit: use nlmsg_len() to get message payload length
audit: use memset instead of trying to initialize field by field
audit: fix info leak in AUDIT_GET requests
audit: update AUDIT_INODE filter rule to comparator function
audit: audit feature to set loginuid immutable
audit: audit feature to only allow unsetting the loginuid
audit: allow unsetting the loginuid (with priv)
audit: remove CONFIG_AUDIT_LOGINUID_IMMUTABLE
audit: loginuid functions coding style
selinux: apply selinux checks on new audit message types
...
More fixes for radeon. This adds new queries for tiling on CIK, and
fixes a crash in handling acpi atif backlight events on CIK.
Some fixes for radeon for 3.13. Mostly CI stability fixes. I think
I've tracked down the stability problems with dpm on Trinity/Richland,
so I'm going to enable that by default now.
* 'drm-next-3.13' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: hook up backlight functions for CI and KV family.
drm/radeon/cik: Add macrotile mode array query
drm/radeon/cik: Return backend map information to userspace
drm/radeon: enable DPM by default in TN asics
drm/radeon: adjust TN dpm parameters for stability (v2)
drm/radeon: use a single doorbell for cik kms compute
drm/radeon/vm: don't attempt to update ptes if ib allocation fails
drm/radeon: disable CIK CP semaphores for now
drm/radeon: allow semaphore emission to fail
drm/radeon: add semaphore trace point
radeon: workaround pinning failure on low ram gpu
radeon/i2c: do not count reg index in number of i2c byte we are writing.
drm/radeon: cypress_dpm: Fix unused variable warning when CONFIG_ACPI=n
drm: radeon: ni_dpm: Fix unused variable warning when CONFIG_ACPI=n
Mostly optimisations and obscure bug fixes.
- raid5 gets less lock contention
- raid1 gets less contention between normal-io and resync-io
during resync.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQIVAwUAUovzDznsnt1WYoG5AQJ1pQ//bDuXadoJ5dwjWjVxFOKoQ9j/9joEI0yH
XTApD3ADKckdBc4TSLOIbCNLW1Pbe23HlOI/GjCiJ/7mePL3OwHd7Fx8Rfq3BubV
f7NgjVwu8nwYD0OXEZsshImptEtrbYwQdy+qlKcHXcZz1MUfR+Egih3r/ouTEfEt
FNq/6MpyN0IKSY82xP/jFZgesBucgKz/YOUIbwClxm7UiyISKvWQLBIAfLB3dyI3
HoEdEzQX6I56Rw0mkSUG4Mk+8xx/8twxL+yqEUqfdJREWuB56Km8kl8y/e465Nk0
ZZg6j/TrslVEwbEeVMx0syvYcaAWFZ4X2jdKfo1lI0g9beZp7H1GRF8yR1s2t/h4
g/vb55MEN++4LPaE9ut4z7SG2yLyGkZgFTzTjyq5of+DFL0cayO7wXxbgpcD7JYf
Doef/OSa6csKiGiJI48iQa08Bolmz9ZWzZQXhAthKfFQ9Rv+GEtIAi4kLR8EZPbu
0/FL1ylYNUY9O7p0g+iy9Kcoc+xW36I95pPZf8pO8GFcXTjyuCCBVh/SNvFZZHPl
3xk3aZJknAEID8VrVG2IJPkeDI8WK8YxmpU/nARCoytn07Df6Ye8jGvLdR8pL3lB
TIZV6eRY4yciB8LtoK9Kg4XTmOMhBtjt4c3znkljp98vhOQQb/oHN+BXMGcwqvr9
fk0KGrg31VA=
=8RCg
-----END PGP SIGNATURE-----
Merge tag 'md/3.13' of git://neil.brown.name/md
Pull md update from Neil Brown:
"Mostly optimisations and obscure bug fixes.
- raid5 gets less lock contention
- raid1 gets less contention between normal-io and resync-io during
resync"
* tag 'md/3.13' of git://neil.brown.name/md:
md/raid5: Use conf->device_lock protect changing of multi-thread resources.
md/raid5: Before freeing old multi-thread worker, it should flush them.
md/raid5: For stripe with R5_ReadNoMerge, we replace REQ_FLUSH with REQ_NOMERGE.
UAPI: include <asm/byteorder.h> in linux/raid/md_p.h
raid1: Rewrite the implementation of iobarrier.
raid1: Add some macros to make code clearly.
raid1: Replace raise_barrier/lower_barrier with freeze_array/unfreeze_array when reconfiguring the array.
raid1: Add a field array_frozen to indicate whether raid in freeze state.
md: Convert use of typedef ctl_table to struct ctl_table
md/raid5: avoid deadlock when raid5 array has unack badblocks during md_stop_writes.
md: use MD_RECOVERY_INTR instead of kthread_should_stop in resync thread.
md: fix some places where mddev_lock return value is not checked.
raid5: Retry R5_ReadNoMerge flag when hit a read error.
raid5: relieve lock contention in get_active_stripe()
raid5: relieve lock contention in get_active_stripe()
wait: add wait_event_cmd()
md/raid5.c: add proper locking to error path of raid5_start_reshape.
md: fix calculation of stacking limits on level change.
raid5: Use slow_path to release stripe when mddev->thread is null
Pull networking fixes from David Miller:
"Mostly these are fixes for fallout due to merge window changes, as
well as cures for problems that have been with us for a much longer
period of time"
1) Johannes Berg noticed two major deficiencies in our genetlink
registration. Some genetlink protocols we passing in constant
counts for their ops array rather than something like
ARRAY_SIZE(ops) or similar. Also, some genetlink protocols were
using fixed IDs for their multicast groups.
We have to retain these fixed IDs to keep existing userland tools
working, but reserve them so that other multicast groups used by
other protocols can not possibly conflict.
In dealing with these two problems, we actually now use less state
management for genetlink operations and multicast groups.
2) When configuring interface hardware timestamping, fix several
drivers that simply do not validate that the hwtstamp_config value
is one the driver actually supports. From Ben Hutchings.
3) Invalid memory references in mwifiex driver, from Amitkumar Karwar.
4) In dev_forward_skb(), set the skb->protocol in the right order
relative to skb_scrub_packet(). From Alexei Starovoitov.
5) Bridge erroneously fails to use the proper wrapper functions to make
calls to netdev_ops->ndo_vlan_rx_{add,kill}_vid. Fix from Toshiaki
Makita.
6) When detaching a bridge port, make sure to flush all VLAN IDs to
prevent them from leaking, also from Toshiaki Makita.
7) Put in a compromise for TCP Small Queues so that deep queued devices
that delay TX reclaim non-trivially don't have such a performance
decrease. One particularly problematic area is 802.11 AMPDU in
wireless. From Eric Dumazet.
8) Fix crashes in tcp_fastopen_cache_get(), we can see NULL socket dsts
here. Fix from Eric Dumzaet, reported by Dave Jones.
9) Fix use after free in ipv6 SIT driver, from Willem de Bruijn.
10) When computing mergeable buffer sizes, virtio-net fails to take the
virtio-net header into account. From Michael Dalton.
11) Fix seqlock deadlock in ip4_datagram_connect() wrt. statistic
bumping, this one has been with us for a while. From Eric Dumazet.
12) Fix NULL deref in the new TIPC fragmentation handling, from Erik
Hugne.
13) 6lowpan bit used for traffic classification was wrong, from Jukka
Rissanen.
14) macvlan has the same issue as normal vlans did wrt. propagating LRO
disabling down to the real device, fix it the same way. From Michal
Kubecek.
15) CPSW driver needs to soft reset all slaves during suspend, from
Daniel Mack.
16) Fix small frame pacing in FQ packet scheduler, from Eric Dumazet.
17) The xen-netfront RX buffer refill timer isn't properly scheduled on
partial RX allocation success, from Ma JieYue.
18) When ipv6 ping protocol support was added, the AF_INET6 protocol
initialization cleanup path on failure was borked a little. Fix
from Vlad Yasevich.
19) If a socket disconnects during a read/recvmsg/recvfrom/etc that
blocks we can do the wrong thing with the msg_name we write back to
userspace. From Hannes Frederic Sowa. There is another fix in the
works from Hannes which will prevent future problems of this nature.
20) Fix route leak in VTI tunnel transmit, from Fan Du.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (106 commits)
genetlink: make multicast groups const, prevent abuse
genetlink: pass family to functions using groups
genetlink: add and use genl_set_err()
genetlink: remove family pointer from genl_multicast_group
genetlink: remove genl_unregister_mc_group()
hsr: don't call genl_unregister_mc_group()
quota/genetlink: use proper genetlink multicast APIs
drop_monitor/genetlink: use proper genetlink multicast APIs
genetlink: only pass array to genl_register_family_with_ops()
tcp: don't update snd_nxt, when a socket is switched from repair mode
atm: idt77252: fix dev refcnt leak
xfrm: Release dst if this dst is improper for vti tunnel
netlink: fix documentation typo in netlink_set_err()
be2net: Delete secondary unicast MAC addresses during be_close
be2net: Fix unconditional enabling of Rx interface options
net, virtio_net: replace the magic value
ping: prevent NULL pointer dereference on write to msg_name
bnx2x: Prevent "timeout waiting for state X"
bnx2x: prevent CFC attention
bnx2x: Prevent panic during DMAE timeout
...
The quota code is abusing the genetlink API and is using
its family ID as the multicast group ID, which is invalid
and may belong to somebody else (and likely will.)
Make the quota code use the correct API, but since this
is already used as-is by userspace, reserve a family ID
for this code and also reserve that group ID to not break
userspace assumptions.
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
linux/raid/md_p.h is using conditionals depending on endianess and fails
with an error if neither of __BIG_ENDIAN, __LITTLE_ENDIAN or
__BYTE_ORDER are defined, but it doesn't include any header which can
define these constants. This make this header unusable alone.
This patch adds a #include <asm/byteorder.h> at the beginning of this
header to make it usable alone. This is needed to compile klibc on MIPS.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: NeilBrown <neilb@suse.de>
- Re-enable flow steering verbs with new improved userspace ABI
- Fixes for slow connection due to GID lookup scalability
- IPoIB fixes
- Many fixes to HW drivers including mlx4, mlx5, ocrdma and qib
- Further improvements to SRP error handling
- Add new transport type for Cisco usNIC
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABCAAGBQJSil7BAAoJEENa44ZhAt0hbtgP/A+AmUalbOX6ZKzuOFxsrtY2
r55CX9b1JBeFM/Zhn2o6y+81lpCjkckJSggESMe4izNgocGw0nW4vYGN4SBynatj
y8sR9OSn+G3ihuENrzG41MJUGEa5WbcNMy4boN+Oa+qyTlV/WjLR7Fv4WbikK7Wm
o8FNlXiiDhMoGfHHG5J0MD0EQsnxuLDk2XP+ciu4tLtTs+wBka+gFK8WnMvztle3
gTeMNna5ilvCS2fdBxteuPA3KeDnJE9AgJSMJ2a4Rh+DR8uTgWYQ6n7amjmOc546
yhAKkoBkxPE10+Yj82WOPhCFxSeWcuSwJvpgv5dTVZ1XqUUcC1V3TEcZDHmyyHQ7
uPXgS1A+erBW3OYPBjZqtKvnHObscV12fL+rId3vIhcAQIbFroci08ZwPidEYRkn
fvwlEKcrIsBIpRXEyjlFCxsiiDnfq1wC1VayMR3jrIK0P6idf1SXf/geiRp9+RGT
wKUc0j51jvEx29qc65xuhEP9FQV9pCMxyd+FEE0d0KkjMz5hsIkjmcUcBbgF0CGg
GEyDPlgRLv+vmWDGpT8XraaV/0CJOEQDIgB4WSN87/AZ4UoNt7spW2xqsLsp1toy
5e0100tpWUleTPLe/Wig5GtBdagQ2jAUK1+186CP93pFPtkwc4/7X3hyp7qPIPTz
VDvT9DEy6zjSMCLpMcdo
=xxC+
-----END PGP SIGNATURE-----
Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
Pull infiniband/rdma updates from Roland Dreier:
- Re-enable flow steering verbs with new improved userspace ABI
- Fixes for slow connection due to GID lookup scalability
- IPoIB fixes
- Many fixes to HW drivers including mlx4, mlx5, ocrdma and qib
- Further improvements to SRP error handling
- Add new transport type for Cisco usNIC
* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (66 commits)
IB/core: Re-enable create_flow/destroy_flow uverbs
IB/core: extended command: an improved infrastructure for uverbs commands
IB/core: Remove ib_uverbs_flow_spec structure from userspace
IB/core: Use a common header for uverbs flow_specs
IB/core: Make uverbs flow structure use names like verbs ones
IB/core: Rename 'flow' structs to match other uverbs structs
IB/core: clarify overflow/underflow checks on ib_create/destroy_flow
IB/ucma: Convert use of typedef ctl_table to struct ctl_table
IB/cm: Convert to using idr_alloc_cyclic()
IB/mlx5: Fix page shift in create CQ for userspace
IB/mlx4: Fix device max capabilities check
IB/mlx5: Fix list_del of empty list
IB/mlx5: Remove dead code
IB/core: Encorce MR access rights rules on kernel consumers
IB/mlx4: Fix endless loop in resize CQ
RDMA/cma: Remove unused argument and minor dead code
RDMA/ucma: Discard events for IDs not yet claimed by user space
IB/core: Add Cisco usNIC rdma node and transport types
RDMA/nes: Remove self-assignment from nes_query_qp()
IB/srp: Report receive errors correctly
...
Pull media updates from Mauro Carvalho Chehab:
"This series include:
- a new Remote Controller driver for ST SoC with the corresponding DT
bindings
- a new frontend (cx24117)
- a new I2C camera flash driver (lm3560)
- a new mem2mem driver for TI SoC (ti-vpe)
- support for Raphael r828d added to r820t driver
- some improvements on buffer allocation at VB2 core
- usual driver fixes and improvements
PS this time, we have a smaller number of patches. While it is hard
to pinpoint to the reasons, I believe that it is mainly due to:
1) there are several patch series ready, but depending on DT review.
I decided to grant some extra time for DT maintainers to look on
it, as they're expecting to have more time with the changes agreed
during ARM mini-summit and KS. If they can't review in time for
3.14, I'll review myself and apply for the next merge window.
2) I suspect that having both LinuxCon EU and LinuxCon NA happening
during the same merge window affected the development
productivity, as several core media developers participated on
both events"
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (151 commits)
[media] media: st-rc: Add ST remote control driver
[media] gpio-ir-recv: Include linux/of.h header
[media] tvp7002: Include linux/of.h header
[media] tvp514x: Include linux/of.h header
[media] ths8200: Include linux/of.h header
[media] adv7343: Include linux/of.h header
[media] v4l: Fix typo in v4l2_subdev_get_try_crop()
[media] media: i2c: add driver for dual LED Flash, lm3560
[media] rtl28xxu: add 15f4:0131 Astrometa DVB-T2
[media] rtl28xxu: add RTL2832P + R828D support
[media] rtl2832: add new tuner R828D
[media] r820t: add support for R828D
[media] media/i2c: ths8200: fix build failure with gcc 4.5.4
[media] Add support for KWorld UB435-Q V2
[media] staging/media: fix msi3101 build errors
[media] ddbridge: Remove casting the return value which is a void pointer
[media] ngene: Remove casting the return value which is a void pointer
[media] dm1105: remove unneeded not-null test
[media] sh_mobile_ceu_camera: remove deprecated IRQF_DISABLED
[media] media: rcar_vin: Add preliminary r8a7790 support
...
This is required to properly calculate the tiling parameters
in userspace.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This commit reverts commit 7afbddfae9 ("IB/core: Temporarily disable
create_flow/destroy_flow uverbs"). Since the uverbs extensions
functionality was experimental for v3.12, this patch re-enables the
support for them and flow-steering for v3.13.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Commit 400dbc9658 ("IB/core: Infrastructure for extensible uverbs
commands") added an infrastructure for extensible uverbs commands
while later commit 436f2ad05a ("IB/core: Export ib_create/destroy_flow
through uverbs") exported ib_create_flow()/ib_destroy_flow() functions
using this new infrastructure.
According to the commit 400dbc9658, the purpose of this
infrastructure is to support passing around provider (eg. hardware)
specific buffers when userspace issue commands to the kernel, so that
it would be possible to extend uverbs (eg. core) buffers independently
from the provider buffers.
But the new kernel command function prototypes were not modified to
take advantage of this extension. This issue was exposed by Roland
Dreier in a previous review[1].
So the following patch is an attempt to a revised extensible command
infrastructure.
This improved extensible command infrastructure distinguish between
core (eg. legacy)'s command/response buffers from provider
(eg. hardware)'s command/response buffers: each extended command
implementing function is given a struct ib_udata to hold core
(eg. uverbs) input and output buffers, and another struct ib_udata to
hold the hw (eg. provider) input and output buffers.
Having those buffers identified separately make it easier to increase
one buffer to support extension without having to add some code to
guess the exact size of each command/response parts: This should make
the extended functions more reliable.
Additionally, instead of relying on command identifier being greater
than IB_USER_VERBS_CMD_THRESHOLD, the proposed infrastructure rely on
unused bits in command field: on the 32 bits provided by command
field, only 6 bits are really needed to encode the identifier of
commands currently supported by the kernel. (Even using only 6 bits
leaves room for about 23 new commands).
So this patch makes use of some high order bits in command field to
store flags, leaving enough room for more command identifiers than one
will ever need (eg. 256).
The new flags are used to specify if the command should be processed
as an extended one or a legacy one. While designing the new command
format, care was taken to make usage of flags itself extensible.
Using high order bits of the commands field ensure that newer
libibverbs on older kernel will properly fail when trying to call
extended commands. On the other hand, older libibverbs on newer kernel
will never be able to issue calls to extended commands.
The extended command header includes the optional response pointer so
that output buffer length and output buffer pointer are located
together in the command, allowing proper parameters checking. This
should make implementing functions easier and safer.
Additionally the extended header ensure 64bits alignment, while making
all sizes multiple of 8 bytes, extending the maximum buffer size:
legacy extended
Maximum command buffer: 256KBytes 1024KBytes (512KBytes + 512KBytes)
Maximum response buffer: 256KBytes 1024KBytes (512KBytes + 512KBytes)
For the purpose of doing proper buffer size accounting, the headers
size are no more taken in account in "in_words".
One of the odds of the current extensible infrastructure, reading
twice the "legacy" command header, is fixed by removing the "legacy"
command header from the extended command header: they are processed as
two different parts of the command: memory is read once and
information are not duplicated: it's making clear that's an extended
command scheme and not a different command scheme.
The proposed scheme will format input (command) and output (response)
buffers this way:
- command:
legacy header +
extended header +
command data (core + hw):
+----------------------------------------+
| flags | 00 00 | command |
| in_words | out_words |
+----------------------------------------+
| response |
| response |
| provider_in_words | provider_out_words |
| padding |
+----------------------------------------+
| |
. <uverbs input> .
. (in_words * 8) .
| |
+----------------------------------------+
| |
. <provider input> .
. (provider_in_words * 8) .
| |
+----------------------------------------+
- response, if present:
+----------------------------------------+
| |
. <uverbs output space> .
. (out_words * 8) .
| |
+----------------------------------------+
| |
. <provider output space> .
. (provider_out_words * 8) .
| |
+----------------------------------------+
The overall design is to ensure that the extensible infrastructure is
itself extensible while begin more reliable with more input and bound
checking.
Note:
The unused field in the extended header would be perfect candidate to
hold the command "comp_mask" (eg. bit field used to handle
compatibility). This was suggested by Roland Dreier in a previous
review[2]. But "comp_mask" field is likely to be present in the uverb
input and/or provider input, likewise for the response, as noted by
Matan Barak[3], so it doesn't make sense to put "comp_mask" in the
header.
[1]:
http://marc.info/?i=CAL1RGDWxmM17W2o_era24A-TTDeKyoL6u3NRu_=t_dhV_ZA9MA@mail.gmail.com
[2]:
http://marc.info/?i=CAL1RGDXJtrc849M6_XNZT5xO1+ybKtLWGq6yg6LhoSsKpsmkYA@mail.gmail.com
[3]:
http://marc.info/?i=525C1149.6000701@mellanox.com
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Link: http://marc.info/?i=cover.1383773832.git.ydroneaud@opteya.com
[ Convert "ret ? ret : 0" to the equivalent "ret". - Roland ]
Signed-off-by: Roland Dreier <roland@purestorage.com>
The structure holding any types of flow_spec is of no use to
userspace. It would be wrong for userspace to do:
struct ib_uverbs_flow_spec flow_spec;
flow_spec.type = IB_FLOW_SPEC_TCP;
flow_spec.size = sizeof(flow_spec);
Instead, userspace should use the dedicated flow_spec structure for
- Ethernet : struct ib_uverbs_flow_spec_eth,
- IPv4 : struct ib_uverbs_flow_spec_ipv4,
- TCP/UDP : struct ib_uverbs_flow_spec_tcp_udp.
In other words, struct ib_uverbs_flow_spec is a "virtual" data
structure that can only be use by the kernel as an alias to the other.
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Link: http://marc.info/?i=cover.1383773832.git.ydroneaud@opteya.com
Signed-off-by: Roland Dreier <roland@purestorage.com>
A common header will allows better checking of flow specs size, while
ensuring strict alignment to 64 bits.
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Link: http://marc.info/?i=cover.1383773832.git.ydroneaud@opteya.com
Signed-off-by: Roland Dreier <roland@purestorage.com>
This patch adds "flow" prefix to most of data structure added as part
of commit 436f2ad05a ("IB/core: Export ib_create/destroy_flow through
uverbs") to keep those names in sync with the data structures added in
commit 319a441d13 ("IB/core: Add receive flow steering support").
It's just a matter of translating 'ib_flow' to 'ib_uverbs_flow'.
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Link: http://marc.info/?i=cover.1383773832.git.ydroneaud@opteya.com
Signed-off-by: Roland Dreier <roland@purestorage.com>
Commit 436f2ad05a ("IB/core: Export ib_create/destroy_flow through
uverbs") added public data structures to support receive flow
steering. The new structs are not following the 'uverbs' pattern:
they're lacking the common prefix 'ib_uverbs'.
This patch replaces ib_kern prefix by ib_uverbs.
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Link: http://marc.info/?i=cover.1383773832.git.ydroneaud@opteya.com
Signed-off-by: Roland Dreier <roland@purestorage.com>
This patch fixes the following issues:
1. Unneeded checks were removed
2. Removed the fixed size out of flow_attr.size, thus simplifying the checks.
3. Remove a 32bit hole on 64bit systems with strict alignment in
struct ib_kern_flow_att by adding a reserved field.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
For performance reasons, sch_fq tried hard to not setup timers for every
sent packet, using a quantum based heuristic : A delay is setup only if
the flow exhausted its credit.
Problem is that application limited flows can refill their credit
for every queued packet, and they can evade pacing.
This problem can also be triggered when TCP flows use small MSS values,
as TSO auto sizing builds packets that are smaller than the default fq
quantum (3028 bytes)
This patch adds a 40 ms delay to guard flow credit refill.
Fixes: afe4fd0624 ("pkt_sched: fq: Fair Queue packet scheduler")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 7eec4174ff ("pkt_sched: fq: fix non TCP flows pacing")
obsoleted TCA_FQ_FLOW_DEFAULT_RATE without notice for the users.
Suggested by David Miller
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull second round of block driver updates from Jens Axboe:
"As mentioned in the original pull request, the bcache bits were pulled
because of their dependency on the immutable bio vecs. Kent re-did
this part and resubmitted it, so here's the 2nd round of (mostly)
driver updates for 3.13. It contains:
- The bcache work from Kent.
- Conversion of virtio-blk to blk-mq. This removes the bio and request
path, and substitutes with the blk-mq path instead. The end result
almost 200 deleted lines. Patch is acked by Asias and Christoph, who
both did a bunch of testing.
- A removal of bootmem.h include from Grygorii Strashko, part of a
larger series of his killing the dependency on that header file.
- Removal of __cpuinit from blk-mq from Paul Gortmaker"
* 'for-linus' of git://git.kernel.dk/linux-block: (56 commits)
virtio_blk: blk-mq support
blk-mq: remove newly added instances of __cpuinit
bcache: defensively handle format strings
bcache: Bypass torture test
bcache: Delete some slower inline asm
bcache: Use ida for bcache block dev minor
bcache: Fix sysfs splat on shutdown with flash only devs
bcache: Better full stripe scanning
bcache: Have btree_split() insert into parent directly
bcache: Move spinlock into struct time_stats
bcache: Kill sequential_merge option
bcache: Kill bch_next_recurse_key()
bcache: Avoid deadlocking in garbage collection
bcache: Incremental gc
bcache: Add make_btree_freeing_key()
bcache: Add btree_node_write_sync()
bcache: PRECEDING_KEY()
bcache: bch_(btree|extent)_ptr_invalid()
bcache: Don't bother with bucket refcount for btree node allocations
bcache: Debug code improvements
...
Pull drm updates from Dave Airlie:
"This is a combo of -next and some -fixes that came in in the
intervening time.
Highlights:
New drivers:
ARM Armada driver for Marvell Armada 510 SOCs
Intel:
Broadwell initial support under a default off switch,
Stereo/3D HDMI mode support
Valleyview improvements
Displayport improvements
Haswell fixes
initial mipi dsi panel support
CRC support for debugging
build with CONFIG_FB=n
Radeon:
enable DPM on a number of GPUs by default
secondary GPU powerdown support
enable HDMI audio by default
Hawaii support
Nouveau:
dynamic pm code infrastructure reworked, does nothing major yet
GK208 modesetting support
MSI fixes, on by default again
PMPEG improvements
pageflipping fixes
GMA500:
minnowboard SDVO support
VMware:
misc fixes
MSM:
prime, plane and rendernodes support
Tegra:
rearchitected to put the drm driver into the drm subsystem.
HDMI and gr2d support for tegra 114 SoC
QXL:
oops fix, and multi-head fixes
DRM core:
sysfs lifetime fixes
client capability ioctl
further cleanups to device midlayer
more vblank timestamp fixes"
* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (789 commits)
drm/nouveau: do not map evicted vram buffers in nouveau_bo_vma_add
drm/nvc0-/gr: shift wrapping bug in nvc0_grctx_generate_r406800
drm/nouveau/pwr: fix missing mutex unlock in a failure path
drm/nv40/therm: fix slowing down fan when pstate undefined
drm/nv11-: synchronise flips to vblank, unless async flip requested
drm/nvc0-: remove nasty fifo swmthd hack for flip completion method
drm/nv10-: we no longer need to create nvsw object on user channels
drm/nouveau: always queue flips relative to kernel channel activity
drm/nouveau: there is no need to reserve/fence the new fb when flipping
drm/nouveau: when bailing out of a pushbuf ioctl, do not remove previous fence
drm/nouveau: allow nouveau_fence_ref() to be a noop
drm/nvc8/mc: msi rearm is via the nvc0 method
drm/ttm: Fix vma page_prot bit manipulation
drm/vmwgfx: Fix a couple of compile / sparse warnings and errors
drm/vmwgfx: Resource evict fixes
drm/edid: compare actual vrefresh for all modes for quirks
drm: shmob_drm: Convert to clk_prepare/unprepare
drm/nouveau: fix 32-bit build
drm/i915/opregion: fix build error on CONFIG_ACPI=n
Revert "drm/radeon/audio: don't set speaker allocation on DCE4+"
...
side: the HV and emulation flavors can now coexist in a single kernel
is probably the most interesting change from a user point of view.
On the x86 side there are nested virtualization improvements and a
few bugfixes. ARM got transparent huge page support, improved
overcommit, and support for big endian guests.
Finally, there is a new interface to connect KVM with VFIO. This
helps with devices that use NoSnoop PCI transactions, letting the
driver in the guest execute WBINVD instructions. This includes
some nVidia cards on Windows, that fail to start without these
patches and the corresponding userspace changes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJShPAhAAoJEBvWZb6bTYbyl48P/297GgmELHAGBgjvb6q7yyGu
L8+eHjKbh4XBAkPwyzbvUjuww5z2hM0N3JQ0BDV9oeXlO+zwwCEns/sg2Q5/NJXq
XxnTeShaKnp9lqVBnE6G9rAOUWKoyLJ2wItlvUL8JlaO9xJ0Vmk0ta4n2Nv5GqDp
db6UD7vju6rHtIAhNpvvAO51kAOwc01xxRixCVb7KUYOnmO9nvpixzoI/S0Rp1gu
w/OWMfCosDzBoT+cOe79Yx1OKcpaVW94X6CH1s+ShCw3wcbCL2f13Ka8/E3FIcuq
vkZaLBxio7vjUAHRjPObw0XBW4InXEbhI1DjzIvm8dmc4VsgmtLQkTCG8fj+jINc
dlHQUq6Do+1F4zy6WMBUj8tNeP1Z9DsABp98rQwR8+BwHoQpGQBpAxW0TE0ZMngC
t1caqyvjZ5pPpFUxSrAV+8Kg4AvobXPYOim0vqV7Qea07KhFcBXLCfF7BWdwq/Jc
0CAOlsLL4mHGIQWZJuVGw0YGP7oATDCyewlBuDObx+szYCoV4fQGZVBEL0KwJx/1
7lrLN7JWzRyw6xTgJ5VVwgYE1tUY4IFQcHu7/5N+dw8/xg9KWA3f4PeMavIKSf+R
qteewbtmQsxUnvuQIBHLs8NRWPnBPy+F3Sc2ckeOLIe4pmfTte6shtTXcLDL+LqH
NTmT/cfmYp2BRkiCfCiS
=rWNf
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM changes from Paolo Bonzini:
"Here are the 3.13 KVM changes. There was a lot of work on the PPC
side: the HV and emulation flavors can now coexist in a single kernel
is probably the most interesting change from a user point of view.
On the x86 side there are nested virtualization improvements and a few
bugfixes.
ARM got transparent huge page support, improved overcommit, and
support for big endian guests.
Finally, there is a new interface to connect KVM with VFIO. This
helps with devices that use NoSnoop PCI transactions, letting the
driver in the guest execute WBINVD instructions. This includes some
nVidia cards on Windows, that fail to start without these patches and
the corresponding userspace changes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (146 commits)
kvm, vmx: Fix lazy FPU on nested guest
arm/arm64: KVM: PSCI: propagate caller endianness to the incoming vcpu
arm/arm64: KVM: MMIO support for BE guest
kvm, cpuid: Fix sparse warning
kvm: Delete prototype for non-existent function kvm_check_iopl
kvm: Delete prototype for non-existent function complete_pio
hung_task: add method to reset detector
pvclock: detect watchdog reset at pvclock read
kvm: optimize out smp_mb after srcu_read_unlock
srcu: API for barrier after srcu read unlock
KVM: remove vm mmap method
KVM: IOMMU: hva align mapping page size
KVM: x86: trace cpuid emulation when called from emulator
KVM: emulator: cleanup decode_register_operand() a bit
KVM: emulator: check rex prefix inside decode_register()
KVM: x86: fix emulation of "movzbl %bpl, %eax"
kvm_host: typo fix
KVM: x86: emulate SAHF instruction
MAINTAINERS: add tree for kvm.git
Documentation/kvm: add a 00-INDEX file
...
Pull btrfs update frm Chris Mason:
"This is our usual merge window set of bug fixes, performance
improvements and cleanups. Miao Xie has some really nice
optimizations for writeback.
Josef also expanded our sanity checks quite a bit; these make up a big
chunk of the new lines"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (98 commits)
Btrfs: rename btrfs_start_all_delalloc_inodes
Btrfs: don't wait for the completion of all the ordered extents
Btrfs: don't wait for all the async delalloc when shrinking delalloc
Btrfs: fix the confusion between delalloc bytes and metadata bytes
Btrfs: pick up the code for the item number calculation in flush_space()
Btrfs: wait for the ordered extent only when we want
Btrfs: remove unnecessary initialization and memory barrior in shrink_delalloc()
Btrfs: avoid unnecessary scrub workers allocation
Btrfs: check file extent type before anything else
btrfs: Remove useless variable in write_ctree_super()
btrfs: Fix checkpatch.pl warning of spacing issues
btrfs: Replace kmalloc with kmalloc_array
btrfs: Enclose macros with complex values within parenthesis
btrfs: Use WARN_ON()'s return value in place of WARN_ON(1)
btrfs: Remove redundant local zero structure
btrfs: Pack struct btrfs_device
btrfs: Replace multiple atomic_inc() with atomic_add()
btrfs: Add helper function for free_root_pointers()
Btrfs: fix a crash when running balance and defrag concurrently
Btrfs: do not run snapshot-aware defragment on error
...
Fix whitespace, capitalization, and spelling errors. No functional change.
I know "busses" is not an error, but "buses" was more common, so I used it
consistently.
Signed-off-by: Marta Rybczynska <rybczynska@gmail.com> (pci_reset_bridge_secondary_bus())
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Improve reliability of buffer allocations for dm messages with a small
number of arguments, a couple path group initialization fixes for dm
multipath, a fix for resizing a dm array, various fixes and
optimizations for dm cache, a fix for device mapper's Kconfig menu
indentation.
Features added include:
- dm crypt support for activating legacy CBC TrueCrypt containers
(useful for forensics of these old TCRYPT containers)
- reduced dm-cache memory requirements for each block in the cache
- basic support for shrinking a dm-cache's cache (fast) device
- most notably, dm-cache support for managing cache coherency when
deploying dm-cache with sophisticated origin volumes (that support
hardware snapshots and/or clustering): these changes come in the form
of a new passthrough operation mode and a cache block invalidation
interface.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQEcBAABAgAGBQJSgt+QAAoJEMUj8QotnQNapcEIALC6U1rmw08PRMSanqg4/aVu
pTahzPtai9jXchQV6q5XsglJryrhD9MoNqrZgHd2drdnmEKTKfVX+/iCXGiE4hQ5
I5QUZf5myEXSd60pCgZwNam+VHMuAuSPQW6LWqRTJjDLHixGF+AoHZGxkEsYgj6M
p686OOpga1nmT2w072xLIh9z2tsv/tm+UN7GSbyklM+/1ItcXxq+/J8rsuth7IqT
k0I60jexq+Q3OaYuJY7vxhdE7PhBCw1fGmtuCcjekqsSVpAdCgDz3FFOEZmyXcUs
YLFE3GcclYQpIPjNjVGTLDFHdoIMWdKiibs/ScBUtegqxWvqP7c87YFhbL+VHDM=
=lLxo
-----END PGP SIGNATURE-----
Merge tag 'dm-3.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper changes from Mike Snitzer:
"A set of device-mapper changes for 3.13.
Improve reliability of buffer allocations for dm messages with a small
number of arguments, a couple path group initialization fixes for dm
multipath, a fix for resizing a dm array, various fixes and
optimizations for dm cache, a fix for device mapper's Kconfig menu
indentation.
Features added include:
- dm crypt support for activating legacy CBC TrueCrypt containers
(useful for forensics of these old TCRYPT containers)
- reduced dm-cache memory requirements for each block in the cache
- basic support for shrinking a dm-cache's cache (fast) device
- most notably, dm-cache support for managing cache coherency when
deploying dm-cache with sophisticated origin volumes (that support
hardware snapshots and/or clustering): these changes come in the
form of a new passthrough operation mode and a cache block
invalidation interface"
* tag 'dm-3.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (32 commits)
dm cache: resolve small nits and improve Documentation
dm cache: add cache block invalidation support
dm cache: add remove_cblock method to policy interface
dm cache policy mq: reduce memory requirements
dm cache metadata: check the metadata version when reading the superblock
dm cache: add passthrough mode
dm cache: cache shrinking support
dm cache: promotion optimisation for writes
dm cache: be much more aggressive about promoting writes to discarded blocks
dm cache policy mq: implement writeback_work() and mq_{set,clear}_dirty()
dm cache: optimize commit_if_needed
dm space map disk: optimise sm_disk_dec_block
MAINTAINERS: add reference to device-mapper's linux-dm.git tree
dm: fix Kconfig menu indentation
dm: allow remove to be deferred
dm table: print error on preresume failure
dm crypt: add TCW IV mode for old CBC TCRYPT containers
dm crypt: properly handle extra key string in initialization
dm cache: log error message if dm_kcopyd_copy() fails
dm cache: use cell_defer() boolean argument consistently
...
* Unify some compile-time differences so that we have fewer uses of
#ifdef CONFIG_OF in atmel_nand
* Other general cleanups (removing unused functions, options, variables,
fields; use correct interfaces)
* Fix BUG() for new odd-sized NAND, which report non-power-of-2 dimensions via
ONFI
* Miscellaneous driver fixes (SPI NOR flash; BCM47xx NAND flash; etc.)
* Improve differentiation between SLC and MLC NAND -- this clarifies an ABI
issue regarding the MTD "type" (in sysfs and in ioctl(MEMGETINFO)), where
the MTD_MLCNANDFLASH type was present but inconsistently used
* Extend GPMI NAND to support multi-chip-select NAND for some platforms
* Many improvements to the OMAP2/3 NAND driver, including an expanded DT
binding to bring us closer to mainline support for some OMAP systems
* Fix a deadlock in the error path of the Atmel NAND driver probe
* Correct the error codes from MTD mmap() to conform to POSIX and the Linux
Programmer's Manual. This is an acknowledged change in the MTD ABI, but I
can't imagine somebody relying on the non-standard -ENOSYS error code
specifically. Am I just being unimaginative? :)
* Fix a few important GPMI NAND bugs (one regression from 3.12 and one
long-standing race condition)
* More? Read the log!
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABAgAGBQJSgzYRAAoJEFySrpd9RFgtv8EP/3ZIS1w4fHyWafVSdgVFGR0Y
urlVDhg7iBauh9admN9xxBz6CYRwhjby8GnN87Q1qzu95Xp63RVx31nNfdBW3DGd
92vSyskijYJcUtanBxYqGp1i3EbQcpF4mumqxnre3C4KTLNije41t/wNVqnXAstU
DWho2iymZdkweKJ0DqBA7WF4l/YscdFyNDanO9JWiwII05Rh3Acv7FPMFm3Clblw
Nvfwzgp4XycYMeIQtkmQgQ3GgeWtxPgQwqMofn97MVH4zeTsmUP317ohIMukLGJD
db33J2xBdrIbk9P4D3RvjOCYyAyonu9y6/p+B1Vmj+R4CAUvQOIljhklHFoT3UZW
OzUHPxB6T0+NZyQ/5IRQIYH9As++vdb/bzsUXm/cXceI4o4I0QCPy/8adifakBOF
IUX9/BCdUOfKXvdOXY5dXMR2sY1IBg/1WfI+qcAoITsS/EVrUTrOcfSLyGqF0ERU
c7mAzXiyp4D51x66/QnfJ4aJjlioQSoa3mK1j4fXqH08YB5Zclpz938Bo1AO3lWy
/n+NYSbeXJoi4rVkNawjrRVs+0OTby2XQ5OqBlUMH6f30fqjUefPm66ZBMhbxzYu
5QFDctUbnHCyAPpOtM/WR3/NOkIqVhQl1331A+dG2TzLK0vTHs+kbt/YmIITpjI+
yn70XJGhk1F4gy8zhD+V
=z5qO
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20131112' of git://git.infradead.org/linux-mtd
Pull MTD changes from Brian Norris:
- Unify some compile-time differences so that we have fewer uses of
#ifdef CONFIG_OF in atmel_nand
- Other general cleanups (removing unused functions, options,
variables, fields; use correct interfaces)
- Fix BUG() for new odd-sized NAND, which report non-power-of-2
dimensions via ONFI
- Miscellaneous driver fixes (SPI NOR flash; BCM47xx NAND flash; etc.)
- Improve differentiation between SLC and MLC NAND -- this clarifies an
ABI issue regarding the MTD "type" (in sysfs and in the MEMGETINFO
ioctl), where the MTD_MLCNANDFLASH type was present but
inconsistently used
- Extend GPMI NAND to support multi-chip-select NAND for some platforms
- Many improvements to the OMAP2/3 NAND driver, including an expanded
DT binding to bring us closer to mainline support for some OMAP
systems
- Fix a deadlock in the error path of the Atmel NAND driver probe
- Correct the error codes from MTD mmap() to conform to POSIX and the
Linux Programmer's Manual. This is an acknowledged change in the MTD
ABI, but I can't imagine somebody relying on the non-standard -ENOSYS
error code specifically. Am I just being unimaginative? :)
- Fix a few important GPMI NAND bugs (one regression from 3.12 and one
long-standing race condition)
- More? Read the log!
* tag 'for-linus-20131112' of git://git.infradead.org/linux-mtd: (98 commits)
mtd: gpmi: fix the NULL pointer
mtd: gpmi: fix kernel BUG due to racing DMA operations
mtd: mtdchar: return expected errors on mmap() call
mtd: gpmi: only scan two chips for imx6
mtd: gpmi: Use devm_kzalloc()
mtd: atmel_nand: fix bug driver will in a dead lock if no nand detected
mtd: nand: use a local variable to simplify the nand_scan_tail
mtd: nand: remove deprecated IRQF_DISABLED
mtd: dataflash: Say if we find a device we don't support
mtd: nand: omap: fix error return code in omap_nand_probe()
mtd: nand_bbt: kill NAND_BBT_SCANALLPAGES
mtd: m25p80: fixup device removal failure path
mtd: mxc_nand: Include linux/of.h header
mtd: remove duplicated include from mtdcore.c
mtd: m25p80: add support for Macronix mx25l3255e
mtd: nand: omap: remove selection of BCH ecc-scheme via KConfig
mtd: nand: omap: updated devm_xx for all resource allocation and free calls
mtd: nand: omap: use drivers/mtd/nand/nand_bch.c wrapper for BCH ECC instead of lib/bch.c
mtd: nand: omap: clean-up ecc layout for BCH ecc schemes
mtd: nand: omap2: clean-up BCHx_HW and BCHx_SW ECC configurations in device_probe
...
Pull networking updates from David Miller:
1) The addition of nftables. No longer will we need protocol aware
firewall filtering modules, it can all live in userspace.
At the core of nftables is a, for lack of a better term, virtual
machine that executes byte codes to inspect packet or metadata
(arriving interface index, etc.) and make verdict decisions.
Besides support for loading packet contents and comparing them, the
interpreter supports lookups in various datastructures as
fundamental operations. For example sets are supports, and
therefore one could create a set of whitelist IP address entries
which have ACCEPT verdicts attached to them, and use the appropriate
byte codes to do such lookups.
Since the interpreted code is composed in userspace, userspace can
do things like optimize things before giving it to the kernel.
Another major improvement is the capability of atomically updating
portions of the ruleset. In the existing netfilter implementation,
one has to update the entire rule set in order to make a change and
this is very expensive.
Userspace tools exist to create nftables rules using existing
netfilter rule sets, but both kernel implementations will need to
co-exist for quite some time as we transition from the old to the
new stuff.
Kudos to Patrick McHardy, Pablo Neira Ayuso, and others who have
worked so hard on this.
2) Daniel Borkmann and Hannes Frederic Sowa made several improvements
to our pseudo-random number generator, mostly used for things like
UDP port randomization and netfitler, amongst other things.
In particular the taus88 generater is updated to taus113, and test
cases are added.
3) Support 64-bit rates in HTB and TBF schedulers, from Eric Dumazet
and Yang Yingliang.
4) Add support for new 577xx tigon3 chips to tg3 driver, from Nithin
Sujir.
5) Fix two fatal flaws in TCP dynamic right sizing, from Eric Dumazet,
Neal Cardwell, and Yuchung Cheng.
6) Allow IP_TOS and IP_TTL to be specified in sendmsg() ancillary
control message data, much like other socket option attributes.
From Francesco Fusco.
7) Allow applications to specify a cap on the rate computed
automatically by the kernel for pacing flows, via a new
SO_MAX_PACING_RATE socket option. From Eric Dumazet.
8) Make the initial autotuned send buffer sizing in TCP more closely
reflect actual needs, from Eric Dumazet.
9) Currently early socket demux only happens for TCP sockets, but we
can do it for connected UDP sockets too. Implementation from Shawn
Bohrer.
10) Refactor inet socket demux with the goal of improving hash demux
performance for listening sockets. With the main goals being able
to use RCU lookups on even request sockets, and eliminating the
listening lock contention. From Eric Dumazet.
11) The bonding layer has many demuxes in it's fast path, and an RCU
conversion was started back in 3.11, several changes here extend the
RCU usage to even more locations. From Ding Tianhong and Wang
Yufen, based upon suggestions by Nikolay Aleksandrov and Veaceslav
Falico.
12) Allow stackability of segmentation offloads to, in particular, allow
segmentation offloading over tunnels. From Eric Dumazet.
13) Significantly improve the handling of secret keys we input into the
various hash functions in the inet hashtables, TCP fast open, as
well as syncookies. From Hannes Frederic Sowa. The key fundamental
operation is "net_get_random_once()" which uses static keys.
Hannes even extended this to ipv4/ipv6 fragmentation handling and
our generic flow dissector.
14) The generic driver layer takes care now to set the driver data to
NULL on device removal, so it's no longer necessary for drivers to
explicitly set it to NULL any more. Many drivers have been cleaned
up in this way, from Jingoo Han.
15) Add a BPF based packet scheduler classifier, from Daniel Borkmann.
16) Improve CRC32 interfaces and generic SKB checksum iterators so that
SCTP's checksumming can more cleanly be handled. Also from Daniel
Borkmann.
17) Add a new PMTU discovery mode, IP_PMTUDISC_INTERFACE, which forces
using the interface MTU value. This helps avoid PMTU attacks,
particularly on DNS servers. From Hannes Frederic Sowa.
18) Use generic XPS for transmit queue steering rather than internal
(re-)implementation in virtio-net. From Jason Wang.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1622 commits)
random32: add test cases for taus113 implementation
random32: upgrade taus88 generator to taus113 from errata paper
random32: move rnd_state to linux/random.h
random32: add prandom_reseed_late() and call when nonblocking pool becomes initialized
random32: add periodic reseeding
random32: fix off-by-one in seeding requirement
PHY: Add RTL8201CP phy_driver to realtek
xtsonic: add missing platform_set_drvdata() in xtsonic_probe()
macmace: add missing platform_set_drvdata() in mace_probe()
ethernet/arc/arc_emac: add missing platform_set_drvdata() in arc_emac_probe()
ipv6: protect for_each_sk_fl_rcu in mem_check with rcu_read_lock_bh
vlan: Implement vlan_dev_get_egress_qos_mask as an inline.
ixgbe: add warning when max_vfs is out of range.
igb: Update link modes display in ethtool
netfilter: push reasm skb through instead of original frag skbs
ip6_output: fragment outgoing reassembled skb properly
MAINTAINERS: mv643xx_eth: take over maintainership from Lennart
net_sched: tbf: support of 64bit rates
ixgbe: deleting dfwd stations out of order can cause null ptr deref
ixgbe: fix build err, num_rx_queues is only available with CONFIG_RPS
...
glibc recently changed the error string for ESTALE to remove "NFS" -
https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=96945714ec61951cc748da2b4b8a80cf02127ee9
from: [ERR_REMAP (ESTALE)] = N_("Stale NFS file handle"),
to: [ERR_REMAP (ESTALE)] = N_("Stale file handle"),
And some have expressed concern that the kernel's errno.h
comments still refer to NFS.
So make that change... note that this is a comment-only change,
and has no functional difference.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are no too intrusive changes in this update batch. The biggest
LOC is found in the new DICE driver, and other small changes are
scattered over the whole sound subtree (which is a common pattern).
Below are highlights:
- ALSA core:
* Memory allocation support with genpool
* Fix blocking in drain ioctl of compress_offload
- HD-audio:
* Improved AMD HDMI supports
* Intel HDMI detection improvements
* thinkpad_acpi mute-key integration
* New PCI ID, New ALC255,285,293 codecs, CX20952
- USB-audio:
* New buffer size management
* Clean up endpoint handling codes
- ASoC:
* Further work on the dmaengine helpers, including support for
configuring the parameters for DMA by reading the capabilities of
the DMA controller which removes some guesswork and magic numbers
from drivers.
* A refresh of the documentation.
* Conversions of many drivers to direct regmap API usage in order to
allow the ASoC level register I/O code to be removed, this will
hopefully be completed by v3.14.
* Support for using async register I/O in DAPM, reducing the time
taken to implement power transitions on systems that support it.
- Fireiwre: DICE driver
- Lots of small fixes for bugs reported by Coverity
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJSf2ycAAoJEGwxgFQ9KSmkVPcQAIenO8wxmHFyxHStQEt4GkM/
1BNk3V9MqAVv+ecjNPWrak+IUFY48gelUISfL1qIvlSl5pZ+FS+UEVSObczeI5Fp
aY1WDCypC3nfsIm4JCIF/Mv3CpE3eY0Gcxqy6OO87mEVs14rLl/Q0NUw2UVrxRQp
tu0dh6/C3Bjh8+qSnVnPVcLQG6tQsl7Wv71TyowL4ywom9yrx3uBT1qmqLftG8AH
Wjm2mpxj0dCGAqTcgiu4DMyTJw7kuTmLduDbhExqIApiaeB2o5ilZny/uQBrP32z
rdUiJm6cSmQ1jv7L0C0xR3vXv73rS73jXMYh2Qt/9iEZIZkwAhTy0Z7Jr5bMfPjP
I9hICYRGhfa0S2UJa7yd6Jy3qlnUSyCAU9StQlLIiA+e3Xg0a8yoTZFQ/qWSWzwL
UK584Wst/lCG8QWUwKV/3n/75ALcKZ1cVrBlcCvcKJwv6OKua7DK0XtDfGpsM5sz
tiXjyY6T8nh87x62z3/IGMHD43xRp6zmadgwvCzYLkcBbsDNQSQHqzvly0XXtLYb
4N0cEJjHjHDbiQXkWEreDZ/y9cUSv129GZWsnUQAsO1OoHQaf8hUQt5PxBeYGu9B
E60pERBNVvicajitdwL+GJ1WeqTkl3VnU8s/ucLXGoGb92Z0aWhqtrMAHCj9MybP
S2aL7q6otZ4k+Wgh3VKj
=lxuj
-----END PGP SIGNATURE-----
Merge tag 'sound-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound updates from Takashi Iwai:
"There are no too intrusive changes in this update batch. The biggest
LOC is found in the new DICE driver, and other small changes are
scattered over the whole sound subtree (which is a common pattern).
Below are highlights:
- ALSA core:
* Memory allocation support with genpool
* Fix blocking in drain ioctl of compress_offload
- HD-audio:
* Improved AMD HDMI supports
* Intel HDMI detection improvements
* thinkpad_acpi mute-key integration
* New PCI ID, New ALC255,285,293 codecs, CX20952
- USB-audio:
* New buffer size management
* Clean up endpoint handling codes
- ASoC:
* Further work on the dmaengine helpers, including support for
configuring the parameters for DMA by reading the capabilities of
the DMA controller which removes some guesswork and magic numbers
from drivers.
* A refresh of the documentation.
* Conversions of many drivers to direct regmap API usage in order
to allow the ASoC level register I/O code to be removed, this
will hopefully be completed by v3.14.
* Support for using async register I/O in DAPM, reducing the time
taken to implement power transitions on systems that support it.
- Firewire: DICE driver
- Lots of small fixes for bugs reported by Coverity"
* tag 'sound-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (382 commits)
ALSA: hda/realtek - Add new codec ALC255/ALC3234 UAJ supported
ALSA: hda - Apply MacBook fixups for CS4208 correctly
ASoC: fsl: imx-wm8962: remove an unneeded check
ASoC: fsl: imx-pcm-fiq: Remove unused 'runtime' variable
ALSA: hda/realtek - Make fixup regs persist after resume
ALSA: hda_intel: ratelimit "spurious response" message
ASoC: generic-dmaengine-pcm: Use SNDRV_DMA_TYPE_DEV_IRAM as default
ASoC: dapm: Use WARN_ON() instead of BUG_ON()
ASoC: wm_adsp: Fix BUG_ON() and WARN_ON() usages
ASoC: Replace BUG() with WARN()
ASoC: wm_hubs: Replace BUG() with WARN()
ASoC: wm8996: Replace BUG() with WARN()
ASoC: wm8962: Replace BUG() with WARN()
ASoC: wm8958: Replace BUG() with WARN()
ASoC: wm8904: Replace BUG() with WARN()
ASoC: wm8900: Replace BUG() with WARN()
ASoC: wm8350: Replace BUG() with WARN()
ASoC: txx9: Use WARN_ON() instead of BUG_ON()
ASoC: sh: Use WARN_ON() instead of BUG_ON()
ASoC: rcar: Use WARN_ON() instead of BUG_ON()
...
not compiled for ages, and recent versions of gcc for it are broken.
Remove support for it.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJSev31AAoJEMsfJm/On5mBzSAQAKRBYLqtf3nJGm9pXGDhZPGG
7KSQ8S11pg/wnXYW6P/XJhFRBrYkOOCeqVKQHtmxG8MmXQkkOz95rsIvBbUzU/FT
yJAPKpOHdh1yLhBGgCj3WhGtjVwpbut1/y9n2M5SpGautUgxfLj9fJiswSJx0n7t
VRWKwfIpBFPLPs9w6hdDf94tIXhSX8Me2gd3LDCPBEQ2SZYd8rtBasYtDeC2+FLa
Xow4ZQrCU7hpYscSUFzJpok35hl7weGhJ9jjXwtic4byFHvdiyHUwCOaEWC0hqNi
fOLWFbvBogqjyAktfZhfyL9R9/7lGlLshLQNmJWR3bO+nCJ21h9ATw0R4gLBdT4/
lzLRnJ/4GdtbvmdqRxNjxxR4zHkZ+tE8HmaCmUzvqGfQyA5sJNBRrBDcWLUOVlO9
0iIZsJBZjSQXKXSk9P5xH4G0tlbAFEUnEHKsrt/mgsD9Z3SgbPKAIWSBAJA0AMQk
DXZaXrBRilXOPUCZASZfmK8AQFC1GYB0tz7nT4x1mjT2/JClgG2kHCAGhNmI+CbK
l9VRIgBydppLFPOGhZLSNGQp29xBhw9JgOVns4a1k7kJQEw9ht38h8Q2ckRYxXhP
/z53eZKMQk62quWlyLRgR9mWqZc2CIifLVdFjiOELMh7wKPwL6eGrrrGBDbPtctS
PX5K26geb0oA3ZMjpBLr
=V6n6
-----END PGP SIGNATURE-----
Merge tag 'h8300-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull h8300 platform removal from Guenter Roeck:
"The patch series has been in -next for more than one relase cycle. I
did get a number of Acks, and no objections.
H8/300 has been dead for several years, the kernel for it has not
compiled for ages, and recent versions of gcc for it are broken.
Remove support for it"
* tag 'h8300-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
CREDITS: Add Yoshinori Sato for h8300
fs/minix: Drop dependency on H8300
Drop remaining references to H8/300 architecture
Drop MAINTAINERS entry for H8/300
watchdog: Drop references to H8300 architecture
net/ethernet: Drop H8/300 Ethernet driver
net/ethernet: smsc9194: Drop conditional code for H8/300
ide: Drop H8/300 driver
Drop support for Renesas H8/300 (h8300) architecture
So both Liu and I made huge messes of find_lock_delalloc_range trying to fix
stuff, me first by fixing extent size, then him by fixing something I broke and
then me again telling him to fix it a different way. So this is obviously a
candidate for some testing. This patch adds a pseudo fs so we can allocate fake
inodes for tests that need an inode or pages. Then it addes a bunch of tests to
make sure find_lock_delalloc_range is acting the way it is supposed to. With
this patch and all of our previous patches to find_lock_delalloc_range I am sure
it is working as expected now. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Pull perf updates from Ingo Molnar:
"As a first remark I'd like to note that the way to build perf tooling
has been simplified and sped up, in the future it should be enough for
you to build perf via:
cd tools/perf/
make install
(ie without the -j option.) The build system will figure out the
number of CPUs and will do a parallel build+install.
The various build system inefficiencies and breakages Linus reported
against the v3.12 pull request should now be resolved - please
(re-)report any remaining annoyances or bugs.
Main changes on the perf kernel side:
* Performance optimizations:
. perf ring-buffer code optimizations, by Peter Zijlstra
. perf ring-buffer code optimizations, by Oleg Nesterov
. x86 NMI call-stack processing optimizations, by Peter Zijlstra
. perf context-switch optimizations, by Peter Zijlstra
. perf sampling speedups, by Peter Zijlstra
. x86 Intel PEBS processing speedups, by Peter Zijlstra
* Enhanced hardware support:
. for Intel Ivy Bridge-EP uncore PMUs, by Zheng Yan
. for Haswell transactions, by Andi Kleen, Peter Zijlstra
* Core perf events code enhancements and fixes by Oleg Nesterov:
. for uprobes, if fork() is called with pending ret-probes
. for uprobes platform support code
* New ABI details by Andi Kleen:
. Report x86 Haswell TSX transaction abort cost as weight
Main changes on the perf tooling side (some of these tooling changes
utilize the above kernel side changes):
* 'perf report/top' enhancements:
. Convert callchain children list to rbtree, greatly reducing the
time taken for callchain processing, from Namhyung Kim.
. Add new COMM infrastructure, further improving histogram
processing, from Frédéric Weisbecker, one fix from Namhyung Kim.
. Add /proc/kcore based live-annotation improvements, including
build-id cache support, multi map 'call' instruction navigation
fixes, kcore address validation, objdump workarounds. From
Adrian Hunter.
. Show progress on histogram collapsing, that can take a long
time, from Namhyung Kim.
. Add --max-stack option to limit callchain stack scan in 'top'
and 'report', improving callchain processing when reducing the
stack depth is an option, from Waiman Long.
. Add new option --ignore-vmlinux for perf top, from Willy
Tarreau.
* 'perf trace' enhancements:
. 'perf trace' now can can use a 'perf probe' dynamic tracepoints
to hook into the userspace -> kernel pathname copy so that it
can map fds to pathnames without reading /proc/pid/fd/ symlinks.
From Arnaldo Carvalho de Melo.
. Show VFS path associated with fd in live sessions, using a
'vfs_getname' 'perf probe' created dynamic tracepoint or by
looking at /proc/pid/fd, from Arnaldo Carvalho de Melo.
. Add 'trace' beautifiers for lots of syscall arguments, from
Arnaldo Carvalho de Melo.
. Implement more compact 'trace' output by suppressing zeroed
args, from Arnaldo Carvalho de Melo.
. Show thread COMM by default in 'trace', from Arnaldo Carvalho de
Melo.
. Add option to show full timestamp in 'trace', from David Ahern.
. Add 'record' command in 'trace', to record raw_syscalls:*, from
David Ahern.
. Add summary option to dump syscall statistics in 'trace', from
David Ahern.
. Improve error messages in 'trace', providing hints about system
configuration steps needed for using it, from Ramkumar
Ramachandra.
. 'perf trace' now emits hints as to why tracing is not possible,
helping the user to setup the system to allow tracing in the
desired permission granularity, telling if the problem is due to
debugfs not being mounted or with not enough permission for
!root, /proc/sys/kernel/perf_event_paranoit value, etc. From
Arnaldo Carvalho de Melo.
* 'perf record' enhancements:
. Check maximum frequency rate for record/top, emitting better
error messages, from Jiri Olsa.
. 'perf record' code cleanups, from David Ahern.
. Improve write_output error message in 'perf record', from Adrian
Hunter.
. Allow specifying B/K/M/G unit to the --mmap-pages arguments,
from Jiri Olsa.
. Fix command line callchain attribute tests to handle the new
-g/--call-chain semantics, from Arnaldo Carvalho de Melo.
* 'perf kvm' enhancements:
. Disable live kvm command if timerfd is not supported, from David
Ahern.
. Fix detection of non-core features, from David Ahern.
* 'perf list' enhancements:
. Add usage to 'perf list', from David Ahern.
. Show error in 'perf list' if tracepoints not available, from
Pekka Enberg.
* 'perf probe' enhancements:
. Support "$vars" meta argument syntax for local variables,
allowing asking for all possible variables at a given probe
point to be collected when it hits, from Masami Hiramatsu.
* 'perf sched' enhancements:
. Address the root cause of that 'perf sched' stack initialization
build slowdown, by programmatically setting a big array after
moving the global variable back to the stack. Fix from Adrian
Hunter.
* 'perf script' enhancements:
. Set up output options for in-stream attributes, from Adrian
Hunter.
. Print addr by default for BTS in 'perf script', from Adrian
Juntmer
* 'perf stat' enhancements:
. Improved messages when doing profiling in all or a subset of
CPUs using a workload as the session delimitator, as in:
'perf stat --cpu 0,2 sleep 10s'
from Arnaldo Carvalho de Melo.
. Add units to nanosec-based counters in 'perf stat', from David
Ahern.
. Remove bogus info when using 'perf stat' -e cycles/instructions,
from Ramkumar Ramachandra.
* 'perf lock' enhancements:
. 'perf lock' fixes and cleanups, from Davidlohr Bueso.
* 'perf test' enhancements:
. Fixup PERF_SAMPLE_TRANSACTION handling in sample synthesizing
and 'perf test', from Adrian Hunter.
. Clarify the "sample parsing" test entry, from Arnaldo Carvalho
de Melo.
. Consider PERF_SAMPLE_TRANSACTION in the "sample parsing" test,
from Arnaldo Carvalho de Melo.
. Memory leak fixes in 'perf test', from Felipe Pena.
* 'perf bench' enhancements:
. Change the procps visible command-name of invididual benchmark
tests plus cleanups, from Ingo Molnar.
* Generic perf tooling infrastructure/plumbing changes:
. Separating data file properties from session, code
reorganization from Jiri Olsa.
. Fix version when building out of tree, as when using one of
these:
$ make help | grep perf
perf-tar-src-pkg - Build perf-3.12.0.tar source tarball
perf-targz-src-pkg - Build perf-3.12.0.tar.gz source tarball
perf-tarbz2-src-pkg - Build perf-3.12.0.tar.bz2 source tarball
perf-tarxz-src-pkg - Build perf-3.12.0.tar.xz source tarball
$
from David Ahern.
. Enhance option parse error message, showing just the help lines
of the options affected, from Namhyung Kim.
. libtraceevent updates from upstream trace-cmd repo, from Steven
Rostedt.
. Always use perf_evsel__set_sample_bit to set sample_type, from
Adrian Hunter.
. Memory and mmap leak fixes from Chenggang Qin.
. Assorted build fixes for from David Ahern and Jiri Olsa.
. Speed up and prettify the build system, from Ingo Molnar.
. Implement addr2line directly using libbfd, from Roberto Vitillo.
. Separate the GTK support in a separate libperf-gtk.so DSO, that
is only loaded when --gtk is specified, from Namhyung Kim.
. perf bash completion fixes and improvements from Ramkumar
Ramachandra.
. Support for Openembedded/Yocto -dbg packages, from Ricardo
Ribalda Delgado.
And lots and lots of other fixes and code reorganizations that did not
make it into the list, see the shortlog, diffstat and the Git log for
details!"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (300 commits)
uprobes: Fix the memory out of bound overwrite in copy_insn()
uprobes: Fix the wrong usage of current->utask in uprobe_copy_process()
perf tools: Remove unneeded include
perf record: Remove post_processing_offset variable
perf record: Remove advance_output function
perf record: Refactor feature handling into a separate function
perf trace: Don't relookup fields by name in each sample
perf tools: Fix version when building out of tree
perf evsel: Ditch evsel->handler.data field
uprobes: Export write_opcode() as uprobe_write_opcode()
uprobes: Introduce arch_uprobe->ixol
uprobes: Kill module_init() and module_exit()
uprobes: Move function declarations out of arch
perf/x86/intel: Add Ivy Bridge-EP uncore IRP box support
perf/x86/intel/uncore: Add filter support for IvyBridge-EP QPI boxes
perf: Factor out strncpy() in perf_event_mmap_event()
tools/perf: Add required memory barriers
perf: Fix arch_perf_out_copy_user default
perf: Update a stale comment
perf: Optimize perf_output_begin() -- address calculation
...