This does add an IPMI BMC server-side driver, to allow a Linux
system to act as an IPMI controller. That's the biggest change,
but it is just a new driver that is fairly narrow in use.
The other largish change is removing ACPI SPMI probe support,
which should have never really been there in the beginning.
-corey
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJaw8XOAAoJEGHzjJCRm/+BeoQQAIBokkgHSAl6y8jlmmTmu9fV
gXl0GT97yAMyuTUaTvkWLdWm8GWCrSSUTAqjbcTzMWya87KRODsUg9eTO58fFw8m
ODbD7ruTGmBQggJEqDaS5NHAiU22ce2aDn5I0WjJkH8vaJ3nQSa6nEgBOzXDMC1A
OmaVQE6BnEAIA+nmlzbUgcuduB3iv7gTua2gy7MQSG/5jBPIAOQtpsgHtCZMA+Dt
bk9LqDpOa6dZry7eIn5/u62YRVLX+iVO/DoxCgZyOxVfgnV/ykKT3PvWO9vviBQV
QXXLAzLO3niDQ/ZwwpqrUWjR7oGWDKe4ntMATfnxCP7LQNrSF7LnAxz8oCcwuBxq
ELrwME8LEQmMYGFg3SYcLA0k1teTP2UVWURUH6zBTft2xzo8bJN0JU7J52tTL4xa
yC7JgTVeq8uQB8gkYKf43LGHlXlaxFQsnRKeCJ2JG+1yD+sOmxbZshfiLroMZ5DM
nPKOuMdoqPCd7JVUMs/c0IgcDHd6vtkiDRA0jh38JdIdp9B/i+dFYs61bSzrkBD9
tXUua56QTVl5WEWy2lGX+ZH/JUJbQnTxjiaW99Q9Soml6qXqIVjR5ZvLtls0PeW0
ZytaQEGuV8758YEmOGk/1pUys/oC37mDlyjTpONGYzvAyzJqU258/YU9eJMkcQ/G
FFEEkWILLuGgtFnyxwN/
=p993
-----END PGP SIGNATURE-----
Merge tag 'for-linus-4.17' of git://github.com/cminyard/linux-ipmi
Pull IPMI updates from Corey Minyard:
"Mostly small changes, as usual.
This does add an IPMI BMC server-side driver, to allow a Linux system
to act as an IPMI controller. That's the biggest change, but it is
just a new driver that is fairly narrow in use.
The other largish change is removing ACPI SPMI probe support, which
should have never really been there in the beginning"
* tag 'for-linus-4.17' of git://github.com/cminyard/linux-ipmi:
ipmi/parisc: Add IPMI chassis poweroff for certain HP PA-RISC and IA-64 servers
ipmi_ssif: Fix kernel panic at msg_done_handler
ipmi:pci: Blacklist a Realtek "IPMI" device
ipmi: Remove ACPI SPMI probing from the system interface driver
ipmi: Remove ACPI SPMI probing from the SSIF (I2C) driver
ipmi: missing error code in try_smi_init()
ipmi: use ARRAY_SIZE for poweroff_functions array sizing calculation
ipmi: Consolidate cleanup code
ipmi: Remove some unnecessary initializations
ipmi: Fix some error cleanup issues
ipmi: Add or fix SPDX-License-Identifier in all files
ipmi: Re-use existing macros for built-in properties
ipmi:pci: Make the PCI defines consistent with normal Linux ones
ipmi: kcs_bmc: coding-style fixes and use new poll type
char/ipmi: add documentation for sysfs interface
ipmi: kcs_bmc: mark expected switch fall-through in kcs_bmc_handle_data
ipmi: add an Aspeed KCS IPMI BMC driver
ipmi: add a KCS IPMI BMC driver
Could be useful for a target to return stats or other information.
If a target does DMEMIT() anything to @result from its .message method
then it must return 1 to the caller.
Signed-off-By: Mike Snitzer <snitzer@redhat.com>
This removes the entire architecture code for blackfin, cris, frv, m32r,
metag, mn10300, score, and tile, including the associated device drivers.
I have been working with the (former) maintainers for each one to ensure
that my interpretation was right and the code is definitely unused in
mainline kernels. Many had fond memories of working on the respective
ports to start with and getting them included in upstream, but also saw
no point in keeping the port alive without any users.
In the end, it seems that while the eight architectures are extremely
different, they all suffered the same fate: There was one company
in charge of an SoC line, a CPU microarchitecture and a software
ecosystem, which was more costly than licensing newer off-the-shelf
CPU cores from a third party (typically ARM, MIPS, or RISC-V). It seems
that all the SoC product lines are still around, but have not used the
custom CPU architectures for several years at this point. In contrast,
CPU instruction sets that remain popular and have actively maintained
kernel ports tend to all be used across multiple licensees.
The removal came out of a discussion that is now documented at
https://lwn.net/Articles/748074/. Unlike the original plans, I'm not
marking any ports as deprecated but remove them all at once after I made
sure that they are all unused. Some architectures (notably tile, mn10300,
and blackfin) are still being shipped in products with old kernels,
but those products will never be updated to newer kernel releases.
After this series, we still have a few architectures without mainline
gcc support:
- unicore32 and hexagon both have very outdated gcc releases, but the
maintainers promised to work on providing something newer. At least
in case of hexagon, this will only be llvm, not gcc.
- openrisc, risc-v and nds32 are still in the process of finishing their
support or getting it added to mainline gcc in the first place.
They all have patched gcc-7.3 ports that work to some degree, but
complete upstream support won't happen before gcc-8.1. Csky posted
their first kernel patch set last week, their situation will be similar.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJawdL2AAoJEGCrR//JCVInuH0P/RJAZh1nTD+TR34ZhJq2TBoo
PgygwDU7Z2+tQVU+EZ453Gywz9/NMRFk1RWAZqrLix4ZtyIMvC6A1qfT2yH1Y7Fb
Qh6tccQeLe4ezq5u4S/46R/fQXu3Txr92yVwzJJUuPyU0arF9rv5MmI8e6p7L1en
yb74kSEaCe+/eMlsEj1Cc1dgthDNXGKIURHkRsILoweysCpesjiTg4qDcL+yTibV
FP2wjVbniKESMKS6qL71tiT5sexvLsLwMNcGiHPj94qCIQuI7DLhLdBVsL5Su6gI
sbtgv0dsq4auRYAbQdMaH1hFvu6WptsuttIbOMnz2Yegi2z28H8uVXkbk2WVLbqG
ZESUwutGh8MzOL2RJ4jyyQq5sfo++CRGlfKjr6ImZRv03dv0pe/W85062cK5cKNs
cgDDJjGRorOXW7dyU6jG2gRqODOQBObIv3w5efdq5OgzOWlbI4EC+Y5u1Z0JF/76
pSwtGXA6YhwC+9LLAlnVTHG+yOwuLmAICgoKcTbzTVDKA2YQZG/cYuQfI5S1wD8e
X6urPx3Md2GCwLXQ9mzKBzKZUpu/Tuhx0NvwF4qVxy6x1PELjn68zuP7abDHr46r
57/09ooVN+iXXnEGMtQVS/OPvYHSa2NgTSZz6Y86lCRbZmUOOlK31RDNlMvYNA+s
3iIVHovno/JuJnTOE8LY
=fQ8z
-----END PGP SIGNATURE-----
Merge tag 'arch-removal' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pul removal of obsolete architecture ports from Arnd Bergmann:
"This removes the entire architecture code for blackfin, cris, frv,
m32r, metag, mn10300, score, and tile, including the associated device
drivers.
I have been working with the (former) maintainers for each one to
ensure that my interpretation was right and the code is definitely
unused in mainline kernels. Many had fond memories of working on the
respective ports to start with and getting them included in upstream,
but also saw no point in keeping the port alive without any users.
In the end, it seems that while the eight architectures are extremely
different, they all suffered the same fate: There was one company in
charge of an SoC line, a CPU microarchitecture and a software
ecosystem, which was more costly than licensing newer off-the-shelf
CPU cores from a third party (typically ARM, MIPS, or RISC-V). It
seems that all the SoC product lines are still around, but have not
used the custom CPU architectures for several years at this point. In
contrast, CPU instruction sets that remain popular and have actively
maintained kernel ports tend to all be used across multiple licensees.
[ See the new nds32 port merged in the previous commit for the next
generation of "one company in charge of an SoC line, a CPU
microarchitecture and a software ecosystem" - Linus ]
The removal came out of a discussion that is now documented at
https://lwn.net/Articles/748074/. Unlike the original plans, I'm not
marking any ports as deprecated but remove them all at once after I
made sure that they are all unused. Some architectures (notably tile,
mn10300, and blackfin) are still being shipped in products with old
kernels, but those products will never be updated to newer kernel
releases.
After this series, we still have a few architectures without mainline
gcc support:
- unicore32 and hexagon both have very outdated gcc releases, but the
maintainers promised to work on providing something newer. At least
in case of hexagon, this will only be llvm, not gcc.
- openrisc, risc-v and nds32 are still in the process of finishing
their support or getting it added to mainline gcc in the first
place. They all have patched gcc-7.3 ports that work to some
degree, but complete upstream support won't happen before gcc-8.1.
Csky posted their first kernel patch set last week, their situation
will be similar
[ Palmer Dabbelt points out that RISC-V support is in mainline gcc
since gcc-7, although gcc-7.3.0 is the recommended minimum - Linus ]"
This really says it all:
2498 files changed, 95 insertions(+), 467668 deletions(-)
* tag 'arch-removal' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (74 commits)
MAINTAINERS: UNICORE32: Change email account
staging: iio: remove iio-trig-bfin-timer driver
tty: hvc: remove tile driver
tty: remove bfin_jtag_comm and hvc_bfin_jtag drivers
serial: remove tile uart driver
serial: remove m32r_sio driver
serial: remove blackfin drivers
serial: remove cris/etrax uart drivers
usb: Remove Blackfin references in USB support
usb: isp1362: remove blackfin arch glue
usb: musb: remove blackfin port
usb: host: remove tilegx platform glue
pwm: remove pwm-bfin driver
i2c: remove bfin-twi driver
spi: remove blackfin related host drivers
watchdog: remove bfin_wdt driver
can: remove bfin_can driver
mmc: remove bfin_sdh driver
input: misc: remove blackfin rotary driver
input: keyboard: remove bf54x driver
...
Pull perf updates from Ingo Molnar:
"The main kernel side changes were:
- Modernize the kprobe and uprobe creation/destruction tooling ABIs:
The existing text based APIs (kprobe_events and uprobe_events in
tracefs), are naive, limited ABIs in that they require user-space
to clean up after themselves, which is both difficult and fragile
if the tool is buggy or exits unexpectedly. In other words they are
not really suited for modern, robust tooling.
So introduce a modern, file descriptor based ABI that does not have
these limitations: introduce the 'perf_kprobe' and 'perf_uprobe'
PMUs and extend the perf_event_open() syscall to create events with
a kprobe/uprobe attached to them. These [k,u]probe are associated
with this file descriptor, so they are not available in tracefs.
(Song Liu)
- Intel Cannon Lake CPU support (Harry Pan)
- Intel PT cleanups (Alexander Shishkin)
- Improve the performance of pinned/flexible event groups by using RB
trees (Alexey Budankov)
- Add PERF_EVENT_IOC_MODIFY_ATTRIBUTES which allows the modification
of hardware breakpoints, which new ABI variant massively speeds up
existing tooling that uses hardware breakpoints to instrument (and
debug) memory usage.
(Milind Chabbi, Jiri Olsa)
- Various Intel PEBS handling fixes and improvements, and other Intel
PMU improvements (Kan Liang)
- Various perf core improvements and optimizations (Peter Zijlstra)
- ... misc cleanups, fixes and updates.
There's over 200 tooling commits, here's an (imperfect) list of
highlights:
- 'perf annotate' improvements:
* Recognize and handle jumps to other functions as calls, which
improves the navigation along jumps and back. (Arnaldo Carvalho
de Melo)
* Add the 'P' hotkey in TUI annotation to dump annotation output
into a file, to ease e-mail reporting of annotation details.
(Arnaldo Carvalho de Melo)
* Add an IPC/cycles column to the TUI (Jin Yao)
* Improve s390 assembly annotation (Thomas Richter)
* Refactor the output formatting logic to better separate it into
interactive and non-interactive features and add the --stdio2
output variant to demonstrate this. (Arnaldo Carvalho de Melo)
- 'perf script' improvements:
* Add Python 3 support (Jaroslav Škarvada)
* Add --show-round-event (Jiri Olsa)
- 'perf c2c' improvements:
* Add NUMA analysis support (Jiri Olsa)
- 'perf trace' improvements:
* Improve PowerPC support (Ravi Bangoria)
- 'perf inject' improvements:
* Integrate ARM CoreSight traces (Robert Walker)
- 'perf stat' improvements:
* Add the --interval-count option (yuzhoujian)
* Add the --timeout option (yuzhoujian)
- 'perf sched' improvements (Changbin Du)
- Vendor events improvements :
* Add IBM s390 vendor events (Thomas Richter)
* Add and improve arm64 vendor events (John Garry, Ganapatrao
Kulkarni)
* Update POWER9 vendor events (Sukadev Bhattiprolu)
- Intel PT tooling improvements (Adrian Hunter)
- PMU handling improvements (Agustin Vega-Frias)
- Record machine topology in perf.data (Jiri Olsa)
- Various overwrite related cleanups (Kan Liang)
- Add arm64 dwarf post unwind support (Kim Phillips, Jean Pihet)
- ... and lots of other changes, cleanups and fixes, see the shortlog
and Git history for details"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (262 commits)
perf/x86/intel: Enable C-state residency events for Cannon Lake
perf/x86/intel: Add Cannon Lake support for RAPL profiling
perf/x86/pt, coresight: Clean up address filter structure
perf vendor events s390: Add JSON files for IBM z14
perf vendor events s390: Add JSON files for IBM z13
perf vendor events s390: Add JSON files for IBM zEC12 zBC12
perf vendor events s390: Add JSON files for IBM z196
perf vendor events s390: Add JSON files for IBM z10EC z10BC
perf mmap: Be consistent when checking for an unmaped ring buffer
perf mmap: Fix accessing unmapped mmap in perf_mmap__read_done()
perf build: Fix check-headers.sh opts assignment
perf/x86: Update rdpmc_always_available static key to the modern API
perf annotate: Use absolute addresses to calculate jump target offsets
perf annotate: Defer searching for comma in raw line till it is needed
perf annotate: Support jumping from one function to another
perf annotate: Add "_local" to jump/offset validation routines
perf python: Reference Py_None before returning it
perf annotate: Mark jumps to outher functions with the call arrow
perf annotate: Pass function descriptor to its instruction parsing routines
perf annotate: No need to calculate notes->start twice
...
This is a *very* big release for ASoC. Not much change in the core but
there s the transition of all the individual drivers over to components
which is intended to support further core work. The goal is to make it
easier to do further core work by removing the need to special case all
the different driver classes in the core, many of the devices end up
being used in multiple roles in modern systems.
We also have quite a lot of new drivers added this month of all kinds,
quite a few for simple devices but also some more advanced ones with
more substantial code.
- The biggest thing is the huge series from Morimoto-san which
converted everything over to components. This is a huge change by
code volume but was fairly mechanical
- Many fixes for some of the Realtek based Baytrail systems covering
both the CODECs and the CPUs, contributed by Hans de Goode.
- Lots of cleanups for Samsung based Odroid systems from Sylwester
Nawrocki.
- The Freescale SSI driver also got a lot of cleanups from Nicolin
Chen.
- The Blackfin drivers have been removed as part of the removal of the
architecture.
- New drivers for AKM AK4458 and AK5558, several AMD based machines,
several Intel based machines, Maxim MAX9759, Motorola CPCAP,
Socionext Uniphier SoCs, and TI PCM1789 and TDA7419
-----BEGIN PGP SIGNATURE-----
iQFHBAABCgAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAlrCUIYTHGJyb29uaWVA
a2VybmVsLm9yZwAKCRAk1otyXVSH0A29B/sGkDyeoSTkvAIIu1cmVAIdpxz/MniC
2/KOVlZkIPV2WqS7wdzadJhTw8Xv/yX+By6w5dYQZyBsw9elYr/AvDomqetEwJfo
229jJGWxFbxNxgSo0gNeo5bL44ISjLK8TUw72YN3M1a15XvxF4NQwxmw3/5FYLHB
i3bxUd+nBTtshnnBTZFCvraF7kgm2OT1wQJgOiD6fWD4eSrIUrnp5kmUzvkrtMEA
PjKWV3k8d4xc1r5IDraX/saUYeoXQ/3cGkktWtc/AmqEf+mLI1iYpdhbAeiEqyNU
mkhcuMwF4E1qaMP0GgifhWnDgEyp4GvMUYkM21EjgKrOxgraMw3NcgX9
=JfFJ
-----END PGP SIGNATURE-----
Merge tag 'asoc-v4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Updates for v4.17
This is a *very* big release for ASoC. Not much change in the core but
there s the transition of all the individual drivers over to components
which is intended to support further core work. The goal is to make it
easier to do further core work by removing the need to special case all
the different driver classes in the core, many of the devices end up
being used in multiple roles in modern systems.
We also have quite a lot of new drivers added this month of all kinds,
quite a few for simple devices but also some more advanced ones with
more substantial code.
- The biggest thing is the huge series from Morimoto-san which
converted everything over to components. This is a huge change by
code volume but was fairly mechanical
- Many fixes for some of the Realtek based Baytrail systems covering
both the CODECs and the CPUs, contributed by Hans de Goode.
- Lots of cleanups for Samsung based Odroid systems from Sylwester
Nawrocki.
- The Freescale SSI driver also got a lot of cleanups from Nicolin
Chen.
- The Blackfin drivers have been removed as part of the removal of the
architecture.
- New drivers for AKM AK4458 and AK5558, several AMD based machines,
several Intel based machines, Maxim MAX9759, Motorola CPCAP,
Socionext Uniphier SoCs, and TI PCM1789 and TDA7419
Minor conflicts in drivers/net/ethernet/mellanox/mlx5/core/en_rep.c,
we had some overlapping changes:
1) In 'net' MLX5E_PARAMS_LOG_{SQ,RQ}_SIZE -->
MLX5E_REP_PARAMS_LOG_{SQ,RQ}_SIZE
2) In 'net-next' params->log_rq_size is renamed to be
params->log_rq_mtu_frames.
3) In 'net-next' params->hard_mtu is added.
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann says:
====================
pull-request: bpf-next 2018-03-31
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) Add raw BPF tracepoint API in order to have a BPF program type that
can access kernel internal arguments of the tracepoints in their
raw form similar to kprobes based BPF programs. This infrastructure
also adds a new BPF_RAW_TRACEPOINT_OPEN command to BPF syscall which
returns an anon-inode backed fd for the tracepoint object that allows
for automatic detach of the BPF program resp. unregistering of the
tracepoint probe on fd release, from Alexei.
2) Add new BPF cgroup hooks at bind() and connect() entry in order to
allow BPF programs to reject, inspect or modify user space passed
struct sockaddr, and as well a hook at post bind time once the port
has been allocated. They are used in FB's container management engine
for implementing policy, replacing fragile LD_PRELOAD wrapper
intercepting bind() and connect() calls that only works in limited
scenarios like glibc based apps but not for other runtimes in
containerized applications, from Andrey.
3) BPF_F_INGRESS flag support has been added to sockmap programs for
their redirect helper call bringing it in line with cls_bpf based
programs. Support is added for both variants of sockmap programs,
meaning for tx ULP hooks as well as recv skb hooks, from John.
4) Various improvements on BPF side for the nfp driver, besides others
this work adds BPF map update and delete helper call support from
the datapath, JITing of 32 and 64 bit XADD instructions as well as
offload support of bpf_get_prandom_u32() call. Initial implementation
of nfp packet cache has been tackled that optimizes memory access
(see merge commit for further details), from Jakub and Jiong.
5) Removal of struct bpf_verifier_env argument from the print_bpf_insn()
API has been done in order to prepare to use print_bpf_insn() soon
out of perf tool directly. This makes the print_bpf_insn() API more
generic and pushes the env into private data. bpftool is adjusted
as well with the print_bpf_insn() argument removal, from Jiri.
6) Couple of cleanups and prep work for the upcoming BTF (BPF Type
Format). The latter will reuse the current BPF verifier log as
well, thus bpf_verifier_log() is further generalized, from Martin.
7) For bpf_getsockopt() and bpf_setsockopt() helpers, IPv4 IP_TOS read
and write support has been added in similar fashion to existing
IPv6 IPV6_TCLASS socket option we already have, from Nikita.
8) Fixes in recent sockmap scatterlist API usage, which did not use
sg_init_table() for initialization thus triggering a BUG_ON() in
scatterlist API when CONFIG_DEBUG_SG was enabled. This adds and
uses a small helper sg_init_marker() to properly handle the affected
cases, from Prashant.
9) Let the BPF core follow IDR code convention and therefore use the
idr_preload() and idr_preload_end() helpers, which would also help
idr_alloc_cyclic() under GFP_ATOMIC to better succeed under memory
pressure, from Shaohua.
10) Last but not least, a spelling fix in an error message for the
BPF cookie UID helper under BPF sample code, from Colin.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
gcc points out that the combined length of the fixed-length inputs to
l->name is larger than the destination buffer size:
net/tipc/link.c: In function 'tipc_link_create':
net/tipc/link.c:465:26: error: '%s' directive writing up to 32 bytes
into a region of size between 26 and 58 [-Werror=format-overflow=]
sprintf(l->name, "%s:%s-%s:unknown", self_str, if_name, peer_str);
net/tipc/link.c:465:2: note: 'sprintf' output 11 or more bytes
(assuming 75) into a destination of size 60
sprintf(l->name, "%s:%s-%s:unknown", self_str, if_name, peer_str);
A detailed analysis reveals that the theoretical maximum length of
a link name is:
max self_str + 1 + max if_name + 1 + max peer_str + 1 + max if_name =
16 + 1 + 15 + 1 + 16 + 1 + 15 = 65
Since we also need space for a trailing zero we now set MAX_LINK_NAME
to 68.
Just to be on the safe side we also replace the sprintf() call with
snprintf().
Fixes: 25b0b9c4e8 ("tipc: handle collisions of 32-bit node address
hash values")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The three address type structs in the user API have names that in
reality reflect the specific, non-Linux environment where they were
originally created.
We now give them more intuitive names, in accordance with how TIPC is
described in the current documentation.
struct tipc_portid -> struct tipc_socket_addr
struct tipc_name -> struct tipc_service_addr
struct tipc_name_seq -> struct tipc_service_range
To avoid confusion, we also update some commmets and macro names to
match the new terminology.
For compatibility, we add macros that map all old names to the new ones.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
"Post-hooks" are hooks that are called right before returning from
sys_bind. At this time IP and port are already allocated and no further
changes to `struct sock` can happen before returning from sys_bind but
BPF program has a chance to inspect the socket and change sys_bind
result.
Specifically it can e.g. inspect what port was allocated and if it
doesn't satisfy some policy, BPF program can force sys_bind to fail and
return EPERM to user.
Another example of usage is recording the IP:port pair to some map to
use it in later calls to sys_connect. E.g. if some TCP server inside
cgroup was bound to some IP:port_n, it can be recorded to a map. And
later when some TCP client inside same cgroup is trying to connect to
127.0.0.1:port_n, BPF hook for sys_connect can override the destination
and connect application to IP:port_n instead of 127.0.0.1:port_n. That
helps forcing all applications inside a cgroup to use desired IP and not
break those applications if they e.g. use localhost to communicate
between each other.
== Implementation details ==
Post-hooks are implemented as two new attach types
`BPF_CGROUP_INET4_POST_BIND` and `BPF_CGROUP_INET6_POST_BIND` for
existing prog type `BPF_PROG_TYPE_CGROUP_SOCK`.
Separate attach types for IPv4 and IPv6 are introduced to avoid access
to IPv6 field in `struct sock` from `inet_bind()` and to IPv4 field from
`inet6_bind()` since those fields might not make sense in such cases.
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
== The problem ==
See description of the problem in the initial patch of this patch set.
== The solution ==
The patch provides much more reliable in-kernel solution for the 2nd
part of the problem: making outgoing connecttion from desired IP.
It adds new attach types `BPF_CGROUP_INET4_CONNECT` and
`BPF_CGROUP_INET6_CONNECT` for program type
`BPF_PROG_TYPE_CGROUP_SOCK_ADDR` that can be used to override both
source and destination of a connection at connect(2) time.
Local end of connection can be bound to desired IP using newly
introduced BPF-helper `bpf_bind()`. It allows to bind to only IP though,
and doesn't support binding to port, i.e. leverages
`IP_BIND_ADDRESS_NO_PORT` socket option. There are two reasons for this:
* looking for a free port is expensive and can affect performance
significantly;
* there is no use-case for port.
As for remote end (`struct sockaddr *` passed by user), both parts of it
can be overridden, remote IP and remote port. It's useful if an
application inside cgroup wants to connect to another application inside
same cgroup or to itself, but knows nothing about IP assigned to the
cgroup.
Support is added for IPv4 and IPv6, for TCP and UDP.
IPv4 and IPv6 have separate attach types for same reason as sys_bind
hooks, i.e. to prevent reading from / writing to e.g. user_ip6 fields
when user passes sockaddr_in since it'd be out-of-bound.
== Implementation notes ==
The patch introduces new field in `struct proto`: `pre_connect` that is
a pointer to a function with same signature as `connect` but is called
before it. The reason is in some cases BPF hooks should be called way
before control is passed to `sk->sk_prot->connect`. Specifically
`inet_dgram_connect` autobinds socket before calling
`sk->sk_prot->connect` and there is no way to call `bpf_bind()` from
hooks from e.g. `ip4_datagram_connect` or `ip6_datagram_connect` since
it'd cause double-bind. On the other hand `proto.pre_connect` provides a
flexible way to add BPF hooks for connect only for necessary `proto` and
call them at desired time before `connect`. Since `bpf_bind()` is
allowed to bind only to IP and autobind in `inet_dgram_connect` binds
only port there is no chance of double-bind.
bpf_bind() sets `force_bind_address_no_port` to bind to only IP despite
of value of `bind_address_no_port` socket field.
bpf_bind() sets `with_lock` to `false` when calling to __inet_bind()
and __inet6_bind() since all call-sites, where bpf_bind() is called,
already hold socket lock.
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
== The problem ==
There is a use-case when all processes inside a cgroup should use one
single IP address on a host that has multiple IP configured. Those
processes should use the IP for both ingress and egress, for TCP and UDP
traffic. So TCP/UDP servers should be bound to that IP to accept
incoming connections on it, and TCP/UDP clients should make outgoing
connections from that IP. It should not require changing application
code since it's often not possible.
Currently it's solved by intercepting glibc wrappers around syscalls
such as `bind(2)` and `connect(2)`. It's done by a shared library that
is preloaded for every process in a cgroup so that whenever TCP/UDP
server calls `bind(2)`, the library replaces IP in sockaddr before
passing arguments to syscall. When application calls `connect(2)` the
library transparently binds the local end of connection to that IP
(`bind(2)` with `IP_BIND_ADDRESS_NO_PORT` to avoid performance penalty).
Shared library approach is fragile though, e.g.:
* some applications clear env vars (incl. `LD_PRELOAD`);
* `/etc/ld.so.preload` doesn't help since some applications are linked
with option `-z nodefaultlib`;
* other applications don't use glibc and there is nothing to intercept.
== The solution ==
The patch provides much more reliable in-kernel solution for the 1st
part of the problem: binding TCP/UDP servers on desired IP. It does not
depend on application environment and implementation details (whether
glibc is used or not).
It adds new eBPF program type `BPF_PROG_TYPE_CGROUP_SOCK_ADDR` and
attach types `BPF_CGROUP_INET4_BIND` and `BPF_CGROUP_INET6_BIND`
(similar to already existing `BPF_CGROUP_INET_SOCK_CREATE`).
The new program type is intended to be used with sockets (`struct sock`)
in a cgroup and provided by user `struct sockaddr`. Pointers to both of
them are parts of the context passed to programs of newly added types.
The new attach types provides hooks in `bind(2)` system call for both
IPv4 and IPv6 so that one can write a program to override IP addresses
and ports user program tries to bind to and apply such a program for
whole cgroup.
== Implementation notes ==
[1]
Separate attach types for `AF_INET` and `AF_INET6` are added
intentionally to prevent reading/writing to offsets that don't make
sense for corresponding socket family. E.g. if user passes `sockaddr_in`
it doesn't make sense to read from / write to `user_ip6[]` context
fields.
[2]
The write access to `struct bpf_sock_addr_kern` is implemented using
special field as an additional "register".
There are just two registers in `sock_addr_convert_ctx_access`: `src`
with value to write and `dst` with pointer to context that can't be
changed not to break later instructions. But the fields, allowed to
write to, are not available directly and to access them address of
corresponding pointer has to be loaded first. To get additional register
the 1st not used by `src` and `dst` one is taken, its content is saved
to `bpf_sock_addr_kern.tmp_reg`, then the register is used to load
address of pointer field, and finally the register's content is restored
from the temporary field after writing `src` value.
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
== The problem ==
There are use-cases when a program of some type can be attached to
multiple attach points and those attach points must have different
permissions to access context or to call helpers.
E.g. context structure may have fields for both IPv4 and IPv6 but it
doesn't make sense to read from / write to IPv6 field when attach point
is somewhere in IPv4 stack.
Same applies to BPF-helpers: it may make sense to call some helper from
some attach point, but not from other for same prog type.
== The solution ==
Introduce `expected_attach_type` field in in `struct bpf_attr` for
`BPF_PROG_LOAD` command. If scenario described in "The problem" section
is the case for some prog type, the field will be checked twice:
1) At load time prog type is checked to see if attach type for it must
be known to validate program permissions correctly. Prog will be
rejected with EINVAL if it's the case and `expected_attach_type` is
not specified or has invalid value.
2) At attach time `attach_type` is compared with `expected_attach_type`,
if prog type requires to have one, and, if they differ, attach will
be rejected with EINVAL.
The `expected_attach_type` is now available as part of `struct bpf_prog`
in both `bpf_verifier_ops->is_valid_access()` and
`bpf_verifier_ops->get_func_proto()` () and can be used to check context
accesses and calls to helpers correspondingly.
Initially the idea was discussed by Alexei Starovoitov <ast@fb.com> and
Daniel Borkmann <daniel@iogearbox.net> here:
https://marc.info/?l=linux-netdev&m=152107378717201&w=2
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The `__u64 time` field of the blk_io_trace struct refers to
the time in nanoseconds, not in microseconds. It is set in
__blk_add_trace, which does the following:
t->time = ktime_to_ns(ktime_get());
ktime_to_ns returns ktime_t in nanoseconds, not microseconds.
Signed-off-by: Souvik Banerjee <souvik1997@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pablo Neira Ayuso says:
====================
Netfilter/IPVS updates for net-next
The following patchset contains Netfilter/IPVS updates for your net-next
tree. This batch comes with more input sanitization for xtables to
address bug reports from fuzzers, preparation works to the flowtable
infrastructure and assorted updates. In no particular order, they are:
1) Make sure userspace provides a valid standard target verdict, from
Florian Westphal.
2) Sanitize error target size, also from Florian.
3) Validate that last rule in basechain matches underflow/policy since
userspace assumes this when decoding the ruleset blob that comes
from the kernel, from Florian.
4) Consolidate hook entry checks through xt_check_table_hooks(),
patch from Florian.
5) Cap ruleset allocations at 512 mbytes, 134217728 rules and reject
very large compat offset arrays, so we have a reasonable upper limit
and fuzzers don't exercise the oom-killer. Patches from Florian.
6) Several WARN_ON checks on xtables mutex helper, from Florian.
7) xt_rateest now has a hashtable per net, from Cong Wang.
8) Consolidate counter allocation in xt_counters_alloc(), from Florian.
9) Earlier xt_table_unlock() call in {ip,ip6,arp,eb}tables, patch
from Xin Long.
10) Set FLOW_OFFLOAD_DIR_* to IP_CT_DIR_* definitions, patch from
Felix Fietkau.
11) Consolidate code through flow_offload_fill_dir(), also from Felix.
12) Inline ip6_dst_mtu_forward() just like ip_dst_mtu_maybe_forward()
to remove a dependency with flowtable and ipv6.ko, from Felix.
13) Cache mtu size in flow_offload_tuple object, this is safe for
forwarding as f87c10a8aa describes, from Felix.
14) Rename nf_flow_table.c to nf_flow_table_core.o, to simplify too
modular infrastructure, from Felix.
15) Add rt0, rt2 and rt4 IPv6 routing extension support, patch from
Ahmed Abdelsalam.
16) Remove unused parameter in nf_conncount_count(), from Yi-Hung Wei.
17) Support for counting only to nf_conncount infrastructure, patch
from Yi-Hung Wei.
18) Add strict NFT_CT_{SRC_IP,DST_IP,SRC_IP6,DST_IP6} key datatypes
to nft_ct.
19) Use boolean as return value from ipt_ah and from IPVS too, patch
from Gustavo A. R. Silva.
20) Remove useless parameters in nfnl_acct_overquota() and
nf_conntrack_broadcast_help(), from Taehee Yoo.
21) Use ipv6_addr_is_multicast() from xt_cluster, also from Taehee Yoo.
22) Statify nf_tables_obj_lookup_byhandle, patch from Fengguang Wu.
23) Fix typo in xt_limit, from Geert Uytterhoeven.
24) Do no use VLAs in Netfilter code, again from Gustavo.
25) Use ADD_COUNTER from ebtables, from Taehee Yoo.
26) Bitshift support for CONNMARK and MARK targets, from Jack Ma.
27) Use pr_*() and add pr_fmt(), from Arushi Singhal.
28) Add synproxy support to ctnetlink.
29) ICMP type and IGMP matching support for ebtables, patches from
Matthias Schiffer.
30) Support for the revision infrastructure to ebtables, from
Bernie Harris.
31) String match support for ebtables, also from Bernie.
32) Documentation for the new flowtable infrastructure.
33) Use generic comparison functions in ebt_stp, from Joe Perches.
34) Demodularize filter chains in nftables.
35) Register conntrack hooks in case nftables NAT chain is added.
36) Merge assignments with return in a couple of spots in the
Netfilter codebase, also from Arushi.
37) Document that xtables percpu counters are stored in the same
memory area, from Ben Hutchings.
38) Revert mark_source_chains() sanity checks that break existing
rulesets, from Florian Westphal.
39) Use is_zero_ether_addr() in the ipset codebase, from Joe Perches.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently ebtables assumes that the revision number of all match
modules is 0, which is an issue when trying to use existing
xtables matches with ebtables. The solution is to modify ebtables
to allow extensions to specify a revision number, similar to
iptables. This gets passed down to the kernel, which is then able
to find the match module correctly.
To main binary backwards compatibility, the size of the ebt_entry
structures is not changed, only the size of the name field is
decreased by 1 byte to make room for the revision field.
Signed-off-by: Bernie Harris <bernie.harris@alliedtelesis.co.nz>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
first bullet here:
* EAPoL-over-nl80211 from Denis - this will let us fix
some long-standing issues with bridging, races with
encryption and more
* DFS offload support from the qtnfmac folks
* regulatory database changes for the new ETSI adaptivity
requirements
* various other fixes and small enhancements
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAlq85QQACgkQB8qZga/f
l8Sdbg//bc8C/4F1TUdJqZWGK1j9Lwd6nZpP0iFqH5Ees3MMnti5XdV2dCL31ivz
c8DOuRAU8ZG/RLPgSTHVZHwh+f7S5/TxSKg8WOBvrYk7a0C1uvVVhe5XZQEmqE7g
eqM0+UQ5DyzUYnu0kSUrFKPV7BqDa2YzVDdK8e8iozqZmAnvGN9k8H7EDEeUxtxk
LEl+bEcmhDSfIssU2Iaksl+9qoZP6BkoVGAOmDzIL654WV4XVKorxRX6vndqSQGu
0cCz2Occ+/0hfvszONBRR4M/gtI/Yyn3u+D1Q0YD3X40Q9gJE11fcodmMT61l5C7
rGcu94RIGilvRvjZScK3giiU2L7DD+VETUa+YGnjd8gLpmrYd6cURxlm4yuHbw8C
UScLCApAUuY+skmPLeuyHW9mnzHaC336vzVjk8OhdNRhX23/rB8nk1yIywgqETVW
g/iub8/Xp6TRfdyh76I18wlfqCp1It2JAeICgKH5NPlwUA6U0xFR0/ddSR8FuAcK
ZLY8mgsc2kIH6r4x5sjeH+Yb6tGi/Z3HMZM2hna+t4vSpn6Q5+GPsA6yuHuBUhJb
8QswMiLDSux8I4guKgQyROiHaCzE3zOigJ4o1z9XITKsgluZVxnKr+ETKdr88WFp
II8U0qH/kejXIfxUjbv5Wla70J9wi/hjxR6vOfSkEtYNvIApdfc=
=hI0F
-----END PGP SIGNATURE-----
Merge tag 'mac80211-next-for-davem-2018-03-29' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:
====================
We have a fair number of patches, but many of them are from the
first bullet here:
* EAPoL-over-nl80211 from Denis - this will let us fix
some long-standing issues with bridging, races with
encryption and more
* DFS offload support from the qtnfmac folks
* regulatory database changes for the new ETSI adaptivity
requirements
* various other fixes and small enhancements
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
These are:
* Mass conversion to GPL-2 SPDX header
* Moved "hwtracing" to now its own submenu, to uncrowd the parent menu a bit
* Added MAINTAINERS entry for drivers/hwtracing
* Somewhat small Trace Hub fixes
* Added ACPI glue layer for the Trace Hub
* Added more module parameters to dummy_stm for better test coverage
-----BEGIN PGP SIGNATURE-----
iJkEABEIAEEWIQQSviFCoXpKPDNATbnrxfYkYwVX/wUCWrzG7yMcYWxleGFuZGVy
LnNoaXNoa2luQGxpbnV4LmludGVsLmNvbQAKCRDrxfYkYwVX/0jcAQCE/maw6L5P
NdT3Ck3EuDWzij9uimq3Y0UyI9F4Cd+vYgD+N8Skyiua+NqfBGB0oXNsJt2acirg
O3G990RK9dX4H2c=
=CuGw
-----END PGP SIGNATURE-----
Merge tag 'stm-intel_th-for-greg-20180329' of git://git.kernel.org/pub/scm/linux/kernel/git/ash/stm into char-misc-next
Alexander writes:
stm class/intel_th: Updates for 4.17
These are:
* Mass conversion to GPL-2 SPDX header
* Moved "hwtracing" to now its own submenu, to uncrowd the parent menu a bit
* Added MAINTAINERS entry for drivers/hwtracing
* Somewhat small Trace Hub fixes
* Added ACPI glue layer for the Trace Hub
* Added more module parameters to dummy_stm for better test coverage
This commit implements the TX side of NL80211_CMD_CONTROL_PORT_FRAME.
Userspace provides the raw EAPoL frame using NL80211_ATTR_FRAME.
Userspace should also provide the destination address and the protocol
type to use when sending the frame. This is used to implement TX of
Pre-authentication frames. If CONTROL_PORT_ETHERTYPE_NO_ENCRYPT is
specified, then the driver will be asked not to encrypt the outgoing
frame.
A new EXT_FEATURE flag is introduced so that nl80211 code can check
whether a given wiphy has capability to pass EAPoL frames over nl80211.
Signed-off-by: Denis Kenzior <denkenz@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This commit also adds cfg80211_rx_control_port function. This is used
to generate a CMD_CONTROL_PORT_FRAME event out to userspace. The
conn_owner_nlportid is used as the unicast destination. This means that
userspace must specify NL80211_ATTR_SOCKET_OWNER flag if control port
over nl80211 routing is requested in NL80211_CMD_CONNECT,
NL80211_CMD_ASSOCIATE, NL80211_CMD_START_AP or IBSS/mesh join.
Signed-off-by: Denis Kenzior <denkenz@gmail.com>
[johannes: fix return value of cfg80211_rx_control_port()]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Denis Kenzior <denkenz@gmail.com>
[johannes: fix race with wdev lock/unlock by just acquiring once]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Denis Kenzior <denkenz@gmail.com>
[johannes: fix race with wdev lock/unlock by just acquiring once]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Introduce BPF_PROG_TYPE_RAW_TRACEPOINT bpf program type to access
kernel internal arguments of the tracepoints in their raw form.
>From bpf program point of view the access to the arguments look like:
struct bpf_raw_tracepoint_args {
__u64 args[0];
};
int bpf_prog(struct bpf_raw_tracepoint_args *ctx)
{
// program can read args[N] where N depends on tracepoint
// and statically verified at program load+attach time
}
kprobe+bpf infrastructure allows programs access function arguments.
This feature allows programs access raw tracepoint arguments.
Similar to proposed 'dynamic ftrace events' there are no abi guarantees
to what the tracepoints arguments are and what their meaning is.
The program needs to type cast args properly and use bpf_probe_read()
helper to access struct fields when argument is a pointer.
For every tracepoint __bpf_trace_##call function is prepared.
In assembler it looks like:
(gdb) disassemble __bpf_trace_xdp_exception
Dump of assembler code for function __bpf_trace_xdp_exception:
0xffffffff81132080 <+0>: mov %ecx,%ecx
0xffffffff81132082 <+2>: jmpq 0xffffffff811231f0 <bpf_trace_run3>
where
TRACE_EVENT(xdp_exception,
TP_PROTO(const struct net_device *dev,
const struct bpf_prog *xdp, u32 act),
The above assembler snippet is casting 32-bit 'act' field into 'u64'
to pass into bpf_trace_run3(), while 'dev' and 'xdp' args are passed as-is.
All of ~500 of __bpf_trace_*() functions are only 5-10 byte long
and in total this approach adds 7k bytes to .text.
This approach gives the lowest possible overhead
while calling trace_xdp_exception() from kernel C code and
transitioning into bpf land.
Since tracepoint+bpf are used at speeds of 1M+ events per second
this is valuable optimization.
The new BPF_RAW_TRACEPOINT_OPEN sys_bpf command is introduced
that returns anon_inode FD of 'bpf-raw-tracepoint' object.
The user space looks like:
// load bpf prog with BPF_PROG_TYPE_RAW_TRACEPOINT type
prog_fd = bpf_prog_load(...);
// receive anon_inode fd for given bpf_raw_tracepoint with prog attached
raw_tp_fd = bpf_raw_tracepoint_open("xdp_exception", prog_fd);
Ctrl-C of tracing daemon or cmdline tool that uses this feature
will automatically detach bpf program, unload it and
unregister tracepoint probe.
On the kernel side the __bpf_raw_tp_map section of pointers to
tracepoint definition and to __bpf_trace_*() probe function is used
to find a tracepoint with "xdp_exception" name and
corresponding __bpf_trace_xdp_exception() probe function
which are passed to tracepoint_probe_register() to connect probe
with tracepoint.
Addition of bpf_raw_tracepoint doesn't interfere with ftrace and perf
tracepoint mechanisms. perf_event_open() can be used in parallel
on the same tracepoint.
Multiple bpf_raw_tracepoint_open("xdp_exception", prog_fd) are permitted.
Each with its own bpf program. The kernel will execute
all tracepoint probes and all attached bpf programs.
In the future bpf_raw_tracepoints can be extended with
query/introspection logic.
__bpf_raw_tp_map section logic was contributed by Steven Rostedt
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
To allow for more flexible testing of the stm class, make it possible
to specify the ranges of masters and channels that the dummy_stm devices
cover. This is done via module parameters.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
This adds SPDX GPL-2.0 header to to stm core files and removes the
GPLv2 boilerplate text.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
- GPUVM support for dGPUs
- KFD events support for dGPUs
- Fix live-lock situation when restoring multiple evicted processes
- Fix VM page table allocation on large-bar systems
- Fix for build failure on frv architecture
* tag 'drm-amdkfd-next-2018-03-27' of git://people.freedesktop.org/~gabbayo/linux:
drm/amdkfd: Use ordered workqueue to restore processes
drm/amdgpu: Fix acquiring VM on large-BAR systems
drm/amdkfd: Add module option for testing large-BAR functionality
drm/amdkfd: Kmap event page for dGPUs
drm/amdkfd: Add ioctls for GPUVM memory management
drm/amdkfd: Add TC flush on VMID deallocation for Hawaii
drm/amdkfd: Allocate CWSR trap handler memory for dGPUs
drm/amdkfd: Add per-process IDR for buffer handles
drm/amdkfd: Aperture setup for dGPUs
drm/amdkfd: Remove limit on number of GPUs
drm/amdkfd: Populate DRM render device minor
drm/amdkfd: Create KFD VMs on demand
drm/amdgpu: Add kfd2kgd interface to acquire an existing VM
drm/amdgpu: Add helper to turn an existing VM into a compute VM
drm/amdgpu: Fix initial validation of PD BO for KFD VMs
drm/amdgpu: Move KFD-specific fields into struct amdgpu_vm
drm/amdkfd: fix uninitialized variable use
drm/amdkfd: add missing include of mm.h
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJauCZfAAoJEHm+PkMAQRiGWGUH/2rhdQDkoJpYWnjaQkolECG8
MxpGE7nmIIHxQcbSDdHTGJ8IhVm6Z5wZ7ym/PwCDTT043Y1y341sJrIwL2/nTG6d
HVidk8hFvgN6QzlzVAHT3ZZMII/V9Zt+VV5SUYLGnPAVuJNHo/6uzWlTU5g+NTFo
IquFDdQUaGBlkKqby+NoAFnkV1UAIkW0g22cfvPnlO5GMer0gusGyVNvVp7TNj3C
sqj4Hvt3RMDLMNe9RZ2pFTiOD096n8FWpYftZneUTxFImhRV3Jg5MaaYZm9SI3HW
tXrv/LChT/F1mi5Pkx6tkT5Hr8WvcrwDMJ4It1kom10RqWAgjxIR3CMm448ileY=
=YKUG
-----END PGP SIGNATURE-----
Backmerge tag 'v4.16-rc7' into drm-next
Linux 4.16-rc7
This was requested by Daniel, and things were getting
a bit hard to reconcile, most of the conflicts were
trivial though.
In the event where the device unexpectedly becomes unresponsive
for a long period of time, flow control mechanism may propagate
pause frames which will cause congestion spreading to the entire
network.
To prevent this scenario, when the device is stalled for a period
longer than a pre-configured timeout, flow control mechanisms are
automatically disabled.
This patch adds support for the ETHTOOL_PFC_STALL_PREVENTION
as a tunable.
This API provides support for configuring flow control storm prevention
timeout (msec).
Signed-off-by: Inbar Karmy <inbark@mellanox.com>
Cc: Michal Kubecek <mkubecek@suse.cz>
Cc: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The ioeventfd here is actually irqfd handling of an ioeventfd such as
supported in KVM. A user is able to pre-program a device write to
occur when the eventfd triggers. This is yet another instance of
eventfd-irqfd triggering between KVM and vfio. The impetus for this
is high frequency writes to pages which are virtualized in QEMU.
Enabling this near-direct write path for selected registers within
the virtualized page can improve performance and reduce overhead.
Specifically this is initially targeted at NVIDIA graphics cards where
the driver issues a write to an MMIO register within a virtualized
region in order to allow the MSI interrupt to re-trigger.
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJatQDcAAoJEAhfPr2O5OEVHe8P/1Ci/5hJ0p8VfEIYXmLtyqiB
3SnXZdvvNifAeSqXfsm7650C+Aamp4E6iNgbn6+DnQ2ZMVrq9LsIbVQHtJI76F5U
v1949Not30aD922jCS1M1yTth/HCZapDfch+6qRCyRX9o9Nap0OlePoeUwbeI7Rb
Mwxf8rOQbEN1Mj5Hfajh9j8O/9StJdIHk7KLebAqwoUlu/YJ0z89H540UKHbPwN+
Z5G+xVTKUp+qztpo4Y7JVcuDL4K012gPVEzepM0aUuqLvs4drV2qJ0OeWW0dKllE
JxEVXHIxUymKriWOq340XeJ0GiLY2UVy2iqr21Gza4hd4kIunm3g3utnGBQgjQtp
CkgVORBGVkpu/74drecl5SqMpPKiiKelY7e2LJCb9uxEWu8h8BBPaiYOM4V3MGIU
jNTKZgGyJPwxH3dv6vv7jJJ87HjYHHZC+/HVQSRRhyLdONzPKHRKjuZbmmXT4l7o
RJ670mMD1m0FM5paYJ7jGZJ0jMYNh8UyehJ65lHcWsnGh7aTgQ7Rv/xnGGxiKhkl
ByclFqLTXfsVdBsi18Jno+l7s8vhowp+DCL8aqpt44wIE4W2gkRY1bD9/MihGiZy
ZaICuMAh/cp3QGC+cAJGY06jjXbnM6TfzQsBlTii31QqBhlezdoaeEq+7LmMjVfP
GB1yHdlOWMLPIIcQQ4wG
=sSiA
-----END PGP SIGNATURE-----
Merge tag 'media/v4.16-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
"Three fixes:
- dvb: fix a Kconfig typo on a help text
- tegra-cec: reset rx_buf_cnt when start bit detected
- rc: lirc does not use LIRC_CAN_SEND_SCANCODE feature"
* tag 'media/v4.16-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
media: dvb: fix a Kconfig typo
media: tegra-cec: reset rx_buf_cnt when start bit detected
media: rc: lirc does not use LIRC_CAN_SEND_SCANCODE feature
Things look calming down, but people were still busy to plaster over
small holes:
- Two fixes to harden against races in aloop driver
- A correction of a long-standing bug in USB-audio UAC2 processing
unit parser
- As usual suspects, HD-audio: a workaround for Coffee Lake
controller and a few other device-specific fixes
All small and for stable.
-----BEGIN PGP SIGNATURE-----
iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAlq0ytkOHHRpd2FpQHN1
c2UuZGUACgkQLtJE4w1nLE8KKg/+NriAIsRozvpJB7/dYwEwTI6cAtEug/9MuNkw
efB0MOT3y5Cpkf0j9wOxEOUXhRRGb1qoYfaIOp+FansZZmddW2q8PNUPgUScYgrD
zU35wW1sOoPccJjPRPn/2Ck9CRbX8voZ84Ahz7se7WcZFk4zgO4Fso0MmPilQnHx
NQVLnztSkOOb4KirbYAiICG30U5ZrukAM9UaHVTrH0IUppihfSSOTncLULIiGPkq
iIlY8HrpZw6un/5BbBH8bHdeAB9GSPrEEQo6r5UAUSYOT0jfpmN8h/inZyID4evF
CFvGcTqG7pr5A3Nsqi6igrniaVsyFQNxHVEHfh5M3M10t8wBrk7uazWapTMunAcY
hx/2VmmkPuZoovRVz2ZPsioeOiJAIS2DSTXrimNI6AUrrPTh3tElEVt9fbxOWAyr
FeLjg/LGCy5yK9UfDMsLy5Zhe5d3hZG/pqbQ1F81B4c/LzL0sQ+6KKvqpjDMdEjH
svcABHZZlHK06bzc9ykQE+9X9VO/0LrSE+koMpfa1L0AdXxogUEozigG5jUwYYjg
eMlkT6WxGhPuCU2RUVvU/Aa3VWX6EbaGswxpTqJl6DyjP2WWp827hCEWJJgzx6SR
pkZsFASO6HdXJObep6pYyEFFa11KZo1sJPa8SGMo9PC8XQwDjjW+oPqlo/QB9hJb
nYVAnc4=
=xZMj
-----END PGP SIGNATURE-----
Merge tag 'sound-4.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Things look calming down, but people were still busy to plaster over
small holes:
- Two fixes to harden against races in aloop driver
- A correction of a long-standing bug in USB-audio UAC2 processing
unit parser
- As usual suspects, HD-audio: a workaround for Coffee Lake
controller and a few other device-specific fixes
All small and for stable"
* tag 'sound-4.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: aloop: Fix access to not-yet-ready substream via cable
ALSA: aloop: Sync stale timer before release
ALSA: hda/realtek - Fix speaker no sound after system resume
ALSA: hda/realtek - Fix Dell headset Mic can't record
ALSA: hda - Force polling mode on CFL for fixing codec communication
ALSA: usb-audio: Fix parsing descriptor of UAC2 processing unit
ALSA: hda/realtek - Always immediately update mute LED with pin VREF
We add a 128-bit node identity, as an alternative to the currently used
32-bit node address.
For the sake of compatibility and to minimize message header changes
we retain the existing 32-bit address field. When not set explicitly by
the user, this field will be filled with a hash value generated from the
much longer node identity, and be used as a shorthand value for the
latter.
We permit either the address or the identity to be set by configuration,
but not both, so when the address value is set by a legacy user the
corresponding 128-bit node identity is generated based on the that value.
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add rx path for tls software implementation.
recvmsg, splice_read, and poll implemented.
An additional sockopt TLS_RX is added, with the same interface as
TLS_TX. Either TLX_RX or TLX_TX may be provided separately, or
together (with two different setsockopt calls with appropriate keys).
Control messages are passed via CMSG in a similar way to transmit.
If no cmsg buffer is passed, then only application data records
will be passed to userspace, and EIO is returned for other types of
alerts.
EBADMSG is passed for decryption errors, and EMSGSIZE is passed for
framing too big, and EBADMSG for framing too small (matching openssl
semantics). EINVAL is returned for TLS versions that do not match the
original setsockopt call. All are unrecoverable.
strparser is used to parse TLS framing. Decryption is done directly
in to userspace buffers if they are large enough to support it, otherwise
sk_cow_data is called (similar to ipsec), and buffers are decrypted in
place and copied. splice_read always decrypts in place, since no
buffers are provided to decrypt in to.
sk_poll is overridden, and only returns POLLIN if a full TLS message is
received. Otherwise we wait for strparser to finish reading a full frame.
Actual decryption is only done during recvmsg or splice_read calls.
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fun set of conflict resolutions here...
For the mac80211 stuff, these were fortunately just parallel
adds. Trivially resolved.
In drivers/net/phy/phy.c we had a bug fix in 'net' that moved the
function phy_disable_interrupts() earlier in the file, whilst in
'net-next' the phy_error() call from this function was removed.
In net/ipv4/xfrm4_policy.c, David Ahern's changes to remove the
'rt_table_id' member of rtable collided with a bug fix in 'net' that
added a new struct member "rt_mtu_locked" which needs to be copied
over here.
The mlxsw driver conflict consisted of net-next separating
the span code and definitions into separate files, whilst
a 'net' bug fix made some changes to that moved code.
The mlx5 infiniband conflict resolution was quite non-trivial,
the RDMA tree's merge commit was used as a guide here, and
here are their notes:
====================
Due to bug fixes found by the syzkaller bot and taken into the for-rc
branch after development for the 4.17 merge window had already started
being taken into the for-next branch, there were fairly non-trivial
merge issues that would need to be resolved between the for-rc branch
and the for-next branch. This merge resolves those conflicts and
provides a unified base upon which ongoing development for 4.17 can
be based.
Conflicts:
drivers/infiniband/hw/mlx5/main.c - Commit 42cea83f95
(IB/mlx5: Fix cleanup order on unload) added to for-rc and
commit b5ca15ad7e (IB/mlx5: Add proper representors support)
add as part of the devel cycle both needed to modify the
init/de-init functions used by mlx5. To support the new
representors, the new functions added by the cleanup patch
needed to be made non-static, and the init/de-init list
added by the representors patch needed to be modified to
match the init/de-init list changes made by the cleanup
patch.
Updates:
drivers/infiniband/hw/mlx5/mlx5_ib.h - Update function
prototypes added by representors patch to reflect new function
names as changed by cleanup patch
drivers/infiniband/hw/mlx5/ib_rep.c - Update init/de-init
stage list to match new order from cleanup patch
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Define new netlink attributes for rmnet mux_id and flags. These
flags / mux_id were earlier using vlan flags / id respectively.
The flag bits are also moved to uapi and are renamed with
prefix RMNET_FLAG_*.
Also add the rmnet policy to handle the new netlink attributes.
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently when tipc is unable to queue a received message on a
socket, the message is rejected back to the sender with error
TIPC_ERR_OVERLOAD. However, the application on this socket
has no knowledge about these discards.
In this commit, we try to step the sk_drops counter when tipc
is unable to queue a received message. Export sk_drops
using tipc socket diagnostics.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: GhantaKrishnamurthy MohanKrishna <mohan.krishna.ghanta.krishnamurthy@ericsson.com>
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit adds socket diagnostics capability for AF_TIPC in netlink
family NETLINK_SOCK_DIAG in a new kernel module (diag.ko).
The following are key design considerations:
- config TIPC_DIAG has default y, like INET_DIAG.
- only requests with flag NLM_F_DUMP is supported (dump all).
- tipc_sock_diag_req message is introduced to send filter parameters.
- the response attributes are of TLV, some nested.
To avoid exposing data structures between diag and tipc modules and
avoid code duplication, the following additions are required:
- export tipc_nl_sk_walk function to reuse socket iterator.
- export tipc_sk_fill_sock_diag to fill the tipc diag attributes.
- create a sock_diag response message in __tipc_add_sock_diag defined
in diag.c and use the above exported tipc_sk_fill_sock_diag
to fill response.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: GhantaKrishnamurthy MohanKrishna <mohan.krishna.ghanta.krishnamurthy@ericsson.com>
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- avoid redundant multicast TT entries, by Linus Luessing
- add netlink support for distributed arp table cache and multicast flags,
by Linus Luessing (2 patches)
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAlqv59kWHHN3QHNpbW9u
d3VuZGVybGljaC5kZQAKCRChK+OYQpKeoSBaD/9r/oJC+Q/3Eu6DTAAiS7Jx2IpQ
kOwU7l4hkGK8mBZ098CkmHTBY+zurqYwcokCHhKJO5mqJEpvlM27PuQxzqWSBMMO
FWFax2YlKPpJp+/f/rSD9HS73RTY7npv6l5/eFg6+0WSQET04PjLB1KxPrO5u1+Z
JujrAxp0GEyMoVQgMy9uloedkpizhADyYSZzDDXnHhq1NiAPU87cjrTLv/xdtdp7
TvbNfobhZUmKZ951yaRlDmE+mH8IoTQoY7HD/JANnduYeFJAlIPnHQEQa8+5tLwO
qWUeLa4Acv5MhO2KjbKQpu5r2dFbs0x+jmsja8xBmgNWO5meKh/aE8TKGJeDVQEW
TTynEivf82suiquCIZ573fBnliJkipffg32ZHgtNGrh54hh+YU7Sts0t9Lsou4ar
aOU6lup3MHFysf3s9hK6y9TzSqwFJ8Mak0UFsa03r0Ub8am6bKHTqMFaCgN0aK9P
vBL4atSvJVguwPlzxLMi44K4NxqEVfa41dZ7nQ99P3BFzWwSvph3i4lu+cxGxwI7
4kgWU5Cz8T51dH7g8j3vUish36DzwQtUsKLWZVpV+DV4BaHJ/rLyqeug3ffUrWRk
p0bFU7wBAv8rKeFPd30m2tdJ/nMo+rDbN6Tm9n43tK4NWKOGBndhCoNhjfrzhM8U
un6Iy7taISgeElZ5fQ==
=HVxO
-----END PGP SIGNATURE-----
Merge tag 'batadv-next-for-davem-20180319' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
This feature/cleanup patchset includes the following patches:
- avoid redundant multicast TT entries, by Linus Luessing
- add netlink support for distributed arp table cache and multicast flags,
by Linus Luessing (2 patches)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
PCIe 4.0 defines the 16.0 GT/s link speed. Links can run at that speed
without any Linux changes, but previously their sysfs "max_link_speed" and
"current_link_speed" files contained "Unknown speed", not the expected
"16.0 GT/s".
Add decoding for the new 16 GT/s link speed.
Signed-off-by: Jay Fang <f.fangjian@huawei.com>
[bhelgaas: add PCI_EXP_LNKCAP2_SLS_16_0GB]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Dongdong Liu <liudongdong3@huawei.com>
Daniel Borkmann says:
====================
pull-request: bpf-next 2018-03-21
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) Add a BPF hook for sendmsg and sendfile by reusing the ULP infrastructure
and sockmap. Three helpers are added along with this, bpf_msg_apply_bytes(),
bpf_msg_cork_bytes(), and bpf_msg_pull_data(). The first is used to tell
for how many bytes the verdict should be applied to, the second to tell
that x bytes need to be queued first to retrigger the BPF program for a
verdict, and the third helper is mainly for the sendfile case to pull in
data for making it private for reading and/or writing, from John.
2) Improve address to symbol resolution of user stack traces in BPF stackmap.
Currently, the latter stores the address for each entry in the call trace,
however to map these addresses to user space files, it is necessary to
maintain the mapping from these virtual addresses to symbols in the binary
which is not practical for system-wide profiling. Instead, this option for
the stackmap rather stores the ELF build id and offset for the call trace
entries, from Song.
3) Add support that allows BPF programs attached to perf events to read the
address values recorded with the perf events. They are requested through
PERF_SAMPLE_ADDR via perf_event_open(). Main motivation behind it is to
support building memory or lock access profiling and tracing tools with
the help of BPF, from Teng.
4) Several improvements to the tools/bpf/ Makefiles. The 'make bpf' in the
tools directory does not provide the standard quiet output except for
bpftool and it also does not respect specifying a build output directory.
'make bpf_install' command neither respects specified destination nor
prefix, all from Jiri. In addition, Jakub fixes several other minor issues
in the Makefiles on top of that, e.g. fixing dependency paths, phony
targets and more.
5) Various doc updates e.g. add a comment for BPF fs about reserved names
to make the dentry lookup from there a bit more obvious, and a comment
to the bpf_devel_QA file in order to explain the diff between native
and bpf target clang usage with regards to pointer size, from Quentin
and Daniel.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This makes it possible to use the various iMON remotes with any raw IR
RC device.
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Recently released USB Audio Class 3.0 specification
introduces many significant changes comparing to
previous versions, like
- new Power Domains, support for LPM/L1
- new Cluster descriptor
- changed layout of all class-specific descriptors
- new High Capability descriptors
- New class-specific String descriptors
- new and removed units
- additional sources for interrupts
- removed Type II Audio Data Formats
- ... and many other things (check spec)
It also provides backward compatibility through
multiple configurations, as well as requires
mandatory support for BADD (Basic Audio Device
Definition) on each ADC3.0 compliant device
This patch adds initial support of UAC3 specification
that is enough for Generic I/O Profile (BAOF, BAIF)
device support from BADD document.
Signed-off-by: Ruslan Bilovol <ruslan.bilovol@gmail.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Add wiphy EXT_FEATURE flag to indicate that HW or driver does
all DFS actions by itself.
User-space functionality already implemented in hostapd using
vendor-specific (QCA) OUI to advertise DFS offload support.
Need to introduce generic flag to inform about DFS offload support.
For devices with DFS_OFFLOAD flag set user-space will no longer
need to issue CAC or do any actions in response to
"radar detected" events. HW will do everything by itself and send
events to user-space to indicate that CAC was started/finished, etc.
Signed-off-by: Dmitrii Lebed <dlebed@quantenna.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
CAC_STARTED event is needed for DFS offload feature and
should be generated by driver/HW if DFS_OFFLOAD is enabled.
Signed-off-by: Dmitry Lebed <dlebed@quantenna.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
We already have ICMPv6 type/code matches (which can be used to distinguish
different types of MLD packets). Add support for IPv4 IGMP matches in the
same way.
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
We already have ICMPv6 type/code matches. This adds support for IPv4 ICMP
matches in the same way.
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Currently the userspace has no way of knowing whether the fuse
connection ended because of umount or abort via sysfs. It makes it hard
for filesystems to free the mountpoint after abort without worrying
about removing some new mount.
The patch fixes it by returning different errors when userspace reads
from /dev/fuse (-ENODEV for umount and -ECONNABORTED for abort).
Add a new capability flag FUSE_ABORT_ERROR. If set and the connection is
gone because of sysfs abort, reading from the device will return
-ECONNABORTED.
Signed-off-by: Szymon Lukasz <noh4hss@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
This patch exposes synproxy information per-conntrack. Moreover, send
sequence adjustment events once server sends us the SYN,ACK packet, so
we can synchronize the sequence adjustment too for packets going as
reply from the server, as part of the synproxy logic.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch introduces a new feature that allows bitshifting (left
and right) operations to co-operate with existing iptables options.
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Jack Ma <jack.ma@alliedtelesis.co.nz>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
All existing keys, except the NFT_CT_SRC and NFT_CT_DST are assumed to
have strict datatypes. This is causing problems with sets and
concatenations given the specific length of these keys is not known.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Florian Westphal <fw@strlen.de>
If the "etc/vmcoreinfo" fw_cfg file is present and we are not running
the kdump kernel, write the addr/size of the vmcoreinfo ELF note.
The DMA operation is expected to run synchronously with today qemu,
but the specification states that it may become async, so we run
"control" field check in a loop for eventual changes.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Create a common header file for well-known values and structures to be
shared by the Linux kernel with qemu or other projects.
It is based from qemu/docs/specs/fw_cfg.txt which references
qemu/include/hw/nvram/fw_cfg_keys.h "for the most up-to-date and
authoritative list" & vmcoreinfo.txt. Those files don't have an
explicit license, but qemu/hw/nvram/fw_cfg.c is BSD-license, so
Michael S. Tsirkin suggested to use the same license.
The patch intentionally left out DMA & vmcoreinfo structures &
defines, which are added in the commits making usage of it.
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently, if a bpf sk msg program is run the program
can only parse data that the (start,end) pointers already
consumed. For sendmsg hooks this is likely the first
scatterlist element. For sendpage this will be the range
(0,0) because the data is shared with userspace and by
default we want to avoid allowing userspace to modify
data while (or after) BPF verdict is being decided.
To support pulling in additional bytes for parsing use
a new helper bpf_sk_msg_pull(start, end, flags) which
works similar to cls tc logic. This helper will attempt
to point the data start pointer at 'start' bytes offest
into msg and data end pointer at 'end' bytes offset into
message.
After basic sanity checks to ensure 'start' <= 'end' and
'end' <= msg_length there are a few cases we need to
handle.
First the sendmsg hook has already copied the data from
userspace and has exclusive access to it. Therefor, it
is not necessesary to copy the data. However, it may
be required. After finding the scatterlist element with
'start' offset byte in it there are two cases. One the
range (start,end) is entirely contained in the sg element
and is already linear. All that is needed is to update the
data pointers, no allocate/copy is needed. The other case
is (start, end) crosses sg element boundaries. In this
case we allocate a block of size 'end - start' and copy
the data to linearize it.
Next sendpage hook has not copied any data in initial
state so that data pointers are (0,0). In this case we
handle it similar to the above sendmsg case except the
allocation/copy must always happen. Then when sending
the data we have possibly three memory regions that
need to be sent, (0, start - 1), (start, end), and
(end + 1, msg_length). This is required to ensure any
writes by the BPF program are correctly transmitted.
Lastly this operation will invalidate any previous
data checks so BPF programs will have to revalidate
pointers after making this BPF call.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
In the case where we need a specific number of bytes before a
verdict can be assigned, even if the data spans multiple sendmsg
or sendfile calls. The BPF program may use msg_cork_bytes().
The extreme case is a user can call sendmsg repeatedly with
1-byte msg segments. Obviously, this is bad for performance but
is still valid. If the BPF program needs N bytes to validate
a header it can use msg_cork_bytes to specify N bytes and the
BPF program will not be called again until N bytes have been
accumulated. The infrastructure will attempt to coalesce data
if possible so in many cases (most my use cases at least) the
data will be in a single scatterlist element with data pointers
pointing to start/end of the element. However, this is dependent
on available memory so is not guaranteed. So BPF programs must
validate data pointer ranges, but this is the case anyways to
convince the verifier the accesses are valid.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
A single sendmsg or sendfile system call can contain multiple logical
messages that a BPF program may want to read and apply a verdict. But,
without an apply_bytes helper any verdict on the data applies to all
bytes in the sendmsg/sendfile. Alternatively, a BPF program may only
care to read the first N bytes of a msg. If the payload is large say
MB or even GB setting up and calling the BPF program repeatedly for
all bytes, even though the verdict is already known, creates
unnecessary overhead.
To allow BPF programs to control how many bytes a given verdict
applies to we implement a bpf_msg_apply_bytes() helper. When called
from within a BPF program this sets a counter, internal to the
BPF infrastructure, that applies the last verdict to the next N
bytes. If the N is smaller than the current data being processed
from a sendmsg/sendfile call, the first N bytes will be sent and
the BPF program will be re-run with start_data pointing to the N+1
byte. If N is larger than the current data being processed the
BPF verdict will be applied to multiple sendmsg/sendfile calls
until N bytes are consumed.
Note1 if a socket closes with apply_bytes counter non-zero this
is not a problem because data is not being buffered for N bytes
and is sent as its received.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This implements a BPF ULP layer to allow policy enforcement and
monitoring at the socket layer. In order to support this a new
program type BPF_PROG_TYPE_SK_MSG is used to run the policy at
the sendmsg/sendpage hook. To attach the policy to sockets a
sockmap is used with a new program attach type BPF_SK_MSG_VERDICT.
Similar to previous sockmap usages when a sock is added to a
sockmap, via a map update, if the map contains a BPF_SK_MSG_VERDICT
program type attached then the BPF ULP layer is created on the
socket and the attached BPF_PROG_TYPE_SK_MSG program is run for
every msg in sendmsg case and page/offset in sendpage case.
BPF_PROG_TYPE_SK_MSG Semantics/API:
BPF_PROG_TYPE_SK_MSG supports only two return codes SK_PASS and
SK_DROP. Returning SK_DROP free's the copied data in the sendmsg
case and in the sendpage case leaves the data untouched. Both cases
return -EACESS to the user. Returning SK_PASS will allow the msg to
be sent.
In the sendmsg case data is copied into kernel space buffers before
running the BPF program. The kernel space buffers are stored in a
scatterlist object where each element is a kernel memory buffer.
Some effort is made to coalesce data from the sendmsg call here.
For example a sendmsg call with many one byte iov entries will
likely be pushed into a single entry. The BPF program is run with
data pointers (start/end) pointing to the first sg element.
In the sendpage case data is not copied. We opt not to copy the
data by default here, because the BPF infrastructure does not
know what bytes will be needed nor when they will be needed. So
copying all bytes may be wasteful. Because of this the initial
start/end data pointers are (0,0). Meaning no data can be read or
written. This avoids reading data that may be modified by the
user. A new helper is added later in this series if reading and
writing the data is needed. The helper call will do a copy by
default so that the page is exclusively owned by the BPF call.
The verdict from the BPF_PROG_TYPE_SK_MSG applies to the entire msg
in the sendmsg() case and the entire page/offset in the sendpage case.
This avoids ambiguity on how to handle mixed return codes in the
sendmsg case. Again a helper is added later in the series if
a verdict needs to apply to multiple system calls and/or only
a subpart of the currently being processed message.
The helper msg_redirect_map() can be used to select the socket to
send the data on. This is used similar to existing redirect use
cases. This allows policy to redirect msgs.
Pseudo code simple example:
The basic logic to attach a program to a socket is as follows,
// load the programs
bpf_prog_load(SOCKMAP_TCP_MSG_PROG, BPF_PROG_TYPE_SK_MSG,
&obj, &msg_prog);
// lookup the sockmap
bpf_map_msg = bpf_object__find_map_by_name(obj, "my_sock_map");
// get fd for sockmap
map_fd_msg = bpf_map__fd(bpf_map_msg);
// attach program to sockmap
bpf_prog_attach(msg_prog, map_fd_msg, BPF_SK_MSG_VERDICT, 0);
Adding sockets to the map is done in the normal way,
// Add a socket 'fd' to sockmap at location 'i'
bpf_map_update_elem(map_fd_msg, &i, fd, BPF_ANY);
After the above any socket attached to "my_sock_map", in this case
'fd', will run the BPF msg verdict program (msg_prog) on every
sendmsg and sendpage system call.
For a complete example see BPF selftests or sockmap samples.
Implementation notes:
It seemed the simplest, to me at least, to use a refcnt to ensure
psock is not lost across the sendmsg copy into the sg, the bpf program
running on the data in sg_data, and the final pass to the TCP stack.
Some performance testing may show a better method to do this and avoid
the refcnt cost, but for now use the simpler method.
Another item that will come after basic support is in place is
supporting MSG_MORE flag. At the moment we call sendpages even if
the MSG_MORE flag is set. An enhancement would be to collect the
pages into a larger scatterlist and pass down the stack. Notice that
bpf_tcp_sendmsg() could support this with some additional state saved
across sendmsg calls. I built the code to support this without having
to do refactoring work. Other features TBD include ZEROCOPY and the
TCP_RECV_QUEUE/TCP_NO_QUEUE support. This will follow initial series
shortly.
Future work could improve size limits on the scatterlist rings used
here. Currently, we use MAX_SKB_FRAGS simply because this was being
used already in the TLS case. Future work could extend the kernel sk
APIs to tune this depending on workload. This is a trade-off
between memory usage and throughput performance.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Currently, the offsets in the UAC2 processing unit descriptor are
calculated incorrectly. It causes an issue when connecting the device which
provides such a feature:
~~~~
[84126.724420] usb 1-1.3.1: invalid Processing Unit descriptor (id 18)
~~~~
After this patch is applied, the UAC2 processing unit inits w/o this error.
Fixes: 23caaf19b1 ("ALSA: usb-mixer: Add support for Audio Class v2.0")
Signed-off-by: Kirill Marinushkin <k.marinushkin@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Dealing with 'struct timeval' users in the y2038 series is a bit tricky:
We have two definitions of timeval that are visible to user space,
one comes from glibc (or some other C library), the other comes from
linux/time.h. The kernel copy is what we want to be used for a number of
structures defined by the kernel itself, e.g. elf_prstatus (used it core
dumps), sysinfo and rusage (used in system calls). These generally tend
to be used for passing time intervals rather than absolute (epoch-based)
times, so they do not suffer from the y2038 overflow. Some of them
could be changed to use 64-bit timestamps by creating new system calls,
others like the core files cannot easily be changed.
An application using these interfaces likely also uses gettimeofday()
or other interfaces that use absolute times, and pass 'struct timeval'
pointers directly into kernel interfaces, so glibc must redefine their
timeval based on a 64-bit time_t when they introduce their y2038-safe
interfaces.
The only reasonable way forward I see is to remove the 'timeval'
definion from the kernel's uapi headers, and change the interfaces that
we do not want to (or cannot) duplicate for 64-bit times to use a new
__kernel_old_timeval definition instead. This type should be avoided
for all new interfaces (those can use 64-bit nanoseconds, or the 64-bit
version of timespec instead), and should be used with great care when
converting existing interfaces from timeval, to be sure they don't suffer
from the y2038 overflow, and only with consensus for the particular user
that using __kernel_old_timeval is better than moving to a 64-bit based
interface. The structure name is intentionally chosen to not conflict
with user space types, and to be ugly enough to discourage its use.
Note that ioctl based interfaces that pass a bare 'timeval' pointer
cannot change to '__kernel_old_timeval' because the user space source
code refers to 'timeval' instead, and we don't want to modify the user
space sources if possible. However, any application that relies on a
structure to contain an embedded 'timeval' (e.g. by passing a pointer
to the member into a function call that expects a timeval pointer) is
broken when that structure gets converted to __kernel_old_timeval. I
don't see any way around that, and we have to rely on the compiler to
produce a warning or compile failure that will alert users when they
recompile their sources against a new libc.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Stephen Boyd <sboyd@kernel.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lkml.kernel.org/r/20180315161739.576085-1-arnd@arndb.de
Publications for TIPC_CLUSTER_SCOPE and TIPC_ZONE_SCOPE are in all
aspects handled the same way, both on the publishing node and on the
receiving nodes.
Despite previous ambitions to the contrary, this is never going to change,
so we take the conseqeunce of this and obsolete TIPC_ZONE_SCOPE and related
macros/functions. Whenever a user is doing a bind() or a sendmsg() attempt
using ZONE_SCOPE we translate this internally to CLUSTER_SCOPE, while we
remain compatible with users and remote nodes still using ZONE_SCOPE.
Furthermore, the non-formalized scope value 0 has always been permitted
for use during lookup, with the same meaning as ZONE_SCOPE/CLUSTER_SCOPE.
We now permit it even as binding scope, but for compatibility reasons we
choose to not change the value of TIPC_CLUSTER_SCOPE.
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It happens often while I'm preparing a patch for a block driver that
I'm wondering: is a definition of SECTOR_SIZE and/or SECTOR_SHIFT
available for this driver? Do I have to introduce definitions of these
constants before I can use these constants? To avoid this confusion,
move the existing definitions of SECTOR_SIZE and SECTOR_SHIFT into the
<linux/blkdev.h> header file such that these become available for all
block drivers. Make the SECTOR_SIZE definition in the uapi msdos_fs.h
header file conditional to avoid that including that header file after
<linux/blkdev.h> causes the compiler to complain about a SECTOR_SIZE
redefinition.
Note: the SECTOR_SIZE / SECTOR_SHIFT / SECTOR_BITS definitions have
not been removed from uapi header files nor from NAND drivers in
which these constants are used for another purpose than converting
block layer offsets and sizes into a number of sectors.
Cc: David S. Miller <davem@davemloft.net>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Allowing a guest to execute MWAIT without interception enables a guest
to put a (physical) CPU into a power saving state, where it takes
longer to return from than what may be desired by the host.
Don't give a guest that power over a host by default. (Especially,
since nothing prevents a guest from using MWAIT even when it is not
advertised via CPUID.)
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Jan H. Schönherr <jschoenh@amazon.de>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch adds TCP_NLA_SND_SSTHRESH stat into SCM_TIMESTAMPING_OPT_STATS
that reports tcp_sock.snd_ssthresh.
Signed-off-by: Yousuk Seung <ysseung@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Priyaranjan Jha <priyarjha@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we have a bridge with vlan_filtering on and a vlan device on top of
it, packets would be corrupted in skb_vlan_untag() called from
br_dev_xmit().
The problem sits in skb_reorder_vlan_header() used in skb_vlan_untag(),
which makes use of skb->mac_len. In this function mac_len is meant for
handling rx path with vlan devices with reorder_header disabled, but in
tx path mac_len is typically 0 and cannot be used, which is the problem
in this case.
The current code even does not properly handle rx path (skb_vlan_untag()
called from __netif_receive_skb_core()) with reorder_header off actually.
In rx path single tag case, it works as follows:
- Before skb_reorder_vlan_header()
mac_header data
v v
+-------------------+-------------+------+----
| ETH | VLAN | ETH |
| ADDRS | TPID | TCI | TYPE |
+-------------------+-------------+------+----
<-------- mac_len --------->
<------------->
to be removed
- After skb_reorder_vlan_header()
mac_header data
v v
+-------------------+------+----
| ETH | ETH |
| ADDRS | TYPE |
+-------------------+------+----
<-------- mac_len --------->
This is ok, but in rx double tag case, it corrupts packets:
- Before skb_reorder_vlan_header()
mac_header data
v v
+-------------------+-------------+-------------+------+----
| ETH | VLAN | VLAN | ETH |
| ADDRS | TPID | TCI | TPID | TCI | TYPE |
+-------------------+-------------+-------------+------+----
<--------------- mac_len ---------------->
<------------->
should be removed
<--------------------------->
actually will be removed
- After skb_reorder_vlan_header()
mac_header data
v v
+-------------------+------+----
| ETH | ETH |
| ADDRS | TYPE |
+-------------------+------+----
<--------------- mac_len ---------------->
So, two of vlan tags are both removed while only inner one should be
removed and mac_header (and mac_len) is broken.
skb_vlan_untag() is meant for removing the vlan header at (skb->data - 2),
so use skb->data and skb->mac_header to calculate the right offset.
Reported-by: Brandon Carpenter <brandon.carpenter@cypherpath.com>
Fixes: a6e18ff111 ("vlan: Fix untag operations of stacked vlans with REORDER_HEADER off")
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
v2:
* Fix error handling after kfd_bind_process_to_device in
kfd_ioctl_map_memory_to_gpu
v3:
* Add ioctl to acquire VM from a DRM FD
v4:
* Return number of successful map/unmap operations in failure cases
* Facilitate partial retry after failed map/unmap
* Added comments with parameter descriptions to new APIs
* Defined AMDKFD_IOC_FREE_MEMORY_OF_GPU write-only
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Currently the number of GPUs is limited by aperture placement options
available on GFX7 and GFX8 hardware. This limitation is not necessary.
Scratch and LDS represent per-work-item and per-work-group storage
respectively. Different work-items and work-groups use the same virtual
address to access their own data. Work running on different GPUs is by
definition in different work-groups (different dispatches, in fact).
That means the same virtual addresses can be used for these apertures
on different GPUs.
Add a new AMDKFD_IOC_GET_PROCESS_APERTURES_NEW ioctl that removes the
artificial limitation on the number of GPUs that can be supported. The
new ioctl allows user mode to query the number of GPUs to allocate
enough memory for all GPUs to be reported.
This deprecates AMDKFD_IOC_GET_PROCESS_APERTURES.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
ixjuser.h includes the telephony.h header. Other than that no kernel
code uses any of these headers. The last user of the ixjuser.h header
has been removed in commit 7326446c72 (Staging: remove telephony
drivers), more than 5 years ago.
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, bpf stackmap store address for each entry in the call trace.
To map these addresses to user space files, it is necessary to maintain
the mapping from these virtual address to symbols in the binary. Usually,
the user space profiler (such as perf) has to scan /proc/pid/maps at the
beginning of profiling, and monitor mmap2() calls afterwards. Given the
cost of maintaining the address map, this solution is not practical for
system wide profiling that is always on.
This patch tries to solve this problem with a variation of stackmap. This
variation is enabled by flag BPF_F_STACK_BUILD_ID. Instead of storing
addresses, the variation stores ELF file build_id + offset.
Build ID is a 20-byte unique identifier for ELF files. The following
command shows the Build ID of /bin/bash:
[user@]$ readelf -n /bin/bash
...
Build ID: XXXXXXXXXX
...
With BPF_F_STACK_BUILD_ID, bpf_get_stackid() tries to parse Build ID
for each entry in the call trace, and translate it into the following
struct:
struct bpf_stack_build_id_offset {
__s32 status;
unsigned char build_id[BPF_BUILD_ID_SIZE];
union {
__u64 offset;
__u64 ip;
};
};
The search of build_id is limited to the first page of the file, and this
page should be in page cache. Otherwise, we fallback to store ip for this
entry (ip field in struct bpf_stack_build_id_offset). This requires the
build_id to be stored in the first page. A quick survey of binary and
dynamic library files in a few different systems shows that almost all
binary and dynamic library files have build_id in the first page.
Build_id is only meaningful for user stack. If a kernel stack is added to
a stackmap with BPF_F_STACK_BUILD_ID, it will automatically fallback to
only store ip (status == BPF_STACK_BUILD_ID_IP). Similarly, if build_id
lookup failed for some reason, it will also fallback to store ip.
User space can access struct bpf_stack_build_id_offset with bpf
syscall BPF_MAP_LOOKUP_ELEM. It is necessary for user space to
maintain mapping from build id to binary files. This mostly static
mapping is much easier to maintain than per process address maps.
Note: Stackmap with build_id only works in non-nmi context at this time.
This is because we need to take mm->mmap_sem for find_vma(). If this
changes, we would like to allow build_id lookup in nmi context.
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This patch is to add SCTP_AUTH_NO_AUTH type for AUTHENTICATION_EVENT,
as described in section 6.1.8 of RFC6458.
SCTP_AUTH_NO_AUTH: This report indicates that the peer does not
support SCTP authentication as defined in [RFC4895].
Note that the implementation is quite similar as that of
SCTP_ADAPTATION_INDICATION.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to add SCTP_AUTH_FREE_KEY type for AUTHENTICATION_EVENT,
as described in section 6.1.8 of RFC6458.
SCTP_AUTH_FREE_KEY: This report indicates that the SCTP
implementation will no longer use the key identifier specified
in auth_keynumber.
After deactivating a key, it would never be used again, which means
it's refcnt can't be held/increased by new chunks. But there may be
some chunks in out queue still using it. So only when refcnt is 1,
which means no chunk in outqueue is using/holding this key either,
this EVENT would be sent.
When users receive this notification, they could do DEL_KEY sockopt to
remove this shkey, and also tell the peer that this key won't be used
in any chunk thoroughly from now on, then the peer can remove it as
well safely.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to add sockopt SCTP_AUTH_DEACTIVATE_KEY, as described in
section 8.3.4 of RFC6458.
This set option indicates that the application will no longer send user
messages using the indicated key identifier.
Note that RFC requires that only deactivated keys that are no longer used
by an association can be deleted, but for the backward compatibility, it
is not to check deactivated when deleting or replacing one sh_key.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to add support for SCTP AUTH Information for sendmsg,
as described in section 5.3.8 of RFC6458.
With this option, you can provide shared key identifier used for
sending the user message.
It's also a necessary send info for sctp_sendv.
Note that it reuses sinfo->sinfo_tsn to indicate if this option is
set and sinfo->sinfo_ssn to save the shkey ID which can be 0.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
No one has publicly stepped up to maintain this broken codebase for
devices that no one uses anymore, so let's just drop the whole thing.
If someone really wants/needs it, we can revert this and they can fix
the code up to work properly.
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dump the list of multicast flags entries via the netlink socket.
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Dump the list of DAT cache entries via the netlink socket.
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Problem and motivation: Once a breakpoint perf event (PERF_TYPE_BREAKPOINT)
is created, there is no flexibility to change the breakpoint type
(bp_type), breakpoint address (bp_addr), or breakpoint length (bp_len). The
only option is to close the perf event and configure a new breakpoint
event. This inflexibility has a significant performance overhead. For
example, sampling-based, lightweight performance profilers (and also
concurrency bug detection tools), monitor different addresses for a short
duration using PERF_TYPE_BREAKPOINT and change the address (bp_addr) to
another address or change the kind of breakpoint (bp_type) from "write" to
a "read" or vice-versa or change the length (bp_len) of the address being
monitored. The cost of these modifications is prohibitive since it involves
unmapping the circular buffer associated with the perf event, closing the
perf event, opening another perf event and mmaping another circular buffer.
Solution: The new ioctl flag for perf events,
PERF_EVENT_IOC_MODIFY_ATTRIBUTES, introduced in this patch takes a pointer
to a struct perf_event_attr as an argument to update an old breakpoint
event with new address, type, and size. This facility allows retaining a
previous mmaped perf events ring buffer and avoids having to close and
reopen another perf event.
This patch supports only changing PERF_TYPE_BREAKPOINT event type; future
implementations can extend this feature. The patch replicates some of its
functionality of modify_user_hw_breakpoint() in
kernel/events/hw_breakpoint.c. modify_user_hw_breakpoint cannot be called
directly since perf_event_ctx_lock() is already held in _perf_ioctl().
Evidence: Experiments show that the baseline (not able to modify an already
created breakpoint) costs an order of magnitude (~10x) more than the
suggested optimization (having the ability to dynamically modifying a
configured breakpoint via ioctl). When the breakpoints typically do not
trap, the speedup due to the suggested optimization is ~10x; even when the
breakpoints always trap, the speedup is ~4x due to the suggested
optimization.
Testing: tests posted at
https://github.com/linux-contrib/perf_event_modify_bp demonstrate the
performance significance of this patch. Tests also check the functional
correctness of the patch.
Signed-off-by: Milind Chabbi <chabbi.milind@gmail.com>
[ Using modify_user_hw_breakpoint_check function. ]
[ Reformated PERF_EVENT_IOC_*, so the values are all in one column. ]
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Oleg Nesterov <onestero@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/20180312134548.31532-8-jolsa@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The planned change to unify the behaviour of the MONOTONIC and BOOTTIME
clocks vs. suspend removes the ability to retrieve the active
non-suspended time of a system.
Provide a new CLOCK_MONOTONIC_ACTIVE clock which returns the active
non-suspended time of the system via clock_gettime().
This preserves the old behaviour of CLOCK_MONOTONIC before the
BOOTTIME/MONOTONIC unification.
This new clock also allows applications to detect programmatically that
the MONOTONIC and BOOTTIME clocks are identical.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kevin Easton <kevin@guarana.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Salyzyn <salyzyn@android.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20180301165149.965235774@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The Nuvoton UART is almost compatible with the 8250 driver when probed
via the 8250_of driver, however it requires some extra configuration
at startup.
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We use a two-step process to configure a filter with RSS spreading. First,
the RSS context is allocated and configured using ETHTOOL_SRSSH; this
returns an identifier (rss_context) which can then be passed to subsequent
invocations of ETHTOOL_SRXCLSRLINS to specify that the offset from the RSS
indirection table lookup should be added to the queue number (ring_cookie)
when delivering the packet. Drivers for devices which can only use the
indirection table entry directly (not add it to a base queue number)
should reject rule insertions combining RSS with a nonzero ring_cookie.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow setting firstfrag as matching option in tc flower classifier.
# tc filter add dev eth0 protocol ip parent ffff: \
flower indev eth0 \
ip_flags firstfrag
action mirred egress redirect dev eth1
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit adds new field "addr" to bpf_perf_event_data which could be
read and used by bpf programs attached to perf events. The value of the
field is copied from bpf_perf_event_data_kern.addr and contains the
address value recorded by specifying sample_type with PERF_SAMPLE_ADDR
when calling perf_event_open.
Signed-off-by: Teng Qin <qinteng@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
These patches remove the metag architecture and tightly dependent
drivers from the kernel. With the 4.16 kernel the ancient gcc 4.2.4
based metag toolchain we have been using is hitting compiler bugs, so
now seems a good time to drop it altogether.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEd80NauSabkiESfLYbAtpk944dnoFAlqdcgQACgkQbAtpk944
dno/1BAAvaiRcKcNxMrYkxG+Wn4r68odu7+E1dy99AaUnvPFT42R5XLMOv4BCu/Y
bhMQ14lMJ9ZBKdYg9E97ulTV0YFhCBHuEWDyDnk/G3CVAEvdPuAQ6ktHDZxRQBFK
JoTUKky53OZbWU9KhLeWpFg4F4E64FBm1kyAkqhs8pPM/LwmrxwIG2sxdTTqkhkc
b+6ABf2NKtmQwHXWmKWCB8rmXMzulYth2ePC/r9MVj92xGKxADsiFArZk4kmoIUb
H5eZ8FkemtUEfZp600dsGR/ffaTBwZJ3SULSkAklUnrcvdIRM+Fu8osG8O8yQKTd
H7xnmtTJ2kCnhhuUMxt6v8WrDbKB8JdFxFOpXW93YKpKAkiGMvoUEZjlwPYIqWxL
xtnDb9Rv+uZ4RpqZf9AtE4Td8lHTH7OZ78RDs9eMo6n1ZIr5CwcLaM2k5skAeyPr
yt1lXePhXFqSS+OpOV6hn95ROqlkuZgvPfkcdNpCJPfM4SpfRLlUjIVqiVK0LDRk
FAkk0VIfzjjNuyV9yr2XXuw90DerhFUgUl6ZYggkgf6umOHhZQdDTFr8gsfvaLm1
1k1banUEF1tpDcUeShylDvqNmVSZZC6siTQMA7T0zjbjYJD25hJWLpFEcPkx/Anp
4oGQNNoe4WgJIrJAoTJTiBVwC/xLDeZV6b5t2pOXBlH+v2eKgMg=
=zDIl
-----END PGP SIGNATURE-----
Merge tag 'metag_remove_2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jhogan/metag into asm-generic
Remove metag architecture
These patches remove the metag architecture and tightly dependent
drivers from the kernel. With the 4.16 kernel the ancient gcc 4.2.4
based metag toolchain we have been using is hitting compiler bugs, so
now seems a good time to drop it altogether.
* tag 'metag_remove_2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jhogan/metag:
i2c: img-scb: Drop METAG dependency
media: img-ir: Drop METAG dependency
watchdog: imgpdc: Drop METAG dependency
MAINTAINERS/CREDITS: Drop METAG ARCHITECTURE
tty: Remove metag DA TTY and console driver
clocksource: Remove metag generic timer driver
irqchip: Remove metag irqchip drivers
Drop a bunch of metag references
docs: Remove remaining references to metag
docs: Remove metag docs
metag: Remove arch/metag/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
This patch is to add support for snd flag SCTP_SENDALL process
in sendmsg, as described in section 5.3.4 of RFC6458.
With this flag, you can send the same data to all the asocs of
this sk once.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to add support for Destination IPv4/6 Address options
for sendmsg, as described in section 5.3.9/10 of RFC6458.
With this option, you can provide more than one destination addrs
to sendmsg when creating asoc, like sctp_connectx.
It's also a necessary send info for sctp_sendv.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to add support for PR-SCTP Information for sendmsg,
as described in section 5.3.7 of RFC6458.
With this option, you can specify pr_policy and pr_value for user
data in sendmsg.
It's also a necessary send info for sctp_sendv.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace hardcoded padding size value for struct kvm_sync_regs
with #define SYNC_REGS_SIZE_BYTES.
Also update the value specified in api.txt from outdated hardcoded
value to SYNC_REGS_SIZE_BYTES.
Signed-off-by: Ken Hofsass <hofsass@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
In Hyper-V, the fast guest->host notification mechanism is the
SIGNAL_EVENT hypercall, with a single parameter of the connection ID to
signal.
Currently this hypercall incurs a user exit and requires the userspace
to decode the parameters and trigger the notification of the potentially
different I/O context.
To avoid the costly user exit, process this hypercall and signal the
corresponding eventfd in KVM, similar to ioeventfd. The association
between the connection id and the eventfd is established via the newly
introduced KVM_HYPERV_EVENTFD ioctl, and maintained in an
(srcu-protected) IDR.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
[asm/hyperv.h changes approved by KY Srinivasan. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
All of the conflicts were cases of overlapping changes.
In net/core/devlink.c, we have to make care that the
resouce size_params have become a struct member rather
than a pointer to such an object.
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a generic netlink family for NCSI. This supports three commands;
NCSI_CMD_PKG_INFO which returns information on packages and their
associated channels, NCSI_CMD_SET_INTERFACE which allows a specific
package or package/channel combination to be set as the preferred
choice, and NCSI_CMD_CLEAR_INTERFACE which clears any preferred setting.
Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds TCP_NLA_CA_STATE stat into SCM_TIMESTAMPING_OPT_STATS.
It reports ca_state of socket, when timestamp is generated.
Signed-off-by: Priyaranjan Jha <priyarjha@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds TCP_NLA_SENDQ_SIZE stat into SCM_TIMESTAMPING_OPT_STATS.
It reports no. of bytes present in send queue, when timestamp is
generated.
Signed-off-by: Priyaranjan Jha <priyarjha@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit 02d742f4b2 ("media: lirc: lirc daemon fails to detect raw
IR device"), the feature LIRC_CAN_SEND_SCANCODE is no longer used as it
tripped up lircd. The ability to send scancodes for IR Tx is implied by
LIRC_CAN_SEND_PULSE (i.e. any device that can send can use IR Tx encoders).
So, remove LIRC_CAN_SEND_SCANCODE since it never used. This fixes:
Documentation/output/lirc.h.rst:6: WARNING: undefined label:
lirc-can-send-scancode (if the link has no caption the label must precede
a section header
As this flag was added for kernel 4.16, let's remove it, while not too
late.
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
* commit 'v4.16-rc4~0': (900 commits)
Linux 4.16-rc4
memremap: fix softlockup reports at teardown
libnvdimm: re-enable deep flush for pmem devices via fsync()
MAINTAINERS: take over Kconfig maintainership
vfio: disable filesystem-dax page pinning
kconfig: fix line number in recursive inclusion error message
Coccinelle: memdup: Fix typo in warning messages
i2c: octeon: Prevent error message on bus error
parisc: Reduce irq overhead when run in qemu
parisc: Use cr16 interval timers unconditionally on qemu
parisc: Check if secondary CPUs want own PDC calls
parisc: Hide virtual kernel memory layout
parisc: Fix ordering of cache and TLB flushes
kconfig: Update ncurses package names for menuconfig
kbuild/kallsyms: trivial typo fix
kbuild: test --build-id linker flag by ld-option instead of cc-ldoption
kbuild: drop superfluous GCC_PLUGINS_CFLAGS assignment
kconfig: Don't leak choice names during parsing
sh: fix build error for empty CONFIG_BUILTIN_DTB_SOURCE
kconfig: set SYMBOL_AUTO to the symbol marked with defconfig_list
...
- bump version strings, by Simon Wunderlich
- bump copyright years, by Sven Eckelmann
- fix macro indendation for checkpatch, by Sven Eckelmann
- fix comparison operator for bool returning functions,
by Sven Eckelmann
- assume 2-byte packet alignments for all packet types,
by Matthias Schiffer
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAlqZkD8WHHN3QHNpbW9u
d3VuZGVybGljaC5kZQAKCRChK+OYQpKeoaQeD/0cHQz/x6ZuXF6MDrwOf78oDvjw
AksSt2e/055Ommhzplz0Aa0hthK7Xb7+dBD+caAOxBwrXtWsWc6H5fsfeiBCTntH
PxDyS68o+iZDonquaOKz+gtILiPbUE0hG3hMizWdF95nDtsJ3rHoL3fU7Wo9eZNq
WtlwCCloyEZeGfPtpJXCo7QmGOvTLk+Qt+pkGWKHFIIXNaxa9yfqp7JCFRfyrkl3
7wb4+YaZCCUDSUiDW6SQY0rTx3bxJPfHOnrG7rB6chmGVyyssH+tjhLP52331/oI
dloVSz96pilAcKivbhLRENcIcUNuj9mHyV6rFM4/paYeDOGotoA5DCkEi3NSTYdN
p2UiDBJZfiA6M40AxmLYK5mLPWYLGpyrZNBNskemrQdlrXDLU/D6UWMUKbFhouPx
AjgX+Yk7giqXdxUjCcaFsjDBf2SxC/Xjv39qvPR0P4rW4xD/Y3xqHDJ8yLrPj7Si
M1NJv6E+gBkIQg+JuoWeQb3kvbtNQsu49XBbrYlUrdPgkJhVC6DIP5jie/TutAGz
9OU1cdDZNUVkI6+iuGP8B3Aj0Mj+zlpXHvhBa5R9duAumdt0uiqwU98k4h89yix+
GK1K9dPyW/r9qmwtemaZH8RQ6iqoxBPVZ43PoM7W/xe04IwslPJWyZ8GT64Yqekl
9m5HhMLLiJx9pklR9w==
=KJdN
-----END PGP SIGNATURE-----
Merge tag 'batadv-next-for-davem-20180302' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
This cleanup patchset includes the following patches:
- bump version strings, by Simon Wunderlich
- bump copyright years, by Sven Eckelmann
- fix macro indendation for checkpatch, by Sven Eckelmann
- fix comparison operator for bool returning functions,
by Sven Eckelmann
- assume 2-byte packet alignments for all packet types,
by Matthias Schiffer
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently GRE sequence number can only be used in native
tunnel mode. This patch adds sequence number support for
gre collect metadata mode. RFC2890 defines GRE sequence
number to be specific to the traffic flow identified by the
key. However, this patch does not implement per-key seqno.
The sequence number is shared in the same tunnel device.
That is, different tunnel keys using the same collect_md
tunnel share single sequence number.
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJamnteAAoJEAhfPr2O5OEVjKUP/03nggoRa29J66geB3LtTnsC
lnQa2Dr6+fRji5eLsXC0su8o3Bf3P5KsyP9gfRrFQcsn68x6LiO6Ud4dL8yM6ZFh
4ShF8IYD1oNMT0CTq5bgfr34lyJRYsF0/FbXuO+XApOlL74JfstN4g3MFjBiojAQ
B7URIw4snzLrpiJ+yHELL71tfPnzgUllWXKSDkql6Te+Gx1mYymDRyXdexlVKSiH
hD/5mQ/6YJdAmFTHl0efvKGFEhfXK2l59I+gpRXgsupaSG+rV2RhjTeVPcWUcvCn
P91/hpahtv2gHWZyEYpN9h1zkWFYFBt8exgnMAuo4YtFzAyMueeUowCrHvTL8P9V
TSkLgFFsZBtrNmWsX1pw2JQJxq2qUXWZNBeUf326TQV9Z2WBeiikXiStdemR/IBQ
ueVOH9yuLjEX9die8PCxU+O/hN0VTfKa0A59D+rTLBbj76LG+iAaK/NAVedJzv5v
U3N0lPL0U8BIJ00VevktBfpB7FcsTBv2Wa/GHOgyCYZHW2NgUqj0cK4jfMogFNID
MM5SgM4fv0D03p4KVLtu3d2+QDKmPjYBs6rAjEtU2eXvaoTzOPtciSb8As5MFxin
xseliOBSVchJDRcKic0vMyMv2jDdvRcAEaDE/w+OsUyWPTixDNN1qOThp8OGEc6P
5P0nL89lvpf9StOl3WM5
=ypdv
-----END PGP SIGNATURE-----
Merge tag 'media/v4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
- some build fixes with randconfigs
- an m88ds3103 fix to prevent an OOPS if the chip doesn't provide the
right version during probe (with can happen if the hardware hangs)
- a potential out of array bounds reference in tvp5150
- some fixes and improvements in the DVB memory mapped API (added for
kernel 4.16)
* tag 'media/v4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
media: vb2: Makefile: place vb2-trace together with vb2-core
media: Don't let tvp5150_get_vbi() go out of vbi_ram_default array
media: dvb: update buffer mmaped flags and frame counter
media: dvb: add continuity error indicators for memory mapped buffers
media: dmxdev: Fix the logic that enables DMA mmap support
media: dmxdev: fix error code for invalid ioctls
media: m88ds3103: don't call a non-initalized function
media: au0828: add VIDEO_V4L2 dependency
media: dvb: fix DVB_MMAP dependency
media: dvb: fix DVB_MMAP symbol name
media: videobuf2: fix build issues with vb2-trace
media: videobuf2: Add VIDEOBUF2_V4L2 Kconfig option for VB2 V4L2 part
x86:
- fix NULL dereference when using userspace lapic
- optimize spectre v1 mitigations by allowing guests to use LFENCE
- make microcode revision configurable to prevent guests from
unnecessarily blacklisting spectre v2 mitigation features
-----BEGIN PGP SIGNATURE-----
iQEcBAABCAAGBQJambvzAAoJEED/6hsPKofo9HwH/2il8xNSLIYf9pJtxZo/puyQ
ZSwByGdeLKBZ1GP1dhdZ8kMk3eBoci0a/sQJmhDiEG6GDf1Mrgri/xj3p60sWwXT
iReG+ZhBKg4QMj/IgOJQrh+53JT73QQP14wIhzc/DSi0Fo0ziqDA/lINxqMKc7oF
b5qratjsb4xF1db4d1g8Ii1VRk64UoBEVpEoP37OOyAu1rgXgDr+9C832KkP0rb+
pVYT8hLFiaYiwVN+WN52/NrIkqBlMvMp3ouRtMAajCQ9OznnraDJE6eTkPGDSDBM
RizSuQbev5R7pcmzCAP9/XKfTbeSUZQei2ZFXQXAOvIXMrQd/ITjbPvnObsceSE=
=wtQd
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Radim Krčmář:
"x86:
- fix NULL dereference when using userspace lapic
- optimize spectre v1 mitigations by allowing guests to use LFENCE
- make microcode revision configurable to prevent guests from
unnecessarily blacklisting spectre v2 mitigation feature"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: fix vcpu initialization with userspace lapic
KVM: X86: Allow userspace to define the microcode version
KVM: X86: Introduce kvm_get_msr_feature()
KVM: SVM: Add MSR-based feature support for serializing LFENCE
KVM: x86: Add a framework for supporting MSR-based features
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCAAGBQJamXImAAoJEPfTWPspceCmEP4P/3kkm0JIXtbNZFMb1JZtsjwE
t4OUVEDj4jjRfmZfUVkajPnczM4MSPiXm43PbcOi4NF53mv8k76jyIPhlZREzYzq
MBknibvpqyiWpbii9tBRrR6FGDR/N51//ya9vdPaYBcBssTg6Aqtt4BE5oPfo011
PleGROe1jtrBUNBy2dMy4sHb/MvZ0vRuNPxMsD8Agy5UiVeItAelY/lDn1Hw41BY
O+muE5bw6+yKqB9vGXhV3O4WRh8BofJi1YdzbwbbIzH40ZZK5VTDQc5o19/CFEZ/
uZ8BStOFEWA0LNuarME5fknWcogiedEtszweyiWBbVZo4VqCsfxPoaRCibY/Wg5F
a0UNJ4iSzglhfSMoHJlhvlCAMCyubFSeMSdJjrrpIcyBrziJXpcEXcUnWI43yi4P
FoM8zUni22XnfLWxIdTjVkMRytjtqTLcXOHXdP5N/ESa80jBq3Q76TLmzIKW+kK5
sAre+hgr52NdgovP/NSxsdvsckAolWNe40JI8wLbwNo+lMHr0ckzOG+sAdz1iPRK
iVL0CAlby4A94Wcu+OHCwfY7B9lBrMuMfHsesEM6x1cxgAhd3YNfEJ8g2QolCUEV
KmZizXbV9nnmJfegVC06SgM+D7AR26dwsBG2aoibShuvdxX6jMdUHygyu5DCJdg/
JS+q71jmxb/r1TWe/62r
=AMhV
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20180302' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"A collection of fixes for this series. This is a little larger than
usual at this time, but that's mainly because I was out on vacation
last week. Nothing in here is major in any way, it's just two weeks of
fixes. This contains:
- NVMe pull from Keith, with a set of fixes from the usual suspects.
- mq-deadline zone unlock fix from Damien, fixing an issue with the
SMR zone locking added for 4.16.
- two bcache fixes sent in by Michael, with changes from Coly and
Tang.
- comment typo fix from Eric for blktrace.
- return-value error handling fix for nbd, from Gustavo.
- fix a direct-io case where we don't defer to a completion handler,
making us sleep from IRQ device completion. From Jan.
- a small series from Jan fixing up holes around handling of bdev
references.
- small set of regression fixes from Jiufei, mostly fixing problems
around the gendisk pointer -> partition index change.
- regression fix from Ming, fixing a boundary issue with the discard
page cache invalidation.
- two-patch series from Ming, fixing both a core blk-mq-sched and
kyber issue around token freeing on a requeue condition"
* tag 'for-linus-20180302' of git://git.kernel.dk/linux-block: (24 commits)
block: fix a typo
block: display the correct diskname for bio
block: fix the count of PGPGOUT for WRITE_SAME
mq-deadline: Make sure to always unlock zones
nvmet: fix PSDT field check in command format
nvme-multipath: fix sysfs dangerously created links
nbd: fix return value in error handling path
bcache: fix kcrashes with fio in RAID5 backend dev
bcache: correct flash only vols (check all uuids)
blktrace_api.h: fix comment for struct blk_user_trace_setup
blockdev: Avoid two active bdev inodes for one device
genhd: Fix BUG in blkdev_open()
genhd: Fix use after free in __blkdev_get()
genhd: Add helper put_disk_and_module()
genhd: Rename get_disk() to get_disk_and_module()
genhd: Fix leaked module reference for NVME devices
direct-io: Fix sleep in atomic due to sync AIO
nvme-pci: Fix nvme queue cleanup if IRQ setup fails
block: kyber: fix domain token leak during requeue
blk-mq: don't call io sched's .requeue_request when requeueing rq to ->dispatch
...
Provide a new KVM capability that allows bits within MSRs to be recognized
as features. Two new ioctls are added to the /dev/kvm ioctl routine to
retrieve the list of these MSRs and then retrieve their values. A kvm_x86_ops
callback is used to determine support for the listed MSR-based features.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[Tweaked documentation. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
uapi for ip_proto, sport and dport range match
in fib rules.
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is an optimization over commit 01883eda72
("rds: support for zcopy completion notification") for PF_RDS sockets.
RDS applications are predominantly request-response transactions, so
it is more efficient to reduce the number of system calls and have
zerocopy completion notification delivered as ancillary data on the
POLLIN channel.
Cookies are passed up as ancillary data (at level SOL_RDS) in a
struct rds_zcopy_cookies when the returned value of recvmsg() is
greater than, or equal to, 0. A max of RDS_MAX_ZCOOKIES may be passed
with each message.
This commit removes support for zerocopy completion notification on
MSG_ERRQUEUE for PF_RDS sockets.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
And get rid of the license text that is no longer necessary.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Alistair Popple <alistair@popple.id.au>
Cc: Jeremy Kerr <jk@ozlabs.org>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Rocky Craig <rocky.craig@hp.com>
NIC drivers generally try to ensure that the "network header" is aligned
to a 4-byte boundary. This is not always possible: When Ethernet frames are
encapsulated in other packets with 4-byte aligned headers, the inner
Ethernet header will have 4-byte alignment, and in consequence, the inner
network header is aligned to 2, but not to 4 bytes.
Most parts of batman-adv only care about 2-byte alignment; in particular,
no unaligned accesses occur in performance-critical paths that handle
actual payload data. This is not true for OGM handling: the seqno and crc
fields are accessed as 32-bit values. To avoid these unaligned accesses,
this patch reduces the expected packet alignment to 2 bytes for all of
batadv's packet types.
As no unaligned accesses existed on the performance-critical paths anyways,
this chance does have any (positive or negative) effect on performance, but
it still makes sense to avoid these accesses to prevent log noise when
examining other unaligned accesses in the kernel while batman-adv is
active.
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Add security hooks allowing security modules to exercise access control
over SCTP.
Signed-off-by: Richard Haines <richard_c_haines@btinternet.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
'struct blk_user_trace_setup' is passed to BLKTRACESETUP, not
BLKTRACESTART.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
- optimization for the exitless interrupt support that was merged in 4.16-rc1
- improve the branch prediction blocking for nested KVM
- replace some jump tables with switch statements to improve expoline performance
- fixes for multiple epoch facility
ARM:
- fix the interaction of userspace irqchip VMs with in-kernel irqchip VMs
- make sure we can build 32-bit KVM/ARM with gcc-8.
x86:
- fixes for AMD SEV
- fixes for Intel nested VMX, emulated UMIP and a dump_stack() on VM startup
- fixes for async page fault migration
- small optimization to PV TLB flush (new in 4.16-rc1)
- syzkaller fixes
Generic:
- compiler warning fixes
- syzkaller fixes
- more improvements to the kvm_stat tool
Two more small Spectre fixes are going to reach you via Ingo.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEbBAABAgAGBQJakL/fAAoJEL/70l94x66Dzp4H9j6qMzgOTAQ0bYmupQp81tad
V8lNabVSNi0UBYwk2D44oNigtNjQckE18KGnjuJ4tZW+GZ+D7zrrHrKXWtATXgxP
SIfHj+raSd/lgJoy6HLu/N0oT6wS+PdZMYFgSu600Vi618lGKGX1SIAwBhjoxdMX
7QKKAuPcDZ1qgGddhWaLnof28nQQEWcCAVfFeVojmM0TyhvSbgSysh/Gq10ydybh
NVUfgP3fzLtT9gVngX/ZtbogNkltPYmucpI+wT3nWfsgBic783klfWrfpnC/GM85
OeXLVhHwVLG6tXUGhb4ULO+F9HwRGX31+er6iIxmwH9PvqnQMRcQ0Xxf2gbNXg==
=YmH6
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"s390:
- optimization for the exitless interrupt support that was merged in 4.16-rc1
- improve the branch prediction blocking for nested KVM
- replace some jump tables with switch statements to improve expoline performance
- fixes for multiple epoch facility
ARM:
- fix the interaction of userspace irqchip VMs with in-kernel irqchip VMs
- make sure we can build 32-bit KVM/ARM with gcc-8.
x86:
- fixes for AMD SEV
- fixes for Intel nested VMX, emulated UMIP and a dump_stack() on VM startup
- fixes for async page fault migration
- small optimization to PV TLB flush (new in 4.16-rc1)
- syzkaller fixes
Generic:
- compiler warning fixes
- syzkaller fixes
- more improvements to the kvm_stat tool
Two more small Spectre fixes are going to reach you via Ingo"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (40 commits)
KVM: SVM: Fix SEV LAUNCH_SECRET command
KVM: SVM: install RSM intercept
KVM: SVM: no need to call access_ok() in LAUNCH_MEASURE command
include: psp-sev: Capitalize invalid length enum
crypto: ccp: Fix sparse, use plain integer as NULL pointer
KVM: X86: Avoid traversing all the cpus for pv tlb flush when steal time is disabled
x86/kvm: Make parse_no_xxx __init for kvm
KVM: x86: fix backward migration with async_PF
kvm: fix warning for non-x86 builds
kvm: fix warning for CONFIG_HAVE_KVM_EVENTFD builds
tools/kvm_stat: print 'Total' line for multiple events only
tools/kvm_stat: group child events indented after parent
tools/kvm_stat: separate drilldown and fields filtering
tools/kvm_stat: eliminate extra guest/pid selection dialog
tools/kvm_stat: mark private methods as such
tools/kvm_stat: fix debugfs handling
tools/kvm_stat: print error on invalid regex
tools/kvm_stat: fix crash when filtering out all non-child trace events
tools/kvm_stat: avoid 'is' for equality checks
tools/kvm_stat: use a more pythonic way to iterate over dictionaries
...
Many for coding-style fixes, and update the poll API with the new
type '__poll_t', this is new commit from linux-4.16-rc1.
Signed-off-by: Haiyue Wang <haiyue.wang@linux.intel.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Provides a device driver for the KCS (Keyboard Controller Style)
IPMI interface which meets the requirement of the BMC (Baseboard
Management Controllers) side for handling the IPMI request from
host system software.
Signed-off-by: Haiyue Wang <haiyue.wang@linux.intel.com>
[Removed the selectability of IPMI_KCS_BMC, as it doesn't do much
good to have it by itself.]
Signed-off-by: Corey Minyard <cminyard@mvista.com>
The media.h public header is very messy. It mixes legacy and 'new' defines
and it is not easy to figure out what should and what shouldn't be used. It
also contains confusing comment that are either out of date or completely
uninteresting for anyone that needs to use this header.
The patch groups all entity functions together, including the 'old' defines
based on the old range base. The reader just wants to know about the available
functions and doesn't care about what range is used.
All legacy defines are moved to the end of the header, so it is easier to
locate them and just ignore them.
The legacy structs in the struct media_entity_desc are put under
also a much more effective signal to the reader that they shouldn't be used
compared to the old method of relying on '#if 1' followed by a comment.
The unused MEDIA_INTF_T_ALSA_* defines are also moved to the end of the header
in the legacy area. They are also dropped from intf_type() in media-entity.c.
All defines are also aligned at the same tab making the header easier to read.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
[mchehab@s-opensource.com: removed lots of spaces before tabs; typo changes ->change ]
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Subdevs are initialized with MEDIA_ENT_F_V4L2_SUBDEV_UNKNOWN, not
MEDIA_ENT_T_V4L2_SUBDEV_UNKNOWN.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
The v4l2_mbus_fmt width and height corresponds directly with the
v4l2_pix_format definitions, yet the differences in documentation make
it ambiguous what to do in the event of field heights.
Clarify this using the same text as is provided for the v4l2_pix_format
which is explicit on the matter, and by matching the terminology of
'image height' rather than the misleading 'frame height'.
Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
For ages iproute2 has used `struct rtmsg` as the ancillary header for
FIB rules and in the process set the protocol value to RTPROT_BOOT.
Until ca56209a66 ("net: Allow a rule to track originating protocol")
the kernel rules code ignored the protocol value sent from userspace
and always returned 0 in notifications. To avoid incompatibility with
existing iproute2, send the protocol as a new attribute.
Fixes: cac56209a6 ("net: Allow a rule to track originating protocol")
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that arch/metag/ has been removed, drop a bunch of metag references
in various codes across the whole tree:
- VM_GROWSUP and __VM_ARCH_SPECIFIC_1.
- MT_METAG_* ELF note types.
- METAG Kconfig dependencies (FRAME_POINTER) and ranges
(MAX_STACK_SIZE_MB).
- metag cases in tools (checkstack.pl, recordmcount.c, perf).
Signed-off-by: James Hogan <jhogan@kernel.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linux-mm@kvack.org
Cc: linux-metag@vger.kernel.org
While userspace can detect discontinuity errors, it is useful to
also let Kernelspace reporting discontinuity, as it can help to
identify if the data loss happened either at Kernel or userspace side.
Update documentation accordingly.
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
One thing to note: I've included a new ethertype
that wireless uses (ETH_P_PREAUTH) in if_ether.h.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAlqPJKcACgkQB8qZga/f
l8TAUw/7BMKG4ofFYRgujmqS+mnSDJx9vgtFHIV3Ymcm9vgdQ6wbNe85ME8J6TpN
Z/HqtWVhn7BEWqpiDgq0ADTEmU/Vt6AQvy6fJX80+Lz4yup8Dq9dI9/CJ7BlKP+t
O7/jXkv2RykFv1IAG9US3Xx9rIwLJRP6XndksZMsK4QihdUYOqAqjZ+pLWCHQ7+a
vFewlUV6t7IMq3R9scL4nf5EmgLWNDNCSOZ6xWDxfDgLHsErbCD9ojRsfAnQWPN0
1rwPC5kGm9tzGtiPVhA0/a4D0dgiYdv723ubs/waSYX5fimXDPSXsRizVp06ZWC+
lFW+Mw52WvxriO61MD99xH1O1/svy+YMMgECoPMjGk7QzgY+2xQ/8hUbo91fsj07
05+rGX3O0SJvB0une3m3ZsZz00DkZDU4Fw0kvO0aSCmE++O4vt/04wmMWxGVfnSo
RtdrQNSAYrYqHSc+1kIzDAH2jCBBLj0cWdlZPYciYMTRUHOFZmMApyyXVLoOl3yn
eqLgK8QBNpkjkf5FbF+m0ccHtQ8lkKiZDcqqIVN+dxKuuO9FEfblDty5bZEp8AaT
Q2soararYeUcNVA2A+Gi1l3qtcj+wWay+CdDcU/QmYbXoxdRh64FBs//y6akIH9p
p/cNgRP/MyEUEkvGmpva+MdtzCToWR4Asm4eErlsvvbIvKEFjPI=
=A6XK
-----END PGP SIGNATURE-----
Merge tag 'mac80211-next-for-davem-2018-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:
====================
Various updates across wireless.
One thing to note: I've included a new ethertype
that wireless uses (ETH_P_PREAUTH) in if_ether.h.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new media entity function definition for digital TV decoders:
MEDIA_ENT_F_DTV_DECODER
Signed-off-by: Tim Harvey <tharvey@gateworks.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Commit 26500475ac ("ptrace, seccomp: add support for retrieving seccomp
metadata") introduced `struct seccomp_metadata`, which contained unsigned
longs that should be arch independent. The type of the flags member was
chosen to match the corresponding argument to seccomp(), and so we need
something at least as big as unsigned long. My understanding is that __u64
should fit the bill, so let's switch both types to that.
While this is userspace facing, it was only introduced in 4.16-rc2, and so
should be safe assuming it goes in before then.
Reported-by: "Dmitry V. Levin" <ldv@altlinux.org>
Signed-off-by: Tycho Andersen <tycho@tycho.ws>
CC: Kees Cook <keescook@chromium.org>
CC: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: "Dmitry V. Levin" <ldv@altlinux.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Allow a rule that is being added/deleted/modified or
dumped to contain the originating protocol's id.
The protocol is handled just like a routes originating
protocol is. This is especially useful because there
is starting to be a plethora of different user space
programs adding rules.
Allow the vrf device to specify that the kernel is the originator
of the rule created for this device.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The commit a new tc ematch for using netfilter xtable matches.
This allows early classification as well as mirroning/redirecting traffic
based on logic implemented in netfilter extensions.
Current supported use case is classification based on the incoming IPSec
state used during decpsulation using the 'policy' iptables extension
(xt_policy).
The module dynamically fetches the netfilter match module and calls
it using a fake xt_action_param structure based on validated userspace
provided parameters.
As the xt_policy match does not access skb->data, no skb modifications
are needed on match.
Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch provides support to get ack signal in probe client response
and in station info from user.
Signed-off-by: Venkateswara Naralasetty <vnaralas@codeaurora.org>
[squash in compilation fixes]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the MSG_ZEROCOPY flag is specified with rds_sendmsg(), and,
if the SO_ZEROCOPY socket option has been set on the PF_RDS socket,
application pages sent down with rds_sendmsg() are pinned.
The pinning uses the accounting infrastructure added by
Commit a91dbff551 ("sock: ulimit on MSG_ZEROCOPY pages")
The payload bytes in the message may not be modified for the
duration that the message has been pinned. A multi-threaded
application using this infrastructure may thus need to be notified
about send-completion so that it can free/reuse the buffers
passed to rds_sendmsg(). Notification of send-completion will
identify each message-buffer by a cookie that the application
must specify as ancillary data to rds_sendmsg().
The ancillary data in this case has cmsg_level == SOL_RDS
and cmsg_type == RDS_CMSG_ZCOPY_COOKIE.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
RDS removes a datagram (rds_message) from the retransmit queue when
an ACK is received. The ACK indicates that the receiver has queued
the RDS datagram, so that the sender can safely forget the datagram.
When all references to the rds_message are quiesced, rds_message_purge
is called to release resources used by the rds_message
If the datagram to be removed had pinned pages set up, add
an entry to the rs->rs_znotify_queue so that the notifcation
will be sent up via rds_rm_zerocopy_callback() when the
rds_message is eventually freed by rds_message_purge.
rds_rm_zerocopy_callback() attempts to batch the number of cookies
sent with each notification to a max of SO_EE_ORIGIN_MAX_ZCOOKIES.
This is achieved by checking the tail skb in the sk_error_queue:
if this has room for one more cookie, the cookie from the
current notification is added; else a new skb is added to the
sk_error_queue. Every invocation of rds_rm_zerocopy_callback() will
trigger a ->sk_error_report to notify the application.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace the old license information with the corresponding SPDX
license for those headers that I authored.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Watch descriptor is id of the watch created by inotify_add_watch().
It is allocated in inotify_add_to_idr(), and takes the numbers
starting from 1. Every new inotify watch obtains next available
number (usually, old + 1), as served by idr_alloc_cyclic().
CRIU (Checkpoint/Restore In Userspace) project supports inotify
files, and restores watched descriptors with the same numbers,
they had before dump. Since there was no kernel support, we
had to use cycle to add a watch with specific descriptor id:
while (1) {
int wd;
wd = inotify_add_watch(inotify_fd, path, mask);
if (wd < 0) {
break;
} else if (wd == desired_wd_id) {
ret = 0;
break;
}
inotify_rm_watch(inotify_fd, wd);
}
(You may find the actual code at the below link:
https://github.com/checkpoint-restore/criu/blob/v3.7/criu/fsnotify.c#L577)
The cycle is suboptiomal and very expensive, but since there is no better
kernel support, it was the only way to restore that. Happily, we had met
mostly descriptors with small id, and this approach had worked somehow.
But recent time containers with inotify with big watch descriptors
begun to come, and this way stopped to work at all. When descriptor id
is something about 0x34d71d6, the restoring process spins in busy loop
for a long time, and the restore hungs and delay of migration from node
to node could easily be watched.
This patch aims to solve this problem. It introduces new ioctl
INOTIFY_IOC_SETNEXTWD, which allows to request the number of next created
watch descriptor from userspace. It simply calls idr_set_cursor() primitive
to populate idr::idr_next, so that next idr_alloc_cyclic() allocation
will return this id, if it is not occupied. This is the way which is
used to restore some other resources from userspace. For example,
/proc/sys/kernel/ns_last_pid works the same for task pids.
The new code is under CONFIG_CHECKPOINT_RESTORE #define, so small system
may exclude it.
v2: Use INT_MAX instead of custom definition of max id,
as IDR subsystem guarantees id is between 0 and INT_MAX.
CC: Jan Kara <jack@suse.cz>
CC: Matthew Wilcox <willy@infradead.org>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jan Kara <jack@suse.cz>
This fixes a compile problem of some user space applications by not
including linux/libc-compat.h in uapi/if_ether.h.
linux/libc-compat.h checks which "features" the header files, included
from the libc, provide to make the Linux kernel uapi header files only
provide no conflicting structures and enums. If a user application mixes
kernel headers and libc headers it could happen that linux/libc-compat.h
gets included too early where not all other libc headers are included
yet. Then the linux/libc-compat.h would not prevent all the
redefinitions and we run into compile problems.
This patch removes the include of linux/libc-compat.h from
uapi/if_ether.h to fix the recently introduced case, but not all as this
is more or less impossible.
It is no problem to do the check directly in the if_ether.h file and not
in libc-compat.h as this does not need any fancy glibc header detection
as glibc never provided struct ethhdr and should define
__UAPI_DEF_ETHHDR by them self when they will provide this.
The following test program did not compile correctly any more:
#include <linux/if_ether.h>
#include <netinet/in.h>
#include <linux/in.h>
int main(void)
{
return 0;
}
Fixes: 6926e041a8 ("uapi/if_ether.h: prevent redefinition of struct ethhdr")
Reported-by: Guillaume Nault <g.nault@alphalink.fr>
Cc: <stable@vger.kernel.org> # 4.15
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull more poll annotation updates from Al Viro:
"This is preparation to solving the problems you've mentioned in the
original poll series.
After this series, the kernel is ready for running
for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
done
as a for bulk search-and-replace.
After that, the kernel is ready to apply the patch to unify
{de,}mangle_poll(), and then get rid of kernel-side POLL... uses
entirely, and we should be all done with that stuff.
Basically, that's what you suggested wrt KPOLL..., except that we can
use EPOLL... instead - they already are arch-independent (and equal to
what is currently kernel-side POLL...).
After the preparations (in this series) switch to returning EPOLL...
from ->poll() instances is completely mechanical and kernel-side
POLL... can go away. The last step (killing kernel-side POLL... and
unifying {de,}mangle_poll() has to be done after the
search-and-replace job, since we need userland-side POLL... for
unified {de,}mangle_poll(), thus the cherry-pick at the last step.
After that we will have:
- POLL{IN,OUT,...} *not* in __poll_t, so any stray instances of
->poll() still using those will be caught by sparse.
- eventpoll.c and select.c warning-free wrt __poll_t
- no more kernel-side definitions of POLL... - userland ones are
visible through the entire kernel (and used pretty much only for
mangle/demangle)
- same behavior as after the first series (i.e. sparc et.al. epoll(2)
working correctly)"
* 'work.poll2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
annotate ep_scan_ready_list()
ep_send_events_proc(): return result via esed->res
preparation to switching ->poll() to returning EPOLL...
add EPOLLNVAL, annotate EPOLL... and event_poll->event
use linux/poll.h instead of asm/poll.h
xen: fix poll misannotation
smc: missing poll annotations
ARM:
- Include icache invalidation optimizations, improving VM startup time
- Support for forwarded level-triggered interrupts, improving
performance for timers and passthrough platform devices
- A small fix for power-management notifiers, and some cosmetic changes
PPC:
- Add MMIO emulation for vector loads and stores
- Allow HPT guests to run on a radix host on POWER9 v2.2 CPUs without
requiring the complex thread synchronization of older CPU versions
- Improve the handling of escalation interrupts with the XIVE interrupt
controller
- Support decrement register migration
- Various cleanups and bugfixes.
s390:
- Cornelia Huck passed maintainership to Janosch Frank
- Exitless interrupts for emulated devices
- Cleanup of cpuflag handling
- kvm_stat counter improvements
- VSIE improvements
- mm cleanup
x86:
- Hypervisor part of SEV
- UMIP, RDPID, and MSR_SMI_COUNT emulation
- Paravirtualized TLB shootdown using the new KVM_VCPU_PREEMPTED bit
- Allow guests to see TOPOEXT, GFNI, VAES, VPCLMULQDQ, and more AVX512
features
- Show vcpu id in its anonymous inode name
- Many fixes and cleanups
- Per-VCPU MSR bitmaps (already merged through x86/pti branch)
- Stable KVM clock when nesting on Hyper-V (merged through x86/hyperv)
-----BEGIN PGP SIGNATURE-----
iQEcBAABCAAGBQJafvMtAAoJEED/6hsPKofo6YcH/Rzf2RmshrWaC3q82yfIV0Qz
Z8N8yJHSaSdc3Jo6cmiVj0zelwAxdQcyjwlT7vxt5SL2yML+/Q0st9Hc3EgGGXPm
Il99eJEl+2MYpZgYZqV8ff3mHS5s5Jms+7BITAeh6Rgt+DyNbykEAvzt+MCHK9cP
xtsIZQlvRF7HIrpOlaRzOPp3sK2/MDZJ1RBE7wYItK3CUAmsHim/LVYKzZkRTij3
/9b4LP1yMMbziG+Yxt1o682EwJB5YIat6fmDG9uFeEVI5rWWN7WFubqs8gCjYy/p
FX+BjpOdgTRnX+1m9GIj0Jlc/HKMXryDfSZS07Zy4FbGEwSiI5SfKECub4mDhuE=
=C/uD
-----END PGP SIGNATURE-----
Merge tag 'kvm-4.16-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Radim Krčmář:
"ARM:
- icache invalidation optimizations, improving VM startup time
- support for forwarded level-triggered interrupts, improving
performance for timers and passthrough platform devices
- a small fix for power-management notifiers, and some cosmetic
changes
PPC:
- add MMIO emulation for vector loads and stores
- allow HPT guests to run on a radix host on POWER9 v2.2 CPUs without
requiring the complex thread synchronization of older CPU versions
- improve the handling of escalation interrupts with the XIVE
interrupt controller
- support decrement register migration
- various cleanups and bugfixes.
s390:
- Cornelia Huck passed maintainership to Janosch Frank
- exitless interrupts for emulated devices
- cleanup of cpuflag handling
- kvm_stat counter improvements
- VSIE improvements
- mm cleanup
x86:
- hypervisor part of SEV
- UMIP, RDPID, and MSR_SMI_COUNT emulation
- paravirtualized TLB shootdown using the new KVM_VCPU_PREEMPTED bit
- allow guests to see TOPOEXT, GFNI, VAES, VPCLMULQDQ, and more
AVX512 features
- show vcpu id in its anonymous inode name
- many fixes and cleanups
- per-VCPU MSR bitmaps (already merged through x86/pti branch)
- stable KVM clock when nesting on Hyper-V (merged through
x86/hyperv)"
* tag 'kvm-4.16-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (197 commits)
KVM: PPC: Book3S: Add MMIO emulation for VMX instructions
KVM: PPC: Book3S HV: Branch inside feature section
KVM: PPC: Book3S HV: Make HPT resizing work on POWER9
KVM: PPC: Book3S HV: Fix handling of secondary HPTEG in HPT resizing code
KVM: PPC: Book3S PR: Fix broken select due to misspelling
KVM: x86: don't forget vcpu_put() in kvm_arch_vcpu_ioctl_set_sregs()
KVM: PPC: Book3S PR: Fix svcpu copying with preemption enabled
KVM: PPC: Book3S HV: Drop locks before reading guest memory
kvm: x86: remove efer_reload entry in kvm_vcpu_stat
KVM: x86: AMD Processor Topology Information
x86/kvm/vmx: do not use vm-exit instruction length for fast MMIO when running nested
kvm: embed vcpu id to dentry of vcpu anon inode
kvm: Map PFN-type memory regions as writable (if possible)
x86/kvm: Make it compile on 32bit and with HYPYERVISOR_GUEST=n
KVM: arm/arm64: Fixup userspace irqchip static key optimization
KVM: arm/arm64: Fix userspace_irqchip_in_use counting
KVM: arm/arm64: Fix incorrect timer_is_pending logic
MAINTAINERS: update KVM/s390 maintainers
MAINTAINERS: add Halil as additional vfio-ccw maintainer
MAINTAINERS: add David as a reviewer for KVM/s390
...
Spectre v1 mitigation:
- back-end version of array_index_mask_nospec()
- masking of the syscall number to restrict speculation through the
syscall table
- masking of __user pointers prior to deference in uaccess routines
Spectre v2 mitigation update:
- using the new firmware SMC calling convention specification update
- removing the current PSCI GET_VERSION firmware call mitigation as
vendors are deploying new SMCCC-capable firmware
- additional branch predictor hardening for synchronous exceptions and
interrupts while in user mode
Meltdown v3 mitigation update for Cavium Thunder X: unaffected but
hardware erratum gets in the way. The kernel now starts with the page
tables mapped as global and switches to non-global if kpti needs to be
enabled.
Other:
- Theoretical trylock bug fixed
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAlp8lqcACgkQa9axLQDI
XvH2lxAAnsYqthpGQ11MtDJB+/UiBAFkg9QWPDkwrBDvNhgpll+J0VQuCN1QJ2GX
qQ8rkv8uV+y4Fqr8hORGJy5At+0aI63ZCJ72RGkZTzJAtbFbFGIDHP7RhAEIGJBS
Lk9kDZ7k39wLEx30UXIFYTTVzyHar397TdI7vkTcngiTzZ8MdFATfN/hiKO906q3
14pYnU9Um4aHUdcJ+FocL3dxvdgniuuMBWoNiYXyOCZXjmbQOnDNU2UrICroV8lS
mB+IHNEhX1Gl35QzNBtC0ET+aySfHBMJmM5oln+uVUljIGx6En1WLj6mrHYcx8U2
rIBm5qO/X/4iuzYPGkxwQtpjq3wPYxsSUnMdKJrsUZqAfy2QeIhFx6XUtJsZPB2J
/lgls5xSXMOS7oiOQtmVjcDLBURDmYXGwljXR4n4jLm4CT1V9qSLcKHu1gdFU9Mq
VuMUdPOnQub1vqKndi154IoYDTo21jAib2ktbcxpJfSJnDYoit4Gtnv7eWY+M3Pd
Toaxi8htM2HSRwbvslHYGW8ZcVpI79Jit+ti7CsFg7m9Lvgs0zxcnNui4uPYDymT
jh2JYxuirIJbX9aGGhnmkNhq9REaeZJg9LA2JM8S77FCHN3bnlSdaG6wy899J6EI
lK4anCuPQKKKhUia/dc1MeKwrmmC18EfPyGUkOzywg/jGwGCmZM=
=Y0TT
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull more arm64 updates from Catalin Marinas:
"As I mentioned in the last pull request, there's a second batch of
security updates for arm64 with mitigations for Spectre/v1 and an
improved one for Spectre/v2 (via a newly defined firmware interface
API).
Spectre v1 mitigation:
- back-end version of array_index_mask_nospec()
- masking of the syscall number to restrict speculation through the
syscall table
- masking of __user pointers prior to deference in uaccess routines
Spectre v2 mitigation update:
- using the new firmware SMC calling convention specification update
- removing the current PSCI GET_VERSION firmware call mitigation as
vendors are deploying new SMCCC-capable firmware
- additional branch predictor hardening for synchronous exceptions
and interrupts while in user mode
Meltdown v3 mitigation update:
- Cavium Thunder X is unaffected but a hardware erratum gets in the
way. The kernel now starts with the page tables mapped as global
and switches to non-global if kpti needs to be enabled.
Other:
- Theoretical trylock bug fixed"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (38 commits)
arm64: Kill PSCI_GET_VERSION as a variant-2 workaround
arm64: Add ARM_SMCCC_ARCH_WORKAROUND_1 BP hardening support
arm/arm64: smccc: Implement SMCCC v1.1 inline primitive
arm/arm64: smccc: Make function identifiers an unsigned quantity
firmware/psci: Expose SMCCC version through psci_ops
firmware/psci: Expose PSCI conduit
arm64: KVM: Add SMCCC_ARCH_WORKAROUND_1 fast handling
arm64: KVM: Report SMCCC_ARCH_WORKAROUND_1 BP hardening support
arm/arm64: KVM: Turn kvm_psci_version into a static inline
arm/arm64: KVM: Advertise SMCCC v1.1
arm/arm64: KVM: Implement PSCI 1.0 support
arm/arm64: KVM: Add smccc accessors to PSCI code
arm/arm64: KVM: Add PSCI_VERSION helper
arm/arm64: KVM: Consolidate the PSCI include files
arm64: KVM: Increment PC after handling an SMC trap
arm: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls
arm64: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls
arm64: entry: Apply BP hardening for suspicious interrupts from EL0
arm64: entry: Apply BP hardening for high-priority synchronous exceptions
arm64: futex: Mask __user pointers prior to dereference
...
This includes the disk/cache memory stats for for the virtio balloon,
as well as multiple fixes and cleanups.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJaempgAAoJECgfDbjSjVRpGOIIAIWiarIFrjjcE+hxcFdKkKvC
T8YQzfvxHTuBqD8m1jd/9R/U0RHYRM4MX+cg6tVz9J2VVhQ2Hjfrs7HExqoZKul8
sOOzk0d+ii0iQvlgnmmXamlceizP4IuNZcia7FDAZK0hWfHE84dPrG3/hhWOGruN
NDR1v7k8GBIMS+7lExQwzmy6gs4zGJftJUF9Fnb4CVT26wWbOKYS2exC8UJerZAE
2puWx71Fd/C27x/iOsxZOME0tOmzU7MIRksSjDcT7YQLesIuJSbjpQd5yfvXW0qC
7iEQzueZuYP4VroHvpDbKDxvFsIxhjFGKR4/sSG+a0Zso1ejS9fyuDL9a627PJI=
=/qQR
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio/vhost updates from Michael Tsirkin:
"virtio, vhost: fixes, cleanups, features
This includes the disk/cache memory stats for for the virtio balloon,
as well as multiple fixes and cleanups"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vhost: don't hold onto file pointer for VHOST_SET_LOG_FD
vhost: don't hold onto file pointer for VHOST_SET_VRING_ERR
vhost: don't hold onto file pointer for VHOST_SET_VRING_CALL
ringtest: ring.c malloc & memset to calloc
virtio_vop: don't kfree device on register failure
virtio_pci: don't kfree device on register failure
virtio: split device_register into device_initialize and device_add
vhost: remove unused lock check flag in vhost_dev_cleanup()
vhost: Remove the unused variable.
virtio_blk: print capacity at probe time
virtio: make VIRTIO a menuconfig to ease disabling it all
virtio/ringtest: virtio_ring: fix up need_event math
virtio/ringtest: fix up need_event math
virtio: virtio_mmio: make of_device_ids const.
firmware: Use PTR_ERR_OR_ZERO()
virtio-mmio: Use PTR_ERR_OR_ZERO()
vhost/scsi: Improve a size determination in four functions
virtio_balloon: include disk/file caches memory statistics
Pull scheduler updates from Ingo Molnar:
- membarrier updates (Mathieu Desnoyers)
- SMP balancing optimizations (Mel Gorman)
- stats update optimizations (Peter Zijlstra)
- RT scheduler race fixes (Steven Rostedt)
- misc fixes and updates
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Use a recently used CPU as an idle candidate and the basis for SIS
sched/fair: Do not migrate if the prev_cpu is idle
sched/fair: Restructure wake_affine*() to return a CPU id
sched/fair: Remove unnecessary parameters from wake_affine_idle()
sched/rt: Make update_curr_rt() more accurate
sched/rt: Up the root domain ref count when passing it around via IPIs
sched/rt: Use container_of() to get root domain in rto_push_irq_work_func()
sched/core: Optimize update_stats_*()
sched/core: Optimize ttwu_stat()
membarrier/selftest: Test private expedited sync core command
membarrier/arm64: Provide core serializing command
membarrier/x86: Provide core serializing command
membarrier: Provide core serializing command, *_SYNC_CORE
lockin/x86: Implement sync_core_before_usermode()
locking: Introduce sync_core_before_usermode()
membarrier/selftest: Test global expedited command
membarrier: Provide GLOBAL_EXPEDITED command
membarrier: Document scheduler barrier requirements
powerpc, membarrier: Skip memory barrier in switch_mm()
membarrier/selftest: Test private expedited command
Exported header doesn't use anything from <linux/string.h>,
it is <linux/uuid.h> which uses memcmp().
Link: http://lkml.kernel.org/r/20171225171121.GA22754@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use __u32 and __u64 instead of POSIX types that may not be defined
in user mode builds.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
New model support added for Dell, Ideapad, Acer, Asus, Thinkpad, and GPD
laptops. Improvements to the common intel-vbtn driver, including tablet
mode, rotate, and front button support. Intel CPU support added for
Cannonlake and platform support for Dollar Cove power button.
Overhaul of the mellanox platform driver, creating a new
platform/mellanox directory for the newly multi-architecture regmap
interface.
Significant Intel PMC update with CannonLake support, Coffeelake update,
CPUID enumeration, module support, new read64 API, refactoring and
cleanups.
Revert the apple-gmux iGP IO lock, addressing reported issues with
non-binary drivers, leaving Nvidia binary driver users to comment out
conflicting code.
Miscellaneous fixes and cleanups.
Previously merged during the 4.15-rc cycle:
- e20a8e771d platform/x86: dell-laptop: Fix keyboard max lighting for Dell Latitude E6410
- 9cd5cf3710 platform/x86: asus-wireless: send an EV_SYN/SYN_REPORT between state changes
- 91c73e8092 platform/x86: dell-wmi: check for kmalloc() errors
- 9a1a625918 platform/x86: wmi: Call acpi_wmi_init() later
The following is an automated git shortlog grouped by driver:
ACPI / LPIT:
- Export lpit_read_residency_count_address()
Input:
- add KEY_ROTATE_LOCK_TOGGLE
MAINTAINERS:
- Update tree for platform-drivers-x86
x86/cpu:
- Add Cannonlake to Intel family
acer-wireless:
- Add Acer Wireless Radio Control driver
intel_chtdc_ti_pwrbtn:
- Add support for Dollar Cove TI power button
GPD pocket fan:
- Add driver for GPD pocket custom fan controller
- Stop work on suspend
- Use a min-speed of 2 while charging
- Set speed to max on get_temp failure
apple-gmux:
- Revert: lock iGP IO to protect from vgaarb changes
alienware-wmi:
- lightbar LED support for Dell Inspiron 5675
asus-nb-wmi:
- Support ALS on the Zenbook UX430UQ
dell-laptop:
- Allocate buffer on heap rather than globally
- Add 2-in-1 devices to the DMI whitelist
- Filter out spurious keyboard backlight change events
- make some local functions static
- Use bool in struct quirk_entry for true/false fields
dell-smbios:
- Correct notation for filtering
dell-wmi:
- Add an event created by Dell Latitude 5495
Kconfig
- have ACPI_CMPC use depends instead of select for INPUT
ideapad-laptop:
- Add Y720-15IKB to no_hw_rfkill
- add lenovo RESCUER R720-15IKBN to no_hw_rfkill_list
- Use __func__ instead of write_ec_cmd in pr_err
- Remove unnecessary else
intel-hid:
- add a DMI quirk to support Wacom MobileStudio Pro
intel-vbtn:
- Replace License by SDPX identifier
- Remove redundant inclusions
- Support tablet mode switch
- Simplify autorelease logic
- support panel front button
- support KEY_ROTATE_LOCK_TOGGLE
- Support separate press/release events
- support SW_TABLET_MODE
intel_int0002_vgpio:
- Remove IRQF_NO_THREAD irq flag
intel_pmc_core:
- Special case for Coffeelake
- Add CannonLake PCH support
- Read base address from LPIT
- Remove unused header file
- Convert to ICPU macro
- Substitute PCI with CPUID enumeration
- Refactor debugfs entries
- Update Kconfig
- Fix file permission warnings
- Change driver to a module
- Fix kernel doc for pmc_dev
- Remove unused variable
- Remove unused EXPORTED API
intel_pmc_ipc:
- Add read64 API
intel_telemetry:
- Remove redundancies
- Improve S0ix logs
- Fix suspend stats
mlx-platform:
- Fix an ERR_PTR vs NULL issue
- Add hotplug device unregister to error path
- fix module aliases
- Add IO access verification callbacks
- Document pdev_hotplug field
- Allow compilation for 32 bit arch
platform/mellanox:
- mlxreg-hotplug: Add check for negative adapter number
- mlxreg-hotplug: Enable building for ARM
- mlxreg-hotplug: Modify to use a regmap interface
- Group create/destroy with attribute functions
- Rename i2c bus to nr
- mlxreg-hotplug: Remove unused wait.h include
- Move Mellanox platform hotplug driver to platform/mellanox
pmc_atom:
- introduce DEFINE_SHOW_ATTRIBUTE() macro
samsung-laptop:
- Grammar s/are can/can/
silead_dmi:
- Add Teclast X3 Plus tablet support
- Add entry for newer BIOS for Trekstor Surftab 7.0
- Add entry for the Teclast X98 Plus II
- Add entry for the Trekstor Primebook C13
- Add entry for the Chuwi Vi8 tablet
- add entry for Chuwi Hi8 tablet
- Add support for the Onda oBook 20 Plus tablet
- Add touchscreen info for SurfTab twin 10.1
thinkpad_acpi:
- suppress warning about palm detection
- Accept flat mode for type 4 multi mode status
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJaegMmAAoJEKbMaAwKp364TvUH/3D9qNtsbXpZuc3ZMNHjIysU
hdW6hOVfBN0Rk049mjw7nWv/udhWZ/6ChJDlXHX0ZugtNGnRnzbdtWGg4y38pDF1
LRuKjWfDeyMeJ11itD2xcxEaE6YsseWCKGZJ5D3T+sN4+1jgS4RLAa9cUJMl8QAo
xZsT1MKpmGuj5eTLf5GgOVL2yfMZhZHabt3kGRY0eQqNqZBgpJw/GQNI1l6v4nAH
MHPA7Gtj4HXHK8jGviZXpD9tg/iwahiUjGugG4HcxbMcpJ96a8CGyeaXmq2FlfNC
/PpmVvhVVqzLuXKWAI+DZFLAiwIvPpxzVfOKF2Lty5Rejxf7pdmHq7aCNcALys0=
=cKm9
-----END PGP SIGNATURE-----
Merge tag 'platform-drivers-x86-v4.16-1' of git://git.infradead.org/linux-platform-drivers-x86
Pull x86 platform-driver updates from Darren Hart:
"New model support added for Dell, Ideapad, Acer, Asus, Thinkpad, and
GPD laptops. Improvements to the common intel-vbtn driver, including
tablet mode, rotate, and front button support. Intel CPU support added
for Cannonlake and platform support for Dollar Cove power button.
Overhaul of the mellanox platform driver, creating a new
platform/mellanox directory for the newly multi-architecture regmap
interface.
Significant Intel PMC update with CannonLake support, Coffeelake
update, CPUID enumeration, module support, new read64 API, refactoring
and cleanups.
Revert the apple-gmux iGP IO lock, addressing reported issues with
non-binary drivers, leaving Nvidia binary driver users to comment out
conflicting code.
Miscellaneous fixes and cleanups"
* tag 'platform-drivers-x86-v4.16-1' of git://git.infradead.org/linux-platform-drivers-x86: (81 commits)
platform/x86: mlx-platform: Fix an ERR_PTR vs NULL issue
platform/x86: intel_pmc_core: Special case for Coffeelake
platform/x86: intel_pmc_core: Add CannonLake PCH support
x86/cpu: Add Cannonlake to Intel family
platform/x86: intel_pmc_core: Read base address from LPIT
ACPI / LPIT: Export lpit_read_residency_count_address()
platform/x86: intel-vbtn: Replace License by SDPX identifier
platform/x86: intel-vbtn: Remove redundant inclusions
platform/x86: intel-vbtn: Support tablet mode switch
platform/x86: dell-laptop: Allocate buffer on heap rather than globally
platform/x86: intel_pmc_core: Remove unused header file
platform/x86: mlx-platform: Add hotplug device unregister to error path
platform/x86: mlx-platform: fix module aliases
platform/mellanox: mlxreg-hotplug: Add check for negative adapter number
platform/x86: mlx-platform: Add IO access verification callbacks
platform/x86: mlx-platform: Document pdev_hotplug field
platform/x86: mlx-platform: Allow compilation for 32 bit arch
platform/mellanox: mlxreg-hotplug: Enable building for ARM
platform/mellanox: mlxreg-hotplug: Modify to use a regmap interface
platform/mellanox: Group create/destroy with attribute functions
...
As we're about to trigger a PSCI version explosion, it doesn't
hurt to introduce a PSCI_VERSION helper that is going to be
used everywhere.
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJacX62AAoJEAhfPr2O5OEVjKYP/R3v+c8ztiHzaeibcZZ8IFNl
58E0Y0yGa8OpoGJx9uqtEOamQmZoHhACfId7joIp/Jv38bgWAdbxOmk3Y4FDCFqG
1bRrpnnmvlfabiMMfLpURLqKhf7rJMtErZkrnmmqg9P/lEMohaZUJAsgBZNfJM8l
fZeacSnCSpzlxVcUb9Bf4vWhLk39R+xFzvFrwzbVUIHf3bDVpf4S4kNorMkhSZSF
HaISYXqVMhpKca7CngVKytbfacUStUY01cXcjdMuB/sD7ySwdtKogbPMvrOSaexz
G/8MB+sGT1JKUgIlh6Qv8hX805KuxBgfP19XSOH46nNU8KbYegdGhN5QXlokwI1m
dAOiozkU93r5yBZl6QzkN3uwXe492PoLgczifg97pzAJP0BfWeFStkYqlugLTwwC
Slmr7g3FZVJajbPl6WyioAGW7xfqBF7ftScZOHYxmhy41CWCGKJctmsJOjncyz5O
GInEIP3KR4CgjR+iM1LoKvE+OvVo4kRc7hrcUsjQNsbfBn6xiixjwH+5M+UVvezA
6UQpmtWGg4pX1djb8j8f6mKF8KZM12Pp3jb4Rl1cLsytN5BOBKaMEKdV3rgL+19P
Yo0x/1wK/unkI20Om71vYyQ0nXVF9j7Tpeij5u0M57TeTVYCwloQgHmrcvQJdo8+
Pqw5XEUiDpAIjvKp0XGh
=H9AS
-----END PGP SIGNATURE-----
Merge tag 'media/v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media updates from Mauro Carvalho Chehab:
- videobuf2 was moved to a media/common dir, as it is now used by the
DVB subsystem too
- Digital TV core memory mapped support interface
- new sensor driver: ov7740
- several improvements at ddbridge driver
- new V4L2 driver: IPU3 CIO2 CSI-2 receiver unit, found on some Intel
SoCs
- new tuner driver: tda18250
- finally got rid of all LIRC staging drivers
- as we don't have old lirc drivers anymore, restruct the lirc device
code
- add support for UVC metadata
- add a new staging driver for NVIDIA Tegra Video Decoder Engine
- DVB kAPI headers moved to include/media
- synchronize the kAPI and uAPI for the DVB subsystem, removing the gap
for non-legacy APIs
- reduce the kAPI gap for V4L2
- lots of other driver enhancements, cleanups, etc.
* tag 'media/v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (407 commits)
media: v4l2-compat-ioctl32.c: make ctrl_is_pointer work for subdevs
media: v4l2-compat-ioctl32.c: refactor compat ioctl32 logic
media: v4l2-compat-ioctl32.c: don't copy back the result for certain errors
media: v4l2-compat-ioctl32.c: drop pr_info for unknown buffer type
media: v4l2-compat-ioctl32.c: copy clip list in put_v4l2_window32
media: v4l2-compat-ioctl32.c: fix ctrl_is_pointer
media: v4l2-compat-ioctl32.c: copy m.userptr in put_v4l2_plane32
media: v4l2-compat-ioctl32.c: avoid sizeof(type)
media: v4l2-compat-ioctl32.c: move 'helper' functions to __get/put_v4l2_format32
media: v4l2-compat-ioctl32.c: fix the indentation
media: v4l2-compat-ioctl32.c: add missing VIDIOC_PREPARE_BUF
media: v4l2-ioctl.c: don't copy back the result for -ENOTTY
media: v4l2-ioctl.c: use check_fmt for enum/g/s/try_fmt
media: vivid: fix module load error when enabling fb and no_error_inj=1
media: dvb_demux: improve debug messages
media: dvb_demux: Better handle discontinuity errors
media: cxusb, dib0700: ignore XC2028_I2C_FLUSH
media: ts2020: avoid integer overflows on 32 bit machines
media: i2c: ov7740: use gpio/consumer.h instead of gpio.h
media: entity: Add a nop variant of media_entity_cleanup
...
* Require struct page by default for filesystem DAX to remove a number of
surprising failure cases. This includes failures with direct I/O, gdb and
fork(2).
* Add support for the new Platform Capabilities Structure added to the NFIT in
ACPI 6.2a. This new table tells us whether the platform supports flushing
of CPU and memory controller caches on unexpected power loss events.
* Revamp vmem_altmap and dev_pagemap handling to clean up code and better
support future future PCI P2P uses.
* Deprecate the ND_IOCTL_SMART_THRESHOLD command whose payload has become
out-of-sync with recent versions of the NVDIMM_FAMILY_INTEL spec, and
instead rely on the generic ND_CMD_CALL approach used by the two other IOCTL
families, NVDIMM_FAMILY_{HPE,MSFT}.
* Enhance nfit_test so we can test some of the new things added in version 1.6
of the DSM specification. This includes testing firmware download and
simulating the Last Shutdown State (LSS) status.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJaeOg0AAoJEJ/BjXdf9fLBAFoQAI/IgcgJ2h9lfEpgjBRTC44t
2p8dxwT1Ofw3Y1aR/tI8nYRXjRtAGuP4UIeRVnb1CL/N7PagJyoMGU+6hmzg+ptY
c7cEDvw6nZOhrFwXx/xn7R53sYG8zH+UE6+jTR/PP/G4mQJfFCg4iF9R72Y7z0n7
aurf82Kz137NPUy6dNr4V9bmPMJWAaOci9WOj5SKddR5ZSNbjoxylTwQRvre5y4r
7HQTScEkirABOdSf1JoXTSUXCH/RC9UFFXR03ScHstGb1HjCj3KdcicVc50Q++Ub
qsEudhE6i44PEW1Hh4Qkg6hjHMEa8qHP+ShBuRuVaUmlghYTQn66niJAYLZilwdz
EVjE7vR+toHA5g3YCalEmYVutUEhIDkh/xfpd7vM6ZorUGJy95a2elEJs2fHBffC
gEhnCip7FROPcK5RDNUM8hBgnG/q5wwWPQMKY+6rKDZQx3mXssCrKp2Vlx7kBwMG
rpblkEpYjPonbLEHxsSU8yTg9Uq55ciIWgnOToffcjZvjbihi8WUVlHcwHUMPf/o
DWElg+4qmG0Sdd4S2NeAGwTl1Ewrf2RrtUGMjHtH4OUFs1wo6ZmfrxFzzMfoZ1Od
ko/s65v4uwtTzECh2o+XQaNsReR5YETXxmA40N/Jpo7/7twABIoZ/ASvj/3ZBYj+
sie+u2rTod8/gQWSfHpJ
=MIMX
-----END PGP SIGNATURE-----
Merge tag 'libnvdimm-for-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm updates from Ross Zwisler:
- Require struct page by default for filesystem DAX to remove a number
of surprising failure cases. This includes failures with direct I/O,
gdb and fork(2).
- Add support for the new Platform Capabilities Structure added to the
NFIT in ACPI 6.2a. This new table tells us whether the platform
supports flushing of CPU and memory controller caches on unexpected
power loss events.
- Revamp vmem_altmap and dev_pagemap handling to clean up code and
better support future future PCI P2P uses.
- Deprecate the ND_IOCTL_SMART_THRESHOLD command whose payload has
become out-of-sync with recent versions of the NVDIMM_FAMILY_INTEL
spec, and instead rely on the generic ND_CMD_CALL approach used by
the two other IOCTL families, NVDIMM_FAMILY_{HPE,MSFT}.
- Enhance nfit_test so we can test some of the new things added in
version 1.6 of the DSM specification. This includes testing firmware
download and simulating the Last Shutdown State (LSS) status.
* tag 'libnvdimm-for-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (37 commits)
libnvdimm, namespace: remove redundant initialization of 'nd_mapping'
acpi, nfit: fix register dimm error handling
libnvdimm, namespace: make min namespace size 4K
tools/testing/nvdimm: force nfit_test to depend on instrumented modules
libnvdimm/nfit_test: adding support for unit testing enable LSS status
libnvdimm/nfit_test: add firmware download emulation
nfit-test: Add platform cap support from ACPI 6.2a to test
libnvdimm: expose platform persistence attribute for nd_region
acpi: nfit: add persistent memory control flag for nd_region
acpi: nfit: Add support for detect platform CPU cache flush on power loss
device-dax: Fix trailing semicolon
libnvdimm, btt: fix uninitialized err_lock
dax: require 'struct page' by default for filesystem dax
ext2: auto disable dax instead of failing mount
ext4: auto disable dax instead of failing mount
mm, dax: introduce pfn_t_special()
mm: Fix devm_memremap_pages() collision handling
mm: Fix memory size alignment in devm_memremap_pages_release()
memremap: merge find_dev_pagemap into get_dev_pagemap
memremap: change devm_memremap_pages interface to use struct dev_pagemap
...
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJad5lgAAoJEFmIoMA60/r8s2kQAI3PztawDpaCP9Z12pkbBHSt
Ho0xTyk9rCZi9kQJbNjc+a+QrlA3QmTHXIXerB3LSWoh7M+XhsECjem92eHpgLNS
JvYPhTfOrCr0vdiAmOz6hD0AqN/psrbfzgiJhSwomsGEFS77k7kERSJckRv81sxb
Aj5F/WjucAgLorwm4auveAJEQ7atE7/6pkXzoqYm4G6NLOb46jUcRGndrnvXZBlz
fws8fBM4BHyi7i25CYQl24tFq1CGax1rIPgLg+4KnH76bQk/N6Ju0sGVSzfh+hG8
SIerK9bJbzGRAuNKoxB3aO1dyzsK3x9WztE2mG98w5trOISPIR1FqnvC/225FWAU
d6eIXiC7wKnEx+DElNTzCjzfHc7SAJoupO32H7CoiTe5zPUlWlxJ1zLYkK1gt50q
m8PRBiYTglxyznzrO0drtcdjEzvbdZNRrsYnul4wi1vSHzjk6F6XLtzT10XWM1M1
1pXLB8384FTj0Hu4bq6Y3Aivkmz0Sf+eQM2NaOwe+Zj7/1VV0d3lvi4LUXkqzLCA
FoXPJSMxG2Qu+iflCeYRQBJjExaZH3eNLZ3dT6QpcJrjaFVedd9u5DeeFqNL27zV
bhr8TdqrR4p4rc8EBAGoCapw96IxLZROKB3gxbrZVOpfIZpzthwHbElHX6aqUgF4
w/EV1JWs36WXWaxFk8wd
=ttq9
-----END PGP SIGNATURE-----
Merge tag 'pci-v4.16-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
- skip AER driver error recovery callbacks for correctable errors
reported via ACPI APEI, as we already do for errors reported via the
native path (Tyler Baicar)
- fix DPC shared interrupt handling (Alex Williamson)
- print full DPC interrupt number (Keith Busch)
- enable DPC only if AER is available (Keith Busch)
- simplify DPC code (Bjorn Helgaas)
- calculate ASPM L1 substate parameter instead of hardcoding it (Bjorn
Helgaas)
- enable Latency Tolerance Reporting for ASPM L1 substates (Bjorn
Helgaas)
- move ASPM internal interfaces out of public header (Bjorn Helgaas)
- allow hot-removal of VGA devices (Mika Westerberg)
- speed up unplug and shutdown by assuming Thunderbolt controllers
don't support Command Completed events (Lukas Wunner)
- add AtomicOps support for GPU and Infiniband drivers (Felix Kuehling,
Jay Cornwall)
- expose "ari_enabled" in sysfs to help NIC naming (Stuart Hayes)
- clean up PCI DMA interface usage (Christoph Hellwig)
- remove PCI pool API (replaced with DMA pool) (Romain Perier)
- deprecate pci_get_bus_and_slot(), which assumed PCI domain 0 (Sinan
Kaya)
- move DT PCI code from drivers/of/ to drivers/pci/ (Rob Herring)
- add PCI-specific wrappers for dev_info(), etc (Frederick Lawler)
- remove warnings on sysfs mmap failure (Bjorn Helgaas)
- quiet ROM validation messages (Alex Deucher)
- remove redundant memory alloc failure messages (Markus Elfring)
- fill in types for compile-time VGA and other I/O port resources
(Bjorn Helgaas)
- make "pci=pcie_scan_all" work for Root Ports as well as Downstream
Ports to help AmigaOne X1000 (Bjorn Helgaas)
- add SPDX tags to all PCI files (Bjorn Helgaas)
- quirk Marvell 9128 DMA aliases (Alex Williamson)
- quirk broken INTx disable on Ceton InfiniTV4 (Bjorn Helgaas)
- fix CONFIG_PCI=n build by adding dummy pci_irqd_intx_xlate() (Niklas
Cassel)
- use DMA API to get MSI address for DesignWare IP (Niklas Cassel)
- fix endpoint-mode DMA mask configuration (Kishon Vijay Abraham I)
- fix ARTPEC-6 incorrect IS_ERR() usage (Wei Yongjun)
- add support for ARTPEC-7 SoC (Niklas Cassel)
- add endpoint-mode support for ARTPEC (Niklas Cassel)
- add Cadence PCIe host and endpoint controller driver (Cyrille
Pitchen)
- handle multiple INTx status bits being set in dra7xx (Vignesh R)
- translate dra7xx hwirq range to fix INTD handling (Vignesh R)
- remove deprecated Exynos PHY initialization code (Jaehoon Chung)
- fix MSI erratum workaround for HiSilicon Hip06/Hip07 (Dongdong Liu)
- fix NULL pointer dereference in iProc BCMA driver (Ray Jui)
- fix Keystone interrupt-controller-node lookup (Johan Hovold)
- constify qcom driver structures (Julia Lawall)
- rework Tegra config space mapping to increase space available for
endpoints (Vidya Sagar)
- simplify Tegra driver by using bus->sysdata (Manikanta Maddireddy)
- remove PCI_REASSIGN_ALL_BUS usage on Tegra (Manikanta Maddireddy)
- add support for Global Fabric Manager Server (GFMS) event to
Microsemi Switchtec switch driver (Logan Gunthorpe)
- add IDs for Switchtec PSX 24xG3 and PSX 48xG3 (Kelvin Cao)
* tag 'pci-v4.16-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (140 commits)
PCI: cadence: Add EndPoint Controller driver for Cadence PCIe controller
dt-bindings: PCI: cadence: Add DT bindings for Cadence PCIe endpoint controller
PCI: endpoint: Fix EPF device name to support multi-function devices
PCI: endpoint: Add the function number as argument to EPC ops
PCI: cadence: Add host driver for Cadence PCIe controller
dt-bindings: PCI: cadence: Add DT bindings for Cadence PCIe host controller
PCI: Add vendor ID for Cadence
PCI: Add generic function to probe PCI host controllers
PCI: generic: fix missing call of pci_free_resource_list()
PCI: OF: Add generic function to parse and allocate PCI resources
PCI: Regroup all PCI related entries into drivers/pci/Makefile
PCI/DPC: Reformat DPC register definitions
PCI/DPC: Add and use DPC Status register field definitions
PCI/DPC: Squash dpc_rp_pio_get_info() into dpc_process_rp_pio_error()
PCI/DPC: Remove unnecessary RP PIO register structs
PCI/DPC: Push dpc->rp_pio_status assignment into dpc_rp_pio_get_info()
PCI/DPC: Squash dpc_rp_pio_print_error() into dpc_rp_pio_get_info()
PCI/DPC: Make RP PIO log size check more generic
PCI/DPC: Rename local "status" to "dpc_status"
PCI/DPC: Squash dpc_rp_pio_print_tlp_header() into dpc_rp_pio_print_error()
...
Two new perf types, perf_kprobe and perf_uprobe, will be added to allow
creating [k,u]probe with perf_event_open. These [k,u]probe are associated
with the file decriptor created by perf_event_open(), thus are easy to
clean when the file descriptor is destroyed.
kprobe_func and uprobe_path are added to union config1 for pointers to
function name for kprobe or binary path for uprobe.
kprobe_addr and probe_offset are added to union config2 for kernel
address (when kprobe_func is NULL), or [k,u]probe offset.
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Yonghong Song <yhs@fb.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: <daniel@iogearbox.net>
Cc: <davem@davemloft.net>
Cc: <kernel-team@fb.com>
Cc: <rostedt@goodmis.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20171206224518.3598254-4-songliubraving@fb.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Provide core serializing membarrier command to support memory reclaim
by JIT.
Each architecture needs to explicitly opt into that support by
documenting in their architecture code how they provide the core
serializing instructions required when returning from the membarrier
IPI, and after the scheduler has updated the curr->mm pointer (before
going back to user-space). They should then select
ARCH_HAS_MEMBARRIER_SYNC_CORE to enable support for that command on
their architecture.
Architectures selecting this feature need to either document that
they issue core serializing instructions when returning to user-space,
or implement their architecture-specific sync_core_before_usermode().
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrea Parri <parri.andrea@gmail.com>
Cc: Andrew Hunter <ahh@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Avi Kivity <avi@scylladb.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Dave Watson <davejwatson@fb.com>
Cc: David Sehr <sehr@google.com>
Cc: Greg Hackmann <ghackmann@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Maged Michael <maged.michael@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-api@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Link: http://lkml.kernel.org/r/20180129202020.8515-9-mathieu.desnoyers@efficios.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Allow expedited membarrier to be used for data shared between processes
through shared memory.
Processes wishing to receive the membarriers register with
MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED. Those which want to issue
membarrier invoke MEMBARRIER_CMD_GLOBAL_EXPEDITED.
This allows extremely simple kernel-level implementation: we have almost
everything we need with the PRIVATE_EXPEDITED barrier code. All we need
to do is to add a flag in the mm_struct that will be used to check
whether we need to send the IPI to the current thread of each CPU.
There is a slight downside to this approach compared to targeting
specific shared memory users: when performing a membarrier operation,
all registered "global" receivers will get the barrier, even if they
don't share a memory mapping with the sender issuing
MEMBARRIER_CMD_GLOBAL_EXPEDITED.
This registration approach seems to fit the requirement of not
disturbing processes that really deeply care about real-time: they
simply should not register with MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED.
In order to align the membarrier command names, the "MEMBARRIER_CMD_SHARED"
command is renamed to "MEMBARRIER_CMD_GLOBAL", keeping an alias of
MEMBARRIER_CMD_SHARED to MEMBARRIER_CMD_GLOBAL for UAPI header backward
compatibility.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrea Parri <parri.andrea@gmail.com>
Cc: Andrew Hunter <ahh@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Avi Kivity <avi@scylladb.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Dave Watson <davejwatson@fb.com>
Cc: David Sehr <sehr@google.com>
Cc: Greg Hackmann <ghackmann@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Maged Michael <maged.michael@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-api@vger.kernel.org
Link: http://lkml.kernel.org/r/20180129202020.8515-5-mathieu.desnoyers@efficios.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull i2c updates from Wolfram Sang:
"I2C has the following changes for you:
- new flag to mark DMA safe buffers in i2c_msg. Also, some
infrastructure around it. And docs.
- huge refactoring of the at24 driver led by the new maintainer
Bartosz
- update I2C bus recovery to send STOP after recovery
- conversion from gpio to gpiod for I2C bus recovery
- adding a fault-injector to the i2c-gpio driver
- lots of small driver improvements, and bigger ones to
i2c-sh_mobile"
* 'i2c/for-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (99 commits)
i2c: mv64xxx: Add myself as maintainer for this driver
i2c: mv64xxx: Fix clock resource by adding an optional bus clock
i2c: mv64xxx: Remove useless test before clk_disable_unprepare
i2c: mxs: use true and false for boolean values
i2c: meson: update doc description to fix build warnings
i2c: meson: add configurable divider factors
dt-bindings: i2c: update documentation for the Meson-AXG
i2c: imx-lpi2c: add runtime pm support
i2c: rcar: fix some trivial typos in comments
i2c: davinci: fix the cpufreq transition
i2c: rk3x: add proper kerneldoc header
i2c: rk3x: account for const type of of_device_id.data
i2c: acorn: remove outdated path from file header
i2c: acorn: add MODULE_LICENSE tag
i2c: rcar: implement bus recovery
i2c: send STOP after successful bus recovery
i2c: ensure SDA is released in recovery if SDA is controllable
i2c: add 'set_sda' to bus_recovery_info
i2c: add identifier in declarations for i2c_bus_recovery
i2c: make kerneldoc about bus recovery more precise
...
Highlights:
- Enable support for memory protection keys aka "pkeys" on Power7/8/9 when
using the hash table MMU.
- Extend our interrupt soft masking to support masking PMU interrupts as well
as "normal" interrupts, and then use that to implement local_t for a ~4x
speedup vs the current atomics-based implementation.
- A new driver "ocxl" for "Open Coherent Accelerator Processor Interface
(OpenCAPI)" devices.
- Support for new device tree properties on PowerVM to describe hotpluggable
memory and devices.
- Add support for CLOCK_{REALTIME/MONOTONIC}_COARSE to the 64-bit VDSO.
- Freescale updates from Scott:
"Contains fixes for CPM GPIO and an FSL PCI erratum workaround, plus a
minor cleanup patch."
As well as quite a lot of other changes all over the place, and small fixes and
cleanups as always.
Thanks to:
Alan Modra, Alastair D'Silva, Alexey Kardashevskiy, Alistair Popple, Andreas
Schwab, Andrew Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Anshuman
Khandual, Anton Blanchard, Arnd Bergmann, Balbir Singh, Benjamin
Herrenschmidt, Bhaktipriya Shridhar, Bryant G. Ly, Cédric Le Goater,
Christophe Leroy, Christophe Lombard, Cyril Bur, David Gibson, Desnes A. Nunes
do Rosario, Dmitry Torokhov, Frederic Barrat, Geert Uytterhoeven, Guilherme G.
Piccoli, Gustavo A. R. Silva, Gustavo Romero, Ivan Mikhaylov, Joakim
Tjernlund, Joe Perches, Josh Poimboeuf, Juan J. Alvarez, Julia Cartwright,
Kamalesh Babulal, Madhavan Srinivasan, Mahesh Salgaonkar, Mathieu Malaterre,
Michael Bringmann, Michael Hanselmann, Michael Neuling, Nathan Fontenot,
Naveen N. Rao, Nicholas Piggin, Paul Mackerras, Philippe Bergheaud, Ram Pai,
Russell Currey, Santosh Sivaraj, Scott Wood, Seth Forshee, Simon Guo, Stewart
Smith, Sukadev Bhattiprolu, Thiago Jung Bauermann, Vaibhav Jain, Vasyl
Gomonovych.
-----BEGIN PGP SIGNATURE-----
iQIwBAABCAAaBQJadF6wExxtcGVAZWxsZXJtYW4uaWQuYXUACgkQUevqPMjhpYA2
nBAAnguCEyAIYpc+ffE3WU9xJEWxa6bKuVufHcUFVntGiGD+igmMS+SHp4ay3Aos
HcA4WFrpzNb2KZ++kmFWtAKWnMfCiW9xuYJNicjr7X5ZiVBEhLWN/mQCwBKs3p6L
5+HhvytcdkKVbEcyVjEGvRL40AyxXNOI02o6Co9X8vanHsmWB4q0eWe4PHstZqlg
6K6kazMp+NTvEFYwKNXDOvuHouKSL57l14SLROH7CpJkNTOQ9s+W59/LmnuCjRlu
o70b7iWOAEbF9tvMma1ksDZVNj7mSyaymLYCyOXu4CkuuleJacZYJ9oQGNddoIbC
wk7l93vPT/yze7DYg8x3uXpKcaDEvEepPuQ/ubz+UXFQWuJtl5ej6Cv+0eOmyZIs
+bjWhGHKdNttnsiPlTRCX/gWD13RE1dB6xXJlfOJ7Oz9OnXXK8ZKc1NTREbQXRWM
8tClAwf9upWpm86GHPVnyrgYbgZo5b1os4SoS8e3kESzakrQVQP7J376u2DtccRq
2AGqjJ+tl5tYPnhm8zG1cNrpqHHpgkNGqLS7DvWRg3EPmEKVQcltN1b/0aKaAjHA
aTRofjrVo+jJ4MX1uyEo59yNCEQPfjkmHRQdLwm+xjWTzEPfIMzpWyXm14tawDQf
OjcAe90W/qQ18brw4z+2BI14J76XziOSX/QcunOn1u/sqaM=
=3rYn
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
"Highlights:
- Enable support for memory protection keys aka "pkeys" on Power7/8/9
when using the hash table MMU.
- Extend our interrupt soft masking to support masking PMU interrupts
as well as "normal" interrupts, and then use that to implement
local_t for a ~4x speedup vs the current atomics-based
implementation.
- A new driver "ocxl" for "Open Coherent Accelerator Processor
Interface (OpenCAPI)" devices.
- Support for new device tree properties on PowerVM to describe
hotpluggable memory and devices.
- Add support for CLOCK_{REALTIME/MONOTONIC}_COARSE to the 64-bit
VDSO.
- Freescale updates from Scott: fixes for CPM GPIO and an FSL PCI
erratum workaround, plus a minor cleanup patch.
As well as quite a lot of other changes all over the place, and small
fixes and cleanups as always.
Thanks to: Alan Modra, Alastair D'Silva, Alexey Kardashevskiy,
Alistair Popple, Andreas Schwab, Andrew Donnellan, Aneesh Kumar K.V,
Anju T Sudhakar, Anshuman Khandual, Anton Blanchard, Arnd Bergmann,
Balbir Singh, Benjamin Herrenschmidt, Bhaktipriya Shridhar, Bryant G.
Ly, Cédric Le Goater, Christophe Leroy, Christophe Lombard, Cyril Bur,
David Gibson, Desnes A. Nunes do Rosario, Dmitry Torokhov, Frederic
Barrat, Geert Uytterhoeven, Guilherme G. Piccoli, Gustavo A. R. Silva,
Gustavo Romero, Ivan Mikhaylov, Joakim Tjernlund, Joe Perches, Josh
Poimboeuf, Juan J. Alvarez, Julia Cartwright, Kamalesh Babulal,
Madhavan Srinivasan, Mahesh Salgaonkar, Mathieu Malaterre, Michael
Bringmann, Michael Hanselmann, Michael Neuling, Nathan Fontenot,
Naveen N. Rao, Nicholas Piggin, Paul Mackerras, Philippe Bergheaud,
Ram Pai, Russell Currey, Santosh Sivaraj, Scott Wood, Seth Forshee,
Simon Guo, Stewart Smith, Sukadev Bhattiprolu, Thiago Jung Bauermann,
Vaibhav Jain, Vasyl Gomonovych"
* tag 'powerpc-4.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (199 commits)
powerpc/mm/radix: Fix build error when RADIX_MMU=n
macintosh/ams-input: Use true and false for boolean values
macintosh: change some data types from int to bool
powerpc/watchdog: Print the NIP in soft_nmi_interrupt()
powerpc/watchdog: regs can't be null in soft_nmi_interrupt()
powerpc/watchdog: Tweak watchdog printks
powerpc/cell: Remove axonram driver
rtc-opal: Fix handling of firmware error codes, prevent busy loops
powerpc/mpc52xx_gpt: make use of raw_spinlock variants
macintosh/adb: Properly mark continued kernel messages
powerpc/pseries: Fix cpu hotplug crash with memoryless nodes
powerpc/numa: Ensure nodes initialized for hotplug
powerpc/numa: Use ibm,max-associativity-domains to discover possible nodes
powerpc/kernel: Block interrupts when updating TIDR
powerpc/powernv/idoa: Remove unnecessary pcidev from pci_dn
powerpc/mm/nohash: do not flush the entire mm when range is a single page
powerpc/pseries: Add Initialization of VF Bars
powerpc/pseries/pci: Associate PEs to VFs in configure SR-IOV
powerpc/eeh: Add EEH notify resume sysfs
powerpc/eeh: Add EEH operations to notify resume
...
The arbitrary 4MB minimum namespace size turns out to be too large for
some environments. Quoting Cheng-mean Liu:
In the case of emulated NVDIMM devices in the VM environment, there
are scenarios that NVDIMM device with much smaller sizes are
desired, for example, we might use a single enumerated NVDIMM DAX
device for representing each container layer, which in some cases
could be just a few KBs size.
PAGE_SIZE is the minimum where we can still support DAX of at least
a single page.
Cc: Matthew Wilcox <willy@infradead.org>
Reported-by: Cheng-mean Liu <soccerl@microsoft.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJacnVwAAoJEAx081l5xIa+HhIP/0yDg5tuco0QN3YskE/bIa3o
4VDWsLi+WCoSZoV4uWLKYK8OHiNzKdnGfNoUNWqRqaYilWDtpgBX86Wjg5hxnGwA
/6jGfU1nhb0teG9clGBbzgxHXW6iKvT+p/Pp1pC8HXU+zEUaungJcWY120hITwMD
NqUGK6kYRsJVYj+4b+5Ho7Fvv912bbjK0YAptD6RdzX4rDPN0D+XrtXlYsg1PJYx
jv/NNWEP5mCesYKsS8JzHYcfOF/vdQpPwAV4C3LKaQy5k3pVVIDOEuOycIZTKMf3
K/fSsbvhHMH3Ck+lPcK+etcoQbkLCcmKbw+3uvM/7njkn7Dp24Ryk9FXB3dXXOgb
3kLs7f0gY9j/NAi3uKAMvACPvXNA7eptIvAmN/VKzmEiqgx+l0sveSuU73DVoe/x
Jko8ijyiKchcN+/CTgZ7FNyEd0UWO06+9B0RMrlEezE8f14EhR51wIQQTNFJRJn/
kqRM1hC2Cvb00vAwq7jjZcDa7hRCI0OoVU9N37smtPuTJY94tR/CUbq10g4pSlu8
h8FiHnLuhlyh1DQNNS19HQfOSh0yYgEGRQcIKy3vqshsO3/hbe8bQD5UerqMZPZB
ZpMEWe5VHSWIVjAxgzHNXFd9F/jSeWDVkCztKfx0CLmzHZNLNjw+/zgbIdF3vj9T
S1cwFZLWr/ngf5mbyR88
=pLN1
-----END PGP SIGNATURE-----
Merge tag 'drm-for-v4.16' of git://people.freedesktop.org/~airlied/linux
Pull drm updates from Dave Airlie:
"This seems to have been a comparatively quieter merge window, I assume
due to holidays etc. The "biggest" change is AMD header cleanups, which
merge/remove a bunch of them. The AMD gpu scheduler is now being made generic
with the etnaviv driver wanting to reuse the code, hopefully other drivers
can go in the same direction.
Otherwise it's the usual lots of stuff in i915/amdgpu, not so much stuff
elsewhere.
Core:
- Add .last_close and .output_poll_changed helpers to reduce driver footprints
- Fix plane clipping
- Improved debug printing support
- Add panel orientation property
- Update edid derived properties at edid setting
- Reduction in fbdev driver footprint
- Move amdgpu scheduler into core for other drivers to use.
i915:
- Selftest and IGT improvements
- Fast boot prep work on IPS, pipe config
- HW workarounds for Cannonlake, Geminilake
- Cannonlake clock and HDMI2.0 fixes
- GPU cache invalidation and context switch improvements
- Display planes cleanup
- New PMU interface for perf queries
- New firmware support for KBL/SKL
- Geminilake HW workaround for perforamce
- Coffeelake stolen memory improvements
- GPU reset robustness work
- Cannonlake horizontal plane flipping
- GVT work
amdgpu/radeon:
- RV and Vega header file cleanups (lots of lines gone!)
- TTM operation context support
- 48-bit GPUVM support for Vega/RV
- ECC support for Vega
- Resizeable BAR support
- Multi-display sync support
- Enable swapout for reserved BOs during allocation
- S3 fixes on Raven
- GPU reset cleanup and fixes
- 2+1 level GPU page table
amdkfd:
- GFX7/8 SDMA user queues support
- Hardware scheduling for multiple processes
- dGPU prep work
rcar:
- Added R8A7743/5 support
- System suspend/resume support
sun4i:
- Multi-plane support for YUV formats
- A83T and LVDS support
msm:
- Devfreq support for GPU
tegra:
- Prep work for adding Tegra186 support
- Tegra186 HDMI support
- HDMI2.0 and zpos support by using generic helpers
tilcdc:
- Misc fixes
omapdrm:
- Support memory bandwidth limits
- DSI command mode panel cleanups
- DMM error handling
exynos:
- drop the old IPP subdriver.
etnaviv:
- Occlusion query fixes
- Job handling fixes
- Prep work for hooking in gpu scheduler
armada:
- Move closer to atomic modesetting
- Allow disabling primary plane if overlay is full screen
imx:
- Format modifier support
- Add tile prefetch to PRE
- Runtime PM support for PRG
ast:
- fix LUT loading"
* tag 'drm-for-v4.16' of git://people.freedesktop.org/~airlied/linux: (1471 commits)
drm/ast: Load lut in crtc_commit
drm: Check for lessee in DROP_MASTER ioctl
drm: fix gpu scheduler link order
drm/amd/display: Demote error print to debug print when ATOM impl missing
dma-buf: fix reservation_object_wait_timeout_rcu once more v2
drm/amdgpu: Avoid leaking PM domain on driver unbind (v2)
drm/amd/amdgpu: Add Polaris version check
drm/amdgpu: Reenable manual GPU reset from sysfs
drm/amdgpu: disable MMHUB power gating on raven
drm/ttm: Don't unreserve swapped BOs that were previously reserved
drm/ttm: Don't add swapped BOs to swap-LRU list
drm/amdgpu: only check for ECC on Vega10
drm/amd/powerplay: Fix smu_table_entry.handle type
drm/ttm: add VADDR_FLAG_UPDATED_COUNT to correctly update dma_page global count
drm: Fix PANEL_ORIENTATION_QUIRKS breaking the Kconfig DRM menuconfig
drm/radeon: fill in rb backend map on evergreen/ni.
drm/amdgpu/gfx9: fix ngg enablement to clear gds reserved memory (v2)
drm/ttm: only free pages rather than update global memory count together
drm/amdgpu: fix CPU based VM updates
drm/amdgpu: fix typo in amdgpu_vce_validate_bo
...
A number of new drivers get added this time, along with many low-priority
bugfixes. The most interesting changes by subsystem are:
bus drivers:
- Updates to the Broadcom bus interface driver to support newer SoC types
- The TI OMAP sysc driver now supports updated DT bindings
memory controllers:
- A new driver for Tegra186 gets added
- A new driver for the ti-emif sram, to allow relocating
suspend/resume handlers there
SoC specific:
- A new driver for Qualcomm QMI, the interface to the modem on MSM SoCs
- A new driver for power domains on the actions S700 SoC
- A driver for the Xilinx Zynq VCU logicoreIP
reset controllers:
- A new driver for Amlogic Meson-AGX
- various bug fixes
tee subsystem:
- A new user interface got added to enable asynchronous communication
with the TEE supplicant.
- A new method of using user space memory for communication with
the TEE is added
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJac0fQAAoJEGCrR//JCVIntjEQALc6kflEGJc/FPundbx9V3F/
b+3+EX/uMnBnKsgZprz9ACPhx5eBH9QWja3A1zmIarb5c+q7zbBZDwhzUb8J8Yg8
xEb0im7Wx/GcKjUYZVKYBxtz9KjkXDzhrq8IAvPg6ShNcIy/8hq7ZO3iOkGsTDcy
/PyioWKC5g0dhJgtp91X1kgog5tuTaWOg39uUOqyEzwVu1vYVa4w+eeCzjEd6I//
68R/zDQ52+hWw6WZGoYOsNYzuriOflnJRnNpwuGhMhLNULBJfWnd4hkqGm4E+hFa
5dzW6vVAdIqjemFqPzCBT2WB4UG871aZX8DJ9HgnfX+g970nlsm1JY8Ck9MJNJum
aDkqZG41ArUYzDFWu8vJ2SKsue5lEZp6TEO2mLEVYrdOjOgedj0Zxqmq2DYeigxd
+ccOVgKJ9SsYw9ft1LkQ5BHCgOh3C7y9Kcg7oBnaEI5OTVvtf5PwEkT2cwbvgxYl
EVKLhlJ0Af+QXOW8E5JbNQETpYw52DMm6UKHiYn/JCGxB/8J0bgJzImDJI4Dtu2h
zqJITKJeTepqbfA5pmNfKa+20RhmsktdRCw2NN/QynY7EEtGjHAUVnlpZT2mrDco
0m62b7Erek/776vJN5ECzE5e6XCs2N0MDE6Anp121C5zEmig/SMBrUosMzP7Jnis
IDVC/QWkb3u85wK20Vc1
=yz0k
-----END PGP SIGNATURE-----
Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC driver updates from Arnd Bergmann:
"A number of new drivers get added this time, along with many
low-priority bugfixes. The most interesting changes by subsystem are:
bus drivers:
- Updates to the Broadcom bus interface driver to support newer SoC
types
- The TI OMAP sysc driver now supports updated DT bindings
memory controllers:
- A new driver for Tegra186 gets added
- A new driver for the ti-emif sram, to allow relocating
suspend/resume handlers there
SoC specific:
- A new driver for Qualcomm QMI, the interface to the modem on MSM
SoCs
- A new driver for power domains on the actions S700 SoC
- A driver for the Xilinx Zynq VCU logicoreIP
reset controllers:
- A new driver for Amlogic Meson-AGX
- various bug fixes
tee subsystem:
- A new user interface got added to enable asynchronous communication
with the TEE supplicant.
- A new method of using user space memory for communication with the
TEE is added"
* tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (84 commits)
of: platform: fix OF node refcount leak
soc: fsl: guts: Add a NULL check for devm_kasprintf()
bus: ti-sysc: Fix smartreflex sysc mask
psci: add CPU_IDLE dependency
soc: xilinx: Fix Kconfig alignment
soc: xilinx: xlnx_vcu: Use bitwise & rather than logical && on clkoutdiv
soc: xilinx: xlnx_vcu: Depends on HAS_IOMEM for xlnx_vcu
soc: bcm: brcmstb: Be multi-platform compatible
soc: brcmstb: biuctrl: exit without warning on non brcmstb platforms
Revert "soc: brcmstb: Only register SoC device on STB platforms"
bus: omap: add MODULE_LICENSE tags
soc: brcmstb: Only register SoC device on STB platforms
tee: shm: Potential NULL dereference calling tee_shm_register()
soc: xilinx: xlnx_vcu: Add Xilinx ZYNQMP VCU logicoreIP init driver
dt-bindings: soc: xilinx: Add DT bindings to xlnx_vcu driver
soc: xilinx: Create folder structure for soc specific drivers
of: platform: populate /firmware/ node from of_platform_default_populate_init()
soc: samsung: Add SPDX license identifiers
soc: qcom: smp2p: Use common error handling code in qcom_smp2p_probe()
tee: shm: don't put_page on null shm->pages
...
- Mask INTx from user if pdev->irq is zero (Alexey Kardashevskiy)
- Capability helper cleanup (Alex Williamson)
- Allow mmaps overlapping MSI-X vector table with region capability
exposing this feature (Alexey Kardashevskiy)
- mdev static cleanups (Xiongwei Song)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)
iQIcBAABAgAGBQJac2tQAAoJECObm247sIsiUr8P/2zrK0G/mVrHWGEjlnlvgYjg
FSozgN0fmc9mJ+Hg8ntfXn4GXuTaCq0uz96eMl9Wy1tMpLs4hoQf20IRojIpqgpb
aVUHKg+7sJUWNsd+u1jSBb64SvQLdTbesfGgL8WLcJB90jWxzaZjEZ3mgKO4Lb88
rNEpiTVoOURDgJo+bOMEnJJ6okmpRLBgw5pqrdLT2BButZg3QfLtcuoY18pFYxc7
INy4YPWPe93aiDGloUrjj6xKRKfTaL7L8KGZBlk4FR5JENDRoOtGEguGcQRl0u8w
IYLIkIE9p172S5bkeCYqawyxAPgQQIk5Wd4buFArg9w7tdWJOkEiCMwSu/LCocR8
CiwZHKOWp7mJWv3XxJGY+rU3nHIuB+IaeKknE6rLXgkLVXED5Ta4pPzrpoUZezsT
yyI5U2BnWNL5ISaWY3i+YQKvFUgPe8Az9Zw4p3zGUJ+zu2QheN7x+BVXT0xENMfk
2sdNFeZwkxmB18pdsJdb+/DpL9yCrS7VxP3IDnAdfR8VIyLE3QFRTnsfNX7F1Uvr
zKBZChuah2osiiM3k3ncwkovsM6iN1ZLVnHUg7xRBmp6WQnBYN02wrTXM/e1UqsS
nKIBYA9fIcpLZ4hXMuiMmk7LFlScl/HngW+h+jWAyoP/j3X0gw/YM5hrPEJEV3fN
WuIPpBYN36+0V9cUDMnN
=Ww3n
-----END PGP SIGNATURE-----
Merge tag 'vfio-v4.16-rc1' of git://github.com/awilliam/linux-vfio
Pull VFIO updates from Alex Williamson:
- Mask INTx from user if pdev->irq is zero (Alexey Kardashevskiy)
- Capability helper cleanup (Alex Williamson)
- Allow mmaps overlapping MSI-X vector table with region capability
exposing this feature (Alexey Kardashevskiy)
- mdev static cleanups (Xiongwei Song)
* tag 'vfio-v4.16-rc1' of git://github.com/awilliam/linux-vfio:
vfio: mdev: make a couple of functions and structure vfio_mdev_driver static
vfio-pci: Allow mapping MSIX BAR
vfio: Simplify capability helper
vfio-pci: Mask INTx if a device is not capabable of enabling it
Pull input layer updates from Dmitry Torokhov:
- evdev interface has been adjusted to extend the life of timestamps on
32 bit systems to the year of 2108
- Synaptics RMI4 driver's PS/2 guest handling ha beed updated to
improve chances of detecting trackpoints on the pass-through port
- mms114 touchcsreen controller driver has been updated to support
generic device properties and work with mms152 cntrollers
- Goodix driver now supports generic touchscreen properties
- couple of drivers for AVR32 architecture are gone as the architecture
support has been removed from the kernel
- gpio-tilt driver has been removed as there are no mainline users and
the driver itself is using legacy APIs and relies on platform data
- MODULE_LINECSE/MODULE_VERSION cleanups
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (45 commits)
Input: goodix - use generic touchscreen_properties
Input: mms114 - fix typo in definition
Input: mms114 - use BIT() macro instead of explicit shifting
Input: mms114 - replace mdelay with msleep
Input: mms114 - add support for mms152
Input: mms114 - drop platform data and use generic APIs
Input: mms114 - mark as direct input device
Input: mms114 - do not clobber interrupt trigger
Input: edt-ft5x06 - fix error handling for factory mode on non-M06
Input: stmfts - set IRQ_NOAUTOEN to the irq flag
Input: auo-pixcir-ts - delete an unnecessary return statement
Input: auo-pixcir-ts - remove custom log for a failed memory allocation
Input: da9052_tsi - remove unused mutex
Input: docs - use PROPERTY_ENTRY_U32() directly
Input: synaptics-rmi4 - log when we create a guest serio port
Input: synaptics-rmi4 - unmask F03 interrupts when port is opened
Input: synaptics-rmi4 - do not delete interrupt memory too early
Input: ad7877 - use managed resource allocations
Input: stmfts,s6sy671 - add SPDX identifier
Input: remove atmel-wm97xx touchscreen driver
...
Here is the big pull request for char/misc drivers for 4.16-rc1.
There's a lot of stuff in here. Three new driver subsystems were added
for various types of hardware busses:
- siox
- slimbus
- soundwire
as well as a new vboxguest subsystem for the VirtualBox hypervisor
drivers.
There's also big updates from the FPGA subsystem, lots of Android binder
fixes, the usual handful of hyper-v updates, and lots of other smaller
driver updates.
All of these have been in linux-next for a long time, with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWnLuZw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ynS4QCcCrPmwfD5PJwaF+q2dPfyKaflkQMAn0x6Wd+u
Gw3Z2scgjETUpwJ9ilnL
=xcQ0
-----END PGP SIGNATURE-----
Merge tag 'char-misc-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver updates from Greg KH:
"Here is the big pull request for char/misc drivers for 4.16-rc1.
There's a lot of stuff in here. Three new driver subsystems were added
for various types of hardware busses:
- siox
- slimbus
- soundwire
as well as a new vboxguest subsystem for the VirtualBox hypervisor
drivers.
There's also big updates from the FPGA subsystem, lots of Android
binder fixes, the usual handful of hyper-v updates, and lots of other
smaller driver updates.
All of these have been in linux-next for a long time, with no reported
issues"
* tag 'char-misc-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (155 commits)
char: lp: use true or false for boolean values
android: binder: use VM_ALLOC to get vm area
android: binder: Use true and false for boolean values
lkdtm: fix handle_irq_event symbol for INT_HW_IRQ_EN
EISA: Delete error message for a failed memory allocation in eisa_probe()
EISA: Whitespace cleanup
misc: remove AVR32 dependencies
virt: vbox: Add error mapping for VERR_INVALID_NAME and VERR_NO_MORE_FILES
soundwire: Fix a signedness bug
uio_hv_generic: fix new type mismatch warnings
uio_hv_generic: fix type mismatch warnings
auxdisplay: img-ascii-lcd: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
uio_hv_generic: add rescind support
uio_hv_generic: check that host supports monitor page
uio_hv_generic: create send and receive buffers
uio: document uio_hv_generic regions
doc: fix documentation about uio_hv_generic
vmbus: add monitor_id and subchannel_id to sysfs per channel
vmbus: fix ABI documentation
uio_hv_generic: use ISR callback method
...
Here is the big USB and PHY driver update for 4.16-rc1.
Along with the normally expected XHCI, MUSB, and Gadget driver patches,
there are some PHY driver fixes, license cleanups, sysfs attribute
cleanups, usbip changes, and a raft of other smaller fixes and
additions.
Full details are in the shortlog.
All of these have been in the linux-next tree for a long time with no
reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWnL0Bg8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymg8gCeLg/FMtc0S/xRR/56N/sbthEebcUAnROr9Sg3
55hDLdkyi93o9R86YOAJ
=8d2q
-----END PGP SIGNATURE-----
Merge tag 'usb-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB/PHY updates from Greg KH:
"Here is the big USB and PHY driver update for 4.16-rc1.
Along with the normally expected XHCI, MUSB, and Gadget driver
patches, there are some PHY driver fixes, license cleanups, sysfs
attribute cleanups, usbip changes, and a raft of other smaller fixes
and additions.
Full details are in the shortlog.
All of these have been in the linux-next tree for a long time with no
reported issues"
* tag 'usb-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (137 commits)
USB: serial: pl2303: new device id for Chilitag
USB: misc: fix up some remaining DEVICE_ATTR() usages
USB: musb: fix up one odd DEVICE_ATTR() usage
USB: atm: fix up some remaining DEVICE_ATTR() usage
USB: move many drivers to use DEVICE_ATTR_WO
USB: move many drivers to use DEVICE_ATTR_RO
USB: move many drivers to use DEVICE_ATTR_RW
USB: misc: chaoskey: Use true and false for boolean values
USB: storage: remove old wording about how to submit a change
USB: storage: remove invalid URL from drivers
usb: ehci-omap: don't complain on -EPROBE_DEFER when no PHY found
usbip: list: don't list devices attached to vhci_hcd
usbip: prevent bind loops on devices attached to vhci_hcd
USB: serial: remove redundant initializations of 'mos_parport'
usb/gadget: Fix "high bandwidth" check in usb_gadget_ep_match_desc()
usb: gadget: compress return logic into one line
usbip: vhci_hcd: update 'status' file header and format
USB: serial: simple: add Motorola Tetra driver
CDC-ACM: apply quirk for card reader
usb: option: Add support for FS040U modem
...
Pull networking updates from David Miller:
1) Significantly shrink the core networking routing structures. Result
of http://vger.kernel.org/~davem/seoul2017_netdev_keynote.pdf
2) Add netdevsim driver for testing various offloads, from Jakub
Kicinski.
3) Support cross-chip FDB operations in DSA, from Vivien Didelot.
4) Add a 2nd listener hash table for TCP, similar to what was done for
UDP. From Martin KaFai Lau.
5) Add eBPF based queue selection to tun, from Jason Wang.
6) Lockless qdisc support, from John Fastabend.
7) SCTP stream interleave support, from Xin Long.
8) Smoother TCP receive autotuning, from Eric Dumazet.
9) Lots of erspan tunneling enhancements, from William Tu.
10) Add true function call support to BPF, from Alexei Starovoitov.
11) Add explicit support for GRO HW offloading, from Michael Chan.
12) Support extack generation in more netlink subsystems. From Alexander
Aring, Quentin Monnet, and Jakub Kicinski.
13) Add 1000BaseX, flow control, and EEE support to mvneta driver. From
Russell King.
14) Add flow table abstraction to netfilter, from Pablo Neira Ayuso.
15) Many improvements and simplifications to the NFP driver bpf JIT,
from Jakub Kicinski.
16) Support for ipv6 non-equal cost multipath routing, from Ido
Schimmel.
17) Add resource abstration to devlink, from Arkadi Sharshevsky.
18) Packet scheduler classifier shared filter block support, from Jiri
Pirko.
19) Avoid locking in act_csum, from Davide Caratti.
20) devinet_ioctl() simplifications from Al viro.
21) More TCP bpf improvements from Lawrence Brakmo.
22) Add support for onlink ipv6 route flag, similar to ipv4, from David
Ahern.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1925 commits)
tls: Add support for encryption using async offload accelerator
ip6mr: fix stale iterator
net/sched: kconfig: Remove blank help texts
openvswitch: meter: Use 64-bit arithmetic instead of 32-bit
tcp_nv: fix potential integer overflow in tcpnv_acked
r8169: fix RTL8168EP take too long to complete driver initialization.
qmi_wwan: Add support for Quectel EP06
rtnetlink: enable IFLA_IF_NETNSID for RTM_NEWLINK
ipmr: Fix ptrdiff_t print formatting
ibmvnic: Wait for device response when changing MAC
qlcnic: fix deadlock bug
tcp: release sk_frag.page in tcp_disconnect
ipv4: Get the address of interface correctly.
net_sched: gen_estimator: fix lockdep splat
net: macb: Handle HRESP error
net/mlx5e: IPoIB, Fix copy-paste bug in flow steering refactoring
ipv6: addrconf: break critical section in addrconf_verify_rtnl()
ipv6: change route cache aging logic
i40e/i40evf: Update DESC_NEEDED value to reflect larger value
bnxt_en: cleanup DIM work on device shutdown
...
Pull seccomp updates from James Morris:
"Add support for retrieving seccomp metadata"
* 'next-seccomp' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
ptrace, seccomp: add support for retrieving seccomp metadata
seccomp: hoist out filter resolving logic
Pull misc vfs updates from Al Viro:
"All kinds of misc stuff, without any unifying topic, from various
people.
Neil's d_anon patch, several bugfixes, introduction of kvmalloc
analogue of kmemdup_user(), extending bitfield.h to deal with
fixed-endians, assorted cleanups all over the place..."
* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (28 commits)
alpha: osf_sys.c: use timespec64 where appropriate
alpha: osf_sys.c: fix put_tv32 regression
jffs2: Fix use-after-free bug in jffs2_iget()'s error handling path
dcache: delete unused d_hash_mask
dcache: subtract d_hash_shift from 32 in advance
fs/buffer.c: fold init_buffer() into init_page_buffers()
fs: fold __inode_permission() into inode_permission()
fs: add RWF_APPEND
sctp: use vmemdup_user() rather than badly open-coding memdup_user()
snd_ctl_elem_init_enum_names(): switch to vmemdup_user()
replace_user_tlv(): switch to vmemdup_user()
new primitive: vmemdup_user()
memdup_user(): switch to GFP_USER
eventfd: fold eventfd_ctx_get() into eventfd_ctx_fileget()
eventfd: fold eventfd_ctx_read() into eventfd_read()
eventfd: convert to use anon_inode_getfd()
nfs4file: get rid of pointless include of btrfs.h
uvc_v4l2: clean copyin/copyout up
vme_user: don't use __copy_..._user()
usx2y: don't bother with memdup_user() for 16-byte structure
...
five categories: (1) code cleanups, (2) patches related to adding
PUNCH_HOLE support to GFS2, (3) support for new fields in resource group
headers, (4) a few bug fixes, and (5) support for new fields in journal
log headers. These new fields, which were previously unused, are designed
to make it easier to track down file system corruption, and allow fsck.gfs2
to make more intelligent decisions when finding and fixing file system
corruption.
1. Abhi Das contributed a patch to trim the ordered writes list, which
used to grow uncontrollably until unmount.
2. Abhi also added a second patch to trim the ordered writes list.
3. Andreas Gruenbacher removed an unused parameter from function
gfs2_write_jdata_pagevec.
4. Andreas also removed a pointless BUG_ON.
5. Andreas cleaned up an error patch in trunc_start.
6. Andreas removed some unused parameters from truncate.
7. Andreas made gfs2_journaled_truncate more efficient.
8. Andreas cleaned up the support functions for truncate.
9. Andreas fixed metadata read-ahead for truncate to make it faster.
10. Andreas fixed up the non-recursive truncate code.
11. Andreas reworked and renamed function gfs2_block_truncate_page.
12. Andreas generalized the non-recursive truncate code so it can
take a range of values for punch_hole support.
13. Andreas introduced new PUNCH_HOLE support that take advantage
of the previous patches.
14. Andreas contributed a patch to add fallocate with PUNCH_HOLE.
15. Andreas fixed some typos in the comments.
16. Andreas added function gfs2_max_stuffed_size to replace a piece
of code that was needlessly repeated throughout GFS2.
17. Andreas made a minor cleanup to function gfs2_page_add_databufs.
18. Andreas got rid of function gfs2_log_header_in in preparation for
the new log header fields.
19. Andreas also fixed up some missing newlines in kernel messages.
20. Andy Price added a new field to resource groups to indicate where
the next one should be, to allow fsck.gfs2 to make better repairs.
21. Andy also added new rindex fields for consistency checking.
22. Andy added a crc field to resource group headers for consistency
checking.
23. I added a patch to reduce redundancy in functions common to
freeing dinodes.
24. I added a patch to reduce redundancy when writing log headers
between the journalling code and journal recovery code.
25. I added a patch to add new fields to journal log headers based
on a prototype from Steve Whitehouse.
26. I added a patch to log the source of journal log headers so we
can better track down journal corruption.
27. I added a patch to fix a minor comment typo.
28. I also added a patch to fix a BUG in an unlink error path.
29. Steve Whitehouse contributed a patch to fix an incorrect use
of the gfs2_blk2rgrpd function.
30. Tetsuo Handa contributed a patch that fixes incorrect error
handling in function init_gfs2_fs.
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJaccZcAAoJENeLYdPf93o7zCYH/RgLn8alarI1IJ3633NHNisE
thIvlcDVb3ffGuxMIAEnGwWEFt8qr8niEGc6GP+zvkWpO6Ym2PNmrB85DwB8brTv
h1u7DTWrDsf9Qm1s6AVsw68+dn+Kq3eGwTficVFZYcJEox7NnFb1Jo6o3B0jVYei
gn9ccELNMyTq8R8W2zaIk43TCbQYWWy5hT8uSMVSzr+Hz0m/fSlKKy8rizYsB1zA
ut8IFUpn4dt6UG9DJ9HHg8MPw5mxr5f6wK/w/wNaaUdtZNMZyWYPiJGTrOk5YpEy
EmGFljNV5iT9rwCmZfFWaXJZC9mYhWRw5zrWkX9bh4/I3TSzul6DUiUo3r6hyNA=
=zDD/
-----END PGP SIGNATURE-----
Merge tag 'gfs2-4.16.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
Pull GFS2 updates from Bob Peterson:
"We've got 30 patches for this merge window. These generally fall into
five categories:
- code cleanups
- patches related to adding PUNCH_HOLE support to GFS2
- support for new fields in resource group headers
- a few bug fixes
- support for new fields in journal log headers. These new fields,
which were previously unused, are designed to make it easier to
track down file system corruption, and allow fsck.gfs2 to make more
intelligent decisions when finding and fixing file system
corruption.
Details:
- Two patches from Abhi Das, to trim the ordered writes list, which
used to grow uncontrollably until unmount.
- Several patches from Andreas Gruenbacher: remove an unused
parameter from function gfs2_write_jdata_pagevec, remove a
pointless BUG_ON, clean up an error patch in trunc_start, remove
some unused parameters from truncate, make gfs2_journaled_truncate
more efficient, clean up the support functions for truncate, fix
metadata read-ahead for truncate to make it faster, fix up the
non-recursive truncate code, rework and rename
gfs2_block_truncate_page, generalize the non-recursive truncate
code so it can take a range of values for punch_hole support,
introduce new PUNCH_HOLE support that take advantage of the
previous patches, add fallocate support with PUNCH_HOLE, fix some
typos in the comments, add the function gfs2_max_stuffed_size to
replace a piece of code that was needlessly repeated throughout
GFS2, a minor cleanup to function gfs2_page_add_databufs, get rid
of function gfs2_log_header_in in preparation for the new log
header fields, and also fix up some missing newlines in kernel
messages.
- Andy Price added a new field to resource groups to indicate where
the next one should be, to allow fsck.gfs2 to make better repairs.
He also added new rindex fields for consistency checking, and added
a crc field to resource group headers for consistency checking.
- I reduced redundancy in functions common to freeing dinodes, and
when writing log headers between the journalling code and journal
recovery code. Also added new fields to journal log headers based
on a prototype from Steve Whitehouse, and log the source of journal
log headers so we can better track down journal corruption. Minor
comment typo fix and a fix for a BUG in an unlink error path.
- Steve Whitehouse contributed a patch to fix an incorrect use of the
gfs2_blk2rgrpd function.
- Tetsuo Handa contributed a patch that fixes incorrect error
handling in function init_gfs2_fs"
* tag 'gfs2-4.16.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: (30 commits)
gfs2: Add a few missing newlines in messages
gfs2: Remove inode from ordered write list in gfs2_write_inode()
GFS2: Don't try to end a non-existent transaction in unlink
GFS2: Fix minor comment typo
GFS2: Log the reason for log flushes in every log header
GFS2: Introduce new gfs2_log_header_v2
gfs2: Get rid of gfs2_log_header_in
gfs2: Minor gfs2_page_add_databufs cleanup
gfs2: Add gfs2_max_stuffed_size
gfs2: Typo fixes
gfs2: Implement fallocate(FALLOC_FL_PUNCH_HOLE)
gfs2: Turn trunc_dealloc into punch_hole
gfs2: Generalize truncate code
Turn gfs2_block_truncate_page into gfs2_block_zero_range
gfs2: Improve non-recursive delete algorithm
gfs2: Fix metadata read-ahead during truncate
gfs2: Clean up {lookup,fillup}_metapath
gfs2: Remove minor gfs2_journaled_truncate inefficiencies
gfs2: truncate: Remove unnecessary oldsize parameters
gfs2: Clean up trunc_start error path
...
* pci/misc:
PCI: Add dummy pci_irqd_intx_xlate() for CONFIG_PCI=n build
PCI: Add wrappers for dev_printk()
PCI: Remove unnecessary messages for memory allocation failures
PCI: Add #defines for Completion Timeout Disable feature
hinic: Replace PCI pool old API
net: e100: Replace PCI pool old API
block: DAC960: Replace PCI pool old API
MAINTAINERS: Include more PCI files
PCI: Remove unneeded kallsyms include
powerpc/pci: Unroll two pass loop when scanning bridges
powerpc/pci: Use for_each_pci_bridge() helper
* pci/enumeration:
RDMA/qedr: Use pci_enable_atomic_ops_to_root()
PCI: Add pci_enable_atomic_ops_to_root()
PCI: Make PCI_SCAN_ALL_PCIE_DEVS work for Root as well as Downstream Ports
ht/vht action frames will be sent to AP from station to notify
change of its ht/vht opmode(max bandwidth, smps mode or nss) modified
values. Currently these valuse used by driver/firmware for rate control
algorithm. This patch introduces NL80211_CMD_STA_OPMODE_CHANGED
command to notify those modified/current supported values(max bandwidth,
smps mode, max nss) to userspace application. This will be useful for the
application like steering, which closely monitoring station's capability
changes. Since the application has taken these values during station
association.
Signed-off-by: Tamizh chelvam <tamizhr@codeaurora.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This interface allows the host driver to offload the authentication to
user space. This is exclusively defined for host drivers that do not
define separate commands for authentication and association, but rely on
userspace SME (e.g., in wpa_supplicant for the ~WPA_DRIVER_FLAGS_SME
case) for the authentication to happen. This can be used to implement
SAE without full implementation in the kernel/firmware while still being
able to use NL80211_CMD_CONNECT with driver-based BSS selection.
Host driver sends NL80211_CMD_EXTERNAL_AUTH event to start/abort
authentication to the port on which connect is triggered and status
of authentication is further indicated by user space to host
driver through the same command response interface.
User space entities advertise this capability through the
NL80211_ATTR_EXTERNAL_AUTH_SUPP flag in the NL80211_CMD_CONNECT request.
Host drivers shall look at this capability to offload the authentication.
Signed-off-by: Srinivas Dasari <dasaris@qti.qualcomm.com>
Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com>
[add socket connection ownership check]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This commit defines new scan flags (LOW_SPAN, LOW_POWER, HIGH_LATENCY)
to emphasize the requested scan behavior for the driver. These flags
are optional and are mutually exclusive. The implementation of the
respective functionality can be driver/hardware specific.
These flags can be used to control the compromise between how long
a scan takes, how much power it uses, and high accurate/complete
the scan is in finding the BSSs.
Signed-off-by: Sunil Dutt <usdutt@qti.qualcomm.com>
Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Highlights include:
Stable bugfixes:
- Fix breakages in the nfsstat utility due to the inclusion of the NFSv4
LOOKUPP operation.
- Fix a NULL pointer dereference in nfs_idmap_prepare_pipe_upcall() due to
nfs_idmap_legacy_upcall() being called without an 'aux' parameter.
- Fix a refcount leak in the standard O_DIRECT error path.
- Fix a refcount leak in the pNFS O_DIRECT fallback to MDS path.
- Fix CPU latency issues with nfs_commit_release_pages()
- Fix the LAYOUTUNAVAILABLE error case in the file layout type.
- NFS: Fix a race between mmap() and O_DIRECT
Features:
- Support the statx() mask and query flags to enable optimisations when
the user is requesting only attributes that are already up to date in
the inode cache, or is specifying the AT_STATX_DONT_SYNC flag.
- Add a module alias for the SCSI pNFS layout type.
Bugfixes:
- Automounting when resolving a NFSv4 referral should preserve the RDMA
transport protocol settings.
- Various other RDMA bugfixes from Chuck.
- pNFS block layout fixes.
- Always set NFS_LOCK_LOST when a lock is lost.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJacHcyAAoJEGcL54qWCgDy5WcP/Aw7GIAVZ+n5B+EIFVvEaWlC
C++eA8vej433ezKRj8IExeCX1C8OKrQZ4iQ3nqIg0mVVQS/Qk+469OhYP9jDagA+
tDZeOs3lbl1EUhUWT+GNxw8bZqKGn9fYzcWFjiYeFTjrcfLYfZTW50V0tofjgmlR
3nQmpPx/56rXE9ZO/EW66HRZWauw7a0hg3/5Ft+F5csqIb3yQOlW8Osp3WClzGBF
to4lS8/IwHvCn3qWAMuivRaMJDxeKrmoJNQh9Kw1Mw3+vurGAjmKo1a153qKPz4N
7wjeP+o3ujc/P7WsJLCIgQRimzSm9FZXMqEVmz07+cIhGbERt2yy0RbHev8bpa+U
3IMj70K9ciPuMZwrAtRAeZL+o9gxlUGUXvTaDUgo4DFgBw9Q5CnMnFn6a725l4h0
nSZsE+bR8d4l/yEjf77SbTrk7atMLfUG1XnKH20i1CUjtd4CaLLzjn81TlbQrfuI
XaFdJUUt63dPTIbhPEk7wHFcITkGZiyXhcepgbaXLiDH/3gyZmqTYzJ2EH14sOC5
NaTueE3ASTiFChvG7jvc89HJN5SN5W11PyzI+GHezx9VkFnPZM3/q2V7oEX2aSld
tkRlDMO4dVmpAA4LVAAarDr0a8ZsJQOdb+yLn21pbmKNAs1vol7tMfJe57ykEV6v
WNAgtKJLtZE0Lh1UyEq0
=5Vv2
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-4.16-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
"Highlights include:
Stable bugfixes:
- Fix breakages in the nfsstat utility due to the inclusion of the
NFSv4 LOOKUPP operation
- Fix a NULL pointer dereference in nfs_idmap_prepare_pipe_upcall()
due to nfs_idmap_legacy_upcall() being called without an 'aux'
parameter
- Fix a refcount leak in the standard O_DIRECT error path
- Fix a refcount leak in the pNFS O_DIRECT fallback to MDS path
- Fix CPU latency issues with nfs_commit_release_pages()
- Fix the LAYOUTUNAVAILABLE error case in the file layout type
- NFS: Fix a race between mmap() and O_DIRECT
Features:
- Support the statx() mask and query flags to enable optimisations
when the user is requesting only attributes that are already up to
date in the inode cache, or is specifying the AT_STATX_DONT_SYNC
flag
- Add a module alias for the SCSI pNFS layout type
Bugfixes:
- Automounting when resolving a NFSv4 referral should preserve the
RDMA transport protocol settings
- Various other RDMA bugfixes from Chuck
- pNFS block layout fixes
- Always set NFS_LOCK_LOST when a lock is lost"
* tag 'nfs-for-4.16-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (69 commits)
NFS: Fix a race between mmap() and O_DIRECT
NFS: Remove a redundant call to unmap_mapping_range()
pnfs/blocklayout: Ensure disk address in block device map
pnfs/blocklayout: pnfs_block_dev_map uses bytes, not sectors
lockd: Fix server refcounting
SUNRPC: Fix null rpc_clnt dereference in rpc_task_queued tracepoint
SUNRPC: Micro-optimize __rpc_execute
SUNRPC: task_run_action should display tk_callback
sunrpc: Format RPC events consistently for display
SUNRPC: Trace xprt_timer events
xprtrdma: Correct some documenting comments
xprtrdma: Fix "bytes registered" accounting
xprtrdma: Instrument allocation/release of rpcrdma_req/rep objects
xprtrdma: Add trace points to instrument QP and CQ access upcalls
xprtrdma: Add trace points in the client-side backchannel code paths
xprtrdma: Add trace points for connect events
xprtrdma: Add trace points to instrument MR allocation and recovery
xprtrdma: Add trace points to instrument memory invalidation
xprtrdma: Add trace points in reply decoder path
xprtrdma: Add trace points to instrument memory registration
..
Pull poll annotations from Al Viro:
"This introduces a __bitwise type for POLL### bitmap, and propagates
the annotations through the tree. Most of that stuff is as simple as
'make ->poll() instances return __poll_t and do the same to local
variables used to hold the future return value'.
Some of the obvious brainos found in process are fixed (e.g. POLLIN
misspelled as POLL_IN). At that point the amount of sparse warnings is
low and most of them are for genuine bugs - e.g. ->poll() instance
deciding to return -EINVAL instead of a bitmap. I hadn't touched those
in this series - it's large enough as it is.
Another problem it has caught was eventpoll() ABI mess; select.c and
eventpoll.c assumed that corresponding POLL### and EPOLL### were
equal. That's true for some, but not all of them - EPOLL### are
arch-independent, but POLL### are not.
The last commit in this series separates userland POLL### values from
the (now arch-independent) kernel-side ones, converting between them
in the few places where they are copied to/from userland. AFAICS, this
is the least disruptive fix preserving poll(2) ABI and making epoll()
work on all architectures.
As it is, it's simply broken on sparc - try to give it EPOLLWRNORM and
it will trigger only on what would've triggered EPOLLWRBAND on other
architectures. EPOLLWRBAND and EPOLLRDHUP, OTOH, are never triggered
at all on sparc. With this patch they should work consistently on all
architectures"
* 'misc.poll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (37 commits)
make kernel-side POLL... arch-independent
eventpoll: no need to mask the result of epi_item_poll() again
eventpoll: constify struct epoll_event pointers
debugging printk in sg_poll() uses %x to print POLL... bitmap
annotate poll(2) guts
9p: untangle ->poll() mess
->si_band gets POLL... bitmap stored into a user-visible long field
ring_buffer_poll_wait() return value used as return value of ->poll()
the rest of drivers/*: annotate ->poll() instances
media: annotate ->poll() instances
fs: annotate ->poll() instances
ipc, kernel, mm: annotate ->poll() instances
net: annotate ->poll() instances
apparmor: annotate ->poll() instances
tomoyo: annotate ->poll() instances
sound: annotate ->poll() instances
acpi: annotate ->poll() instances
crypto: annotate ->poll() instances
block: annotate ->poll() instances
x86: annotate ->poll() instances
...
Add a new field VIRTIO_BALLOON_S_CACHES to virtio_balloon memory
statistics protocol. The value represents all disk/file caches.
In this case it corresponds to the sum of values
Buffers+Cached+SwapCached from /proc/meminfo.
Signed-off-by: Tomáš Golembiovský <tgolembi@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
- Security mitigations:
- variant 2: invalidating the branch predictor with a call to secure firmware
- variant 3: implementing KPTI for arm64
- 52-bit physical address support for arm64 (ARMv8.2)
- arm64 support for RAS (firmware first only) and SDEI (software
delegated exception interface; allows firmware to inject a RAS error
into the OS)
- Perf support for the ARM DynamIQ Shared Unit PMU
- CPUID and HWCAP bits updated for new floating point multiplication
instructions in ARMv8.4
- Removing some virtual memory layout printks during boot
- Fix initial page table creation to cope with larger than 32M kernel
images when 16K pages are enabled
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAlpwxDMACgkQa9axLQDI
XvF55BAAniMpxPXnYNfv6l7/4O8eKo1lJIaG1wbej4JRZ/rT3K4Z3OBXW1dKHO8d
/PTbVmZ90IqIGROkoDrE+6xyjjn9yK3uuW4ytN2zQkBa8VFaHAnHlX+zKQcuwy9f
yxwiHk+C7vK5JR7mpXTazjRknsUv1MPtlTt7DQrSdq0KRDJVDNFC+grmbew2rz0X
cjQDqZqgzuFyrKxdiQVjDmc3zH9NsNBhDo0hlGHf2jK6bGJsAPtI8M2JcLrK8ITG
Ye/dD7BJp1mWD8ff0BPaMxu24qfAMNLH8f2dpTa986/H78irVz7i/t5HG0/1+5Jh
EE4OFRTKZ59Qgyo1zWcaJvdp8YjiaX/L4PWJg8CxM5OhP9dIac9ydcFQfWzpKpUs
xyZfmK6XliGFReAkVOOf5tEqFUDhMtsqhzPYmbmU1lp61wmSYIZ8CTenpWWCJSRO
NOGyG1X2uFBvP69+iPNlfTGz1r7tg1URY5iO8fUEIhY8LrgyORkiqw4OvPEgnMXP
Ngy+dXhyvnps2AAWbSX0O4puRlTgEYLT5KaMLzH/+gWsXATT0rzUCD/aOwUQq/Y7
SWXZHkb3jpmOZZnzZsLL2MNzEIPCFBwSUE9fSv4dA9d/N6tUmlmZALJjHkfzCDpj
+mPsSmAMTj72kUYzm0b5GCtOu/iQ2kDWOZjOM1m4+v/B+f7JoEE=
=iEjP
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
"The main theme of this pull request is security covering variants 2
and 3 for arm64. I expect to send additional patches next week
covering an improved firmware interface (requires firmware changes)
for variant 2 and way for KPTI to be disabled on unaffected CPUs
(Cavium's ThunderX doesn't work properly with KPTI enabled because of
a hardware erratum).
Summary:
- Security mitigations:
- variant 2: invalidate the branch predictor with a call to
secure firmware
- variant 3: implement KPTI for arm64
- 52-bit physical address support for arm64 (ARMv8.2)
- arm64 support for RAS (firmware first only) and SDEI (software
delegated exception interface; allows firmware to inject a RAS
error into the OS)
- perf support for the ARM DynamIQ Shared Unit PMU
- CPUID and HWCAP bits updated for new floating point multiplication
instructions in ARMv8.4
- remove some virtual memory layout printks during boot
- fix initial page table creation to cope with larger than 32M kernel
images when 16K pages are enabled"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (104 commits)
arm64: Fix TTBR + PAN + 52-bit PA logic in cpu_do_switch_mm
arm64: Turn on KPTI only on CPUs that need it
arm64: Branch predictor hardening for Cavium ThunderX2
arm64: Run enable method for errata work arounds on late CPUs
arm64: Move BP hardening to check_and_switch_context
arm64: mm: ignore memory above supported physical address size
arm64: kpti: Fix the interaction between ASID switching and software PAN
KVM: arm64: Emulate RAS error registers and set HCR_EL2's TERR & TEA
KVM: arm64: Handle RAS SErrors from EL2 on guest exit
KVM: arm64: Handle RAS SErrors from EL1 on guest exit
KVM: arm64: Save ESR_EL2 on guest SError
KVM: arm64: Save/Restore guest DISR_EL1
KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.
KVM: arm/arm64: mask/unmask daif around VHE guests
arm64: kernel: Prepare for a DISR user
arm64: Unconditionally enable IESB on exception entry/return for firmware-first
arm64: kernel: Survive corrected RAS errors notified by SError
arm64: cpufeature: Detect CPU RAS Extentions
arm64: sysreg: Move to use definitions for all the SCTLR bits
arm64: cpufeature: __this_cpu_has_cap() shouldn't stop early
...
Reformat DPC register definitions to follow the convention that register
field masks indicate the register width, e.g., a field of a 16-bit register
uses a mask of 4 hex digits, with leading zeros included as needed.
No functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sinan Kaya <okaya@codeaurora.org>
Add definitions for DPC Status register fields and use them in the code.
No functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sinan Kaya <okaya@codeaurora.org>
Pull scheduler updates from Ingo Molnar:
"The main changes in this cycle were:
- Implement frequency/CPU invariance and OPP selection for
SCHED_DEADLINE (Juri Lelli)
- Tweak the task migration logic for better multi-tasking
workload scalability (Mel Gorman)
- Misc cleanups, fixes and improvements"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/deadline: Make bandwidth enforcement scale-invariant
sched/cpufreq: Move arch_scale_{freq,cpu}_capacity() outside of #ifdef CONFIG_SMP
sched/cpufreq: Remove arch_scale_freq_capacity()'s 'sd' parameter
sched/cpufreq: Always consider all CPUs when deciding next freq
sched/cpufreq: Split utilization signals
sched/cpufreq: Change the worker kthread to SCHED_DEADLINE
sched/deadline: Move CPU frequency selection triggering points
sched/cpufreq: Use the DEADLINE utilization signal
sched/deadline: Implement "runtime overrun signal" support
sched/fair: Only immediately migrate tasks due to interrupts if prev and target CPUs share cache
sched/fair: Correct obsolete comment about cpufreq_update_util()
sched/fair: Remove impossible condition from find_idlest_group_cpu()
sched/cpufreq: Don't pass flags to sugov_set_iowait_boost()
sched/cpufreq: Initialize sg_cpu->flags to 0
sched/fair: Consider RT/IRQ pressure in capacity_spare_wake()
sched/fair: Use 'unsigned long' for utilization, consistently
sched/core: Rework and clarify prepare_lock_switch()
sched/fair: Remove unused 'curr' parameter from wakeup_gran
sched/headers: Constify object_is_on_stack()
Pull perf updates from Ingo Molnar:
"Kernel side changes:
- Clean up the x86 instruction decoder (Masami Hiramatsu)
- Add new uprobes optimization for PUSH instructions on x86 (Yonghong
Song)
- Add MSR_IA32_THERM_STATUS to the MSR events (Stephane Eranian)
- Fix misc bugs, update documentation, plus various cleanups (Jiri
Olsa)
There's a large number of tooling side improvements:
- Intel-PT/BTS improvements (Adrian Hunter)
- Numerous 'perf trace' improvements (Arnaldo Carvalho de Melo)
- Introduce an errno code to string facility (Hendrik Brueckner)
- Various build system improvements (Jiri Olsa)
- Add support for CoreSight trace decoding by making the perf tools
use the external openCSD (Mathieu Poirier, Tor Jeremiassen)
- Add ARM Statistical Profiling Extensions (SPE) support (Kim
Phillips)
- libtraceevent updates (Steven Rostedt)
- Intel vendor event JSON updates (Andi Kleen)
- Introduce 'perf report --mmaps' and 'perf report --tasks' to show
info present in 'perf.data' (Jiri Olsa, Arnaldo Carvalho de Melo)
- Add infrastructure to record first and last sample time to the
perf.data file header, so that when processing all samples in a
'perf record' session, such as when doing build-id processing, or
when specifically requesting that that info be recorded, use that
in 'perf report --time', that also got support for percent slices
in addition to absolute ones.
I.e. now it is possible to ask for the samples in the 10%-20% time
slice of a perf.data file (Jin Yao)
- Allow system wide 'perf stat --per-thread', sorting the result (Jin
Yao)
E.g.:
[root@jouet ~]# perf stat --per-thread --metrics IPC
^C
Performance counter stats for 'system wide':
make-22229 23,012,094,032 inst_retired.any # 0.8 IPC
cc1-22419 692,027,497 inst_retired.any # 0.8 IPC
gcc-22418 328,231,855 inst_retired.any # 0.9 IPC
cc1-22509 220,853,647 inst_retired.any # 0.8 IPC
gcc-22486 199,874,810 inst_retired.any # 1.0 IPC
as-22466 177,896,365 inst_retired.any # 0.9 IPC
cc1-22465 150,732,374 inst_retired.any # 0.8 IPC
gcc-22508 112,555,593 inst_retired.any # 0.9 IPC
cc1-22487 108,964,079 inst_retired.any # 0.7 IPC
qemu-system-x86-2697 21,330,550 inst_retired.any # 0.3 IPC
systemd-journal-551 20,642,951 inst_retired.any # 0.4 IPC
docker-containe-17651 9,552,892 inst_retired.any # 0.5 IPC
dockerd-current-9809 7,528,586 inst_retired.any # 0.5 IPC
make-22153 12,504,194,380 inst_retired.any # 0.8 IPC
python2-22429 12,081,290,954 inst_retired.any # 0.8 IPC
<SNIP>
python2-22429 15,026,328,103 cpu_clk_unhalted.thread
cc1-22419 826,660,193 cpu_clk_unhalted.thread
gcc-22418 365,321,295 cpu_clk_unhalted.thread
cc1-22509 279,169,362 cpu_clk_unhalted.thread
gcc-22486 210,156,950 cpu_clk_unhalted.thread
<SNIP>
5.638075538 seconds time elapsed
[root@jouet ~]#
- Improve shell auto-completion of perf events (Jin Yao)
- 'perf probe' improvements (Masami Hiramatsu)
- Improve PMU infrastructure to support amp64's ThunderX2
implementation defined core events (Ganapatrao Kulkarni)
- Various annotation related improvements and fixes (Thomas Richter)
- Clarify usage of 'overwrite' and 'backward' in the evlist/mmap
code, removing the 'overwrite' parameter from several functions as
it was always used it as 'false' (Wang Nan)
- Fix/improve 'perf record' reverse recording support (Wang Nan)
- Improve command line options documentation (Sihyeon Jang)
- Optimize sample parsing for ordering events, where we don't need to
parse all the PERF_SAMPLE_ bits, just the ones leading to the
timestamp needed to reorder events (Jiri Olsa)
- Generalize the annotation code to support other source information
besides objdump/DWARF obtained ones, starting with python scripts,
that will is slated to be merged soon (Jiri Olsa)
- ... and a lot more that I failed to list, see the shortlog and
changelog for details"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (262 commits)
perf trace beauty flock: Move to separate object file
perf evlist: Remove fcntl.h from evlist.h
perf trace beauty futex: Beautify FUTEX_BITSET_MATCH_ANY
perf trace: Do not print from time delta for interrupted syscall lines
perf trace: Add --print-sample
perf bpf: Remove misplaced __maybe_unused attribute
MAINTAINERS: Adding entry for CoreSight trace decoding
perf tools: Add mechanic to synthesise CoreSight trace packets
perf tools: Add full support for CoreSight trace decoding
pert tools: Add queue management functionality
perf tools: Add functionality to communicate with the openCSD decoder
perf tools: Add support for decoding CoreSight trace data
perf tools: Add decoder mechanic to support dumping trace data
perf tools: Add processing of coresight metadata
perf tools: Add initial entry point for decoder CoreSight traces
perf tools: Integrating the CoreSight decoding library
perf vendor events intel: Update IvyTown files to V20
perf vendor events intel: Update IvyBridge files to V20
perf vendor events intel: Update BroadwellDE events to V7
perf vendor events intel: Update SkylakeX events to V1.06
...
- First part of an overhaul of the NuBus subsystem, to bring it up to
modern driver model standards,
- A race condition fix for Mac,
- Defconfig updates.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJabuYGAAoJEEgEtLw/Ve77uscQAKjtYt4xGksr7mQtCu+7I0F1
ZCpb9JWdQNNyKLYmjKQuPLEwMYVKsSam1gSqdKkmFI5xWDO7mT2+xOXA9oVBiqbm
gTkzuQtG+3/6CrKU7VFXgZaewYwK4zFQQ7+mrNkktPUyUuvya6xUqzgxyBX6oFoo
b796t/HISpoQMUKgPSYGGkqhkNlx3WTqh1yn8ZzaOSSg1ZylHZF0ntyHpPlp7nY8
lSyY1IwQAT80gebWN4huV3FU+Lyk8TZ8ic2EZkG/8SBqVJ/4+JDF4h8hWIB9DcAI
8h4CnYMOZZEenPmf7ptqWrUmcWy5OwYnIlKFEMEbBCoVC6853fdz/5LQd7fs57Vv
AOuIttyrn80NYDU2Mxg0o1kOqxDOXPbl2JrOXwSYyAoPSUQHK1FsesJfn96vRHEu
Y/rjVbURnOzMeaFUvczZkfgbh6dT4eU/5Zn5MPx71m7gKlqNGCJFYu7sORJq/0pI
9jrUtE+R6YP5rsrr77QfIUxPJSlnXLQQHdQUmzEVFjo+7A97Kh8FySpVWAR/zB5s
YsCc8EikLO5mxCyN3TmyUuo+TZJTwpPV0haxNAMoJKLRpNOVHTNrfuYFJR6JTOf8
a7Y9RiUud7iMrOy3NZEcK8jOxeVAaUHZJxurvnnrCsdaUcjRJRkkygCJjsJ37MMF
rRUZF/LDofLR8WWDrdXN
=PZNx
-----END PGP SIGNATURE-----
Merge tag 'm68k-for-v4.16-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
Pull m68k updates from Geert Uytterhoeven:
- first part of an overhaul of the NuBus subsystem, to bring it up to
modern driver model standards
- a race condition fix for Mac
- defconfig updates
* tag 'm68k-for-v4.16-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
MAINTAINERS: Add NuBus subsystem entry
m68k/mac: Fix race conditions in OSS interrupt dispatch
nubus: Add support for the driver model
nubus: Add expansion_type values for various Mac models
nubus: Adopt standard linked list implementation
nubus: Rename struct nubus_dev
nubus: Rework /proc/bus/nubus/s/ implementation
nubus: Generalize block resource handling
nubus: Clean up whitespace
nubus: Remove redundant code
nubus: Call proc_mkdir() not more than once per slot directory
nubus: Validate slot resource IDs
nubus: Fix log spam
nubus: Use static functions where possible
nubus: Fix up header split
nubus: Avoid array underflow and overflow
m68k/defconfig: Update defconfigs for v4.15-rc1
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAlpvikQACgkQxWXV+ddt
WDs6qA//ZE7eEH0sKpD4Z+3gUevk/MMXwE9prRijEdjXz/K/UXtvpq0sI7HMQskZ
Ls9Wmzof+3WEQoa08RQZFzwuclW1Udm09SqE2oHP2gXQB5rC0BtWdrlMaKUJy03y
NUwxHetbE6TsFLU5HIVmi05NexNx9SVV6oJTWt00RlXTePw9Aoc88ikoXXUE2vqH
wbH9/ccmM9EkDFxdG+YG5QX054kQV8/5RXdqBJnIiGVRX5ZsAY84AN9x9YoRCVUw
wq9TfPu6XmeA6Uq6knpeLlXDms5w+FE3n5CduROk7Q7YNgpoZekF20c8uK8HzT4T
KF8hc0QpQgRCVBJ8I4MbPSMRIDf3IWfZmWSDEDda/6/ep6Bl99b8PFvdDKDBMUct
8wsgGrwGbHuz2l2QUIXjpBL9Cv9Tbu8vjmg0h2hFrpiH1c8JaXjKtJXAMtigWsZ1
DdX+5Y0zqvV/YLpzKF4aMDWXIteN4qaznvjdmj3B7BxgcnITOV/cmPCyMplNrtUa
Cs2fzGV5tpxhBzxE490v+frMULmLq2W1e6WPfFCqPKBCqulcR75TozDQS9M2/h4k
uAZzVKoguHrUPP1ONVas9aVC05K473nbYPd28eecYwkZ4z32hK4SAdnQY1aPreTe
axoV7p7a+i1bkzT5LK6gIfpddVWth8w45nz4P0lwxp0Z6XlkbJE=
=Irul
-----END PGP SIGNATURE-----
Merge tag 'for-4.16-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs updates from David Sterba:
"Features or user visible changes:
- fallocate: implement zero range mode
- avoid losing data raid profile when deleting a device
- tree item checker: more checks for directory items and xattrs
Notable fixes:
- raid56 recovery: don't use cached stripes, that could be
potentially changed and a later RMW or recovery would lead to
corruptions or failures
- let raid56 try harder to rebuild damaged data, reading from all
stripes if necessary
- fix scrub to repair raid56 in a similar way as in the case above
Other:
- cleanups: device freeing, removed some call indirections, redundant
bio_put/_get, unused parameters, refactorings and renames
- RCU list traversal fixups
- simplify mount callchain, remove recursing back when mounting a
subvolume
- plug for fsync, may improve bio merging on multiple devices
- compression heurisic: replace heap sort with radix sort, gains some
performance
- add extent map selftests, buffered write vs dio"
* tag 'for-4.16-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (155 commits)
btrfs: drop devid as device_list_add() arg
btrfs: get device pointer from device_list_add()
btrfs: set the total_devices in device_list_add()
btrfs: move pr_info into device_list_add
btrfs: make btrfs_free_stale_devices() to match the path
btrfs: rename btrfs_free_stale_devices() arg to skip_dev
btrfs: make btrfs_free_stale_devices() argument optional
btrfs: make btrfs_free_stale_device() to iterate all stales
btrfs: no need to check for btrfs_fs_devices::seeding
btrfs: Use IS_ALIGNED in btrfs_truncate_block instead of opencoding it
Btrfs: noinline merge_extent_mapping
Btrfs: add WARN_ONCE to detect unexpected error from merge_extent_mapping
Btrfs: extent map selftest: dio write vs dio read
Btrfs: extent map selftest: buffered write vs dio read
Btrfs: add extent map selftests
Btrfs: move extent map specific code to extent_map.c
Btrfs: add helper for em merge logic
Btrfs: fix unexpected EEXIST from btrfs_get_extent
Btrfs: fix incorrect block_len in merge_extent_mapping
btrfs: Remove unused readahead spinlock
...
Pull block updates from Jens Axboe:
"This is the main pull request for block IO related changes for the
4.16 kernel. Nothing major in this pull request, but a good amount of
improvements and fixes all over the map. This contains:
- BFQ improvements, fixes, and cleanups from Angelo, Chiara, and
Paolo.
- Support for SMR zones for deadline and mq-deadline from Damien and
Christoph.
- Set of fixes for bcache by way of Michael Lyle, including fixes
from himself, Kent, Rui, Tang, and Coly.
- Series from Matias for lightnvm with fixes from Hans Holmberg,
Javier, and Matias. Mostly centered around pblk, and the removing
rrpc 1.2 in preparation for supporting 2.0.
- A couple of NVMe pull requests from Christoph. Nothing major in
here, just fixes and cleanups, and support for command tracing from
Johannes.
- Support for blk-throttle for tracking reads and writes separately.
From Joseph Qi. A few cleanups/fixes also for blk-throttle from
Weiping.
- Series from Mike Snitzer that enables dm to register its queue more
logically, something that's alwways been problematic on dm since
it's a stacked device.
- Series from Ming cleaning up some of the bio accessor use, in
preparation for supporting multipage bvecs.
- Various fixes from Ming closing up holes around queue mapping and
quiescing.
- BSD partition fix from Richard Narron, fixing a problem where we
can't mount newer (10/11) FreeBSD partitions.
- Series from Tejun reworking blk-mq timeout handling. The previous
scheme relied on atomic bits, but it had races where we would think
a request had timed out if it to reused at the wrong time.
- null_blk now supports faking timeouts, to enable us to better
exercise and test that functionality separately. From me.
- Kill the separate atomic poll bit in the request struct. After
this, we don't use the atomic bits on blk-mq anymore at all. From
me.
- sgl_alloc/free helpers from Bart.
- Heavily contended tag case scalability improvement from me.
- Various little fixes and cleanups from Arnd, Bart, Corentin,
Douglas, Eryu, Goldwyn, and myself"
* 'for-4.16/block' of git://git.kernel.dk/linux-block: (186 commits)
block: remove smart1,2.h
nvme: add tracepoint for nvme_complete_rq
nvme: add tracepoint for nvme_setup_cmd
nvme-pci: introduce RECONNECTING state to mark initializing procedure
nvme-rdma: remove redundant boolean for inline_data
nvme: don't free uuid pointer before printing it
nvme-pci: Suspend queues after deleting them
bsg: use pr_debug instead of hand crafted macros
blk-mq-debugfs: don't allow write on attributes with seq_operations set
nvme-pci: Fix queue double allocations
block: Set BIO_TRACE_COMPLETION on new bio during split
blk-throttle: use queue_is_rq_based
block: Remove kblockd_schedule_delayed_work{,_on}()
blk-mq: Avoid that blk_mq_delay_run_hw_queue() introduces unintended delays
blk-mq: Rename blk_mq_request_direct_issue() into blk_mq_request_issue_directly()
lib/scatterlist: Fix chaining support in sgl_alloc_order()
blk-throttle: track read and write request individually
block: add bdev_read_only() checks to common helpers
block: fail op_is_write() requests to read-only partitions
blk-throttle: export io_serviced_recursive, io_service_bytes_recursive
...
The goal is to let the user follow an interface that moves to another
netns.
CC: Jiri Benc <jbenc@redhat.com>
CC: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov says:
====================
pull-request: bpf-next 2018-01-26
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) A number of extensions to tcp-bpf, from Lawrence.
- direct R or R/W access to many tcp_sock fields via bpf_sock_ops
- passing up to 3 arguments to bpf_sock_ops functions
- tcp_sock field bpf_sock_ops_cb_flags for controlling callbacks
- optionally calling bpf_sock_ops program when RTO fires
- optionally calling bpf_sock_ops program when packet is retransmitted
- optionally calling bpf_sock_ops program when TCP state changes
- access to tclass and sk_txhash
- new selftest
2) div/mod exception handling, from Daniel.
One of the ugly leftovers from the early eBPF days is that div/mod
operations based on registers have a hard-coded src_reg == 0 test
in the interpreter as well as in JIT code generators that would
return from the BPF program with exit code 0. This was basically
adopted from cBPF interpreter for historical reasons.
There are multiple reasons why this is very suboptimal and prone
to bugs. To name one: the return code mapping for such abnormal
program exit of 0 does not always match with a suitable program
type's exit code mapping. For example, '0' in tc means action 'ok'
where the packet gets passed further up the stack, which is just
undesirable for such cases (e.g. when implementing policy) and
also does not match with other program types.
After considering _four_ different ways to address the problem,
we adapt the same behavior as on some major archs like ARMv8:
X div 0 results in 0, and X mod 0 results in X. aarch64 and
aarch32 ISA do not generate any traps or otherwise aborts
of program execution for unsigned divides.
Given the options, it seems the most suitable from
all of them, also since major archs have similar schemes in
place. Given this is all in the realm of undefined behavior,
we still have the option to adapt if deemed necessary.
3) sockmap sample refactoring, from John.
4) lpm map get_next_key fixes, from Yonghong.
5) test cleanups, from Alexei and Prashant.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The patch adds support for openvswitch to configure erspan
v1 and v2. The OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS attr is added
to uapi as a binary blob to support all ERSPAN v1 and v2's
fields. Note that Previous commit "openvswitch: Add erspan tunnel
support." was reverted since it does not design properly.
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The patch adds a new uapi header file, erspan.h, and moves
the 'struct erspan_metadata' from internal erspan.h to it.
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds support for calling sock_ops BPF program when there is a TCP state
change. Two arguments are used; one for the old state and another for
the new state.
There is a new enum in include/uapi/linux/bpf.h that exports the TCP
states that prepends BPF_ to the current TCP state names. If it is ever
necessary to change the internal TCP state values (other than adding
more to the end), then it will become necessary to convert from the
internal TCP state value to the BPF value before calling the BPF
sock_ops function. There are a set of compile checks added in tcp.c
to detect if the internal and BPF values differ so we can make the
necessary fixes.
New op: BPF_SOCK_OPS_STATE_CB.
Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adds support for calling sock_ops BPF program when there is a
retransmission. Three arguments are used; one for the sequence number,
another for the number of segments retransmitted, and the last one for
the return value of tcp_transmit_skb (0 => success).
Does not include syn-ack retransmissions.
New op: BPF_SOCK_OPS_RETRANS_CB.
Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adds an optional call to sock_ops BPF program based on whether the
BPF_SOCK_OPS_RTO_CB_FLAG is set in bpf_sock_ops_flags.
The BPF program is passed 2 arguments: icsk_retransmits and whether the
RTO has expired.
Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adds field bpf_sock_ops_cb_flags to tcp_sock and bpf_sock_ops. Its primary
use is to determine if there should be calls to sock_ops bpf program at
various points in the TCP code. The field is initialized to zero,
disabling the calls. A sock_ops BPF program can set it, per connection and
as necessary, when the connection is established.
It also adds support for reading and writting the field within a
sock_ops BPF program. Reading is done by accessing the field directly.
However, writing is done through the helper function
bpf_sock_ops_cb_flags_set, in order to return an error if a BPF program
is trying to set a callback that is not supported in the current kernel
(i.e. running an older kernel). The helper function returns 0 if it was
able to set all of the bits set in the argument, a positive number
containing the bits that could not be set, or -EINVAL if the socket is
not a full TCP socket.
Examples of where one could call the bpf program:
1) When RTO fires
2) When a packet is retransmitted
3) When the connection terminates
4) When a packet is sent
5) When a packet is received
Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adds support for passing up to 4 arguments to sock_ops bpf functions. It
reusues the reply union, so the bpf_sock_ops structures are not
increased in size.
Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This is the per-I/O equivalent of O_APPEND to support atomic append
operations on any open file.
If a file is opened with O_APPEND, pwrite() ignores the offset and
always appends data to the end of the file. RWF_APPEND enables atomic
append and pwrite() with offset on a single file descriptor.
Signed-off-by: Jürg Billeter <j@bitron.ch>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The Atomic Operations feature (PCIe r4.0, sec 6.15) allows atomic
transctions to be requested by, routed through and completed by PCIe
components. Routing and completion do not require software support.
Component support for each is detectable via the DEVCAP2 register.
A Requester may use AtomicOps only if its PCI_EXP_DEVCTL2_ATOMIC_REQ is
set. This should be set only if the Completer and all intermediate routing
elements support AtomicOps.
A concrete example is the AMD Fiji-class GPU (which is capable of making
AtomicOp requests), below a PLX 8747 switch (advertising AtomicOp routing)
with a Haswell host bridge (advertising AtomicOp completion support).
Add pci_enable_atomic_ops_to_root() for per-device control over AtomicOp
requests. This checks to be sure the Root Port supports completion of the
desired AtomicOp sizes and the path to the Root Port supports routing the
AtomicOps.
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[bhelgaas: changelog, comments, whitespace]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
en_rx_am.c was deleted in 'net-next' but had a bug fixed in it in
'net'.
The esp{4,6}_offload.c conflicts were overlapping changes.
The 'out' label is removed so we just return ERR_PTR(-EINVAL)
directly.
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch just adds the capability for GFS2 to track which function
called gfs2_log_flush. This should make it easier to diagnose
problems based on the sequence of events found in the journals.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
This patch adds a new structure called gfs2_log_header_v2 which is used
to store expanded fields into previously unused areas of the log headers
(i.e., this change is backwards compatible). Some of these are used for
debug purposes so we can backtrack when problems occur. Others are
reserved for future expansion.
This patch is based on a prototype from Steve Whitehouse.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Expose the number of times the link has been going UP or DOWN, and
update the "carrier_changes" counter to be the sum of these two events.
While at it, also update the sysfs-class-net documentation to cover:
carrier_changes (3.15), carrier_up_count (4.16) and carrier_down_count
(4.16)
Signed-off-by: David Decotigny <decot@googlers.com>
[Florian:
* rebase
* add documentation
* merge carrier_changes with up/down counters]
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit ccfdec9089 ("macsec: Add support for GCM-AES-256 cipher suite")
changed a few values in the uapi headers for MACsec.
Because of existing userspace implementations, we need to preserve the
value of MACSEC_DEFAULT_CIPHER_ID. Not doing that resulted in
wpa_supplicant segfaults when a secure channel was created using the
default cipher. Thus, swap MACSEC_DEFAULT_CIPHER_{ID,ALT} back to their
original values.
Changing the maximum length of the MACSEC_SA_ATTR_KEY attribute is
unnecessary, as the previous value (MACSEC_MAX_KEY_LEN, which was 128B)
is large enough to carry 32-bytes keys. This patch reverts
MACSEC_MAX_KEY_LEN to 128B and restores the old length check on
MACSEC_SA_ATTR_KEY.
Fixes: ccfdec9089 ("macsec: Add support for GCM-AES-256 cipher suite")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The UUID change by btrfstune sets SUPER_FLAG_CHANGING_FSID and resets it
only when changing fsid is complete. Its not a good idea to mount the
device anything in between, reading metadata blocks would fail with UUID
mismatch.
This patch doesn't add SUPER_FLAG_CHANGING_FSID into
BTRFS_SUPER_FLAG_SUPP list, so mount will fail (along with the fix in
the next patch) when SUPER_FLAG_CHANGING_FSID is set.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ update changelog ]
Signed-off-by: David Sterba <dsterba@suse.com>
btrfs-progs uses super flag bit BTRFS_SUPER_FLAG_METADUMP_V2 (1ULL << 34).
So just define that in kernel so that we know its been used.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Just a code spatial rearrangement, no functional change.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Pablo Neira Ayuso says:
====================
Netfilter/IPVS updates for net-next
The following patchset contains Netfilter/IPVS updates for your net-next
tree. Basically, a new extension for ip6tables, simplification work of
nf_tables that saves us 500 LoC, allow raw table registration before
defragmentation, conversion of the SNMP helper to use the ASN.1 code
generator, unique 64-bit handle for all nf_tables objects and fixes to
address fallout from previous nf-next batch. More specifically, they
are:
1) Seven patches to remove family abstraction layer (struct nft_af_info)
in nf_tables, this simplifies our codebase and it saves us 64 bytes per
net namespace.
2) Add IPv6 segment routing header matching for ip6tables, from Ahmed
Abdelsalam.
3) Allow to register iptable_raw table before defragmentation, some
people do not want to waste cycles on defragmenting traffic that is
going to be dropped, hence add a new module parameter to enable this
behaviour in iptables and ip6tables. From Subash Abhinov
Kasiviswanathan. This patch needed a couple of follow up patches to
get things tidy from Arnd Bergmann.
4) SNMP helper uses the ASN.1 code generator, from Taehee Yoo. Several
patches for this helper to prepare this change are also part of this
patch series.
5) Add 64-bit handles to uniquely objects in nf_tables, from Harsha
Sharma.
6) Remove log message that several netfilter subsystems print at
boot/load time.
7) Restore x_tables module autoloading, that got broken in a previous
patch to allow singleton NAT hook callback registration per hook
spot, from Florian Westphal. Moreover, return EBUSY to report that
the singleton NAT hook slot is already in instead.
8) Several fixes for the new nf_tables flowtable representation,
including incorrect error check after nf_tables_flowtable_lookup(),
missing Kconfig dependencies that lead to build breakage and missing
initialization of priority and hooknum in flowtable object.
9) Missing NETFILTER_FAMILY_ARP dependency in Kconfig for the clusterip
target. This is due to recent updates in the core to shrink the hook
array size and compile it out if no specific family is enabled via
.config file. Patch from Florian Westphal.
10) Remove duplicated include header files, from Wei Yongjun.
11) Sparse warning fix for the NFPROTO_INET handling from the core
due to missing static function definition, also from Wei Yongjun.
12) Restore ICMPv6 Parameter Problem error reporting when
defragmentation fails, from Subash Abhinov Kasiviswanathan.
13) Remove obsolete owner field initialization from struct
file_operations, patch from Alexey Dobriyan.
14) Use boolean datatype where needed in the Netfilter codebase, from
Gustavo A. R. Silva.
15) Remove double semicolon in dynset nf_tables expression, from
Luis de Bethencourt.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov says:
====================
pull-request: bpf-next 2018-01-19
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) bpf array map HW offload, from Jakub.
2) support for bpf_get_next_key() for LPM map, from Yonghong.
3) test_verifier now runs loaded programs, from Alexei.
4) xdp cpumap monitoring, from Jesper.
5) variety of tests, cleanups and small x64 JIT optimization, from Daniel.
6) user space can now retrieve HW JITed program, from Jiong.
Note there is a minor conflict between Russell's arm32 JIT fixes
and removal of bpf_jit_enable variable by Daniel which should
be resolved by keeping Russell's comment and removing that variable.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
ARM:
* fix incorrect huge page mappings on systems using the contiguous hint
for hugetlbfs
* support alternative GICv4 init sequence
* correctly implement the ARM SMCC for HVC and SMC handling
PPC:
* add KVM IOCTL for reporting vulnerability and workaround status
s390:
* provide userspace interface for branch prediction changes in firmware
x86:
* use correct macros for bits
-----BEGIN PGP SIGNATURE-----
iQEcBAABCAAGBQJaY3/eAAoJEED/6hsPKofo64kH/16SCSA9pKJTf39+jLoCPzbp
tlhzxoaqb9cPNMQBAk8Cj5xNJ6V4Clwnk8iRWaE6dRI5nWQxnxRHiWxnrobHwUbK
I0zSy+SywynSBnollKzLzQrDUBZ72fv3oLwiYEYhjMvs0zW6Q/vg10WERbav912Q
bv8nb5e8TbvU500ErndKTXOa8/B6uZYkMVjBNvAHwb+4AQ7bJgDQs5/qOeXllm8A
MT/SNYop/fkjRP7mQng5XYzoO+70tbe0hWpOQGgBnduzrbkNNvZtYtovusHYytLX
PAB7DDPbLZm5L2HBo4zvKgTHIoHTxU0X2yfUDzt7O151O2WSyqBRC3y1tpj6xa8=
=GnNJ
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Radim Krčmář:
"ARM:
- fix incorrect huge page mappings on systems using the contiguous
hint for hugetlbfs
- support alternative GICv4 init sequence
- correctly implement the ARM SMCC for HVC and SMC handling
PPC:
- add KVM IOCTL for reporting vulnerability and workaround status
s390:
- provide userspace interface for branch prediction changes in
firmware
x86:
- use correct macros for bits"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: s390: wire up bpb feature
KVM: PPC: Book3S: Provide information about hardware/firmware CVE workarounds
KVM/x86: Fix wrong macro references of X86_CR0_PG_BIT and X86_CR4_PAE_BIT in kvm_valid_sregs()
arm64: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls
KVM: arm64: Fix GICv4 init when called from vgic_its_create
KVM: arm/arm64: Check pagesize when allocating a hugepage at Stage 2
The new firmware interfaces for branch prediction behaviour changes
are transparently available for the guest. Nevertheless, there is
new state attached that should be migrated and properly resetted.
Provide a mechanism for handling reset, migration and VSIE.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[Changed capability number to 152. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
The AMR/IAMR/UAMOR are part of the program context.
Allow it to be accessed via ptrace and through core files.
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Guillaume Nault <g.nault@alphalink.fr>
Tested-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch allows deletion of objects via unique handle which can be
listed via '-a' option.
Signed-off-by: Harsha Sharma <harshasharmaiitr@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This adds a new ioctl, KVM_PPC_GET_CPU_CHAR, that gives userspace
information about the underlying machine's level of vulnerability
to the recently announced vulnerabilities CVE-2017-5715,
CVE-2017-5753 and CVE-2017-5754, and whether the machine provides
instructions to assist software to work around the vulnerabilities.
The ioctl returns two u64 words describing characteristics of the
CPU and required software behaviour respectively, plus two mask
words which indicate which bits have been filled in by the kernel,
for extensibility. The bit definitions are the same as for the
new H_GET_CPU_CHARACTERISTICS hypercall.
There is also a new capability, KVM_CAP_PPC_GET_CPU_CHAR, which
indicates whether the new ioctl is available.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Tell user space about device on which the map was created.
Unfortunate reality of user ABI makes sharing this code
with program offload difficult but the information is the
same.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Doc BPF ld/ldx size defines as comments in code, as it makes in
faster to lookup in a programming/review setting, than looking up
the sizes in Documentation/networking/filter.txt.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJaW+iVAAoJEHm+PkMAQRiGCDsIAJALNpX7odTx/8y+yCSWbpBH
E57iwr4rmnI6tXJY6gqBUWTYnjAcf4b8IsHGCO6q3WIE3l/kt+m3eA21a32mF2Db
/bfPGTOWu5LoOnFqzgH2kiFuC3Y474toxpld2YtkQWYxi5W7SUtIHi/jGgkUprth
g15yPfwYgotJd/gpmPfBDMPlYDYvLlnPYbTG6ZWdMbg39m2RF2m0BdQ6aBFLHvbJ
IN0tjCM6hrLFBP0+6Zn60pevUW9/AFYotZn2ankNTk5QVCQm14rgQIP+Pfoa5WpE
I25r0DbkG2jKJCq+tlgIJjxHKD37GEDMc4T8/5Y8CNNeT9Q8si9EWvznjaAPazw=
=o5gx
-----END PGP SIGNATURE-----
BackMerge tag 'v4.15-rc8' into drm-next
Linux 4.15-rc8
Daniel requested this for so the intel CI won't fall over on drm-next
so often.
-----BEGIN PGP SIGNATURE-----
iQFHBAABCgAxFiEE4bay/IylYqM/npjQHv7KIOw4HPYFAlpeCCQTHG1rbEBwZW5n
dXRyb25peC5kZQAKCRAe/sog7Dgc9ti9B/4jAoCYTuYXnkRvS34jdgQQV99QyC8M
EpcAgzHo2kYNPDf5q8TEVheXxvA9XeGiA+TtjL9cNowxwMMtJev3/dmBtOmU6jek
RgOsYlR8guxBHOx8pj1IMl1xPoWTLfg4Kv1qnpXx3zvhOP610G/edBBSZt659MGF
SW6pBoNivbl+EYSM5x81QIfqJ4NlD5AKY4PeeSnGrnSthd8EFNp2zKkcY8nMOJ0D
Gm2YyxGJXh+lHse965DQRNg+owZxIWyheQplulVrw9v34LOjbFko3Cd+D9KLW5MG
LVVRJ3E0jm7W75AvxNcv2WP+lZVcDxXqsxFH0dP8WOJNZiKZeJ5aZfji
=ROXk
-----END PGP SIGNATURE-----
Merge tag 'linux-can-next-for-4.16-20180116' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next
Marc Kleine-Budde says:
====================
pull-request: can-next 2018-01-16
this is a pull request for net-next/master consisting of 9 patches.
This is a series of patches, some of them initially by Franklin S Cooper
Jr, which was picked up by Faiz Abbas. Faiz Abbas added some patches
while working on this series, I contributed one as well.
The first two patches add support to CAN device infrastructure to limit
the bitrate of a CAN adapter if the used CAN-transceiver has a certain
maximum bitrate.
The remaining patches improve the m_can driver. They add support for
bitrate limiting to the driver, clean up the driver and add support for
runtime PM.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch allows userspace to attach eBPF filter to tun. This will
allow to implement VM dataplane filtering in a more efficient way
compared to cBPF filter by allowing either qemu or libvirt to
attach eBPF filter to tun.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce two new attributes to be used for qdisc creation and dumping.
One for ingress block, one for egress block. Introduce a set of ops that
qdisc which supports block sharing would implement.
Passing block indexes in qdisc change is not supported yet and it is
checked and forbidded.
In future, these attributes are to be reused for specifying block
indexes for classes as well. As of this moment however, it is not
supported so a check is in place to forbid it.
Suggested-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As the tcm_ifindex with value TCM_IFINDEX_MAGIC_BLOCK is invalid ifindex,
use it to indicate that we work with block, instead of qdisc.
So if tcm_ifindex is set to TCM_IFINDEX_MAGIC_BLOCK, tcm_parent is used
to carry block_index.
If the block is set to be shared between at least 2 qdiscs, it is
forbidden to use the qdisc handle to add/delete filters. In that case,
userspace has to pass block_index.
Also, for dump of the filters, in case the block is shared in between at
least 2 qdiscs, the each filter is dumped with tcm_ifindex value
TCM_IFINDEX_MAGIC_BLOCK and tcm_parent set to block_index. That gives
the user clear indication, that the filter belongs to a shared block
and not only to one qdisc under which it is dumped.
Suggested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann says:
====================
pull-request: bpf-next 2018-01-17
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) Add initial BPF map offloading for nfp driver. Currently only
programs were supported so far w/o being able to access maps.
Offloaded programs are right now only allowed to perform map
lookups, and control path is responsible for populating the
maps. BPF core infrastructure along with nfp implementation is
provided, from Jakub.
2) Various follow-ups to Josef's BPF error injections. More
specifically that includes: properly check whether the error
injectable event is on function entry or not, remove the percpu
bpf_kprobe_override and rather compare instruction pointer
with original one, separate error-injection from kprobes since
it's not limited to it, add injectable error types in order to
specify what is the expected type of failure, and last but not
least also support the kernel's fault injection framework, all
from Masami.
3) Various misc improvements and cleanups to the libbpf Makefile.
That is, fix permissions when installing BPF header files, remove
unused variables and functions, and also install the libbpf.h
header, from Jesper.
4) When offloading to nfp JIT and the BPF insn is unsupported in the
JIT, then reject right at verification time. Also fix libbpf with
regards to ELF section name matching by properly treating the
program type as prefix. Both from Quentin.
5) Add -DPACKAGE to bpftool when including bfd.h for the disassembler.
This is needed, for example, when building libfd from source as
bpftool doesn't supply a config.h for bfd.h. Fix from Jiong.
6) xdp_convert_ctx_access() is simplified since it doesn't need to
set target size during verification, from Jesper.
7) Let bpftool properly recognize BPF_PROG_TYPE_CGROUP_DEVICE
program types, from Roman.
8) Various functions in BPF cpumap were not declared static, from Wei.
9) Fix a double semicolon in BPF samples, from Luis.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The hardware processes which are modeled via dpipe commonly use some
internal hardware resources. Such relation can improve the understanding
of hardware limitations. The number of resource's unit consumed per
table's entry are also provided for each table.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for performing driver hot reload.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for hardware resource abstraction over devlink. Each resource
is identified via id, furthermore it contains information regarding its
size and its related sub resources. Each resource can also provide its
current occupancy.
In some cases the sizes of some resources can be changed, yet for those
changes to take place a hot driver reload may be needed. The reload
capability will be introduced in the next patch.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Due to the '#ifdef __KERNEL__' being located in the wrong place, some
definitions from the kernel API were placed in the UAPI header during
the scripted header split. Fix this. Also, remove the duplicate comment
which is only relevant to the UAPI header.
Fixes: 607ca46e97 ("UAPI: (Scripted) Disintegrate include/linux")
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
This part of Secure Encrypted Virtualization (SEV) patch series focuses on KVM
changes required to create and manage SEV guests.
SEV is an extension to the AMD-V architecture which supports running encrypted
virtual machine (VMs) under the control of a hypervisor. Encrypted VMs have their
pages (code and data) secured such that only the guest itself has access to
unencrypted version. Each encrypted VM is associated with a unique encryption key;
if its data is accessed to a different entity using a different key the encrypted
guest's data will be incorrectly decrypted, leading to unintelligible data.
This security model ensures that hypervisor will no longer able to inspect or
alter any guest code or data.
The key management of this feature is handled by a separate processor known as
the AMD Secure Processor (AMD-SP) which is present on AMD SOCs. The SEV Key
Management Specification (see below) provides a set of commands which can be
used by hypervisor to load virtual machine keys through the AMD-SP driver.
The patch series adds a new ioctl in KVM driver (KVM_MEMORY_ENCRYPT_OP). The
ioctl will be used by qemu to issue SEV guest-specific commands defined in Key
Management Specification.
The following links provide additional details:
AMD Memory Encryption white paper:
http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2013/12/AMD_Memory_Encryption_Whitepaper_v7-Public.pdf
AMD64 Architecture Programmer's Manual:
http://support.amd.com/TechDocs/24593.pdf
SME is section 7.10
SEV is section 15.34
SEV Key Management:
http://support.amd.com/TechDocs/55766_SEV-KM API_Specification.pdf
KVM Forum Presentation:
http://www.linux-kvm.org/images/7/74/02x08A-Thomas_Lendacky-AMDs_Virtualizatoin_Memory_Encryption_Technology.pdf
SEV Guest BIOS support:
SEV support has been add to EDKII/OVMF BIOS
https://github.com/tianocore/edk2
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Various CAN or CAN-FD IP may be able to run at a faster rate than
what the transceiver the CAN node is connected to. This can lead to
unexpected errors. However, CAN transceivers typically have fixed
limitations and provide no means to discover these limitations at
runtime. Therefore, add support for a can-transceiver node that
can be reused by other CAN peripheral drivers to determine for both
CAN and CAN-FD what the max bitrate that can be used. If the user
tries to configure CAN to pass these maximum bitrates it will throw
an error.
Also add support for reading bitrate_max via the netlink interface.
Reviewed-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Franklin S Cooper Jr <fcooper@ti.com>
[nsekhar@ti.com: fix build error with !CONFIG_OF]
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Faiz Abbas <faiz_abbas@ti.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
This reverts commit ceaa001a17.
The OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS attr should be designed
as a nested attribute to support all ERSPAN v1 and v2's fields.
The current attr is a be32 supporting only one field. Thus, this
patch reverts it and later patch will redo it using nested attr.
Signed-off-by: William Tu <u9012063@gmail.com>
Cc: Jiri Benc <jbenc@redhat.com>
Cc: Pravin Shelar <pshelar@ovn.org>
Acked-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The NFS/RDMA port assignment is specified in Section 9 of RFC 8267.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
BPF map offload follow similar path to program offload. At creation
time users may specify ifindex of the device on which they want to
create the map. Map will be validated by the kernel's
.map_alloc_check callback and device driver will be called for the
actual allocation. Map will have an empty set of operations
associated with it (save for alloc and free callbacks). The real
device callbacks are kept in map->offload->dev_ops because they
have slightly different signatures. Map operations are called in
process context so the driver may communicate with HW freely,
msleep(), wait() etc.
Map alloc and free callbacks are muxed via existing .ndo_bpf, and
are always called with rtnl lock held. Maps and programs are
guaranteed to be destroyed before .ndo_uninit (i.e. before
unregister_netdev() returns). Map callbacks are invoked with
bpf_devs_lock *read* locked, drivers must take care of exclusive
locking if necessary.
All offload-specific branches are marked with unlikely() (through
bpf_map_is_dev_bound()), given that branch penalty will be
negligible compared to IO anyway, and we don't want to penalize
SW path unnecessarily.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The Software Delegated Exception Interface (SDEI) is an ARM standard
for registering callbacks from the platform firmware into the OS.
This is typically used to implement firmware notifications (such as
firmware-first RAS) or promote an IRQ that has been promoted to a
firmware-assisted NMI.
Add the code for detecting the SDEI version and the framework for
registering and unregistering events. Subsequent patches will add the
arch-specific backend code and the necessary power management hooks.
Only shared events are supported, power management, private events and
discovery for ACPI systems will be added by later patches.
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
As pointed out by Daniel Borkmann, using bpf_target_off() is not
necessary for xdp_rxq_info when extracting queue_index and
ifindex, as these members are u32 like BPF_W.
Also fix trivial spelling mistake introduced in same commit.
Fixes: 02dd3291b2 ("bpf: finally expose xdp_rxq_info to XDP bpf-programs")
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
conntrack defrag is needed only if some module like CONNTRACK or NAT
explicitly requests it. For plain forwarding scenarios, defrag is
not needed and can be skipped if NOTRACK is set in a rule.
Since conntrack defrag is currently higher priority than raw table,
setting NOTRACK is not sufficient. We need to move raw to a higher
priority for iptables only.
This is achieved by introducing a module parameter "raw_before_defrag"
which allows to change the priority of raw table to place it before
defrag. By default, the parameter is disabled and the priority of raw
table is NF_IP_PRI_RAW to support legacy behavior. If the module
parameter is enabled, then the priority of the raw table is set to
NF_IP_PRI_RAW_BEFORE_DEFRAG.
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Add #defines for the Completion Timeout Disable feature and use them. No
functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Four patches from Or that add Hairpin support to mlx5:
===========================================================
From: Or Gerlitz <ogerlitz@mellanox.com>
We refer the ability of NIC HW to fwd packet received on one port to
the other port (also from a port to itself) as hairpin. The application API
is based
on ingress tc/flower rules set on the NIC with the mirred redirect
action. Other actions can apply to packets during the redirect.
Hairpin allows to offload the data-path of various SW DDoS gateways,
load-balancers, etc to HW. Packets go through all the required
processing in HW (header re-write, encap/decap, push/pop vlan) and
then forwarded, CPU stays at practically zero usage. HW Flow counters
are used by the control plane for monitoring and accounting.
Hairpin is implemented by pairing a receive queue (RQ) to send queue (SQ).
All the flows that share <recv NIC, mirred NIC> are redirected through
the same hairpin pair. Currently, only header-rewrite is supported as a
packet modification action.
I'd like to thanks Elijah Shakkour <elijahs@mellanox.com> for implementing this
functionality
on HW simulator, before it was avail in the FW so the driver code could be
tested early.
===========================================================
From Feras three patches that provide very small changes that allow IPoIB
to support RX timestamping for child interfaces, simply by hooking the mlx5e
timestamping PTP ioctl to IPoIB child interface netdev profile.
One patch from Gal to fix a spilling mistake.
Two patches from Eugenia adds drop counters to VF statistics
to be reported as part of VF statistics in netlink (iproute2) and
implemented them in mlx5 eswitch.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJaVF5WAAoJEEg/ir3gV/o+fRkH/0PxjwJRA3REqhi/H8HOdH9f
cBLrOzFdqTCYQWQFCLFbMQ/Zgoel3KglpJ0iQMjuVFfjMbybVXOe8FAEVdbWHnfL
C+2HRMe8dplKrsq5UkxJhbyKhFKhl2XeMFYWonw9dSM7Nz5DyowQ1y1r5SgMlMAv
t3mYAIa4kZHK18BjDoIsCoAXXwsHiztR2irMp5+DwataTGP7vC7AsrucDxLA/qFf
I3E15DZk9s1f53PUuY7CYnUnJfMMP3VJdxpyx4k6xt9J2IMuilF4YyD6wpAKsVQU
/LzRkWI9x/6QindffqlrACeeidimOeY4pC4txIhS5uXgFXulugDHq1/Ih1sgZS8=
=g5vr
-----END PGP SIGNATURE-----
Merge tag 'mlx5-updates-2018-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
mlx5-updates-2018-01-08
Four patches from Or that add Hairpin support to mlx5:
===========================================================
From: Or Gerlitz <ogerlitz@mellanox.com>
We refer the ability of NIC HW to fwd packet received on one port to
the other port (also from a port to itself) as hairpin. The application API
is based
on ingress tc/flower rules set on the NIC with the mirred redirect
action. Other actions can apply to packets during the redirect.
Hairpin allows to offload the data-path of various SW DDoS gateways,
load-balancers, etc to HW. Packets go through all the required
processing in HW (header re-write, encap/decap, push/pop vlan) and
then forwarded, CPU stays at practically zero usage. HW Flow counters
are used by the control plane for monitoring and accounting.
Hairpin is implemented by pairing a receive queue (RQ) to send queue (SQ).
All the flows that share <recv NIC, mirred NIC> are redirected through
the same hairpin pair. Currently, only header-rewrite is supported as a
packet modification action.
I'd like to thanks Elijah Shakkour <elijahs@mellanox.com> for implementing this
functionality
on HW simulator, before it was avail in the FW so the driver code could be
tested early.
===========================================================
From Feras three patches that provide very small changes that allow IPoIB
to support RX timestamping for child interfaces, simply by hooking the mlx5e
timestamping PTP ioctl to IPoIB child interface netdev profile.
One patch from Gal to fix a spilling mistake.
Two patches from Eugenia adds drop counters to VF statistics
to be reported as part of VF statistics in netlink (iproute2) and
implemented them in mlx5 eswitch.
Signed-off-by: David S. Miller <davem@davemloft.net>
It allows matching packets based on Segment Routing Header
(SRH) information.
The implementation considers revision 7 of the SRH draft.
https://tools.ietf.org/html/draft-ietf-6man-segment-routing-header-07
Currently supported match options include:
(1) Next Header
(2) Hdr Ext Len
(3) Segments Left
(4) Last Entry
(5) Tag value of SRH
Signed-off-by: Ahmed Abdelsalam <amsalam20@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch adds the possibility of getting the delivery of a SIGXCPU
signal whenever there is a runtime overrun. The request is done through
the sched_flags field within the sched_attr structure.
Forward port of https://lkml.org/lkml/2009/10/16/170
Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Juri Lelli <juri.lelli@gmail.com>
Signed-off-by: Claudio Scordino <claudio@evidence.eu.com>
Signed-off-by: Luca Abeni <luca.abeni@santannapisa.it>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tommaso Cucinotta <tommaso.cucinotta@sssup.it>
Link: http://lkml.kernel.org/r/1513077024-25461-1-git-send-email-claudio@evidence.eu.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The input events use struct timeval to store event time, unfortunately
this structure is not y2038 safe and is being replaced in kernel with
y2038 safe structures.
Because of ABI concerns we can not change the size or the layout of
structure input_event, so we opt to re-interpreting the 'seconds' part
of timestamp as an unsigned value, effectively doubling the range of
values, to year 2106.
Newer glibc that has support for 32 bit applications to use 64 bit
time_t supplies __USE_TIME_BITS64 define [1], that we can use to present
the userspace with updated input_event layout. The updated layout will
cause the compile time breakage, alerting applications and distributions
maintainers to the issue. Existing 32 binaries will continue working
without any changes until 2038.
Ultimately userspace applications should switch to using monotonic or
boot time clocks, as realtime clock is not very well suited for input
event timestamps as it can go backwards (see a80b83b7b8 "Input: evdev -
add CLOCK_BOOTTIME support" by by John Stultz). With monotonic clock the
practical range of reported times will always fit into the pair of 32
bit values, as we do not expect any system to stay up for a hundred
years without a single reboot.
[1] https://sourceware.org/glibc/wiki/Y2038ProofnessDesign
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Acked-by: Peter Hutterer <peter.hutterer@who-t.net>
Patchwork-Id: 10148083
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
When a member joins a group, it also indicates a binding scope. This
makes it possible to create both node local groups, invisible to other
nodes, as well as cluster global groups, visible everywhere.
In order to avoid that different members end up having permanently
differing views of group size and memberhip, we must inhibit locally
and globally bound members from joining the same group.
We do this by using the binding scope as an additional separator between
groups. I.e., a member must ignore all membership events from sockets
using a different scope than itself, and all lookups for message
destinations must require an exact match between the message's lookup
scope and the potential target's binding scope.
Apart from making it possible to create local groups using the same
identity on different nodes, a side effect of this is that it now also
becomes possible to create a cluster global group with the same identity
across the same nodes, without interfering with the local groups.
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ability to set speed and duplex for virtio_net is useful in various
scenarios as described here:
16032be virtio_net: add ethtool support for set and get of settings
However, it would be nice to be able to set this from the hypervisor,
such that virtio_net doesn't require custom guest ethtool commands.
Introduce a new feature flag, VIRTIO_NET_F_SPEED_DUPLEX, which allows
the hypervisor to export a linkspeed and duplex setting. The user can
subsequently overwrite it later if desired via: 'ethtool -s'.
Note that VIRTIO_NET_F_SPEED_DUPLEX is defined as bit 63, the intention
is that device feature bits are to grow down from bit 63, since the
transports are starting from bit 24 and growing up.
Signed-off-by: Jason Baron <jbaron@akamai.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: virtio-dev@lists.oasis-open.org
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds support for the GCM-AES-256 cipher suite as specified in
IEEE 802.1AEbn-2011. The prepared cipher suite selection mechanism is used,
with GCM-AES-128 being the default cipher suite as defined in the standard.
Signed-off-by: Felix Walter <felix.walter@cloudandheat.com>
Cc: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
USB_DT_USB_SSP_CAP_SIZE(ssac) gives the size of the SSP capability
descriptor. The descriptor consists of 12 bytes plus a array of
SSA entries.
The number of SSA entries is stored in a SSAC value in the first 12 bytes,
The USB3.1 specification 9.6.2.5 defines SSAC as zero based:
"The number of Sublink Speed Attributes = SSAC + 1." This is not
intuitive and has already caused some confusion.
Make a small modifiaction to the USB_DT_USB_SSP_CAP_SIZE(ssac)
definition to make it a bit clearer
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Modern hardware can decide to drop packets going to/from a VF.
Add receive and transmit drop counters to be displayed at hypervisor
layer in iproute2 per VF statistics.
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Pablo Neira Ayuso says:
====================
Netfilter/IPVS updates for net-next
The following patchset contains Netfilter/IPVS updates for your
net-next tree:
1) Free hooks via call_rcu to speed up netns release path, from
Florian Westphal.
2) Reduce memory footprint of hook arrays, skip allocation if family is
not present - useful in case decnet support is not compiled built-in.
Patches from Florian Westphal.
3) Remove defensive check for malformed IPv4 - including ihl field - and
IPv6 headers in x_tables and nf_tables.
4) Add generic flow table offload infrastructure for nf_tables, this
includes the netlink control plane and support for IPv4, IPv6 and
mixed IPv4/IPv6 dataplanes. This comes with NAT support too. This
patchset adds the IPS_OFFLOAD conntrack status bit to indicate that
this flow has been offloaded.
5) Add secpath matching support for nf_tables, from Florian.
6) Save some code bytes in the fast path for the nf_tables netdev,
bridge and inet families.
7) Allow one single NAT hook per point and do not allow to register NAT
hooks in nf_tables before the conntrack hook, patches from Florian.
8) Seven patches to remove the struct nf_af_info abstraction, instead
we perform direct calls for IPv4 which is faster. IPv6 indirections
are still needed to avoid dependencies with the 'ipv6' module, but
these now reside in struct nf_ipv6_ops.
9) Seven patches to handle NFPROTO_INET from the Netfilter core,
hence we can remove specific code in nf_tables to handle this
pseudofamily.
10) No need for synchronize_net() call for nf_queue after conversion
to hook arrays. Also from Florian.
11) Call cond_resched_rcu() when dumping large sets in ipset to avoid
softlockup. Again from Florian.
12) Pass lockdep_nfnl_is_held() to rcu_dereference_protected(), patch
from Florian Westphal.
13) Fix matching of counters in ipset, from Jozsef Kadlecsik.
14) Missing nfnl lock protection in the ip_set_net_exit path, also
from Jozsef.
15) Move connlimit code that we can reuse from nf_tables into
nf_conncount, from Florian Westhal.
And asorted cleanups:
16) Get rid of nft_dereference(), it only has one single caller.
17) Add nft_set_is_anonymous() helper function.
18) Remove NF_ARP_FORWARD leftover chain definition in nf_tables_arp.
19) Remove unnecessary comments in nf_conntrack_h323_asn1.c
From Varsha Rao.
20) Remove useless parameters in frag_safe_skb_hp(), from Gao Feng.
21) Constify layer 4 conntrack protocol definitions, function
parameters to register/unregister these protocol trackers, and
timeouts. Patches from Florian Westphal.
22) Remove nlattr_size indirection, from Florian Westphal.
23) Add fall-through comments as -Wimplicit-fallthrough needs this,
from Gustavo A. R. Silva.
24) Use swap() macro to exchange values in ipset, patch from
Gustavo A. R. Silva.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The "offset" option has been removed by
commit 900631ee6a ("l2tp: remove configurable payload offset").
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Acked-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add new instruction for the nf_tables VM that allows us to specify what
flows are offloaded into a given flow table via name. This new
instruction creates the flow entry and adds it to the flow table.
Only established flows, ie. we have seen traffic in both directions, are
added to the flow table. You can still decide to offload entries at a
later stage via packet counting or checking the ct status in case you
want to offload assured conntracks.
This new extension depends on the conntrack subsystem.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch introduces a netlink control plane to create, delete and dump
flow tables. Flow tables are identified by name, this name is used from
rules to refer to an specific flow table. Flow tables use the rhashtable
class and a generic garbage collector to remove expired entries.
This also adds the infrastructure to add different flow table types, so
we can add one for each layer 3 protocol family.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This new bit tells us that the conntrack entry is owned by the flow
table offload infrastructure.
# cat /proc/net/nf_conntrack
ipv4 2 tcp 6 src=10.141.10.2 dst=147.75.205.195 sport=36392 dport=443 src=147.75.205.195 dst=192.168.2.195 sport=443 dport=36392 [OFFLOAD] mark=0 zone=0 use=2
Note the [OFFLOAD] tag in the listing.
The timer of such conntrack entries look like stopped from userspace.
In practise, to make sure the conntrack entry does not go away, the
conntrack timer is periodically set to an arbitrary large value that
gets refreshed on every iteration from the garbage collector, so it
never expires- and they display no internal state in the case of TCP
flows. This allows us to save a bitcheck from the packet path via
nf_ct_is_expired().
Conntrack entries that have been offloaded to the flow table
infrastructure cannot be deleted/flushed via ctnetlink. The flow table
infrastructure is also responsible for releasing this conntrack entry.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This allows to reuse xt_connlimit infrastructure from nf_tables.
The upcoming nf_tables frontend can just pass in an nftables register
as input key, this allows limiting by any nft-supported key, including
concatenations.
For xt_connlimit, pass in the zone and the ip/ipv6 address.
With help from Yi-Hung Wei.
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The kernel already has defines for this, but they are in uapi exposed
headers.
Including these from netns.h causes build errors and also adds unneeded
dependencies on heads that we don't need.
So move these defines to netfilter_defs.h and place the uapi ones
in ifndef __KERNEL__ to keep them for userspace.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The perf_event_header::misc bit 13 is shared on different events and
next patch is adding yet another bit 13 user. Updating the comment to
make it more structured and clear which events use bit 13.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20180107160356.28203-8-jolsa@kernel.org
[ Update the tools/include/uapi/linux copy ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Daniel Borkmann says:
====================
pull-request: bpf-next 2018-01-07
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) Add a start of a framework for extending struct xdp_buff without
having the overhead of populating every data at runtime. Idea
is to have a new per-queue struct xdp_rxq_info that holds read
mostly data (currently that is, queue number and a pointer to
the corresponding netdev) which is set up during rxqueue config
time. When a XDP program is invoked, struct xdp_buff holds a
pointer to struct xdp_rxq_info that the BPF program can then
walk. The user facing BPF program that uses struct xdp_md for
context can use these members directly, and the verifier rewrites
context access transparently by walking the xdp_rxq_info and
net_device pointers to load the data, from Jesper.
2) Redo the reporting of offload device information to user space
such that it works in combination with network namespaces. The
latter is reported through a device/inode tuple as similarly
done in other subsystems as well (e.g. perf) in order to identify
the namespace. For this to work, ns_get_path() has been generalized
such that the namespace can be retrieved not only from a specific
task (perf case), but also from a callback where we deduce the
netns (ns_common) from a netdevice. bpftool support using the new
uapi info and extensive test cases for test_offload.py in BPF
selftests have been added as well, from Jakub.
3) Add two bpftool improvements: i) properly report the bpftool
version such that it corresponds to the version from the kernel
source tree. So pick the right linux/version.h from the source
tree instead of the installed one. ii) fix bpftool and also
bpf_jit_disasm build with bintutils >= 2.9. The reason for the
build breakage is that binutils library changed the function
signature to select the disassembler. Given this is needed in
multiple tools, add a proper feature detection to the
tools/build/features infrastructure, from Roman.
4) Implement the BPF syscall command BPF_MAP_GET_NEXT_KEY for the
stacktrace map. It is currently unimplemented, but there are
use cases where user space needs to walk all stacktrace map
entries e.g. for dumping or deleting map entries w/o having to
close and recreate the map. Add BPF selftests along with it,
from Yonghong.
5) Few follow-up cleanups for the bpftool cgroup code: i) rename
the cgroup 'list' command into 'show' as we have it for other
subcommands as well, ii) then alias the 'show' command such that
'list' is accepted which is also common practice in iproute2,
and iii) remove couple of newlines from error messages using
p_err(), from Jakub.
6) Two follow-up cleanups to sockmap code: i) remove the unused
bpf_compute_data_end_sk_skb() function and ii) only build the
sockmap infrastructure when CONFIG_INET is enabled since it's
only aware of TCP sockets at this time, from John.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Now all XDP driver have been updated to setup xdp_rxq_info and assign
this to xdp_buff->rxq. Thus, it is now safe to enable access to some
of the xdp_rxq_info struct members.
This patch extend xdp_md and expose UAPI to userspace for
ingress_ifindex and rx_queue_index. Access happens via bpf
instruction rewrite, that load data directly from struct xdp_rxq_info.
* ingress_ifindex map to xdp_rxq_info->dev->ifindex
* rx_queue_index map to xdp_rxq_info->queue_index
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Revert commit f15bc54eee ("l2tp: add peer_offset parameter"). This
is removed because it is adding another configurable offset and
configurable offsets are being removed.
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow to set the over-provision percentage on target creation. In case
that the value is not provided, fall back to the default value set by
the target.
In pblk, set the default OP to 11% of the total size of the device
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for your net tree,
they are:
1) Fix chain filtering when dumping rules via nf_tables_dump_rules().
2) Fix accidental change in NF_CT_STATE_UNTRACKED_BIT through uapi,
introduced when removing the untracked conntrack object, from
Florian Westphal.
3) Fix potential nul-dereference when releasing dump filter in
nf_tables_dump_obj_done(), patch from Hangbin Liu.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
One thing is notable: I applied two patches and later
reverted them - we'll get back to that once all the driver
situation is sorted out.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAlpOQXUACgkQB8qZga/f
l8Qpuw/+OclRyelfxh1v1xwFYDUAJZZmU9wr/Yx/ezZ8NoebA5bfJSXV/s+Tgw5E
oORx7LUkbxwreQtoEHtc9/IE7SCfXrB8kWoy5A/Q094SDglWOiQbRuYQ0gn4pMkV
zukm4O3+cHHGj1slnSOzQWNeF/5mbNwEMo5Id5ZnSjMfoPl+CWH8qvfu4oRFhmiG
tZ0gIGARX9FL3v+RyqEhugTxfCzAYRTinGQhG4r6LlkgCqTnza7VhG+3N+fPMkjS
4Rs9ucnMnunrbd9lbbpTb+vWAJ+McJfVw/Gtmjp/W8vyxZFEr0EHiY31btmMAhTO
ibZVZYCslL3WM2vIxxy0nGR6O28eCRzU4ETSOrInv4ZooplvmFHVnjms1hqiSaZO
4qy8Yb8cPrIPTcI3OYWvicBAAcHLqlEw8GC4rltf2bw6a0FdJ3igitFy9MPFhxBW
OZ0YS+exHAb9lBbk49qOM0Bqu7ug5MUTygX9RGTeWB0sRDmc5OQVsAqvfaapGts9
u+huQzO2Y1b8IDVAL/tTOoDz6A1Qc/S2BFDNilFKVeGOhB35jFA3BN4vJzmpp9Oy
cz8150ls6BbfHjrFiuHlQWwaoG6GTebSln9XnEqNXfh5GFj1H/FYTQOv4rIaIjrY
wdirSv6UopaRjnBSb062glmb9ZFHQEKBWDvRC7jTRaXMRTBQ2zo=
=M0gz
-----END PGP SIGNATURE-----
Merge tag 'mac80211-next-for-davem-2018-01-04' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:
====================
We have things all over the place, no point listing them.
One thing is notable: I applied two patches and later
reverted them - we'll get back to that once all the driver
situation is sorted out.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Musl provides its own ethhdr struct definition. Add a guard to prevent
its definition of the appropriate musl header has already been included.
glibc does not implement this header, but when glibc will implement this
they can just define __UAPI_DEF_ETHHDR 0 to make it work with the
kernel.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are a lot of places where sequences of space/tabs are
found. Get rid of all spaces before tabs.
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Some UVC video cameras contain metadata in their payload headers. This
patch extracts that data, adding more clock synchronisation information,
on both bulk and isochronous endpoints and makes it available to the user
space on a separate video node, using the V4L2_CAP_META_CAPTURE capability
and the V4L2_BUF_TYPE_META_CAPTURE buffer queue type. By default, only the
V4L2_META_FMT_UVC pixel format is available from those nodes. However,
cameras can be added to the device ID table to additionally specify their
own metadata format, in which case that format will also become available
from the metadata node.
[Use put_unaligned instead of __put_unaligned_cpu64]
[Use put_unaligned for the sof field as well]
Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@intel.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Add a pixel format, used by the UVC driver to stream metadata.
Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@intel.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAlpM+BAACgkQEacuoBRx
13KHtxAAj8uQ/ZEMz5vPQd4kQ3pn2a5Ts/yrAc8PPfk8SrBvMVk7KWNTRmbQgF2d
EcL0Xs8AtiOYdSd1KmEQrkHtjJsJxRW27IsDGHJuoEc3aBrmAStgGvUUFROncZWU
fHUm2Hy6Vw46Bx7HzxelmN5vjFHIZFdnx0sa6EfLL8Pui4VUVga7u9c8ezGL71MC
TnXUCl6roWsrQhxoEWwX+HnzT0OI/Q+eQdvycWVyAOJDNga0F7+5KFhP6G1IOzsR
QaUoowydRlSpot6dCRHeZ8+0aGtLXcZsMmQrg735ngdVT7ROZtY6dpXEpyn/KQHS
ysFtaxJ8PPiC6PY0sXmztWBgZMLTph555Uqp3PSIlpuP5qFMqiTi8BoCv+UvXwA1
QAyPd7QUrefLrhFb21v3RlDNdalMAU8NGKh8nEkaP664HJjw4TvRlWOcyxsmcsOI
7szGsF/hsqQeaVrVsHQbF+sEy16u98VPIGA1k+KCtY0334vsMKOtfndXnS2OeR5l
tJncNkfYD/0PeG479Wq4T42NOe6w90JmDKzWnLuDcY/zCJ7/AjJPp95AwyK/HPDc
F5GcwC1WmcbcYkuKCG9c2nMxidpH8QZ5sei1TWey6lM4lAAHnQrXhohQw4zJcjst
Q4ZRnXWGizSWOSTVQHVlmdUcPs/DlKPyDlcoTSEOD6JDhvl6AEY=
=C2BD
-----END PGP SIGNATURE-----
Merge tag 'at24-4.16-updates-for-wolfram' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux into i2c/for-4.16
"AT24 updates for 4.16 merge window
The driver has been converted to using regmap instead of raw i2c and
smbus calls which shrank the code significantly.
Device tree binding document has been cleaned up. Device tree support in
the driver has been improved and we now support all at24 models as well
as two new DT properties (no-read-rollover and wp-gpios).
We no longer user unreadable magic values for driver data as the way it
was implemented caused problems for some EEPROM models - we switched to
regular structs.
Aside from that, there's a bunch of coding style fixes and minor
improvements all over the place."
libc-compat.h aims to prevent symbol collisions between uapi and libc
headers for each supported libc. This requires continuous coordination
between them.
The goal of this commit is to improve the situation for libcs (such as
musl) which are not yet supported and/or do not wish to be explicitly
supported, while not affecting supported libcs. More precisely, with
this commit, unsupported libcs can request the suppression of any
specific uapi definition by defining the correspondings _UAPI_DEF_*
macro as 0. This can fix symbol collisions for them, as long as the
libc headers are included before the uapi headers. Inclusion in the
other order is outside the scope of this commit.
All infrastructure in order to enable this fallback for unsupported
libcs is already in place, except that libc-compat.h unconditionally
defines all _UAPI_DEF_* macros to 1 for all unsupported libcs so that
any previous definitions are ignored. In order to fix this, this commit
merely makes these definitions conditional.
This commit together with the musl libc commit
http://git.musl-libc.org/cgit/musl/commit/?id=04983f2272382af92eb8f8838964ff944fbb8258
fixes for example the following compiler errors when <linux/in6.h> is
included after musl's <netinet/in.h>:
./linux/in6.h:32:8: error: redefinition of 'struct in6_addr'
./linux/in6.h:49:8: error: redefinition of 'struct sockaddr_in6'
./linux/in6.h:59:8: error: redefinition of 'struct ipv6_mreq'
The comments referencing glibc are still correct, but this file is not
only used for glibc any more.
Signed-off-by: Felix Janda <felix.janda@posteo.de>
Reviewed-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJaSWkXAAoJEHm+PkMAQRiGZ/wH/j4H3KWPOmjRPryIrJVC8f+7
LGm8RVko+J2MJzeBY6EScomPQDGLrUxy4CbEfrLs3yFaryXYb6fCCdoV+h5tSUMY
VDsa04u/SQ8T6KsnOwQApk1h06vQRVpXiCATkAxNni/T+GMoYWIwcDPyXN4oJYv8
mOWB9TS0Hb/mfCVyUrxjNkCP39lJTUGcJQSc3yxV6v9ZziVReAHYcRt3KkknDL2j
byNXZEMgwAwzyoZJfuSNkCF3DY4rcX5UnQCGmh7M6AAY9XQpMcESmW/HtVYPz2Js
9+4q52LKIMviPZQS7/fOj9uSA7IsJpPIWb9rMIvlWbuCxEaSREPLRC+yYowKOP0=
=tEHk
-----END PGP SIGNATURE-----
Merge tag 'v4.15-rc6' into patchwork
Linux 4.15-rc6
* tag 'v4.15-rc6': (734 commits)
Linux 4.15-rc6
MAINTAINERS: mark arch/blackfin/ and its gubbins as orphaned
x86/ldt: Make LDT pgtable free conditional
x86/ldt: Plug memory leak in error path
x86/mm: Remove preempt_disable/enable() from __native_flush_tlb()
x86/smpboot: Remove stale TLB flush invocations
objtool: Fix seg fault with clang-compiled objects
objtool: Fix seg fault caused by missing parameter
kbuild: add '-fno-stack-check' to kernel build options
timerqueue: Document return values of timerqueue_add/del()
timers: Invoke timer_start_debug() where it makes sense
nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick()
timers: Reinitialize per cpu bases on hotplug
timers: Use deferrable base independent of base::nohz_active
genirq/msi, x86/vector: Prevent reservation mode for non maskable MSI
genirq/irqdomain: Rename early argument of irq_domain_activate_irq()
x86/vector: Use IRQD_CAN_RESERVE flag
genirq: Introduce IRQD_CAN_RESERVE flag
genirq/msi: Handle reactivation only on success
gpio: brcmstb: Make really use of the new lockdep class
...
inet_diag currently provides less/greater than or equal operators for
comparing ports when filtering sockets. An equal comparison can be
performed by combining the two existing operators, or a user can for
example request a port range and then do the final filtering in
userspace. However, these approaches both have drawbacks. Implementing
equal using LE/GE causes the size and complexity of a filter to grow
quickly as the number of ports increase, while it on busy machines would
be great if the kernel only returns information about relevant sockets.
This patch introduces source and destination port equal operators.
INET_DIAG_BC_S_EQ is used to match a source port, INET_DIAG_BC_D_EQ a
destination port, and usage is the same as for the existing port
operators. I.e., the port to match is stored in the no-member of the
next inet_diag_bc_op-struct in the filter.
Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Report to the user ifindex and namespace information of offloaded
programs. If device has disappeared return -ENODEV. Specify the
namespace using dev/inode combination.
CC: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The "reserved" field was a way, used at V4L2 API, to add new
data to existing structs without breaking userspace. However,
there are now clever ways of doing that, without needing to add
an uneeded overhead. So, get rid of them.
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Adds a new uAPI for DVB to use streaming I/O which is implemented
based on videobuf2, using those new ioctls:
- DMX_REQBUFS: Request kernel to allocate buffers which count and size
are dedicated by user.
- DMX_QUERYBUF: Get the buffer information like a memory offset which
will mmap() and be shared with user-space.
- DMX_EXPBUF: Just for testing whether buffer-exporting success or not.
- DMX_QBUF: Pass the buffer to kernel-space.
- DMX_DQBUF: Get back the buffer which may contain TS data.
Originally developed by: Junghak Sung <jh1009.sung@samsung.com>, as
seen at:
https://patchwork.linuxtv.org/patch/31613/https://patchwork.kernel.org/patch/7334301/
The original patch was written before merging VB2-core functionalities
upstream. When such series was added, several adjustments were made,
fixing some issues with V4L2, causing the original patch to be
non-trivially rebased.
After rebased, a few bugs in the patch were fixed. The patch was
also enhanced it and polling functionality got added.
The main changes over the original patch are:
dvb_vb2_fill_buffer():
- Set the size of the outgoing buffer after while loop using
vb2_set_plane_payload;
- Added NULL check for source buffer as per normal convention
of demux driver, this is called twice, first time with valid
buffer second time with NULL pointer, if its not handled,
it will result in crash
- Restricted spinlock for only list_* operations
dvb_vb2_init():
- Restricted q->io_modes to only VB2_MMAP as its the only
supported mode
dvb_vb2_release():
- Replaced the && in if condiion with &, because otherwise
it was always getting satisfied.
dvb_vb2_stream_off():
- Added list_del code for enqueud buffers upon stream off
dvb_vb2_poll():
- Added this new function in order to support polling
dvb_demux_poll() and dvb_dvr_poll()
- dvb_vb2_poll() is now called from these functions
- Ported this patch and latest videobuf2 to lower kernel versions and
tested auto scan.
Co-developed-by: Junghak Sung <jh1009.sung@samsung.com>
[mchehab@s-opensource.com: checkpatch fixes]
Signed-off-by: Junghak Sung <jh1009.sung@samsung.com>
Signed-off-by: Geunyoung Kim <nenggun.kim@samsung.com>
Acked-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Acked-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Satendra Singh Thakur <satendra.t@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Daniel Borkmann says:
====================
pull-request: bpf-next 2017-12-28
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) Fix incorrect state pruning related to recognition of zero initialized
stack slots, where stacksafe exploration would mistakenly return a
positive pruning verdict too early ignoring other slots, from Gianluca.
2) Various BPF to BPF calls related follow-up fixes. Fix an off-by-one
in maximum call depth check, and rework maximum stack depth tracking
logic to fix a bypass of the total stack size check reported by Jann.
Also fix a bug in arm64 JIT where prog->jited_len was uninitialized.
Addition of various test cases to BPF selftests, from Alexei.
3) Addition of a BPF selftest to test_verifier that is related to BPF to
BPF calls which demonstrates a late caller stack size increase and
thus out of bounds access. Fixed above in 2). Test case from Jann.
4) Addition of correlating BPF helper calls, BPF to BPF calls as well
as BPF maps to bpftool xlated dump in order to allow for better
BPF program introspection and debugging, from Daniel.
5) Fixing several bugs in BPF to BPF calls kallsyms handling in order
to get it actually to work for subprogs, from Daniel.
6) Extending sparc64 JIT support for BPF to BPF calls and fix a couple
of build errors for libbpf on sparc64, from David.
7) Allow narrower context access for BPF dev cgroup typed programs in
order to adapt to LLVM code generation. Also adjust memlock rlimit
in the test_dev_cgroup BPF selftest, from Yonghong.
8) Add netdevsim Kconfig entry to BPF selftests since test_offload.py
relies on netdevsim device being available, from Jakub.
9) Reduce scope of xdp_do_generic_redirect_map() to being static,
from Xiongwei.
10) Minor cleanups and spelling fixes in BPF verifier, from Colin.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce peer_offset parameter in order to add the capability
to specify two different values for payload offset on tx/rx side.
If just offset is provided by userspace use it for rx side as well
in order to maintain compatibility with older l2tp versions
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The header file is used by different userspace programs to inject packets
or to decode sniffed packets. It should therefore be available to them as
userspace header.
Also other components in the kernel (like the flow dissector) require
access to the packet definitions to be able to decode ETH_P_BATMAN ethernet
packets.
Signed-off-by: Sven Eckelmann <sven.eckelmann@openmesh.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
subsystem as a whole and in OP-TEE in particular.
Global Platform TEE specification [1] allows client applications
to register part of own memory as a shared buffer between
application and TEE. This allows fast zero-copy communication between
TEE and REE. But current implementation of TEE in Linux does not support
this feature.
Also, current implementation of OP-TEE transport uses fixed size
pre-shared buffer for all communications with OP-TEE OS. This is okay
in the most use cases. But this prevents use of OP-TEE in virtualized
environments, because:
a) We can't share the same buffer between different virtual machines
b) Physically contiguous memory as seen by VM can be non-contiguous
in reality (and as seen by OP-TEE OS) due to second stage of
MMU translation.
c) Size of this pre-shared buffer is limited.
So, first part of this pull request adds generic register/unregister
interface to tee subsystem. The second part adds necessary features into
OP-TEE driver, so it can use not only static pre-shared buffer, but
whole RAM to communicate with OP-TEE OS.
This change is backwards compatible allowing older secure world or
user space to work with newer kernels and vice versa.
[1] https://www.globalplatform.org/specificationsdevice.asp
-----BEGIN PGP SIGNATURE-----
iQI3BAABCgAhBQJaM8X7GhxqZW5zLndpa2xhbmRlckBsaW5hcm8ub3JnAAoJELWw
uEGXj+zThYsQAMPsMwvV977gLCnFxSZuIh1qnK5sXabpe4ITVOaUaxyCIoKAcROX
exFdo1l+4UrOaEA9o06IROnHczCEz7IvGcPVYCB13tHwyfPsuicrdM0b/hm2Mehx
MGYDsm3ZjnUTcZxGMNHYvCunNi84Rt1yOC8Mdx4kPhCI8ZCDqb9pV/Bb5wNLnkXS
lXP/+EAkF0ECj88JUhgunkvL96QyK/PROCNUMWansB1RwglvyWy7IS/r03BW9Cpi
4Mtiywmj/KZO9To4LvWhPiX5xvdxe+VxXUD6BW9hVVOxmXGSTEwr9YYr0f7qWH5q
HeTLzkOsRQ+uHkaSLZOJ1HkIsP0sYQ7tR6OaipAEMJIN87ktGr45uuxaMnJCV1Z/
tiKkGKJq9VISa7LA0Fv3nLhfYo8/jHiV/dV77FTreHhWimtVl3aiIkon+P/VSA7W
Qstkq/v+djZXSmJ+dAcaRdukufWLUB4xhl27isnmaVjToFUHJH36wM9smtgXFygv
DL8+5UBgsWPOlpJkIsTD/dwiQK+CeG4/SASgfe5DV7GVh+Z+71E2V40UQ9JoUROa
Y33tPFWg07gG3cHAZYugKG2ucf4Yy3GXh5xZnjIq0Ye1U3/TnbK543V1y2N45vx0
xBWJFFh2blKD04QPynBFqKPKNc5d//OgeK3m4PBTYk2GoGIvnc5YxPTq
=3iwl
-----END PGP SIGNATURE-----
Merge tag 'tee-drv-dynamic-shm-for-v4.16' of https://git.linaro.org/people/jens.wiklander/linux-tee into next/drivers
Pull "tee dynamic shm for v4.16" from Jens Wiklander:
This pull request enables dynamic shared memory support in the TEE
subsystem as a whole and in OP-TEE in particular.
Global Platform TEE specification [1] allows client applications
to register part of own memory as a shared buffer between
application and TEE. This allows fast zero-copy communication between
TEE and REE. But current implementation of TEE in Linux does not support
this feature.
Also, current implementation of OP-TEE transport uses fixed size
pre-shared buffer for all communications with OP-TEE OS. This is okay
in the most use cases. But this prevents use of OP-TEE in virtualized
environments, because:
a) We can't share the same buffer between different virtual machines
b) Physically contiguous memory as seen by VM can be non-contiguous
in reality (and as seen by OP-TEE OS) due to second stage of
MMU translation.
c) Size of this pre-shared buffer is limited.
So, first part of this pull request adds generic register/unregister
interface to tee subsystem. The second part adds necessary features into
OP-TEE driver, so it can use not only static pre-shared buffer, but
whole RAM to communicate with OP-TEE OS.
This change is backwards compatible allowing older secure world or
user space to work with newer kernels and vice versa.
[1] https://www.globalplatform.org/specificationsdevice.asp
* tag 'tee-drv-dynamic-shm-for-v4.16' of https://git.linaro.org/people/jens.wiklander/linux-tee:
tee: shm: inline tee_shm_get_id()
tee: use reference counting for tee_context
tee: optee: enable dynamic SHM support
tee: optee: add optee-specific shared pool implementation
tee: optee: store OP-TEE capabilities in private data
tee: optee: add registered buffers handling into RPC calls
tee: optee: add registered shared parameters handling
tee: optee: add shared buffer registration functions
tee: optee: add page list manipulation functions
tee: optee: Update protocol definitions
tee: shm: add page accessor functions
tee: shm: add accessors for buffer size and page offset
tee: add register user memory
tee: flexible shared memory pool creation
This pull request enables asynchronous communication with TEE supplicant
by introducing meta parameters in the user space API. The meta
parameters can be used to tag requests with an id that can be matched
against an asynchronous response as is done here in the OP-TEE driver.
Asynchronous supplicant communication is needed by OP-TEE to implement
GlobalPlatforms TEE Sockets API Specification v1.0.1. The specification
is available at https://www.globalplatform.org/specificationsdevice.asp.
This change is backwards compatible allowing older supplicants to work
with newer kernels and vice versa.
-----BEGIN PGP SIGNATURE-----
iQI3BAABCgAhBQJaHoIkGhxqZW5zLndpa2xhbmRlckBsaW5hcm8ub3JnAAoJELWw
uEGXj+zTdtoP/2/Og5YP0Xad0oRINgWJO/YnXT+kTHTHf35RuFA7LFbHdMNl4ZYx
h7vTiHdgA1Bper+2ljLZdu2R36/za6LLVnof4p9WPDLueSJgH71jDMC3jVO0r1C7
SydYuCUtcdNTNObSSy3SJVdMZtCyXA+p3Z1VLdP9E3fSVYg/36JH9clqkB/LrTC+
NTC1w/g3cBcNIGj6kToPDwyXtKOk/9fMK7e4ylinscX0LyDmWTZyYRGrzB2gaek0
Xk1fSEprX5jor99GdU1WtYuO/6hk1T1LglD/L0ROO++7RQYZiAyO6Lgfyi6mvlMi
3oxuZRI3JZRR5hdzBzwxTpQGVVxj8ukYD/PFX0pBbw1XA9W9+gK7Du5OLmp4WZ+b
t2Eg18pWivcRCmglpoS+VsirQkXT36ElCpHAK0QZ4gB2gVPFvNBjlFPD4xa5FsLw
urO35ZLlj7b9VuEQQSJjif5FDc3sCyBlRuFQ1XCMXOeBULel8gnuWwyJRwwYhxIJ
/moFkylFqOwewZui1oC1gmXCo/ud4xIw79+8HPN7ISves/bfJtDaMRK+YAFbPshl
MbYCqRsr+B5o0MrFZbx+ofUNeJeh3Wwk/zdYoxTfRDGJ9ILBixBKCtPEmjX2MJGi
Y66jiw73Yo0RsVTS3IO6sxFxsBi1mfGlpkUg4/pclr8Ijw71lc5msHkv
=jsDo
-----END PGP SIGNATURE-----
Merge tag 'tee-drv-async-supplicant-for-v4.16' of https://git.linaro.org/people/jens.wiklander/linux-tee into next/drivers
Pull "Enable async communication with tee supplicant" from Jens Wiklander:
This pull request enables asynchronous communication with TEE supplicant
by introducing meta parameters in the user space API. The meta
parameters can be used to tag requests with an id that can be matched
against an asynchronous response as is done here in the OP-TEE driver.
Asynchronous supplicant communication is needed by OP-TEE to implement
GlobalPlatforms TEE Sockets API Specification v1.0.1. The specification
is available at https://www.globalplatform.org/specificationsdevice.asp.
This change is backwards compatible allowing older supplicants to work
with newer kernels and vice versa.
* tag 'tee-drv-async-supplicant-for-v4.16' of https://git.linaro.org/people/jens.wiklander/linux-tee:
optee: support asynchronous supplicant requests
tee: add TEE_IOCTL_PARAM_ATTR_META
tee: add tee_param_is_memref() for driver use
nft_ct exposes this bit to userspace. This used to be
#define NF_CT_STATE_UNTRACKED_BIT (1 << (IP_CT_NUMBER + 1))
(IP_CT_NUMBER is 5, so this was 0x40)
.. but this got changed to 8 (0x100) when the untracked object got removed.
Replace this with a literal 6 to prevent further incompatible changes
in case IP_CT_NUMBER ever increases.
Fixes: cc41c84b7e ("netfilter: kill the fake untracked conntrack objects")
Reported-by: Li Shuang <shuali@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
- bump version strings, by Simon Wunderlich
- de-inline hash functions to save memory footprint, by Denys Vlasenko
- Add License information to various files, by Sven Eckelmann (3 patches)
- Change batman_adv.h from ISC to MIT, by Sven Eckelmann
- Improve various includes, by Sven Eckelmann (5 patches)
- Lots of kernel-doc work by Sven Eckelmann (8 patches)
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAlo6QfgWHHN3QHNpbW9u
d3VuZGVybGljaC5kZQAKCRChK+OYQpKeobWfEADPEOdxWS7nW4Xkhug+7vLbcloJ
Om1VDKFD4n5NfB6e+vh8kQAnnnQ/LzFSv53giNdnjjE9IPNKxNhzBQFS95H189EP
ebP0mKOesTadkTx+MjFxenhnaTnzK0hkngdxz/frvaq+i6ECMhnq8Bw0elVG0nSg
X9ts0x6BuNIw6EjIdPP0GvfOV1DvUmdMz1YLJy4yoJ8Tm671y86y07jTJaaEN2Ex
9dp0j3hEqtrYZvgRdQ/hzYLFJ9fvcF1FyA+duufK3FNbkJtZn89zqseIuN2saqqw
QN/nDBrduzW6SR9y0JfXlatI6FN6316jskLcpqorz5/88KrwQeTOg2ZXXn0iw6l1
r0tkBP5/eEu2Dcd3WRsNtTnMZmGfc2uuqmvD1Pz0wN7RAzIYhPvs9to6TVy3mK5p
0OyFJaZU9vurtIPqVYjNtSofkUHKMOZZ1H7LocWaINGIVsREx/i0DmX//M0ZiqbB
mS4ybkn8yOYfFUizg51RYo2nKVyw/ZXNqBYBPUWio91CPj0vOUFitvfxnxNitV/m
182HVdoOnGhorYbS3J/8Su3AyEyhJTVWnmq0z3u1CuvdMhppbAzvXvmERotAJN9e
5Wp26PE2a7zb4LJhMBQ4q+RnUnZV5ADhigrIEGPOQoY5KdIrEOcW65wrgfhWYlER
NVJc2okVZKhg3o7amA==
=5JUK
-----END PGP SIGNATURE-----
Merge tag 'batadv-next-for-davem-20171220' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
This feature/cleanup patchset includes the following patches:
- bump version strings, by Simon Wunderlich
- de-inline hash functions to save memory footprint, by Denys Vlasenko
- Add License information to various files, by Sven Eckelmann (3 patches)
- Change batman_adv.h from ISC to MIT, by Sven Eckelmann
- Improve various includes, by Sven Eckelmann (5 patches)
- Lots of kernel-doc work by Sven Eckelmann (8 patches)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
By default VFIO disables mapping of MSIX BAR to the userspace as
the userspace may program it in a way allowing spurious interrupts;
instead the userspace uses the VFIO_DEVICE_SET_IRQS ioctl.
In order to eliminate guessing from the userspace about what is
mmapable, VFIO also advertises a sparse list of regions allowed to mmap.
This works fine as long as the system page size equals to the MSIX
alignment requirement which is 4KB. However with a bigger page size
the existing code prohibits mapping non-MSIX parts of a page with MSIX
structures so these parts have to be emulated via slow reads/writes on
a VFIO device fd. If these emulated bits are accessed often, this has
serious impact on performance.
This allows mmap of the entire BAR containing MSIX vector table.
This removes the sparse capability for PCI devices as it becomes useless.
As the userspace needs to know for sure whether mmapping of the MSIX
vector containing data can succeed, this adds a new capability -
VFIO_REGION_INFO_CAP_MSIX_MAPPABLE - which explicitly tells the userspace
that the entire BAR can be mmapped.
This does not touch the MSIX mangling in the BAR read/write handlers as
we are doing this just to enable direct access to non MSIX registers.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[aw - fixup whitespace, trim function name]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
This commit adds a new property DTV_SCRAMBLING_SEQUENCE_INDEX.
This 18 bit field, when present, carries the index of the DVB-S2 physical
layer scrambling sequence as defined in clause 5.5.4 of EN 302 307.
There is no explicit signalling method to convey scrambling sequence index
to the receiver. If S2 satellite delivery system descriptor is available
it can be used to read the scrambling sequence index (EN 300 468 table 41).
By default, gold scrambling sequence index 0 is used. The valid scrambling
sequence index range is from 0 to 262142.
Increase the DVB API version in order userspace to be aware of the changes.
Signed-off-by: Athanasios Oikonomou <athoik@gmail.com>
Acked-by: Ralph Metzler <rjkm@metzlerbros.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
This commit enhances the scan results to report the per chain signal
strength based on the latest BSS update. This provides similar
information to what is already available through STA information.
Signed-off-by: Sunil Dutt <usdutt@qti.qualcomm.com>
Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Add a new event type that is newly exposed by recent firmware. The event
will never occur if the firmware is too old. If user space tries to use
this event in an older kernel, it will just get an EINVAL which is
perfectly acceptable in the existing user space code.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The tools/testing/selftests/bpf test program
test_dev_cgroup fails with the following error
when compiled with llvm 6.0. (I did not try
with earlier versions.)
libbpf: load bpf program failed: Permission denied
libbpf: -- BEGIN DUMP LOG ---
libbpf:
0: (61) r2 = *(u32 *)(r1 +4)
1: (b7) r0 = 0
2: (55) if r2 != 0x1 goto pc+8
R0=inv0 R1=ctx(id=0,off=0,imm=0) R2=inv1 R10=fp0
3: (69) r2 = *(u16 *)(r1 +0)
invalid bpf_context access off=0 size=2
...
The culprit is the following statement in dev_cgroup.c:
short type = ctx->access_type & 0xFFFF;
This code is typical as the ctx->access_type is assigned
as below in kernel/bpf/cgroup.c:
struct bpf_cgroup_dev_ctx ctx = {
.access_type = (access << 16) | dev_type,
.major = major,
.minor = minor,
};
The compiler converts it to u16 access while
the verifier cgroup_dev_is_valid_access rejects
any non u32 access.
This patch permits the field access_type to be accessible
with type u16 and u8 as well.
Signed-off-by: Yonghong Song <yhs@fb.com>
Tested-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Daniel Borkmann says:
====================
pull-request: bpf-next 2017-12-18
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) Allow arbitrary function calls from one BPF function to another BPF function.
As of today when writing BPF programs, __always_inline had to be used in
the BPF C programs for all functions, unnecessarily causing LLVM to inflate
code size. Handle this more naturally with support for BPF to BPF calls
such that this __always_inline restriction can be overcome. As a result,
it allows for better optimized code and finally enables to introduce core
BPF libraries in the future that can be reused out of different projects.
x86 and arm64 JIT support was added as well, from Alexei.
2) Add infrastructure for tagging functions as error injectable and allow for
BPF to return arbitrary error values when BPF is attached via kprobes on
those. This way of injecting errors generically eases testing and debugging
without having to recompile or restart the kernel. Tags for opting-in for
this facility are added with BPF_ALLOW_ERROR_INJECTION(), from Josef.
3) For BPF offload via nfp JIT, add support for bpf_xdp_adjust_head() helper
call for XDP programs. First part of this work adds handling of BPF
capabilities included in the firmware, and the later patches add support
to the nfp verifier part and JIT as well as some small optimizations,
from Jakub.
4) The bpftool now also gets support for basic cgroup BPF operations such
as attaching, detaching and listing current BPF programs. As a requirement
for the attach part, bpftool can now also load object files through
'bpftool prog load'. This reuses libbpf which we have in the kernel tree
as well. bpftool-cgroup man page is added along with it, from Roman.
5) Back then commit e87c6bc385 ("bpf: permit multiple bpf attachments for
a single perf event") added support for attaching multiple BPF programs
to a single perf event. Given they are configured through perf's ioctl()
interface, the interface has been extended with a PERF_EVENT_IOC_QUERY_BPF
command in this work in order to return an array of one or multiple BPF
prog ids that are currently attached, from Yonghong.
6) Various minor fixes and cleanups to the bpftool's Makefile as well
as a new 'uninstall' and 'doc-uninstall' target for removing bpftool
itself or prior installed documentation related to it, from Quentin.
7) Add CONFIG_CGROUP_BPF=y to the BPF kernel selftest config file which is
required for the test_dev_cgroup test case to run, from Naresh.
8) Fix reporting of XDP prog_flags for nfp driver, from Jakub.
9) Fix libbpf's exit code from the Makefile when libelf was not found in
the system, also from Jakub.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>