Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller: 1) Add Maglev hashing scheduler to IPVS, from Inju Song. 2) Lots of new TC subsystem tests from Roman Mashak. 3) Add TCP zero copy receive and fix delayed acks and autotuning with SO_RCVLOWAT, from Eric Dumazet. 4) Add XDP_REDIRECT support to mlx5 driver, from Jesper Dangaard Brouer. 5) Add ttl inherit support to vxlan, from Hangbin Liu. 6) Properly separate ipv6 routes into their logically independant components. fib6_info for the routing table, and fib6_nh for sets of nexthops, which thus can be shared. From David Ahern. 7) Add bpf_xdp_adjust_tail helper, which can be used to generate ICMP messages from XDP programs. From Nikita V. Shirokov. 8) Lots of long overdue cleanups to the r8169 driver, from Heiner Kallweit. 9) Add BTF ("BPF Type Format"), from Martin KaFai Lau. 10) Add traffic condition monitoring to iwlwifi, from Luca Coelho. 11) Plumb extack down into fib_rules, from Roopa Prabhu. 12) Add Flower classifier offload support to igb, from Vinicius Costa Gomes. 13) Add UDP GSO support, from Willem de Bruijn. 14) Add documentation for eBPF helpers, from Quentin Monnet. 15) Add TLS tx offload to mlx5, from Ilya Lesokhin. 16) Allow applications to be given the number of bytes available to read on a socket via a control message returned from recvmsg(), from Soheil Hassas Yeganeh. 17) Add x86_32 eBPF JIT compiler, from Wang YanQing. 18) Add AF_XDP sockets, with zerocopy support infrastructure as well. From Björn Töpel. 19) Remove indirect load support from all of the BPF JITs and handle these operations in the verifier by translating them into native BPF instead. From Daniel Borkmann. 20) Add GRO support to ipv6 gre tunnels, from Eran Ben Elisha. 21) Allow XDP programs to do lookups in the main kernel routing tables for forwarding. From David Ahern. 22) Allow drivers to store hardware state into an ELF section of kernel dump vmcore files, and use it in cxgb4. From Rahul Lakkireddy. 23) Various RACK and loss detection improvements in TCP, from Yuchung Cheng. 24) Add TCP SACK compression, from Eric Dumazet. 25) Add User Mode Helper support and basic bpfilter infrastructure, from Alexei Starovoitov. 26) Support ports and protocol values in RTM_GETROUTE, from Roopa Prabhu. 27) Support bulking in ->ndo_xdp_xmit() API, from Jesper Dangaard Brouer. 28) Add lots of forwarding selftests, from Petr Machata. 29) Add generic network device failover driver, from Sridhar Samudrala. * ra.kernel.org:/pub/scm/linux/kernel/git/davem/net-next: (1959 commits) strparser: Add __strp_unpause and use it in ktls. rxrpc: Fix terminal retransmission connection ID to include the channel net: hns3: Optimize PF CMDQ interrupt switching process net: hns3: Fix for VF mailbox receiving unknown message net: hns3: Fix for VF mailbox cannot receiving PF response bnx2x: use the right constant Revert "net: sched: cls: Fix offloading when ingress dev is vxlan" net: dsa: b53: Fix for brcm tag issue in Cygnus SoC enic: fix UDP rss bits netdev-FAQ: clarify DaveM's position for stable backports rtnetlink: validate attributes in do_setlink() mlxsw: Add extack messages for port_{un, }split failures netdevsim: Add extack error message for devlink reload devlink: Add extack to reload and port_{un, }split operations net: metrics: add proper netlink validation ipmr: fix error path when ipmr_new_table fails ip6mr: only set ip6mr_table from setsockopt when ip6mr_new_table succeeds net: hns3: remove unused hclgevf_cfg_func_mta_filter netfilter: provide udp*_lib_lookup for nf_tproxy qed*: Utilize FW 8.37.2.0 ...
This commit is contained in:
commit
1c8c5a9d38
|
@ -0,0 +1,36 @@
|
|||
=================
|
||||
BPF documentation
|
||||
=================
|
||||
|
||||
This directory contains documentation for the BPF (Berkeley Packet
|
||||
Filter) facility, with a focus on the extended BPF version (eBPF).
|
||||
|
||||
This kernel side documentation is still work in progress. The main
|
||||
textual documentation is (for historical reasons) described in
|
||||
`Documentation/networking/filter.txt`_, which describe both classical
|
||||
and extended BPF instruction-set.
|
||||
The Cilium project also maintains a `BPF and XDP Reference Guide`_
|
||||
that goes into great technical depth about the BPF Architecture.
|
||||
|
||||
The primary info for the bpf syscall is available in the `man-pages`_
|
||||
for `bpf(2)`_.
|
||||
|
||||
|
||||
|
||||
Frequently asked questions (FAQ)
|
||||
================================
|
||||
|
||||
Two sets of Questions and Answers (Q&A) are maintained.
|
||||
|
||||
* QA for common questions about BPF see: bpf_design_QA_
|
||||
|
||||
* QA for developers interacting with BPF subsystem: bpf_devel_QA_
|
||||
|
||||
|
||||
.. Links:
|
||||
.. _bpf_design_QA: bpf_design_QA.rst
|
||||
.. _bpf_devel_QA: bpf_devel_QA.rst
|
||||
.. _Documentation/networking/filter.txt: ../networking/filter.txt
|
||||
.. _man-pages: https://www.kernel.org/doc/man-pages/
|
||||
.. _bpf(2): http://man7.org/linux/man-pages/man2/bpf.2.html
|
||||
.. _BPF and XDP Reference Guide: http://cilium.readthedocs.io/en/latest/bpf/
|
|
@ -0,0 +1,221 @@
|
|||
==============
|
||||
BPF Design Q&A
|
||||
==============
|
||||
|
||||
BPF extensibility and applicability to networking, tracing, security
|
||||
in the linux kernel and several user space implementations of BPF
|
||||
virtual machine led to a number of misunderstanding on what BPF actually is.
|
||||
This short QA is an attempt to address that and outline a direction
|
||||
of where BPF is heading long term.
|
||||
|
||||
.. contents::
|
||||
:local:
|
||||
:depth: 3
|
||||
|
||||
Questions and Answers
|
||||
=====================
|
||||
|
||||
Q: Is BPF a generic instruction set similar to x64 and arm64?
|
||||
-------------------------------------------------------------
|
||||
A: NO.
|
||||
|
||||
Q: Is BPF a generic virtual machine ?
|
||||
-------------------------------------
|
||||
A: NO.
|
||||
|
||||
BPF is generic instruction set *with* C calling convention.
|
||||
-----------------------------------------------------------
|
||||
|
||||
Q: Why C calling convention was chosen?
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
A: Because BPF programs are designed to run in the linux kernel
|
||||
which is written in C, hence BPF defines instruction set compatible
|
||||
with two most used architectures x64 and arm64 (and takes into
|
||||
consideration important quirks of other architectures) and
|
||||
defines calling convention that is compatible with C calling
|
||||
convention of the linux kernel on those architectures.
|
||||
|
||||
Q: can multiple return values be supported in the future?
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
A: NO. BPF allows only register R0 to be used as return value.
|
||||
|
||||
Q: can more than 5 function arguments be supported in the future?
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
A: NO. BPF calling convention only allows registers R1-R5 to be used
|
||||
as arguments. BPF is not a standalone instruction set.
|
||||
(unlike x64 ISA that allows msft, cdecl and other conventions)
|
||||
|
||||
Q: can BPF programs access instruction pointer or return address?
|
||||
-----------------------------------------------------------------
|
||||
A: NO.
|
||||
|
||||
Q: can BPF programs access stack pointer ?
|
||||
------------------------------------------
|
||||
A: NO.
|
||||
|
||||
Only frame pointer (register R10) is accessible.
|
||||
From compiler point of view it's necessary to have stack pointer.
|
||||
For example LLVM defines register R11 as stack pointer in its
|
||||
BPF backend, but it makes sure that generated code never uses it.
|
||||
|
||||
Q: Does C-calling convention diminishes possible use cases?
|
||||
-----------------------------------------------------------
|
||||
A: YES.
|
||||
|
||||
BPF design forces addition of major functionality in the form
|
||||
of kernel helper functions and kernel objects like BPF maps with
|
||||
seamless interoperability between them. It lets kernel call into
|
||||
BPF programs and programs call kernel helpers with zero overhead.
|
||||
As all of them were native C code. That is particularly the case
|
||||
for JITed BPF programs that are indistinguishable from
|
||||
native kernel C code.
|
||||
|
||||
Q: Does it mean that 'innovative' extensions to BPF code are disallowed?
|
||||
------------------------------------------------------------------------
|
||||
A: Soft yes.
|
||||
|
||||
At least for now until BPF core has support for
|
||||
bpf-to-bpf calls, indirect calls, loops, global variables,
|
||||
jump tables, read only sections and all other normal constructs
|
||||
that C code can produce.
|
||||
|
||||
Q: Can loops be supported in a safe way?
|
||||
----------------------------------------
|
||||
A: It's not clear yet.
|
||||
|
||||
BPF developers are trying to find a way to
|
||||
support bounded loops where the verifier can guarantee that
|
||||
the program terminates in less than 4096 instructions.
|
||||
|
||||
Instruction level questions
|
||||
---------------------------
|
||||
|
||||
Q: LD_ABS and LD_IND instructions vs C code
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Q: How come LD_ABS and LD_IND instruction are present in BPF whereas
|
||||
C code cannot express them and has to use builtin intrinsics?
|
||||
|
||||
A: This is artifact of compatibility with classic BPF. Modern
|
||||
networking code in BPF performs better without them.
|
||||
See 'direct packet access'.
|
||||
|
||||
Q: BPF instructions mapping not one-to-one to native CPU
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
Q: It seems not all BPF instructions are one-to-one to native CPU.
|
||||
For example why BPF_JNE and other compare and jumps are not cpu-like?
|
||||
|
||||
A: This was necessary to avoid introducing flags into ISA which are
|
||||
impossible to make generic and efficient across CPU architectures.
|
||||
|
||||
Q: why BPF_DIV instruction doesn't map to x64 div?
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
A: Because if we picked one-to-one relationship to x64 it would have made
|
||||
it more complicated to support on arm64 and other archs. Also it
|
||||
needs div-by-zero runtime check.
|
||||
|
||||
Q: why there is no BPF_SDIV for signed divide operation?
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
A: Because it would be rarely used. llvm errors in such case and
|
||||
prints a suggestion to use unsigned divide instead
|
||||
|
||||
Q: Why BPF has implicit prologue and epilogue?
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
A: Because architectures like sparc have register windows and in general
|
||||
there are enough subtle differences between architectures, so naive
|
||||
store return address into stack won't work. Another reason is BPF has
|
||||
to be safe from division by zero (and legacy exception path
|
||||
of LD_ABS insn). Those instructions need to invoke epilogue and
|
||||
return implicitly.
|
||||
|
||||
Q: Why BPF_JLT and BPF_JLE instructions were not introduced in the beginning?
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
A: Because classic BPF didn't have them and BPF authors felt that compiler
|
||||
workaround would be acceptable. Turned out that programs lose performance
|
||||
due to lack of these compare instructions and they were added.
|
||||
These two instructions is a perfect example what kind of new BPF
|
||||
instructions are acceptable and can be added in the future.
|
||||
These two already had equivalent instructions in native CPUs.
|
||||
New instructions that don't have one-to-one mapping to HW instructions
|
||||
will not be accepted.
|
||||
|
||||
Q: BPF 32-bit subregister requirements
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
Q: BPF 32-bit subregisters have a requirement to zero upper 32-bits of BPF
|
||||
registers which makes BPF inefficient virtual machine for 32-bit
|
||||
CPU architectures and 32-bit HW accelerators. Can true 32-bit registers
|
||||
be added to BPF in the future?
|
||||
|
||||
A: NO. The first thing to improve performance on 32-bit archs is to teach
|
||||
LLVM to generate code that uses 32-bit subregisters. Then second step
|
||||
is to teach verifier to mark operations where zero-ing upper bits
|
||||
is unnecessary. Then JITs can take advantage of those markings and
|
||||
drastically reduce size of generated code and improve performance.
|
||||
|
||||
Q: Does BPF have a stable ABI?
|
||||
------------------------------
|
||||
A: YES. BPF instructions, arguments to BPF programs, set of helper
|
||||
functions and their arguments, recognized return codes are all part
|
||||
of ABI. However when tracing programs are using bpf_probe_read() helper
|
||||
to walk kernel internal datastructures and compile with kernel
|
||||
internal headers these accesses can and will break with newer
|
||||
kernels. The union bpf_attr -> kern_version is checked at load time
|
||||
to prevent accidentally loading kprobe-based bpf programs written
|
||||
for a different kernel. Networking programs don't do kern_version check.
|
||||
|
||||
Q: How much stack space a BPF program uses?
|
||||
-------------------------------------------
|
||||
A: Currently all program types are limited to 512 bytes of stack
|
||||
space, but the verifier computes the actual amount of stack used
|
||||
and both interpreter and most JITed code consume necessary amount.
|
||||
|
||||
Q: Can BPF be offloaded to HW?
|
||||
------------------------------
|
||||
A: YES. BPF HW offload is supported by NFP driver.
|
||||
|
||||
Q: Does classic BPF interpreter still exist?
|
||||
--------------------------------------------
|
||||
A: NO. Classic BPF programs are converted into extend BPF instructions.
|
||||
|
||||
Q: Can BPF call arbitrary kernel functions?
|
||||
-------------------------------------------
|
||||
A: NO. BPF programs can only call a set of helper functions which
|
||||
is defined for every program type.
|
||||
|
||||
Q: Can BPF overwrite arbitrary kernel memory?
|
||||
---------------------------------------------
|
||||
A: NO.
|
||||
|
||||
Tracing bpf programs can *read* arbitrary memory with bpf_probe_read()
|
||||
and bpf_probe_read_str() helpers. Networking programs cannot read
|
||||
arbitrary memory, since they don't have access to these helpers.
|
||||
Programs can never read or write arbitrary memory directly.
|
||||
|
||||
Q: Can BPF overwrite arbitrary user memory?
|
||||
-------------------------------------------
|
||||
A: Sort-of.
|
||||
|
||||
Tracing BPF programs can overwrite the user memory
|
||||
of the current task with bpf_probe_write_user(). Every time such
|
||||
program is loaded the kernel will print warning message, so
|
||||
this helper is only useful for experiments and prototypes.
|
||||
Tracing BPF programs are root only.
|
||||
|
||||
Q: bpf_trace_printk() helper warning
|
||||
------------------------------------
|
||||
Q: When bpf_trace_printk() helper is used the kernel prints nasty
|
||||
warning message. Why is that?
|
||||
|
||||
A: This is done to nudge program authors into better interfaces when
|
||||
programs need to pass data to user space. Like bpf_perf_event_output()
|
||||
can be used to efficiently stream data via perf ring buffer.
|
||||
BPF maps can be used for asynchronous data sharing between kernel
|
||||
and user space. bpf_trace_printk() should only be used for debugging.
|
||||
|
||||
Q: New functionality via kernel modules?
|
||||
----------------------------------------
|
||||
Q: Can BPF functionality such as new program or map types, new
|
||||
helpers, etc be added out of kernel module code?
|
||||
|
||||
A: NO.
|
|
@ -1,156 +0,0 @@
|
|||
BPF extensibility and applicability to networking, tracing, security
|
||||
in the linux kernel and several user space implementations of BPF
|
||||
virtual machine led to a number of misunderstanding on what BPF actually is.
|
||||
This short QA is an attempt to address that and outline a direction
|
||||
of where BPF is heading long term.
|
||||
|
||||
Q: Is BPF a generic instruction set similar to x64 and arm64?
|
||||
A: NO.
|
||||
|
||||
Q: Is BPF a generic virtual machine ?
|
||||
A: NO.
|
||||
|
||||
BPF is generic instruction set _with_ C calling convention.
|
||||
|
||||
Q: Why C calling convention was chosen?
|
||||
A: Because BPF programs are designed to run in the linux kernel
|
||||
which is written in C, hence BPF defines instruction set compatible
|
||||
with two most used architectures x64 and arm64 (and takes into
|
||||
consideration important quirks of other architectures) and
|
||||
defines calling convention that is compatible with C calling
|
||||
convention of the linux kernel on those architectures.
|
||||
|
||||
Q: can multiple return values be supported in the future?
|
||||
A: NO. BPF allows only register R0 to be used as return value.
|
||||
|
||||
Q: can more than 5 function arguments be supported in the future?
|
||||
A: NO. BPF calling convention only allows registers R1-R5 to be used
|
||||
as arguments. BPF is not a standalone instruction set.
|
||||
(unlike x64 ISA that allows msft, cdecl and other conventions)
|
||||
|
||||
Q: can BPF programs access instruction pointer or return address?
|
||||
A: NO.
|
||||
|
||||
Q: can BPF programs access stack pointer ?
|
||||
A: NO. Only frame pointer (register R10) is accessible.
|
||||
From compiler point of view it's necessary to have stack pointer.
|
||||
For example LLVM defines register R11 as stack pointer in its
|
||||
BPF backend, but it makes sure that generated code never uses it.
|
||||
|
||||
Q: Does C-calling convention diminishes possible use cases?
|
||||
A: YES. BPF design forces addition of major functionality in the form
|
||||
of kernel helper functions and kernel objects like BPF maps with
|
||||
seamless interoperability between them. It lets kernel call into
|
||||
BPF programs and programs call kernel helpers with zero overhead.
|
||||
As all of them were native C code. That is particularly the case
|
||||
for JITed BPF programs that are indistinguishable from
|
||||
native kernel C code.
|
||||
|
||||
Q: Does it mean that 'innovative' extensions to BPF code are disallowed?
|
||||
A: Soft yes. At least for now until BPF core has support for
|
||||
bpf-to-bpf calls, indirect calls, loops, global variables,
|
||||
jump tables, read only sections and all other normal constructs
|
||||
that C code can produce.
|
||||
|
||||
Q: Can loops be supported in a safe way?
|
||||
A: It's not clear yet. BPF developers are trying to find a way to
|
||||
support bounded loops where the verifier can guarantee that
|
||||
the program terminates in less than 4096 instructions.
|
||||
|
||||
Q: How come LD_ABS and LD_IND instruction are present in BPF whereas
|
||||
C code cannot express them and has to use builtin intrinsics?
|
||||
A: This is artifact of compatibility with classic BPF. Modern
|
||||
networking code in BPF performs better without them.
|
||||
See 'direct packet access'.
|
||||
|
||||
Q: It seems not all BPF instructions are one-to-one to native CPU.
|
||||
For example why BPF_JNE and other compare and jumps are not cpu-like?
|
||||
A: This was necessary to avoid introducing flags into ISA which are
|
||||
impossible to make generic and efficient across CPU architectures.
|
||||
|
||||
Q: why BPF_DIV instruction doesn't map to x64 div?
|
||||
A: Because if we picked one-to-one relationship to x64 it would have made
|
||||
it more complicated to support on arm64 and other archs. Also it
|
||||
needs div-by-zero runtime check.
|
||||
|
||||
Q: why there is no BPF_SDIV for signed divide operation?
|
||||
A: Because it would be rarely used. llvm errors in such case and
|
||||
prints a suggestion to use unsigned divide instead
|
||||
|
||||
Q: Why BPF has implicit prologue and epilogue?
|
||||
A: Because architectures like sparc have register windows and in general
|
||||
there are enough subtle differences between architectures, so naive
|
||||
store return address into stack won't work. Another reason is BPF has
|
||||
to be safe from division by zero (and legacy exception path
|
||||
of LD_ABS insn). Those instructions need to invoke epilogue and
|
||||
return implicitly.
|
||||
|
||||
Q: Why BPF_JLT and BPF_JLE instructions were not introduced in the beginning?
|
||||
A: Because classic BPF didn't have them and BPF authors felt that compiler
|
||||
workaround would be acceptable. Turned out that programs lose performance
|
||||
due to lack of these compare instructions and they were added.
|
||||
These two instructions is a perfect example what kind of new BPF
|
||||
instructions are acceptable and can be added in the future.
|
||||
These two already had equivalent instructions in native CPUs.
|
||||
New instructions that don't have one-to-one mapping to HW instructions
|
||||
will not be accepted.
|
||||
|
||||
Q: BPF 32-bit subregisters have a requirement to zero upper 32-bits of BPF
|
||||
registers which makes BPF inefficient virtual machine for 32-bit
|
||||
CPU architectures and 32-bit HW accelerators. Can true 32-bit registers
|
||||
be added to BPF in the future?
|
||||
A: NO. The first thing to improve performance on 32-bit archs is to teach
|
||||
LLVM to generate code that uses 32-bit subregisters. Then second step
|
||||
is to teach verifier to mark operations where zero-ing upper bits
|
||||
is unnecessary. Then JITs can take advantage of those markings and
|
||||
drastically reduce size of generated code and improve performance.
|
||||
|
||||
Q: Does BPF have a stable ABI?
|
||||
A: YES. BPF instructions, arguments to BPF programs, set of helper
|
||||
functions and their arguments, recognized return codes are all part
|
||||
of ABI. However when tracing programs are using bpf_probe_read() helper
|
||||
to walk kernel internal datastructures and compile with kernel
|
||||
internal headers these accesses can and will break with newer
|
||||
kernels. The union bpf_attr -> kern_version is checked at load time
|
||||
to prevent accidentally loading kprobe-based bpf programs written
|
||||
for a different kernel. Networking programs don't do kern_version check.
|
||||
|
||||
Q: How much stack space a BPF program uses?
|
||||
A: Currently all program types are limited to 512 bytes of stack
|
||||
space, but the verifier computes the actual amount of stack used
|
||||
and both interpreter and most JITed code consume necessary amount.
|
||||
|
||||
Q: Can BPF be offloaded to HW?
|
||||
A: YES. BPF HW offload is supported by NFP driver.
|
||||
|
||||
Q: Does classic BPF interpreter still exist?
|
||||
A: NO. Classic BPF programs are converted into extend BPF instructions.
|
||||
|
||||
Q: Can BPF call arbitrary kernel functions?
|
||||
A: NO. BPF programs can only call a set of helper functions which
|
||||
is defined for every program type.
|
||||
|
||||
Q: Can BPF overwrite arbitrary kernel memory?
|
||||
A: NO. Tracing bpf programs can _read_ arbitrary memory with bpf_probe_read()
|
||||
and bpf_probe_read_str() helpers. Networking programs cannot read
|
||||
arbitrary memory, since they don't have access to these helpers.
|
||||
Programs can never read or write arbitrary memory directly.
|
||||
|
||||
Q: Can BPF overwrite arbitrary user memory?
|
||||
A: Sort-of. Tracing BPF programs can overwrite the user memory
|
||||
of the current task with bpf_probe_write_user(). Every time such
|
||||
program is loaded the kernel will print warning message, so
|
||||
this helper is only useful for experiments and prototypes.
|
||||
Tracing BPF programs are root only.
|
||||
|
||||
Q: When bpf_trace_printk() helper is used the kernel prints nasty
|
||||
warning message. Why is that?
|
||||
A: This is done to nudge program authors into better interfaces when
|
||||
programs need to pass data to user space. Like bpf_perf_event_output()
|
||||
can be used to efficiently stream data via perf ring buffer.
|
||||
BPF maps can be used for asynchronous data sharing between kernel
|
||||
and user space. bpf_trace_printk() should only be used for debugging.
|
||||
|
||||
Q: Can BPF functionality such as new program or map types, new
|
||||
helpers, etc be added out of kernel module code?
|
||||
A: NO.
|
|
@ -0,0 +1,640 @@
|
|||
=================================
|
||||
HOWTO interact with BPF subsystem
|
||||
=================================
|
||||
|
||||
This document provides information for the BPF subsystem about various
|
||||
workflows related to reporting bugs, submitting patches, and queueing
|
||||
patches for stable kernels.
|
||||
|
||||
For general information about submitting patches, please refer to
|
||||
`Documentation/process/`_. This document only describes additional specifics
|
||||
related to BPF.
|
||||
|
||||
.. contents::
|
||||
:local:
|
||||
:depth: 2
|
||||
|
||||
Reporting bugs
|
||||
==============
|
||||
|
||||
Q: How do I report bugs for BPF kernel code?
|
||||
--------------------------------------------
|
||||
A: Since all BPF kernel development as well as bpftool and iproute2 BPF
|
||||
loader development happens through the netdev kernel mailing list,
|
||||
please report any found issues around BPF to the following mailing
|
||||
list:
|
||||
|
||||
netdev@vger.kernel.org
|
||||
|
||||
This may also include issues related to XDP, BPF tracing, etc.
|
||||
|
||||
Given netdev has a high volume of traffic, please also add the BPF
|
||||
maintainers to Cc (from kernel MAINTAINERS_ file):
|
||||
|
||||
* Alexei Starovoitov <ast@kernel.org>
|
||||
* Daniel Borkmann <daniel@iogearbox.net>
|
||||
|
||||
In case a buggy commit has already been identified, make sure to keep
|
||||
the actual commit authors in Cc as well for the report. They can
|
||||
typically be identified through the kernel's git tree.
|
||||
|
||||
**Please do NOT report BPF issues to bugzilla.kernel.org since it
|
||||
is a guarantee that the reported issue will be overlooked.**
|
||||
|
||||
Submitting patches
|
||||
==================
|
||||
|
||||
Q: To which mailing list do I need to submit my BPF patches?
|
||||
------------------------------------------------------------
|
||||
A: Please submit your BPF patches to the netdev kernel mailing list:
|
||||
|
||||
netdev@vger.kernel.org
|
||||
|
||||
Historically, BPF came out of networking and has always been maintained
|
||||
by the kernel networking community. Although these days BPF touches
|
||||
many other subsystems as well, the patches are still routed mainly
|
||||
through the networking community.
|
||||
|
||||
In case your patch has changes in various different subsystems (e.g.
|
||||
tracing, security, etc), make sure to Cc the related kernel mailing
|
||||
lists and maintainers from there as well, so they are able to review
|
||||
the changes and provide their Acked-by's to the patches.
|
||||
|
||||
Q: Where can I find patches currently under discussion for BPF subsystem?
|
||||
-------------------------------------------------------------------------
|
||||
A: All patches that are Cc'ed to netdev are queued for review under netdev
|
||||
patchwork project:
|
||||
|
||||
http://patchwork.ozlabs.org/project/netdev/list/
|
||||
|
||||
Those patches which target BPF, are assigned to a 'bpf' delegate for
|
||||
further processing from BPF maintainers. The current queue with
|
||||
patches under review can be found at:
|
||||
|
||||
https://patchwork.ozlabs.org/project/netdev/list/?delegate=77147
|
||||
|
||||
Once the patches have been reviewed by the BPF community as a whole
|
||||
and approved by the BPF maintainers, their status in patchwork will be
|
||||
changed to 'Accepted' and the submitter will be notified by mail. This
|
||||
means that the patches look good from a BPF perspective and have been
|
||||
applied to one of the two BPF kernel trees.
|
||||
|
||||
In case feedback from the community requires a respin of the patches,
|
||||
their status in patchwork will be set to 'Changes Requested', and purged
|
||||
from the current review queue. Likewise for cases where patches would
|
||||
get rejected or are not applicable to the BPF trees (but assigned to
|
||||
the 'bpf' delegate).
|
||||
|
||||
Q: How do the changes make their way into Linux?
|
||||
------------------------------------------------
|
||||
A: There are two BPF kernel trees (git repositories). Once patches have
|
||||
been accepted by the BPF maintainers, they will be applied to one
|
||||
of the two BPF trees:
|
||||
|
||||
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/
|
||||
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/
|
||||
|
||||
The bpf tree itself is for fixes only, whereas bpf-next for features,
|
||||
cleanups or other kind of improvements ("next-like" content). This is
|
||||
analogous to net and net-next trees for networking. Both bpf and
|
||||
bpf-next will only have a master branch in order to simplify against
|
||||
which branch patches should get rebased to.
|
||||
|
||||
Accumulated BPF patches in the bpf tree will regularly get pulled
|
||||
into the net kernel tree. Likewise, accumulated BPF patches accepted
|
||||
into the bpf-next tree will make their way into net-next tree. net and
|
||||
net-next are both run by David S. Miller. From there, they will go
|
||||
into the kernel mainline tree run by Linus Torvalds. To read up on the
|
||||
process of net and net-next being merged into the mainline tree, see
|
||||
the `netdev FAQ`_ under:
|
||||
|
||||
`Documentation/networking/netdev-FAQ.txt`_
|
||||
|
||||
Occasionally, to prevent merge conflicts, we might send pull requests
|
||||
to other trees (e.g. tracing) with a small subset of the patches, but
|
||||
net and net-next are always the main trees targeted for integration.
|
||||
|
||||
The pull requests will contain a high-level summary of the accumulated
|
||||
patches and can be searched on netdev kernel mailing list through the
|
||||
following subject lines (``yyyy-mm-dd`` is the date of the pull
|
||||
request)::
|
||||
|
||||
pull-request: bpf yyyy-mm-dd
|
||||
pull-request: bpf-next yyyy-mm-dd
|
||||
|
||||
Q: How do I indicate which tree (bpf vs. bpf-next) my patch should be applied to?
|
||||
---------------------------------------------------------------------------------
|
||||
|
||||
A: The process is the very same as described in the `netdev FAQ`_, so
|
||||
please read up on it. The subject line must indicate whether the
|
||||
patch is a fix or rather "next-like" content in order to let the
|
||||
maintainers know whether it is targeted at bpf or bpf-next.
|
||||
|
||||
For fixes eventually landing in bpf -> net tree, the subject must
|
||||
look like::
|
||||
|
||||
git format-patch --subject-prefix='PATCH bpf' start..finish
|
||||
|
||||
For features/improvements/etc that should eventually land in
|
||||
bpf-next -> net-next, the subject must look like::
|
||||
|
||||
git format-patch --subject-prefix='PATCH bpf-next' start..finish
|
||||
|
||||
If unsure whether the patch or patch series should go into bpf
|
||||
or net directly, or bpf-next or net-next directly, it is not a
|
||||
problem either if the subject line says net or net-next as target.
|
||||
It is eventually up to the maintainers to do the delegation of
|
||||
the patches.
|
||||
|
||||
If it is clear that patches should go into bpf or bpf-next tree,
|
||||
please make sure to rebase the patches against those trees in
|
||||
order to reduce potential conflicts.
|
||||
|
||||
In case the patch or patch series has to be reworked and sent out
|
||||
again in a second or later revision, it is also required to add a
|
||||
version number (``v2``, ``v3``, ...) into the subject prefix::
|
||||
|
||||
git format-patch --subject-prefix='PATCH net-next v2' start..finish
|
||||
|
||||
When changes have been requested to the patch series, always send the
|
||||
whole patch series again with the feedback incorporated (never send
|
||||
individual diffs on top of the old series).
|
||||
|
||||
Q: What does it mean when a patch gets applied to bpf or bpf-next tree?
|
||||
-----------------------------------------------------------------------
|
||||
A: It means that the patch looks good for mainline inclusion from
|
||||
a BPF point of view.
|
||||
|
||||
Be aware that this is not a final verdict that the patch will
|
||||
automatically get accepted into net or net-next trees eventually:
|
||||
|
||||
On the netdev kernel mailing list reviews can come in at any point
|
||||
in time. If discussions around a patch conclude that they cannot
|
||||
get included as-is, we will either apply a follow-up fix or drop
|
||||
them from the trees entirely. Therefore, we also reserve to rebase
|
||||
the trees when deemed necessary. After all, the purpose of the tree
|
||||
is to:
|
||||
|
||||
i) accumulate and stage BPF patches for integration into trees
|
||||
like net and net-next, and
|
||||
|
||||
ii) run extensive BPF test suite and
|
||||
workloads on the patches before they make their way any further.
|
||||
|
||||
Once the BPF pull request was accepted by David S. Miller, then
|
||||
the patches end up in net or net-next tree, respectively, and
|
||||
make their way from there further into mainline. Again, see the
|
||||
`netdev FAQ`_ for additional information e.g. on how often they are
|
||||
merged to mainline.
|
||||
|
||||
Q: How long do I need to wait for feedback on my BPF patches?
|
||||
-------------------------------------------------------------
|
||||
A: We try to keep the latency low. The usual time to feedback will
|
||||
be around 2 or 3 business days. It may vary depending on the
|
||||
complexity of changes and current patch load.
|
||||
|
||||
Q: How often do you send pull requests to major kernel trees like net or net-next?
|
||||
----------------------------------------------------------------------------------
|
||||
|
||||
A: Pull requests will be sent out rather often in order to not
|
||||
accumulate too many patches in bpf or bpf-next.
|
||||
|
||||
As a rule of thumb, expect pull requests for each tree regularly
|
||||
at the end of the week. In some cases pull requests could additionally
|
||||
come also in the middle of the week depending on the current patch
|
||||
load or urgency.
|
||||
|
||||
Q: Are patches applied to bpf-next when the merge window is open?
|
||||
-----------------------------------------------------------------
|
||||
A: For the time when the merge window is open, bpf-next will not be
|
||||
processed. This is roughly analogous to net-next patch processing,
|
||||
so feel free to read up on the `netdev FAQ`_ about further details.
|
||||
|
||||
During those two weeks of merge window, we might ask you to resend
|
||||
your patch series once bpf-next is open again. Once Linus released
|
||||
a ``v*-rc1`` after the merge window, we continue processing of bpf-next.
|
||||
|
||||
For non-subscribers to kernel mailing lists, there is also a status
|
||||
page run by David S. Miller on net-next that provides guidance:
|
||||
|
||||
http://vger.kernel.org/~davem/net-next.html
|
||||
|
||||
Q: Verifier changes and test cases
|
||||
----------------------------------
|
||||
Q: I made a BPF verifier change, do I need to add test cases for
|
||||
BPF kernel selftests_?
|
||||
|
||||
A: If the patch has changes to the behavior of the verifier, then yes,
|
||||
it is absolutely necessary to add test cases to the BPF kernel
|
||||
selftests_ suite. If they are not present and we think they are
|
||||
needed, then we might ask for them before accepting any changes.
|
||||
|
||||
In particular, test_verifier.c is tracking a high number of BPF test
|
||||
cases, including a lot of corner cases that LLVM BPF back end may
|
||||
generate out of the restricted C code. Thus, adding test cases is
|
||||
absolutely crucial to make sure future changes do not accidentally
|
||||
affect prior use-cases. Thus, treat those test cases as: verifier
|
||||
behavior that is not tracked in test_verifier.c could potentially
|
||||
be subject to change.
|
||||
|
||||
Q: samples/bpf preference vs selftests?
|
||||
---------------------------------------
|
||||
Q: When should I add code to `samples/bpf/`_ and when to BPF kernel
|
||||
selftests_ ?
|
||||
|
||||
A: In general, we prefer additions to BPF kernel selftests_ rather than
|
||||
`samples/bpf/`_. The rationale is very simple: kernel selftests are
|
||||
regularly run by various bots to test for kernel regressions.
|
||||
|
||||
The more test cases we add to BPF selftests, the better the coverage
|
||||
and the less likely it is that those could accidentally break. It is
|
||||
not that BPF kernel selftests cannot demo how a specific feature can
|
||||
be used.
|
||||
|
||||
That said, `samples/bpf/`_ may be a good place for people to get started,
|
||||
so it might be advisable that simple demos of features could go into
|
||||
`samples/bpf/`_, but advanced functional and corner-case testing rather
|
||||
into kernel selftests.
|
||||
|
||||
If your sample looks like a test case, then go for BPF kernel selftests
|
||||
instead!
|
||||
|
||||
Q: When should I add code to the bpftool?
|
||||
-----------------------------------------
|
||||
A: The main purpose of bpftool (under tools/bpf/bpftool/) is to provide
|
||||
a central user space tool for debugging and introspection of BPF programs
|
||||
and maps that are active in the kernel. If UAPI changes related to BPF
|
||||
enable for dumping additional information of programs or maps, then
|
||||
bpftool should be extended as well to support dumping them.
|
||||
|
||||
Q: When should I add code to iproute2's BPF loader?
|
||||
---------------------------------------------------
|
||||
A: For UAPI changes related to the XDP or tc layer (e.g. ``cls_bpf``),
|
||||
the convention is that those control-path related changes are added to
|
||||
iproute2's BPF loader as well from user space side. This is not only
|
||||
useful to have UAPI changes properly designed to be usable, but also
|
||||
to make those changes available to a wider user base of major
|
||||
downstream distributions.
|
||||
|
||||
Q: Do you accept patches as well for iproute2's BPF loader?
|
||||
-----------------------------------------------------------
|
||||
A: Patches for the iproute2's BPF loader have to be sent to:
|
||||
|
||||
netdev@vger.kernel.org
|
||||
|
||||
While those patches are not processed by the BPF kernel maintainers,
|
||||
please keep them in Cc as well, so they can be reviewed.
|
||||
|
||||
The official git repository for iproute2 is run by Stephen Hemminger
|
||||
and can be found at:
|
||||
|
||||
https://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2.git/
|
||||
|
||||
The patches need to have a subject prefix of '``[PATCH iproute2
|
||||
master]``' or '``[PATCH iproute2 net-next]``'. '``master``' or
|
||||
'``net-next``' describes the target branch where the patch should be
|
||||
applied to. Meaning, if kernel changes went into the net-next kernel
|
||||
tree, then the related iproute2 changes need to go into the iproute2
|
||||
net-next branch, otherwise they can be targeted at master branch. The
|
||||
iproute2 net-next branch will get merged into the master branch after
|
||||
the current iproute2 version from master has been released.
|
||||
|
||||
Like BPF, the patches end up in patchwork under the netdev project and
|
||||
are delegated to 'shemminger' for further processing:
|
||||
|
||||
http://patchwork.ozlabs.org/project/netdev/list/?delegate=389
|
||||
|
||||
Q: What is the minimum requirement before I submit my BPF patches?
|
||||
------------------------------------------------------------------
|
||||
A: When submitting patches, always take the time and properly test your
|
||||
patches *prior* to submission. Never rush them! If maintainers find
|
||||
that your patches have not been properly tested, it is a good way to
|
||||
get them grumpy. Testing patch submissions is a hard requirement!
|
||||
|
||||
Note, fixes that go to bpf tree *must* have a ``Fixes:`` tag included.
|
||||
The same applies to fixes that target bpf-next, where the affected
|
||||
commit is in net-next (or in some cases bpf-next). The ``Fixes:`` tag is
|
||||
crucial in order to identify follow-up commits and tremendously helps
|
||||
for people having to do backporting, so it is a must have!
|
||||
|
||||
We also don't accept patches with an empty commit message. Take your
|
||||
time and properly write up a high quality commit message, it is
|
||||
essential!
|
||||
|
||||
Think about it this way: other developers looking at your code a month
|
||||
from now need to understand *why* a certain change has been done that
|
||||
way, and whether there have been flaws in the analysis or assumptions
|
||||
that the original author did. Thus providing a proper rationale and
|
||||
describing the use-case for the changes is a must.
|
||||
|
||||
Patch submissions with >1 patch must have a cover letter which includes
|
||||
a high level description of the series. This high level summary will
|
||||
then be placed into the merge commit by the BPF maintainers such that
|
||||
it is also accessible from the git log for future reference.
|
||||
|
||||
Q: Features changing BPF JIT and/or LLVM
|
||||
----------------------------------------
|
||||
Q: What do I need to consider when adding a new instruction or feature
|
||||
that would require BPF JIT and/or LLVM integration as well?
|
||||
|
||||
A: We try hard to keep all BPF JITs up to date such that the same user
|
||||
experience can be guaranteed when running BPF programs on different
|
||||
architectures without having the program punt to the less efficient
|
||||
interpreter in case the in-kernel BPF JIT is enabled.
|
||||
|
||||
If you are unable to implement or test the required JIT changes for
|
||||
certain architectures, please work together with the related BPF JIT
|
||||
developers in order to get the feature implemented in a timely manner.
|
||||
Please refer to the git log (``arch/*/net/``) to locate the necessary
|
||||
people for helping out.
|
||||
|
||||
Also always make sure to add BPF test cases (e.g. test_bpf.c and
|
||||
test_verifier.c) for new instructions, so that they can receive
|
||||
broad test coverage and help run-time testing the various BPF JITs.
|
||||
|
||||
In case of new BPF instructions, once the changes have been accepted
|
||||
into the Linux kernel, please implement support into LLVM's BPF back
|
||||
end. See LLVM_ section below for further information.
|
||||
|
||||
Stable submission
|
||||
=================
|
||||
|
||||
Q: I need a specific BPF commit in stable kernels. What should I do?
|
||||
--------------------------------------------------------------------
|
||||
A: In case you need a specific fix in stable kernels, first check whether
|
||||
the commit has already been applied in the related ``linux-*.y`` branches:
|
||||
|
||||
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/
|
||||
|
||||
If not the case, then drop an email to the BPF maintainers with the
|
||||
netdev kernel mailing list in Cc and ask for the fix to be queued up:
|
||||
|
||||
netdev@vger.kernel.org
|
||||
|
||||
The process in general is the same as on netdev itself, see also the
|
||||
`netdev FAQ`_ document.
|
||||
|
||||
Q: Do you also backport to kernels not currently maintained as stable?
|
||||
----------------------------------------------------------------------
|
||||
A: No. If you need a specific BPF commit in kernels that are currently not
|
||||
maintained by the stable maintainers, then you are on your own.
|
||||
|
||||
The current stable and longterm stable kernels are all listed here:
|
||||
|
||||
https://www.kernel.org/
|
||||
|
||||
Q: The BPF patch I am about to submit needs to go to stable as well
|
||||
-------------------------------------------------------------------
|
||||
What should I do?
|
||||
|
||||
A: The same rules apply as with netdev patch submissions in general, see
|
||||
`netdev FAQ`_ under:
|
||||
|
||||
`Documentation/networking/netdev-FAQ.txt`_
|
||||
|
||||
Never add "``Cc: stable@vger.kernel.org``" to the patch description, but
|
||||
ask the BPF maintainers to queue the patches instead. This can be done
|
||||
with a note, for example, under the ``---`` part of the patch which does
|
||||
not go into the git log. Alternatively, this can be done as a simple
|
||||
request by mail instead.
|
||||
|
||||
Q: Queue stable patches
|
||||
-----------------------
|
||||
Q: Where do I find currently queued BPF patches that will be submitted
|
||||
to stable?
|
||||
|
||||
A: Once patches that fix critical bugs got applied into the bpf tree, they
|
||||
are queued up for stable submission under:
|
||||
|
||||
http://patchwork.ozlabs.org/bundle/bpf/stable/?state=*
|
||||
|
||||
They will be on hold there at minimum until the related commit made its
|
||||
way into the mainline kernel tree.
|
||||
|
||||
After having been under broader exposure, the queued patches will be
|
||||
submitted by the BPF maintainers to the stable maintainers.
|
||||
|
||||
Testing patches
|
||||
===============
|
||||
|
||||
Q: How to run BPF selftests
|
||||
---------------------------
|
||||
A: After you have booted into the newly compiled kernel, navigate to
|
||||
the BPF selftests_ suite in order to test BPF functionality (current
|
||||
working directory points to the root of the cloned git tree)::
|
||||
|
||||
$ cd tools/testing/selftests/bpf/
|
||||
$ make
|
||||
|
||||
To run the verifier tests::
|
||||
|
||||
$ sudo ./test_verifier
|
||||
|
||||
The verifier tests print out all the current checks being
|
||||
performed. The summary at the end of running all tests will dump
|
||||
information of test successes and failures::
|
||||
|
||||
Summary: 418 PASSED, 0 FAILED
|
||||
|
||||
In order to run through all BPF selftests, the following command is
|
||||
needed::
|
||||
|
||||
$ sudo make run_tests
|
||||
|
||||
See the kernels selftest `Documentation/dev-tools/kselftest.rst`_
|
||||
document for further documentation.
|
||||
|
||||
Q: Which BPF kernel selftests version should I run my kernel against?
|
||||
---------------------------------------------------------------------
|
||||
A: If you run a kernel ``xyz``, then always run the BPF kernel selftests
|
||||
from that kernel ``xyz`` as well. Do not expect that the BPF selftest
|
||||
from the latest mainline tree will pass all the time.
|
||||
|
||||
In particular, test_bpf.c and test_verifier.c have a large number of
|
||||
test cases and are constantly updated with new BPF test sequences, or
|
||||
existing ones are adapted to verifier changes e.g. due to verifier
|
||||
becoming smarter and being able to better track certain things.
|
||||
|
||||
LLVM
|
||||
====
|
||||
|
||||
Q: Where do I find LLVM with BPF support?
|
||||
-----------------------------------------
|
||||
A: The BPF back end for LLVM is upstream in LLVM since version 3.7.1.
|
||||
|
||||
All major distributions these days ship LLVM with BPF back end enabled,
|
||||
so for the majority of use-cases it is not required to compile LLVM by
|
||||
hand anymore, just install the distribution provided package.
|
||||
|
||||
LLVM's static compiler lists the supported targets through
|
||||
``llc --version``, make sure BPF targets are listed. Example::
|
||||
|
||||
$ llc --version
|
||||
LLVM (http://llvm.org/):
|
||||
LLVM version 6.0.0svn
|
||||
Optimized build.
|
||||
Default target: x86_64-unknown-linux-gnu
|
||||
Host CPU: skylake
|
||||
|
||||
Registered Targets:
|
||||
bpf - BPF (host endian)
|
||||
bpfeb - BPF (big endian)
|
||||
bpfel - BPF (little endian)
|
||||
x86 - 32-bit X86: Pentium-Pro and above
|
||||
x86-64 - 64-bit X86: EM64T and AMD64
|
||||
|
||||
For developers in order to utilize the latest features added to LLVM's
|
||||
BPF back end, it is advisable to run the latest LLVM releases. Support
|
||||
for new BPF kernel features such as additions to the BPF instruction
|
||||
set are often developed together.
|
||||
|
||||
All LLVM releases can be found at: http://releases.llvm.org/
|
||||
|
||||
Q: Got it, so how do I build LLVM manually anyway?
|
||||
--------------------------------------------------
|
||||
A: You need cmake and gcc-c++ as build requisites for LLVM. Once you have
|
||||
that set up, proceed with building the latest LLVM and clang version
|
||||
from the git repositories::
|
||||
|
||||
$ git clone http://llvm.org/git/llvm.git
|
||||
$ cd llvm/tools
|
||||
$ git clone --depth 1 http://llvm.org/git/clang.git
|
||||
$ cd ..; mkdir build; cd build
|
||||
$ cmake .. -DLLVM_TARGETS_TO_BUILD="BPF;X86" \
|
||||
-DBUILD_SHARED_LIBS=OFF \
|
||||
-DCMAKE_BUILD_TYPE=Release \
|
||||
-DLLVM_BUILD_RUNTIME=OFF
|
||||
$ make -j $(getconf _NPROCESSORS_ONLN)
|
||||
|
||||
The built binaries can then be found in the build/bin/ directory, where
|
||||
you can point the PATH variable to.
|
||||
|
||||
Q: Reporting LLVM BPF issues
|
||||
----------------------------
|
||||
Q: Should I notify BPF kernel maintainers about issues in LLVM's BPF code
|
||||
generation back end or about LLVM generated code that the verifier
|
||||
refuses to accept?
|
||||
|
||||
A: Yes, please do!
|
||||
|
||||
LLVM's BPF back end is a key piece of the whole BPF
|
||||
infrastructure and it ties deeply into verification of programs from the
|
||||
kernel side. Therefore, any issues on either side need to be investigated
|
||||
and fixed whenever necessary.
|
||||
|
||||
Therefore, please make sure to bring them up at netdev kernel mailing
|
||||
list and Cc BPF maintainers for LLVM and kernel bits:
|
||||
|
||||
* Yonghong Song <yhs@fb.com>
|
||||
* Alexei Starovoitov <ast@kernel.org>
|
||||
* Daniel Borkmann <daniel@iogearbox.net>
|
||||
|
||||
LLVM also has an issue tracker where BPF related bugs can be found:
|
||||
|
||||
https://bugs.llvm.org/buglist.cgi?quicksearch=bpf
|
||||
|
||||
However, it is better to reach out through mailing lists with having
|
||||
maintainers in Cc.
|
||||
|
||||
Q: New BPF instruction for kernel and LLVM
|
||||
------------------------------------------
|
||||
Q: I have added a new BPF instruction to the kernel, how can I integrate
|
||||
it into LLVM?
|
||||
|
||||
A: LLVM has a ``-mcpu`` selector for the BPF back end in order to allow
|
||||
the selection of BPF instruction set extensions. By default the
|
||||
``generic`` processor target is used, which is the base instruction set
|
||||
(v1) of BPF.
|
||||
|
||||
LLVM has an option to select ``-mcpu=probe`` where it will probe the host
|
||||
kernel for supported BPF instruction set extensions and selects the
|
||||
optimal set automatically.
|
||||
|
||||
For cross-compilation, a specific version can be select manually as well ::
|
||||
|
||||
$ llc -march bpf -mcpu=help
|
||||
Available CPUs for this target:
|
||||
|
||||
generic - Select the generic processor.
|
||||
probe - Select the probe processor.
|
||||
v1 - Select the v1 processor.
|
||||
v2 - Select the v2 processor.
|
||||
[...]
|
||||
|
||||
Newly added BPF instructions to the Linux kernel need to follow the same
|
||||
scheme, bump the instruction set version and implement probing for the
|
||||
extensions such that ``-mcpu=probe`` users can benefit from the
|
||||
optimization transparently when upgrading their kernels.
|
||||
|
||||
If you are unable to implement support for the newly added BPF instruction
|
||||
please reach out to BPF developers for help.
|
||||
|
||||
By the way, the BPF kernel selftests run with ``-mcpu=probe`` for better
|
||||
test coverage.
|
||||
|
||||
Q: clang flag for target bpf?
|
||||
-----------------------------
|
||||
Q: In some cases clang flag ``-target bpf`` is used but in other cases the
|
||||
default clang target, which matches the underlying architecture, is used.
|
||||
What is the difference and when I should use which?
|
||||
|
||||
A: Although LLVM IR generation and optimization try to stay architecture
|
||||
independent, ``-target <arch>`` still has some impact on generated code:
|
||||
|
||||
- BPF program may recursively include header file(s) with file scope
|
||||
inline assembly codes. The default target can handle this well,
|
||||
while ``bpf`` target may fail if bpf backend assembler does not
|
||||
understand these assembly codes, which is true in most cases.
|
||||
|
||||
- When compiled without ``-g``, additional elf sections, e.g.,
|
||||
.eh_frame and .rela.eh_frame, may be present in the object file
|
||||
with default target, but not with ``bpf`` target.
|
||||
|
||||
- The default target may turn a C switch statement into a switch table
|
||||
lookup and jump operation. Since the switch table is placed
|
||||
in the global readonly section, the bpf program will fail to load.
|
||||
The bpf target does not support switch table optimization.
|
||||
The clang option ``-fno-jump-tables`` can be used to disable
|
||||
switch table generation.
|
||||
|
||||
- For clang ``-target bpf``, it is guaranteed that pointer or long /
|
||||
unsigned long types will always have a width of 64 bit, no matter
|
||||
whether underlying clang binary or default target (or kernel) is
|
||||
32 bit. However, when native clang target is used, then it will
|
||||
compile these types based on the underlying architecture's conventions,
|
||||
meaning in case of 32 bit architecture, pointer or long / unsigned
|
||||
long types e.g. in BPF context structure will have width of 32 bit
|
||||
while the BPF LLVM back end still operates in 64 bit. The native
|
||||
target is mostly needed in tracing for the case of walking ``pt_regs``
|
||||
or other kernel structures where CPU's register width matters.
|
||||
Otherwise, ``clang -target bpf`` is generally recommended.
|
||||
|
||||
You should use default target when:
|
||||
|
||||
- Your program includes a header file, e.g., ptrace.h, which eventually
|
||||
pulls in some header files containing file scope host assembly codes.
|
||||
|
||||
- You can add ``-fno-jump-tables`` to work around the switch table issue.
|
||||
|
||||
Otherwise, you can use ``bpf`` target. Additionally, you *must* use bpf target
|
||||
when:
|
||||
|
||||
- Your program uses data structures with pointer or long / unsigned long
|
||||
types that interface with BPF helpers or context data structures. Access
|
||||
into these structures is verified by the BPF verifier and may result
|
||||
in verification failures if the native architecture is not aligned with
|
||||
the BPF architecture, e.g. 64-bit. An example of this is
|
||||
BPF_PROG_TYPE_SK_MSG require ``-target bpf``
|
||||
|
||||
|
||||
.. Links
|
||||
.. _Documentation/process/: https://www.kernel.org/doc/html/latest/process/
|
||||
.. _MAINTAINERS: ../../MAINTAINERS
|
||||
.. _Documentation/networking/netdev-FAQ.txt: ../networking/netdev-FAQ.txt
|
||||
.. _netdev FAQ: ../networking/netdev-FAQ.txt
|
||||
.. _samples/bpf/: ../../samples/bpf/
|
||||
.. _selftests: ../../tools/testing/selftests/bpf/
|
||||
.. _Documentation/dev-tools/kselftest.rst:
|
||||
https://www.kernel.org/doc/html/latest/dev-tools/kselftest.html
|
||||
|
||||
Happy BPF hacking!
|
|
@ -1,570 +0,0 @@
|
|||
This document provides information for the BPF subsystem about various
|
||||
workflows related to reporting bugs, submitting patches, and queueing
|
||||
patches for stable kernels.
|
||||
|
||||
For general information about submitting patches, please refer to
|
||||
Documentation/process/. This document only describes additional specifics
|
||||
related to BPF.
|
||||
|
||||
Reporting bugs:
|
||||
---------------
|
||||
|
||||
Q: How do I report bugs for BPF kernel code?
|
||||
|
||||
A: Since all BPF kernel development as well as bpftool and iproute2 BPF
|
||||
loader development happens through the netdev kernel mailing list,
|
||||
please report any found issues around BPF to the following mailing
|
||||
list:
|
||||
|
||||
netdev@vger.kernel.org
|
||||
|
||||
This may also include issues related to XDP, BPF tracing, etc.
|
||||
|
||||
Given netdev has a high volume of traffic, please also add the BPF
|
||||
maintainers to Cc (from kernel MAINTAINERS file):
|
||||
|
||||
Alexei Starovoitov <ast@kernel.org>
|
||||
Daniel Borkmann <daniel@iogearbox.net>
|
||||
|
||||
In case a buggy commit has already been identified, make sure to keep
|
||||
the actual commit authors in Cc as well for the report. They can
|
||||
typically be identified through the kernel's git tree.
|
||||
|
||||
Please do *not* report BPF issues to bugzilla.kernel.org since it
|
||||
is a guarantee that the reported issue will be overlooked.
|
||||
|
||||
Submitting patches:
|
||||
-------------------
|
||||
|
||||
Q: To which mailing list do I need to submit my BPF patches?
|
||||
|
||||
A: Please submit your BPF patches to the netdev kernel mailing list:
|
||||
|
||||
netdev@vger.kernel.org
|
||||
|
||||
Historically, BPF came out of networking and has always been maintained
|
||||
by the kernel networking community. Although these days BPF touches
|
||||
many other subsystems as well, the patches are still routed mainly
|
||||
through the networking community.
|
||||
|
||||
In case your patch has changes in various different subsystems (e.g.
|
||||
tracing, security, etc), make sure to Cc the related kernel mailing
|
||||
lists and maintainers from there as well, so they are able to review
|
||||
the changes and provide their Acked-by's to the patches.
|
||||
|
||||
Q: Where can I find patches currently under discussion for BPF subsystem?
|
||||
|
||||
A: All patches that are Cc'ed to netdev are queued for review under netdev
|
||||
patchwork project:
|
||||
|
||||
http://patchwork.ozlabs.org/project/netdev/list/
|
||||
|
||||
Those patches which target BPF, are assigned to a 'bpf' delegate for
|
||||
further processing from BPF maintainers. The current queue with
|
||||
patches under review can be found at:
|
||||
|
||||
https://patchwork.ozlabs.org/project/netdev/list/?delegate=77147
|
||||
|
||||
Once the patches have been reviewed by the BPF community as a whole
|
||||
and approved by the BPF maintainers, their status in patchwork will be
|
||||
changed to 'Accepted' and the submitter will be notified by mail. This
|
||||
means that the patches look good from a BPF perspective and have been
|
||||
applied to one of the two BPF kernel trees.
|
||||
|
||||
In case feedback from the community requires a respin of the patches,
|
||||
their status in patchwork will be set to 'Changes Requested', and purged
|
||||
from the current review queue. Likewise for cases where patches would
|
||||
get rejected or are not applicable to the BPF trees (but assigned to
|
||||
the 'bpf' delegate).
|
||||
|
||||
Q: How do the changes make their way into Linux?
|
||||
|
||||
A: There are two BPF kernel trees (git repositories). Once patches have
|
||||
been accepted by the BPF maintainers, they will be applied to one
|
||||
of the two BPF trees:
|
||||
|
||||
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/
|
||||
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/
|
||||
|
||||
The bpf tree itself is for fixes only, whereas bpf-next for features,
|
||||
cleanups or other kind of improvements ("next-like" content). This is
|
||||
analogous to net and net-next trees for networking. Both bpf and
|
||||
bpf-next will only have a master branch in order to simplify against
|
||||
which branch patches should get rebased to.
|
||||
|
||||
Accumulated BPF patches in the bpf tree will regularly get pulled
|
||||
into the net kernel tree. Likewise, accumulated BPF patches accepted
|
||||
into the bpf-next tree will make their way into net-next tree. net and
|
||||
net-next are both run by David S. Miller. From there, they will go
|
||||
into the kernel mainline tree run by Linus Torvalds. To read up on the
|
||||
process of net and net-next being merged into the mainline tree, see
|
||||
the netdev FAQ under:
|
||||
|
||||
Documentation/networking/netdev-FAQ.txt
|
||||
|
||||
Occasionally, to prevent merge conflicts, we might send pull requests
|
||||
to other trees (e.g. tracing) with a small subset of the patches, but
|
||||
net and net-next are always the main trees targeted for integration.
|
||||
|
||||
The pull requests will contain a high-level summary of the accumulated
|
||||
patches and can be searched on netdev kernel mailing list through the
|
||||
following subject lines (yyyy-mm-dd is the date of the pull request):
|
||||
|
||||
pull-request: bpf yyyy-mm-dd
|
||||
pull-request: bpf-next yyyy-mm-dd
|
||||
|
||||
Q: How do I indicate which tree (bpf vs. bpf-next) my patch should be
|
||||
applied to?
|
||||
|
||||
A: The process is the very same as described in the netdev FAQ, so
|
||||
please read up on it. The subject line must indicate whether the
|
||||
patch is a fix or rather "next-like" content in order to let the
|
||||
maintainers know whether it is targeted at bpf or bpf-next.
|
||||
|
||||
For fixes eventually landing in bpf -> net tree, the subject must
|
||||
look like:
|
||||
|
||||
git format-patch --subject-prefix='PATCH bpf' start..finish
|
||||
|
||||
For features/improvements/etc that should eventually land in
|
||||
bpf-next -> net-next, the subject must look like:
|
||||
|
||||
git format-patch --subject-prefix='PATCH bpf-next' start..finish
|
||||
|
||||
If unsure whether the patch or patch series should go into bpf
|
||||
or net directly, or bpf-next or net-next directly, it is not a
|
||||
problem either if the subject line says net or net-next as target.
|
||||
It is eventually up to the maintainers to do the delegation of
|
||||
the patches.
|
||||
|
||||
If it is clear that patches should go into bpf or bpf-next tree,
|
||||
please make sure to rebase the patches against those trees in
|
||||
order to reduce potential conflicts.
|
||||
|
||||
In case the patch or patch series has to be reworked and sent out
|
||||
again in a second or later revision, it is also required to add a
|
||||
version number (v2, v3, ...) into the subject prefix:
|
||||
|
||||
git format-patch --subject-prefix='PATCH net-next v2' start..finish
|
||||
|
||||
When changes have been requested to the patch series, always send the
|
||||
whole patch series again with the feedback incorporated (never send
|
||||
individual diffs on top of the old series).
|
||||
|
||||
Q: What does it mean when a patch gets applied to bpf or bpf-next tree?
|
||||
|
||||
A: It means that the patch looks good for mainline inclusion from
|
||||
a BPF point of view.
|
||||
|
||||
Be aware that this is not a final verdict that the patch will
|
||||
automatically get accepted into net or net-next trees eventually:
|
||||
|
||||
On the netdev kernel mailing list reviews can come in at any point
|
||||
in time. If discussions around a patch conclude that they cannot
|
||||
get included as-is, we will either apply a follow-up fix or drop
|
||||
them from the trees entirely. Therefore, we also reserve to rebase
|
||||
the trees when deemed necessary. After all, the purpose of the tree
|
||||
is to i) accumulate and stage BPF patches for integration into trees
|
||||
like net and net-next, and ii) run extensive BPF test suite and
|
||||
workloads on the patches before they make their way any further.
|
||||
|
||||
Once the BPF pull request was accepted by David S. Miller, then
|
||||
the patches end up in net or net-next tree, respectively, and
|
||||
make their way from there further into mainline. Again, see the
|
||||
netdev FAQ for additional information e.g. on how often they are
|
||||
merged to mainline.
|
||||
|
||||
Q: How long do I need to wait for feedback on my BPF patches?
|
||||
|
||||
A: We try to keep the latency low. The usual time to feedback will
|
||||
be around 2 or 3 business days. It may vary depending on the
|
||||
complexity of changes and current patch load.
|
||||
|
||||
Q: How often do you send pull requests to major kernel trees like
|
||||
net or net-next?
|
||||
|
||||
A: Pull requests will be sent out rather often in order to not
|
||||
accumulate too many patches in bpf or bpf-next.
|
||||
|
||||
As a rule of thumb, expect pull requests for each tree regularly
|
||||
at the end of the week. In some cases pull requests could additionally
|
||||
come also in the middle of the week depending on the current patch
|
||||
load or urgency.
|
||||
|
||||
Q: Are patches applied to bpf-next when the merge window is open?
|
||||
|
||||
A: For the time when the merge window is open, bpf-next will not be
|
||||
processed. This is roughly analogous to net-next patch processing,
|
||||
so feel free to read up on the netdev FAQ about further details.
|
||||
|
||||
During those two weeks of merge window, we might ask you to resend
|
||||
your patch series once bpf-next is open again. Once Linus released
|
||||
a v*-rc1 after the merge window, we continue processing of bpf-next.
|
||||
|
||||
For non-subscribers to kernel mailing lists, there is also a status
|
||||
page run by David S. Miller on net-next that provides guidance:
|
||||
|
||||
http://vger.kernel.org/~davem/net-next.html
|
||||
|
||||
Q: I made a BPF verifier change, do I need to add test cases for
|
||||
BPF kernel selftests?
|
||||
|
||||
A: If the patch has changes to the behavior of the verifier, then yes,
|
||||
it is absolutely necessary to add test cases to the BPF kernel
|
||||
selftests suite. If they are not present and we think they are
|
||||
needed, then we might ask for them before accepting any changes.
|
||||
|
||||
In particular, test_verifier.c is tracking a high number of BPF test
|
||||
cases, including a lot of corner cases that LLVM BPF back end may
|
||||
generate out of the restricted C code. Thus, adding test cases is
|
||||
absolutely crucial to make sure future changes do not accidentally
|
||||
affect prior use-cases. Thus, treat those test cases as: verifier
|
||||
behavior that is not tracked in test_verifier.c could potentially
|
||||
be subject to change.
|
||||
|
||||
Q: When should I add code to samples/bpf/ and when to BPF kernel
|
||||
selftests?
|
||||
|
||||
A: In general, we prefer additions to BPF kernel selftests rather than
|
||||
samples/bpf/. The rationale is very simple: kernel selftests are
|
||||
regularly run by various bots to test for kernel regressions.
|
||||
|
||||
The more test cases we add to BPF selftests, the better the coverage
|
||||
and the less likely it is that those could accidentally break. It is
|
||||
not that BPF kernel selftests cannot demo how a specific feature can
|
||||
be used.
|
||||
|
||||
That said, samples/bpf/ may be a good place for people to get started,
|
||||
so it might be advisable that simple demos of features could go into
|
||||
samples/bpf/, but advanced functional and corner-case testing rather
|
||||
into kernel selftests.
|
||||
|
||||
If your sample looks like a test case, then go for BPF kernel selftests
|
||||
instead!
|
||||
|
||||
Q: When should I add code to the bpftool?
|
||||
|
||||
A: The main purpose of bpftool (under tools/bpf/bpftool/) is to provide
|
||||
a central user space tool for debugging and introspection of BPF programs
|
||||
and maps that are active in the kernel. If UAPI changes related to BPF
|
||||
enable for dumping additional information of programs or maps, then
|
||||
bpftool should be extended as well to support dumping them.
|
||||
|
||||
Q: When should I add code to iproute2's BPF loader?
|
||||
|
||||
A: For UAPI changes related to the XDP or tc layer (e.g. cls_bpf), the
|
||||
convention is that those control-path related changes are added to
|
||||
iproute2's BPF loader as well from user space side. This is not only
|
||||
useful to have UAPI changes properly designed to be usable, but also
|
||||
to make those changes available to a wider user base of major
|
||||
downstream distributions.
|
||||
|
||||
Q: Do you accept patches as well for iproute2's BPF loader?
|
||||
|
||||
A: Patches for the iproute2's BPF loader have to be sent to:
|
||||
|
||||
netdev@vger.kernel.org
|
||||
|
||||
While those patches are not processed by the BPF kernel maintainers,
|
||||
please keep them in Cc as well, so they can be reviewed.
|
||||
|
||||
The official git repository for iproute2 is run by Stephen Hemminger
|
||||
and can be found at:
|
||||
|
||||
https://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2.git/
|
||||
|
||||
The patches need to have a subject prefix of '[PATCH iproute2 master]'
|
||||
or '[PATCH iproute2 net-next]'. 'master' or 'net-next' describes the
|
||||
target branch where the patch should be applied to. Meaning, if kernel
|
||||
changes went into the net-next kernel tree, then the related iproute2
|
||||
changes need to go into the iproute2 net-next branch, otherwise they
|
||||
can be targeted at master branch. The iproute2 net-next branch will get
|
||||
merged into the master branch after the current iproute2 version from
|
||||
master has been released.
|
||||
|
||||
Like BPF, the patches end up in patchwork under the netdev project and
|
||||
are delegated to 'shemminger' for further processing:
|
||||
|
||||
http://patchwork.ozlabs.org/project/netdev/list/?delegate=389
|
||||
|
||||
Q: What is the minimum requirement before I submit my BPF patches?
|
||||
|
||||
A: When submitting patches, always take the time and properly test your
|
||||
patches *prior* to submission. Never rush them! If maintainers find
|
||||
that your patches have not been properly tested, it is a good way to
|
||||
get them grumpy. Testing patch submissions is a hard requirement!
|
||||
|
||||
Note, fixes that go to bpf tree *must* have a Fixes: tag included. The
|
||||
same applies to fixes that target bpf-next, where the affected commit
|
||||
is in net-next (or in some cases bpf-next). The Fixes: tag is crucial
|
||||
in order to identify follow-up commits and tremendously helps for people
|
||||
having to do backporting, so it is a must have!
|
||||
|
||||
We also don't accept patches with an empty commit message. Take your
|
||||
time and properly write up a high quality commit message, it is
|
||||
essential!
|
||||
|
||||
Think about it this way: other developers looking at your code a month
|
||||
from now need to understand *why* a certain change has been done that
|
||||
way, and whether there have been flaws in the analysis or assumptions
|
||||
that the original author did. Thus providing a proper rationale and
|
||||
describing the use-case for the changes is a must.
|
||||
|
||||
Patch submissions with >1 patch must have a cover letter which includes
|
||||
a high level description of the series. This high level summary will
|
||||
then be placed into the merge commit by the BPF maintainers such that
|
||||
it is also accessible from the git log for future reference.
|
||||
|
||||
Q: What do I need to consider when adding a new instruction or feature
|
||||
that would require BPF JIT and/or LLVM integration as well?
|
||||
|
||||
A: We try hard to keep all BPF JITs up to date such that the same user
|
||||
experience can be guaranteed when running BPF programs on different
|
||||
architectures without having the program punt to the less efficient
|
||||
interpreter in case the in-kernel BPF JIT is enabled.
|
||||
|
||||
If you are unable to implement or test the required JIT changes for
|
||||
certain architectures, please work together with the related BPF JIT
|
||||
developers in order to get the feature implemented in a timely manner.
|
||||
Please refer to the git log (arch/*/net/) to locate the necessary
|
||||
people for helping out.
|
||||
|
||||
Also always make sure to add BPF test cases (e.g. test_bpf.c and
|
||||
test_verifier.c) for new instructions, so that they can receive
|
||||
broad test coverage and help run-time testing the various BPF JITs.
|
||||
|
||||
In case of new BPF instructions, once the changes have been accepted
|
||||
into the Linux kernel, please implement support into LLVM's BPF back
|
||||
end. See LLVM section below for further information.
|
||||
|
||||
Stable submission:
|
||||
------------------
|
||||
|
||||
Q: I need a specific BPF commit in stable kernels. What should I do?
|
||||
|
||||
A: In case you need a specific fix in stable kernels, first check whether
|
||||
the commit has already been applied in the related linux-*.y branches:
|
||||
|
||||
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/
|
||||
|
||||
If not the case, then drop an email to the BPF maintainers with the
|
||||
netdev kernel mailing list in Cc and ask for the fix to be queued up:
|
||||
|
||||
netdev@vger.kernel.org
|
||||
|
||||
The process in general is the same as on netdev itself, see also the
|
||||
netdev FAQ document.
|
||||
|
||||
Q: Do you also backport to kernels not currently maintained as stable?
|
||||
|
||||
A: No. If you need a specific BPF commit in kernels that are currently not
|
||||
maintained by the stable maintainers, then you are on your own.
|
||||
|
||||
The current stable and longterm stable kernels are all listed here:
|
||||
|
||||
https://www.kernel.org/
|
||||
|
||||
Q: The BPF patch I am about to submit needs to go to stable as well. What
|
||||
should I do?
|
||||
|
||||
A: The same rules apply as with netdev patch submissions in general, see
|
||||
netdev FAQ under:
|
||||
|
||||
Documentation/networking/netdev-FAQ.txt
|
||||
|
||||
Never add "Cc: stable@vger.kernel.org" to the patch description, but
|
||||
ask the BPF maintainers to queue the patches instead. This can be done
|
||||
with a note, for example, under the "---" part of the patch which does
|
||||
not go into the git log. Alternatively, this can be done as a simple
|
||||
request by mail instead.
|
||||
|
||||
Q: Where do I find currently queued BPF patches that will be submitted
|
||||
to stable?
|
||||
|
||||
A: Once patches that fix critical bugs got applied into the bpf tree, they
|
||||
are queued up for stable submission under:
|
||||
|
||||
http://patchwork.ozlabs.org/bundle/bpf/stable/?state=*
|
||||
|
||||
They will be on hold there at minimum until the related commit made its
|
||||
way into the mainline kernel tree.
|
||||
|
||||
After having been under broader exposure, the queued patches will be
|
||||
submitted by the BPF maintainers to the stable maintainers.
|
||||
|
||||
Testing patches:
|
||||
----------------
|
||||
|
||||
Q: Which BPF kernel selftests version should I run my kernel against?
|
||||
|
||||
A: If you run a kernel xyz, then always run the BPF kernel selftests from
|
||||
that kernel xyz as well. Do not expect that the BPF selftest from the
|
||||
latest mainline tree will pass all the time.
|
||||
|
||||
In particular, test_bpf.c and test_verifier.c have a large number of
|
||||
test cases and are constantly updated with new BPF test sequences, or
|
||||
existing ones are adapted to verifier changes e.g. due to verifier
|
||||
becoming smarter and being able to better track certain things.
|
||||
|
||||
LLVM:
|
||||
-----
|
||||
|
||||
Q: Where do I find LLVM with BPF support?
|
||||
|
||||
A: The BPF back end for LLVM is upstream in LLVM since version 3.7.1.
|
||||
|
||||
All major distributions these days ship LLVM with BPF back end enabled,
|
||||
so for the majority of use-cases it is not required to compile LLVM by
|
||||
hand anymore, just install the distribution provided package.
|
||||
|
||||
LLVM's static compiler lists the supported targets through 'llc --version',
|
||||
make sure BPF targets are listed. Example:
|
||||
|
||||
$ llc --version
|
||||
LLVM (http://llvm.org/):
|
||||
LLVM version 6.0.0svn
|
||||
Optimized build.
|
||||
Default target: x86_64-unknown-linux-gnu
|
||||
Host CPU: skylake
|
||||
|
||||
Registered Targets:
|
||||
bpf - BPF (host endian)
|
||||
bpfeb - BPF (big endian)
|
||||
bpfel - BPF (little endian)
|
||||
x86 - 32-bit X86: Pentium-Pro and above
|
||||
x86-64 - 64-bit X86: EM64T and AMD64
|
||||
|
||||
For developers in order to utilize the latest features added to LLVM's
|
||||
BPF back end, it is advisable to run the latest LLVM releases. Support
|
||||
for new BPF kernel features such as additions to the BPF instruction
|
||||
set are often developed together.
|
||||
|
||||
All LLVM releases can be found at: http://releases.llvm.org/
|
||||
|
||||
Q: Got it, so how do I build LLVM manually anyway?
|
||||
|
||||
A: You need cmake and gcc-c++ as build requisites for LLVM. Once you have
|
||||
that set up, proceed with building the latest LLVM and clang version
|
||||
from the git repositories:
|
||||
|
||||
$ git clone http://llvm.org/git/llvm.git
|
||||
$ cd llvm/tools
|
||||
$ git clone --depth 1 http://llvm.org/git/clang.git
|
||||
$ cd ..; mkdir build; cd build
|
||||
$ cmake .. -DLLVM_TARGETS_TO_BUILD="BPF;X86" \
|
||||
-DBUILD_SHARED_LIBS=OFF \
|
||||
-DCMAKE_BUILD_TYPE=Release \
|
||||
-DLLVM_BUILD_RUNTIME=OFF
|
||||
$ make -j $(getconf _NPROCESSORS_ONLN)
|
||||
|
||||
The built binaries can then be found in the build/bin/ directory, where
|
||||
you can point the PATH variable to.
|
||||
|
||||
Q: Should I notify BPF kernel maintainers about issues in LLVM's BPF code
|
||||
generation back end or about LLVM generated code that the verifier
|
||||
refuses to accept?
|
||||
|
||||
A: Yes, please do! LLVM's BPF back end is a key piece of the whole BPF
|
||||
infrastructure and it ties deeply into verification of programs from the
|
||||
kernel side. Therefore, any issues on either side need to be investigated
|
||||
and fixed whenever necessary.
|
||||
|
||||
Therefore, please make sure to bring them up at netdev kernel mailing
|
||||
list and Cc BPF maintainers for LLVM and kernel bits:
|
||||
|
||||
Yonghong Song <yhs@fb.com>
|
||||
Alexei Starovoitov <ast@kernel.org>
|
||||
Daniel Borkmann <daniel@iogearbox.net>
|
||||
|
||||
LLVM also has an issue tracker where BPF related bugs can be found:
|
||||
|
||||
https://bugs.llvm.org/buglist.cgi?quicksearch=bpf
|
||||
|
||||
However, it is better to reach out through mailing lists with having
|
||||
maintainers in Cc.
|
||||
|
||||
Q: I have added a new BPF instruction to the kernel, how can I integrate
|
||||
it into LLVM?
|
||||
|
||||
A: LLVM has a -mcpu selector for the BPF back end in order to allow the
|
||||
selection of BPF instruction set extensions. By default the 'generic'
|
||||
processor target is used, which is the base instruction set (v1) of BPF.
|
||||
|
||||
LLVM has an option to select -mcpu=probe where it will probe the host
|
||||
kernel for supported BPF instruction set extensions and selects the
|
||||
optimal set automatically.
|
||||
|
||||
For cross-compilation, a specific version can be select manually as well.
|
||||
|
||||
$ llc -march bpf -mcpu=help
|
||||
Available CPUs for this target:
|
||||
|
||||
generic - Select the generic processor.
|
||||
probe - Select the probe processor.
|
||||
v1 - Select the v1 processor.
|
||||
v2 - Select the v2 processor.
|
||||
[...]
|
||||
|
||||
Newly added BPF instructions to the Linux kernel need to follow the same
|
||||
scheme, bump the instruction set version and implement probing for the
|
||||
extensions such that -mcpu=probe users can benefit from the optimization
|
||||
transparently when upgrading their kernels.
|
||||
|
||||
If you are unable to implement support for the newly added BPF instruction
|
||||
please reach out to BPF developers for help.
|
||||
|
||||
By the way, the BPF kernel selftests run with -mcpu=probe for better
|
||||
test coverage.
|
||||
|
||||
Q: In some cases clang flag "-target bpf" is used but in other cases the
|
||||
default clang target, which matches the underlying architecture, is used.
|
||||
What is the difference and when I should use which?
|
||||
|
||||
A: Although LLVM IR generation and optimization try to stay architecture
|
||||
independent, "-target <arch>" still has some impact on generated code:
|
||||
|
||||
- BPF program may recursively include header file(s) with file scope
|
||||
inline assembly codes. The default target can handle this well,
|
||||
while bpf target may fail if bpf backend assembler does not
|
||||
understand these assembly codes, which is true in most cases.
|
||||
|
||||
- When compiled without -g, additional elf sections, e.g.,
|
||||
.eh_frame and .rela.eh_frame, may be present in the object file
|
||||
with default target, but not with bpf target.
|
||||
|
||||
- The default target may turn a C switch statement into a switch table
|
||||
lookup and jump operation. Since the switch table is placed
|
||||
in the global readonly section, the bpf program will fail to load.
|
||||
The bpf target does not support switch table optimization.
|
||||
The clang option "-fno-jump-tables" can be used to disable
|
||||
switch table generation.
|
||||
|
||||
- For clang -target bpf, it is guaranteed that pointer or long /
|
||||
unsigned long types will always have a width of 64 bit, no matter
|
||||
whether underlying clang binary or default target (or kernel) is
|
||||
32 bit. However, when native clang target is used, then it will
|
||||
compile these types based on the underlying architecture's conventions,
|
||||
meaning in case of 32 bit architecture, pointer or long / unsigned
|
||||
long types e.g. in BPF context structure will have width of 32 bit
|
||||
while the BPF LLVM back end still operates in 64 bit. The native
|
||||
target is mostly needed in tracing for the case of walking pt_regs
|
||||
or other kernel structures where CPU's register width matters.
|
||||
Otherwise, clang -target bpf is generally recommended.
|
||||
|
||||
You should use default target when:
|
||||
|
||||
- Your program includes a header file, e.g., ptrace.h, which eventually
|
||||
pulls in some header files containing file scope host assembly codes.
|
||||
- You can add "-fno-jump-tables" to work around the switch table issue.
|
||||
|
||||
Otherwise, you can use bpf target. Additionally, you _must_ use bpf target
|
||||
when:
|
||||
|
||||
- Your program uses data structures with pointer or long / unsigned long
|
||||
types that interface with BPF helpers or context data structures. Access
|
||||
into these structures is verified by the BPF verifier and may result
|
||||
in verification failures if the native architecture is not aligned with
|
||||
the BPF architecture, e.g. 64-bit. An example of this is
|
||||
BPF_PROG_TYPE_SK_MSG require '-target bpf'
|
||||
|
||||
Happy BPF hacking!
|
|
@ -0,0 +1,14 @@
|
|||
STMicroelectronics STM32 Platforms System Controller
|
||||
|
||||
Properties:
|
||||
- compatible : should contain two values. First value must be :
|
||||
- " st,stm32mp157-syscfg " - for stm32mp157 based SoCs,
|
||||
second value must be always "syscon".
|
||||
- reg : offset and length of the register set.
|
||||
|
||||
Example:
|
||||
syscfg: syscon@50020000 {
|
||||
compatible = "st,stm32mp157-syscfg", "syscon";
|
||||
reg = <0x50020000 0x400>;
|
||||
};
|
||||
|
|
@ -82,8 +82,6 @@ linked into one DSA cluster.
|
|||
|
||||
switch0: switch0@0 {
|
||||
compatible = "marvell,mv88e6085";
|
||||
#address-cells = <1>;
|
||||
#size-cells = <0>;
|
||||
reg = <0>;
|
||||
|
||||
dsa,member = <0 0>;
|
||||
|
@ -135,8 +133,6 @@ linked into one DSA cluster.
|
|||
|
||||
switch1: switch1@0 {
|
||||
compatible = "marvell,mv88e6085";
|
||||
#address-cells = <1>;
|
||||
#size-cells = <0>;
|
||||
reg = <0>;
|
||||
|
||||
dsa,member = <0 1>;
|
||||
|
@ -204,8 +200,6 @@ linked into one DSA cluster.
|
|||
|
||||
switch2: switch2@0 {
|
||||
compatible = "marvell,mv88e6085";
|
||||
#address-cells = <1>;
|
||||
#size-cells = <0>;
|
||||
reg = <0>;
|
||||
|
||||
dsa,member = <0 2>;
|
||||
|
|
|
@ -2,7 +2,10 @@
|
|||
|
||||
Required properties:
|
||||
|
||||
- compatible: should be "qca,qca8337"
|
||||
- compatible: should be one of:
|
||||
"qca,qca8334"
|
||||
"qca,qca8337"
|
||||
|
||||
- #size-cells: must be 0
|
||||
- #address-cells: must be 1
|
||||
|
||||
|
@ -14,6 +17,20 @@ port and PHY id, each subnode describing a port needs to have a valid phandle
|
|||
referencing the internal PHY connected to it. The CPU port of this switch is
|
||||
always port 0.
|
||||
|
||||
A CPU port node has the following optional node:
|
||||
|
||||
- fixed-link : Fixed-link subnode describing a link to a non-MDIO
|
||||
managed entity. See
|
||||
Documentation/devicetree/bindings/net/fixed-link.txt
|
||||
for details.
|
||||
|
||||
For QCA8K the 'fixed-link' sub-node supports only the following properties:
|
||||
|
||||
- 'speed' (integer, mandatory), to indicate the link speed. Accepted
|
||||
values are 10, 100 and 1000
|
||||
- 'full-duplex' (boolean, optional), to indicate that full duplex is
|
||||
used. When absent, half duplex is assumed.
|
||||
|
||||
Example:
|
||||
|
||||
|
||||
|
@ -53,6 +70,10 @@ Example:
|
|||
label = "cpu";
|
||||
ethernet = <&gmac1>;
|
||||
phy-mode = "rgmii";
|
||||
fixed-link {
|
||||
speed = 1000;
|
||||
full-duplex;
|
||||
};
|
||||
};
|
||||
|
||||
port@1 {
|
||||
|
|
|
@ -7,6 +7,7 @@ Required properties:
|
|||
- compatible: must be one of the following string:
|
||||
"allwinner,sun8i-a83t-emac"
|
||||
"allwinner,sun8i-h3-emac"
|
||||
"allwinner,sun8i-r40-gmac"
|
||||
"allwinner,sun8i-v3s-emac"
|
||||
"allwinner,sun50i-a64-emac"
|
||||
- reg: address and length of the register for the device.
|
||||
|
@ -20,18 +21,18 @@ Required properties:
|
|||
- phy-handle: See ethernet.txt
|
||||
- #address-cells: shall be 1
|
||||
- #size-cells: shall be 0
|
||||
- syscon: A phandle to the syscon of the SoC with one of the following
|
||||
compatible string:
|
||||
- allwinner,sun8i-h3-system-controller
|
||||
- allwinner,sun8i-v3s-system-controller
|
||||
- allwinner,sun50i-a64-system-controller
|
||||
- allwinner,sun8i-a83t-system-controller
|
||||
- syscon: A phandle to the device containing the EMAC or GMAC clock register
|
||||
|
||||
Optional properties:
|
||||
- allwinner,tx-delay-ps: TX clock delay chain value in ps. Range value is 0-700. Default is 0)
|
||||
- allwinner,rx-delay-ps: RX clock delay chain value in ps. Range value is 0-3100. Default is 0)
|
||||
Both delay properties need to be a multiple of 100. They control the delay for
|
||||
external PHY.
|
||||
- allwinner,tx-delay-ps: TX clock delay chain value in ps.
|
||||
Range is 0-700. Default is 0.
|
||||
Unavailable for allwinner,sun8i-r40-gmac
|
||||
- allwinner,rx-delay-ps: RX clock delay chain value in ps.
|
||||
Range is 0-3100. Default is 0.
|
||||
Range is 0-700 for allwinner,sun8i-r40-gmac
|
||||
Both delay properties need to be a multiple of 100. They control the
|
||||
clock delay for external RGMII PHY. They do not apply to the internal
|
||||
PHY or external non-RGMII PHYs.
|
||||
|
||||
Optional properties for the following compatibles:
|
||||
- "allwinner,sun8i-h3-emac",
|
||||
|
|
|
@ -86,70 +86,4 @@ Example:
|
|||
|
||||
* Gianfar PTP clock nodes
|
||||
|
||||
General Properties:
|
||||
|
||||
- compatible Should be "fsl,etsec-ptp"
|
||||
- reg Offset and length of the register set for the device
|
||||
- interrupts There should be at least two interrupts. Some devices
|
||||
have as many as four PTP related interrupts.
|
||||
|
||||
Clock Properties:
|
||||
|
||||
- fsl,cksel Timer reference clock source.
|
||||
- fsl,tclk-period Timer reference clock period in nanoseconds.
|
||||
- fsl,tmr-prsc Prescaler, divides the output clock.
|
||||
- fsl,tmr-add Frequency compensation value.
|
||||
- fsl,tmr-fiper1 Fixed interval period pulse generator.
|
||||
- fsl,tmr-fiper2 Fixed interval period pulse generator.
|
||||
- fsl,max-adj Maximum frequency adjustment in parts per billion.
|
||||
|
||||
These properties set the operational parameters for the PTP
|
||||
clock. You must choose these carefully for the clock to work right.
|
||||
Here is how to figure good values:
|
||||
|
||||
TimerOsc = selected reference clock MHz
|
||||
tclk_period = desired clock period nanoseconds
|
||||
NominalFreq = 1000 / tclk_period MHz
|
||||
FreqDivRatio = TimerOsc / NominalFreq (must be greater that 1.0)
|
||||
tmr_add = ceil(2^32 / FreqDivRatio)
|
||||
OutputClock = NominalFreq / tmr_prsc MHz
|
||||
PulseWidth = 1 / OutputClock microseconds
|
||||
FiperFreq1 = desired frequency in Hz
|
||||
FiperDiv1 = 1000000 * OutputClock / FiperFreq1
|
||||
tmr_fiper1 = tmr_prsc * tclk_period * FiperDiv1 - tclk_period
|
||||
max_adj = 1000000000 * (FreqDivRatio - 1.0) - 1
|
||||
|
||||
The calculation for tmr_fiper2 is the same as for tmr_fiper1. The
|
||||
driver expects that tmr_fiper1 will be correctly set to produce a 1
|
||||
Pulse Per Second (PPS) signal, since this will be offered to the PPS
|
||||
subsystem to synchronize the Linux clock.
|
||||
|
||||
Reference clock source is determined by the value, which is holded
|
||||
in CKSEL bits in TMR_CTRL register. "fsl,cksel" property keeps the
|
||||
value, which will be directly written in those bits, that is why,
|
||||
according to reference manual, the next clock sources can be used:
|
||||
|
||||
<0> - external high precision timer reference clock (TSEC_TMR_CLK
|
||||
input is used for this purpose);
|
||||
<1> - eTSEC system clock;
|
||||
<2> - eTSEC1 transmit clock;
|
||||
<3> - RTC clock input.
|
||||
|
||||
When this attribute is not used, eTSEC system clock will serve as
|
||||
IEEE 1588 timer reference clock.
|
||||
|
||||
Example:
|
||||
|
||||
ptp_clock@24e00 {
|
||||
compatible = "fsl,etsec-ptp";
|
||||
reg = <0x24E00 0xB0>;
|
||||
interrupts = <12 0x8 13 0x8>;
|
||||
interrupt-parent = < &ipic >;
|
||||
fsl,cksel = <1>;
|
||||
fsl,tclk-period = <10>;
|
||||
fsl,tmr-prsc = <100>;
|
||||
fsl,tmr-add = <0x999999A4>;
|
||||
fsl,tmr-fiper1 = <0x3B9AC9F6>;
|
||||
fsl,tmr-fiper2 = <0x00018696>;
|
||||
fsl,max-adj = <659999998>;
|
||||
};
|
||||
Refer to Documentation/devicetree/bindings/ptp/ptp-qoriq.txt
|
||||
|
|
|
@ -11,6 +11,7 @@ Required properties on all platforms:
|
|||
- "amlogic,meson8b-dwmac"
|
||||
- "amlogic,meson8m2-dwmac"
|
||||
- "amlogic,meson-gxbb-dwmac"
|
||||
- "amlogic,meson-axg-dwmac"
|
||||
Additionally "snps,dwmac" and any applicable more
|
||||
detailed version number described in net/stmmac.txt
|
||||
should be used.
|
||||
|
|
|
@ -0,0 +1,54 @@
|
|||
Microchip LAN78xx Gigabit Ethernet controller
|
||||
|
||||
The LAN78XX devices are usually configured by programming their OTP or with
|
||||
an external EEPROM, but some platforms (e.g. Raspberry Pi 3 B+) have neither.
|
||||
The Device Tree properties, if present, override the OTP and EEPROM.
|
||||
|
||||
Required properties:
|
||||
- compatible: Should be one of "usb424,7800", "usb424,7801" or "usb424,7850".
|
||||
|
||||
Optional properties:
|
||||
- local-mac-address: see ethernet.txt
|
||||
- mac-address: see ethernet.txt
|
||||
|
||||
Optional properties of the embedded PHY:
|
||||
- microchip,led-modes: a 0..4 element vector, with each element configuring
|
||||
the operating mode of an LED. Omitted LEDs are turned off. Allowed values
|
||||
are defined in "include/dt-bindings/net/microchip-lan78xx.h".
|
||||
|
||||
Example:
|
||||
|
||||
/* Based on the configuration for a Raspberry Pi 3 B+ */
|
||||
&usb {
|
||||
usb-port@1 {
|
||||
compatible = "usb424,2514";
|
||||
reg = <1>;
|
||||
#address-cells = <1>;
|
||||
#size-cells = <0>;
|
||||
|
||||
usb-port@1 {
|
||||
compatible = "usb424,2514";
|
||||
reg = <1>;
|
||||
#address-cells = <1>;
|
||||
#size-cells = <0>;
|
||||
|
||||
ethernet: ethernet@1 {
|
||||
compatible = "usb424,7800";
|
||||
reg = <1>;
|
||||
local-mac-address = [ 00 11 22 33 44 55 ];
|
||||
|
||||
mdio {
|
||||
#address-cells = <0x1>;
|
||||
#size-cells = <0x0>;
|
||||
eth_phy: ethernet-phy@1 {
|
||||
reg = <1>;
|
||||
microchip,led-modes = <
|
||||
LAN78XX_LINK_1000_ACTIVITY
|
||||
LAN78XX_LINK_10_100_ACTIVITY
|
||||
>;
|
||||
};
|
||||
};
|
||||
};
|
||||
};
|
||||
};
|
||||
};
|
|
@ -0,0 +1,26 @@
|
|||
Microsemi MII Management Controller (MIIM) / MDIO
|
||||
=================================================
|
||||
|
||||
Properties:
|
||||
- compatible: must be "mscc,ocelot-miim"
|
||||
- reg: The base address of the MDIO bus controller register bank. Optionally, a
|
||||
second register bank can be defined if there is an associated reset register
|
||||
for internal PHYs
|
||||
- #address-cells: Must be <1>.
|
||||
- #size-cells: Must be <0>. MDIO addresses have no size component.
|
||||
- interrupts: interrupt specifier (refer to the interrupt binding)
|
||||
|
||||
Typically an MDIO bus might have several children.
|
||||
|
||||
Example:
|
||||
mdio@107009c {
|
||||
#address-cells = <1>;
|
||||
#size-cells = <0>;
|
||||
compatible = "mscc,ocelot-miim";
|
||||
reg = <0x107009c 0x36>, <0x10700f0 0x8>;
|
||||
interrupts = <14>;
|
||||
|
||||
phy0: ethernet-phy@0 {
|
||||
reg = <0>;
|
||||
};
|
||||
};
|
|
@ -0,0 +1,82 @@
|
|||
Microsemi Ocelot network Switch
|
||||
===============================
|
||||
|
||||
The Microsemi Ocelot network switch can be found on Microsemi SoCs (VSC7513,
|
||||
VSC7514)
|
||||
|
||||
Required properties:
|
||||
- compatible: Should be "mscc,vsc7514-switch"
|
||||
- reg: Must contain an (offset, length) pair of the register set for each
|
||||
entry in reg-names.
|
||||
- reg-names: Must include the following entries:
|
||||
- "sys"
|
||||
- "rew"
|
||||
- "qs"
|
||||
- "hsio"
|
||||
- "qsys"
|
||||
- "ana"
|
||||
- "portX" with X from 0 to the number of last port index available on that
|
||||
switch
|
||||
- interrupts: Should contain the switch interrupts for frame extraction and
|
||||
frame injection
|
||||
- interrupt-names: should contain the interrupt names: "xtr", "inj"
|
||||
- ethernet-ports: A container for child nodes representing switch ports.
|
||||
|
||||
The ethernet-ports container has the following properties
|
||||
|
||||
Required properties:
|
||||
|
||||
- #address-cells: Must be 1
|
||||
- #size-cells: Must be 0
|
||||
|
||||
Each port node must have the following mandatory properties:
|
||||
- reg: Describes the port address in the switch
|
||||
|
||||
Port nodes may also contain the following optional standardised
|
||||
properties, described in binding documents:
|
||||
|
||||
- phy-handle: Phandle to a PHY on an MDIO bus. See
|
||||
Documentation/devicetree/bindings/net/ethernet.txt for details.
|
||||
|
||||
Example:
|
||||
|
||||
switch@1010000 {
|
||||
compatible = "mscc,vsc7514-switch";
|
||||
reg = <0x1010000 0x10000>,
|
||||
<0x1030000 0x10000>,
|
||||
<0x1080000 0x100>,
|
||||
<0x10d0000 0x10000>,
|
||||
<0x11e0000 0x100>,
|
||||
<0x11f0000 0x100>,
|
||||
<0x1200000 0x100>,
|
||||
<0x1210000 0x100>,
|
||||
<0x1220000 0x100>,
|
||||
<0x1230000 0x100>,
|
||||
<0x1240000 0x100>,
|
||||
<0x1250000 0x100>,
|
||||
<0x1260000 0x100>,
|
||||
<0x1270000 0x100>,
|
||||
<0x1280000 0x100>,
|
||||
<0x1800000 0x80000>,
|
||||
<0x1880000 0x10000>;
|
||||
reg-names = "sys", "rew", "qs", "hsio", "port0",
|
||||
"port1", "port2", "port3", "port4", "port5",
|
||||
"port6", "port7", "port8", "port9", "port10",
|
||||
"qsys", "ana";
|
||||
interrupts = <21 22>;
|
||||
interrupt-names = "xtr", "inj";
|
||||
|
||||
ethernet-ports {
|
||||
#address-cells = <1>;
|
||||
#size-cells = <0>;
|
||||
|
||||
port0: port@0 {
|
||||
reg = <0>;
|
||||
phy-handle = <&phy0>;
|
||||
};
|
||||
port1: port@1 {
|
||||
reg = <1>;
|
||||
phy-handle = <&phy1>;
|
||||
};
|
||||
};
|
||||
};
|
|
@ -0,0 +1,30 @@
|
|||
Qualcomm Bluetooth Chips
|
||||
---------------------
|
||||
|
||||
This documents the binding structure and common properties for serial
|
||||
attached Qualcomm devices.
|
||||
|
||||
Serial attached Qualcomm devices shall be a child node of the host UART
|
||||
device the slave device is attached to.
|
||||
|
||||
Required properties:
|
||||
- compatible: should contain one of the following:
|
||||
* "qcom,qca6174-bt"
|
||||
|
||||
Optional properties:
|
||||
- enable-gpios: gpio specifier used to enable chip
|
||||
- clocks: clock provided to the controller (SUSCLK_32KHZ)
|
||||
|
||||
Example:
|
||||
|
||||
serial@7570000 {
|
||||
label = "BT-UART";
|
||||
status = "okay";
|
||||
|
||||
bluetooth {
|
||||
compatible = "qcom,qca6174-bt";
|
||||
|
||||
enable-gpios = <&pm8994_gpios 19 GPIO_ACTIVE_HIGH>;
|
||||
clocks = <&divclk4>;
|
||||
};
|
||||
};
|
|
@ -7,11 +7,11 @@ Required properties:
|
|||
"sff,sfp" for SFP modules
|
||||
"sff,sff" for soldered down SFF modules
|
||||
|
||||
Optional Properties:
|
||||
|
||||
- i2c-bus : phandle of an I2C bus controller for the SFP two wire serial
|
||||
interface
|
||||
|
||||
Optional Properties:
|
||||
|
||||
- mod-def0-gpios : GPIO phandle and a specifier of the MOD-DEF0 (AKA Mod_ABS)
|
||||
module presence input gpio signal, active (module absent) high. Must
|
||||
not be present for SFF modules
|
||||
|
|
|
@ -14,6 +14,7 @@ Required properties:
|
|||
"renesas,ether-r8a7791" if the device is a part of R8A7791 SoC.
|
||||
"renesas,ether-r8a7793" if the device is a part of R8A7793 SoC.
|
||||
"renesas,ether-r8a7794" if the device is a part of R8A7794 SoC.
|
||||
"renesas,gether-r8a77980" if the device is a part of R8A77980 SoC.
|
||||
"renesas,ether-r7s72100" if the device is a part of R7S72100 SoC.
|
||||
"renesas,rcar-gen1-ether" for a generic R-Car Gen1 device.
|
||||
"renesas,rcar-gen2-ether" for a generic R-Car Gen2 or RZ/G1
|
||||
|
|
|
@ -13,13 +13,25 @@ Required properties:
|
|||
- reg: Address where registers are mapped and size of region.
|
||||
- interrupts: Should contain the MAC interrupt.
|
||||
- phy-mode: See ethernet.txt in the same directory. Allow to choose
|
||||
"rgmii", "rmii", or "mii" according to the PHY.
|
||||
"rgmii", "rmii", "mii", or "internal" according to the PHY.
|
||||
The acceptable mode is SoC-dependent.
|
||||
- phy-handle: Should point to the external phy device.
|
||||
See ethernet.txt file in the same directory.
|
||||
- clocks: A phandle to the clock for the MAC.
|
||||
For Pro4 SoC, that is "socionext,uniphier-pro4-ave4",
|
||||
another MAC clock, GIO bus clock and PHY clock are also required.
|
||||
- clock-names: Should contain
|
||||
- "ether", "ether-gb", "gio", "ether-phy" for Pro4 SoC
|
||||
- "ether" for others
|
||||
- resets: A phandle to the reset control for the MAC. For Pro4 SoC,
|
||||
GIO bus reset is also required.
|
||||
- reset-names: Should contain
|
||||
- "ether", "gio" for Pro4 SoC
|
||||
- "ether" for others
|
||||
- socionext,syscon-phy-mode: A phandle to syscon with one argument
|
||||
that configures phy mode. The argument is the ID of MAC instance.
|
||||
|
||||
Optional properties:
|
||||
- resets: A phandle to the reset control for the MAC.
|
||||
- local-mac-address: See ethernet.txt in the same directory.
|
||||
|
||||
Required subnode:
|
||||
|
@ -34,8 +46,11 @@ Example:
|
|||
interrupts = <0 66 4>;
|
||||
phy-mode = "rgmii";
|
||||
phy-handle = <ðphy>;
|
||||
clock-names = "ether";
|
||||
clocks = <&sys_clk 6>;
|
||||
reset-names = "ether";
|
||||
resets = <&sys_rst 6>;
|
||||
socionext,syscon-phy-mode = <&soc_glue 0>;
|
||||
local-mac-address = [00 00 00 00 00 00];
|
||||
|
||||
mdio {
|
||||
|
|
|
@ -6,14 +6,28 @@ Please see stmmac.txt for the other unchanged properties.
|
|||
The device node has following properties.
|
||||
|
||||
Required properties:
|
||||
- compatible: Should be "st,stm32-dwmac" to select glue, and
|
||||
- compatible: For MCU family should be "st,stm32-dwmac" to select glue, and
|
||||
"snps,dwmac-3.50a" to select IP version.
|
||||
For MPU family should be "st,stm32mp1-dwmac" to select
|
||||
glue, and "snps,dwmac-4.20a" to select IP version.
|
||||
- clocks: Must contain a phandle for each entry in clock-names.
|
||||
- clock-names: Should be "stmmaceth" for the host clock.
|
||||
Should be "mac-clk-tx" for the MAC TX clock.
|
||||
Should be "mac-clk-rx" for the MAC RX clock.
|
||||
For MPU family need to add also "ethstp" for power mode clock and,
|
||||
"syscfg-clk" for SYSCFG clock.
|
||||
- interrupt-names: Should contain a list of interrupt names corresponding to
|
||||
the interrupts in the interrupts property, if available.
|
||||
Should be "macirq" for the main MAC IRQ
|
||||
Should be "eth_wake_irq" for the IT which wake up system
|
||||
- st,syscon : Should be phandle/offset pair. The phandle to the syscon node which
|
||||
encompases the glue register, and the offset of the control register.
|
||||
encompases the glue register, and the offset of the control register.
|
||||
|
||||
Optional properties:
|
||||
- clock-names: For MPU family "mac-clk-ck" for PHY without quartz
|
||||
- st,int-phyclk (boolean) : valid only where PHY do not have quartz and need to be clock
|
||||
by RCC
|
||||
|
||||
Example:
|
||||
|
||||
ethernet@40028000 {
|
||||
|
|
|
@ -4,6 +4,7 @@ Required properties:
|
|||
- compatible: Should be one of the following:
|
||||
* "qcom,ath10k"
|
||||
* "qcom,ipq4019-wifi"
|
||||
* "qcom,wcn3990-wifi"
|
||||
|
||||
PCI based devices uses compatible string "qcom,ath10k" and takes calibration
|
||||
data along with board specific data via "qcom,ath10k-calibration-data".
|
||||
|
@ -18,8 +19,12 @@ In general, entry "qcom,ath10k-pre-calibration-data" and
|
|||
"qcom,ath10k-calibration-data" conflict with each other and only one
|
||||
can be provided per device.
|
||||
|
||||
SNOC based devices (i.e. wcn3990) uses compatible string "qcom,wcn3990-wifi".
|
||||
|
||||
Optional properties:
|
||||
- reg: Address and length of the register set for the device.
|
||||
- reg-names: Must include the list of following reg names,
|
||||
"membase"
|
||||
- resets: Must contain an entry for each entry in reset-names.
|
||||
See ../reset/reseti.txt for details.
|
||||
- reset-names: Must include the list of following reset names,
|
||||
|
@ -49,6 +54,8 @@ Optional properties:
|
|||
hw versions.
|
||||
- qcom,ath10k-pre-calibration-data : pre calibration data as an array,
|
||||
the length can vary between hw versions.
|
||||
- <supply-name>-supply: handle to the regulator device tree node
|
||||
optional "supply-name" is "vdd-0.8-cx-mx".
|
||||
|
||||
Example (to supply the calibration data alone):
|
||||
|
||||
|
@ -119,3 +126,27 @@ wifi0: wifi@a000000 {
|
|||
qcom,msi_base = <0x40>;
|
||||
qcom,ath10k-pre-calibration-data = [ 01 02 03 ... ];
|
||||
};
|
||||
|
||||
Example (to supply wcn3990 SoC wifi block details):
|
||||
|
||||
wifi@18000000 {
|
||||
compatible = "qcom,wcn3990-wifi";
|
||||
reg = <0x18800000 0x800000>;
|
||||
reg-names = "membase";
|
||||
clocks = <&clock_gcc clk_aggre2_noc_clk>;
|
||||
clock-names = "smmu_aggre2_noc_clk"
|
||||
interrupts =
|
||||
<0 130 0 /* CE0 */ >,
|
||||
<0 131 0 /* CE1 */ >,
|
||||
<0 132 0 /* CE2 */ >,
|
||||
<0 133 0 /* CE3 */ >,
|
||||
<0 134 0 /* CE4 */ >,
|
||||
<0 135 0 /* CE5 */ >,
|
||||
<0 136 0 /* CE6 */ >,
|
||||
<0 137 0 /* CE7 */ >,
|
||||
<0 138 0 /* CE8 */ >,
|
||||
<0 139 0 /* CE9 */ >,
|
||||
<0 140 0 /* CE10 */ >,
|
||||
<0 141 0 /* CE11 */ >;
|
||||
vdd-0.8-cx-mx-supply = <&pm8998_l5>;
|
||||
};
|
||||
|
|
|
@ -0,0 +1,69 @@
|
|||
* Freescale QorIQ 1588 timer based PTP clock
|
||||
|
||||
General Properties:
|
||||
|
||||
- compatible Should be "fsl,etsec-ptp"
|
||||
- reg Offset and length of the register set for the device
|
||||
- interrupts There should be at least two interrupts. Some devices
|
||||
have as many as four PTP related interrupts.
|
||||
|
||||
Clock Properties:
|
||||
|
||||
- fsl,cksel Timer reference clock source.
|
||||
- fsl,tclk-period Timer reference clock period in nanoseconds.
|
||||
- fsl,tmr-prsc Prescaler, divides the output clock.
|
||||
- fsl,tmr-add Frequency compensation value.
|
||||
- fsl,tmr-fiper1 Fixed interval period pulse generator.
|
||||
- fsl,tmr-fiper2 Fixed interval period pulse generator.
|
||||
- fsl,max-adj Maximum frequency adjustment in parts per billion.
|
||||
|
||||
These properties set the operational parameters for the PTP
|
||||
clock. You must choose these carefully for the clock to work right.
|
||||
Here is how to figure good values:
|
||||
|
||||
TimerOsc = selected reference clock MHz
|
||||
tclk_period = desired clock period nanoseconds
|
||||
NominalFreq = 1000 / tclk_period MHz
|
||||
FreqDivRatio = TimerOsc / NominalFreq (must be greater that 1.0)
|
||||
tmr_add = ceil(2^32 / FreqDivRatio)
|
||||
OutputClock = NominalFreq / tmr_prsc MHz
|
||||
PulseWidth = 1 / OutputClock microseconds
|
||||
FiperFreq1 = desired frequency in Hz
|
||||
FiperDiv1 = 1000000 * OutputClock / FiperFreq1
|
||||
tmr_fiper1 = tmr_prsc * tclk_period * FiperDiv1 - tclk_period
|
||||
max_adj = 1000000000 * (FreqDivRatio - 1.0) - 1
|
||||
|
||||
The calculation for tmr_fiper2 is the same as for tmr_fiper1. The
|
||||
driver expects that tmr_fiper1 will be correctly set to produce a 1
|
||||
Pulse Per Second (PPS) signal, since this will be offered to the PPS
|
||||
subsystem to synchronize the Linux clock.
|
||||
|
||||
Reference clock source is determined by the value, which is holded
|
||||
in CKSEL bits in TMR_CTRL register. "fsl,cksel" property keeps the
|
||||
value, which will be directly written in those bits, that is why,
|
||||
according to reference manual, the next clock sources can be used:
|
||||
|
||||
<0> - external high precision timer reference clock (TSEC_TMR_CLK
|
||||
input is used for this purpose);
|
||||
<1> - eTSEC system clock;
|
||||
<2> - eTSEC1 transmit clock;
|
||||
<3> - RTC clock input.
|
||||
|
||||
When this attribute is not used, eTSEC system clock will serve as
|
||||
IEEE 1588 timer reference clock.
|
||||
|
||||
Example:
|
||||
|
||||
ptp_clock@24e00 {
|
||||
compatible = "fsl,etsec-ptp";
|
||||
reg = <0x24E00 0xB0>;
|
||||
interrupts = <12 0x8 13 0x8>;
|
||||
interrupt-parent = < &ipic >;
|
||||
fsl,cksel = <1>;
|
||||
fsl,tclk-period = <10>;
|
||||
fsl,tmr-prsc = <100>;
|
||||
fsl,tmr-add = <0x999999A4>;
|
||||
fsl,tmr-fiper1 = <0x3B9AC9F6>;
|
||||
fsl,tmr-fiper2 = <0x00018696>;
|
||||
fsl,max-adj = <659999998>;
|
||||
};
|
|
@ -17,7 +17,8 @@ pool management.
|
|||
|
||||
|
||||
Required properties:
|
||||
- compatible : Must be "ti,keystone-navigator-qmss";
|
||||
- compatible : Must be "ti,keystone-navigator-qmss".
|
||||
: Must be "ti,66ak2g-navss-qm" for QMSS on K2G SoC.
|
||||
- clocks : phandle to the reference clock for this device.
|
||||
- queue-range : <start number> total range of queue numbers for the device.
|
||||
- linkram0 : <address size> for internal link ram, where size is the total
|
||||
|
@ -39,6 +40,12 @@ Required properties:
|
|||
- Descriptor memory setup region.
|
||||
- Queue Management/Queue Proxy region for queue Push.
|
||||
- Queue Management/Queue Proxy region for queue Pop.
|
||||
|
||||
For QMSS on K2G SoC, following QM reg indexes are used in that order
|
||||
- Queue Peek region.
|
||||
- Queue configuration region.
|
||||
- Queue Management/Queue Proxy region for queue Push/Pop.
|
||||
|
||||
- queue-pools : child node classifying the queue ranges into pools.
|
||||
Queue ranges are grouped into 3 type of pools:
|
||||
- qpend : pool of qpend(interruptible) queues
|
||||
|
|
|
@ -5,6 +5,7 @@ Written 1996 by Gero Kuhlmann <gero@gkminix.han.de>
|
|||
Updated 1997 by Martin Mares <mj@atrey.karlin.mff.cuni.cz>
|
||||
Updated 2006 by Nico Schottelius <nico-kernel-nfsroot@schottelius.org>
|
||||
Updated 2006 by Horms <horms@verge.net.au>
|
||||
Updated 2018 by Chris Novakovic <chris@chrisn.me.uk>
|
||||
|
||||
|
||||
|
||||
|
@ -79,7 +80,7 @@ nfsroot=[<server-ip>:]<root-dir>[,<nfs-options>]
|
|||
|
||||
|
||||
ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf>:
|
||||
<dns0-ip>:<dns1-ip>
|
||||
<dns0-ip>:<dns1-ip>:<ntp0-ip>
|
||||
|
||||
This parameter tells the kernel how to configure IP addresses of devices
|
||||
and also how to set up the IP routing table. It was originally called
|
||||
|
@ -110,6 +111,9 @@ ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf>:
|
|||
will not be triggered if it is missing and NFS root is not
|
||||
in operation.
|
||||
|
||||
Value is exported to /proc/net/pnp with the prefix "bootserver "
|
||||
(see below).
|
||||
|
||||
Default: Determined using autoconfiguration.
|
||||
The address of the autoconfiguration server is used.
|
||||
|
||||
|
@ -123,10 +127,13 @@ ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf>:
|
|||
|
||||
Default: Determined using autoconfiguration.
|
||||
|
||||
<hostname> Name of the client. May be supplied by autoconfiguration,
|
||||
but its absence will not trigger autoconfiguration.
|
||||
If specified and DHCP is used, the user provided hostname will
|
||||
be carried in the DHCP request to hopefully update DNS record.
|
||||
<hostname> Name of the client. If a '.' character is present, anything
|
||||
before the first '.' is used as the client's hostname, and anything
|
||||
after it is used as its NIS domain name. May be supplied by
|
||||
autoconfiguration, but its absence will not trigger autoconfiguration.
|
||||
If specified and DHCP is used, the user-provided hostname (and NIS
|
||||
domain name, if present) will be carried in the DHCP request; this
|
||||
may cause a DNS record to be created or updated for the client.
|
||||
|
||||
Default: Client IP address is used in ASCII notation.
|
||||
|
||||
|
@ -162,12 +169,55 @@ ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf>:
|
|||
|
||||
Default: any
|
||||
|
||||
<dns0-ip> IP address of first nameserver.
|
||||
Value gets exported by /proc/net/pnp which is often linked
|
||||
on embedded systems by /etc/resolv.conf.
|
||||
<dns0-ip> IP address of primary nameserver.
|
||||
Value is exported to /proc/net/pnp with the prefix "nameserver "
|
||||
(see below).
|
||||
|
||||
<dns1-ip> IP address of second nameserver.
|
||||
Same as above.
|
||||
Default: None if not using autoconfiguration; determined
|
||||
automatically if using autoconfiguration.
|
||||
|
||||
<dns1-ip> IP address of secondary nameserver.
|
||||
See <dns0-ip>.
|
||||
|
||||
<ntp0-ip> IP address of a Network Time Protocol (NTP) server.
|
||||
Value is exported to /proc/net/ipconfig/ntp_servers, but is
|
||||
otherwise unused (see below).
|
||||
|
||||
Default: None if not using autoconfiguration; determined
|
||||
automatically if using autoconfiguration.
|
||||
|
||||
After configuration (whether manual or automatic) is complete, two files
|
||||
are created in the following format; lines are omitted if their respective
|
||||
value is empty following configuration:
|
||||
|
||||
- /proc/net/pnp:
|
||||
|
||||
#PROTO: <DHCP|BOOTP|RARP|MANUAL> (depending on configuration method)
|
||||
domain <dns-domain> (if autoconfigured, the DNS domain)
|
||||
nameserver <dns0-ip> (primary name server IP)
|
||||
nameserver <dns1-ip> (secondary name server IP)
|
||||
nameserver <dns2-ip> (tertiary name server IP)
|
||||
bootserver <server-ip> (NFS server IP)
|
||||
|
||||
- /proc/net/ipconfig/ntp_servers:
|
||||
|
||||
<ntp0-ip> (NTP server IP)
|
||||
<ntp1-ip> (NTP server IP)
|
||||
<ntp2-ip> (NTP server IP)
|
||||
|
||||
<dns-domain> and <dns2-ip> (in /proc/net/pnp) and <ntp1-ip> and <ntp2-ip>
|
||||
(in /proc/net/ipconfig/ntp_servers) are requested during autoconfiguration;
|
||||
they cannot be specified as part of the "ip=" kernel command line parameter.
|
||||
|
||||
Because the "domain" and "nameserver" options are recognised by DNS
|
||||
resolvers, /etc/resolv.conf is often linked to /proc/net/pnp on systems
|
||||
that use an NFS root filesystem.
|
||||
|
||||
Note that the kernel will not synchronise the system time with any NTP
|
||||
servers it discovers; this is the responsibility of a user space process
|
||||
(e.g. an initrd/initramfs script that passes the IP addresses listed in
|
||||
/proc/net/ipconfig/ntp_servers to an NTP client before mounting the real
|
||||
root filesystem if it is on NFS).
|
||||
|
||||
|
||||
nfsrootdebug
|
||||
|
|
|
@ -24,10 +24,10 @@ enum lowpan_lltypes.
|
|||
|
||||
Example to evaluate the private usually you can do:
|
||||
|
||||
static inline sturct lowpan_priv_foobar *
|
||||
static inline struct lowpan_priv_foobar *
|
||||
lowpan_foobar_priv(struct net_device *dev)
|
||||
{
|
||||
return (sturct lowpan_priv_foobar *)lowpan_priv(dev)->priv;
|
||||
return (struct lowpan_priv_foobar *)lowpan_priv(dev)->priv;
|
||||
}
|
||||
|
||||
switch (dev->type) {
|
||||
|
|
|
@ -0,0 +1,312 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
======
|
||||
AF_XDP
|
||||
======
|
||||
|
||||
Overview
|
||||
========
|
||||
|
||||
AF_XDP is an address family that is optimized for high performance
|
||||
packet processing.
|
||||
|
||||
This document assumes that the reader is familiar with BPF and XDP. If
|
||||
not, the Cilium project has an excellent reference guide at
|
||||
http://cilium.readthedocs.io/en/latest/bpf/.
|
||||
|
||||
Using the XDP_REDIRECT action from an XDP program, the program can
|
||||
redirect ingress frames to other XDP enabled netdevs, using the
|
||||
bpf_redirect_map() function. AF_XDP sockets enable the possibility for
|
||||
XDP programs to redirect frames to a memory buffer in a user-space
|
||||
application.
|
||||
|
||||
An AF_XDP socket (XSK) is created with the normal socket()
|
||||
syscall. Associated with each XSK are two rings: the RX ring and the
|
||||
TX ring. A socket can receive packets on the RX ring and it can send
|
||||
packets on the TX ring. These rings are registered and sized with the
|
||||
setsockopts XDP_RX_RING and XDP_TX_RING, respectively. It is mandatory
|
||||
to have at least one of these rings for each socket. An RX or TX
|
||||
descriptor ring points to a data buffer in a memory area called a
|
||||
UMEM. RX and TX can share the same UMEM so that a packet does not have
|
||||
to be copied between RX and TX. Moreover, if a packet needs to be kept
|
||||
for a while due to a possible retransmit, the descriptor that points
|
||||
to that packet can be changed to point to another and reused right
|
||||
away. This again avoids copying data.
|
||||
|
||||
The UMEM consists of a number of equally sized chunks. A descriptor in
|
||||
one of the rings references a frame by referencing its addr. The addr
|
||||
is simply an offset within the entire UMEM region. The user space
|
||||
allocates memory for this UMEM using whatever means it feels is most
|
||||
appropriate (malloc, mmap, huge pages, etc). This memory area is then
|
||||
registered with the kernel using the new setsockopt XDP_UMEM_REG. The
|
||||
UMEM also has two rings: the FILL ring and the COMPLETION ring. The
|
||||
fill ring is used by the application to send down addr for the kernel
|
||||
to fill in with RX packet data. References to these frames will then
|
||||
appear in the RX ring once each packet has been received. The
|
||||
completion ring, on the other hand, contains frame addr that the
|
||||
kernel has transmitted completely and can now be used again by user
|
||||
space, for either TX or RX. Thus, the frame addrs appearing in the
|
||||
completion ring are addrs that were previously transmitted using the
|
||||
TX ring. In summary, the RX and FILL rings are used for the RX path
|
||||
and the TX and COMPLETION rings are used for the TX path.
|
||||
|
||||
The socket is then finally bound with a bind() call to a device and a
|
||||
specific queue id on that device, and it is not until bind is
|
||||
completed that traffic starts to flow.
|
||||
|
||||
The UMEM can be shared between processes, if desired. If a process
|
||||
wants to do this, it simply skips the registration of the UMEM and its
|
||||
corresponding two rings, sets the XDP_SHARED_UMEM flag in the bind
|
||||
call and submits the XSK of the process it would like to share UMEM
|
||||
with as well as its own newly created XSK socket. The new process will
|
||||
then receive frame addr references in its own RX ring that point to
|
||||
this shared UMEM. Note that since the ring structures are
|
||||
single-consumer / single-producer (for performance reasons), the new
|
||||
process has to create its own socket with associated RX and TX rings,
|
||||
since it cannot share this with the other process. This is also the
|
||||
reason that there is only one set of FILL and COMPLETION rings per
|
||||
UMEM. It is the responsibility of a single process to handle the UMEM.
|
||||
|
||||
How is then packets distributed from an XDP program to the XSKs? There
|
||||
is a BPF map called XSKMAP (or BPF_MAP_TYPE_XSKMAP in full). The
|
||||
user-space application can place an XSK at an arbitrary place in this
|
||||
map. The XDP program can then redirect a packet to a specific index in
|
||||
this map and at this point XDP validates that the XSK in that map was
|
||||
indeed bound to that device and ring number. If not, the packet is
|
||||
dropped. If the map is empty at that index, the packet is also
|
||||
dropped. This also means that it is currently mandatory to have an XDP
|
||||
program loaded (and one XSK in the XSKMAP) to be able to get any
|
||||
traffic to user space through the XSK.
|
||||
|
||||
AF_XDP can operate in two different modes: XDP_SKB and XDP_DRV. If the
|
||||
driver does not have support for XDP, or XDP_SKB is explicitly chosen
|
||||
when loading the XDP program, XDP_SKB mode is employed that uses SKBs
|
||||
together with the generic XDP support and copies out the data to user
|
||||
space. A fallback mode that works for any network device. On the other
|
||||
hand, if the driver has support for XDP, it will be used by the AF_XDP
|
||||
code to provide better performance, but there is still a copy of the
|
||||
data into user space.
|
||||
|
||||
Concepts
|
||||
========
|
||||
|
||||
In order to use an AF_XDP socket, a number of associated objects need
|
||||
to be setup.
|
||||
|
||||
Jonathan Corbet has also written an excellent article on LWN,
|
||||
"Accelerating networking with AF_XDP". It can be found at
|
||||
https://lwn.net/Articles/750845/.
|
||||
|
||||
UMEM
|
||||
----
|
||||
|
||||
UMEM is a region of virtual contiguous memory, divided into
|
||||
equal-sized frames. An UMEM is associated to a netdev and a specific
|
||||
queue id of that netdev. It is created and configured (chunk size,
|
||||
headroom, start address and size) by using the XDP_UMEM_REG setsockopt
|
||||
system call. A UMEM is bound to a netdev and queue id, via the bind()
|
||||
system call.
|
||||
|
||||
An AF_XDP is socket linked to a single UMEM, but one UMEM can have
|
||||
multiple AF_XDP sockets. To share an UMEM created via one socket A,
|
||||
the next socket B can do this by setting the XDP_SHARED_UMEM flag in
|
||||
struct sockaddr_xdp member sxdp_flags, and passing the file descriptor
|
||||
of A to struct sockaddr_xdp member sxdp_shared_umem_fd.
|
||||
|
||||
The UMEM has two single-producer/single-consumer rings, that are used
|
||||
to transfer ownership of UMEM frames between the kernel and the
|
||||
user-space application.
|
||||
|
||||
Rings
|
||||
-----
|
||||
|
||||
There are a four different kind of rings: Fill, Completion, RX and
|
||||
TX. All rings are single-producer/single-consumer, so the user-space
|
||||
application need explicit synchronization of multiple
|
||||
processes/threads are reading/writing to them.
|
||||
|
||||
The UMEM uses two rings: Fill and Completion. Each socket associated
|
||||
with the UMEM must have an RX queue, TX queue or both. Say, that there
|
||||
is a setup with four sockets (all doing TX and RX). Then there will be
|
||||
one Fill ring, one Completion ring, four TX rings and four RX rings.
|
||||
|
||||
The rings are head(producer)/tail(consumer) based rings. A producer
|
||||
writes the data ring at the index pointed out by struct xdp_ring
|
||||
producer member, and increasing the producer index. A consumer reads
|
||||
the data ring at the index pointed out by struct xdp_ring consumer
|
||||
member, and increasing the consumer index.
|
||||
|
||||
The rings are configured and created via the _RING setsockopt system
|
||||
calls and mmapped to user-space using the appropriate offset to mmap()
|
||||
(XDP_PGOFF_RX_RING, XDP_PGOFF_TX_RING, XDP_UMEM_PGOFF_FILL_RING and
|
||||
XDP_UMEM_PGOFF_COMPLETION_RING).
|
||||
|
||||
The size of the rings need to be of size power of two.
|
||||
|
||||
UMEM Fill Ring
|
||||
~~~~~~~~~~~~~~
|
||||
|
||||
The Fill ring is used to transfer ownership of UMEM frames from
|
||||
user-space to kernel-space. The UMEM addrs are passed in the ring. As
|
||||
an example, if the UMEM is 64k and each chunk is 4k, then the UMEM has
|
||||
16 chunks and can pass addrs between 0 and 64k.
|
||||
|
||||
Frames passed to the kernel are used for the ingress path (RX rings).
|
||||
|
||||
The user application produces UMEM addrs to this ring. Note that the
|
||||
kernel will mask the incoming addr. E.g. for a chunk size of 2k, the
|
||||
log2(2048) LSB of the addr will be masked off, meaning that 2048, 2050
|
||||
and 3000 refers to the same chunk.
|
||||
|
||||
|
||||
UMEM Completetion Ring
|
||||
~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The Completion Ring is used transfer ownership of UMEM frames from
|
||||
kernel-space to user-space. Just like the Fill ring, UMEM indicies are
|
||||
used.
|
||||
|
||||
Frames passed from the kernel to user-space are frames that has been
|
||||
sent (TX ring) and can be used by user-space again.
|
||||
|
||||
The user application consumes UMEM addrs from this ring.
|
||||
|
||||
|
||||
RX Ring
|
||||
~~~~~~~
|
||||
|
||||
The RX ring is the receiving side of a socket. Each entry in the ring
|
||||
is a struct xdp_desc descriptor. The descriptor contains UMEM offset
|
||||
(addr) and the length of the data (len).
|
||||
|
||||
If no frames have been passed to kernel via the Fill ring, no
|
||||
descriptors will (or can) appear on the RX ring.
|
||||
|
||||
The user application consumes struct xdp_desc descriptors from this
|
||||
ring.
|
||||
|
||||
TX Ring
|
||||
~~~~~~~
|
||||
|
||||
The TX ring is used to send frames. The struct xdp_desc descriptor is
|
||||
filled (index, length and offset) and passed into the ring.
|
||||
|
||||
To start the transfer a sendmsg() system call is required. This might
|
||||
be relaxed in the future.
|
||||
|
||||
The user application produces struct xdp_desc descriptors to this
|
||||
ring.
|
||||
|
||||
XSKMAP / BPF_MAP_TYPE_XSKMAP
|
||||
----------------------------
|
||||
|
||||
On XDP side there is a BPF map type BPF_MAP_TYPE_XSKMAP (XSKMAP) that
|
||||
is used in conjunction with bpf_redirect_map() to pass the ingress
|
||||
frame to a socket.
|
||||
|
||||
The user application inserts the socket into the map, via the bpf()
|
||||
system call.
|
||||
|
||||
Note that if an XDP program tries to redirect to a socket that does
|
||||
not match the queue configuration and netdev, the frame will be
|
||||
dropped. E.g. an AF_XDP socket is bound to netdev eth0 and
|
||||
queue 17. Only the XDP program executing for eth0 and queue 17 will
|
||||
successfully pass data to the socket. Please refer to the sample
|
||||
application (samples/bpf/) in for an example.
|
||||
|
||||
Usage
|
||||
=====
|
||||
|
||||
In order to use AF_XDP sockets there are two parts needed. The
|
||||
user-space application and the XDP program. For a complete setup and
|
||||
usage example, please refer to the sample application. The user-space
|
||||
side is xdpsock_user.c and the XDP side xdpsock_kern.c.
|
||||
|
||||
Naive ring dequeue and enqueue could look like this::
|
||||
|
||||
// struct xdp_rxtx_ring {
|
||||
// __u32 *producer;
|
||||
// __u32 *consumer;
|
||||
// struct xdp_desc *desc;
|
||||
// };
|
||||
|
||||
// struct xdp_umem_ring {
|
||||
// __u32 *producer;
|
||||
// __u32 *consumer;
|
||||
// __u64 *desc;
|
||||
// };
|
||||
|
||||
// typedef struct xdp_rxtx_ring RING;
|
||||
// typedef struct xdp_umem_ring RING;
|
||||
|
||||
// typedef struct xdp_desc RING_TYPE;
|
||||
// typedef __u64 RING_TYPE;
|
||||
|
||||
int dequeue_one(RING *ring, RING_TYPE *item)
|
||||
{
|
||||
__u32 entries = *ring->producer - *ring->consumer;
|
||||
|
||||
if (entries == 0)
|
||||
return -1;
|
||||
|
||||
// read-barrier!
|
||||
|
||||
*item = ring->desc[*ring->consumer & (RING_SIZE - 1)];
|
||||
(*ring->consumer)++;
|
||||
return 0;
|
||||
}
|
||||
|
||||
int enqueue_one(RING *ring, const RING_TYPE *item)
|
||||
{
|
||||
u32 free_entries = RING_SIZE - (*ring->producer - *ring->consumer);
|
||||
|
||||
if (free_entries == 0)
|
||||
return -1;
|
||||
|
||||
ring->desc[*ring->producer & (RING_SIZE - 1)] = *item;
|
||||
|
||||
// write-barrier!
|
||||
|
||||
(*ring->producer)++;
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
||||
For a more optimized version, please refer to the sample application.
|
||||
|
||||
Sample application
|
||||
==================
|
||||
|
||||
There is a xdpsock benchmarking/test application included that
|
||||
demonstrates how to use AF_XDP sockets with both private and shared
|
||||
UMEMs. Say that you would like your UDP traffic from port 4242 to end
|
||||
up in queue 16, that we will enable AF_XDP on. Here, we use ethtool
|
||||
for this::
|
||||
|
||||
ethtool -N p3p2 rx-flow-hash udp4 fn
|
||||
ethtool -N p3p2 flow-type udp4 src-port 4242 dst-port 4242 \
|
||||
action 16
|
||||
|
||||
Running the rxdrop benchmark in XDP_DRV mode can then be done
|
||||
using::
|
||||
|
||||
samples/bpf/xdpsock -i p3p2 -q 16 -r -N
|
||||
|
||||
For XDP_SKB mode, use the switch "-S" instead of "-N" and all options
|
||||
can be displayed with "-h", as usual.
|
||||
|
||||
Credits
|
||||
=======
|
||||
|
||||
- Björn Töpel (AF_XDP core)
|
||||
- Magnus Karlsson (AF_XDP core)
|
||||
- Alexander Duyck
|
||||
- Alexei Starovoitov
|
||||
- Daniel Borkmann
|
||||
- Jesper Dangaard Brouer
|
||||
- John Fastabend
|
||||
- Jonathan Corbet (LWN coverage)
|
||||
- Michael S. Tsirkin
|
||||
- Qi Z Zhang
|
||||
- Willem de Bruijn
|
||||
|
|
@ -140,7 +140,7 @@ bonding module at load time, or are specified via sysfs.
|
|||
|
||||
Module options may be given as command line arguments to the
|
||||
insmod or modprobe command, but are usually specified in either the
|
||||
/etc/modrobe.d/*.conf configuration files, or in a distro-specific
|
||||
/etc/modprobe.d/*.conf configuration files, or in a distro-specific
|
||||
configuration file (some of which are detailed in the next section).
|
||||
|
||||
Details on bonding support for sysfs is provided in the
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
Linux* Base Driver for the Intel(R) PRO/100 Family of Adapters
|
||||
==============================================================
|
||||
|
||||
March 15, 2011
|
||||
June 1, 2018
|
||||
|
||||
Contents
|
||||
========
|
||||
|
@ -36,16 +36,9 @@ Channel Bonding documentation can be found in the Linux kernel source:
|
|||
Identifying Your Adapter
|
||||
========================
|
||||
|
||||
For more information on how to identify your adapter, go to the Adapter &
|
||||
Driver ID Guide at:
|
||||
|
||||
http://support.intel.com/support/network/adapter/pro100/21397.htm
|
||||
|
||||
For the latest Intel network drivers for Linux, refer to the following
|
||||
website. In the search field, enter your adapter name or type, or use the
|
||||
networking link on the left to search for your adapter:
|
||||
|
||||
http://downloadfinder.intel.com/scripts-df/support_intel.asp
|
||||
For information on how to identify your adapter, and for the latest Intel
|
||||
network drivers, refer to the Intel Support website:
|
||||
http://www.intel.com/support
|
||||
|
||||
Driver Configuration Parameters
|
||||
===============================
|
||||
|
@ -57,22 +50,26 @@ Rx Descriptors: Number of receive descriptors. A receive descriptor is a data
|
|||
structure that describes a receive buffer and its attributes to the network
|
||||
controller. The data in the descriptor is used by the controller to write
|
||||
data from the controller to host memory. In the 3.x.x driver the valid range
|
||||
for this parameter is 64-256. The default value is 64. This parameter can be
|
||||
changed using the command:
|
||||
for this parameter is 64-256. The default value is 256. This parameter can be
|
||||
changed using the command::
|
||||
|
||||
ethtool -G eth? rx n, where n is the number of desired rx descriptors.
|
||||
ethtool -G eth? rx n
|
||||
|
||||
Where n is the number of desired Rx descriptors.
|
||||
|
||||
Tx Descriptors: Number of transmit descriptors. A transmit descriptor is a data
|
||||
structure that describes a transmit buffer and its attributes to the network
|
||||
controller. The data in the descriptor is used by the controller to read
|
||||
data from the host memory to the controller. In the 3.x.x driver the valid
|
||||
range for this parameter is 64-256. The default value is 64. This parameter
|
||||
can be changed using the command:
|
||||
range for this parameter is 64-256. The default value is 128. This parameter
|
||||
can be changed using the command::
|
||||
|
||||
ethtool -G eth? tx n, where n is the number of desired tx descriptors.
|
||||
ethtool -G eth? tx n
|
||||
|
||||
Where n is the number of desired Tx descriptors.
|
||||
|
||||
Speed/Duplex: The driver auto-negotiates the link speed and duplex settings by
|
||||
default. The ethtool utility can be used as follows to force speed/duplex.
|
||||
default. The ethtool utility can be used as follows to force speed/duplex.::
|
||||
|
||||
ethtool -s eth? autoneg off speed {10|100} duplex {full|half}
|
||||
|
||||
|
@ -81,7 +78,7 @@ Speed/Duplex: The driver auto-negotiates the link speed and duplex settings by
|
|||
|
||||
Event Log Message Level: The driver uses the message level flag to log events
|
||||
to syslog. The message level can be set at driver load time. It can also be
|
||||
set using the command:
|
||||
set using the command::
|
||||
|
||||
ethtool -s eth? msglvl n
|
||||
|
||||
|
@ -112,9 +109,9 @@ Additional Configurations
|
|||
---------------------
|
||||
In order to see link messages and other Intel driver information on your
|
||||
console, you must set the dmesg level up to six. This can be done by
|
||||
entering the following on the command line before loading the e100 driver:
|
||||
entering the following on the command line before loading the e100 driver::
|
||||
|
||||
dmesg -n 8
|
||||
dmesg -n 6
|
||||
|
||||
If you wish to see all messages issued by the driver, including debug
|
||||
messages, set the dmesg level to eight.
|
||||
|
@ -146,7 +143,8 @@ Additional Configurations
|
|||
|
||||
NAPI (Rx polling mode) is supported in the e100 driver.
|
||||
|
||||
See www.cyberus.ca/~hadi/usenix-paper.tgz for more information on NAPI.
|
||||
See https://wiki.linuxfoundation.org/networking/napi for more information
|
||||
on NAPI.
|
||||
|
||||
Multiple Interfaces on Same Ethernet Broadcast Network
|
||||
------------------------------------------------------
|
||||
|
@ -160,7 +158,7 @@ Additional Configurations
|
|||
If you have multiple interfaces in a server, either turn on ARP
|
||||
filtering by
|
||||
|
||||
(1) entering: echo 1 > /proc/sys/net/ipv4/conf/all/arp_filter
|
||||
(1) entering:: echo 1 > /proc/sys/net/ipv4/conf/all/arp_filter
|
||||
(this only works if your kernel's version is higher than 2.4.5), or
|
||||
|
||||
(2) installing the interfaces in separate broadcast domains (either
|
||||
|
@ -169,15 +167,11 @@ Additional Configurations
|
|||
|
||||
Support
|
||||
=======
|
||||
|
||||
For general information, go to the Intel support website at:
|
||||
http://www.intel.com/support/
|
||||
|
||||
http://support.intel.com
|
||||
|
||||
or the Intel Wired Networking project hosted by Sourceforge at:
|
||||
|
||||
http://sourceforge.net/projects/e1000
|
||||
|
||||
If an issue is identified with the released source code on the supported
|
||||
kernel with a supported adapter, email the specific information related to the
|
||||
issue to e1000-devel@lists.sourceforge.net.
|
||||
or the Intel Wired Networking project hosted by Sourceforge at:
|
||||
http://sourceforge.net/projects/e1000
|
||||
If an issue is identified with the released source code on a supported kernel
|
||||
with a supported adapter, email the specific information related to the issue
|
||||
to e1000-devel@lists.sf.net.
|
|
@ -154,7 +154,7 @@ NOTE: When e1000 is loaded with default settings and multiple adapters
|
|||
are in use simultaneously, the CPU utilization may increase non-
|
||||
linearly. In order to limit the CPU utilization without impacting
|
||||
the overall throughput, we recommend that you load the driver as
|
||||
follows:
|
||||
follows::
|
||||
|
||||
modprobe e1000 InterruptThrottleRate=3000,3000,3000
|
||||
|
||||
|
@ -167,8 +167,8 @@ NOTE: When e1000 is loaded with default settings and multiple adapters
|
|||
|
||||
RxDescriptors
|
||||
-------------
|
||||
Valid Range: 80-256 for 82542 and 82543-based adapters
|
||||
80-4096 for all other supported adapters
|
||||
Valid Range: 48-256 for 82542 and 82543-based adapters
|
||||
48-4096 for all other supported adapters
|
||||
Default Value: 256
|
||||
|
||||
This value specifies the number of receive buffer descriptors allocated
|
||||
|
@ -230,8 +230,8 @@ speed. Duplex should also be set when Speed is set to either 10 or 100.
|
|||
|
||||
TxDescriptors
|
||||
-------------
|
||||
Valid Range: 80-256 for 82542 and 82543-based adapters
|
||||
80-4096 for all other supported adapters
|
||||
Valid Range: 48-256 for 82542 and 82543-based adapters
|
||||
48-4096 for all other supported adapters
|
||||
Default Value: 256
|
||||
|
||||
This value is the number of transmit descriptors allocated by the driver.
|
||||
|
@ -242,41 +242,10 @@ NOTE: Depending on the available system resources, the request for a
|
|||
higher number of transmit descriptors may be denied. In this case,
|
||||
use a lower number.
|
||||
|
||||
TxDescriptorStep
|
||||
----------------
|
||||
Valid Range: 1 (use every Tx Descriptor)
|
||||
4 (use every 4th Tx Descriptor)
|
||||
|
||||
Default Value: 1 (use every Tx Descriptor)
|
||||
|
||||
On certain non-Intel architectures, it has been observed that intense TX
|
||||
traffic bursts of short packets may result in an improper descriptor
|
||||
writeback. If this occurs, the driver will report a "TX Timeout" and reset
|
||||
the adapter, after which the transmit flow will restart, though data may
|
||||
have stalled for as much as 10 seconds before it resumes.
|
||||
|
||||
The improper writeback does not occur on the first descriptor in a system
|
||||
memory cache-line, which is typically 32 bytes, or 4 descriptors long.
|
||||
|
||||
Setting TxDescriptorStep to a value of 4 will ensure that all TX descriptors
|
||||
are aligned to the start of a system memory cache line, and so this problem
|
||||
will not occur.
|
||||
|
||||
NOTES: Setting TxDescriptorStep to 4 effectively reduces the number of
|
||||
TxDescriptors available for transmits to 1/4 of the normal allocation.
|
||||
This has a possible negative performance impact, which may be
|
||||
compensated for by allocating more descriptors using the TxDescriptors
|
||||
module parameter.
|
||||
|
||||
There are other conditions which may result in "TX Timeout", which will
|
||||
not be resolved by the use of the TxDescriptorStep parameter. As the
|
||||
issue addressed by this parameter has never been observed on Intel
|
||||
Architecture platforms, it should not be used on Intel platforms.
|
||||
|
||||
TxIntDelay
|
||||
----------
|
||||
Valid Range: 0-65535 (0=off)
|
||||
Default Value: 64
|
||||
Default Value: 8
|
||||
|
||||
This value delays the generation of transmit interrupts in units of
|
||||
1.024 microseconds. Transmit interrupt reduction can improve CPU
|
||||
|
@ -288,7 +257,7 @@ TxAbsIntDelay
|
|||
-------------
|
||||
(This parameter is supported only on 82540, 82545 and later adapters.)
|
||||
Valid Range: 0-65535 (0=off)
|
||||
Default Value: 64
|
||||
Default Value: 32
|
||||
|
||||
This value, in units of 1.024 microseconds, limits the delay in which a
|
||||
transmit interrupt is generated. Useful only if TxIntDelay is non-zero,
|
||||
|
@ -310,7 +279,7 @@ Copybreak
|
|||
---------
|
||||
Valid Range: 0-xxxxxxx (0=off)
|
||||
Default Value: 256
|
||||
Usage: insmod e1000.ko copybreak=128
|
||||
Usage: modprobe e1000.ko copybreak=128
|
||||
|
||||
Driver copies all packets below or equaling this size to a fresh RX
|
||||
buffer before handing it up the stack.
|
||||
|
@ -328,14 +297,6 @@ Default Value: 0 (disabled)
|
|||
Allows PHY to turn off in lower power states. The user can turn off
|
||||
this parameter in supported chipsets.
|
||||
|
||||
KumeranLockLoss
|
||||
---------------
|
||||
Valid Range: 0-1
|
||||
Default Value: 1 (enabled)
|
||||
|
||||
This workaround skips resetting the PHY at shutdown for the initial
|
||||
silicon releases of ICH8 systems.
|
||||
|
||||
Speed and Duplex Configuration
|
||||
==============================
|
||||
|
||||
|
@ -397,12 +358,12 @@ Additional Configurations
|
|||
------------
|
||||
Jumbo Frames support is enabled by changing the MTU to a value larger than
|
||||
the default of 1500. Use the ifconfig command to increase the MTU size.
|
||||
For example:
|
||||
For example::
|
||||
|
||||
ifconfig eth<x> mtu 9000 up
|
||||
|
||||
This setting is not saved across reboots. It can be made permanent if
|
||||
you add:
|
||||
you add::
|
||||
|
||||
MTU=9000
|
||||
|
|
@ -0,0 +1,18 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
========
|
||||
FAILOVER
|
||||
========
|
||||
|
||||
Overview
|
||||
========
|
||||
|
||||
The failover module provides a generic interface for paravirtual drivers
|
||||
to register a netdev and a set of ops with a failover instance. The ops
|
||||
are used as event handlers that get called to handle netdev register/
|
||||
unregister/link change/name change events on slave pci ethernet devices
|
||||
with the same mac address as the failover netdev.
|
||||
|
||||
This enables paravirtual drivers to use a VF as an accelerated low latency
|
||||
datapath. It also allows live migration of VMs with direct attached VFs by
|
||||
failing over to the paravirtual datapath when the VF is unplugged.
|
|
@ -483,6 +483,12 @@ Example output from dmesg:
|
|||
[ 3389.935851] JIT code: 00000030: 00 e8 28 94 ff e0 83 f8 01 75 07 b8 ff ff 00 00
|
||||
[ 3389.935852] JIT code: 00000040: eb 02 31 c0 c9 c3
|
||||
|
||||
When CONFIG_BPF_JIT_ALWAYS_ON is enabled, bpf_jit_enable is permanently set to 1 and
|
||||
setting any other value than that will return in failure. This is even the case for
|
||||
setting bpf_jit_enable to 2, since dumping the final JIT image into the kernel log
|
||||
is discouraged and introspection through bpftool (under tools/bpf/bpftool/) is the
|
||||
generally recommended approach instead.
|
||||
|
||||
In the kernel source tree under tools/bpf/, there's bpf_jit_disasm for
|
||||
generating disassembly out of the kernel log's hexdump:
|
||||
|
||||
|
@ -1136,6 +1142,7 @@ into a register from memory, the register's top 56 bits are known zero, while
|
|||
the low 8 are unknown - which is represented as the tnum (0x0; 0xff). If we
|
||||
then OR this with 0x40, we get (0x40; 0xbf), then if we add 1 we get (0x0;
|
||||
0x1ff), because of potential carries.
|
||||
|
||||
Besides arithmetic, the register state can also be updated by conditional
|
||||
branches. For instance, if a SCALAR_VALUE is compared > 8, in the 'true' branch
|
||||
it will have a umin_value (unsigned minimum value) of 9, whereas in the 'false'
|
||||
|
@ -1144,14 +1151,16 @@ BPF_JSGE) would instead update the signed minimum/maximum values. Information
|
|||
from the signed and unsigned bounds can be combined; for instance if a value is
|
||||
first tested < 8 and then tested s> 4, the verifier will conclude that the value
|
||||
is also > 4 and s< 8, since the bounds prevent crossing the sign boundary.
|
||||
|
||||
PTR_TO_PACKETs with a variable offset part have an 'id', which is common to all
|
||||
pointers sharing that same variable offset. This is important for packet range
|
||||
checks: after adding some variable to a packet pointer, if you then copy it to
|
||||
another register and (say) add a constant 4, both registers will share the same
|
||||
'id' but one will have a fixed offset of +4. Then if it is bounds-checked and
|
||||
found to be less than a PTR_TO_PACKET_END, the other register is now known to
|
||||
have a safe range of at least 4 bytes. See 'Direct packet access', below, for
|
||||
more on PTR_TO_PACKET ranges.
|
||||
checks: after adding a variable to a packet pointer register A, if you then copy
|
||||
it to another register B and then add a constant 4 to A, both registers will
|
||||
share the same 'id' but the A will have a fixed offset of +4. Then if A is
|
||||
bounds-checked and found to be less than a PTR_TO_PACKET_END, the register B is
|
||||
now known to have a safe range of at least 4 bytes. See 'Direct packet access',
|
||||
below, for more on PTR_TO_PACKET ranges.
|
||||
|
||||
The 'id' field is also used on PTR_TO_MAP_VALUE_OR_NULL, common to all copies of
|
||||
the pointer returned from a map lookup. This means that when one copy is
|
||||
checked and found to be non-NULL, all copies can become PTR_TO_MAP_VALUEs.
|
||||
|
|
|
@ -67,7 +67,7 @@ Don't be confused by terminology: The GTP User Plane goes through
|
|||
kernel accelerated path, while the GTP Control Plane goes to
|
||||
Userspace :)
|
||||
|
||||
The official homepge of the module is at
|
||||
The official homepage of the module is at
|
||||
https://osmocom.org/projects/linux-kernel-gtp-u/wiki
|
||||
|
||||
== Userspace Programs with Linux Kernel GTP-U support ==
|
||||
|
@ -120,7 +120,7 @@ If yo have questions regarding how to use the Kernel GTP module from
|
|||
your own software, or want to contribute to the code, please use the
|
||||
osmocom-net-grps mailing list for related discussion. The list can be
|
||||
reached at osmocom-net-gprs@lists.osmocom.org and the mailman
|
||||
interface for managign your subscription is at
|
||||
interface for managing your subscription is at
|
||||
https://lists.osmocom.org/mailman/listinfo/osmocom-net-gprs
|
||||
|
||||
== Issue Tracker ==
|
||||
|
|
|
@ -121,7 +121,7 @@ three options to deal with this:
|
|||
|
||||
- checksum neutral mapping
|
||||
When an address is translated the difference can be offset
|
||||
elsewhere in a part of the packet that is covered by the
|
||||
elsewhere in a part of the packet that is covered by
|
||||
the checksum. The low order sixteen bits of the identifier
|
||||
are used. This method is preferred since it doesn't require
|
||||
parsing a packet beyond the IP header and in most cases the
|
||||
|
|
|
@ -6,9 +6,12 @@ Contents:
|
|||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
af_xdp
|
||||
batman-adv
|
||||
can
|
||||
dpaa2/index
|
||||
e100
|
||||
e1000
|
||||
kapi
|
||||
z8530book
|
||||
msg_zerocopy
|
||||
|
|
|
@ -26,7 +26,7 @@ ip_no_pmtu_disc - INTEGER
|
|||
discarded. Outgoing frames are handled the same as in mode 1,
|
||||
implicitly setting IP_PMTUDISC_DONT on every created socket.
|
||||
|
||||
Mode 3 is a hardend pmtu discover mode. The kernel will only
|
||||
Mode 3 is a hardened pmtu discover mode. The kernel will only
|
||||
accept fragmentation-needed errors if the underlying protocol
|
||||
can verify them besides a plain socket lookup. Current
|
||||
protocols for which pmtu events will be honored are TCP, SCTP
|
||||
|
@ -449,8 +449,10 @@ tcp_recovery - INTEGER
|
|||
features.
|
||||
|
||||
RACK: 0x1 enables the RACK loss detection for fast detection of lost
|
||||
retransmissions and tail drops.
|
||||
retransmissions and tail drops. It also subsumes and disables
|
||||
RFC6675 recovery for SACK connections.
|
||||
RACK: 0x2 makes RACK's reordering window static (min_rtt/4).
|
||||
RACK: 0x4 disables RACK's DUPACK threshold heuristic
|
||||
|
||||
Default: 0x1
|
||||
|
||||
|
@ -523,6 +525,19 @@ tcp_rmem - vector of 3 INTEGERs: min, default, max
|
|||
tcp_sack - BOOLEAN
|
||||
Enable select acknowledgments (SACKS).
|
||||
|
||||
tcp_comp_sack_delay_ns - LONG INTEGER
|
||||
TCP tries to reduce number of SACK sent, using a timer
|
||||
based on 5% of SRTT, capped by this sysctl, in nano seconds.
|
||||
The default is 1ms, based on TSO autosizing period.
|
||||
|
||||
Default : 1,000,000 ns (1 ms)
|
||||
|
||||
tcp_comp_sack_nr - INTEGER
|
||||
Max numer of SACK that can be compressed.
|
||||
Using 0 disables SACK compression.
|
||||
|
||||
Detault : 44
|
||||
|
||||
tcp_slow_start_after_idle - BOOLEAN
|
||||
If set, provide RFC2861 behavior and time out the congestion
|
||||
window after an idle period. An idle period is defined at
|
||||
|
@ -652,11 +667,15 @@ tcp_tso_win_divisor - INTEGER
|
|||
building larger TSO frames.
|
||||
Default: 3
|
||||
|
||||
tcp_tw_reuse - BOOLEAN
|
||||
Allow to reuse TIME-WAIT sockets for new connections when it is
|
||||
safe from protocol viewpoint. Default value is 0.
|
||||
tcp_tw_reuse - INTEGER
|
||||
Enable reuse of TIME-WAIT sockets for new connections when it is
|
||||
safe from protocol viewpoint.
|
||||
0 - disable
|
||||
1 - global enable
|
||||
2 - enable for loopback traffic only
|
||||
It should not be changed without advice/request of technical
|
||||
experts.
|
||||
Default: 2
|
||||
|
||||
tcp_window_scaling - BOOLEAN
|
||||
Enable window scaling as defined in RFC1323.
|
||||
|
@ -1428,6 +1447,19 @@ ip6frag_low_thresh - INTEGER
|
|||
ip6frag_time - INTEGER
|
||||
Time in seconds to keep an IPv6 fragment in memory.
|
||||
|
||||
IPv6 Segment Routing:
|
||||
|
||||
seg6_flowlabel - INTEGER
|
||||
Controls the behaviour of computing the flowlabel of outer
|
||||
IPv6 header in case of SR T.encaps
|
||||
|
||||
-1 set flowlabel to zero.
|
||||
0 copy flowlabel from Inner packet in case of Inner IPv6
|
||||
(Set flowlabel to 0 in case IPv4/L2)
|
||||
1 Compute the flowlabel using seg6_make_flowlabel()
|
||||
|
||||
Default is 0.
|
||||
|
||||
conf/default/*:
|
||||
Change the interface-specific default settings.
|
||||
|
||||
|
|
|
@ -25,8 +25,8 @@ Quote from RFC3173:
|
|||
is implementation dependent.
|
||||
|
||||
Current IPComp implementation is indeed by the book, while as in practice
|
||||
when sending non-compressed packet to the peer(whether or not packet len
|
||||
is smaller than the threshold or the compressed len is large than original
|
||||
when sending non-compressed packet to the peer (whether or not packet len
|
||||
is smaller than the threshold or the compressed len is larger than original
|
||||
packet len), the packet is dropped when checking the policy as this packet
|
||||
matches the selector but not coming from any XFRM layer, i.e., with no
|
||||
security path. Such naked packet will not eventually make it to upper layer.
|
||||
|
|
|
@ -73,11 +73,11 @@ mode to make conn-tracking work.
|
|||
This is the default option. To configure the IPvlan port in this mode,
|
||||
user can choose to either add this option on the command-line or don't specify
|
||||
anything. This is the traditional mode where slaves can cross-talk among
|
||||
themseleves apart from talking through the master device.
|
||||
themselves apart from talking through the master device.
|
||||
|
||||
5.2 private:
|
||||
If this option is added to the command-line, the port is set in private
|
||||
mode. i.e. port wont allow cross communication between slaves.
|
||||
mode. i.e. port won't allow cross communication between slaves.
|
||||
|
||||
5.3 vepa:
|
||||
If this is added to the command-line, the port is set in VEPA mode.
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
Kernel Connection Mulitplexor
|
||||
Kernel Connection Multiplexor
|
||||
-----------------------------
|
||||
|
||||
Kernel Connection Multiplexor (KCM) is a mechanism that provides a message based
|
||||
|
@ -31,7 +31,7 @@ KCM implements an NxM multiplexor in the kernel as diagrammed below:
|
|||
KCM sockets
|
||||
-----------
|
||||
|
||||
The KCM sockets provide the user interface to the muliplexor. All the KCM sockets
|
||||
The KCM sockets provide the user interface to the multiplexor. All the KCM sockets
|
||||
bound to a multiplexor are considered to have equivalent function, and I/O
|
||||
operations in different sockets may be done in parallel without the need for
|
||||
synchronization between threads in userspace.
|
||||
|
@ -199,7 +199,7 @@ while. Example use:
|
|||
BFP programs for message delineation
|
||||
------------------------------------
|
||||
|
||||
BPF programs can be compiled using the BPF LLVM backend. For exmple,
|
||||
BPF programs can be compiled using the BPF LLVM backend. For example,
|
||||
the BPF program for parsing Thrift is:
|
||||
|
||||
#include "bpf.h" /* for __sk_buff */
|
||||
|
@ -222,7 +222,7 @@ messages. The kernel provides necessary assurances that messages are sent
|
|||
and received atomically. This relieves much of the burden applications have
|
||||
in mapping a message based protocol onto the TCP stream. KCM also make
|
||||
application layer messages a unit of work in the kernel for the purposes of
|
||||
steerng and scheduling, which in turn allows a simpler networking model in
|
||||
steering and scheduling, which in turn allows a simpler networking model in
|
||||
multithreaded applications.
|
||||
|
||||
Configurations
|
||||
|
@ -272,7 +272,7 @@ on the socket thus waking up the application thread. When the application
|
|||
sees the error (which may just be a disconnect) it should unattach the
|
||||
socket from KCM and then close it. It is assumed that once an error is
|
||||
posted on the TCP socket the data stream is unrecoverable (i.e. an error
|
||||
may have occurred in the middle of receiving a messssge).
|
||||
may have occurred in the middle of receiving a message).
|
||||
|
||||
TCP connection monitoring
|
||||
-------------------------
|
||||
|
|
|
@ -0,0 +1,116 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
============
|
||||
NET_FAILOVER
|
||||
============
|
||||
|
||||
Overview
|
||||
========
|
||||
|
||||
The net_failover driver provides an automated failover mechanism via APIs
|
||||
to create and destroy a failover master netdev and mananges a primary and
|
||||
standby slave netdevs that get registered via the generic failover
|
||||
infrastructrure.
|
||||
|
||||
The failover netdev acts a master device and controls 2 slave devices. The
|
||||
original paravirtual interface is registered as 'standby' slave netdev and
|
||||
a passthru/vf device with the same MAC gets registered as 'primary' slave
|
||||
netdev. Both 'standby' and 'failover' netdevs are associated with the same
|
||||
'pci' device. The user accesses the network interface via 'failover' netdev.
|
||||
The 'failover' netdev chooses 'primary' netdev as default for transmits when
|
||||
it is available with link up and running.
|
||||
|
||||
This can be used by paravirtual drivers to enable an alternate low latency
|
||||
datapath. It also enables hypervisor controlled live migration of a VM with
|
||||
direct attached VF by failing over to the paravirtual datapath when the VF
|
||||
is unplugged.
|
||||
|
||||
virtio-net accelerated datapath: STANDBY mode
|
||||
=============================================
|
||||
|
||||
net_failover enables hypervisor controlled accelerated datapath to virtio-net
|
||||
enabled VMs in a transparent manner with no/minimal guest userspace chanages.
|
||||
|
||||
To support this, the hypervisor needs to enable VIRTIO_NET_F_STANDBY
|
||||
feature on the virtio-net interface and assign the same MAC address to both
|
||||
virtio-net and VF interfaces.
|
||||
|
||||
Here is an example XML snippet that shows such configuration.
|
||||
|
||||
<interface type='network'>
|
||||
<mac address='52:54:00:00:12:53'/>
|
||||
<source network='enp66s0f0_br'/>
|
||||
<target dev='tap01'/>
|
||||
<model type='virtio'/>
|
||||
<driver name='vhost' queues='4'/>
|
||||
<link state='down'/>
|
||||
<address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/>
|
||||
</interface>
|
||||
<interface type='hostdev' managed='yes'>
|
||||
<mac address='52:54:00:00:12:53'/>
|
||||
<source>
|
||||
<address type='pci' domain='0x0000' bus='0x42' slot='0x02' function='0x5'/>
|
||||
</source>
|
||||
<address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/>
|
||||
</interface>
|
||||
|
||||
Booting a VM with the above configuration will result in the following 3
|
||||
netdevs created in the VM.
|
||||
|
||||
4: ens10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
|
||||
link/ether 52:54:00:00:12:53 brd ff:ff:ff:ff:ff:ff
|
||||
inet 192.168.12.53/24 brd 192.168.12.255 scope global dynamic ens10
|
||||
valid_lft 42482sec preferred_lft 42482sec
|
||||
inet6 fe80::97d8:db2:8c10:b6d6/64 scope link
|
||||
valid_lft forever preferred_lft forever
|
||||
5: ens10nsby: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master ens10 state UP group default qlen 1000
|
||||
link/ether 52:54:00:00:12:53 brd ff:ff:ff:ff:ff:ff
|
||||
7: ens11: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ens10 state UP group default qlen 1000
|
||||
link/ether 52:54:00:00:12:53 brd ff:ff:ff:ff:ff:ff
|
||||
|
||||
ens10 is the 'failover' master netdev, ens10nsby and ens11 are the slave
|
||||
'standby' and 'primary' netdevs respectively.
|
||||
|
||||
Live Migration of a VM with SR-IOV VF & virtio-net in STANDBY mode
|
||||
==================================================================
|
||||
|
||||
net_failover also enables hypervisor controlled live migration to be supported
|
||||
with VMs that have direct attached SR-IOV VF devices by automatic failover to
|
||||
the paravirtual datapath when the VF is unplugged.
|
||||
|
||||
Here is a sample script that shows the steps to initiate live migration on
|
||||
the source hypervisor.
|
||||
|
||||
# cat vf_xml
|
||||
<interface type='hostdev' managed='yes'>
|
||||
<mac address='52:54:00:00:12:53'/>
|
||||
<source>
|
||||
<address type='pci' domain='0x0000' bus='0x42' slot='0x02' function='0x5'/>
|
||||
</source>
|
||||
<address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/>
|
||||
</interface>
|
||||
|
||||
# Source Hypervisor
|
||||
#!/bin/bash
|
||||
|
||||
DOMAIN=fedora27-tap01
|
||||
PF=enp66s0f0
|
||||
VF_NUM=5
|
||||
TAP_IF=tap01
|
||||
VF_XML=
|
||||
|
||||
MAC=52:54:00:00:12:53
|
||||
ZERO_MAC=00:00:00:00:00:00
|
||||
|
||||
virsh domif-setlink $DOMAIN $TAP_IF up
|
||||
bridge fdb del $MAC dev $PF master
|
||||
virsh detach-device $DOMAIN $VF_XML
|
||||
ip link set $PF vf $VF_NUM mac $ZERO_MAC
|
||||
|
||||
virsh migrate --live $DOMAIN qemu+ssh://$REMOTE_HOST/system
|
||||
|
||||
# Destination Hypervisor
|
||||
#!/bin/bash
|
||||
|
||||
virsh attach-device $DOMAIN $VF_XML
|
||||
virsh domif-setlink $DOMAIN $TAP_IF down
|
|
@ -179,6 +179,15 @@ A: No. See above answer. In short, if you think it really belongs in
|
|||
dash marker line as described in Documentation/process/submitting-patches.rst to
|
||||
temporarily embed that information into the patch that you send.
|
||||
|
||||
Q: Are all networking bug fixes backported to all stable releases?
|
||||
|
||||
A: Due to capacity, Dave could only take care of the backports for the last
|
||||
2 stable releases. For earlier stable releases, each stable branch maintainer
|
||||
is supposed to take care of them. If you find any patch is missing from an
|
||||
earlier stable branch, please notify stable@vger.kernel.org with either a
|
||||
commit ID or a formal patch backported, and CC Dave and other relevant
|
||||
networking developers.
|
||||
|
||||
Q: Someone said that the comment style and coding convention is different
|
||||
for the networking content. Is this true?
|
||||
|
||||
|
|
|
@ -113,6 +113,13 @@ whatever headers there might be.
|
|||
NETIF_F_TSO_ECN means that hardware can properly split packets with CWR bit
|
||||
set, be it TCPv4 (when NETIF_F_TSO is enabled) or TCPv6 (NETIF_F_TSO6).
|
||||
|
||||
* Transmit UDP segmentation offload
|
||||
|
||||
NETIF_F_GSO_UDP_GSO_L4 accepts a single UDP header with a payload that exceeds
|
||||
gso_size. On segmentation, it segments the payload on gso_size boundaries and
|
||||
replicates the network and UDP headers (fixing up the last one if less than
|
||||
gso_size).
|
||||
|
||||
* Transmit DMA from high memory
|
||||
|
||||
On platforms where this is relevant, NETIF_F_HIGHDMA signals that
|
||||
|
|
|
@ -156,7 +156,7 @@ nf_conntrack_timestamp - BOOLEAN
|
|||
nf_conntrack_udp_timeout - INTEGER (seconds)
|
||||
default 30
|
||||
|
||||
nf_conntrack_udp_timeout_stream2 - INTEGER (seconds)
|
||||
nf_conntrack_udp_timeout_stream - INTEGER (seconds)
|
||||
default 180
|
||||
|
||||
This extended timeout will be used in case there is an UDP stream
|
||||
|
|
|
@ -45,6 +45,7 @@ through bpf(2) and passing a verifier in the kernel, a JIT will then
|
|||
translate these BPF proglets into native CPU instructions. There are
|
||||
two flavors of JITs, the newer eBPF JIT currently supported on:
|
||||
- x86_64
|
||||
- x86_32
|
||||
- arm64
|
||||
- arm32
|
||||
- ppc64
|
||||
|
|
70
MAINTAINERS
70
MAINTAINERS
|
@ -2732,13 +2732,13 @@ L: netdev@vger.kernel.org
|
|||
L: linux-kernel@vger.kernel.org
|
||||
T: git git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git
|
||||
T: git git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git
|
||||
Q: https://patchwork.ozlabs.org/project/netdev/list/?delegate=77147
|
||||
S: Supported
|
||||
F: arch/x86/net/bpf_jit*
|
||||
F: Documentation/networking/filter.txt
|
||||
F: Documentation/bpf/
|
||||
F: include/linux/bpf*
|
||||
F: include/linux/filter.h
|
||||
F: include/trace/events/bpf.h
|
||||
F: include/trace/events/xdp.h
|
||||
F: include/uapi/linux/bpf*
|
||||
F: include/uapi/linux/filter.h
|
||||
|
@ -2751,6 +2751,7 @@ F: net/sched/act_bpf.c
|
|||
F: net/sched/cls_bpf.c
|
||||
F: samples/bpf/
|
||||
F: tools/bpf/
|
||||
F: tools/lib/bpf/
|
||||
F: tools/testing/selftests/bpf/
|
||||
|
||||
BROADCOM B44 10/100 ETHERNET DRIVER
|
||||
|
@ -5352,7 +5353,6 @@ F: include/linux/*mdio*.h
|
|||
F: include/linux/of_net.h
|
||||
F: include/linux/phy.h
|
||||
F: include/linux/phy_fixed.h
|
||||
F: include/linux/platform_data/mdio-gpio.h
|
||||
F: include/linux/platform_data/mdio-bcm-unimac.h
|
||||
F: include/trace/events/mdio.h
|
||||
F: include/uapi/linux/mdio.h
|
||||
|
@ -5453,6 +5453,14 @@ M: Josh Poimboeuf <jpoimboe@redhat.com>
|
|||
S: Maintained
|
||||
F: scripts/faddr2line
|
||||
|
||||
FAILOVER MODULE
|
||||
M: Sridhar Samudrala <sridhar.samudrala@intel.com>
|
||||
L: netdev@vger.kernel.org
|
||||
S: Supported
|
||||
F: net/core/failover.c
|
||||
F: include/net/failover.h
|
||||
F: Documentation/networking/failover.rst
|
||||
|
||||
FANOTIFY
|
||||
M: Jan Kara <jack@suse.cz>
|
||||
R: Amir Goldstein <amir73il@gmail.com>
|
||||
|
@ -5663,7 +5671,6 @@ M: Claudiu Manoil <claudiu.manoil@nxp.com>
|
|||
L: netdev@vger.kernel.org
|
||||
S: Maintained
|
||||
F: drivers/net/ethernet/freescale/gianfar*
|
||||
X: drivers/net/ethernet/freescale/gianfar_ptp.c
|
||||
F: Documentation/devicetree/bindings/net/fsl-tsec-phy.txt
|
||||
|
||||
FREESCALE GPMI NAND DRIVER
|
||||
|
@ -5710,6 +5717,14 @@ S: Maintained
|
|||
F: drivers/net/ethernet/freescale/fman
|
||||
F: Documentation/devicetree/bindings/powerpc/fsl/fman.txt
|
||||
|
||||
FREESCALE QORIQ PTP CLOCK DRIVER
|
||||
M: Yangbo Lu <yangbo.lu@nxp.com>
|
||||
L: netdev@vger.kernel.org
|
||||
S: Maintained
|
||||
F: drivers/ptp/ptp_qoriq.c
|
||||
F: include/linux/fsl/ptp_qoriq.h
|
||||
F: Documentation/devicetree/bindings/ptp/ptp-qoriq.txt
|
||||
|
||||
FREESCALE QUAD SPI DRIVER
|
||||
M: Han Xu <han.xu@nxp.com>
|
||||
L: linux-mtd@lists.infradead.org
|
||||
|
@ -7123,8 +7138,8 @@ Q: http://patchwork.ozlabs.org/project/intel-wired-lan/list/
|
|||
T: git git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue.git
|
||||
T: git git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue.git
|
||||
S: Supported
|
||||
F: Documentation/networking/e100.txt
|
||||
F: Documentation/networking/e1000.txt
|
||||
F: Documentation/networking/e100.rst
|
||||
F: Documentation/networking/e1000.rst
|
||||
F: Documentation/networking/e1000e.txt
|
||||
F: Documentation/networking/igb.txt
|
||||
F: Documentation/networking/igbvf.txt
|
||||
|
@ -8521,6 +8536,7 @@ M: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
|
|||
L: netdev@vger.kernel.org
|
||||
S: Maintained
|
||||
F: drivers/net/dsa/mv88e6xxx/
|
||||
F: linux/platform_data/mv88e6xxx.h
|
||||
F: Documentation/devicetree/bindings/net/dsa/marvell.txt
|
||||
|
||||
MARVELL ARMADA DRM SUPPORT
|
||||
|
@ -9074,12 +9090,14 @@ W: http://www.mellanox.com
|
|||
Q: http://patchwork.ozlabs.org/project/netdev/list/
|
||||
F: drivers/net/ethernet/mellanox/mlx5/core/en_*
|
||||
|
||||
MELLANOX ETHERNET INNOVA DRIVER
|
||||
MELLANOX ETHERNET INNOVA DRIVERS
|
||||
R: Boris Pismenny <borisp@mellanox.com>
|
||||
L: netdev@vger.kernel.org
|
||||
S: Supported
|
||||
W: http://www.mellanox.com
|
||||
Q: http://patchwork.ozlabs.org/project/netdev/list/
|
||||
F: drivers/net/ethernet/mellanox/mlx5/core/en_accel/*
|
||||
F: drivers/net/ethernet/mellanox/mlx5/core/accel/*
|
||||
F: drivers/net/ethernet/mellanox/mlx5/core/fpga/*
|
||||
F: include/linux/mlx5/mlx5_ifc_fpga.h
|
||||
|
||||
|
@ -9339,6 +9357,12 @@ F: include/linux/cciss*.h
|
|||
F: include/uapi/linux/cciss*.h
|
||||
F: Documentation/scsi/smartpqi.txt
|
||||
|
||||
MICROSEMI ETHERNET SWITCH DRIVER
|
||||
M: Alexandre Belloni <alexandre.belloni@bootlin.com>
|
||||
L: netdev@vger.kernel.org
|
||||
S: Supported
|
||||
F: drivers/net/ethernet/mscc/
|
||||
|
||||
MICROSOFT SURFACE PRO 3 BUTTON DRIVER
|
||||
M: Chen Yu <yu.c.chen@intel.com>
|
||||
L: platform-driver-x86@vger.kernel.org
|
||||
|
@ -9688,6 +9712,14 @@ S: Maintained
|
|||
F: Documentation/hwmon/nct6775
|
||||
F: drivers/hwmon/nct6775.c
|
||||
|
||||
NET_FAILOVER MODULE
|
||||
M: Sridhar Samudrala <sridhar.samudrala@intel.com>
|
||||
L: netdev@vger.kernel.org
|
||||
S: Supported
|
||||
F: driver/net/net_failover.c
|
||||
F: include/net/net_failover.h
|
||||
F: Documentation/networking/net_failover.rst
|
||||
|
||||
NETEFFECT IWARP RNIC DRIVER (IW_NES)
|
||||
M: Faisal Latif <faisal.latif@intel.com>
|
||||
L: linux-rdma@vger.kernel.org
|
||||
|
@ -9881,7 +9913,21 @@ F: net/ipv6/calipso.c
|
|||
F: net/netfilter/xt_CONNSECMARK.c
|
||||
F: net/netfilter/xt_SECMARK.c
|
||||
|
||||
NETWORKING [TCP]
|
||||
M: Eric Dumazet <edumazet@google.com>
|
||||
L: netdev@vger.kernel.org
|
||||
S: Maintained
|
||||
F: net/ipv4/tcp*.c
|
||||
F: net/ipv4/syncookies.c
|
||||
F: net/ipv6/tcp*.c
|
||||
F: net/ipv6/syncookies.c
|
||||
F: include/uapi/linux/tcp.h
|
||||
F: include/net/tcp.h
|
||||
F: include/linux/tcp.h
|
||||
F: include/trace/events/tcp.h
|
||||
|
||||
NETWORKING [TLS]
|
||||
M: Boris Pismenny <borisp@mellanox.com>
|
||||
M: Aviad Yehezkel <aviadye@mellanox.com>
|
||||
M: Dave Watson <davejwatson@fb.com>
|
||||
L: netdev@vger.kernel.org
|
||||
|
@ -11447,7 +11493,6 @@ S: Maintained
|
|||
W: http://linuxptp.sourceforge.net/
|
||||
F: Documentation/ABI/testing/sysfs-ptp
|
||||
F: Documentation/ptp/*
|
||||
F: drivers/net/ethernet/freescale/gianfar_ptp.c
|
||||
F: drivers/net/phy/dp83640*
|
||||
F: drivers/ptp/*
|
||||
F: include/linux/ptp_cl*
|
||||
|
@ -13458,6 +13503,7 @@ F: drivers/media/usb/stk1160/
|
|||
STMMAC ETHERNET DRIVER
|
||||
M: Giuseppe Cavallaro <peppe.cavallaro@st.com>
|
||||
M: Alexandre Torgue <alexandre.torgue@st.com>
|
||||
M: Jose Abreu <joabreu@synopsys.com>
|
||||
L: netdev@vger.kernel.org
|
||||
W: http://www.stlinux.com
|
||||
S: Supported
|
||||
|
@ -14683,7 +14729,9 @@ M: Woojung Huh <woojung.huh@microchip.com>
|
|||
M: Microchip Linux Driver Support <UNGLinuxDriver@microchip.com>
|
||||
L: netdev@vger.kernel.org
|
||||
S: Maintained
|
||||
F: Documentation/devicetree/bindings/net/microchip,lan78xx.txt
|
||||
F: drivers/net/usb/lan78xx.*
|
||||
F: include/dt-bindings/net/microchip-lan78xx.h
|
||||
|
||||
USB MASS STORAGE DRIVER
|
||||
M: Alan Stern <stern@rowland.harvard.edu>
|
||||
|
@ -15472,6 +15520,14 @@ T: git git://linuxtv.org/media_tree.git
|
|||
S: Maintained
|
||||
F: drivers/media/tuners/tuner-xc2028.*
|
||||
|
||||
XDP SOCKETS (AF_XDP)
|
||||
M: Björn Töpel <bjorn.topel@intel.com>
|
||||
M: Magnus Karlsson <magnus.karlsson@intel.com>
|
||||
L: netdev@vger.kernel.org
|
||||
S: Maintained
|
||||
F: kernel/bpf/xskmap.c
|
||||
F: net/xdp/
|
||||
|
||||
XEN BLOCK SUBSYSTEM
|
||||
M: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
||||
M: Roger Pau Monné <roger.pau@citrix.com>
|
||||
|
|
5
Makefile
5
Makefile
|
@ -509,6 +509,11 @@ ifeq ($(shell $(CONFIG_SHELL) $(srctree)/scripts/gcc-goto.sh $(CC) $(KBUILD_CFLA
|
|||
KBUILD_AFLAGS += -DCC_HAVE_ASM_GOTO
|
||||
endif
|
||||
|
||||
ifeq ($(shell $(CONFIG_SHELL) $(srctree)/scripts/cc-can-link.sh $(CC)), y)
|
||||
CC_CAN_LINK := y
|
||||
export CC_CAN_LINK
|
||||
endif
|
||||
|
||||
ifeq ($(config-targets),1)
|
||||
# ===========================================================================
|
||||
# *config targets only - make sure prerequisites are updated, and descend
|
||||
|
|
|
@ -84,7 +84,7 @@
|
|||
*
|
||||
* 1. First argument is passed using the arm 32bit registers and rest of the
|
||||
* arguments are passed on stack scratch space.
|
||||
* 2. First callee-saved arugument is mapped to arm 32 bit registers and rest
|
||||
* 2. First callee-saved argument is mapped to arm 32 bit registers and rest
|
||||
* arguments are mapped to scratch space on stack.
|
||||
* 3. We need two 64 bit temp registers to do complex operations on eBPF
|
||||
* registers.
|
||||
|
@ -234,18 +234,11 @@ static void jit_fill_hole(void *area, unsigned int size)
|
|||
#define SCRATCH_SIZE 80
|
||||
|
||||
/* total stack size used in JITed code */
|
||||
#define _STACK_SIZE \
|
||||
(ctx->prog->aux->stack_depth + \
|
||||
+ SCRATCH_SIZE + \
|
||||
+ 4 /* extra for skb_copy_bits buffer */)
|
||||
|
||||
#define STACK_SIZE ALIGN(_STACK_SIZE, STACK_ALIGNMENT)
|
||||
#define _STACK_SIZE (ctx->prog->aux->stack_depth + SCRATCH_SIZE)
|
||||
#define STACK_SIZE ALIGN(_STACK_SIZE, STACK_ALIGNMENT)
|
||||
|
||||
/* Get the offset of eBPF REGISTERs stored on scratch space. */
|
||||
#define STACK_VAR(off) (STACK_SIZE-off-4)
|
||||
|
||||
/* Offset of skb_copy_bits buffer */
|
||||
#define SKB_BUFFER STACK_VAR(SCRATCH_SIZE)
|
||||
#define STACK_VAR(off) (STACK_SIZE - off)
|
||||
|
||||
#if __LINUX_ARM_ARCH__ < 7
|
||||
|
||||
|
@ -708,7 +701,7 @@ static inline void emit_a32_arsh_r64(const u8 dst[], const u8 src[], bool dstk,
|
|||
}
|
||||
|
||||
/* dst = dst >> src */
|
||||
static inline void emit_a32_lsr_r64(const u8 dst[], const u8 src[], bool dstk,
|
||||
static inline void emit_a32_rsh_r64(const u8 dst[], const u8 src[], bool dstk,
|
||||
bool sstk, struct jit_ctx *ctx) {
|
||||
const u8 *tmp = bpf2a32[TMP_REG_1];
|
||||
const u8 *tmp2 = bpf2a32[TMP_REG_2];
|
||||
|
@ -724,7 +717,7 @@ static inline void emit_a32_lsr_r64(const u8 dst[], const u8 src[], bool dstk,
|
|||
emit(ARM_LDR_I(rm, ARM_SP, STACK_VAR(dst_hi)), ctx);
|
||||
}
|
||||
|
||||
/* Do LSH operation */
|
||||
/* Do RSH operation */
|
||||
emit(ARM_RSB_I(ARM_IP, rt, 32), ctx);
|
||||
emit(ARM_SUBS_I(tmp2[0], rt, 32), ctx);
|
||||
emit(ARM_MOV_SR(ARM_LR, rd, SRTYPE_LSR, rt), ctx);
|
||||
|
@ -774,7 +767,7 @@ static inline void emit_a32_lsh_i64(const u8 dst[], bool dstk,
|
|||
}
|
||||
|
||||
/* dst = dst >> val */
|
||||
static inline void emit_a32_lsr_i64(const u8 dst[], bool dstk,
|
||||
static inline void emit_a32_rsh_i64(const u8 dst[], bool dstk,
|
||||
const u32 val, struct jit_ctx *ctx) {
|
||||
const u8 *tmp = bpf2a32[TMP_REG_1];
|
||||
const u8 *tmp2 = bpf2a32[TMP_REG_2];
|
||||
|
@ -1199,8 +1192,8 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
|
|||
s32 jmp_offset;
|
||||
|
||||
#define check_imm(bits, imm) do { \
|
||||
if ((((imm) > 0) && ((imm) >> (bits))) || \
|
||||
(((imm) < 0) && (~(imm) >> (bits)))) { \
|
||||
if ((imm) >= (1 << ((bits) - 1)) || \
|
||||
(imm) < -(1 << ((bits) - 1))) { \
|
||||
pr_info("[%2d] imm=%d(0x%x) out of range\n", \
|
||||
i, imm, imm); \
|
||||
return -EINVAL; \
|
||||
|
@ -1330,7 +1323,7 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
|
|||
case BPF_ALU64 | BPF_RSH | BPF_K:
|
||||
if (unlikely(imm > 63))
|
||||
return -EINVAL;
|
||||
emit_a32_lsr_i64(dst, dstk, imm, ctx);
|
||||
emit_a32_rsh_i64(dst, dstk, imm, ctx);
|
||||
break;
|
||||
/* dst = dst << src */
|
||||
case BPF_ALU64 | BPF_LSH | BPF_X:
|
||||
|
@ -1338,7 +1331,7 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
|
|||
break;
|
||||
/* dst = dst >> src */
|
||||
case BPF_ALU64 | BPF_RSH | BPF_X:
|
||||
emit_a32_lsr_r64(dst, src, dstk, sstk, ctx);
|
||||
emit_a32_rsh_r64(dst, src, dstk, sstk, ctx);
|
||||
break;
|
||||
/* dst = dst >> src (signed) */
|
||||
case BPF_ALU64 | BPF_ARSH | BPF_X:
|
||||
|
@ -1452,83 +1445,6 @@ exit:
|
|||
emit(ARM_LDR_I(rn, ARM_SP, STACK_VAR(src_lo)), ctx);
|
||||
emit_ldx_r(dst, rn, dstk, off, ctx, BPF_SIZE(code));
|
||||
break;
|
||||
/* R0 = ntohx(*(size *)(((struct sk_buff *)R6)->data + imm)) */
|
||||
case BPF_LD | BPF_ABS | BPF_W:
|
||||
case BPF_LD | BPF_ABS | BPF_H:
|
||||
case BPF_LD | BPF_ABS | BPF_B:
|
||||
/* R0 = ntohx(*(size *)(((struct sk_buff *)R6)->data + src + imm)) */
|
||||
case BPF_LD | BPF_IND | BPF_W:
|
||||
case BPF_LD | BPF_IND | BPF_H:
|
||||
case BPF_LD | BPF_IND | BPF_B:
|
||||
{
|
||||
const u8 r4 = bpf2a32[BPF_REG_6][1]; /* r4 = ptr to sk_buff */
|
||||
const u8 r0 = bpf2a32[BPF_REG_0][1]; /*r0: struct sk_buff *skb*/
|
||||
/* rtn value */
|
||||
const u8 r1 = bpf2a32[BPF_REG_0][0]; /* r1: int k */
|
||||
const u8 r2 = bpf2a32[BPF_REG_1][1]; /* r2: unsigned int size */
|
||||
const u8 r3 = bpf2a32[BPF_REG_1][0]; /* r3: void *buffer */
|
||||
const u8 r6 = bpf2a32[TMP_REG_1][1]; /* r6: void *(*func)(..) */
|
||||
int size;
|
||||
|
||||
/* Setting up first argument */
|
||||
emit(ARM_MOV_R(r0, r4), ctx);
|
||||
|
||||
/* Setting up second argument */
|
||||
emit_a32_mov_i(r1, imm, false, ctx);
|
||||
if (BPF_MODE(code) == BPF_IND)
|
||||
emit_a32_alu_r(r1, src_lo, false, sstk, ctx,
|
||||
false, false, BPF_ADD);
|
||||
|
||||
/* Setting up third argument */
|
||||
switch (BPF_SIZE(code)) {
|
||||
case BPF_W:
|
||||
size = 4;
|
||||
break;
|
||||
case BPF_H:
|
||||
size = 2;
|
||||
break;
|
||||
case BPF_B:
|
||||
size = 1;
|
||||
break;
|
||||
default:
|
||||
return -EINVAL;
|
||||
}
|
||||
emit_a32_mov_i(r2, size, false, ctx);
|
||||
|
||||
/* Setting up fourth argument */
|
||||
emit(ARM_ADD_I(r3, ARM_SP, imm8m(SKB_BUFFER)), ctx);
|
||||
|
||||
/* Setting up function pointer to call */
|
||||
emit_a32_mov_i(r6, (unsigned int)bpf_load_pointer, false, ctx);
|
||||
emit_blx_r(r6, ctx);
|
||||
|
||||
emit(ARM_EOR_R(r1, r1, r1), ctx);
|
||||
/* Check if return address is NULL or not.
|
||||
* if NULL then jump to epilogue
|
||||
* else continue to load the value from retn address
|
||||
*/
|
||||
emit(ARM_CMP_I(r0, 0), ctx);
|
||||
jmp_offset = epilogue_offset(ctx);
|
||||
check_imm24(jmp_offset);
|
||||
_emit(ARM_COND_EQ, ARM_B(jmp_offset), ctx);
|
||||
|
||||
/* Load value from the address */
|
||||
switch (BPF_SIZE(code)) {
|
||||
case BPF_W:
|
||||
emit(ARM_LDR_I(r0, r0, 0), ctx);
|
||||
emit_rev32(r0, r0, ctx);
|
||||
break;
|
||||
case BPF_H:
|
||||
emit(ARM_LDRH_I(r0, r0, 0), ctx);
|
||||
emit_rev16(r0, r0, ctx);
|
||||
break;
|
||||
case BPF_B:
|
||||
emit(ARM_LDRB_I(r0, r0, 0), ctx);
|
||||
/* No need to reverse */
|
||||
break;
|
||||
}
|
||||
break;
|
||||
}
|
||||
/* ST: *(size *)(dst + off) = imm */
|
||||
case BPF_ST | BPF_MEM | BPF_W:
|
||||
case BPF_ST | BPF_MEM | BPF_H:
|
||||
|
|
|
@ -36,4 +36,30 @@
|
|||
drive-strength = <2>; /* 2 MA */
|
||||
};
|
||||
};
|
||||
|
||||
blsp1_uart1_default: blsp1_uart1_default {
|
||||
mux {
|
||||
pins = "gpio41", "gpio42", "gpio43", "gpio44";
|
||||
function = "blsp_uart2";
|
||||
};
|
||||
|
||||
config {
|
||||
pins = "gpio41", "gpio42", "gpio43", "gpio44";
|
||||
drive-strength = <16>;
|
||||
bias-disable;
|
||||
};
|
||||
};
|
||||
|
||||
blsp1_uart1_sleep: blsp1_uart1_sleep {
|
||||
mux {
|
||||
pins = "gpio41", "gpio42", "gpio43", "gpio44";
|
||||
function = "gpio";
|
||||
};
|
||||
|
||||
config {
|
||||
pins = "gpio41", "gpio42", "gpio43", "gpio44";
|
||||
drive-strength = <2>;
|
||||
bias-disable;
|
||||
};
|
||||
};
|
||||
};
|
||||
|
|
|
@ -14,6 +14,28 @@
|
|||
};
|
||||
};
|
||||
|
||||
bt_en_gpios: bt_en_gpios {
|
||||
pinconf {
|
||||
pins = "gpio19";
|
||||
function = PMIC_GPIO_FUNC_NORMAL;
|
||||
output-low;
|
||||
power-source = <PM8994_GPIO_S4>; // 1.8V
|
||||
qcom,drive-strength = <PMIC_GPIO_STRENGTH_LOW>;
|
||||
bias-pull-down;
|
||||
};
|
||||
};
|
||||
|
||||
wlan_en_gpios: wlan_en_gpios {
|
||||
pinconf {
|
||||
pins = "gpio8";
|
||||
function = PMIC_GPIO_FUNC_NORMAL;
|
||||
output-low;
|
||||
power-source = <PM8994_GPIO_S4>; // 1.8V
|
||||
qcom,drive-strength = <PMIC_GPIO_STRENGTH_LOW>;
|
||||
bias-pull-down;
|
||||
};
|
||||
};
|
||||
|
||||
volume_up_gpio: pm8996_gpio2 {
|
||||
pinconf {
|
||||
pins = "gpio2";
|
||||
|
@ -26,6 +48,16 @@
|
|||
};
|
||||
};
|
||||
|
||||
divclk4_pin_a: divclk4 {
|
||||
pinconf {
|
||||
pins = "gpio18";
|
||||
function = PMIC_GPIO_FUNC_FUNC2;
|
||||
|
||||
bias-disable;
|
||||
power-source = <PM8994_GPIO_S4>;
|
||||
};
|
||||
};
|
||||
|
||||
usb3_vbus_det_gpio: pm8996_gpio22 {
|
||||
pinconf {
|
||||
pins = "gpio22";
|
||||
|
|
|
@ -23,6 +23,7 @@
|
|||
aliases {
|
||||
serial0 = &blsp2_uart1;
|
||||
serial1 = &blsp2_uart2;
|
||||
serial2 = &blsp1_uart1;
|
||||
i2c0 = &blsp1_i2c2;
|
||||
i2c1 = &blsp2_i2c1;
|
||||
i2c2 = &blsp2_i2c0;
|
||||
|
@ -34,7 +35,36 @@
|
|||
stdout-path = "serial0:115200n8";
|
||||
};
|
||||
|
||||
clocks {
|
||||
divclk4: divclk4 {
|
||||
compatible = "fixed-clock";
|
||||
#clock-cells = <0>;
|
||||
clock-frequency = <32768>;
|
||||
clock-output-names = "divclk4";
|
||||
|
||||
pinctrl-names = "default";
|
||||
pinctrl-0 = <&divclk4_pin_a>;
|
||||
};
|
||||
};
|
||||
|
||||
soc {
|
||||
serial@7570000 {
|
||||
label = "BT-UART";
|
||||
status = "okay";
|
||||
pinctrl-names = "default", "sleep";
|
||||
pinctrl-0 = <&blsp1_uart1_default>;
|
||||
pinctrl-1 = <&blsp1_uart1_sleep>;
|
||||
|
||||
bluetooth {
|
||||
compatible = "qcom,qca6174-bt";
|
||||
|
||||
/* bt_disable_n gpio */
|
||||
enable-gpios = <&pm8994_gpios 19 GPIO_ACTIVE_HIGH>;
|
||||
|
||||
clocks = <&divclk4>;
|
||||
};
|
||||
};
|
||||
|
||||
serial@75b0000 {
|
||||
label = "LS-UART1";
|
||||
status = "okay";
|
||||
|
@ -139,9 +169,40 @@
|
|||
pinctrl-0 = <&usb2_vbus_det_gpio>;
|
||||
};
|
||||
|
||||
bt_en: bt-en-1-8v {
|
||||
pinctrl-names = "default";
|
||||
pinctrl-0 = <&bt_en_gpios>;
|
||||
compatible = "regulator-fixed";
|
||||
regulator-name = "bt-en-regulator";
|
||||
regulator-min-microvolt = <1800000>;
|
||||
regulator-max-microvolt = <1800000>;
|
||||
|
||||
/* WLAN card specific delay */
|
||||
startup-delay-us = <70000>;
|
||||
enable-active-high;
|
||||
};
|
||||
|
||||
wlan_en: wlan-en-1-8v {
|
||||
pinctrl-names = "default";
|
||||
pinctrl-0 = <&wlan_en_gpios>;
|
||||
compatible = "regulator-fixed";
|
||||
regulator-name = "wlan-en-regulator";
|
||||
regulator-min-microvolt = <1800000>;
|
||||
regulator-max-microvolt = <1800000>;
|
||||
|
||||
gpio = <&pm8994_gpios 8 0>;
|
||||
|
||||
/* WLAN card specific delay */
|
||||
startup-delay-us = <70000>;
|
||||
enable-active-high;
|
||||
};
|
||||
|
||||
agnoc@0 {
|
||||
qcom,pcie@600000 {
|
||||
status = "okay";
|
||||
perst-gpio = <&msmgpio 35 GPIO_ACTIVE_LOW>;
|
||||
vddpe-supply = <&wlan_en>;
|
||||
vddpe1-supply = <&bt_en>;
|
||||
};
|
||||
|
||||
qcom,pcie@608000 {
|
||||
|
|
|
@ -419,6 +419,16 @@
|
|||
#clock-cells = <1>;
|
||||
};
|
||||
|
||||
blsp1_uart1: serial@7570000 {
|
||||
compatible = "qcom,msm-uartdm-v1.4", "qcom,msm-uartdm";
|
||||
reg = <0x07570000 0x1000>;
|
||||
interrupts = <GIC_SPI 108 IRQ_TYPE_LEVEL_HIGH>;
|
||||
clocks = <&gcc GCC_BLSP1_UART2_APPS_CLK>,
|
||||
<&gcc GCC_BLSP1_AHB_CLK>;
|
||||
clock-names = "core", "iface";
|
||||
status = "disabled";
|
||||
};
|
||||
|
||||
blsp1_spi0: spi@7575000 {
|
||||
compatible = "qcom,spi-qup-v2.2.1";
|
||||
reg = <0x07575000 0x600>;
|
||||
|
|
|
@ -21,7 +21,6 @@
|
|||
#include <linux/bpf.h>
|
||||
#include <linux/filter.h>
|
||||
#include <linux/printk.h>
|
||||
#include <linux/skbuff.h>
|
||||
#include <linux/slab.h>
|
||||
|
||||
#include <asm/byteorder.h>
|
||||
|
@ -80,37 +79,6 @@ static inline void emit(const u32 insn, struct jit_ctx *ctx)
|
|||
ctx->idx++;
|
||||
}
|
||||
|
||||
static inline void emit_a64_mov_i64(const int reg, const u64 val,
|
||||
struct jit_ctx *ctx)
|
||||
{
|
||||
u64 tmp = val;
|
||||
int shift = 0;
|
||||
|
||||
emit(A64_MOVZ(1, reg, tmp & 0xffff, shift), ctx);
|
||||
tmp >>= 16;
|
||||
shift += 16;
|
||||
while (tmp) {
|
||||
if (tmp & 0xffff)
|
||||
emit(A64_MOVK(1, reg, tmp & 0xffff, shift), ctx);
|
||||
tmp >>= 16;
|
||||
shift += 16;
|
||||
}
|
||||
}
|
||||
|
||||
static inline void emit_addr_mov_i64(const int reg, const u64 val,
|
||||
struct jit_ctx *ctx)
|
||||
{
|
||||
u64 tmp = val;
|
||||
int shift = 0;
|
||||
|
||||
emit(A64_MOVZ(1, reg, tmp & 0xffff, shift), ctx);
|
||||
for (;shift < 48;) {
|
||||
tmp >>= 16;
|
||||
shift += 16;
|
||||
emit(A64_MOVK(1, reg, tmp & 0xffff, shift), ctx);
|
||||
}
|
||||
}
|
||||
|
||||
static inline void emit_a64_mov_i(const int is64, const int reg,
|
||||
const s32 val, struct jit_ctx *ctx)
|
||||
{
|
||||
|
@ -122,7 +90,8 @@ static inline void emit_a64_mov_i(const int is64, const int reg,
|
|||
emit(A64_MOVN(is64, reg, (u16)~lo, 0), ctx);
|
||||
} else {
|
||||
emit(A64_MOVN(is64, reg, (u16)~hi, 16), ctx);
|
||||
emit(A64_MOVK(is64, reg, lo, 0), ctx);
|
||||
if (lo != 0xffff)
|
||||
emit(A64_MOVK(is64, reg, lo, 0), ctx);
|
||||
}
|
||||
} else {
|
||||
emit(A64_MOVZ(is64, reg, lo, 0), ctx);
|
||||
|
@ -131,6 +100,59 @@ static inline void emit_a64_mov_i(const int is64, const int reg,
|
|||
}
|
||||
}
|
||||
|
||||
static int i64_i16_blocks(const u64 val, bool inverse)
|
||||
{
|
||||
return (((val >> 0) & 0xffff) != (inverse ? 0xffff : 0x0000)) +
|
||||
(((val >> 16) & 0xffff) != (inverse ? 0xffff : 0x0000)) +
|
||||
(((val >> 32) & 0xffff) != (inverse ? 0xffff : 0x0000)) +
|
||||
(((val >> 48) & 0xffff) != (inverse ? 0xffff : 0x0000));
|
||||
}
|
||||
|
||||
static inline void emit_a64_mov_i64(const int reg, const u64 val,
|
||||
struct jit_ctx *ctx)
|
||||
{
|
||||
u64 nrm_tmp = val, rev_tmp = ~val;
|
||||
bool inverse;
|
||||
int shift;
|
||||
|
||||
if (!(nrm_tmp >> 32))
|
||||
return emit_a64_mov_i(0, reg, (u32)val, ctx);
|
||||
|
||||
inverse = i64_i16_blocks(nrm_tmp, true) < i64_i16_blocks(nrm_tmp, false);
|
||||
shift = max(round_down((inverse ? (fls64(rev_tmp) - 1) :
|
||||
(fls64(nrm_tmp) - 1)), 16), 0);
|
||||
if (inverse)
|
||||
emit(A64_MOVN(1, reg, (rev_tmp >> shift) & 0xffff, shift), ctx);
|
||||
else
|
||||
emit(A64_MOVZ(1, reg, (nrm_tmp >> shift) & 0xffff, shift), ctx);
|
||||
shift -= 16;
|
||||
while (shift >= 0) {
|
||||
if (((nrm_tmp >> shift) & 0xffff) != (inverse ? 0xffff : 0x0000))
|
||||
emit(A64_MOVK(1, reg, (nrm_tmp >> shift) & 0xffff, shift), ctx);
|
||||
shift -= 16;
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* This is an unoptimized 64 immediate emission used for BPF to BPF call
|
||||
* addresses. It will always do a full 64 bit decomposition as otherwise
|
||||
* more complexity in the last extra pass is required since we previously
|
||||
* reserved 4 instructions for the address.
|
||||
*/
|
||||
static inline void emit_addr_mov_i64(const int reg, const u64 val,
|
||||
struct jit_ctx *ctx)
|
||||
{
|
||||
u64 tmp = val;
|
||||
int shift = 0;
|
||||
|
||||
emit(A64_MOVZ(1, reg, tmp & 0xffff, shift), ctx);
|
||||
for (;shift < 48;) {
|
||||
tmp >>= 16;
|
||||
shift += 16;
|
||||
emit(A64_MOVK(1, reg, tmp & 0xffff, shift), ctx);
|
||||
}
|
||||
}
|
||||
|
||||
static inline int bpf2a64_offset(int bpf_to, int bpf_from,
|
||||
const struct jit_ctx *ctx)
|
||||
{
|
||||
|
@ -163,7 +185,7 @@ static inline int epilogue_offset(const struct jit_ctx *ctx)
|
|||
/* Tail call offset to jump into */
|
||||
#define PROLOGUE_OFFSET 7
|
||||
|
||||
static int build_prologue(struct jit_ctx *ctx)
|
||||
static int build_prologue(struct jit_ctx *ctx, bool ebpf_from_cbpf)
|
||||
{
|
||||
const struct bpf_prog *prog = ctx->prog;
|
||||
const u8 r6 = bpf2a64[BPF_REG_6];
|
||||
|
@ -188,7 +210,7 @@ static int build_prologue(struct jit_ctx *ctx)
|
|||
* | ... | BPF prog stack
|
||||
* | |
|
||||
* +-----+ <= (BPF_FP - prog->aux->stack_depth)
|
||||
* |RSVD | JIT scratchpad
|
||||
* |RSVD | padding
|
||||
* current A64_SP => +-----+ <= (BPF_FP - ctx->stack_size)
|
||||
* | |
|
||||
* | ... | Function call stack
|
||||
|
@ -210,19 +232,19 @@ static int build_prologue(struct jit_ctx *ctx)
|
|||
/* Set up BPF prog stack base register */
|
||||
emit(A64_MOV(1, fp, A64_SP), ctx);
|
||||
|
||||
/* Initialize tail_call_cnt */
|
||||
emit(A64_MOVZ(1, tcc, 0, 0), ctx);
|
||||
if (!ebpf_from_cbpf) {
|
||||
/* Initialize tail_call_cnt */
|
||||
emit(A64_MOVZ(1, tcc, 0, 0), ctx);
|
||||
|
||||
cur_offset = ctx->idx - idx0;
|
||||
if (cur_offset != PROLOGUE_OFFSET) {
|
||||
pr_err_once("PROLOGUE_OFFSET = %d, expected %d!\n",
|
||||
cur_offset, PROLOGUE_OFFSET);
|
||||
return -1;
|
||||
cur_offset = ctx->idx - idx0;
|
||||
if (cur_offset != PROLOGUE_OFFSET) {
|
||||
pr_err_once("PROLOGUE_OFFSET = %d, expected %d!\n",
|
||||
cur_offset, PROLOGUE_OFFSET);
|
||||
return -1;
|
||||
}
|
||||
}
|
||||
|
||||
/* 4 byte extra for skb_copy_bits buffer */
|
||||
ctx->stack_size = prog->aux->stack_depth + 4;
|
||||
ctx->stack_size = STACK_ALIGN(ctx->stack_size);
|
||||
ctx->stack_size = STACK_ALIGN(prog->aux->stack_depth);
|
||||
|
||||
/* Set up function call stack */
|
||||
emit(A64_SUB_I(1, A64_SP, A64_SP, ctx->stack_size), ctx);
|
||||
|
@ -723,71 +745,6 @@ emit_cond_jmp:
|
|||
emit(A64_CBNZ(0, tmp3, jmp_offset), ctx);
|
||||
break;
|
||||
|
||||
/* R0 = ntohx(*(size *)(((struct sk_buff *)R6)->data + imm)) */
|
||||
case BPF_LD | BPF_ABS | BPF_W:
|
||||
case BPF_LD | BPF_ABS | BPF_H:
|
||||
case BPF_LD | BPF_ABS | BPF_B:
|
||||
/* R0 = ntohx(*(size *)(((struct sk_buff *)R6)->data + src + imm)) */
|
||||
case BPF_LD | BPF_IND | BPF_W:
|
||||
case BPF_LD | BPF_IND | BPF_H:
|
||||
case BPF_LD | BPF_IND | BPF_B:
|
||||
{
|
||||
const u8 r0 = bpf2a64[BPF_REG_0]; /* r0 = return value */
|
||||
const u8 r6 = bpf2a64[BPF_REG_6]; /* r6 = pointer to sk_buff */
|
||||
const u8 fp = bpf2a64[BPF_REG_FP];
|
||||
const u8 r1 = bpf2a64[BPF_REG_1]; /* r1: struct sk_buff *skb */
|
||||
const u8 r2 = bpf2a64[BPF_REG_2]; /* r2: int k */
|
||||
const u8 r3 = bpf2a64[BPF_REG_3]; /* r3: unsigned int size */
|
||||
const u8 r4 = bpf2a64[BPF_REG_4]; /* r4: void *buffer */
|
||||
const u8 r5 = bpf2a64[BPF_REG_5]; /* r5: void *(*func)(...) */
|
||||
int size;
|
||||
|
||||
emit(A64_MOV(1, r1, r6), ctx);
|
||||
emit_a64_mov_i(0, r2, imm, ctx);
|
||||
if (BPF_MODE(code) == BPF_IND)
|
||||
emit(A64_ADD(0, r2, r2, src), ctx);
|
||||
switch (BPF_SIZE(code)) {
|
||||
case BPF_W:
|
||||
size = 4;
|
||||
break;
|
||||
case BPF_H:
|
||||
size = 2;
|
||||
break;
|
||||
case BPF_B:
|
||||
size = 1;
|
||||
break;
|
||||
default:
|
||||
return -EINVAL;
|
||||
}
|
||||
emit_a64_mov_i64(r3, size, ctx);
|
||||
emit(A64_SUB_I(1, r4, fp, ctx->stack_size), ctx);
|
||||
emit_a64_mov_i64(r5, (unsigned long)bpf_load_pointer, ctx);
|
||||
emit(A64_BLR(r5), ctx);
|
||||
emit(A64_MOV(1, r0, A64_R(0)), ctx);
|
||||
|
||||
jmp_offset = epilogue_offset(ctx);
|
||||
check_imm19(jmp_offset);
|
||||
emit(A64_CBZ(1, r0, jmp_offset), ctx);
|
||||
emit(A64_MOV(1, r5, r0), ctx);
|
||||
switch (BPF_SIZE(code)) {
|
||||
case BPF_W:
|
||||
emit(A64_LDR32(r0, r5, A64_ZR), ctx);
|
||||
#ifndef CONFIG_CPU_BIG_ENDIAN
|
||||
emit(A64_REV32(0, r0, r0), ctx);
|
||||
#endif
|
||||
break;
|
||||
case BPF_H:
|
||||
emit(A64_LDRH(r0, r5, A64_ZR), ctx);
|
||||
#ifndef CONFIG_CPU_BIG_ENDIAN
|
||||
emit(A64_REV16(0, r0, r0), ctx);
|
||||
#endif
|
||||
break;
|
||||
case BPF_B:
|
||||
emit(A64_LDRB(r0, r5, A64_ZR), ctx);
|
||||
break;
|
||||
}
|
||||
break;
|
||||
}
|
||||
default:
|
||||
pr_err_once("unknown opcode %02x\n", code);
|
||||
return -EINVAL;
|
||||
|
@ -851,6 +808,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
|
|||
struct bpf_prog *tmp, *orig_prog = prog;
|
||||
struct bpf_binary_header *header;
|
||||
struct arm64_jit_data *jit_data;
|
||||
bool was_classic = bpf_prog_was_classic(prog);
|
||||
bool tmp_blinded = false;
|
||||
bool extra_pass = false;
|
||||
struct jit_ctx ctx;
|
||||
|
@ -905,7 +863,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
|
|||
goto out_off;
|
||||
}
|
||||
|
||||
if (build_prologue(&ctx)) {
|
||||
if (build_prologue(&ctx, was_classic)) {
|
||||
prog = orig_prog;
|
||||
goto out_off;
|
||||
}
|
||||
|
@ -928,7 +886,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
|
|||
skip_init_ctx:
|
||||
ctx.idx = 0;
|
||||
|
||||
build_prologue(&ctx);
|
||||
build_prologue(&ctx, was_classic);
|
||||
|
||||
if (build_body(&ctx)) {
|
||||
bpf_jit_binary_free(header);
|
||||
|
|
|
@ -95,7 +95,6 @@ enum reg_val_type {
|
|||
* struct jit_ctx - JIT context
|
||||
* @skf: The sk_filter
|
||||
* @stack_size: eBPF stack size
|
||||
* @tmp_offset: eBPF $sp offset to 8-byte temporary memory
|
||||
* @idx: Instruction index
|
||||
* @flags: JIT flags
|
||||
* @offsets: Instruction offsets
|
||||
|
@ -105,7 +104,6 @@ enum reg_val_type {
|
|||
struct jit_ctx {
|
||||
const struct bpf_prog *skf;
|
||||
int stack_size;
|
||||
int tmp_offset;
|
||||
u32 idx;
|
||||
u32 flags;
|
||||
u32 *offsets;
|
||||
|
@ -293,7 +291,6 @@ static int gen_int_prologue(struct jit_ctx *ctx)
|
|||
locals_size = (ctx->flags & EBPF_SEEN_FP) ? MAX_BPF_STACK : 0;
|
||||
|
||||
stack_adjust += locals_size;
|
||||
ctx->tmp_offset = locals_size;
|
||||
|
||||
ctx->stack_size = stack_adjust;
|
||||
|
||||
|
@ -399,7 +396,6 @@ static void gen_imm_to_reg(const struct bpf_insn *insn, int reg,
|
|||
emit_instr(ctx, lui, reg, upper >> 16);
|
||||
emit_instr(ctx, addiu, reg, reg, lower);
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
static int gen_imm_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
|
||||
|
@ -547,28 +543,6 @@ static int gen_imm_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
|
|||
return 0;
|
||||
}
|
||||
|
||||
static void * __must_check
|
||||
ool_skb_header_pointer(const struct sk_buff *skb, int offset,
|
||||
int len, void *buffer)
|
||||
{
|
||||
return skb_header_pointer(skb, offset, len, buffer);
|
||||
}
|
||||
|
||||
static int size_to_len(const struct bpf_insn *insn)
|
||||
{
|
||||
switch (BPF_SIZE(insn->code)) {
|
||||
case BPF_B:
|
||||
return 1;
|
||||
case BPF_H:
|
||||
return 2;
|
||||
case BPF_W:
|
||||
return 4;
|
||||
case BPF_DW:
|
||||
return 8;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void emit_const_to_reg(struct jit_ctx *ctx, int dst, u64 value)
|
||||
{
|
||||
if (value >= 0xffffffffffff8000ull || value < 0x8000ull) {
|
||||
|
@ -1267,110 +1241,6 @@ jeq_common:
|
|||
return -EINVAL;
|
||||
break;
|
||||
|
||||
case BPF_LD | BPF_B | BPF_ABS:
|
||||
case BPF_LD | BPF_H | BPF_ABS:
|
||||
case BPF_LD | BPF_W | BPF_ABS:
|
||||
case BPF_LD | BPF_DW | BPF_ABS:
|
||||
ctx->flags |= EBPF_SAVE_RA;
|
||||
|
||||
gen_imm_to_reg(insn, MIPS_R_A1, ctx);
|
||||
emit_instr(ctx, addiu, MIPS_R_A2, MIPS_R_ZERO, size_to_len(insn));
|
||||
|
||||
if (insn->imm < 0) {
|
||||
emit_const_to_reg(ctx, MIPS_R_T9, (u64)bpf_internal_load_pointer_neg_helper);
|
||||
} else {
|
||||
emit_const_to_reg(ctx, MIPS_R_T9, (u64)ool_skb_header_pointer);
|
||||
emit_instr(ctx, daddiu, MIPS_R_A3, MIPS_R_SP, ctx->tmp_offset);
|
||||
}
|
||||
goto ld_skb_common;
|
||||
|
||||
case BPF_LD | BPF_B | BPF_IND:
|
||||
case BPF_LD | BPF_H | BPF_IND:
|
||||
case BPF_LD | BPF_W | BPF_IND:
|
||||
case BPF_LD | BPF_DW | BPF_IND:
|
||||
ctx->flags |= EBPF_SAVE_RA;
|
||||
src = ebpf_to_mips_reg(ctx, insn, src_reg_no_fp);
|
||||
if (src < 0)
|
||||
return src;
|
||||
ts = get_reg_val_type(ctx, this_idx, insn->src_reg);
|
||||
if (ts == REG_32BIT_ZERO_EX) {
|
||||
/* sign extend */
|
||||
emit_instr(ctx, sll, MIPS_R_A1, src, 0);
|
||||
src = MIPS_R_A1;
|
||||
}
|
||||
if (insn->imm >= S16_MIN && insn->imm <= S16_MAX) {
|
||||
emit_instr(ctx, daddiu, MIPS_R_A1, src, insn->imm);
|
||||
} else {
|
||||
gen_imm_to_reg(insn, MIPS_R_AT, ctx);
|
||||
emit_instr(ctx, daddu, MIPS_R_A1, MIPS_R_AT, src);
|
||||
}
|
||||
/* truncate to 32-bit int */
|
||||
emit_instr(ctx, sll, MIPS_R_A1, MIPS_R_A1, 0);
|
||||
emit_instr(ctx, daddiu, MIPS_R_A3, MIPS_R_SP, ctx->tmp_offset);
|
||||
emit_instr(ctx, slt, MIPS_R_AT, MIPS_R_A1, MIPS_R_ZERO);
|
||||
|
||||
emit_const_to_reg(ctx, MIPS_R_T8, (u64)bpf_internal_load_pointer_neg_helper);
|
||||
emit_const_to_reg(ctx, MIPS_R_T9, (u64)ool_skb_header_pointer);
|
||||
emit_instr(ctx, addiu, MIPS_R_A2, MIPS_R_ZERO, size_to_len(insn));
|
||||
emit_instr(ctx, movn, MIPS_R_T9, MIPS_R_T8, MIPS_R_AT);
|
||||
|
||||
ld_skb_common:
|
||||
emit_instr(ctx, jalr, MIPS_R_RA, MIPS_R_T9);
|
||||
/* delay slot move */
|
||||
emit_instr(ctx, daddu, MIPS_R_A0, MIPS_R_S0, MIPS_R_ZERO);
|
||||
|
||||
/* Check the error value */
|
||||
b_off = b_imm(exit_idx, ctx);
|
||||
if (is_bad_offset(b_off)) {
|
||||
target = j_target(ctx, exit_idx);
|
||||
if (target == (unsigned int)-1)
|
||||
return -E2BIG;
|
||||
|
||||
if (!(ctx->offsets[this_idx] & OFFSETS_B_CONV)) {
|
||||
ctx->offsets[this_idx] |= OFFSETS_B_CONV;
|
||||
ctx->long_b_conversion = 1;
|
||||
}
|
||||
emit_instr(ctx, bne, MIPS_R_V0, MIPS_R_ZERO, 4 * 3);
|
||||
emit_instr(ctx, nop);
|
||||
emit_instr(ctx, j, target);
|
||||
emit_instr(ctx, nop);
|
||||
} else {
|
||||
emit_instr(ctx, beq, MIPS_R_V0, MIPS_R_ZERO, b_off);
|
||||
emit_instr(ctx, nop);
|
||||
}
|
||||
|
||||
#ifdef __BIG_ENDIAN
|
||||
need_swap = false;
|
||||
#else
|
||||
need_swap = true;
|
||||
#endif
|
||||
dst = MIPS_R_V0;
|
||||
switch (BPF_SIZE(insn->code)) {
|
||||
case BPF_B:
|
||||
emit_instr(ctx, lbu, dst, 0, MIPS_R_V0);
|
||||
break;
|
||||
case BPF_H:
|
||||
emit_instr(ctx, lhu, dst, 0, MIPS_R_V0);
|
||||
if (need_swap)
|
||||
emit_instr(ctx, wsbh, dst, dst);
|
||||
break;
|
||||
case BPF_W:
|
||||
emit_instr(ctx, lw, dst, 0, MIPS_R_V0);
|
||||
if (need_swap) {
|
||||
emit_instr(ctx, wsbh, dst, dst);
|
||||
emit_instr(ctx, rotr, dst, dst, 16);
|
||||
}
|
||||
break;
|
||||
case BPF_DW:
|
||||
emit_instr(ctx, ld, dst, 0, MIPS_R_V0);
|
||||
if (need_swap) {
|
||||
emit_instr(ctx, dsbh, dst, dst);
|
||||
emit_instr(ctx, dshd, dst, dst);
|
||||
}
|
||||
break;
|
||||
}
|
||||
|
||||
break;
|
||||
case BPF_ALU | BPF_END | BPF_FROM_BE:
|
||||
case BPF_ALU | BPF_END | BPF_FROM_LE:
|
||||
dst = ebpf_to_mips_reg(ctx, insn, dst_reg);
|
||||
|
|
|
@ -3,7 +3,7 @@
|
|||
# Arch-specific network modules
|
||||
#
|
||||
ifeq ($(CONFIG_PPC64),y)
|
||||
obj-$(CONFIG_BPF_JIT) += bpf_jit_asm64.o bpf_jit_comp64.o
|
||||
obj-$(CONFIG_BPF_JIT) += bpf_jit_comp64.o
|
||||
else
|
||||
obj-$(CONFIG_BPF_JIT) += bpf_jit_asm.o bpf_jit_comp.o
|
||||
endif
|
||||
|
|
|
@ -20,7 +20,7 @@
|
|||
* with our redzone usage.
|
||||
*
|
||||
* [ prev sp ] <-------------
|
||||
* [ nv gpr save area ] 8*8 |
|
||||
* [ nv gpr save area ] 6*8 |
|
||||
* [ tail_call_cnt ] 8 |
|
||||
* [ local_tmp_var ] 8 |
|
||||
* fp (r31) --> [ ebpf stack space ] upto 512 |
|
||||
|
@ -28,8 +28,8 @@
|
|||
* sp (r1) ---> [ stack pointer ] --------------
|
||||
*/
|
||||
|
||||
/* for gpr non volatile registers BPG_REG_6 to 10, plus skb cache registers */
|
||||
#define BPF_PPC_STACK_SAVE (8*8)
|
||||
/* for gpr non volatile registers BPG_REG_6 to 10 */
|
||||
#define BPF_PPC_STACK_SAVE (6*8)
|
||||
/* for bpf JIT code internal usage */
|
||||
#define BPF_PPC_STACK_LOCALS 16
|
||||
/* stack frame excluding BPF stack, ensure this is quadword aligned */
|
||||
|
@ -39,10 +39,8 @@
|
|||
#ifndef __ASSEMBLY__
|
||||
|
||||
/* BPF register usage */
|
||||
#define SKB_HLEN_REG (MAX_BPF_JIT_REG + 0)
|
||||
#define SKB_DATA_REG (MAX_BPF_JIT_REG + 1)
|
||||
#define TMP_REG_1 (MAX_BPF_JIT_REG + 2)
|
||||
#define TMP_REG_2 (MAX_BPF_JIT_REG + 3)
|
||||
#define TMP_REG_1 (MAX_BPF_JIT_REG + 0)
|
||||
#define TMP_REG_2 (MAX_BPF_JIT_REG + 1)
|
||||
|
||||
/* BPF to ppc register mappings */
|
||||
static const int b2p[] = {
|
||||
|
@ -63,40 +61,23 @@ static const int b2p[] = {
|
|||
[BPF_REG_FP] = 31,
|
||||
/* eBPF jit internal registers */
|
||||
[BPF_REG_AX] = 2,
|
||||
[SKB_HLEN_REG] = 25,
|
||||
[SKB_DATA_REG] = 26,
|
||||
[TMP_REG_1] = 9,
|
||||
[TMP_REG_2] = 10
|
||||
};
|
||||
|
||||
/* PPC NVR range -- update this if we ever use NVRs below r24 */
|
||||
#define BPF_PPC_NVR_MIN 24
|
||||
|
||||
/* Assembly helpers */
|
||||
#define DECLARE_LOAD_FUNC(func) u64 func(u64 r3, u64 r4); \
|
||||
u64 func##_negative_offset(u64 r3, u64 r4); \
|
||||
u64 func##_positive_offset(u64 r3, u64 r4);
|
||||
|
||||
DECLARE_LOAD_FUNC(sk_load_word);
|
||||
DECLARE_LOAD_FUNC(sk_load_half);
|
||||
DECLARE_LOAD_FUNC(sk_load_byte);
|
||||
|
||||
#define CHOOSE_LOAD_FUNC(imm, func) \
|
||||
(imm < 0 ? \
|
||||
(imm >= SKF_LL_OFF ? func##_negative_offset : func) : \
|
||||
func##_positive_offset)
|
||||
/* PPC NVR range -- update this if we ever use NVRs below r27 */
|
||||
#define BPF_PPC_NVR_MIN 27
|
||||
|
||||
#define SEEN_FUNC 0x1000 /* might call external helpers */
|
||||
#define SEEN_STACK 0x2000 /* uses BPF stack */
|
||||
#define SEEN_SKB 0x4000 /* uses sk_buff */
|
||||
#define SEEN_TAILCALL 0x8000 /* uses tail calls */
|
||||
#define SEEN_TAILCALL 0x4000 /* uses tail calls */
|
||||
|
||||
struct codegen_context {
|
||||
/*
|
||||
* This is used to track register usage as well
|
||||
* as calls to external helpers.
|
||||
* - register usage is tracked with corresponding
|
||||
* bits (r3-r10 and r25-r31)
|
||||
* bits (r3-r10 and r27-r31)
|
||||
* - rest of the bits can be used to track other
|
||||
* things -- for now, we use bits 16 to 23
|
||||
* encoded in SEEN_* macros above
|
||||
|
|
|
@ -1,180 +0,0 @@
|
|||
/*
|
||||
* bpf_jit_asm64.S: Packet/header access helper functions
|
||||
* for PPC64 BPF compiler.
|
||||
*
|
||||
* Copyright 2016, Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
|
||||
* IBM Corporation
|
||||
*
|
||||
* Based on bpf_jit_asm.S by Matt Evans
|
||||
*
|
||||
* This program is free software; you can redistribute it and/or
|
||||
* modify it under the terms of the GNU General Public License
|
||||
* as published by the Free Software Foundation; version 2
|
||||
* of the License.
|
||||
*/
|
||||
|
||||
#include <asm/ppc_asm.h>
|
||||
#include <asm/ptrace.h>
|
||||
#include "bpf_jit64.h"
|
||||
|
||||
/*
|
||||
* All of these routines are called directly from generated code,
|
||||
* with the below register usage:
|
||||
* r27 skb pointer (ctx)
|
||||
* r25 skb header length
|
||||
* r26 skb->data pointer
|
||||
* r4 offset
|
||||
*
|
||||
* Result is passed back in:
|
||||
* r8 data read in host endian format (accumulator)
|
||||
*
|
||||
* r9 is used as a temporary register
|
||||
*/
|
||||
|
||||
#define r_skb r27
|
||||
#define r_hlen r25
|
||||
#define r_data r26
|
||||
#define r_off r4
|
||||
#define r_val r8
|
||||
#define r_tmp r9
|
||||
|
||||
_GLOBAL_TOC(sk_load_word)
|
||||
cmpdi r_off, 0
|
||||
blt bpf_slow_path_word_neg
|
||||
b sk_load_word_positive_offset
|
||||
|
||||
_GLOBAL_TOC(sk_load_word_positive_offset)
|
||||
/* Are we accessing past headlen? */
|
||||
subi r_tmp, r_hlen, 4
|
||||
cmpd r_tmp, r_off
|
||||
blt bpf_slow_path_word
|
||||
/* Nope, just hitting the header. cr0 here is eq or gt! */
|
||||
LWZX_BE r_val, r_data, r_off
|
||||
blr /* Return success, cr0 != LT */
|
||||
|
||||
_GLOBAL_TOC(sk_load_half)
|
||||
cmpdi r_off, 0
|
||||
blt bpf_slow_path_half_neg
|
||||
b sk_load_half_positive_offset
|
||||
|
||||
_GLOBAL_TOC(sk_load_half_positive_offset)
|
||||
subi r_tmp, r_hlen, 2
|
||||
cmpd r_tmp, r_off
|
||||
blt bpf_slow_path_half
|
||||
LHZX_BE r_val, r_data, r_off
|
||||
blr
|
||||
|
||||
_GLOBAL_TOC(sk_load_byte)
|
||||
cmpdi r_off, 0
|
||||
blt bpf_slow_path_byte_neg
|
||||
b sk_load_byte_positive_offset
|
||||
|
||||
_GLOBAL_TOC(sk_load_byte_positive_offset)
|
||||
cmpd r_hlen, r_off
|
||||
ble bpf_slow_path_byte
|
||||
lbzx r_val, r_data, r_off
|
||||
blr
|
||||
|
||||
/*
|
||||
* Call out to skb_copy_bits:
|
||||
* Allocate a new stack frame here to remain ABI-compliant in
|
||||
* stashing LR.
|
||||
*/
|
||||
#define bpf_slow_path_common(SIZE) \
|
||||
mflr r0; \
|
||||
std r0, PPC_LR_STKOFF(r1); \
|
||||
stdu r1, -(STACK_FRAME_MIN_SIZE + BPF_PPC_STACK_LOCALS)(r1); \
|
||||
mr r3, r_skb; \
|
||||
/* r4 = r_off as passed */ \
|
||||
addi r5, r1, STACK_FRAME_MIN_SIZE; \
|
||||
li r6, SIZE; \
|
||||
bl skb_copy_bits; \
|
||||
nop; \
|
||||
/* save r5 */ \
|
||||
addi r5, r1, STACK_FRAME_MIN_SIZE; \
|
||||
/* r3 = 0 on success */ \
|
||||
addi r1, r1, STACK_FRAME_MIN_SIZE + BPF_PPC_STACK_LOCALS; \
|
||||
ld r0, PPC_LR_STKOFF(r1); \
|
||||
mtlr r0; \
|
||||
cmpdi r3, 0; \
|
||||
blt bpf_error; /* cr0 = LT */
|
||||
|
||||
bpf_slow_path_word:
|
||||
bpf_slow_path_common(4)
|
||||
/* Data value is on stack, and cr0 != LT */
|
||||
LWZX_BE r_val, 0, r5
|
||||
blr
|
||||
|
||||
bpf_slow_path_half:
|
||||
bpf_slow_path_common(2)
|
||||
LHZX_BE r_val, 0, r5
|
||||
blr
|
||||
|
||||
bpf_slow_path_byte:
|
||||
bpf_slow_path_common(1)
|
||||
lbzx r_val, 0, r5
|
||||
blr
|
||||
|
||||
/*
|
||||
* Call out to bpf_internal_load_pointer_neg_helper
|
||||
*/
|
||||
#define sk_negative_common(SIZE) \
|
||||
mflr r0; \
|
||||
std r0, PPC_LR_STKOFF(r1); \
|
||||
stdu r1, -STACK_FRAME_MIN_SIZE(r1); \
|
||||
mr r3, r_skb; \
|
||||
/* r4 = r_off, as passed */ \
|
||||
li r5, SIZE; \
|
||||
bl bpf_internal_load_pointer_neg_helper; \
|
||||
nop; \
|
||||
addi r1, r1, STACK_FRAME_MIN_SIZE; \
|
||||
ld r0, PPC_LR_STKOFF(r1); \
|
||||
mtlr r0; \
|
||||
/* R3 != 0 on success */ \
|
||||
cmpldi r3, 0; \
|
||||
beq bpf_error_slow; /* cr0 = EQ */
|
||||
|
||||
bpf_slow_path_word_neg:
|
||||
lis r_tmp, -32 /* SKF_LL_OFF */
|
||||
cmpd r_off, r_tmp /* addr < SKF_* */
|
||||
blt bpf_error /* cr0 = LT */
|
||||
b sk_load_word_negative_offset
|
||||
|
||||
_GLOBAL_TOC(sk_load_word_negative_offset)
|
||||
sk_negative_common(4)
|
||||
LWZX_BE r_val, 0, r3
|
||||
blr
|
||||
|
||||
bpf_slow_path_half_neg:
|
||||
lis r_tmp, -32 /* SKF_LL_OFF */
|
||||
cmpd r_off, r_tmp /* addr < SKF_* */
|
||||
blt bpf_error /* cr0 = LT */
|
||||
b sk_load_half_negative_offset
|
||||
|
||||
_GLOBAL_TOC(sk_load_half_negative_offset)
|
||||
sk_negative_common(2)
|
||||
LHZX_BE r_val, 0, r3
|
||||
blr
|
||||
|
||||
bpf_slow_path_byte_neg:
|
||||
lis r_tmp, -32 /* SKF_LL_OFF */
|
||||
cmpd r_off, r_tmp /* addr < SKF_* */
|
||||
blt bpf_error /* cr0 = LT */
|
||||
b sk_load_byte_negative_offset
|
||||
|
||||
_GLOBAL_TOC(sk_load_byte_negative_offset)
|
||||
sk_negative_common(1)
|
||||
lbzx r_val, 0, r3
|
||||
blr
|
||||
|
||||
bpf_error_slow:
|
||||
/* fabricate a cr0 = lt */
|
||||
li r_tmp, -1
|
||||
cmpdi r_tmp, 0
|
||||
bpf_error:
|
||||
/*
|
||||
* Entered with cr0 = lt
|
||||
* Generated code will 'blt epilogue', returning 0.
|
||||
*/
|
||||
li r_val, 0
|
||||
blr
|
|
@ -59,7 +59,7 @@ static inline bool bpf_has_stack_frame(struct codegen_context *ctx)
|
|||
* [ prev sp ] <-------------
|
||||
* [ ... ] |
|
||||
* sp (r1) ---> [ stack pointer ] --------------
|
||||
* [ nv gpr save area ] 8*8
|
||||
* [ nv gpr save area ] 6*8
|
||||
* [ tail_call_cnt ] 8
|
||||
* [ local_tmp_var ] 8
|
||||
* [ unused red zone ] 208 bytes protected
|
||||
|
@ -88,21 +88,6 @@ static int bpf_jit_stack_offsetof(struct codegen_context *ctx, int reg)
|
|||
BUG();
|
||||
}
|
||||
|
||||
static void bpf_jit_emit_skb_loads(u32 *image, struct codegen_context *ctx)
|
||||
{
|
||||
/*
|
||||
* Load skb->len and skb->data_len
|
||||
* r3 points to skb
|
||||
*/
|
||||
PPC_LWZ(b2p[SKB_HLEN_REG], 3, offsetof(struct sk_buff, len));
|
||||
PPC_LWZ(b2p[TMP_REG_1], 3, offsetof(struct sk_buff, data_len));
|
||||
/* header_len = len - data_len */
|
||||
PPC_SUB(b2p[SKB_HLEN_REG], b2p[SKB_HLEN_REG], b2p[TMP_REG_1]);
|
||||
|
||||
/* skb->data pointer */
|
||||
PPC_BPF_LL(b2p[SKB_DATA_REG], 3, offsetof(struct sk_buff, data));
|
||||
}
|
||||
|
||||
static void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx)
|
||||
{
|
||||
int i;
|
||||
|
@ -145,18 +130,6 @@ static void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx)
|
|||
if (bpf_is_seen_register(ctx, i))
|
||||
PPC_BPF_STL(b2p[i], 1, bpf_jit_stack_offsetof(ctx, b2p[i]));
|
||||
|
||||
/*
|
||||
* Save additional non-volatile regs if we cache skb
|
||||
* Also, setup skb data
|
||||
*/
|
||||
if (ctx->seen & SEEN_SKB) {
|
||||
PPC_BPF_STL(b2p[SKB_HLEN_REG], 1,
|
||||
bpf_jit_stack_offsetof(ctx, b2p[SKB_HLEN_REG]));
|
||||
PPC_BPF_STL(b2p[SKB_DATA_REG], 1,
|
||||
bpf_jit_stack_offsetof(ctx, b2p[SKB_DATA_REG]));
|
||||
bpf_jit_emit_skb_loads(image, ctx);
|
||||
}
|
||||
|
||||
/* Setup frame pointer to point to the bpf stack area */
|
||||
if (bpf_is_seen_register(ctx, BPF_REG_FP))
|
||||
PPC_ADDI(b2p[BPF_REG_FP], 1,
|
||||
|
@ -172,14 +145,6 @@ static void bpf_jit_emit_common_epilogue(u32 *image, struct codegen_context *ctx
|
|||
if (bpf_is_seen_register(ctx, i))
|
||||
PPC_BPF_LL(b2p[i], 1, bpf_jit_stack_offsetof(ctx, b2p[i]));
|
||||
|
||||
/* Restore non-volatile registers used for skb cache */
|
||||
if (ctx->seen & SEEN_SKB) {
|
||||
PPC_BPF_LL(b2p[SKB_HLEN_REG], 1,
|
||||
bpf_jit_stack_offsetof(ctx, b2p[SKB_HLEN_REG]));
|
||||
PPC_BPF_LL(b2p[SKB_DATA_REG], 1,
|
||||
bpf_jit_stack_offsetof(ctx, b2p[SKB_DATA_REG]));
|
||||
}
|
||||
|
||||
/* Tear down our stack frame */
|
||||
if (bpf_has_stack_frame(ctx)) {
|
||||
PPC_ADDI(1, 1, BPF_PPC_STACKFRAME + ctx->stack_size);
|
||||
|
@ -202,25 +167,37 @@ static void bpf_jit_build_epilogue(u32 *image, struct codegen_context *ctx)
|
|||
|
||||
static void bpf_jit_emit_func_call(u32 *image, struct codegen_context *ctx, u64 func)
|
||||
{
|
||||
unsigned int i, ctx_idx = ctx->idx;
|
||||
|
||||
/* Load function address into r12 */
|
||||
PPC_LI64(12, func);
|
||||
|
||||
/* For bpf-to-bpf function calls, the callee's address is unknown
|
||||
* until the last extra pass. As seen above, we use PPC_LI64() to
|
||||
* load the callee's address, but this may optimize the number of
|
||||
* instructions required based on the nature of the address.
|
||||
*
|
||||
* Since we don't want the number of instructions emitted to change,
|
||||
* we pad the optimized PPC_LI64() call with NOPs to guarantee that
|
||||
* we always have a five-instruction sequence, which is the maximum
|
||||
* that PPC_LI64() can emit.
|
||||
*/
|
||||
for (i = ctx->idx - ctx_idx; i < 5; i++)
|
||||
PPC_NOP();
|
||||
|
||||
#ifdef PPC64_ELF_ABI_v1
|
||||
/* func points to the function descriptor */
|
||||
PPC_LI64(b2p[TMP_REG_2], func);
|
||||
/* Load actual entry point from function descriptor */
|
||||
PPC_BPF_LL(b2p[TMP_REG_1], b2p[TMP_REG_2], 0);
|
||||
/* ... and move it to LR */
|
||||
PPC_MTLR(b2p[TMP_REG_1]);
|
||||
/*
|
||||
* Load TOC from function descriptor at offset 8.
|
||||
* We can clobber r2 since we get called through a
|
||||
* function pointer (so caller will save/restore r2)
|
||||
* and since we don't use a TOC ourself.
|
||||
*/
|
||||
PPC_BPF_LL(2, b2p[TMP_REG_2], 8);
|
||||
#else
|
||||
/* We can clobber r12 */
|
||||
PPC_FUNC_ADDR(12, func);
|
||||
PPC_MTLR(12);
|
||||
PPC_BPF_LL(2, 12, 8);
|
||||
/* Load actual entry point from function descriptor */
|
||||
PPC_BPF_LL(12, 12, 0);
|
||||
#endif
|
||||
|
||||
PPC_MTLR(12);
|
||||
PPC_BLRL();
|
||||
}
|
||||
|
||||
|
@ -291,7 +268,7 @@ static void bpf_jit_emit_tail_call(u32 *image, struct codegen_context *ctx, u32
|
|||
/* Assemble the body code between the prologue & epilogue */
|
||||
static int bpf_jit_build_body(struct bpf_prog *fp, u32 *image,
|
||||
struct codegen_context *ctx,
|
||||
u32 *addrs)
|
||||
u32 *addrs, bool extra_pass)
|
||||
{
|
||||
const struct bpf_insn *insn = fp->insnsi;
|
||||
int flen = fp->len;
|
||||
|
@ -747,29 +724,30 @@ emit_clear:
|
|||
break;
|
||||
|
||||
/*
|
||||
* Call kernel helper
|
||||
* Call kernel helper or bpf function
|
||||
*/
|
||||
case BPF_JMP | BPF_CALL:
|
||||
ctx->seen |= SEEN_FUNC;
|
||||
func = (u8 *) __bpf_call_base + imm;
|
||||
|
||||
/* Save skb pointer if we need to re-cache skb data */
|
||||
if ((ctx->seen & SEEN_SKB) &&
|
||||
bpf_helper_changes_pkt_data(func))
|
||||
PPC_BPF_STL(3, 1, bpf_jit_stack_local(ctx));
|
||||
/* bpf function call */
|
||||
if (insn[i].src_reg == BPF_PSEUDO_CALL)
|
||||
if (!extra_pass)
|
||||
func = NULL;
|
||||
else if (fp->aux->func && off < fp->aux->func_cnt)
|
||||
/* use the subprog id from the off
|
||||
* field to lookup the callee address
|
||||
*/
|
||||
func = (u8 *) fp->aux->func[off]->bpf_func;
|
||||
else
|
||||
return -EINVAL;
|
||||
/* kernel helper call */
|
||||
else
|
||||
func = (u8 *) __bpf_call_base + imm;
|
||||
|
||||
bpf_jit_emit_func_call(image, ctx, (u64)func);
|
||||
|
||||
/* move return value from r3 to BPF_REG_0 */
|
||||
PPC_MR(b2p[BPF_REG_0], 3);
|
||||
|
||||
/* refresh skb cache */
|
||||
if ((ctx->seen & SEEN_SKB) &&
|
||||
bpf_helper_changes_pkt_data(func)) {
|
||||
/* reload skb pointer to r3 */
|
||||
PPC_BPF_LL(3, 1, bpf_jit_stack_local(ctx));
|
||||
bpf_jit_emit_skb_loads(image, ctx);
|
||||
}
|
||||
break;
|
||||
|
||||
/*
|
||||
|
@ -886,65 +864,6 @@ cond_branch:
|
|||
PPC_BCC(true_cond, addrs[i + 1 + off]);
|
||||
break;
|
||||
|
||||
/*
|
||||
* Loads from packet header/data
|
||||
* Assume 32-bit input value in imm and X (src_reg)
|
||||
*/
|
||||
|
||||
/* Absolute loads */
|
||||
case BPF_LD | BPF_W | BPF_ABS:
|
||||
func = (u8 *)CHOOSE_LOAD_FUNC(imm, sk_load_word);
|
||||
goto common_load_abs;
|
||||
case BPF_LD | BPF_H | BPF_ABS:
|
||||
func = (u8 *)CHOOSE_LOAD_FUNC(imm, sk_load_half);
|
||||
goto common_load_abs;
|
||||
case BPF_LD | BPF_B | BPF_ABS:
|
||||
func = (u8 *)CHOOSE_LOAD_FUNC(imm, sk_load_byte);
|
||||
common_load_abs:
|
||||
/*
|
||||
* Load from [imm]
|
||||
* Load into r4, which can just be passed onto
|
||||
* skb load helpers as the second parameter
|
||||
*/
|
||||
PPC_LI32(4, imm);
|
||||
goto common_load;
|
||||
|
||||
/* Indirect loads */
|
||||
case BPF_LD | BPF_W | BPF_IND:
|
||||
func = (u8 *)sk_load_word;
|
||||
goto common_load_ind;
|
||||
case BPF_LD | BPF_H | BPF_IND:
|
||||
func = (u8 *)sk_load_half;
|
||||
goto common_load_ind;
|
||||
case BPF_LD | BPF_B | BPF_IND:
|
||||
func = (u8 *)sk_load_byte;
|
||||
common_load_ind:
|
||||
/*
|
||||
* Load from [src_reg + imm]
|
||||
* Treat src_reg as a 32-bit value
|
||||
*/
|
||||
PPC_EXTSW(4, src_reg);
|
||||
if (imm) {
|
||||
if (imm >= -32768 && imm < 32768)
|
||||
PPC_ADDI(4, 4, IMM_L(imm));
|
||||
else {
|
||||
PPC_LI32(b2p[TMP_REG_1], imm);
|
||||
PPC_ADD(4, 4, b2p[TMP_REG_1]);
|
||||
}
|
||||
}
|
||||
|
||||
common_load:
|
||||
ctx->seen |= SEEN_SKB;
|
||||
ctx->seen |= SEEN_FUNC;
|
||||
bpf_jit_emit_func_call(image, ctx, (u64)func);
|
||||
|
||||
/*
|
||||
* Helper returns 'lt' condition on error, and an
|
||||
* appropriate return value in BPF_REG_0
|
||||
*/
|
||||
PPC_BCC(COND_LT, exit_addr);
|
||||
break;
|
||||
|
||||
/*
|
||||
* Tail call
|
||||
*/
|
||||
|
@ -971,6 +890,14 @@ common_load:
|
|||
return 0;
|
||||
}
|
||||
|
||||
struct powerpc64_jit_data {
|
||||
struct bpf_binary_header *header;
|
||||
u32 *addrs;
|
||||
u8 *image;
|
||||
u32 proglen;
|
||||
struct codegen_context ctx;
|
||||
};
|
||||
|
||||
struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
|
||||
{
|
||||
u32 proglen;
|
||||
|
@ -978,6 +905,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
|
|||
u8 *image = NULL;
|
||||
u32 *code_base;
|
||||
u32 *addrs;
|
||||
struct powerpc64_jit_data *jit_data;
|
||||
struct codegen_context cgctx;
|
||||
int pass;
|
||||
int flen;
|
||||
|
@ -985,6 +913,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
|
|||
struct bpf_prog *org_fp = fp;
|
||||
struct bpf_prog *tmp_fp;
|
||||
bool bpf_blinded = false;
|
||||
bool extra_pass = false;
|
||||
|
||||
if (!fp->jit_requested)
|
||||
return org_fp;
|
||||
|
@ -998,11 +927,32 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
|
|||
fp = tmp_fp;
|
||||
}
|
||||
|
||||
jit_data = fp->aux->jit_data;
|
||||
if (!jit_data) {
|
||||
jit_data = kzalloc(sizeof(*jit_data), GFP_KERNEL);
|
||||
if (!jit_data) {
|
||||
fp = org_fp;
|
||||
goto out;
|
||||
}
|
||||
fp->aux->jit_data = jit_data;
|
||||
}
|
||||
|
||||
flen = fp->len;
|
||||
addrs = jit_data->addrs;
|
||||
if (addrs) {
|
||||
cgctx = jit_data->ctx;
|
||||
image = jit_data->image;
|
||||
bpf_hdr = jit_data->header;
|
||||
proglen = jit_data->proglen;
|
||||
alloclen = proglen + FUNCTION_DESCR_SIZE;
|
||||
extra_pass = true;
|
||||
goto skip_init_ctx;
|
||||
}
|
||||
|
||||
addrs = kzalloc((flen+1) * sizeof(*addrs), GFP_KERNEL);
|
||||
if (addrs == NULL) {
|
||||
fp = org_fp;
|
||||
goto out;
|
||||
goto out_addrs;
|
||||
}
|
||||
|
||||
memset(&cgctx, 0, sizeof(struct codegen_context));
|
||||
|
@ -1011,10 +961,10 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
|
|||
cgctx.stack_size = round_up(fp->aux->stack_depth, 16);
|
||||
|
||||
/* Scouting faux-generate pass 0 */
|
||||
if (bpf_jit_build_body(fp, 0, &cgctx, addrs)) {
|
||||
if (bpf_jit_build_body(fp, 0, &cgctx, addrs, false)) {
|
||||
/* We hit something illegal or unsupported. */
|
||||
fp = org_fp;
|
||||
goto out;
|
||||
goto out_addrs;
|
||||
}
|
||||
|
||||
/*
|
||||
|
@ -1032,9 +982,10 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
|
|||
bpf_jit_fill_ill_insns);
|
||||
if (!bpf_hdr) {
|
||||
fp = org_fp;
|
||||
goto out;
|
||||
goto out_addrs;
|
||||
}
|
||||
|
||||
skip_init_ctx:
|
||||
code_base = (u32 *)(image + FUNCTION_DESCR_SIZE);
|
||||
|
||||
/* Code generation passes 1-2 */
|
||||
|
@ -1042,7 +993,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
|
|||
/* Now build the prologue, body code & epilogue for real. */
|
||||
cgctx.idx = 0;
|
||||
bpf_jit_build_prologue(code_base, &cgctx);
|
||||
bpf_jit_build_body(fp, code_base, &cgctx, addrs);
|
||||
bpf_jit_build_body(fp, code_base, &cgctx, addrs, extra_pass);
|
||||
bpf_jit_build_epilogue(code_base, &cgctx);
|
||||
|
||||
if (bpf_jit_enable > 1)
|
||||
|
@ -1068,10 +1019,20 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
|
|||
fp->jited_len = alloclen;
|
||||
|
||||
bpf_flush_icache(bpf_hdr, (u8 *)bpf_hdr + (bpf_hdr->pages * PAGE_SIZE));
|
||||
if (!fp->is_func || extra_pass) {
|
||||
out_addrs:
|
||||
kfree(addrs);
|
||||
kfree(jit_data);
|
||||
fp->aux->jit_data = NULL;
|
||||
} else {
|
||||
jit_data->addrs = addrs;
|
||||
jit_data->ctx = cgctx;
|
||||
jit_data->proglen = proglen;
|
||||
jit_data->image = image;
|
||||
jit_data->header = bpf_hdr;
|
||||
}
|
||||
|
||||
out:
|
||||
kfree(addrs);
|
||||
|
||||
if (bpf_blinded)
|
||||
bpf_jit_prog_release_other(fp, fp == org_fp ? tmp_fp : org_fp);
|
||||
|
||||
|
|
|
@ -2,5 +2,5 @@
|
|||
#
|
||||
# Arch-specific network modules
|
||||
#
|
||||
obj-$(CONFIG_BPF_JIT) += bpf_jit.o bpf_jit_comp.o
|
||||
obj-$(CONFIG_BPF_JIT) += bpf_jit_comp.o
|
||||
obj-$(CONFIG_HAVE_PNETID) += pnet.o
|
||||
|
|
|
@ -1,120 +0,0 @@
|
|||
/* SPDX-License-Identifier: GPL-2.0 */
|
||||
/*
|
||||
* BPF Jit compiler for s390, help functions.
|
||||
*
|
||||
* Copyright IBM Corp. 2012,2015
|
||||
*
|
||||
* Author(s): Martin Schwidefsky <schwidefsky@de.ibm.com>
|
||||
* Michael Holzheu <holzheu@linux.vnet.ibm.com>
|
||||
*/
|
||||
|
||||
#include <linux/linkage.h>
|
||||
#include <asm/nospec-insn.h>
|
||||
#include "bpf_jit.h"
|
||||
|
||||
/*
|
||||
* Calling convention:
|
||||
* registers %r7-%r10, %r11,%r13, and %r15 are call saved
|
||||
*
|
||||
* Input (64 bit):
|
||||
* %r3 (%b2) = offset into skb data
|
||||
* %r6 (%b5) = return address
|
||||
* %r7 (%b6) = skb pointer
|
||||
* %r12 = skb data pointer
|
||||
*
|
||||
* Output:
|
||||
* %r14= %b0 = return value (read skb value)
|
||||
*
|
||||
* Work registers: %r2,%r4,%r5,%r14
|
||||
*
|
||||
* skb_copy_bits takes 4 parameters:
|
||||
* %r2 = skb pointer
|
||||
* %r3 = offset into skb data
|
||||
* %r4 = pointer to temp buffer
|
||||
* %r5 = length to copy
|
||||
* Return value in %r2: 0 = ok
|
||||
*
|
||||
* bpf_internal_load_pointer_neg_helper takes 3 parameters:
|
||||
* %r2 = skb pointer
|
||||
* %r3 = offset into data
|
||||
* %r4 = length to copy
|
||||
* Return value in %r2: Pointer to data
|
||||
*/
|
||||
|
||||
#define SKF_MAX_NEG_OFF -0x200000 /* SKF_LL_OFF from filter.h */
|
||||
|
||||
/*
|
||||
* Load SIZE bytes from SKB
|
||||
*/
|
||||
#define sk_load_common(NAME, SIZE, LOAD) \
|
||||
ENTRY(sk_load_##NAME); \
|
||||
ltgr %r3,%r3; /* Is offset negative? */ \
|
||||
jl sk_load_##NAME##_slow_neg; \
|
||||
ENTRY(sk_load_##NAME##_pos); \
|
||||
aghi %r3,SIZE; /* Offset + SIZE */ \
|
||||
clg %r3,STK_OFF_HLEN(%r15); /* Offset + SIZE > hlen? */ \
|
||||
jh sk_load_##NAME##_slow; \
|
||||
LOAD %r14,-SIZE(%r3,%r12); /* Get data from skb */ \
|
||||
B_EX OFF_OK,%r6; /* Return */ \
|
||||
\
|
||||
sk_load_##NAME##_slow:; \
|
||||
lgr %r2,%r7; /* Arg1 = skb pointer */ \
|
||||
aghi %r3,-SIZE; /* Arg2 = offset */ \
|
||||
la %r4,STK_OFF_TMP(%r15); /* Arg3 = temp bufffer */ \
|
||||
lghi %r5,SIZE; /* Arg4 = size */ \
|
||||
brasl %r14,skb_copy_bits; /* Get data from skb */ \
|
||||
LOAD %r14,STK_OFF_TMP(%r15); /* Load from temp bufffer */ \
|
||||
ltgr %r2,%r2; /* Set cc to (%r2 != 0) */ \
|
||||
BR_EX %r6; /* Return */
|
||||
|
||||
sk_load_common(word, 4, llgf) /* r14 = *(u32 *) (skb->data+offset) */
|
||||
sk_load_common(half, 2, llgh) /* r14 = *(u16 *) (skb->data+offset) */
|
||||
|
||||
GEN_BR_THUNK %r6
|
||||
GEN_B_THUNK OFF_OK,%r6
|
||||
|
||||
/*
|
||||
* Load 1 byte from SKB (optimized version)
|
||||
*/
|
||||
/* r14 = *(u8 *) (skb->data+offset) */
|
||||
ENTRY(sk_load_byte)
|
||||
ltgr %r3,%r3 # Is offset negative?
|
||||
jl sk_load_byte_slow_neg
|
||||
ENTRY(sk_load_byte_pos)
|
||||
clg %r3,STK_OFF_HLEN(%r15) # Offset >= hlen?
|
||||
jnl sk_load_byte_slow
|
||||
llgc %r14,0(%r3,%r12) # Get byte from skb
|
||||
B_EX OFF_OK,%r6 # Return OK
|
||||
|
||||
sk_load_byte_slow:
|
||||
lgr %r2,%r7 # Arg1 = skb pointer
|
||||
# Arg2 = offset
|
||||
la %r4,STK_OFF_TMP(%r15) # Arg3 = pointer to temp buffer
|
||||
lghi %r5,1 # Arg4 = size (1 byte)
|
||||
brasl %r14,skb_copy_bits # Get data from skb
|
||||
llgc %r14,STK_OFF_TMP(%r15) # Load result from temp buffer
|
||||
ltgr %r2,%r2 # Set cc to (%r2 != 0)
|
||||
BR_EX %r6 # Return cc
|
||||
|
||||
#define sk_negative_common(NAME, SIZE, LOAD) \
|
||||
sk_load_##NAME##_slow_neg:; \
|
||||
cgfi %r3,SKF_MAX_NEG_OFF; \
|
||||
jl bpf_error; \
|
||||
lgr %r2,%r7; /* Arg1 = skb pointer */ \
|
||||
/* Arg2 = offset */ \
|
||||
lghi %r4,SIZE; /* Arg3 = size */ \
|
||||
brasl %r14,bpf_internal_load_pointer_neg_helper; \
|
||||
ltgr %r2,%r2; \
|
||||
jz bpf_error; \
|
||||
LOAD %r14,0(%r2); /* Get data from pointer */ \
|
||||
xr %r3,%r3; /* Set cc to zero */ \
|
||||
BR_EX %r6; /* Return cc */
|
||||
|
||||
sk_negative_common(word, 4, llgf)
|
||||
sk_negative_common(half, 2, llgh)
|
||||
sk_negative_common(byte, 1, llgc)
|
||||
|
||||
bpf_error:
|
||||
# force a return 0 from jit handler
|
||||
ltgr %r15,%r15 # Set condition code
|
||||
BR_EX %r6
|
|
@ -16,9 +16,6 @@
|
|||
#include <linux/filter.h>
|
||||
#include <linux/types.h>
|
||||
|
||||
extern u8 sk_load_word_pos[], sk_load_half_pos[], sk_load_byte_pos[];
|
||||
extern u8 sk_load_word[], sk_load_half[], sk_load_byte[];
|
||||
|
||||
#endif /* __ASSEMBLY__ */
|
||||
|
||||
/*
|
||||
|
@ -36,15 +33,6 @@ extern u8 sk_load_word[], sk_load_half[], sk_load_byte[];
|
|||
* | | |
|
||||
* | BPF stack | |
|
||||
* | | |
|
||||
* +---------------+ |
|
||||
* | 8 byte skbp | |
|
||||
* R15+176 -> +---------------+ |
|
||||
* | 8 byte hlen | |
|
||||
* R15+168 -> +---------------+ |
|
||||
* | 4 byte align | |
|
||||
* +---------------+ |
|
||||
* | 4 byte temp | |
|
||||
* | for bpf_jit.S | |
|
||||
* R15+160 -> +---------------+ |
|
||||
* | new backchain | |
|
||||
* R15+152 -> +---------------+ |
|
||||
|
@ -57,17 +45,11 @@ extern u8 sk_load_word[], sk_load_half[], sk_load_byte[];
|
|||
* The stack size used by the BPF program ("BPF stack" above) is passed
|
||||
* via "aux->stack_depth".
|
||||
*/
|
||||
#define STK_SPACE_ADD (8 + 8 + 4 + 4 + 160)
|
||||
#define STK_SPACE_ADD (160)
|
||||
#define STK_160_UNUSED (160 - 12 * 8)
|
||||
#define STK_OFF (STK_SPACE_ADD - STK_160_UNUSED)
|
||||
#define STK_OFF_TMP 160 /* Offset of tmp buffer on stack */
|
||||
#define STK_OFF_HLEN 168 /* Offset of SKB header length on stack */
|
||||
#define STK_OFF_SKBP 176 /* Offset of SKB pointer on stack */
|
||||
|
||||
#define STK_OFF_R6 (160 - 11 * 8) /* Offset of r6 on stack */
|
||||
#define STK_OFF_TCCNT (160 - 12 * 8) /* Offset of tail_call_cnt on stack */
|
||||
|
||||
/* Offset to skip condition code check */
|
||||
#define OFF_OK 4
|
||||
|
||||
#endif /* __ARCH_S390_NET_BPF_JIT_H */
|
||||
|
|
|
@ -51,23 +51,21 @@ struct bpf_jit {
|
|||
|
||||
#define BPF_SIZE_MAX 0xffff /* Max size for program (16 bit branches) */
|
||||
|
||||
#define SEEN_SKB 1 /* skb access */
|
||||
#define SEEN_MEM 2 /* use mem[] for temporary storage */
|
||||
#define SEEN_RET0 4 /* ret0_ip points to a valid return 0 */
|
||||
#define SEEN_LITERAL 8 /* code uses literals */
|
||||
#define SEEN_FUNC 16 /* calls C functions */
|
||||
#define SEEN_TAIL_CALL 32 /* code uses tail calls */
|
||||
#define SEEN_REG_AX 64 /* code uses constant blinding */
|
||||
#define SEEN_STACK (SEEN_FUNC | SEEN_MEM | SEEN_SKB)
|
||||
#define SEEN_MEM (1 << 0) /* use mem[] for temporary storage */
|
||||
#define SEEN_RET0 (1 << 1) /* ret0_ip points to a valid return 0 */
|
||||
#define SEEN_LITERAL (1 << 2) /* code uses literals */
|
||||
#define SEEN_FUNC (1 << 3) /* calls C functions */
|
||||
#define SEEN_TAIL_CALL (1 << 4) /* code uses tail calls */
|
||||
#define SEEN_REG_AX (1 << 5) /* code uses constant blinding */
|
||||
#define SEEN_STACK (SEEN_FUNC | SEEN_MEM)
|
||||
|
||||
/*
|
||||
* s390 registers
|
||||
*/
|
||||
#define REG_W0 (MAX_BPF_JIT_REG + 0) /* Work register 1 (even) */
|
||||
#define REG_W1 (MAX_BPF_JIT_REG + 1) /* Work register 2 (odd) */
|
||||
#define REG_SKB_DATA (MAX_BPF_JIT_REG + 2) /* SKB data register */
|
||||
#define REG_L (MAX_BPF_JIT_REG + 3) /* Literal pool register */
|
||||
#define REG_15 (MAX_BPF_JIT_REG + 4) /* Register 15 */
|
||||
#define REG_L (MAX_BPF_JIT_REG + 2) /* Literal pool register */
|
||||
#define REG_15 (MAX_BPF_JIT_REG + 3) /* Register 15 */
|
||||
#define REG_0 REG_W0 /* Register 0 */
|
||||
#define REG_1 REG_W1 /* Register 1 */
|
||||
#define REG_2 BPF_REG_1 /* Register 2 */
|
||||
|
@ -92,10 +90,8 @@ static const int reg2hex[] = {
|
|||
[BPF_REG_9] = 10,
|
||||
/* BPF stack pointer */
|
||||
[BPF_REG_FP] = 13,
|
||||
/* Register for blinding (shared with REG_SKB_DATA) */
|
||||
/* Register for blinding */
|
||||
[BPF_REG_AX] = 12,
|
||||
/* SKB data pointer */
|
||||
[REG_SKB_DATA] = 12,
|
||||
/* Work registers for s390x backend */
|
||||
[REG_W0] = 0,
|
||||
[REG_W1] = 1,
|
||||
|
@ -401,27 +397,6 @@ static void save_restore_regs(struct bpf_jit *jit, int op, u32 stack_depth)
|
|||
} while (re <= 15);
|
||||
}
|
||||
|
||||
/*
|
||||
* For SKB access %b1 contains the SKB pointer. For "bpf_jit.S"
|
||||
* we store the SKB header length on the stack and the SKB data
|
||||
* pointer in REG_SKB_DATA if BPF_REG_AX is not used.
|
||||
*/
|
||||
static void emit_load_skb_data_hlen(struct bpf_jit *jit)
|
||||
{
|
||||
/* Header length: llgf %w1,<len>(%b1) */
|
||||
EMIT6_DISP_LH(0xe3000000, 0x0016, REG_W1, REG_0, BPF_REG_1,
|
||||
offsetof(struct sk_buff, len));
|
||||
/* s %w1,<data_len>(%b1) */
|
||||
EMIT4_DISP(0x5b000000, REG_W1, BPF_REG_1,
|
||||
offsetof(struct sk_buff, data_len));
|
||||
/* stg %w1,ST_OFF_HLEN(%r0,%r15) */
|
||||
EMIT6_DISP_LH(0xe3000000, 0x0024, REG_W1, REG_0, REG_15, STK_OFF_HLEN);
|
||||
if (!(jit->seen & SEEN_REG_AX))
|
||||
/* lg %skb_data,data_off(%b1) */
|
||||
EMIT6_DISP_LH(0xe3000000, 0x0004, REG_SKB_DATA, REG_0,
|
||||
BPF_REG_1, offsetof(struct sk_buff, data));
|
||||
}
|
||||
|
||||
/*
|
||||
* Emit function prologue
|
||||
*
|
||||
|
@ -462,12 +437,6 @@ static void bpf_jit_prologue(struct bpf_jit *jit, u32 stack_depth)
|
|||
EMIT6_DISP_LH(0xe3000000, 0x0024, REG_W1, REG_0,
|
||||
REG_15, 152);
|
||||
}
|
||||
if (jit->seen & SEEN_SKB) {
|
||||
emit_load_skb_data_hlen(jit);
|
||||
/* stg %b1,ST_OFF_SKBP(%r0,%r15) */
|
||||
EMIT6_DISP_LH(0xe3000000, 0x0024, BPF_REG_1, REG_0, REG_15,
|
||||
STK_OFF_SKBP);
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
|
@ -537,12 +506,12 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp, int i
|
|||
{
|
||||
struct bpf_insn *insn = &fp->insnsi[i];
|
||||
int jmp_off, last, insn_count = 1;
|
||||
unsigned int func_addr, mask;
|
||||
u32 dst_reg = insn->dst_reg;
|
||||
u32 src_reg = insn->src_reg;
|
||||
u32 *addrs = jit->addrs;
|
||||
s32 imm = insn->imm;
|
||||
s16 off = insn->off;
|
||||
unsigned int mask;
|
||||
|
||||
if (dst_reg == BPF_REG_AX || src_reg == BPF_REG_AX)
|
||||
jit->seen |= SEEN_REG_AX;
|
||||
|
@ -1029,13 +998,6 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp, int i
|
|||
}
|
||||
/* lgr %b0,%r2: load return value into %b0 */
|
||||
EMIT4(0xb9040000, BPF_REG_0, REG_2);
|
||||
if ((jit->seen & SEEN_SKB) &&
|
||||
bpf_helper_changes_pkt_data((void *)func)) {
|
||||
/* lg %b1,ST_OFF_SKBP(%r15) */
|
||||
EMIT6_DISP_LH(0xe3000000, 0x0004, BPF_REG_1, REG_0,
|
||||
REG_15, STK_OFF_SKBP);
|
||||
emit_load_skb_data_hlen(jit);
|
||||
}
|
||||
break;
|
||||
}
|
||||
case BPF_JMP | BPF_TAIL_CALL:
|
||||
|
@ -1235,73 +1197,6 @@ branch_oc:
|
|||
jmp_off = addrs[i + off + 1] - (addrs[i + 1] - 4);
|
||||
EMIT4_PCREL(0xa7040000 | mask << 8, jmp_off);
|
||||
break;
|
||||
/*
|
||||
* BPF_LD
|
||||
*/
|
||||
case BPF_LD | BPF_ABS | BPF_B: /* b0 = *(u8 *) (skb->data+imm) */
|
||||
case BPF_LD | BPF_IND | BPF_B: /* b0 = *(u8 *) (skb->data+imm+src) */
|
||||
if ((BPF_MODE(insn->code) == BPF_ABS) && (imm >= 0))
|
||||
func_addr = __pa(sk_load_byte_pos);
|
||||
else
|
||||
func_addr = __pa(sk_load_byte);
|
||||
goto call_fn;
|
||||
case BPF_LD | BPF_ABS | BPF_H: /* b0 = *(u16 *) (skb->data+imm) */
|
||||
case BPF_LD | BPF_IND | BPF_H: /* b0 = *(u16 *) (skb->data+imm+src) */
|
||||
if ((BPF_MODE(insn->code) == BPF_ABS) && (imm >= 0))
|
||||
func_addr = __pa(sk_load_half_pos);
|
||||
else
|
||||
func_addr = __pa(sk_load_half);
|
||||
goto call_fn;
|
||||
case BPF_LD | BPF_ABS | BPF_W: /* b0 = *(u32 *) (skb->data+imm) */
|
||||
case BPF_LD | BPF_IND | BPF_W: /* b0 = *(u32 *) (skb->data+imm+src) */
|
||||
if ((BPF_MODE(insn->code) == BPF_ABS) && (imm >= 0))
|
||||
func_addr = __pa(sk_load_word_pos);
|
||||
else
|
||||
func_addr = __pa(sk_load_word);
|
||||
goto call_fn;
|
||||
call_fn:
|
||||
jit->seen |= SEEN_SKB | SEEN_RET0 | SEEN_FUNC;
|
||||
REG_SET_SEEN(REG_14); /* Return address of possible func call */
|
||||
|
||||
/*
|
||||
* Implicit input:
|
||||
* BPF_REG_6 (R7) : skb pointer
|
||||
* REG_SKB_DATA (R12): skb data pointer (if no BPF_REG_AX)
|
||||
*
|
||||
* Calculated input:
|
||||
* BPF_REG_2 (R3) : offset of byte(s) to fetch in skb
|
||||
* BPF_REG_5 (R6) : return address
|
||||
*
|
||||
* Output:
|
||||
* BPF_REG_0 (R14): data read from skb
|
||||
*
|
||||
* Scratch registers (BPF_REG_1-5)
|
||||
*/
|
||||
|
||||
/* Call function: llilf %w1,func_addr */
|
||||
EMIT6_IMM(0xc00f0000, REG_W1, func_addr);
|
||||
|
||||
/* Offset: lgfi %b2,imm */
|
||||
EMIT6_IMM(0xc0010000, BPF_REG_2, imm);
|
||||
if (BPF_MODE(insn->code) == BPF_IND)
|
||||
/* agfr %b2,%src (%src is s32 here) */
|
||||
EMIT4(0xb9180000, BPF_REG_2, src_reg);
|
||||
|
||||
/* Reload REG_SKB_DATA if BPF_REG_AX is used */
|
||||
if (jit->seen & SEEN_REG_AX)
|
||||
/* lg %skb_data,data_off(%b6) */
|
||||
EMIT6_DISP_LH(0xe3000000, 0x0004, REG_SKB_DATA, REG_0,
|
||||
BPF_REG_6, offsetof(struct sk_buff, data));
|
||||
/* basr %b5,%w1 (%b5 is call saved) */
|
||||
EMIT2(0x0d00, BPF_REG_5, REG_W1);
|
||||
|
||||
/*
|
||||
* Note: For fast access we jump directly after the
|
||||
* jnz instruction from bpf_jit.S
|
||||
*/
|
||||
/* jnz <ret0> */
|
||||
EMIT4_PCREL(0xa7740000, jit->ret0_ip - jit->prg);
|
||||
break;
|
||||
default: /* too complex, give up */
|
||||
pr_err("Unknown opcode %02x\n", insn->code);
|
||||
return -1;
|
||||
|
|
|
@ -1,4 +1,7 @@
|
|||
#
|
||||
# Arch-specific network modules
|
||||
#
|
||||
obj-$(CONFIG_BPF_JIT) += bpf_jit_asm_$(BITS).o bpf_jit_comp_$(BITS).o
|
||||
obj-$(CONFIG_BPF_JIT) += bpf_jit_comp_$(BITS).o
|
||||
ifeq ($(BITS),32)
|
||||
obj-$(CONFIG_BPF_JIT) += bpf_jit_asm_32.o
|
||||
endif
|
||||
|
|
|
@ -33,35 +33,6 @@
|
|||
#define I5 0x1d
|
||||
#define FP 0x1e
|
||||
#define I7 0x1f
|
||||
|
||||
#define r_SKB L0
|
||||
#define r_HEADLEN L4
|
||||
#define r_SKB_DATA L5
|
||||
#define r_TMP G1
|
||||
#define r_TMP2 G3
|
||||
|
||||
/* assembly code in arch/sparc/net/bpf_jit_asm_64.S */
|
||||
extern u32 bpf_jit_load_word[];
|
||||
extern u32 bpf_jit_load_half[];
|
||||
extern u32 bpf_jit_load_byte[];
|
||||
extern u32 bpf_jit_load_byte_msh[];
|
||||
extern u32 bpf_jit_load_word_positive_offset[];
|
||||
extern u32 bpf_jit_load_half_positive_offset[];
|
||||
extern u32 bpf_jit_load_byte_positive_offset[];
|
||||
extern u32 bpf_jit_load_byte_msh_positive_offset[];
|
||||
extern u32 bpf_jit_load_word_negative_offset[];
|
||||
extern u32 bpf_jit_load_half_negative_offset[];
|
||||
extern u32 bpf_jit_load_byte_negative_offset[];
|
||||
extern u32 bpf_jit_load_byte_msh_negative_offset[];
|
||||
|
||||
#else
|
||||
#define r_RESULT %o0
|
||||
#define r_SKB %o0
|
||||
#define r_OFF %o1
|
||||
#define r_HEADLEN %l4
|
||||
#define r_SKB_DATA %l5
|
||||
#define r_TMP %g1
|
||||
#define r_TMP2 %g3
|
||||
#endif
|
||||
|
||||
#endif /* _BPF_JIT_H */
|
||||
|
|
|
@ -1,162 +0,0 @@
|
|||
/* SPDX-License-Identifier: GPL-2.0 */
|
||||
#include <asm/ptrace.h>
|
||||
|
||||
#include "bpf_jit_64.h"
|
||||
|
||||
#define SAVE_SZ 176
|
||||
#define SCRATCH_OFF STACK_BIAS + 128
|
||||
#define BE_PTR(label) be,pn %xcc, label
|
||||
#define SIGN_EXTEND(reg) sra reg, 0, reg
|
||||
|
||||
#define SKF_MAX_NEG_OFF (-0x200000) /* SKF_LL_OFF from filter.h */
|
||||
|
||||
.text
|
||||
.globl bpf_jit_load_word
|
||||
bpf_jit_load_word:
|
||||
cmp r_OFF, 0
|
||||
bl bpf_slow_path_word_neg
|
||||
nop
|
||||
.globl bpf_jit_load_word_positive_offset
|
||||
bpf_jit_load_word_positive_offset:
|
||||
sub r_HEADLEN, r_OFF, r_TMP
|
||||
cmp r_TMP, 3
|
||||
ble bpf_slow_path_word
|
||||
add r_SKB_DATA, r_OFF, r_TMP
|
||||
andcc r_TMP, 3, %g0
|
||||
bne load_word_unaligned
|
||||
nop
|
||||
retl
|
||||
ld [r_TMP], r_RESULT
|
||||
load_word_unaligned:
|
||||
ldub [r_TMP + 0x0], r_OFF
|
||||
ldub [r_TMP + 0x1], r_TMP2
|
||||
sll r_OFF, 8, r_OFF
|
||||
or r_OFF, r_TMP2, r_OFF
|
||||
ldub [r_TMP + 0x2], r_TMP2
|
||||
sll r_OFF, 8, r_OFF
|
||||
or r_OFF, r_TMP2, r_OFF
|
||||
ldub [r_TMP + 0x3], r_TMP2
|
||||
sll r_OFF, 8, r_OFF
|
||||
retl
|
||||
or r_OFF, r_TMP2, r_RESULT
|
||||
|
||||
.globl bpf_jit_load_half
|
||||
bpf_jit_load_half:
|
||||
cmp r_OFF, 0
|
||||
bl bpf_slow_path_half_neg
|
||||
nop
|
||||
.globl bpf_jit_load_half_positive_offset
|
||||
bpf_jit_load_half_positive_offset:
|
||||
sub r_HEADLEN, r_OFF, r_TMP
|
||||
cmp r_TMP, 1
|
||||
ble bpf_slow_path_half
|
||||
add r_SKB_DATA, r_OFF, r_TMP
|
||||
andcc r_TMP, 1, %g0
|
||||
bne load_half_unaligned
|
||||
nop
|
||||
retl
|
||||
lduh [r_TMP], r_RESULT
|
||||
load_half_unaligned:
|
||||
ldub [r_TMP + 0x0], r_OFF
|
||||
ldub [r_TMP + 0x1], r_TMP2
|
||||
sll r_OFF, 8, r_OFF
|
||||
retl
|
||||
or r_OFF, r_TMP2, r_RESULT
|
||||
|
||||
.globl bpf_jit_load_byte
|
||||
bpf_jit_load_byte:
|
||||
cmp r_OFF, 0
|
||||
bl bpf_slow_path_byte_neg
|
||||
nop
|
||||
.globl bpf_jit_load_byte_positive_offset
|
||||
bpf_jit_load_byte_positive_offset:
|
||||
cmp r_OFF, r_HEADLEN
|
||||
bge bpf_slow_path_byte
|
||||
nop
|
||||
retl
|
||||
ldub [r_SKB_DATA + r_OFF], r_RESULT
|
||||
|
||||
#define bpf_slow_path_common(LEN) \
|
||||
save %sp, -SAVE_SZ, %sp; \
|
||||
mov %i0, %o0; \
|
||||
mov %i1, %o1; \
|
||||
add %fp, SCRATCH_OFF, %o2; \
|
||||
call skb_copy_bits; \
|
||||
mov (LEN), %o3; \
|
||||
cmp %o0, 0; \
|
||||
restore;
|
||||
|
||||
bpf_slow_path_word:
|
||||
bpf_slow_path_common(4)
|
||||
bl bpf_error
|
||||
ld [%sp + SCRATCH_OFF], r_RESULT
|
||||
retl
|
||||
nop
|
||||
bpf_slow_path_half:
|
||||
bpf_slow_path_common(2)
|
||||
bl bpf_error
|
||||
lduh [%sp + SCRATCH_OFF], r_RESULT
|
||||
retl
|
||||
nop
|
||||
bpf_slow_path_byte:
|
||||
bpf_slow_path_common(1)
|
||||
bl bpf_error
|
||||
ldub [%sp + SCRATCH_OFF], r_RESULT
|
||||
retl
|
||||
nop
|
||||
|
||||
#define bpf_negative_common(LEN) \
|
||||
save %sp, -SAVE_SZ, %sp; \
|
||||
mov %i0, %o0; \
|
||||
mov %i1, %o1; \
|
||||
SIGN_EXTEND(%o1); \
|
||||
call bpf_internal_load_pointer_neg_helper; \
|
||||
mov (LEN), %o2; \
|
||||
mov %o0, r_TMP; \
|
||||
cmp %o0, 0; \
|
||||
BE_PTR(bpf_error); \
|
||||
restore;
|
||||
|
||||
bpf_slow_path_word_neg:
|
||||
sethi %hi(SKF_MAX_NEG_OFF), r_TMP
|
||||
cmp r_OFF, r_TMP
|
||||
bl bpf_error
|
||||
nop
|
||||
.globl bpf_jit_load_word_negative_offset
|
||||
bpf_jit_load_word_negative_offset:
|
||||
bpf_negative_common(4)
|
||||
andcc r_TMP, 3, %g0
|
||||
bne load_word_unaligned
|
||||
nop
|
||||
retl
|
||||
ld [r_TMP], r_RESULT
|
||||
|
||||
bpf_slow_path_half_neg:
|
||||
sethi %hi(SKF_MAX_NEG_OFF), r_TMP
|
||||
cmp r_OFF, r_TMP
|
||||
bl bpf_error
|
||||
nop
|
||||
.globl bpf_jit_load_half_negative_offset
|
||||
bpf_jit_load_half_negative_offset:
|
||||
bpf_negative_common(2)
|
||||
andcc r_TMP, 1, %g0
|
||||
bne load_half_unaligned
|
||||
nop
|
||||
retl
|
||||
lduh [r_TMP], r_RESULT
|
||||
|
||||
bpf_slow_path_byte_neg:
|
||||
sethi %hi(SKF_MAX_NEG_OFF), r_TMP
|
||||
cmp r_OFF, r_TMP
|
||||
bl bpf_error
|
||||
nop
|
||||
.globl bpf_jit_load_byte_negative_offset
|
||||
bpf_jit_load_byte_negative_offset:
|
||||
bpf_negative_common(1)
|
||||
retl
|
||||
ldub [r_TMP], r_RESULT
|
||||
|
||||
bpf_error:
|
||||
/* Make the JIT program itself return zero. */
|
||||
ret
|
||||
restore %g0, %g0, %o0
|
|
@ -48,10 +48,6 @@ static void bpf_flush_icache(void *start_, void *end_)
|
|||
}
|
||||
}
|
||||
|
||||
#define SEEN_DATAREF 1 /* might call external helpers */
|
||||
#define SEEN_XREG 2 /* ebx is used */
|
||||
#define SEEN_MEM 4 /* use mem[] for temporary storage */
|
||||
|
||||
#define S13(X) ((X) & 0x1fff)
|
||||
#define S5(X) ((X) & 0x1f)
|
||||
#define IMMED 0x00002000
|
||||
|
@ -198,7 +194,6 @@ struct jit_ctx {
|
|||
bool tmp_1_used;
|
||||
bool tmp_2_used;
|
||||
bool tmp_3_used;
|
||||
bool saw_ld_abs_ind;
|
||||
bool saw_frame_pointer;
|
||||
bool saw_call;
|
||||
bool saw_tail_call;
|
||||
|
@ -207,9 +202,7 @@ struct jit_ctx {
|
|||
|
||||
#define TMP_REG_1 (MAX_BPF_JIT_REG + 0)
|
||||
#define TMP_REG_2 (MAX_BPF_JIT_REG + 1)
|
||||
#define SKB_HLEN_REG (MAX_BPF_JIT_REG + 2)
|
||||
#define SKB_DATA_REG (MAX_BPF_JIT_REG + 3)
|
||||
#define TMP_REG_3 (MAX_BPF_JIT_REG + 4)
|
||||
#define TMP_REG_3 (MAX_BPF_JIT_REG + 2)
|
||||
|
||||
/* Map BPF registers to SPARC registers */
|
||||
static const int bpf2sparc[] = {
|
||||
|
@ -238,9 +231,6 @@ static const int bpf2sparc[] = {
|
|||
[TMP_REG_1] = G1,
|
||||
[TMP_REG_2] = G2,
|
||||
[TMP_REG_3] = G3,
|
||||
|
||||
[SKB_HLEN_REG] = L4,
|
||||
[SKB_DATA_REG] = L5,
|
||||
};
|
||||
|
||||
static void emit(const u32 insn, struct jit_ctx *ctx)
|
||||
|
@ -800,25 +790,6 @@ static int emit_compare_and_branch(const u8 code, const u8 dst, u8 src,
|
|||
return 0;
|
||||
}
|
||||
|
||||
static void load_skb_regs(struct jit_ctx *ctx, u8 r_skb)
|
||||
{
|
||||
const u8 r_headlen = bpf2sparc[SKB_HLEN_REG];
|
||||
const u8 r_data = bpf2sparc[SKB_DATA_REG];
|
||||
const u8 r_tmp = bpf2sparc[TMP_REG_1];
|
||||
unsigned int off;
|
||||
|
||||
off = offsetof(struct sk_buff, len);
|
||||
emit(LD32I | RS1(r_skb) | S13(off) | RD(r_headlen), ctx);
|
||||
|
||||
off = offsetof(struct sk_buff, data_len);
|
||||
emit(LD32I | RS1(r_skb) | S13(off) | RD(r_tmp), ctx);
|
||||
|
||||
emit(SUB | RS1(r_headlen) | RS2(r_tmp) | RD(r_headlen), ctx);
|
||||
|
||||
off = offsetof(struct sk_buff, data);
|
||||
emit(LDPTRI | RS1(r_skb) | S13(off) | RD(r_data), ctx);
|
||||
}
|
||||
|
||||
/* Just skip the save instruction and the ctx register move. */
|
||||
#define BPF_TAILCALL_PROLOGUE_SKIP 16
|
||||
#define BPF_TAILCALL_CNT_SP_OFF (STACK_BIAS + 128)
|
||||
|
@ -857,9 +828,6 @@ static void build_prologue(struct jit_ctx *ctx)
|
|||
|
||||
emit_reg_move(I0, O0, ctx);
|
||||
/* If you add anything here, adjust BPF_TAILCALL_PROLOGUE_SKIP above. */
|
||||
|
||||
if (ctx->saw_ld_abs_ind)
|
||||
load_skb_regs(ctx, bpf2sparc[BPF_REG_1]);
|
||||
}
|
||||
|
||||
static void build_epilogue(struct jit_ctx *ctx)
|
||||
|
@ -926,7 +894,6 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
|
|||
const int i = insn - ctx->prog->insnsi;
|
||||
const s16 off = insn->off;
|
||||
const s32 imm = insn->imm;
|
||||
u32 *func;
|
||||
|
||||
if (insn->src_reg == BPF_REG_FP)
|
||||
ctx->saw_frame_pointer = true;
|
||||
|
@ -1225,16 +1192,11 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
|
|||
u8 *func = ((u8 *)__bpf_call_base) + imm;
|
||||
|
||||
ctx->saw_call = true;
|
||||
if (ctx->saw_ld_abs_ind && bpf_helper_changes_pkt_data(func))
|
||||
emit_reg_move(bpf2sparc[BPF_REG_1], L7, ctx);
|
||||
|
||||
emit_call((u32 *)func, ctx);
|
||||
emit_nop(ctx);
|
||||
|
||||
emit_reg_move(O0, bpf2sparc[BPF_REG_0], ctx);
|
||||
|
||||
if (ctx->saw_ld_abs_ind && bpf_helper_changes_pkt_data(func))
|
||||
load_skb_regs(ctx, L7);
|
||||
break;
|
||||
}
|
||||
|
||||
|
@ -1412,43 +1374,6 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
|
|||
emit_nop(ctx);
|
||||
break;
|
||||
}
|
||||
#define CHOOSE_LOAD_FUNC(K, func) \
|
||||
((int)K < 0 ? ((int)K >= SKF_LL_OFF ? func##_negative_offset : func) : func##_positive_offset)
|
||||
|
||||
/* R0 = ntohx(*(size *)(((struct sk_buff *)R6)->data + imm)) */
|
||||
case BPF_LD | BPF_ABS | BPF_W:
|
||||
func = CHOOSE_LOAD_FUNC(imm, bpf_jit_load_word);
|
||||
goto common_load;
|
||||
case BPF_LD | BPF_ABS | BPF_H:
|
||||
func = CHOOSE_LOAD_FUNC(imm, bpf_jit_load_half);
|
||||
goto common_load;
|
||||
case BPF_LD | BPF_ABS | BPF_B:
|
||||
func = CHOOSE_LOAD_FUNC(imm, bpf_jit_load_byte);
|
||||
goto common_load;
|
||||
/* R0 = ntohx(*(size *)(((struct sk_buff *)R6)->data + src + imm)) */
|
||||
case BPF_LD | BPF_IND | BPF_W:
|
||||
func = bpf_jit_load_word;
|
||||
goto common_load;
|
||||
case BPF_LD | BPF_IND | BPF_H:
|
||||
func = bpf_jit_load_half;
|
||||
goto common_load;
|
||||
|
||||
case BPF_LD | BPF_IND | BPF_B:
|
||||
func = bpf_jit_load_byte;
|
||||
common_load:
|
||||
ctx->saw_ld_abs_ind = true;
|
||||
|
||||
emit_reg_move(bpf2sparc[BPF_REG_6], O0, ctx);
|
||||
emit_loadimm(imm, O1, ctx);
|
||||
|
||||
if (BPF_MODE(code) == BPF_IND)
|
||||
emit_alu(ADD, src, O1, ctx);
|
||||
|
||||
emit_call(func, ctx);
|
||||
emit_alu_K(SRA, O1, 0, ctx);
|
||||
|
||||
emit_reg_move(O0, bpf2sparc[BPF_REG_0], ctx);
|
||||
break;
|
||||
|
||||
default:
|
||||
pr_err_once("unknown opcode %02x\n", code);
|
||||
|
@ -1583,12 +1508,11 @@ skip_init_ctx:
|
|||
build_epilogue(&ctx);
|
||||
|
||||
if (bpf_jit_enable > 1)
|
||||
pr_info("Pass %d: shrink = %d, seen = [%c%c%c%c%c%c%c]\n", pass,
|
||||
pr_info("Pass %d: shrink = %d, seen = [%c%c%c%c%c%c]\n", pass,
|
||||
image_size - (ctx.idx * 4),
|
||||
ctx.tmp_1_used ? '1' : ' ',
|
||||
ctx.tmp_2_used ? '2' : ' ',
|
||||
ctx.tmp_3_used ? '3' : ' ',
|
||||
ctx.saw_ld_abs_ind ? 'L' : ' ',
|
||||
ctx.saw_frame_pointer ? 'F' : ' ',
|
||||
ctx.saw_call ? 'C' : ' ',
|
||||
ctx.saw_tail_call ? 'T' : ' ');
|
||||
|
|
|
@ -140,7 +140,7 @@ config X86
|
|||
select HAVE_DMA_CONTIGUOUS
|
||||
select HAVE_DYNAMIC_FTRACE
|
||||
select HAVE_DYNAMIC_FTRACE_WITH_REGS
|
||||
select HAVE_EBPF_JIT if X86_64
|
||||
select HAVE_EBPF_JIT
|
||||
select HAVE_EFFICIENT_UNALIGNED_ACCESS
|
||||
select HAVE_EXIT_THREAD
|
||||
select HAVE_FENTRY if X86_64 || DYNAMIC_FTRACE
|
||||
|
|
|
@ -308,16 +308,20 @@ do { \
|
|||
* lfence
|
||||
* jmp spec_trap
|
||||
* do_rop:
|
||||
* mov %rax,(%rsp)
|
||||
* mov %rax,(%rsp) for x86_64
|
||||
* mov %edx,(%esp) for x86_32
|
||||
* retq
|
||||
*
|
||||
* Without retpolines configured:
|
||||
*
|
||||
* jmp *%rax
|
||||
* jmp *%rax for x86_64
|
||||
* jmp *%edx for x86_32
|
||||
*/
|
||||
#ifdef CONFIG_RETPOLINE
|
||||
# define RETPOLINE_RAX_BPF_JIT_SIZE 17
|
||||
# define RETPOLINE_RAX_BPF_JIT() \
|
||||
# ifdef CONFIG_X86_64
|
||||
# define RETPOLINE_RAX_BPF_JIT_SIZE 17
|
||||
# define RETPOLINE_RAX_BPF_JIT() \
|
||||
do { \
|
||||
EMIT1_off32(0xE8, 7); /* callq do_rop */ \
|
||||
/* spec_trap: */ \
|
||||
EMIT2(0xF3, 0x90); /* pause */ \
|
||||
|
@ -325,11 +329,30 @@ do { \
|
|||
EMIT2(0xEB, 0xF9); /* jmp spec_trap */ \
|
||||
/* do_rop: */ \
|
||||
EMIT4(0x48, 0x89, 0x04, 0x24); /* mov %rax,(%rsp) */ \
|
||||
EMIT1(0xC3); /* retq */
|
||||
#else
|
||||
# define RETPOLINE_RAX_BPF_JIT_SIZE 2
|
||||
# define RETPOLINE_RAX_BPF_JIT() \
|
||||
EMIT2(0xFF, 0xE0); /* jmp *%rax */
|
||||
EMIT1(0xC3); /* retq */ \
|
||||
} while (0)
|
||||
# else /* !CONFIG_X86_64 */
|
||||
# define RETPOLINE_EDX_BPF_JIT() \
|
||||
do { \
|
||||
EMIT1_off32(0xE8, 7); /* call do_rop */ \
|
||||
/* spec_trap: */ \
|
||||
EMIT2(0xF3, 0x90); /* pause */ \
|
||||
EMIT3(0x0F, 0xAE, 0xE8); /* lfence */ \
|
||||
EMIT2(0xEB, 0xF9); /* jmp spec_trap */ \
|
||||
/* do_rop: */ \
|
||||
EMIT3(0x89, 0x14, 0x24); /* mov %edx,(%esp) */ \
|
||||
EMIT1(0xC3); /* ret */ \
|
||||
} while (0)
|
||||
# endif
|
||||
#else /* !CONFIG_RETPOLINE */
|
||||
# ifdef CONFIG_X86_64
|
||||
# define RETPOLINE_RAX_BPF_JIT_SIZE 2
|
||||
# define RETPOLINE_RAX_BPF_JIT() \
|
||||
EMIT2(0xFF, 0xE0); /* jmp *%rax */
|
||||
# else /* !CONFIG_X86_64 */
|
||||
# define RETPOLINE_EDX_BPF_JIT() \
|
||||
EMIT2(0xFF, 0xE2) /* jmp *%edx */
|
||||
# endif
|
||||
#endif
|
||||
|
||||
#endif /* _ASM_X86_NOSPEC_BRANCH_H_ */
|
||||
|
|
|
@ -1,6 +1,9 @@
|
|||
#
|
||||
# Arch-specific network modules
|
||||
#
|
||||
OBJECT_FILES_NON_STANDARD_bpf_jit.o += y
|
||||
|
||||
obj-$(CONFIG_BPF_JIT) += bpf_jit.o bpf_jit_comp.o
|
||||
ifeq ($(CONFIG_X86_32),y)
|
||||
obj-$(CONFIG_BPF_JIT) += bpf_jit_comp32.o
|
||||
else
|
||||
obj-$(CONFIG_BPF_JIT) += bpf_jit_comp.o
|
||||
endif
|
||||
|
|
|
@ -1,154 +0,0 @@
|
|||
/* bpf_jit.S : BPF JIT helper functions
|
||||
*
|
||||
* Copyright (C) 2011 Eric Dumazet (eric.dumazet@gmail.com)
|
||||
*
|
||||
* This program is free software; you can redistribute it and/or
|
||||
* modify it under the terms of the GNU General Public License
|
||||
* as published by the Free Software Foundation; version 2
|
||||
* of the License.
|
||||
*/
|
||||
#include <linux/linkage.h>
|
||||
#include <asm/frame.h>
|
||||
|
||||
/*
|
||||
* Calling convention :
|
||||
* rbx : skb pointer (callee saved)
|
||||
* esi : offset of byte(s) to fetch in skb (can be scratched)
|
||||
* r10 : copy of skb->data
|
||||
* r9d : hlen = skb->len - skb->data_len
|
||||
*/
|
||||
#define SKBDATA %r10
|
||||
#define SKF_MAX_NEG_OFF $(-0x200000) /* SKF_LL_OFF from filter.h */
|
||||
|
||||
#define FUNC(name) \
|
||||
.globl name; \
|
||||
.type name, @function; \
|
||||
name:
|
||||
|
||||
FUNC(sk_load_word)
|
||||
test %esi,%esi
|
||||
js bpf_slow_path_word_neg
|
||||
|
||||
FUNC(sk_load_word_positive_offset)
|
||||
mov %r9d,%eax # hlen
|
||||
sub %esi,%eax # hlen - offset
|
||||
cmp $3,%eax
|
||||
jle bpf_slow_path_word
|
||||
mov (SKBDATA,%rsi),%eax
|
||||
bswap %eax /* ntohl() */
|
||||
ret
|
||||
|
||||
FUNC(sk_load_half)
|
||||
test %esi,%esi
|
||||
js bpf_slow_path_half_neg
|
||||
|
||||
FUNC(sk_load_half_positive_offset)
|
||||
mov %r9d,%eax
|
||||
sub %esi,%eax # hlen - offset
|
||||
cmp $1,%eax
|
||||
jle bpf_slow_path_half
|
||||
movzwl (SKBDATA,%rsi),%eax
|
||||
rol $8,%ax # ntohs()
|
||||
ret
|
||||
|
||||
FUNC(sk_load_byte)
|
||||
test %esi,%esi
|
||||
js bpf_slow_path_byte_neg
|
||||
|
||||
FUNC(sk_load_byte_positive_offset)
|
||||
cmp %esi,%r9d /* if (offset >= hlen) goto bpf_slow_path_byte */
|
||||
jle bpf_slow_path_byte
|
||||
movzbl (SKBDATA,%rsi),%eax
|
||||
ret
|
||||
|
||||
/* rsi contains offset and can be scratched */
|
||||
#define bpf_slow_path_common(LEN) \
|
||||
lea 32(%rbp), %rdx;\
|
||||
FRAME_BEGIN; \
|
||||
mov %rbx, %rdi; /* arg1 == skb */ \
|
||||
push %r9; \
|
||||
push SKBDATA; \
|
||||
/* rsi already has offset */ \
|
||||
mov $LEN,%ecx; /* len */ \
|
||||
call skb_copy_bits; \
|
||||
test %eax,%eax; \
|
||||
pop SKBDATA; \
|
||||
pop %r9; \
|
||||
FRAME_END
|
||||
|
||||
|
||||
bpf_slow_path_word:
|
||||
bpf_slow_path_common(4)
|
||||
js bpf_error
|
||||
mov 32(%rbp),%eax
|
||||
bswap %eax
|
||||
ret
|
||||
|
||||
bpf_slow_path_half:
|
||||
bpf_slow_path_common(2)
|
||||
js bpf_error
|
||||
mov 32(%rbp),%ax
|
||||
rol $8,%ax
|
||||
movzwl %ax,%eax
|
||||
ret
|
||||
|
||||
bpf_slow_path_byte:
|
||||
bpf_slow_path_common(1)
|
||||
js bpf_error
|
||||
movzbl 32(%rbp),%eax
|
||||
ret
|
||||
|
||||
#define sk_negative_common(SIZE) \
|
||||
FRAME_BEGIN; \
|
||||
mov %rbx, %rdi; /* arg1 == skb */ \
|
||||
push %r9; \
|
||||
push SKBDATA; \
|
||||
/* rsi already has offset */ \
|
||||
mov $SIZE,%edx; /* size */ \
|
||||
call bpf_internal_load_pointer_neg_helper; \
|
||||
test %rax,%rax; \
|
||||
pop SKBDATA; \
|
||||
pop %r9; \
|
||||
FRAME_END; \
|
||||
jz bpf_error
|
||||
|
||||
bpf_slow_path_word_neg:
|
||||
cmp SKF_MAX_NEG_OFF, %esi /* test range */
|
||||
jl bpf_error /* offset lower -> error */
|
||||
|
||||
FUNC(sk_load_word_negative_offset)
|
||||
sk_negative_common(4)
|
||||
mov (%rax), %eax
|
||||
bswap %eax
|
||||
ret
|
||||
|
||||
bpf_slow_path_half_neg:
|
||||
cmp SKF_MAX_NEG_OFF, %esi
|
||||
jl bpf_error
|
||||
|
||||
FUNC(sk_load_half_negative_offset)
|
||||
sk_negative_common(2)
|
||||
mov (%rax),%ax
|
||||
rol $8,%ax
|
||||
movzwl %ax,%eax
|
||||
ret
|
||||
|
||||
bpf_slow_path_byte_neg:
|
||||
cmp SKF_MAX_NEG_OFF, %esi
|
||||
jl bpf_error
|
||||
|
||||
FUNC(sk_load_byte_negative_offset)
|
||||
sk_negative_common(1)
|
||||
movzbl (%rax), %eax
|
||||
ret
|
||||
|
||||
bpf_error:
|
||||
# force a return 0 from jit handler
|
||||
xor %eax,%eax
|
||||
mov (%rbp),%rbx
|
||||
mov 8(%rbp),%r13
|
||||
mov 16(%rbp),%r14
|
||||
mov 24(%rbp),%r15
|
||||
add $40, %rbp
|
||||
leaveq
|
||||
ret
|
|
@ -17,15 +17,6 @@
|
|||
#include <asm/set_memory.h>
|
||||
#include <asm/nospec-branch.h>
|
||||
|
||||
/*
|
||||
* Assembly code in arch/x86/net/bpf_jit.S
|
||||
*/
|
||||
extern u8 sk_load_word[], sk_load_half[], sk_load_byte[];
|
||||
extern u8 sk_load_word_positive_offset[], sk_load_half_positive_offset[];
|
||||
extern u8 sk_load_byte_positive_offset[];
|
||||
extern u8 sk_load_word_negative_offset[], sk_load_half_negative_offset[];
|
||||
extern u8 sk_load_byte_negative_offset[];
|
||||
|
||||
static u8 *emit_code(u8 *ptr, u32 bytes, unsigned int len)
|
||||
{
|
||||
if (len == 1)
|
||||
|
@ -107,9 +98,6 @@ static int bpf_size_to_x86_bytes(int bpf_size)
|
|||
#define X86_JLE 0x7E
|
||||
#define X86_JG 0x7F
|
||||
|
||||
#define CHOOSE_LOAD_FUNC(K, func) \
|
||||
((int)K < 0 ? ((int)K >= SKF_LL_OFF ? func##_negative_offset : func) : func##_positive_offset)
|
||||
|
||||
/* Pick a register outside of BPF range for JIT internal work */
|
||||
#define AUX_REG (MAX_BPF_JIT_REG + 1)
|
||||
|
||||
|
@ -120,8 +108,8 @@ static int bpf_size_to_x86_bytes(int bpf_size)
|
|||
* register in load/store instructions, it always needs an
|
||||
* extra byte of encoding and is callee saved.
|
||||
*
|
||||
* R9 caches skb->len - skb->data_len
|
||||
* R10 caches skb->data, and used for blinding (if enabled)
|
||||
* Also x86-64 register R9 is unused. x86-64 register R10 is
|
||||
* used for blinding (if enabled).
|
||||
*/
|
||||
static const int reg2hex[] = {
|
||||
[BPF_REG_0] = 0, /* RAX */
|
||||
|
@ -196,19 +184,15 @@ static void jit_fill_hole(void *area, unsigned int size)
|
|||
|
||||
struct jit_context {
|
||||
int cleanup_addr; /* Epilogue code offset */
|
||||
bool seen_ld_abs;
|
||||
bool seen_ax_reg;
|
||||
};
|
||||
|
||||
/* Maximum number of bytes emitted while JITing one eBPF insn */
|
||||
#define BPF_MAX_INSN_SIZE 128
|
||||
#define BPF_INSN_SAFETY 64
|
||||
|
||||
#define AUX_STACK_SPACE \
|
||||
(32 /* Space for RBX, R13, R14, R15 */ + \
|
||||
8 /* Space for skb_copy_bits() buffer */)
|
||||
#define AUX_STACK_SPACE 40 /* Space for RBX, R13, R14, R15, tailcnt */
|
||||
|
||||
#define PROLOGUE_SIZE 37
|
||||
#define PROLOGUE_SIZE 37
|
||||
|
||||
/*
|
||||
* Emit x86-64 prologue code for BPF program and check its size.
|
||||
|
@ -232,20 +216,8 @@ static void emit_prologue(u8 **pprog, u32 stack_depth, bool ebpf_from_cbpf)
|
|||
/* sub rbp, AUX_STACK_SPACE */
|
||||
EMIT4(0x48, 0x83, 0xED, AUX_STACK_SPACE);
|
||||
|
||||
/* All classic BPF filters use R6(rbx) save it */
|
||||
|
||||
/* mov qword ptr [rbp+0],rbx */
|
||||
EMIT4(0x48, 0x89, 0x5D, 0);
|
||||
|
||||
/*
|
||||
* bpf_convert_filter() maps classic BPF register X to R7 and uses R8
|
||||
* as temporary, so all tcpdump filters need to spill/fill R7(R13) and
|
||||
* R8(R14). R9(R15) spill could be made conditional, but there is only
|
||||
* one 'bpf_error' return path out of helper functions inside bpf_jit.S
|
||||
* The overhead of extra spill is negligible for any filter other
|
||||
* than synthetic ones. Therefore not worth adding complexity.
|
||||
*/
|
||||
|
||||
/* mov qword ptr [rbp+8],r13 */
|
||||
EMIT4(0x4C, 0x89, 0x6D, 8);
|
||||
/* mov qword ptr [rbp+16],r14 */
|
||||
|
@ -353,27 +325,6 @@ static void emit_bpf_tail_call(u8 **pprog)
|
|||
*pprog = prog;
|
||||
}
|
||||
|
||||
|
||||
static void emit_load_skb_data_hlen(u8 **pprog)
|
||||
{
|
||||
u8 *prog = *pprog;
|
||||
int cnt = 0;
|
||||
|
||||
/*
|
||||
* r9d = skb->len - skb->data_len (headlen)
|
||||
* r10 = skb->data
|
||||
*/
|
||||
/* mov %r9d, off32(%rdi) */
|
||||
EMIT3_off32(0x44, 0x8b, 0x8f, offsetof(struct sk_buff, len));
|
||||
|
||||
/* sub %r9d, off32(%rdi) */
|
||||
EMIT3_off32(0x44, 0x2b, 0x8f, offsetof(struct sk_buff, data_len));
|
||||
|
||||
/* mov %r10, off32(%rdi) */
|
||||
EMIT3_off32(0x4c, 0x8b, 0x97, offsetof(struct sk_buff, data));
|
||||
*pprog = prog;
|
||||
}
|
||||
|
||||
static void emit_mov_imm32(u8 **pprog, bool sign_propagate,
|
||||
u32 dst_reg, const u32 imm32)
|
||||
{
|
||||
|
@ -462,8 +413,6 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
|
|||
{
|
||||
struct bpf_insn *insn = bpf_prog->insnsi;
|
||||
int insn_cnt = bpf_prog->len;
|
||||
bool seen_ld_abs = ctx->seen_ld_abs | (oldproglen == 0);
|
||||
bool seen_ax_reg = ctx->seen_ax_reg | (oldproglen == 0);
|
||||
bool seen_exit = false;
|
||||
u8 temp[BPF_MAX_INSN_SIZE + BPF_INSN_SAFETY];
|
||||
int i, cnt = 0;
|
||||
|
@ -473,9 +422,6 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
|
|||
emit_prologue(&prog, bpf_prog->aux->stack_depth,
|
||||
bpf_prog_was_classic(bpf_prog));
|
||||
|
||||
if (seen_ld_abs)
|
||||
emit_load_skb_data_hlen(&prog);
|
||||
|
||||
for (i = 0; i < insn_cnt; i++, insn++) {
|
||||
const s32 imm32 = insn->imm;
|
||||
u32 dst_reg = insn->dst_reg;
|
||||
|
@ -483,13 +429,9 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
|
|||
u8 b2 = 0, b3 = 0;
|
||||
s64 jmp_offset;
|
||||
u8 jmp_cond;
|
||||
bool reload_skb_data;
|
||||
int ilen;
|
||||
u8 *func;
|
||||
|
||||
if (dst_reg == BPF_REG_AX || src_reg == BPF_REG_AX)
|
||||
ctx->seen_ax_reg = seen_ax_reg = true;
|
||||
|
||||
switch (insn->code) {
|
||||
/* ALU */
|
||||
case BPF_ALU | BPF_ADD | BPF_X:
|
||||
|
@ -916,36 +858,12 @@ xadd: if (is_imm8(insn->off))
|
|||
case BPF_JMP | BPF_CALL:
|
||||
func = (u8 *) __bpf_call_base + imm32;
|
||||
jmp_offset = func - (image + addrs[i]);
|
||||
if (seen_ld_abs) {
|
||||
reload_skb_data = bpf_helper_changes_pkt_data(func);
|
||||
if (reload_skb_data) {
|
||||
EMIT1(0x57); /* push %rdi */
|
||||
jmp_offset += 22; /* pop, mov, sub, mov */
|
||||
} else {
|
||||
EMIT2(0x41, 0x52); /* push %r10 */
|
||||
EMIT2(0x41, 0x51); /* push %r9 */
|
||||
/*
|
||||
* We need to adjust jmp offset, since
|
||||
* pop %r9, pop %r10 take 4 bytes after call insn
|
||||
*/
|
||||
jmp_offset += 4;
|
||||
}
|
||||
}
|
||||
if (!imm32 || !is_simm32(jmp_offset)) {
|
||||
pr_err("unsupported BPF func %d addr %p image %p\n",
|
||||
imm32, func, image);
|
||||
return -EINVAL;
|
||||
}
|
||||
EMIT1_off32(0xE8, jmp_offset);
|
||||
if (seen_ld_abs) {
|
||||
if (reload_skb_data) {
|
||||
EMIT1(0x5F); /* pop %rdi */
|
||||
emit_load_skb_data_hlen(&prog);
|
||||
} else {
|
||||
EMIT2(0x41, 0x59); /* pop %r9 */
|
||||
EMIT2(0x41, 0x5A); /* pop %r10 */
|
||||
}
|
||||
}
|
||||
break;
|
||||
|
||||
case BPF_JMP | BPF_TAIL_CALL:
|
||||
|
@ -1080,60 +998,6 @@ emit_jmp:
|
|||
}
|
||||
break;
|
||||
|
||||
case BPF_LD | BPF_IND | BPF_W:
|
||||
func = sk_load_word;
|
||||
goto common_load;
|
||||
case BPF_LD | BPF_ABS | BPF_W:
|
||||
func = CHOOSE_LOAD_FUNC(imm32, sk_load_word);
|
||||
common_load:
|
||||
ctx->seen_ld_abs = seen_ld_abs = true;
|
||||
jmp_offset = func - (image + addrs[i]);
|
||||
if (!func || !is_simm32(jmp_offset)) {
|
||||
pr_err("unsupported BPF func %d addr %p image %p\n",
|
||||
imm32, func, image);
|
||||
return -EINVAL;
|
||||
}
|
||||
if (BPF_MODE(insn->code) == BPF_ABS) {
|
||||
/* mov %esi, imm32 */
|
||||
EMIT1_off32(0xBE, imm32);
|
||||
} else {
|
||||
/* mov %rsi, src_reg */
|
||||
EMIT_mov(BPF_REG_2, src_reg);
|
||||
if (imm32) {
|
||||
if (is_imm8(imm32))
|
||||
/* add %esi, imm8 */
|
||||
EMIT3(0x83, 0xC6, imm32);
|
||||
else
|
||||
/* add %esi, imm32 */
|
||||
EMIT2_off32(0x81, 0xC6, imm32);
|
||||
}
|
||||
}
|
||||
/*
|
||||
* skb pointer is in R6 (%rbx), it will be copied into
|
||||
* %rdi if skb_copy_bits() call is necessary.
|
||||
* sk_load_* helpers also use %r10 and %r9d.
|
||||
* See bpf_jit.S
|
||||
*/
|
||||
if (seen_ax_reg)
|
||||
/* r10 = skb->data, mov %r10, off32(%rbx) */
|
||||
EMIT3_off32(0x4c, 0x8b, 0x93,
|
||||
offsetof(struct sk_buff, data));
|
||||
EMIT1_off32(0xE8, jmp_offset); /* call */
|
||||
break;
|
||||
|
||||
case BPF_LD | BPF_IND | BPF_H:
|
||||
func = sk_load_half;
|
||||
goto common_load;
|
||||
case BPF_LD | BPF_ABS | BPF_H:
|
||||
func = CHOOSE_LOAD_FUNC(imm32, sk_load_half);
|
||||
goto common_load;
|
||||
case BPF_LD | BPF_IND | BPF_B:
|
||||
func = sk_load_byte;
|
||||
goto common_load;
|
||||
case BPF_LD | BPF_ABS | BPF_B:
|
||||
func = CHOOSE_LOAD_FUNC(imm32, sk_load_byte);
|
||||
goto common_load;
|
||||
|
||||
case BPF_JMP | BPF_EXIT:
|
||||
if (seen_exit) {
|
||||
jmp_offset = ctx->cleanup_addr - addrs[i];
|
||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -197,6 +197,7 @@ config BT_HCIUART_BCM
|
|||
config BT_HCIUART_QCA
|
||||
bool "Qualcomm Atheros protocol support"
|
||||
depends on BT_HCIUART
|
||||
depends on BT_HCIUART_SERDEV
|
||||
select BT_HCIUART_H4
|
||||
select BT_QCA
|
||||
help
|
||||
|
|
|
@ -315,10 +315,12 @@ static int btbcm_read_info(struct hci_dev *hdev)
|
|||
return 0;
|
||||
}
|
||||
|
||||
static const struct {
|
||||
struct bcm_subver_table {
|
||||
u16 subver;
|
||||
const char *name;
|
||||
} bcm_uart_subver_table[] = {
|
||||
};
|
||||
|
||||
static const struct bcm_subver_table bcm_uart_subver_table[] = {
|
||||
{ 0x4103, "BCM4330B1" }, /* 002.001.003 */
|
||||
{ 0x410e, "BCM43341B0" }, /* 002.001.014 */
|
||||
{ 0x4406, "BCM4324B3" }, /* 002.004.006 */
|
||||
|
@ -330,98 +332,7 @@ static const struct {
|
|||
{ }
|
||||
};
|
||||
|
||||
int btbcm_initialize(struct hci_dev *hdev, char *fw_name, size_t len)
|
||||
{
|
||||
u16 subver, rev;
|
||||
const char *hw_name = NULL;
|
||||
struct sk_buff *skb;
|
||||
struct hci_rp_read_local_version *ver;
|
||||
int i, err;
|
||||
|
||||
/* Reset */
|
||||
err = btbcm_reset(hdev);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
/* Read Local Version Info */
|
||||
skb = btbcm_read_local_version(hdev);
|
||||
if (IS_ERR(skb))
|
||||
return PTR_ERR(skb);
|
||||
|
||||
ver = (struct hci_rp_read_local_version *)skb->data;
|
||||
rev = le16_to_cpu(ver->hci_rev);
|
||||
subver = le16_to_cpu(ver->lmp_subver);
|
||||
kfree_skb(skb);
|
||||
|
||||
/* Read controller information */
|
||||
err = btbcm_read_info(hdev);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
switch ((rev & 0xf000) >> 12) {
|
||||
case 0:
|
||||
case 1:
|
||||
case 2:
|
||||
case 3:
|
||||
for (i = 0; bcm_uart_subver_table[i].name; i++) {
|
||||
if (subver == bcm_uart_subver_table[i].subver) {
|
||||
hw_name = bcm_uart_subver_table[i].name;
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
snprintf(fw_name, len, "brcm/%s.hcd", hw_name ? : "BCM");
|
||||
break;
|
||||
default:
|
||||
return 0;
|
||||
}
|
||||
|
||||
bt_dev_info(hdev, "%s (%3.3u.%3.3u.%3.3u) build %4.4u",
|
||||
hw_name ? : "BCM", (subver & 0xe000) >> 13,
|
||||
(subver & 0x1f00) >> 8, (subver & 0x00ff), rev & 0x0fff);
|
||||
|
||||
return 0;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(btbcm_initialize);
|
||||
|
||||
int btbcm_finalize(struct hci_dev *hdev)
|
||||
{
|
||||
struct sk_buff *skb;
|
||||
struct hci_rp_read_local_version *ver;
|
||||
u16 subver, rev;
|
||||
int err;
|
||||
|
||||
/* Reset */
|
||||
err = btbcm_reset(hdev);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
/* Read Local Version Info */
|
||||
skb = btbcm_read_local_version(hdev);
|
||||
if (IS_ERR(skb))
|
||||
return PTR_ERR(skb);
|
||||
|
||||
ver = (struct hci_rp_read_local_version *)skb->data;
|
||||
rev = le16_to_cpu(ver->hci_rev);
|
||||
subver = le16_to_cpu(ver->lmp_subver);
|
||||
kfree_skb(skb);
|
||||
|
||||
bt_dev_info(hdev, "BCM (%3.3u.%3.3u.%3.3u) build %4.4u",
|
||||
(subver & 0xe000) >> 13, (subver & 0x1f00) >> 8,
|
||||
(subver & 0x00ff), rev & 0x0fff);
|
||||
|
||||
btbcm_check_bdaddr(hdev);
|
||||
|
||||
set_bit(HCI_QUIRK_STRICT_DUPLICATE_FILTER, &hdev->quirks);
|
||||
|
||||
return 0;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(btbcm_finalize);
|
||||
|
||||
static const struct {
|
||||
u16 subver;
|
||||
const char *name;
|
||||
} bcm_usb_subver_table[] = {
|
||||
static const struct bcm_subver_table bcm_usb_subver_table[] = {
|
||||
{ 0x210b, "BCM43142A0" }, /* 001.001.011 */
|
||||
{ 0x2112, "BCM4314A0" }, /* 001.001.018 */
|
||||
{ 0x2118, "BCM20702A0" }, /* 001.001.024 */
|
||||
|
@ -435,14 +346,14 @@ static const struct {
|
|||
{ }
|
||||
};
|
||||
|
||||
int btbcm_setup_patchram(struct hci_dev *hdev)
|
||||
int btbcm_initialize(struct hci_dev *hdev, char *fw_name, size_t len,
|
||||
bool reinit)
|
||||
{
|
||||
char fw_name[64];
|
||||
const struct firmware *fw;
|
||||
u16 subver, rev, pid, vid;
|
||||
const char *hw_name = NULL;
|
||||
const char *hw_name = "BCM";
|
||||
struct sk_buff *skb;
|
||||
struct hci_rp_read_local_version *ver;
|
||||
const struct bcm_subver_table *bcm_subver_table;
|
||||
int i, err;
|
||||
|
||||
/* Reset */
|
||||
|
@ -461,25 +372,27 @@ int btbcm_setup_patchram(struct hci_dev *hdev)
|
|||
kfree_skb(skb);
|
||||
|
||||
/* Read controller information */
|
||||
err = btbcm_read_info(hdev);
|
||||
if (err)
|
||||
return err;
|
||||
if (!reinit) {
|
||||
err = btbcm_read_info(hdev);
|
||||
if (err)
|
||||
return err;
|
||||
}
|
||||
|
||||
switch ((rev & 0xf000) >> 12) {
|
||||
case 0:
|
||||
case 3:
|
||||
for (i = 0; bcm_uart_subver_table[i].name; i++) {
|
||||
if (subver == bcm_uart_subver_table[i].subver) {
|
||||
hw_name = bcm_uart_subver_table[i].name;
|
||||
break;
|
||||
}
|
||||
/* Upper nibble of rev should be between 0 and 3? */
|
||||
if (((rev & 0xf000) >> 12) > 3)
|
||||
return 0;
|
||||
|
||||
bcm_subver_table = (hdev->bus == HCI_USB) ? bcm_usb_subver_table :
|
||||
bcm_uart_subver_table;
|
||||
|
||||
for (i = 0; bcm_subver_table[i].name; i++) {
|
||||
if (subver == bcm_subver_table[i].subver) {
|
||||
hw_name = bcm_subver_table[i].name;
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
snprintf(fw_name, sizeof(fw_name), "brcm/%s.hcd",
|
||||
hw_name ? : "BCM");
|
||||
break;
|
||||
case 1:
|
||||
case 2:
|
||||
if (hdev->bus == HCI_USB) {
|
||||
/* Read USB Product Info */
|
||||
skb = btbcm_read_usb_product(hdev);
|
||||
if (IS_ERR(skb))
|
||||
|
@ -489,24 +402,50 @@ int btbcm_setup_patchram(struct hci_dev *hdev)
|
|||
pid = get_unaligned_le16(skb->data + 3);
|
||||
kfree_skb(skb);
|
||||
|
||||
for (i = 0; bcm_usb_subver_table[i].name; i++) {
|
||||
if (subver == bcm_usb_subver_table[i].subver) {
|
||||
hw_name = bcm_usb_subver_table[i].name;
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
snprintf(fw_name, sizeof(fw_name), "brcm/%s-%4.4x-%4.4x.hcd",
|
||||
hw_name ? : "BCM", vid, pid);
|
||||
break;
|
||||
default:
|
||||
return 0;
|
||||
snprintf(fw_name, len, "brcm/%s-%4.4x-%4.4x.hcd",
|
||||
hw_name, vid, pid);
|
||||
} else {
|
||||
snprintf(fw_name, len, "brcm/%s.hcd", hw_name);
|
||||
}
|
||||
|
||||
bt_dev_info(hdev, "%s (%3.3u.%3.3u.%3.3u) build %4.4u",
|
||||
hw_name ? : "BCM", (subver & 0xe000) >> 13,
|
||||
hw_name, (subver & 0xe000) >> 13,
|
||||
(subver & 0x1f00) >> 8, (subver & 0x00ff), rev & 0x0fff);
|
||||
|
||||
return 0;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(btbcm_initialize);
|
||||
|
||||
int btbcm_finalize(struct hci_dev *hdev)
|
||||
{
|
||||
char fw_name[64];
|
||||
int err;
|
||||
|
||||
/* Re-initialize */
|
||||
err = btbcm_initialize(hdev, fw_name, sizeof(fw_name), true);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
btbcm_check_bdaddr(hdev);
|
||||
|
||||
set_bit(HCI_QUIRK_STRICT_DUPLICATE_FILTER, &hdev->quirks);
|
||||
|
||||
return 0;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(btbcm_finalize);
|
||||
|
||||
int btbcm_setup_patchram(struct hci_dev *hdev)
|
||||
{
|
||||
char fw_name[64];
|
||||
const struct firmware *fw;
|
||||
struct sk_buff *skb;
|
||||
int err;
|
||||
|
||||
/* Initialize */
|
||||
err = btbcm_initialize(hdev, fw_name, sizeof(fw_name), false);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
err = request_firmware(&fw, fw_name, &hdev->dev);
|
||||
if (err < 0) {
|
||||
bt_dev_info(hdev, "BCM: Patch %s not found", fw_name);
|
||||
|
@ -517,25 +456,11 @@ int btbcm_setup_patchram(struct hci_dev *hdev)
|
|||
|
||||
release_firmware(fw);
|
||||
|
||||
/* Reset */
|
||||
err = btbcm_reset(hdev);
|
||||
/* Re-initialize */
|
||||
err = btbcm_initialize(hdev, fw_name, sizeof(fw_name), true);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
/* Read Local Version Info */
|
||||
skb = btbcm_read_local_version(hdev);
|
||||
if (IS_ERR(skb))
|
||||
return PTR_ERR(skb);
|
||||
|
||||
ver = (struct hci_rp_read_local_version *)skb->data;
|
||||
rev = le16_to_cpu(ver->hci_rev);
|
||||
subver = le16_to_cpu(ver->lmp_subver);
|
||||
kfree_skb(skb);
|
||||
|
||||
bt_dev_info(hdev, "%s (%3.3u.%3.3u.%3.3u) build %4.4u",
|
||||
hw_name ? : "BCM", (subver & 0xe000) >> 13,
|
||||
(subver & 0x1f00) >> 8, (subver & 0x00ff), rev & 0x0fff);
|
||||
|
||||
/* Read Local Name */
|
||||
skb = btbcm_read_local_name(hdev);
|
||||
if (IS_ERR(skb))
|
||||
|
|
|
@ -73,7 +73,8 @@ int btbcm_patchram(struct hci_dev *hdev, const struct firmware *fw);
|
|||
int btbcm_setup_patchram(struct hci_dev *hdev);
|
||||
int btbcm_setup_apple(struct hci_dev *hdev);
|
||||
|
||||
int btbcm_initialize(struct hci_dev *hdev, char *fw_name, size_t len);
|
||||
int btbcm_initialize(struct hci_dev *hdev, char *fw_name, size_t len,
|
||||
bool reinit);
|
||||
int btbcm_finalize(struct hci_dev *hdev);
|
||||
|
||||
#else
|
||||
|
@ -104,7 +105,7 @@ static inline int btbcm_setup_apple(struct hci_dev *hdev)
|
|||
}
|
||||
|
||||
static inline int btbcm_initialize(struct hci_dev *hdev, char *fw_name,
|
||||
size_t len)
|
||||
size_t len, bool reinit)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
|
|
@ -35,15 +35,9 @@ static ssize_t btmrvl_hscfgcmd_write(struct file *file,
|
|||
const char __user *ubuf, size_t count, loff_t *ppos)
|
||||
{
|
||||
struct btmrvl_private *priv = file->private_data;
|
||||
char buf[16];
|
||||
long result, ret;
|
||||
|
||||
memset(buf, 0, sizeof(buf));
|
||||
|
||||
if (copy_from_user(&buf, ubuf, min_t(size_t, sizeof(buf) - 1, count)))
|
||||
return -EFAULT;
|
||||
|
||||
ret = kstrtol(buf, 10, &result);
|
||||
ret = kstrtol_from_user(ubuf, count, 10, &result);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
|
@ -81,15 +75,9 @@ static ssize_t btmrvl_pscmd_write(struct file *file, const char __user *ubuf,
|
|||
size_t count, loff_t *ppos)
|
||||
{
|
||||
struct btmrvl_private *priv = file->private_data;
|
||||
char buf[16];
|
||||
long result, ret;
|
||||
|
||||
memset(buf, 0, sizeof(buf));
|
||||
|
||||
if (copy_from_user(&buf, ubuf, min_t(size_t, sizeof(buf) - 1, count)))
|
||||
return -EFAULT;
|
||||
|
||||
ret = kstrtol(buf, 10, &result);
|
||||
ret = kstrtol_from_user(ubuf, count, 10, &result);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
|
@ -127,15 +115,9 @@ static ssize_t btmrvl_hscmd_write(struct file *file, const char __user *ubuf,
|
|||
size_t count, loff_t *ppos)
|
||||
{
|
||||
struct btmrvl_private *priv = file->private_data;
|
||||
char buf[16];
|
||||
long result, ret;
|
||||
|
||||
memset(buf, 0, sizeof(buf));
|
||||
|
||||
if (copy_from_user(&buf, ubuf, min_t(size_t, sizeof(buf) - 1, count)))
|
||||
return -EFAULT;
|
||||
|
||||
ret = kstrtol(buf, 10, &result);
|
||||
ret = kstrtol_from_user(ubuf, count, 10, &result);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
|
@ -167,35 +149,6 @@ static const struct file_operations btmrvl_hscmd_fops = {
|
|||
.llseek = default_llseek,
|
||||
};
|
||||
|
||||
static ssize_t btmrvl_fwdump_write(struct file *file, const char __user *ubuf,
|
||||
size_t count, loff_t *ppos)
|
||||
{
|
||||
struct btmrvl_private *priv = file->private_data;
|
||||
char buf[16];
|
||||
bool result;
|
||||
|
||||
memset(buf, 0, sizeof(buf));
|
||||
|
||||
if (copy_from_user(&buf, ubuf, min_t(size_t, sizeof(buf) - 1, count)))
|
||||
return -EFAULT;
|
||||
|
||||
if (strtobool(buf, &result))
|
||||
return -EINVAL;
|
||||
|
||||
if (!result)
|
||||
return -EINVAL;
|
||||
|
||||
btmrvl_firmware_dump(priv);
|
||||
|
||||
return count;
|
||||
}
|
||||
|
||||
static const struct file_operations btmrvl_fwdump_fops = {
|
||||
.write = btmrvl_fwdump_write,
|
||||
.open = simple_open,
|
||||
.llseek = default_llseek,
|
||||
};
|
||||
|
||||
void btmrvl_debugfs_init(struct hci_dev *hdev)
|
||||
{
|
||||
struct btmrvl_private *priv = hci_get_drvdata(hdev);
|
||||
|
@ -226,8 +179,6 @@ void btmrvl_debugfs_init(struct hci_dev *hdev)
|
|||
priv, &btmrvl_hscmd_fops);
|
||||
debugfs_create_file("hscfgcmd", 0644, dbg->config_dir,
|
||||
priv, &btmrvl_hscfgcmd_fops);
|
||||
debugfs_create_file("fw_dump", 0200, dbg->config_dir,
|
||||
priv, &btmrvl_fwdump_fops);
|
||||
|
||||
dbg->status_dir = debugfs_create_dir("status", hdev->debugfs);
|
||||
debugfs_create_u8("curpsmode", 0444, dbg->status_dir,
|
||||
|
|
|
@ -110,7 +110,6 @@ struct btmrvl_private {
|
|||
u8 *payload, u16 nb);
|
||||
int (*hw_wakeup_firmware)(struct btmrvl_private *priv);
|
||||
int (*hw_process_int_status)(struct btmrvl_private *priv);
|
||||
void (*firmware_dump)(struct btmrvl_private *priv);
|
||||
spinlock_t driver_lock; /* spinlock used by driver */
|
||||
#ifdef CONFIG_DEBUG_FS
|
||||
void *debugfs_data;
|
||||
|
@ -183,7 +182,6 @@ int btmrvl_send_hscfg_cmd(struct btmrvl_private *priv);
|
|||
int btmrvl_enable_ps(struct btmrvl_private *priv);
|
||||
int btmrvl_prepare_command(struct btmrvl_private *priv);
|
||||
int btmrvl_enable_hs(struct btmrvl_private *priv);
|
||||
void btmrvl_firmware_dump(struct btmrvl_private *priv);
|
||||
|
||||
#ifdef CONFIG_DEBUG_FS
|
||||
void btmrvl_debugfs_init(struct hci_dev *hdev);
|
||||
|
|
|
@ -358,12 +358,6 @@ int btmrvl_prepare_command(struct btmrvl_private *priv)
|
|||
return ret;
|
||||
}
|
||||
|
||||
void btmrvl_firmware_dump(struct btmrvl_private *priv)
|
||||
{
|
||||
if (priv->firmware_dump)
|
||||
priv->firmware_dump(priv);
|
||||
}
|
||||
|
||||
static int btmrvl_tx_pkt(struct btmrvl_private *priv, struct sk_buff *skb)
|
||||
{
|
||||
int ret = 0;
|
||||
|
|
|
@ -1311,9 +1311,11 @@ rdwr_status btmrvl_sdio_rdwr_firmware(struct btmrvl_private *priv,
|
|||
}
|
||||
|
||||
/* This function dump sdio register and memory data */
|
||||
static void btmrvl_sdio_dump_firmware(struct btmrvl_private *priv)
|
||||
static void btmrvl_sdio_coredump(struct device *dev)
|
||||
{
|
||||
struct btmrvl_sdio_card *card = priv->btmrvl_dev.card;
|
||||
struct sdio_func *func = dev_to_sdio_func(dev);
|
||||
struct btmrvl_sdio_card *card;
|
||||
struct btmrvl_private *priv;
|
||||
int ret = 0;
|
||||
unsigned int reg, reg_start, reg_end;
|
||||
enum rdwr_status stat;
|
||||
|
@ -1321,6 +1323,9 @@ static void btmrvl_sdio_dump_firmware(struct btmrvl_private *priv)
|
|||
u8 dump_num = 0, idx, i, read_reg, doneflag = 0;
|
||||
u32 memory_size, fw_dump_len = 0;
|
||||
|
||||
card = sdio_get_drvdata(func);
|
||||
priv = card->priv;
|
||||
|
||||
/* dump sdio register first */
|
||||
btmrvl_sdio_dump_regs(priv);
|
||||
|
||||
|
@ -1547,7 +1552,6 @@ static int btmrvl_sdio_probe(struct sdio_func *func,
|
|||
priv->hw_host_to_card = btmrvl_sdio_host_to_card;
|
||||
priv->hw_wakeup_firmware = btmrvl_sdio_wakeup_fw;
|
||||
priv->hw_process_int_status = btmrvl_sdio_process_int_status;
|
||||
priv->firmware_dump = btmrvl_sdio_dump_firmware;
|
||||
|
||||
if (btmrvl_register_hdev(priv)) {
|
||||
BT_ERR("Register hdev failed!");
|
||||
|
@ -1717,6 +1721,7 @@ static struct sdio_driver bt_mrvl_sdio = {
|
|||
.remove = btmrvl_sdio_remove,
|
||||
.drv = {
|
||||
.owner = THIS_MODULE,
|
||||
.coredump = btmrvl_sdio_coredump,
|
||||
.pm = &btmrvl_sdio_pm_ops,
|
||||
}
|
||||
};
|
||||
|
|
|
@ -127,28 +127,41 @@ static void rome_tlv_check_data(struct rome_config *config,
|
|||
BT_DBG("TLV Type\t\t : 0x%x", type_len & 0x000000ff);
|
||||
BT_DBG("Length\t\t : %d bytes", length);
|
||||
|
||||
config->dnld_mode = ROME_SKIP_EVT_NONE;
|
||||
|
||||
switch (config->type) {
|
||||
case TLV_TYPE_PATCH:
|
||||
tlv_patch = (struct tlv_type_patch *)tlv->data;
|
||||
BT_DBG("Total Length\t\t : %d bytes",
|
||||
|
||||
/* For Rome version 1.1 to 3.1, all segment commands
|
||||
* are acked by a vendor specific event (VSE).
|
||||
* For Rome >= 3.2, the download mode field indicates
|
||||
* if VSE is skipped by the controller.
|
||||
* In case VSE is skipped, only the last segment is acked.
|
||||
*/
|
||||
config->dnld_mode = tlv_patch->download_mode;
|
||||
|
||||
BT_DBG("Total Length : %d bytes",
|
||||
le32_to_cpu(tlv_patch->total_size));
|
||||
BT_DBG("Patch Data Length\t : %d bytes",
|
||||
BT_DBG("Patch Data Length : %d bytes",
|
||||
le32_to_cpu(tlv_patch->data_length));
|
||||
BT_DBG("Signing Format Version : 0x%x",
|
||||
tlv_patch->format_version);
|
||||
BT_DBG("Signature Algorithm\t : 0x%x",
|
||||
BT_DBG("Signature Algorithm : 0x%x",
|
||||
tlv_patch->signature);
|
||||
BT_DBG("Reserved\t\t : 0x%x",
|
||||
le16_to_cpu(tlv_patch->reserved1));
|
||||
BT_DBG("Product ID\t\t : 0x%04x",
|
||||
BT_DBG("Download mode : 0x%x",
|
||||
tlv_patch->download_mode);
|
||||
BT_DBG("Reserved : 0x%x",
|
||||
tlv_patch->reserved1);
|
||||
BT_DBG("Product ID : 0x%04x",
|
||||
le16_to_cpu(tlv_patch->product_id));
|
||||
BT_DBG("Rom Build Version\t : 0x%04x",
|
||||
BT_DBG("Rom Build Version : 0x%04x",
|
||||
le16_to_cpu(tlv_patch->rom_build));
|
||||
BT_DBG("Patch Version\t\t : 0x%04x",
|
||||
BT_DBG("Patch Version : 0x%04x",
|
||||
le16_to_cpu(tlv_patch->patch_version));
|
||||
BT_DBG("Reserved\t\t : 0x%x",
|
||||
BT_DBG("Reserved : 0x%x",
|
||||
le16_to_cpu(tlv_patch->reserved2));
|
||||
BT_DBG("Patch Entry Address\t : 0x%x",
|
||||
BT_DBG("Patch Entry Address : 0x%x",
|
||||
le32_to_cpu(tlv_patch->entry));
|
||||
break;
|
||||
|
||||
|
@ -194,8 +207,8 @@ static void rome_tlv_check_data(struct rome_config *config,
|
|||
}
|
||||
}
|
||||
|
||||
static int rome_tlv_send_segment(struct hci_dev *hdev, int idx, int seg_size,
|
||||
const u8 *data)
|
||||
static int rome_tlv_send_segment(struct hci_dev *hdev, int seg_size,
|
||||
const u8 *data, enum rome_tlv_dnld_mode mode)
|
||||
{
|
||||
struct sk_buff *skb;
|
||||
struct edl_event_hdr *edl;
|
||||
|
@ -203,12 +216,14 @@ static int rome_tlv_send_segment(struct hci_dev *hdev, int idx, int seg_size,
|
|||
u8 cmd[MAX_SIZE_PER_TLV_SEGMENT + 2];
|
||||
int err = 0;
|
||||
|
||||
BT_DBG("%s: Download segment #%d size %d", hdev->name, idx, seg_size);
|
||||
|
||||
cmd[0] = EDL_PATCH_TLV_REQ_CMD;
|
||||
cmd[1] = seg_size;
|
||||
memcpy(cmd + 2, data, seg_size);
|
||||
|
||||
if (mode == ROME_SKIP_EVT_VSE_CC || mode == ROME_SKIP_EVT_VSE)
|
||||
return __hci_cmd_send(hdev, EDL_PATCH_CMD_OPCODE, seg_size + 2,
|
||||
cmd);
|
||||
|
||||
skb = __hci_cmd_sync_ev(hdev, EDL_PATCH_CMD_OPCODE, seg_size + 2, cmd,
|
||||
HCI_VENDOR_PKT, HCI_INIT_TIMEOUT);
|
||||
if (IS_ERR(skb)) {
|
||||
|
@ -245,47 +260,12 @@ out:
|
|||
return err;
|
||||
}
|
||||
|
||||
static int rome_tlv_download_request(struct hci_dev *hdev,
|
||||
const struct firmware *fw)
|
||||
{
|
||||
const u8 *buffer, *data;
|
||||
int total_segment, remain_size;
|
||||
int ret, i;
|
||||
|
||||
if (!fw || !fw->data)
|
||||
return -EINVAL;
|
||||
|
||||
total_segment = fw->size / MAX_SIZE_PER_TLV_SEGMENT;
|
||||
remain_size = fw->size % MAX_SIZE_PER_TLV_SEGMENT;
|
||||
|
||||
BT_DBG("%s: Total segment num %d remain size %d total size %zu",
|
||||
hdev->name, total_segment, remain_size, fw->size);
|
||||
|
||||
data = fw->data;
|
||||
for (i = 0; i < total_segment; i++) {
|
||||
buffer = data + i * MAX_SIZE_PER_TLV_SEGMENT;
|
||||
ret = rome_tlv_send_segment(hdev, i, MAX_SIZE_PER_TLV_SEGMENT,
|
||||
buffer);
|
||||
if (ret < 0)
|
||||
return -EIO;
|
||||
}
|
||||
|
||||
if (remain_size) {
|
||||
buffer = data + total_segment * MAX_SIZE_PER_TLV_SEGMENT;
|
||||
ret = rome_tlv_send_segment(hdev, total_segment, remain_size,
|
||||
buffer);
|
||||
if (ret < 0)
|
||||
return -EIO;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int rome_download_firmware(struct hci_dev *hdev,
|
||||
struct rome_config *config)
|
||||
{
|
||||
const struct firmware *fw;
|
||||
int ret;
|
||||
const u8 *segment;
|
||||
int ret, remain, i = 0;
|
||||
|
||||
bt_dev_info(hdev, "ROME Downloading %s", config->fwname);
|
||||
|
||||
|
@ -298,10 +278,24 @@ static int rome_download_firmware(struct hci_dev *hdev,
|
|||
|
||||
rome_tlv_check_data(config, fw);
|
||||
|
||||
ret = rome_tlv_download_request(hdev, fw);
|
||||
if (ret) {
|
||||
BT_ERR("%s: Failed to download file: %s (%d)", hdev->name,
|
||||
config->fwname, ret);
|
||||
segment = fw->data;
|
||||
remain = fw->size;
|
||||
while (remain > 0) {
|
||||
int segsize = min(MAX_SIZE_PER_TLV_SEGMENT, remain);
|
||||
|
||||
bt_dev_dbg(hdev, "Send segment %d, size %d", i++, segsize);
|
||||
|
||||
remain -= segsize;
|
||||
/* The last segment is always acked regardless download mode */
|
||||
if (!remain || segsize < MAX_SIZE_PER_TLV_SEGMENT)
|
||||
config->dnld_mode = ROME_SKIP_EVT_NONE;
|
||||
|
||||
ret = rome_tlv_send_segment(hdev, segsize, segment,
|
||||
config->dnld_mode);
|
||||
if (ret)
|
||||
break;
|
||||
|
||||
segment += segsize;
|
||||
}
|
||||
|
||||
release_firmware(fw);
|
||||
|
|
|
@ -61,6 +61,13 @@ enum qca_bardrate {
|
|||
QCA_BAUDRATE_RESERVED
|
||||
};
|
||||
|
||||
enum rome_tlv_dnld_mode {
|
||||
ROME_SKIP_EVT_NONE,
|
||||
ROME_SKIP_EVT_VSE,
|
||||
ROME_SKIP_EVT_CC,
|
||||
ROME_SKIP_EVT_VSE_CC
|
||||
};
|
||||
|
||||
enum rome_tlv_type {
|
||||
TLV_TYPE_PATCH = 1,
|
||||
TLV_TYPE_NVM
|
||||
|
@ -70,6 +77,7 @@ struct rome_config {
|
|||
u8 type;
|
||||
char fwname[64];
|
||||
uint8_t user_baud_rate;
|
||||
enum rome_tlv_dnld_mode dnld_mode;
|
||||
};
|
||||
|
||||
struct edl_event_hdr {
|
||||
|
@ -94,7 +102,8 @@ struct tlv_type_patch {
|
|||
__le32 data_length;
|
||||
__u8 format_version;
|
||||
__u8 signature;
|
||||
__le16 reserved1;
|
||||
__u8 download_mode;
|
||||
__u8 reserved1;
|
||||
__le16 product_id;
|
||||
__le16 rom_build;
|
||||
__le16 patch_version;
|
||||
|
|
|
@ -65,6 +65,7 @@ static int btqcomsmd_cmd_callback(struct rpmsg_device *rpdev, void *data,
|
|||
{
|
||||
struct btqcomsmd *btq = priv;
|
||||
|
||||
btq->hdev->stat.byte_rx += count;
|
||||
return btqcomsmd_recv(btq->hdev, HCI_EVENT_PKT, data, count);
|
||||
}
|
||||
|
||||
|
@ -76,12 +77,21 @@ static int btqcomsmd_send(struct hci_dev *hdev, struct sk_buff *skb)
|
|||
switch (hci_skb_pkt_type(skb)) {
|
||||
case HCI_ACLDATA_PKT:
|
||||
ret = rpmsg_send(btq->acl_channel, skb->data, skb->len);
|
||||
if (ret) {
|
||||
hdev->stat.err_tx++;
|
||||
break;
|
||||
}
|
||||
hdev->stat.acl_tx++;
|
||||
hdev->stat.byte_tx += skb->len;
|
||||
break;
|
||||
case HCI_COMMAND_PKT:
|
||||
ret = rpmsg_send(btq->cmd_channel, skb->data, skb->len);
|
||||
if (ret) {
|
||||
hdev->stat.err_tx++;
|
||||
break;
|
||||
}
|
||||
hdev->stat.cmd_tx++;
|
||||
hdev->stat.byte_tx += skb->len;
|
||||
break;
|
||||
default:
|
||||
ret = -EILSEQ;
|
||||
|
|
|
@ -276,6 +276,8 @@ static const struct usb_device_id blacklist_table[] = {
|
|||
{ USB_DEVICE(0x04ca, 0x3011), .driver_info = BTUSB_QCA_ROME },
|
||||
{ USB_DEVICE(0x04ca, 0x3015), .driver_info = BTUSB_QCA_ROME },
|
||||
{ USB_DEVICE(0x04ca, 0x3016), .driver_info = BTUSB_QCA_ROME },
|
||||
{ USB_DEVICE(0x04ca, 0x301a), .driver_info = BTUSB_QCA_ROME },
|
||||
{ USB_DEVICE(0x13d3, 0x3496), .driver_info = BTUSB_QCA_ROME },
|
||||
|
||||
/* Broadcom BCM2035 */
|
||||
{ USB_DEVICE(0x0a5c, 0x2009), .driver_info = BTUSB_BCM92035 },
|
||||
|
@ -371,6 +373,9 @@ static const struct usb_device_id blacklist_table[] = {
|
|||
/* Additional Realtek 8723BU Bluetooth devices */
|
||||
{ USB_DEVICE(0x7392, 0xa611), .driver_info = BTUSB_REALTEK },
|
||||
|
||||
/* Additional Realtek 8723DE Bluetooth devices */
|
||||
{ USB_DEVICE(0x2ff8, 0xb011), .driver_info = BTUSB_REALTEK },
|
||||
|
||||
/* Additional Realtek 8821AE Bluetooth devices */
|
||||
{ USB_DEVICE(0x0b05, 0x17dc), .driver_info = BTUSB_REALTEK },
|
||||
{ USB_DEVICE(0x13d3, 0x3414), .driver_info = BTUSB_REALTEK },
|
||||
|
@ -379,6 +384,7 @@ static const struct usb_device_id blacklist_table[] = {
|
|||
{ USB_DEVICE(0x13d3, 0x3462), .driver_info = BTUSB_REALTEK },
|
||||
|
||||
/* Additional Realtek 8822BE Bluetooth devices */
|
||||
{ USB_DEVICE(0x13d3, 0x3526), .driver_info = BTUSB_REALTEK },
|
||||
{ USB_DEVICE(0x0b05, 0x185c), .driver_info = BTUSB_REALTEK },
|
||||
|
||||
/* Silicon Wave based devices */
|
||||
|
@ -406,6 +412,13 @@ static const struct dmi_system_id btusb_needs_reset_resume_table[] = {
|
|||
DMI_MATCH(DMI_PRODUCT_NAME, "XPS 13 9360"),
|
||||
},
|
||||
},
|
||||
{
|
||||
/* Dell Inspiron 5565 (QCA ROME device 0cf3:e009) */
|
||||
.matches = {
|
||||
DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
|
||||
DMI_MATCH(DMI_PRODUCT_NAME, "Inspiron 5565"),
|
||||
},
|
||||
},
|
||||
{}
|
||||
};
|
||||
|
||||
|
@ -2497,11 +2510,9 @@ static const struct qca_device_info qca_devices_table[] = {
|
|||
{ 0x00000302, 28, 4, 18 }, /* Rome 3.2 */
|
||||
};
|
||||
|
||||
static int btusb_qca_send_vendor_req(struct hci_dev *hdev, u8 request,
|
||||
static int btusb_qca_send_vendor_req(struct usb_device *udev, u8 request,
|
||||
void *data, u16 size)
|
||||
{
|
||||
struct btusb_data *btdata = hci_get_drvdata(hdev);
|
||||
struct usb_device *udev = btdata->udev;
|
||||
int pipe, err;
|
||||
u8 *buf;
|
||||
|
||||
|
@ -2516,7 +2527,7 @@ static int btusb_qca_send_vendor_req(struct hci_dev *hdev, u8 request,
|
|||
err = usb_control_msg(udev, pipe, request, USB_TYPE_VENDOR | USB_DIR_IN,
|
||||
0, 0, buf, size, USB_CTRL_SET_TIMEOUT);
|
||||
if (err < 0) {
|
||||
bt_dev_err(hdev, "Failed to access otp area (%d)", err);
|
||||
dev_err(&udev->dev, "Failed to access otp area (%d)", err);
|
||||
goto done;
|
||||
}
|
||||
|
||||
|
@ -2666,20 +2677,38 @@ static int btusb_setup_qca_load_nvm(struct hci_dev *hdev,
|
|||
return err;
|
||||
}
|
||||
|
||||
/* identify the ROM version and check whether patches are needed */
|
||||
static bool btusb_qca_need_patch(struct usb_device *udev)
|
||||
{
|
||||
struct qca_version ver;
|
||||
|
||||
if (btusb_qca_send_vendor_req(udev, QCA_GET_TARGET_VERSION, &ver,
|
||||
sizeof(ver)) < 0)
|
||||
return false;
|
||||
/* only low ROM versions need patches */
|
||||
return !(le32_to_cpu(ver.rom_version) & ~0xffffU);
|
||||
}
|
||||
|
||||
static int btusb_setup_qca(struct hci_dev *hdev)
|
||||
{
|
||||
struct btusb_data *btdata = hci_get_drvdata(hdev);
|
||||
struct usb_device *udev = btdata->udev;
|
||||
const struct qca_device_info *info = NULL;
|
||||
struct qca_version ver;
|
||||
u32 ver_rom;
|
||||
u8 status;
|
||||
int i, err;
|
||||
|
||||
err = btusb_qca_send_vendor_req(hdev, QCA_GET_TARGET_VERSION, &ver,
|
||||
err = btusb_qca_send_vendor_req(udev, QCA_GET_TARGET_VERSION, &ver,
|
||||
sizeof(ver));
|
||||
if (err < 0)
|
||||
return err;
|
||||
|
||||
ver_rom = le32_to_cpu(ver.rom_version);
|
||||
/* Don't care about high ROM versions */
|
||||
if (ver_rom & ~0xffffU)
|
||||
return 0;
|
||||
|
||||
for (i = 0; i < ARRAY_SIZE(qca_devices_table); i++) {
|
||||
if (ver_rom == qca_devices_table[i].rom_version)
|
||||
info = &qca_devices_table[i];
|
||||
|
@ -2689,7 +2718,7 @@ static int btusb_setup_qca(struct hci_dev *hdev)
|
|||
return -ENODEV;
|
||||
}
|
||||
|
||||
err = btusb_qca_send_vendor_req(hdev, QCA_CHECK_STATUS, &status,
|
||||
err = btusb_qca_send_vendor_req(udev, QCA_CHECK_STATUS, &status,
|
||||
sizeof(status));
|
||||
if (err < 0)
|
||||
return err;
|
||||
|
@ -2903,7 +2932,8 @@ static int btusb_probe(struct usb_interface *intf,
|
|||
/* Old firmware would otherwise let ath3k driver load
|
||||
* patch and sysconfig files
|
||||
*/
|
||||
if (le16_to_cpu(udev->descriptor.bcdDevice) <= 0x0001)
|
||||
if (le16_to_cpu(udev->descriptor.bcdDevice) <= 0x0001 &&
|
||||
!btusb_qca_need_patch(udev))
|
||||
return -ENODEV;
|
||||
}
|
||||
|
||||
|
@ -3065,6 +3095,7 @@ static int btusb_probe(struct usb_interface *intf,
|
|||
}
|
||||
|
||||
if (id->driver_info & BTUSB_ATH3012) {
|
||||
data->setup_on_usb = btusb_setup_qca;
|
||||
hdev->set_bdaddr = btusb_set_bdaddr_ath3012;
|
||||
set_bit(HCI_QUIRK_SIMULTANEOUS_DISCOVERY, &hdev->quirks);
|
||||
set_bit(HCI_QUIRK_STRICT_DUPLICATE_FILTER, &hdev->quirks);
|
||||
|
|
|
@ -380,10 +380,6 @@ static int bcm_open(struct hci_uart *hu)
|
|||
mutex_lock(&bcm_device_lock);
|
||||
|
||||
if (hu->serdev) {
|
||||
err = serdev_device_open(hu->serdev);
|
||||
if (err)
|
||||
goto err_free;
|
||||
|
||||
bcm->dev = serdev_device_get_drvdata(hu->serdev);
|
||||
goto out;
|
||||
}
|
||||
|
@ -420,13 +416,10 @@ out:
|
|||
return 0;
|
||||
|
||||
err_unset_hu:
|
||||
if (hu->serdev)
|
||||
serdev_device_close(hu->serdev);
|
||||
#ifdef CONFIG_PM
|
||||
else
|
||||
if (!hu->serdev)
|
||||
bcm->dev->hu = NULL;
|
||||
#endif
|
||||
err_free:
|
||||
mutex_unlock(&bcm_device_lock);
|
||||
hu->priv = NULL;
|
||||
kfree(bcm);
|
||||
|
@ -445,7 +438,6 @@ static int bcm_close(struct hci_uart *hu)
|
|||
mutex_lock(&bcm_device_lock);
|
||||
|
||||
if (hu->serdev) {
|
||||
serdev_device_close(hu->serdev);
|
||||
bdev = serdev_device_get_drvdata(hu->serdev);
|
||||
} else if (bcm_device_exists(bcm->dev)) {
|
||||
bdev = bcm->dev;
|
||||
|
@ -501,7 +493,7 @@ static int bcm_setup(struct hci_uart *hu)
|
|||
hu->hdev->set_diag = bcm_set_diag;
|
||||
hu->hdev->set_bdaddr = btbcm_set_bdaddr;
|
||||
|
||||
err = btbcm_initialize(hu->hdev, fw_name, sizeof(fw_name));
|
||||
err = btbcm_initialize(hu->hdev, fw_name, sizeof(fw_name), false);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
|
@ -794,19 +786,21 @@ static const struct acpi_gpio_mapping acpi_bcm_int_first_gpios[] = {
|
|||
{ },
|
||||
};
|
||||
|
||||
#ifdef CONFIG_ACPI
|
||||
/* IRQ polarity of some chipsets are not defined correctly in ACPI table. */
|
||||
static const struct dmi_system_id bcm_active_low_irq_dmi_table[] = {
|
||||
{ /* Handle ThinkPad 8 tablets with BCM2E55 chipset ACPI ID */
|
||||
.ident = "Lenovo ThinkPad 8",
|
||||
/* Some firmware reports an IRQ which does not work (wrong pin in fw table?) */
|
||||
static const struct dmi_system_id bcm_broken_irq_dmi_table[] = {
|
||||
{
|
||||
.ident = "Meegopad T08",
|
||||
.matches = {
|
||||
DMI_EXACT_MATCH(DMI_SYS_VENDOR, "LENOVO"),
|
||||
DMI_EXACT_MATCH(DMI_PRODUCT_VERSION, "ThinkPad 8"),
|
||||
DMI_EXACT_MATCH(DMI_BOARD_VENDOR,
|
||||
"To be filled by OEM."),
|
||||
DMI_EXACT_MATCH(DMI_BOARD_NAME, "T3 MRD"),
|
||||
DMI_EXACT_MATCH(DMI_BOARD_VERSION, "V1.1"),
|
||||
},
|
||||
},
|
||||
{ }
|
||||
};
|
||||
|
||||
#ifdef CONFIG_ACPI
|
||||
static int bcm_resource(struct acpi_resource *ares, void *data)
|
||||
{
|
||||
struct bcm_device *dev = data;
|
||||
|
@ -904,6 +898,8 @@ static int bcm_gpio_set_shutdown(struct bcm_device *dev, bool powered)
|
|||
|
||||
static int bcm_get_resources(struct bcm_device *dev)
|
||||
{
|
||||
const struct dmi_system_id *dmi_id;
|
||||
|
||||
dev->name = dev_name(dev->dev);
|
||||
|
||||
if (x86_apple_machine && !bcm_apple_get_resources(dev))
|
||||
|
@ -936,6 +932,13 @@ static int bcm_get_resources(struct bcm_device *dev)
|
|||
dev->irq = gpiod_to_irq(gpio);
|
||||
}
|
||||
|
||||
dmi_id = dmi_first_match(bcm_broken_irq_dmi_table);
|
||||
if (dmi_id) {
|
||||
dev_info(dev->dev, "%s: Has a broken IRQ config, disabling IRQ support / runtime-pm\n",
|
||||
dmi_id->ident);
|
||||
dev->irq = 0;
|
||||
}
|
||||
|
||||
dev_dbg(dev->dev, "BCM irq: %d\n", dev->irq);
|
||||
return 0;
|
||||
}
|
||||
|
@ -944,7 +947,6 @@ static int bcm_get_resources(struct bcm_device *dev)
|
|||
static int bcm_acpi_probe(struct bcm_device *dev)
|
||||
{
|
||||
LIST_HEAD(resources);
|
||||
const struct dmi_system_id *dmi_id;
|
||||
const struct acpi_gpio_mapping *gpio_mapping = acpi_bcm_int_last_gpios;
|
||||
struct resource_entry *entry;
|
||||
int ret;
|
||||
|
@ -991,13 +993,6 @@ static int bcm_acpi_probe(struct bcm_device *dev)
|
|||
dev->irq_active_low = irq_polarity;
|
||||
dev_warn(dev->dev, "Overwriting IRQ polarity to active %s by module-param\n",
|
||||
dev->irq_active_low ? "low" : "high");
|
||||
} else {
|
||||
dmi_id = dmi_first_match(bcm_active_low_irq_dmi_table);
|
||||
if (dmi_id) {
|
||||
dev_warn(dev->dev, "%s: Overwriting IRQ polarity to active low",
|
||||
dmi_id->ident);
|
||||
dev->irq_active_low = true;
|
||||
}
|
||||
}
|
||||
|
||||
return 0;
|
||||
|
|
|
@ -195,7 +195,7 @@ restart:
|
|||
clear_bit(HCI_UART_SENDING, &hu->tx_state);
|
||||
}
|
||||
|
||||
static void hci_uart_init_work(struct work_struct *work)
|
||||
void hci_uart_init_work(struct work_struct *work)
|
||||
{
|
||||
struct hci_uart *hu = container_of(work, struct hci_uart, init_ready);
|
||||
int err;
|
||||
|
@ -229,15 +229,6 @@ int hci_uart_init_ready(struct hci_uart *hu)
|
|||
}
|
||||
|
||||
/* ------- Interface to HCI layer ------ */
|
||||
/* Initialize device */
|
||||
static int hci_uart_open(struct hci_dev *hdev)
|
||||
{
|
||||
BT_DBG("%s %p", hdev->name, hdev);
|
||||
|
||||
/* Nothing to do for UART driver */
|
||||
return 0;
|
||||
}
|
||||
|
||||
/* Reset device */
|
||||
static int hci_uart_flush(struct hci_dev *hdev)
|
||||
{
|
||||
|
@ -264,6 +255,17 @@ static int hci_uart_flush(struct hci_dev *hdev)
|
|||
return 0;
|
||||
}
|
||||
|
||||
/* Initialize device */
|
||||
static int hci_uart_open(struct hci_dev *hdev)
|
||||
{
|
||||
BT_DBG("%s %p", hdev->name, hdev);
|
||||
|
||||
/* Undo clearing this from hci_uart_close() */
|
||||
hdev->flush = hci_uart_flush;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/* Close device */
|
||||
static int hci_uart_close(struct hci_dev *hdev)
|
||||
{
|
||||
|
@ -447,6 +449,8 @@ static int hci_uart_setup(struct hci_dev *hdev)
|
|||
btbcm_check_bdaddr(hdev);
|
||||
break;
|
||||
#endif
|
||||
default:
|
||||
break;
|
||||
}
|
||||
|
||||
done:
|
||||
|
|
|
@ -141,7 +141,6 @@ static int ll_open(struct hci_uart *hu)
|
|||
|
||||
if (hu->serdev) {
|
||||
struct ll_device *lldev = serdev_device_get_drvdata(hu->serdev);
|
||||
serdev_device_open(hu->serdev);
|
||||
if (!IS_ERR(lldev->ext_clk))
|
||||
clk_prepare_enable(lldev->ext_clk);
|
||||
}
|
||||
|
@ -179,8 +178,6 @@ static int ll_close(struct hci_uart *hu)
|
|||
gpiod_set_value_cansleep(lldev->enable_gpio, 0);
|
||||
|
||||
clk_disable_unprepare(lldev->ext_clk);
|
||||
|
||||
serdev_device_close(hu->serdev);
|
||||
}
|
||||
|
||||
hu->priv = NULL;
|
||||
|
|
|
@ -477,8 +477,6 @@ static int nokia_open(struct hci_uart *hu)
|
|||
|
||||
dev_dbg(dev, "protocol open");
|
||||
|
||||
serdev_device_open(hu->serdev);
|
||||
|
||||
pm_runtime_enable(dev);
|
||||
|
||||
return 0;
|
||||
|
@ -513,7 +511,6 @@ static int nokia_close(struct hci_uart *hu)
|
|||
gpiod_set_value(btdev->wakeup_bt, 0);
|
||||
|
||||
pm_runtime_disable(&btdev->serdev->dev);
|
||||
serdev_device_close(btdev->serdev);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
|
|
@ -29,7 +29,12 @@
|
|||
*/
|
||||
|
||||
#include <linux/kernel.h>
|
||||
#include <linux/clk.h>
|
||||
#include <linux/debugfs.h>
|
||||
#include <linux/gpio/consumer.h>
|
||||
#include <linux/mod_devicetable.h>
|
||||
#include <linux/module.h>
|
||||
#include <linux/serdev.h>
|
||||
|
||||
#include <net/bluetooth/bluetooth.h>
|
||||
#include <net/bluetooth/hci_core.h>
|
||||
|
@ -50,6 +55,9 @@
|
|||
#define IBS_TX_IDLE_TIMEOUT_MS 2000
|
||||
#define BAUDRATE_SETTLE_TIMEOUT_MS 300
|
||||
|
||||
/* susclk rate */
|
||||
#define SUSCLK_RATE_32KHZ 32768
|
||||
|
||||
/* HCI_IBS transmit side sleep protocol states */
|
||||
enum tx_ibs_states {
|
||||
HCI_IBS_TX_ASLEEP,
|
||||
|
@ -111,6 +119,12 @@ struct qca_data {
|
|||
u64 votes_off;
|
||||
};
|
||||
|
||||
struct qca_serdev {
|
||||
struct hci_uart serdev_hu;
|
||||
struct gpio_desc *bt_en;
|
||||
struct clk *susclk;
|
||||
};
|
||||
|
||||
static void __serial_clock_on(struct tty_struct *tty)
|
||||
{
|
||||
/* TODO: Some chipset requires to enable UART clock on client
|
||||
|
@ -386,6 +400,7 @@ static void hci_ibs_wake_retrans_timeout(struct timer_list *t)
|
|||
/* Initialize protocol */
|
||||
static int qca_open(struct hci_uart *hu)
|
||||
{
|
||||
struct qca_serdev *qcadev;
|
||||
struct qca_data *qca;
|
||||
|
||||
BT_DBG("hu %p qca_open", hu);
|
||||
|
@ -444,6 +459,13 @@ static int qca_open(struct hci_uart *hu)
|
|||
timer_setup(&qca->tx_idle_timer, hci_ibs_tx_idle_timeout, 0);
|
||||
qca->tx_idle_delay = IBS_TX_IDLE_TIMEOUT_MS;
|
||||
|
||||
if (hu->serdev) {
|
||||
serdev_device_open(hu->serdev);
|
||||
|
||||
qcadev = serdev_device_get_drvdata(hu->serdev);
|
||||
gpiod_set_value_cansleep(qcadev->bt_en, 1);
|
||||
}
|
||||
|
||||
BT_DBG("HCI_UART_QCA open, tx_idle_delay=%u, wake_retrans=%u",
|
||||
qca->tx_idle_delay, qca->wake_retrans);
|
||||
|
||||
|
@ -512,6 +534,7 @@ static int qca_flush(struct hci_uart *hu)
|
|||
/* Close protocol */
|
||||
static int qca_close(struct hci_uart *hu)
|
||||
{
|
||||
struct qca_serdev *qcadev;
|
||||
struct qca_data *qca = hu->priv;
|
||||
|
||||
BT_DBG("hu %p qca close", hu);
|
||||
|
@ -525,6 +548,13 @@ static int qca_close(struct hci_uart *hu)
|
|||
destroy_workqueue(qca->workqueue);
|
||||
qca->hu = NULL;
|
||||
|
||||
if (hu->serdev) {
|
||||
serdev_device_close(hu->serdev);
|
||||
|
||||
qcadev = serdev_device_get_drvdata(hu->serdev);
|
||||
gpiod_set_value_cansleep(qcadev->bt_en, 0);
|
||||
}
|
||||
|
||||
kfree_skb(qca->rx_skb);
|
||||
|
||||
hu->priv = NULL;
|
||||
|
@ -880,11 +910,19 @@ static int qca_set_baudrate(struct hci_dev *hdev, uint8_t baudrate)
|
|||
*/
|
||||
set_current_state(TASK_UNINTERRUPTIBLE);
|
||||
schedule_timeout(msecs_to_jiffies(BAUDRATE_SETTLE_TIMEOUT_MS));
|
||||
set_current_state(TASK_INTERRUPTIBLE);
|
||||
set_current_state(TASK_RUNNING);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static inline void host_set_baudrate(struct hci_uart *hu, unsigned int speed)
|
||||
{
|
||||
if (hu->serdev)
|
||||
serdev_device_set_baudrate(hu->serdev, speed);
|
||||
else
|
||||
hci_uart_set_baudrate(hu, speed);
|
||||
}
|
||||
|
||||
static int qca_setup(struct hci_uart *hu)
|
||||
{
|
||||
struct hci_dev *hdev = hu->hdev;
|
||||
|
@ -905,7 +943,7 @@ static int qca_setup(struct hci_uart *hu)
|
|||
speed = hu->proto->init_speed;
|
||||
|
||||
if (speed)
|
||||
hci_uart_set_baudrate(hu, speed);
|
||||
host_set_baudrate(hu, speed);
|
||||
|
||||
/* Setup user speed if needed */
|
||||
speed = 0;
|
||||
|
@ -924,7 +962,7 @@ static int qca_setup(struct hci_uart *hu)
|
|||
ret);
|
||||
return ret;
|
||||
}
|
||||
hci_uart_set_baudrate(hu, speed);
|
||||
host_set_baudrate(hu, speed);
|
||||
}
|
||||
|
||||
/* Setup patch / NVM configurations */
|
||||
|
@ -935,6 +973,12 @@ static int qca_setup(struct hci_uart *hu)
|
|||
} else if (ret == -ENOENT) {
|
||||
/* No patch/nvm-config found, run with original fw/config */
|
||||
ret = 0;
|
||||
} else if (ret == -EAGAIN) {
|
||||
/*
|
||||
* Userspace firmware loader will return -EAGAIN in case no
|
||||
* patch/nvm-config is found, so run with original fw/config.
|
||||
*/
|
||||
ret = 0;
|
||||
}
|
||||
|
||||
/* Setup bdaddr */
|
||||
|
@ -958,12 +1002,80 @@ static struct hci_uart_proto qca_proto = {
|
|||
.dequeue = qca_dequeue,
|
||||
};
|
||||
|
||||
static int qca_serdev_probe(struct serdev_device *serdev)
|
||||
{
|
||||
struct qca_serdev *qcadev;
|
||||
int err;
|
||||
|
||||
qcadev = devm_kzalloc(&serdev->dev, sizeof(*qcadev), GFP_KERNEL);
|
||||
if (!qcadev)
|
||||
return -ENOMEM;
|
||||
|
||||
qcadev->serdev_hu.serdev = serdev;
|
||||
serdev_device_set_drvdata(serdev, qcadev);
|
||||
|
||||
qcadev->bt_en = devm_gpiod_get(&serdev->dev, "enable",
|
||||
GPIOD_OUT_LOW);
|
||||
if (IS_ERR(qcadev->bt_en)) {
|
||||
dev_err(&serdev->dev, "failed to acquire enable gpio\n");
|
||||
return PTR_ERR(qcadev->bt_en);
|
||||
}
|
||||
|
||||
qcadev->susclk = devm_clk_get(&serdev->dev, NULL);
|
||||
if (IS_ERR(qcadev->susclk)) {
|
||||
dev_err(&serdev->dev, "failed to acquire clk\n");
|
||||
return PTR_ERR(qcadev->susclk);
|
||||
}
|
||||
|
||||
err = clk_set_rate(qcadev->susclk, SUSCLK_RATE_32KHZ);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
err = clk_prepare_enable(qcadev->susclk);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
err = hci_uart_register_device(&qcadev->serdev_hu, &qca_proto);
|
||||
if (err)
|
||||
clk_disable_unprepare(qcadev->susclk);
|
||||
|
||||
return err;
|
||||
}
|
||||
|
||||
static void qca_serdev_remove(struct serdev_device *serdev)
|
||||
{
|
||||
struct qca_serdev *qcadev = serdev_device_get_drvdata(serdev);
|
||||
|
||||
hci_uart_unregister_device(&qcadev->serdev_hu);
|
||||
|
||||
clk_disable_unprepare(qcadev->susclk);
|
||||
}
|
||||
|
||||
static const struct of_device_id qca_bluetooth_of_match[] = {
|
||||
{ .compatible = "qcom,qca6174-bt" },
|
||||
{ /* sentinel */ }
|
||||
};
|
||||
MODULE_DEVICE_TABLE(of, qca_bluetooth_of_match);
|
||||
|
||||
static struct serdev_device_driver qca_serdev_driver = {
|
||||
.probe = qca_serdev_probe,
|
||||
.remove = qca_serdev_remove,
|
||||
.driver = {
|
||||
.name = "hci_uart_qca",
|
||||
.of_match_table = qca_bluetooth_of_match,
|
||||
},
|
||||
};
|
||||
|
||||
int __init qca_init(void)
|
||||
{
|
||||
serdev_device_driver_register(&qca_serdev_driver);
|
||||
|
||||
return hci_uart_register_proto(&qca_proto);
|
||||
}
|
||||
|
||||
int __exit qca_deinit(void)
|
||||
{
|
||||
serdev_device_driver_unregister(&qca_serdev_driver);
|
||||
|
||||
return hci_uart_unregister_proto(&qca_proto);
|
||||
}
|
||||
|
|
|
@ -101,14 +101,6 @@ static void hci_uart_write_work(struct work_struct *work)
|
|||
|
||||
/* ------- Interface to HCI layer ------ */
|
||||
|
||||
/* Initialize device */
|
||||
static int hci_uart_open(struct hci_dev *hdev)
|
||||
{
|
||||
BT_DBG("%s %p", hdev->name, hdev);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/* Reset device */
|
||||
static int hci_uart_flush(struct hci_dev *hdev)
|
||||
{
|
||||
|
@ -129,6 +121,17 @@ static int hci_uart_flush(struct hci_dev *hdev)
|
|||
return 0;
|
||||
}
|
||||
|
||||
/* Initialize device */
|
||||
static int hci_uart_open(struct hci_dev *hdev)
|
||||
{
|
||||
BT_DBG("%s %p", hdev->name, hdev);
|
||||
|
||||
/* Undo clearing this from hci_uart_close() */
|
||||
hdev->flush = hci_uart_flush;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/* Close device */
|
||||
static int hci_uart_close(struct hci_dev *hdev)
|
||||
{
|
||||
|
@ -204,9 +207,8 @@ static int hci_uart_setup(struct hci_dev *hdev)
|
|||
return 0;
|
||||
}
|
||||
|
||||
if (skb->len != sizeof(*ver)) {
|
||||
if (skb->len != sizeof(*ver))
|
||||
bt_dev_err(hdev, "Event length mismatch for version info");
|
||||
}
|
||||
|
||||
kfree_skb(skb);
|
||||
return 0;
|
||||
|
@ -282,10 +284,14 @@ int hci_uart_register_device(struct hci_uart *hu,
|
|||
|
||||
serdev_device_set_client_ops(hu->serdev, &hci_serdev_client_ops);
|
||||
|
||||
err = p->open(hu);
|
||||
err = serdev_device_open(hu->serdev);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
err = p->open(hu);
|
||||
if (err)
|
||||
goto err_open;
|
||||
|
||||
hu->proto = p;
|
||||
set_bit(HCI_UART_PROTO_READY, &hu->flags);
|
||||
|
||||
|
@ -302,6 +308,7 @@ int hci_uart_register_device(struct hci_uart *hu,
|
|||
hdev->bus = HCI_UART;
|
||||
hci_set_drvdata(hdev, hu);
|
||||
|
||||
INIT_WORK(&hu->init_ready, hci_uart_init_work);
|
||||
INIT_WORK(&hu->write_work, hci_uart_write_work);
|
||||
percpu_init_rwsem(&hu->proto_lock);
|
||||
|
||||
|
@ -351,6 +358,8 @@ err_register:
|
|||
err_alloc:
|
||||
clear_bit(HCI_UART_PROTO_READY, &hu->flags);
|
||||
p->close(hu);
|
||||
err_open:
|
||||
serdev_device_close(hu->serdev);
|
||||
return err;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(hci_uart_register_device);
|
||||
|
@ -365,5 +374,6 @@ void hci_uart_unregister_device(struct hci_uart *hu)
|
|||
cancel_work_sync(&hu->write_work);
|
||||
|
||||
hu->proto->close(hu);
|
||||
serdev_device_close(hu->serdev);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(hci_uart_unregister_device);
|
||||
|
|
|
@ -116,6 +116,7 @@ void hci_uart_unregister_device(struct hci_uart *hu);
|
|||
|
||||
int hci_uart_tx_wakeup(struct hci_uart *hu);
|
||||
int hci_uart_init_ready(struct hci_uart *hu);
|
||||
void hci_uart_init_work(struct work_struct *work);
|
||||
void hci_uart_set_baudrate(struct hci_uart *hu, unsigned int speed);
|
||||
void hci_uart_set_flow_control(struct hci_uart *hu, bool enable);
|
||||
void hci_uart_set_speeds(struct hci_uart *hu, unsigned int init_speed,
|
||||
|
|
|
@ -262,6 +262,8 @@ void proc_coredump_connector(struct task_struct *task)
|
|||
ev->what = PROC_EVENT_COREDUMP;
|
||||
ev->event_data.coredump.process_pid = task->pid;
|
||||
ev->event_data.coredump.process_tgid = task->tgid;
|
||||
ev->event_data.coredump.parent_pid = task->real_parent->pid;
|
||||
ev->event_data.coredump.parent_tgid = task->real_parent->tgid;
|
||||
|
||||
memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id));
|
||||
msg->ack = 0; /* not used */
|
||||
|
@ -288,6 +290,8 @@ void proc_exit_connector(struct task_struct *task)
|
|||
ev->event_data.exit.process_tgid = task->tgid;
|
||||
ev->event_data.exit.exit_code = task->exit_code;
|
||||
ev->event_data.exit.exit_signal = task->exit_signal;
|
||||
ev->event_data.exit.parent_pid = task->real_parent->pid;
|
||||
ev->event_data.exit.parent_tgid = task->real_parent->tgid;
|
||||
|
||||
memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id));
|
||||
msg->ack = 0; /* not used */
|
||||
|
|
|
@ -270,7 +270,7 @@ EXPORT_SYMBOL_GPL(dca_remove_requester);
|
|||
* @dev - the device that wants dca service
|
||||
* @cpu - the cpuid as returned by get_cpu()
|
||||
*/
|
||||
u8 dca_common_get_tag(struct device *dev, int cpu)
|
||||
static u8 dca_common_get_tag(struct device *dev, int cpu)
|
||||
{
|
||||
struct dca_provider *dca;
|
||||
u8 tag;
|
||||
|
|
|
@ -849,7 +849,7 @@ static int create_cq_user(struct mlx5_ib_dev *dev, struct ib_udata *udata,
|
|||
return 0;
|
||||
|
||||
err_cqb:
|
||||
kfree(*cqb);
|
||||
kvfree(*cqb);
|
||||
|
||||
err_db:
|
||||
mlx5_ib_db_unmap_user(to_mucontext(context), &cq->db);
|
||||
|
|
|
@ -116,6 +116,7 @@ enum rdma_cqe_requester_status_enum {
|
|||
RDMA_CQE_REQ_STS_TRANSPORT_RETRY_CNT_ERR,
|
||||
RDMA_CQE_REQ_STS_WORK_REQUEST_FLUSHED_ERR,
|
||||
RDMA_CQE_REQ_STS_XRC_VOILATION_ERR,
|
||||
RDMA_CQE_REQ_STS_SIG_ERR,
|
||||
MAX_RDMA_CQE_REQUESTER_STATUS_ENUM
|
||||
};
|
||||
|
||||
|
@ -152,12 +153,12 @@ struct rdma_rq_sge {
|
|||
struct regpair addr;
|
||||
__le32 length;
|
||||
__le32 flags;
|
||||
#define RDMA_RQ_SGE_L_KEY_MASK 0x3FFFFFF
|
||||
#define RDMA_RQ_SGE_L_KEY_SHIFT 0
|
||||
#define RDMA_RQ_SGE_L_KEY_LO_MASK 0x3FFFFFF
|
||||
#define RDMA_RQ_SGE_L_KEY_LO_SHIFT 0
|
||||
#define RDMA_RQ_SGE_NUM_SGES_MASK 0x7
|
||||
#define RDMA_RQ_SGE_NUM_SGES_SHIFT 26
|
||||
#define RDMA_RQ_SGE_RESERVED0_MASK 0x7
|
||||
#define RDMA_RQ_SGE_RESERVED0_SHIFT 29
|
||||
#define RDMA_RQ_SGE_L_KEY_HI_MASK 0x7
|
||||
#define RDMA_RQ_SGE_L_KEY_HI_SHIFT 29
|
||||
};
|
||||
|
||||
struct rdma_srq_sge {
|
||||
|
@ -241,18 +242,39 @@ enum rdma_dif_io_direction_flg {
|
|||
MAX_RDMA_DIF_IO_DIRECTION_FLG
|
||||
};
|
||||
|
||||
/* RDMA DIF Runt Result Structure */
|
||||
struct rdma_dif_runt_result {
|
||||
__le16 guard_tag;
|
||||
__le16 reserved[3];
|
||||
struct rdma_dif_params {
|
||||
__le32 base_ref_tag;
|
||||
__le16 app_tag;
|
||||
__le16 app_tag_mask;
|
||||
__le16 runt_crc_value;
|
||||
__le16 flags;
|
||||
#define RDMA_DIF_PARAMS_IO_DIRECTION_FLG_MASK 0x1
|
||||
#define RDMA_DIF_PARAMS_IO_DIRECTION_FLG_SHIFT 0
|
||||
#define RDMA_DIF_PARAMS_BLOCK_SIZE_MASK 0x1
|
||||
#define RDMA_DIF_PARAMS_BLOCK_SIZE_SHIFT 1
|
||||
#define RDMA_DIF_PARAMS_RUNT_VALID_FLG_MASK 0x1
|
||||
#define RDMA_DIF_PARAMS_RUNT_VALID_FLG_SHIFT 2
|
||||
#define RDMA_DIF_PARAMS_VALIDATE_CRC_GUARD_MASK 0x1
|
||||
#define RDMA_DIF_PARAMS_VALIDATE_CRC_GUARD_SHIFT 3
|
||||
#define RDMA_DIF_PARAMS_VALIDATE_REF_TAG_MASK 0x1
|
||||
#define RDMA_DIF_PARAMS_VALIDATE_REF_TAG_SHIFT 4
|
||||
#define RDMA_DIF_PARAMS_VALIDATE_APP_TAG_MASK 0x1
|
||||
#define RDMA_DIF_PARAMS_VALIDATE_APP_TAG_SHIFT 5
|
||||
#define RDMA_DIF_PARAMS_CRC_SEED_MASK 0x1
|
||||
#define RDMA_DIF_PARAMS_CRC_SEED_SHIFT 6
|
||||
#define RDMA_DIF_PARAMS_RX_REF_TAG_CONST_MASK 0x1
|
||||
#define RDMA_DIF_PARAMS_RX_REF_TAG_CONST_SHIFT 7
|
||||
#define RDMA_DIF_PARAMS_BLOCK_GUARD_TYPE_MASK 0x1
|
||||
#define RDMA_DIF_PARAMS_BLOCK_GUARD_TYPE_SHIFT 8
|
||||
#define RDMA_DIF_PARAMS_APP_ESCAPE_MASK 0x1
|
||||
#define RDMA_DIF_PARAMS_APP_ESCAPE_SHIFT 9
|
||||
#define RDMA_DIF_PARAMS_REF_ESCAPE_MASK 0x1
|
||||
#define RDMA_DIF_PARAMS_REF_ESCAPE_SHIFT 10
|
||||
#define RDMA_DIF_PARAMS_RESERVED4_MASK 0x1F
|
||||
#define RDMA_DIF_PARAMS_RESERVED4_SHIFT 11
|
||||
__le32 reserved5;
|
||||
};
|
||||
|
||||
/* Memory window type enumeration */
|
||||
enum rdma_mw_type {
|
||||
RDMA_MW_TYPE_1,
|
||||
RDMA_MW_TYPE_2A,
|
||||
MAX_RDMA_MW_TYPE
|
||||
};
|
||||
|
||||
struct rdma_sq_atomic_wqe {
|
||||
__le32 reserved1;
|
||||
|
@ -334,17 +356,17 @@ struct rdma_sq_bind_wqe {
|
|||
#define RDMA_SQ_BIND_WQE_SE_FLG_SHIFT 3
|
||||
#define RDMA_SQ_BIND_WQE_INLINE_FLG_MASK 0x1
|
||||
#define RDMA_SQ_BIND_WQE_INLINE_FLG_SHIFT 4
|
||||
#define RDMA_SQ_BIND_WQE_RESERVED0_MASK 0x7
|
||||
#define RDMA_SQ_BIND_WQE_RESERVED0_SHIFT 5
|
||||
#define RDMA_SQ_BIND_WQE_DIF_ON_HOST_FLG_MASK 0x1
|
||||
#define RDMA_SQ_BIND_WQE_DIF_ON_HOST_FLG_SHIFT 5
|
||||
#define RDMA_SQ_BIND_WQE_RESERVED0_MASK 0x3
|
||||
#define RDMA_SQ_BIND_WQE_RESERVED0_SHIFT 6
|
||||
u8 wqe_size;
|
||||
u8 prev_wqe_size;
|
||||
u8 bind_ctrl;
|
||||
#define RDMA_SQ_BIND_WQE_ZERO_BASED_MASK 0x1
|
||||
#define RDMA_SQ_BIND_WQE_ZERO_BASED_SHIFT 0
|
||||
#define RDMA_SQ_BIND_WQE_MW_TYPE_MASK 0x1
|
||||
#define RDMA_SQ_BIND_WQE_MW_TYPE_SHIFT 1
|
||||
#define RDMA_SQ_BIND_WQE_RESERVED1_MASK 0x3F
|
||||
#define RDMA_SQ_BIND_WQE_RESERVED1_SHIFT 2
|
||||
#define RDMA_SQ_BIND_WQE_RESERVED1_MASK 0x7F
|
||||
#define RDMA_SQ_BIND_WQE_RESERVED1_SHIFT 1
|
||||
u8 access_ctrl;
|
||||
#define RDMA_SQ_BIND_WQE_REMOTE_READ_MASK 0x1
|
||||
#define RDMA_SQ_BIND_WQE_REMOTE_READ_SHIFT 0
|
||||
|
@ -363,6 +385,7 @@ struct rdma_sq_bind_wqe {
|
|||
__le32 length_lo;
|
||||
__le32 parent_l_key;
|
||||
__le32 reserved4;
|
||||
struct rdma_dif_params dif_params;
|
||||
};
|
||||
|
||||
/* First element (16 bytes) of bind wqe */
|
||||
|
@ -392,10 +415,8 @@ struct rdma_sq_bind_wqe_2nd {
|
|||
u8 bind_ctrl;
|
||||
#define RDMA_SQ_BIND_WQE_2ND_ZERO_BASED_MASK 0x1
|
||||
#define RDMA_SQ_BIND_WQE_2ND_ZERO_BASED_SHIFT 0
|
||||
#define RDMA_SQ_BIND_WQE_2ND_MW_TYPE_MASK 0x1
|
||||
#define RDMA_SQ_BIND_WQE_2ND_MW_TYPE_SHIFT 1
|
||||
#define RDMA_SQ_BIND_WQE_2ND_RESERVED1_MASK 0x3F
|
||||
#define RDMA_SQ_BIND_WQE_2ND_RESERVED1_SHIFT 2
|
||||
#define RDMA_SQ_BIND_WQE_2ND_RESERVED1_MASK 0x7F
|
||||
#define RDMA_SQ_BIND_WQE_2ND_RESERVED1_SHIFT 1
|
||||
u8 access_ctrl;
|
||||
#define RDMA_SQ_BIND_WQE_2ND_REMOTE_READ_MASK 0x1
|
||||
#define RDMA_SQ_BIND_WQE_2ND_REMOTE_READ_SHIFT 0
|
||||
|
@ -416,6 +437,11 @@ struct rdma_sq_bind_wqe_2nd {
|
|||
__le32 reserved4;
|
||||
};
|
||||
|
||||
/* Third element (16 bytes) of bind wqe */
|
||||
struct rdma_sq_bind_wqe_3rd {
|
||||
struct rdma_dif_params dif_params;
|
||||
};
|
||||
|
||||
/* Structure with only the SQ WQE common
|
||||
* fields. Size is of one SQ element (16B)
|
||||
*/
|
||||
|
@ -486,30 +512,6 @@ struct rdma_sq_fmr_wqe {
|
|||
u8 length_hi;
|
||||
__le32 length_lo;
|
||||
struct regpair pbl_addr;
|
||||
__le32 dif_base_ref_tag;
|
||||
__le16 dif_app_tag;
|
||||
__le16 dif_app_tag_mask;
|
||||
__le16 dif_runt_crc_value;
|
||||
__le16 dif_flags;
|
||||
#define RDMA_SQ_FMR_WQE_DIF_IO_DIRECTION_FLG_MASK 0x1
|
||||
#define RDMA_SQ_FMR_WQE_DIF_IO_DIRECTION_FLG_SHIFT 0
|
||||
#define RDMA_SQ_FMR_WQE_DIF_BLOCK_SIZE_MASK 0x1
|
||||
#define RDMA_SQ_FMR_WQE_DIF_BLOCK_SIZE_SHIFT 1
|
||||
#define RDMA_SQ_FMR_WQE_DIF_RUNT_VALID_FLG_MASK 0x1
|
||||
#define RDMA_SQ_FMR_WQE_DIF_RUNT_VALID_FLG_SHIFT 2
|
||||
#define RDMA_SQ_FMR_WQE_DIF_VALIDATE_CRC_GUARD_MASK 0x1
|
||||
#define RDMA_SQ_FMR_WQE_DIF_VALIDATE_CRC_GUARD_SHIFT 3
|
||||
#define RDMA_SQ_FMR_WQE_DIF_VALIDATE_REF_TAG_MASK 0x1
|
||||
#define RDMA_SQ_FMR_WQE_DIF_VALIDATE_REF_TAG_SHIFT 4
|
||||
#define RDMA_SQ_FMR_WQE_DIF_VALIDATE_APP_TAG_MASK 0x1
|
||||
#define RDMA_SQ_FMR_WQE_DIF_VALIDATE_APP_TAG_SHIFT 5
|
||||
#define RDMA_SQ_FMR_WQE_DIF_CRC_SEED_MASK 0x1
|
||||
#define RDMA_SQ_FMR_WQE_DIF_CRC_SEED_SHIFT 6
|
||||
#define RDMA_SQ_FMR_WQE_DIF_RX_REF_TAG_CONST_MASK 0x1
|
||||
#define RDMA_SQ_FMR_WQE_DIF_RX_REF_TAG_CONST_SHIFT 7
|
||||
#define RDMA_SQ_FMR_WQE_RESERVED4_MASK 0xFF
|
||||
#define RDMA_SQ_FMR_WQE_RESERVED4_SHIFT 8
|
||||
__le32 reserved5;
|
||||
};
|
||||
|
||||
/* First element (16 bytes) of fmr wqe */
|
||||
|
@ -566,33 +568,6 @@ struct rdma_sq_fmr_wqe_2nd {
|
|||
struct regpair pbl_addr;
|
||||
};
|
||||
|
||||
/* Third element (16 bytes) of fmr wqe */
|
||||
struct rdma_sq_fmr_wqe_3rd {
|
||||
__le32 dif_base_ref_tag;
|
||||
__le16 dif_app_tag;
|
||||
__le16 dif_app_tag_mask;
|
||||
__le16 dif_runt_crc_value;
|
||||
__le16 dif_flags;
|
||||
#define RDMA_SQ_FMR_WQE_3RD_DIF_IO_DIRECTION_FLG_MASK 0x1
|
||||
#define RDMA_SQ_FMR_WQE_3RD_DIF_IO_DIRECTION_FLG_SHIFT 0
|
||||
#define RDMA_SQ_FMR_WQE_3RD_DIF_BLOCK_SIZE_MASK 0x1
|
||||
#define RDMA_SQ_FMR_WQE_3RD_DIF_BLOCK_SIZE_SHIFT 1
|
||||
#define RDMA_SQ_FMR_WQE_3RD_DIF_RUNT_VALID_FLG_MASK 0x1
|
||||
#define RDMA_SQ_FMR_WQE_3RD_DIF_RUNT_VALID_FLG_SHIFT 2
|
||||
#define RDMA_SQ_FMR_WQE_3RD_DIF_VALIDATE_CRC_GUARD_MASK 0x1
|
||||
#define RDMA_SQ_FMR_WQE_3RD_DIF_VALIDATE_CRC_GUARD_SHIFT 3
|
||||
#define RDMA_SQ_FMR_WQE_3RD_DIF_VALIDATE_REF_TAG_MASK 0x1
|
||||
#define RDMA_SQ_FMR_WQE_3RD_DIF_VALIDATE_REF_TAG_SHIFT 4
|
||||
#define RDMA_SQ_FMR_WQE_3RD_DIF_VALIDATE_APP_TAG_MASK 0x1
|
||||
#define RDMA_SQ_FMR_WQE_3RD_DIF_VALIDATE_APP_TAG_SHIFT 5
|
||||
#define RDMA_SQ_FMR_WQE_3RD_DIF_CRC_SEED_MASK 0x1
|
||||
#define RDMA_SQ_FMR_WQE_3RD_DIF_CRC_SEED_SHIFT 6
|
||||
#define RDMA_SQ_FMR_WQE_3RD_DIF_RX_REF_TAG_CONST_MASK 0x1
|
||||
#define RDMA_SQ_FMR_WQE_3RD_DIF_RX_REF_TAG_CONST_SHIFT 7
|
||||
#define RDMA_SQ_FMR_WQE_3RD_RESERVED4_MASK 0xFF
|
||||
#define RDMA_SQ_FMR_WQE_RESERVED4_SHIFT 8
|
||||
__le32 reserved5;
|
||||
};
|
||||
|
||||
struct rdma_sq_local_inv_wqe {
|
||||
struct regpair reserved;
|
||||
|
@ -637,8 +612,8 @@ struct rdma_sq_rdma_wqe {
|
|||
#define RDMA_SQ_RDMA_WQE_DIF_ON_HOST_FLG_SHIFT 5
|
||||
#define RDMA_SQ_RDMA_WQE_READ_INV_FLG_MASK 0x1
|
||||
#define RDMA_SQ_RDMA_WQE_READ_INV_FLG_SHIFT 6
|
||||
#define RDMA_SQ_RDMA_WQE_RESERVED0_MASK 0x1
|
||||
#define RDMA_SQ_RDMA_WQE_RESERVED0_SHIFT 7
|
||||
#define RDMA_SQ_RDMA_WQE_RESERVED1_MASK 0x1
|
||||
#define RDMA_SQ_RDMA_WQE_RESERVED1_SHIFT 7
|
||||
u8 wqe_size;
|
||||
u8 prev_wqe_size;
|
||||
struct regpair remote_va;
|
||||
|
@ -646,13 +621,9 @@ struct rdma_sq_rdma_wqe {
|
|||
u8 dif_flags;
|
||||
#define RDMA_SQ_RDMA_WQE_DIF_BLOCK_SIZE_MASK 0x1
|
||||
#define RDMA_SQ_RDMA_WQE_DIF_BLOCK_SIZE_SHIFT 0
|
||||
#define RDMA_SQ_RDMA_WQE_DIF_FIRST_RDMA_IN_IO_FLG_MASK 0x1
|
||||
#define RDMA_SQ_RDMA_WQE_DIF_FIRST_RDMA_IN_IO_FLG_SHIFT 1
|
||||
#define RDMA_SQ_RDMA_WQE_DIF_LAST_RDMA_IN_IO_FLG_MASK 0x1
|
||||
#define RDMA_SQ_RDMA_WQE_DIF_LAST_RDMA_IN_IO_FLG_SHIFT 2
|
||||
#define RDMA_SQ_RDMA_WQE_RESERVED1_MASK 0x1F
|
||||
#define RDMA_SQ_RDMA_WQE_RESERVED1_SHIFT 3
|
||||
u8 reserved2[3];
|
||||
#define RDMA_SQ_RDMA_WQE_RESERVED2_MASK 0x7F
|
||||
#define RDMA_SQ_RDMA_WQE_RESERVED2_SHIFT 1
|
||||
u8 reserved3[3];
|
||||
};
|
||||
|
||||
/* First element (16 bytes) of rdma wqe */
|
||||
|
|
|
@ -3276,7 +3276,7 @@ int qedr_post_recv(struct ib_qp *ibqp, struct ib_recv_wr *wr,
|
|||
SET_FIELD(flags, RDMA_RQ_SGE_NUM_SGES,
|
||||
wr->num_sge);
|
||||
|
||||
SET_FIELD(flags, RDMA_RQ_SGE_L_KEY,
|
||||
SET_FIELD(flags, RDMA_RQ_SGE_L_KEY_LO,
|
||||
wr->sg_list[i].lkey);
|
||||
|
||||
RQ_SGE_SET(rqe, wr->sg_list[i].addr,
|
||||
|
@ -3295,7 +3295,7 @@ int qedr_post_recv(struct ib_qp *ibqp, struct ib_recv_wr *wr,
|
|||
/* First one must include the number
|
||||
* of SGE in the list
|
||||
*/
|
||||
SET_FIELD(flags, RDMA_RQ_SGE_L_KEY, 0);
|
||||
SET_FIELD(flags, RDMA_RQ_SGE_L_KEY_LO, 0);
|
||||
SET_FIELD(flags, RDMA_RQ_SGE_NUM_SGES, 1);
|
||||
|
||||
RQ_SGE_SET(rqe, 0, 0, flags);
|
||||
|
|
|
@ -443,17 +443,16 @@ static u8 opa_vnic_get_rc(struct __opa_veswport_info *info,
|
|||
}
|
||||
|
||||
/* opa_vnic_calc_entropy - calculate the packet entropy */
|
||||
u8 opa_vnic_calc_entropy(struct opa_vnic_adapter *adapter, struct sk_buff *skb)
|
||||
u8 opa_vnic_calc_entropy(struct sk_buff *skb)
|
||||
{
|
||||
u16 hash16;
|
||||
u32 hash = skb_get_hash(skb);
|
||||
|
||||
/*
|
||||
* Get flow based 16-bit hash and then XOR the upper and lower bytes
|
||||
* to get the entropy.
|
||||
* __skb_tx_hash limits qcount to 16 bits. Hence, get 15-bit hash.
|
||||
*/
|
||||
hash16 = __skb_tx_hash(adapter->netdev, skb, BIT(15));
|
||||
return (u8)((hash16 >> 8) ^ (hash16 & 0xff));
|
||||
/* store XOR of all bytes in lower 8 bits */
|
||||
hash ^= hash >> 8;
|
||||
hash ^= hash >> 16;
|
||||
|
||||
/* return lower 8 bits as entropy */
|
||||
return (u8)(hash & 0xFF);
|
||||
}
|
||||
|
||||
/* opa_vnic_get_def_port - get default port based on entropy */
|
||||
|
@ -490,7 +489,7 @@ void opa_vnic_encap_skb(struct opa_vnic_adapter *adapter, struct sk_buff *skb)
|
|||
|
||||
hdr = skb_push(skb, OPA_VNIC_HDR_LEN);
|
||||
|
||||
entropy = opa_vnic_calc_entropy(adapter, skb);
|
||||
entropy = opa_vnic_calc_entropy(skb);
|
||||
def_port = opa_vnic_get_def_port(adapter, entropy);
|
||||
len = opa_vnic_wire_length(skb);
|
||||
dlid = opa_vnic_get_dlid(adapter, skb, def_port);
|
||||
|
|
|
@ -299,7 +299,7 @@ struct opa_vnic_adapter *opa_vnic_add_netdev(struct ib_device *ibdev,
|
|||
void opa_vnic_rem_netdev(struct opa_vnic_adapter *adapter);
|
||||
void opa_vnic_encap_skb(struct opa_vnic_adapter *adapter, struct sk_buff *skb);
|
||||
u8 opa_vnic_get_vl(struct opa_vnic_adapter *adapter, struct sk_buff *skb);
|
||||
u8 opa_vnic_calc_entropy(struct opa_vnic_adapter *adapter, struct sk_buff *skb);
|
||||
u8 opa_vnic_calc_entropy(struct sk_buff *skb);
|
||||
void opa_vnic_process_vema_config(struct opa_vnic_adapter *adapter);
|
||||
void opa_vnic_release_mac_tbl(struct opa_vnic_adapter *adapter);
|
||||
void opa_vnic_query_mac_tbl(struct opa_vnic_adapter *adapter,
|
||||
|
|
|
@ -104,7 +104,7 @@ static u16 opa_vnic_select_queue(struct net_device *netdev, struct sk_buff *skb,
|
|||
|
||||
/* pass entropy and vl as metadata in skb */
|
||||
mdata = skb_push(skb, sizeof(*mdata));
|
||||
mdata->entropy = opa_vnic_calc_entropy(adapter, skb);
|
||||
mdata->entropy = opa_vnic_calc_entropy(skb);
|
||||
mdata->vl = opa_vnic_get_vl(adapter, skb);
|
||||
rc = adapter->rn_ops->ndo_select_queue(netdev, skb,
|
||||
accel_priv, fallback);
|
||||
|
|
|
@ -25,6 +25,19 @@ config LIRC
|
|||
passes raw IR to and from userspace, which is needed for
|
||||
IR transmitting (aka "blasting") and for the lirc daemon.
|
||||
|
||||
config BPF_LIRC_MODE2
|
||||
bool "Support for eBPF programs attached to lirc devices"
|
||||
depends on BPF_SYSCALL
|
||||
depends on RC_CORE=y
|
||||
depends on LIRC
|
||||
help
|
||||
Allow attaching eBPF programs to a lirc device using the bpf(2)
|
||||
syscall command BPF_PROG_ATTACH. This is supported for raw IR
|
||||
receivers.
|
||||
|
||||
These eBPF programs can be used to decode IR into scancodes, for
|
||||
IR protocols not supported by the kernel decoders.
|
||||
|
||||
menuconfig RC_DECODERS
|
||||
bool "Remote controller decoders"
|
||||
depends on RC_CORE
|
||||
|
|
|
@ -5,6 +5,7 @@ obj-y += keymaps/
|
|||
obj-$(CONFIG_RC_CORE) += rc-core.o
|
||||
rc-core-y := rc-main.o rc-ir-raw.o
|
||||
rc-core-$(CONFIG_LIRC) += lirc_dev.o
|
||||
rc-core-$(CONFIG_BPF_LIRC_MODE2) += bpf-lirc.o
|
||||
obj-$(CONFIG_IR_NEC_DECODER) += ir-nec-decoder.o
|
||||
obj-$(CONFIG_IR_RC5_DECODER) += ir-rc5-decoder.o
|
||||
obj-$(CONFIG_IR_RC6_DECODER) += ir-rc6-decoder.o
|
||||
|
|
|
@ -0,0 +1,313 @@
|
|||
// SPDX-License-Identifier: GPL-2.0
|
||||
// bpf-lirc.c - handles bpf
|
||||
//
|
||||
// Copyright (C) 2018 Sean Young <sean@mess.org>
|
||||
|
||||
#include <linux/bpf.h>
|
||||
#include <linux/filter.h>
|
||||
#include <linux/bpf_lirc.h>
|
||||
#include "rc-core-priv.h"
|
||||
|
||||
/*
|
||||
* BPF interface for raw IR
|
||||
*/
|
||||
const struct bpf_prog_ops lirc_mode2_prog_ops = {
|
||||
};
|
||||
|
||||
BPF_CALL_1(bpf_rc_repeat, u32*, sample)
|
||||
{
|
||||
struct ir_raw_event_ctrl *ctrl;
|
||||
|
||||
ctrl = container_of(sample, struct ir_raw_event_ctrl, bpf_sample);
|
||||
|
||||
rc_repeat(ctrl->dev);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static const struct bpf_func_proto rc_repeat_proto = {
|
||||
.func = bpf_rc_repeat,
|
||||
.gpl_only = true, /* rc_repeat is EXPORT_SYMBOL_GPL */
|
||||
.ret_type = RET_INTEGER,
|
||||
.arg1_type = ARG_PTR_TO_CTX,
|
||||
};
|
||||
|
||||
/*
|
||||
* Currently rc-core does not support 64-bit scancodes, but there are many
|
||||
* known protocols with more than 32 bits. So, define the interface as u64
|
||||
* as a future-proof.
|
||||
*/
|
||||
BPF_CALL_4(bpf_rc_keydown, u32*, sample, u32, protocol, u64, scancode,
|
||||
u32, toggle)
|
||||
{
|
||||
struct ir_raw_event_ctrl *ctrl;
|
||||
|
||||
ctrl = container_of(sample, struct ir_raw_event_ctrl, bpf_sample);
|
||||
|
||||
rc_keydown(ctrl->dev, protocol, scancode, toggle != 0);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static const struct bpf_func_proto rc_keydown_proto = {
|
||||
.func = bpf_rc_keydown,
|
||||
.gpl_only = true, /* rc_keydown is EXPORT_SYMBOL_GPL */
|
||||
.ret_type = RET_INTEGER,
|
||||
.arg1_type = ARG_PTR_TO_CTX,
|
||||
.arg2_type = ARG_ANYTHING,
|
||||
.arg3_type = ARG_ANYTHING,
|
||||
.arg4_type = ARG_ANYTHING,
|
||||
};
|
||||
|
||||
static const struct bpf_func_proto *
|
||||
lirc_mode2_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
|
||||
{
|
||||
switch (func_id) {
|
||||
case BPF_FUNC_rc_repeat:
|
||||
return &rc_repeat_proto;
|
||||
case BPF_FUNC_rc_keydown:
|
||||
return &rc_keydown_proto;
|
||||
case BPF_FUNC_map_lookup_elem:
|
||||
return &bpf_map_lookup_elem_proto;
|
||||
case BPF_FUNC_map_update_elem:
|
||||
return &bpf_map_update_elem_proto;
|
||||
case BPF_FUNC_map_delete_elem:
|
||||
return &bpf_map_delete_elem_proto;
|
||||
case BPF_FUNC_ktime_get_ns:
|
||||
return &bpf_ktime_get_ns_proto;
|
||||
case BPF_FUNC_tail_call:
|
||||
return &bpf_tail_call_proto;
|
||||
case BPF_FUNC_get_prandom_u32:
|
||||
return &bpf_get_prandom_u32_proto;
|
||||
case BPF_FUNC_trace_printk:
|
||||
if (capable(CAP_SYS_ADMIN))
|
||||
return bpf_get_trace_printk_proto();
|
||||
/* fall through */
|
||||
default:
|
||||
return NULL;
|
||||
}
|
||||
}
|
||||
|
||||
static bool lirc_mode2_is_valid_access(int off, int size,
|
||||
enum bpf_access_type type,
|
||||
const struct bpf_prog *prog,
|
||||
struct bpf_insn_access_aux *info)
|
||||
{
|
||||
/* We have one field of u32 */
|
||||
return type == BPF_READ && off == 0 && size == sizeof(u32);
|
||||
}
|
||||
|
||||
const struct bpf_verifier_ops lirc_mode2_verifier_ops = {
|
||||
.get_func_proto = lirc_mode2_func_proto,
|
||||
.is_valid_access = lirc_mode2_is_valid_access
|
||||
};
|
||||
|
||||
#define BPF_MAX_PROGS 64
|
||||
|
||||
static int lirc_bpf_attach(struct rc_dev *rcdev, struct bpf_prog *prog)
|
||||
{
|
||||
struct bpf_prog_array __rcu *old_array;
|
||||
struct bpf_prog_array *new_array;
|
||||
struct ir_raw_event_ctrl *raw;
|
||||
int ret;
|
||||
|
||||
if (rcdev->driver_type != RC_DRIVER_IR_RAW)
|
||||
return -EINVAL;
|
||||
|
||||
ret = mutex_lock_interruptible(&ir_raw_handler_lock);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
raw = rcdev->raw;
|
||||
if (!raw) {
|
||||
ret = -ENODEV;
|
||||
goto unlock;
|
||||
}
|
||||
|
||||
if (raw->progs && bpf_prog_array_length(raw->progs) >= BPF_MAX_PROGS) {
|
||||
ret = -E2BIG;
|
||||
goto unlock;
|
||||
}
|
||||
|
||||
old_array = raw->progs;
|
||||
ret = bpf_prog_array_copy(old_array, NULL, prog, &new_array);
|
||||
if (ret < 0)
|
||||
goto unlock;
|
||||
|
||||
rcu_assign_pointer(raw->progs, new_array);
|
||||
bpf_prog_array_free(old_array);
|
||||
|
||||
unlock:
|
||||
mutex_unlock(&ir_raw_handler_lock);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int lirc_bpf_detach(struct rc_dev *rcdev, struct bpf_prog *prog)
|
||||
{
|
||||
struct bpf_prog_array __rcu *old_array;
|
||||
struct bpf_prog_array *new_array;
|
||||
struct ir_raw_event_ctrl *raw;
|
||||
int ret;
|
||||
|
||||
if (rcdev->driver_type != RC_DRIVER_IR_RAW)
|
||||
return -EINVAL;
|
||||
|
||||
ret = mutex_lock_interruptible(&ir_raw_handler_lock);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
raw = rcdev->raw;
|
||||
if (!raw) {
|
||||
ret = -ENODEV;
|
||||
goto unlock;
|
||||
}
|
||||
|
||||
old_array = raw->progs;
|
||||
ret = bpf_prog_array_copy(old_array, prog, NULL, &new_array);
|
||||
/*
|
||||
* Do not use bpf_prog_array_delete_safe() as we would end up
|
||||
* with a dummy entry in the array, and the we would free the
|
||||
* dummy in lirc_bpf_free()
|
||||
*/
|
||||
if (ret)
|
||||
goto unlock;
|
||||
|
||||
rcu_assign_pointer(raw->progs, new_array);
|
||||
bpf_prog_array_free(old_array);
|
||||
unlock:
|
||||
mutex_unlock(&ir_raw_handler_lock);
|
||||
return ret;
|
||||
}
|
||||
|
||||
void lirc_bpf_run(struct rc_dev *rcdev, u32 sample)
|
||||
{
|
||||
struct ir_raw_event_ctrl *raw = rcdev->raw;
|
||||
|
||||
raw->bpf_sample = sample;
|
||||
|
||||
if (raw->progs)
|
||||
BPF_PROG_RUN_ARRAY(raw->progs, &raw->bpf_sample, BPF_PROG_RUN);
|
||||
}
|
||||
|
||||
/*
|
||||
* This should be called once the rc thread has been stopped, so there can be
|
||||
* no concurrent bpf execution.
|
||||
*/
|
||||
void lirc_bpf_free(struct rc_dev *rcdev)
|
||||
{
|
||||
struct bpf_prog **progs;
|
||||
|
||||
if (!rcdev->raw->progs)
|
||||
return;
|
||||
|
||||
progs = rcu_dereference(rcdev->raw->progs)->progs;
|
||||
while (*progs)
|
||||
bpf_prog_put(*progs++);
|
||||
|
||||
bpf_prog_array_free(rcdev->raw->progs);
|
||||
}
|
||||
|
||||
int lirc_prog_attach(const union bpf_attr *attr)
|
||||
{
|
||||
struct bpf_prog *prog;
|
||||
struct rc_dev *rcdev;
|
||||
int ret;
|
||||
|
||||
if (attr->attach_flags)
|
||||
return -EINVAL;
|
||||
|
||||
prog = bpf_prog_get_type(attr->attach_bpf_fd,
|
||||
BPF_PROG_TYPE_LIRC_MODE2);
|
||||
if (IS_ERR(prog))
|
||||
return PTR_ERR(prog);
|
||||
|
||||
rcdev = rc_dev_get_from_fd(attr->target_fd);
|
||||
if (IS_ERR(rcdev)) {
|
||||
bpf_prog_put(prog);
|
||||
return PTR_ERR(rcdev);
|
||||
}
|
||||
|
||||
ret = lirc_bpf_attach(rcdev, prog);
|
||||
if (ret)
|
||||
bpf_prog_put(prog);
|
||||
|
||||
put_device(&rcdev->dev);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
int lirc_prog_detach(const union bpf_attr *attr)
|
||||
{
|
||||
struct bpf_prog *prog;
|
||||
struct rc_dev *rcdev;
|
||||
int ret;
|
||||
|
||||
if (attr->attach_flags)
|
||||
return -EINVAL;
|
||||
|
||||
prog = bpf_prog_get_type(attr->attach_bpf_fd,
|
||||
BPF_PROG_TYPE_LIRC_MODE2);
|
||||
if (IS_ERR(prog))
|
||||
return PTR_ERR(prog);
|
||||
|
||||
rcdev = rc_dev_get_from_fd(attr->target_fd);
|
||||
if (IS_ERR(rcdev)) {
|
||||
bpf_prog_put(prog);
|
||||
return PTR_ERR(rcdev);
|
||||
}
|
||||
|
||||
ret = lirc_bpf_detach(rcdev, prog);
|
||||
|
||||
bpf_prog_put(prog);
|
||||
put_device(&rcdev->dev);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
int lirc_prog_query(const union bpf_attr *attr, union bpf_attr __user *uattr)
|
||||
{
|
||||
__u32 __user *prog_ids = u64_to_user_ptr(attr->query.prog_ids);
|
||||
struct bpf_prog_array __rcu *progs;
|
||||
struct rc_dev *rcdev;
|
||||
u32 cnt, flags = 0;
|
||||
int ret;
|
||||
|
||||
if (attr->query.query_flags)
|
||||
return -EINVAL;
|
||||
|
||||
rcdev = rc_dev_get_from_fd(attr->query.target_fd);
|
||||
if (IS_ERR(rcdev))
|
||||
return PTR_ERR(rcdev);
|
||||
|
||||
if (rcdev->driver_type != RC_DRIVER_IR_RAW) {
|
||||
ret = -EINVAL;
|
||||
goto put;
|
||||
}
|
||||
|
||||
ret = mutex_lock_interruptible(&ir_raw_handler_lock);
|
||||
if (ret)
|
||||
goto put;
|
||||
|
||||
progs = rcdev->raw->progs;
|
||||
cnt = progs ? bpf_prog_array_length(progs) : 0;
|
||||
|
||||
if (copy_to_user(&uattr->query.prog_cnt, &cnt, sizeof(cnt))) {
|
||||
ret = -EFAULT;
|
||||
goto unlock;
|
||||
}
|
||||
|
||||
if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags))) {
|
||||
ret = -EFAULT;
|
||||
goto unlock;
|
||||
}
|
||||
|
||||
if (attr->query.prog_cnt != 0 && prog_ids && cnt)
|
||||
ret = bpf_prog_array_copy_to_user(progs, prog_ids, cnt);
|
||||
|
||||
unlock:
|
||||
mutex_unlock(&ir_raw_handler_lock);
|
||||
put:
|
||||
put_device(&rcdev->dev);
|
||||
|
||||
return ret;
|
||||
}
|
|
@ -20,6 +20,7 @@
|
|||
#include <linux/module.h>
|
||||
#include <linux/mutex.h>
|
||||
#include <linux/device.h>
|
||||
#include <linux/file.h>
|
||||
#include <linux/idr.h>
|
||||
#include <linux/poll.h>
|
||||
#include <linux/sched.h>
|
||||
|
@ -104,6 +105,12 @@ void ir_lirc_raw_event(struct rc_dev *dev, struct ir_raw_event ev)
|
|||
TO_US(ev.duration), TO_STR(ev.pulse));
|
||||
}
|
||||
|
||||
/*
|
||||
* bpf does not care about the gap generated above; that exists
|
||||
* for backwards compatibility
|
||||
*/
|
||||
lirc_bpf_run(dev, sample);
|
||||
|
||||
spin_lock_irqsave(&dev->lirc_fh_lock, flags);
|
||||
list_for_each_entry(fh, &dev->lirc_fh, list) {
|
||||
if (LIRC_IS_TIMEOUT(sample) && !fh->send_timeout_reports)
|
||||
|
@ -816,4 +823,27 @@ void __exit lirc_dev_exit(void)
|
|||
unregister_chrdev_region(lirc_base_dev, RC_DEV_MAX);
|
||||
}
|
||||
|
||||
struct rc_dev *rc_dev_get_from_fd(int fd)
|
||||
{
|
||||
struct fd f = fdget(fd);
|
||||
struct lirc_fh *fh;
|
||||
struct rc_dev *dev;
|
||||
|
||||
if (!f.file)
|
||||
return ERR_PTR(-EBADF);
|
||||
|
||||
if (f.file->f_op != &lirc_fops) {
|
||||
fdput(f);
|
||||
return ERR_PTR(-EINVAL);
|
||||
}
|
||||
|
||||
fh = f.file->private_data;
|
||||
dev = fh->rc;
|
||||
|
||||
get_device(&dev->dev);
|
||||
fdput(f);
|
||||
|
||||
return dev;
|
||||
}
|
||||
|
||||
MODULE_ALIAS("lirc_dev");
|
||||
|
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue