Merge branch 'convert-doc-to-rst'
Jesper Dangaard Brouer says: ==================== The kernel is moving files under Documentation to use the RST (reStructuredText) format and Sphinx [1]. This patchset converts the files under Documentation/bpf/ into RST format. The Sphinx integration is left as followup work. [1] https://www.kernel.org/doc/html/latest/doc-guide/sphinx.html This patchset have been uploaded as branch bpf_doc10 on github[2], so reviewers can see how GitHub renders this. [2] https://github.com/netoptimizer/linux/tree/bpf_doc10/Documentation/bpf ==================== Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This commit is contained in:
commit
ef11d41b3a
|
@ -0,0 +1,36 @@
|
|||
=================
|
||||
BPF documentation
|
||||
=================
|
||||
|
||||
This directory contains documentation for the BPF (Berkeley Packet
|
||||
Filter) facility, with a focus on the extended BPF version (eBPF).
|
||||
|
||||
This kernel side documentation is still work in progress. The main
|
||||
textual documentation is (for historical reasons) described in
|
||||
`Documentation/networking/filter.txt`_, which describe both classical
|
||||
and extended BPF instruction-set.
|
||||
The Cilium project also maintains a `BPF and XDP Reference Guide`_
|
||||
that goes into great technical depth about the BPF Architecture.
|
||||
|
||||
The primary info for the bpf syscall is available in the `man-pages`_
|
||||
for `bpf(2)`_.
|
||||
|
||||
|
||||
|
||||
Frequently asked questions (FAQ)
|
||||
================================
|
||||
|
||||
Two sets of Questions and Answers (Q&A) are maintained.
|
||||
|
||||
* QA for common questions about BPF see: bpf_design_QA_
|
||||
|
||||
* QA for developers interacting with BPF subsystem: bpf_devel_QA_
|
||||
|
||||
|
||||
.. Links:
|
||||
.. _bpf_design_QA: bpf_design_QA.rst
|
||||
.. _bpf_devel_QA: bpf_devel_QA.rst
|
||||
.. _Documentation/networking/filter.txt: ../networking/filter.txt
|
||||
.. _man-pages: https://www.kernel.org/doc/man-pages/
|
||||
.. _bpf(2): http://man7.org/linux/man-pages/man2/bpf.2.html
|
||||
.. _BPF and XDP Reference Guide: http://cilium.readthedocs.io/en/latest/bpf/
|
|
@ -0,0 +1,221 @@
|
|||
==============
|
||||
BPF Design Q&A
|
||||
==============
|
||||
|
||||
BPF extensibility and applicability to networking, tracing, security
|
||||
in the linux kernel and several user space implementations of BPF
|
||||
virtual machine led to a number of misunderstanding on what BPF actually is.
|
||||
This short QA is an attempt to address that and outline a direction
|
||||
of where BPF is heading long term.
|
||||
|
||||
.. contents::
|
||||
:local:
|
||||
:depth: 3
|
||||
|
||||
Questions and Answers
|
||||
=====================
|
||||
|
||||
Q: Is BPF a generic instruction set similar to x64 and arm64?
|
||||
-------------------------------------------------------------
|
||||
A: NO.
|
||||
|
||||
Q: Is BPF a generic virtual machine ?
|
||||
-------------------------------------
|
||||
A: NO.
|
||||
|
||||
BPF is generic instruction set *with* C calling convention.
|
||||
-----------------------------------------------------------
|
||||
|
||||
Q: Why C calling convention was chosen?
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
A: Because BPF programs are designed to run in the linux kernel
|
||||
which is written in C, hence BPF defines instruction set compatible
|
||||
with two most used architectures x64 and arm64 (and takes into
|
||||
consideration important quirks of other architectures) and
|
||||
defines calling convention that is compatible with C calling
|
||||
convention of the linux kernel on those architectures.
|
||||
|
||||
Q: can multiple return values be supported in the future?
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
A: NO. BPF allows only register R0 to be used as return value.
|
||||
|
||||
Q: can more than 5 function arguments be supported in the future?
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
A: NO. BPF calling convention only allows registers R1-R5 to be used
|
||||
as arguments. BPF is not a standalone instruction set.
|
||||
(unlike x64 ISA that allows msft, cdecl and other conventions)
|
||||
|
||||
Q: can BPF programs access instruction pointer or return address?
|
||||
-----------------------------------------------------------------
|
||||
A: NO.
|
||||
|
||||
Q: can BPF programs access stack pointer ?
|
||||
------------------------------------------
|
||||
A: NO.
|
||||
|
||||
Only frame pointer (register R10) is accessible.
|
||||
From compiler point of view it's necessary to have stack pointer.
|
||||
For example LLVM defines register R11 as stack pointer in its
|
||||
BPF backend, but it makes sure that generated code never uses it.
|
||||
|
||||
Q: Does C-calling convention diminishes possible use cases?
|
||||
-----------------------------------------------------------
|
||||
A: YES.
|
||||
|
||||
BPF design forces addition of major functionality in the form
|
||||
of kernel helper functions and kernel objects like BPF maps with
|
||||
seamless interoperability between them. It lets kernel call into
|
||||
BPF programs and programs call kernel helpers with zero overhead.
|
||||
As all of them were native C code. That is particularly the case
|
||||
for JITed BPF programs that are indistinguishable from
|
||||
native kernel C code.
|
||||
|
||||
Q: Does it mean that 'innovative' extensions to BPF code are disallowed?
|
||||
------------------------------------------------------------------------
|
||||
A: Soft yes.
|
||||
|
||||
At least for now until BPF core has support for
|
||||
bpf-to-bpf calls, indirect calls, loops, global variables,
|
||||
jump tables, read only sections and all other normal constructs
|
||||
that C code can produce.
|
||||
|
||||
Q: Can loops be supported in a safe way?
|
||||
----------------------------------------
|
||||
A: It's not clear yet.
|
||||
|
||||
BPF developers are trying to find a way to
|
||||
support bounded loops where the verifier can guarantee that
|
||||
the program terminates in less than 4096 instructions.
|
||||
|
||||
Instruction level questions
|
||||
---------------------------
|
||||
|
||||
Q: LD_ABS and LD_IND instructions vs C code
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Q: How come LD_ABS and LD_IND instruction are present in BPF whereas
|
||||
C code cannot express them and has to use builtin intrinsics?
|
||||
|
||||
A: This is artifact of compatibility with classic BPF. Modern
|
||||
networking code in BPF performs better without them.
|
||||
See 'direct packet access'.
|
||||
|
||||
Q: BPF instructions mapping not one-to-one to native CPU
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
Q: It seems not all BPF instructions are one-to-one to native CPU.
|
||||
For example why BPF_JNE and other compare and jumps are not cpu-like?
|
||||
|
||||
A: This was necessary to avoid introducing flags into ISA which are
|
||||
impossible to make generic and efficient across CPU architectures.
|
||||
|
||||
Q: why BPF_DIV instruction doesn't map to x64 div?
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
A: Because if we picked one-to-one relationship to x64 it would have made
|
||||
it more complicated to support on arm64 and other archs. Also it
|
||||
needs div-by-zero runtime check.
|
||||
|
||||
Q: why there is no BPF_SDIV for signed divide operation?
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
A: Because it would be rarely used. llvm errors in such case and
|
||||
prints a suggestion to use unsigned divide instead
|
||||
|
||||
Q: Why BPF has implicit prologue and epilogue?
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
A: Because architectures like sparc have register windows and in general
|
||||
there are enough subtle differences between architectures, so naive
|
||||
store return address into stack won't work. Another reason is BPF has
|
||||
to be safe from division by zero (and legacy exception path
|
||||
of LD_ABS insn). Those instructions need to invoke epilogue and
|
||||
return implicitly.
|
||||
|
||||
Q: Why BPF_JLT and BPF_JLE instructions were not introduced in the beginning?
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
A: Because classic BPF didn't have them and BPF authors felt that compiler
|
||||
workaround would be acceptable. Turned out that programs lose performance
|
||||
due to lack of these compare instructions and they were added.
|
||||
These two instructions is a perfect example what kind of new BPF
|
||||
instructions are acceptable and can be added in the future.
|
||||
These two already had equivalent instructions in native CPUs.
|
||||
New instructions that don't have one-to-one mapping to HW instructions
|
||||
will not be accepted.
|
||||
|
||||
Q: BPF 32-bit subregister requirements
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
Q: BPF 32-bit subregisters have a requirement to zero upper 32-bits of BPF
|
||||
registers which makes BPF inefficient virtual machine for 32-bit
|
||||
CPU architectures and 32-bit HW accelerators. Can true 32-bit registers
|
||||
be added to BPF in the future?
|
||||
|
||||
A: NO. The first thing to improve performance on 32-bit archs is to teach
|
||||
LLVM to generate code that uses 32-bit subregisters. Then second step
|
||||
is to teach verifier to mark operations where zero-ing upper bits
|
||||
is unnecessary. Then JITs can take advantage of those markings and
|
||||
drastically reduce size of generated code and improve performance.
|
||||
|
||||
Q: Does BPF have a stable ABI?
|
||||
------------------------------
|
||||
A: YES. BPF instructions, arguments to BPF programs, set of helper
|
||||
functions and their arguments, recognized return codes are all part
|
||||
of ABI. However when tracing programs are using bpf_probe_read() helper
|
||||
to walk kernel internal datastructures and compile with kernel
|
||||
internal headers these accesses can and will break with newer
|
||||
kernels. The union bpf_attr -> kern_version is checked at load time
|
||||
to prevent accidentally loading kprobe-based bpf programs written
|
||||
for a different kernel. Networking programs don't do kern_version check.
|
||||
|
||||
Q: How much stack space a BPF program uses?
|
||||
-------------------------------------------
|
||||
A: Currently all program types are limited to 512 bytes of stack
|
||||
space, but the verifier computes the actual amount of stack used
|
||||
and both interpreter and most JITed code consume necessary amount.
|
||||
|
||||
Q: Can BPF be offloaded to HW?
|
||||
------------------------------
|
||||
A: YES. BPF HW offload is supported by NFP driver.
|
||||
|
||||
Q: Does classic BPF interpreter still exist?
|
||||
--------------------------------------------
|
||||
A: NO. Classic BPF programs are converted into extend BPF instructions.
|
||||
|
||||
Q: Can BPF call arbitrary kernel functions?
|
||||
-------------------------------------------
|
||||
A: NO. BPF programs can only call a set of helper functions which
|
||||
is defined for every program type.
|
||||
|
||||
Q: Can BPF overwrite arbitrary kernel memory?
|
||||
---------------------------------------------
|
||||
A: NO.
|
||||
|
||||
Tracing bpf programs can *read* arbitrary memory with bpf_probe_read()
|
||||
and bpf_probe_read_str() helpers. Networking programs cannot read
|
||||
arbitrary memory, since they don't have access to these helpers.
|
||||
Programs can never read or write arbitrary memory directly.
|
||||
|
||||
Q: Can BPF overwrite arbitrary user memory?
|
||||
-------------------------------------------
|
||||
A: Sort-of.
|
||||
|
||||
Tracing BPF programs can overwrite the user memory
|
||||
of the current task with bpf_probe_write_user(). Every time such
|
||||
program is loaded the kernel will print warning message, so
|
||||
this helper is only useful for experiments and prototypes.
|
||||
Tracing BPF programs are root only.
|
||||
|
||||
Q: bpf_trace_printk() helper warning
|
||||
------------------------------------
|
||||
Q: When bpf_trace_printk() helper is used the kernel prints nasty
|
||||
warning message. Why is that?
|
||||
|
||||
A: This is done to nudge program authors into better interfaces when
|
||||
programs need to pass data to user space. Like bpf_perf_event_output()
|
||||
can be used to efficiently stream data via perf ring buffer.
|
||||
BPF maps can be used for asynchronous data sharing between kernel
|
||||
and user space. bpf_trace_printk() should only be used for debugging.
|
||||
|
||||
Q: New functionality via kernel modules?
|
||||
----------------------------------------
|
||||
Q: Can BPF functionality such as new program or map types, new
|
||||
helpers, etc be added out of kernel module code?
|
||||
|
||||
A: NO.
|
|
@ -1,156 +0,0 @@
|
|||
BPF extensibility and applicability to networking, tracing, security
|
||||
in the linux kernel and several user space implementations of BPF
|
||||
virtual machine led to a number of misunderstanding on what BPF actually is.
|
||||
This short QA is an attempt to address that and outline a direction
|
||||
of where BPF is heading long term.
|
||||
|
||||
Q: Is BPF a generic instruction set similar to x64 and arm64?
|
||||
A: NO.
|
||||
|
||||
Q: Is BPF a generic virtual machine ?
|
||||
A: NO.
|
||||
|
||||
BPF is generic instruction set _with_ C calling convention.
|
||||
|
||||
Q: Why C calling convention was chosen?
|
||||
A: Because BPF programs are designed to run in the linux kernel
|
||||
which is written in C, hence BPF defines instruction set compatible
|
||||
with two most used architectures x64 and arm64 (and takes into
|
||||
consideration important quirks of other architectures) and
|
||||
defines calling convention that is compatible with C calling
|
||||
convention of the linux kernel on those architectures.
|
||||
|
||||
Q: can multiple return values be supported in the future?
|
||||
A: NO. BPF allows only register R0 to be used as return value.
|
||||
|
||||
Q: can more than 5 function arguments be supported in the future?
|
||||
A: NO. BPF calling convention only allows registers R1-R5 to be used
|
||||
as arguments. BPF is not a standalone instruction set.
|
||||
(unlike x64 ISA that allows msft, cdecl and other conventions)
|
||||
|
||||
Q: can BPF programs access instruction pointer or return address?
|
||||
A: NO.
|
||||
|
||||
Q: can BPF programs access stack pointer ?
|
||||
A: NO. Only frame pointer (register R10) is accessible.
|
||||
From compiler point of view it's necessary to have stack pointer.
|
||||
For example LLVM defines register R11 as stack pointer in its
|
||||
BPF backend, but it makes sure that generated code never uses it.
|
||||
|
||||
Q: Does C-calling convention diminishes possible use cases?
|
||||
A: YES. BPF design forces addition of major functionality in the form
|
||||
of kernel helper functions and kernel objects like BPF maps with
|
||||
seamless interoperability between them. It lets kernel call into
|
||||
BPF programs and programs call kernel helpers with zero overhead.
|
||||
As all of them were native C code. That is particularly the case
|
||||
for JITed BPF programs that are indistinguishable from
|
||||
native kernel C code.
|
||||
|
||||
Q: Does it mean that 'innovative' extensions to BPF code are disallowed?
|
||||
A: Soft yes. At least for now until BPF core has support for
|
||||
bpf-to-bpf calls, indirect calls, loops, global variables,
|
||||
jump tables, read only sections and all other normal constructs
|
||||
that C code can produce.
|
||||
|
||||
Q: Can loops be supported in a safe way?
|
||||
A: It's not clear yet. BPF developers are trying to find a way to
|
||||
support bounded loops where the verifier can guarantee that
|
||||
the program terminates in less than 4096 instructions.
|
||||
|
||||
Q: How come LD_ABS and LD_IND instruction are present in BPF whereas
|
||||
C code cannot express them and has to use builtin intrinsics?
|
||||
A: This is artifact of compatibility with classic BPF. Modern
|
||||
networking code in BPF performs better without them.
|
||||
See 'direct packet access'.
|
||||
|
||||
Q: It seems not all BPF instructions are one-to-one to native CPU.
|
||||
For example why BPF_JNE and other compare and jumps are not cpu-like?
|
||||
A: This was necessary to avoid introducing flags into ISA which are
|
||||
impossible to make generic and efficient across CPU architectures.
|
||||
|
||||
Q: why BPF_DIV instruction doesn't map to x64 div?
|
||||
A: Because if we picked one-to-one relationship to x64 it would have made
|
||||
it more complicated to support on arm64 and other archs. Also it
|
||||
needs div-by-zero runtime check.
|
||||
|
||||
Q: why there is no BPF_SDIV for signed divide operation?
|
||||
A: Because it would be rarely used. llvm errors in such case and
|
||||
prints a suggestion to use unsigned divide instead
|
||||
|
||||
Q: Why BPF has implicit prologue and epilogue?
|
||||
A: Because architectures like sparc have register windows and in general
|
||||
there are enough subtle differences between architectures, so naive
|
||||
store return address into stack won't work. Another reason is BPF has
|
||||
to be safe from division by zero (and legacy exception path
|
||||
of LD_ABS insn). Those instructions need to invoke epilogue and
|
||||
return implicitly.
|
||||
|
||||
Q: Why BPF_JLT and BPF_JLE instructions were not introduced in the beginning?
|
||||
A: Because classic BPF didn't have them and BPF authors felt that compiler
|
||||
workaround would be acceptable. Turned out that programs lose performance
|
||||
due to lack of these compare instructions and they were added.
|
||||
These two instructions is a perfect example what kind of new BPF
|
||||
instructions are acceptable and can be added in the future.
|
||||
These two already had equivalent instructions in native CPUs.
|
||||
New instructions that don't have one-to-one mapping to HW instructions
|
||||
will not be accepted.
|
||||
|
||||
Q: BPF 32-bit subregisters have a requirement to zero upper 32-bits of BPF
|
||||
registers which makes BPF inefficient virtual machine for 32-bit
|
||||
CPU architectures and 32-bit HW accelerators. Can true 32-bit registers
|
||||
be added to BPF in the future?
|
||||
A: NO. The first thing to improve performance on 32-bit archs is to teach
|
||||
LLVM to generate code that uses 32-bit subregisters. Then second step
|
||||
is to teach verifier to mark operations where zero-ing upper bits
|
||||
is unnecessary. Then JITs can take advantage of those markings and
|
||||
drastically reduce size of generated code and improve performance.
|
||||
|
||||
Q: Does BPF have a stable ABI?
|
||||
A: YES. BPF instructions, arguments to BPF programs, set of helper
|
||||
functions and their arguments, recognized return codes are all part
|
||||
of ABI. However when tracing programs are using bpf_probe_read() helper
|
||||
to walk kernel internal datastructures and compile with kernel
|
||||
internal headers these accesses can and will break with newer
|
||||
kernels. The union bpf_attr -> kern_version is checked at load time
|
||||
to prevent accidentally loading kprobe-based bpf programs written
|
||||
for a different kernel. Networking programs don't do kern_version check.
|
||||
|
||||
Q: How much stack space a BPF program uses?
|
||||
A: Currently all program types are limited to 512 bytes of stack
|
||||
space, but the verifier computes the actual amount of stack used
|
||||
and both interpreter and most JITed code consume necessary amount.
|
||||
|
||||
Q: Can BPF be offloaded to HW?
|
||||
A: YES. BPF HW offload is supported by NFP driver.
|
||||
|
||||
Q: Does classic BPF interpreter still exist?
|
||||
A: NO. Classic BPF programs are converted into extend BPF instructions.
|
||||
|
||||
Q: Can BPF call arbitrary kernel functions?
|
||||
A: NO. BPF programs can only call a set of helper functions which
|
||||
is defined for every program type.
|
||||
|
||||
Q: Can BPF overwrite arbitrary kernel memory?
|
||||
A: NO. Tracing bpf programs can _read_ arbitrary memory with bpf_probe_read()
|
||||
and bpf_probe_read_str() helpers. Networking programs cannot read
|
||||
arbitrary memory, since they don't have access to these helpers.
|
||||
Programs can never read or write arbitrary memory directly.
|
||||
|
||||
Q: Can BPF overwrite arbitrary user memory?
|
||||
A: Sort-of. Tracing BPF programs can overwrite the user memory
|
||||
of the current task with bpf_probe_write_user(). Every time such
|
||||
program is loaded the kernel will print warning message, so
|
||||
this helper is only useful for experiments and prototypes.
|
||||
Tracing BPF programs are root only.
|
||||
|
||||
Q: When bpf_trace_printk() helper is used the kernel prints nasty
|
||||
warning message. Why is that?
|
||||
A: This is done to nudge program authors into better interfaces when
|
||||
programs need to pass data to user space. Like bpf_perf_event_output()
|
||||
can be used to efficiently stream data via perf ring buffer.
|
||||
BPF maps can be used for asynchronous data sharing between kernel
|
||||
and user space. bpf_trace_printk() should only be used for debugging.
|
||||
|
||||
Q: Can BPF functionality such as new program or map types, new
|
||||
helpers, etc be added out of kernel module code?
|
||||
A: NO.
|
|
@ -0,0 +1,640 @@
|
|||
=================================
|
||||
HOWTO interact with BPF subsystem
|
||||
=================================
|
||||
|
||||
This document provides information for the BPF subsystem about various
|
||||
workflows related to reporting bugs, submitting patches, and queueing
|
||||
patches for stable kernels.
|
||||
|
||||
For general information about submitting patches, please refer to
|
||||
`Documentation/process/`_. This document only describes additional specifics
|
||||
related to BPF.
|
||||
|
||||
.. contents::
|
||||
:local:
|
||||
:depth: 2
|
||||
|
||||
Reporting bugs
|
||||
==============
|
||||
|
||||
Q: How do I report bugs for BPF kernel code?
|
||||
--------------------------------------------
|
||||
A: Since all BPF kernel development as well as bpftool and iproute2 BPF
|
||||
loader development happens through the netdev kernel mailing list,
|
||||
please report any found issues around BPF to the following mailing
|
||||
list:
|
||||
|
||||
netdev@vger.kernel.org
|
||||
|
||||
This may also include issues related to XDP, BPF tracing, etc.
|
||||
|
||||
Given netdev has a high volume of traffic, please also add the BPF
|
||||
maintainers to Cc (from kernel MAINTAINERS_ file):
|
||||
|
||||
* Alexei Starovoitov <ast@kernel.org>
|
||||
* Daniel Borkmann <daniel@iogearbox.net>
|
||||
|
||||
In case a buggy commit has already been identified, make sure to keep
|
||||
the actual commit authors in Cc as well for the report. They can
|
||||
typically be identified through the kernel's git tree.
|
||||
|
||||
**Please do NOT report BPF issues to bugzilla.kernel.org since it
|
||||
is a guarantee that the reported issue will be overlooked.**
|
||||
|
||||
Submitting patches
|
||||
==================
|
||||
|
||||
Q: To which mailing list do I need to submit my BPF patches?
|
||||
------------------------------------------------------------
|
||||
A: Please submit your BPF patches to the netdev kernel mailing list:
|
||||
|
||||
netdev@vger.kernel.org
|
||||
|
||||
Historically, BPF came out of networking and has always been maintained
|
||||
by the kernel networking community. Although these days BPF touches
|
||||
many other subsystems as well, the patches are still routed mainly
|
||||
through the networking community.
|
||||
|
||||
In case your patch has changes in various different subsystems (e.g.
|
||||
tracing, security, etc), make sure to Cc the related kernel mailing
|
||||
lists and maintainers from there as well, so they are able to review
|
||||
the changes and provide their Acked-by's to the patches.
|
||||
|
||||
Q: Where can I find patches currently under discussion for BPF subsystem?
|
||||
-------------------------------------------------------------------------
|
||||
A: All patches that are Cc'ed to netdev are queued for review under netdev
|
||||
patchwork project:
|
||||
|
||||
http://patchwork.ozlabs.org/project/netdev/list/
|
||||
|
||||
Those patches which target BPF, are assigned to a 'bpf' delegate for
|
||||
further processing from BPF maintainers. The current queue with
|
||||
patches under review can be found at:
|
||||
|
||||
https://patchwork.ozlabs.org/project/netdev/list/?delegate=77147
|
||||
|
||||
Once the patches have been reviewed by the BPF community as a whole
|
||||
and approved by the BPF maintainers, their status in patchwork will be
|
||||
changed to 'Accepted' and the submitter will be notified by mail. This
|
||||
means that the patches look good from a BPF perspective and have been
|
||||
applied to one of the two BPF kernel trees.
|
||||
|
||||
In case feedback from the community requires a respin of the patches,
|
||||
their status in patchwork will be set to 'Changes Requested', and purged
|
||||
from the current review queue. Likewise for cases where patches would
|
||||
get rejected or are not applicable to the BPF trees (but assigned to
|
||||
the 'bpf' delegate).
|
||||
|
||||
Q: How do the changes make their way into Linux?
|
||||
------------------------------------------------
|
||||
A: There are two BPF kernel trees (git repositories). Once patches have
|
||||
been accepted by the BPF maintainers, they will be applied to one
|
||||
of the two BPF trees:
|
||||
|
||||
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/
|
||||
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/
|
||||
|
||||
The bpf tree itself is for fixes only, whereas bpf-next for features,
|
||||
cleanups or other kind of improvements ("next-like" content). This is
|
||||
analogous to net and net-next trees for networking. Both bpf and
|
||||
bpf-next will only have a master branch in order to simplify against
|
||||
which branch patches should get rebased to.
|
||||
|
||||
Accumulated BPF patches in the bpf tree will regularly get pulled
|
||||
into the net kernel tree. Likewise, accumulated BPF patches accepted
|
||||
into the bpf-next tree will make their way into net-next tree. net and
|
||||
net-next are both run by David S. Miller. From there, they will go
|
||||
into the kernel mainline tree run by Linus Torvalds. To read up on the
|
||||
process of net and net-next being merged into the mainline tree, see
|
||||
the `netdev FAQ`_ under:
|
||||
|
||||
`Documentation/networking/netdev-FAQ.txt`_
|
||||
|
||||
Occasionally, to prevent merge conflicts, we might send pull requests
|
||||
to other trees (e.g. tracing) with a small subset of the patches, but
|
||||
net and net-next are always the main trees targeted for integration.
|
||||
|
||||
The pull requests will contain a high-level summary of the accumulated
|
||||
patches and can be searched on netdev kernel mailing list through the
|
||||
following subject lines (``yyyy-mm-dd`` is the date of the pull
|
||||
request)::
|
||||
|
||||
pull-request: bpf yyyy-mm-dd
|
||||
pull-request: bpf-next yyyy-mm-dd
|
||||
|
||||
Q: How do I indicate which tree (bpf vs. bpf-next) my patch should be applied to?
|
||||
---------------------------------------------------------------------------------
|
||||
|
||||
A: The process is the very same as described in the `netdev FAQ`_, so
|
||||
please read up on it. The subject line must indicate whether the
|
||||
patch is a fix or rather "next-like" content in order to let the
|
||||
maintainers know whether it is targeted at bpf or bpf-next.
|
||||
|
||||
For fixes eventually landing in bpf -> net tree, the subject must
|
||||
look like::
|
||||
|
||||
git format-patch --subject-prefix='PATCH bpf' start..finish
|
||||
|
||||
For features/improvements/etc that should eventually land in
|
||||
bpf-next -> net-next, the subject must look like::
|
||||
|
||||
git format-patch --subject-prefix='PATCH bpf-next' start..finish
|
||||
|
||||
If unsure whether the patch or patch series should go into bpf
|
||||
or net directly, or bpf-next or net-next directly, it is not a
|
||||
problem either if the subject line says net or net-next as target.
|
||||
It is eventually up to the maintainers to do the delegation of
|
||||
the patches.
|
||||
|
||||
If it is clear that patches should go into bpf or bpf-next tree,
|
||||
please make sure to rebase the patches against those trees in
|
||||
order to reduce potential conflicts.
|
||||
|
||||
In case the patch or patch series has to be reworked and sent out
|
||||
again in a second or later revision, it is also required to add a
|
||||
version number (``v2``, ``v3``, ...) into the subject prefix::
|
||||
|
||||
git format-patch --subject-prefix='PATCH net-next v2' start..finish
|
||||
|
||||
When changes have been requested to the patch series, always send the
|
||||
whole patch series again with the feedback incorporated (never send
|
||||
individual diffs on top of the old series).
|
||||
|
||||
Q: What does it mean when a patch gets applied to bpf or bpf-next tree?
|
||||
-----------------------------------------------------------------------
|
||||
A: It means that the patch looks good for mainline inclusion from
|
||||
a BPF point of view.
|
||||
|
||||
Be aware that this is not a final verdict that the patch will
|
||||
automatically get accepted into net or net-next trees eventually:
|
||||
|
||||
On the netdev kernel mailing list reviews can come in at any point
|
||||
in time. If discussions around a patch conclude that they cannot
|
||||
get included as-is, we will either apply a follow-up fix or drop
|
||||
them from the trees entirely. Therefore, we also reserve to rebase
|
||||
the trees when deemed necessary. After all, the purpose of the tree
|
||||
is to:
|
||||
|
||||
i) accumulate and stage BPF patches for integration into trees
|
||||
like net and net-next, and
|
||||
|
||||
ii) run extensive BPF test suite and
|
||||
workloads on the patches before they make their way any further.
|
||||
|
||||
Once the BPF pull request was accepted by David S. Miller, then
|
||||
the patches end up in net or net-next tree, respectively, and
|
||||
make their way from there further into mainline. Again, see the
|
||||
`netdev FAQ`_ for additional information e.g. on how often they are
|
||||
merged to mainline.
|
||||
|
||||
Q: How long do I need to wait for feedback on my BPF patches?
|
||||
-------------------------------------------------------------
|
||||
A: We try to keep the latency low. The usual time to feedback will
|
||||
be around 2 or 3 business days. It may vary depending on the
|
||||
complexity of changes and current patch load.
|
||||
|
||||
Q: How often do you send pull requests to major kernel trees like net or net-next?
|
||||
----------------------------------------------------------------------------------
|
||||
|
||||
A: Pull requests will be sent out rather often in order to not
|
||||
accumulate too many patches in bpf or bpf-next.
|
||||
|
||||
As a rule of thumb, expect pull requests for each tree regularly
|
||||
at the end of the week. In some cases pull requests could additionally
|
||||
come also in the middle of the week depending on the current patch
|
||||
load or urgency.
|
||||
|
||||
Q: Are patches applied to bpf-next when the merge window is open?
|
||||
-----------------------------------------------------------------
|
||||
A: For the time when the merge window is open, bpf-next will not be
|
||||
processed. This is roughly analogous to net-next patch processing,
|
||||
so feel free to read up on the `netdev FAQ`_ about further details.
|
||||
|
||||
During those two weeks of merge window, we might ask you to resend
|
||||
your patch series once bpf-next is open again. Once Linus released
|
||||
a ``v*-rc1`` after the merge window, we continue processing of bpf-next.
|
||||
|
||||
For non-subscribers to kernel mailing lists, there is also a status
|
||||
page run by David S. Miller on net-next that provides guidance:
|
||||
|
||||
http://vger.kernel.org/~davem/net-next.html
|
||||
|
||||
Q: Verifier changes and test cases
|
||||
----------------------------------
|
||||
Q: I made a BPF verifier change, do I need to add test cases for
|
||||
BPF kernel selftests_?
|
||||
|
||||
A: If the patch has changes to the behavior of the verifier, then yes,
|
||||
it is absolutely necessary to add test cases to the BPF kernel
|
||||
selftests_ suite. If they are not present and we think they are
|
||||
needed, then we might ask for them before accepting any changes.
|
||||
|
||||
In particular, test_verifier.c is tracking a high number of BPF test
|
||||
cases, including a lot of corner cases that LLVM BPF back end may
|
||||
generate out of the restricted C code. Thus, adding test cases is
|
||||
absolutely crucial to make sure future changes do not accidentally
|
||||
affect prior use-cases. Thus, treat those test cases as: verifier
|
||||
behavior that is not tracked in test_verifier.c could potentially
|
||||
be subject to change.
|
||||
|
||||
Q: samples/bpf preference vs selftests?
|
||||
---------------------------------------
|
||||
Q: When should I add code to `samples/bpf/`_ and when to BPF kernel
|
||||
selftests_ ?
|
||||
|
||||
A: In general, we prefer additions to BPF kernel selftests_ rather than
|
||||
`samples/bpf/`_. The rationale is very simple: kernel selftests are
|
||||
regularly run by various bots to test for kernel regressions.
|
||||
|
||||
The more test cases we add to BPF selftests, the better the coverage
|
||||
and the less likely it is that those could accidentally break. It is
|
||||
not that BPF kernel selftests cannot demo how a specific feature can
|
||||
be used.
|
||||
|
||||
That said, `samples/bpf/`_ may be a good place for people to get started,
|
||||
so it might be advisable that simple demos of features could go into
|
||||
`samples/bpf/`_, but advanced functional and corner-case testing rather
|
||||
into kernel selftests.
|
||||
|
||||
If your sample looks like a test case, then go for BPF kernel selftests
|
||||
instead!
|
||||
|
||||
Q: When should I add code to the bpftool?
|
||||
-----------------------------------------
|
||||
A: The main purpose of bpftool (under tools/bpf/bpftool/) is to provide
|
||||
a central user space tool for debugging and introspection of BPF programs
|
||||
and maps that are active in the kernel. If UAPI changes related to BPF
|
||||
enable for dumping additional information of programs or maps, then
|
||||
bpftool should be extended as well to support dumping them.
|
||||
|
||||
Q: When should I add code to iproute2's BPF loader?
|
||||
---------------------------------------------------
|
||||
A: For UAPI changes related to the XDP or tc layer (e.g. ``cls_bpf``),
|
||||
the convention is that those control-path related changes are added to
|
||||
iproute2's BPF loader as well from user space side. This is not only
|
||||
useful to have UAPI changes properly designed to be usable, but also
|
||||
to make those changes available to a wider user base of major
|
||||
downstream distributions.
|
||||
|
||||
Q: Do you accept patches as well for iproute2's BPF loader?
|
||||
-----------------------------------------------------------
|
||||
A: Patches for the iproute2's BPF loader have to be sent to:
|
||||
|
||||
netdev@vger.kernel.org
|
||||
|
||||
While those patches are not processed by the BPF kernel maintainers,
|
||||
please keep them in Cc as well, so they can be reviewed.
|
||||
|
||||
The official git repository for iproute2 is run by Stephen Hemminger
|
||||
and can be found at:
|
||||
|
||||
https://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2.git/
|
||||
|
||||
The patches need to have a subject prefix of '``[PATCH iproute2
|
||||
master]``' or '``[PATCH iproute2 net-next]``'. '``master``' or
|
||||
'``net-next``' describes the target branch where the patch should be
|
||||
applied to. Meaning, if kernel changes went into the net-next kernel
|
||||
tree, then the related iproute2 changes need to go into the iproute2
|
||||
net-next branch, otherwise they can be targeted at master branch. The
|
||||
iproute2 net-next branch will get merged into the master branch after
|
||||
the current iproute2 version from master has been released.
|
||||
|
||||
Like BPF, the patches end up in patchwork under the netdev project and
|
||||
are delegated to 'shemminger' for further processing:
|
||||
|
||||
http://patchwork.ozlabs.org/project/netdev/list/?delegate=389
|
||||
|
||||
Q: What is the minimum requirement before I submit my BPF patches?
|
||||
------------------------------------------------------------------
|
||||
A: When submitting patches, always take the time and properly test your
|
||||
patches *prior* to submission. Never rush them! If maintainers find
|
||||
that your patches have not been properly tested, it is a good way to
|
||||
get them grumpy. Testing patch submissions is a hard requirement!
|
||||
|
||||
Note, fixes that go to bpf tree *must* have a ``Fixes:`` tag included.
|
||||
The same applies to fixes that target bpf-next, where the affected
|
||||
commit is in net-next (or in some cases bpf-next). The ``Fixes:`` tag is
|
||||
crucial in order to identify follow-up commits and tremendously helps
|
||||
for people having to do backporting, so it is a must have!
|
||||
|
||||
We also don't accept patches with an empty commit message. Take your
|
||||
time and properly write up a high quality commit message, it is
|
||||
essential!
|
||||
|
||||
Think about it this way: other developers looking at your code a month
|
||||
from now need to understand *why* a certain change has been done that
|
||||
way, and whether there have been flaws in the analysis or assumptions
|
||||
that the original author did. Thus providing a proper rationale and
|
||||
describing the use-case for the changes is a must.
|
||||
|
||||
Patch submissions with >1 patch must have a cover letter which includes
|
||||
a high level description of the series. This high level summary will
|
||||
then be placed into the merge commit by the BPF maintainers such that
|
||||
it is also accessible from the git log for future reference.
|
||||
|
||||
Q: Features changing BPF JIT and/or LLVM
|
||||
----------------------------------------
|
||||
Q: What do I need to consider when adding a new instruction or feature
|
||||
that would require BPF JIT and/or LLVM integration as well?
|
||||
|
||||
A: We try hard to keep all BPF JITs up to date such that the same user
|
||||
experience can be guaranteed when running BPF programs on different
|
||||
architectures without having the program punt to the less efficient
|
||||
interpreter in case the in-kernel BPF JIT is enabled.
|
||||
|
||||
If you are unable to implement or test the required JIT changes for
|
||||
certain architectures, please work together with the related BPF JIT
|
||||
developers in order to get the feature implemented in a timely manner.
|
||||
Please refer to the git log (``arch/*/net/``) to locate the necessary
|
||||
people for helping out.
|
||||
|
||||
Also always make sure to add BPF test cases (e.g. test_bpf.c and
|
||||
test_verifier.c) for new instructions, so that they can receive
|
||||
broad test coverage and help run-time testing the various BPF JITs.
|
||||
|
||||
In case of new BPF instructions, once the changes have been accepted
|
||||
into the Linux kernel, please implement support into LLVM's BPF back
|
||||
end. See LLVM_ section below for further information.
|
||||
|
||||
Stable submission
|
||||
=================
|
||||
|
||||
Q: I need a specific BPF commit in stable kernels. What should I do?
|
||||
--------------------------------------------------------------------
|
||||
A: In case you need a specific fix in stable kernels, first check whether
|
||||
the commit has already been applied in the related ``linux-*.y`` branches:
|
||||
|
||||
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/
|
||||
|
||||
If not the case, then drop an email to the BPF maintainers with the
|
||||
netdev kernel mailing list in Cc and ask for the fix to be queued up:
|
||||
|
||||
netdev@vger.kernel.org
|
||||
|
||||
The process in general is the same as on netdev itself, see also the
|
||||
`netdev FAQ`_ document.
|
||||
|
||||
Q: Do you also backport to kernels not currently maintained as stable?
|
||||
----------------------------------------------------------------------
|
||||
A: No. If you need a specific BPF commit in kernels that are currently not
|
||||
maintained by the stable maintainers, then you are on your own.
|
||||
|
||||
The current stable and longterm stable kernels are all listed here:
|
||||
|
||||
https://www.kernel.org/
|
||||
|
||||
Q: The BPF patch I am about to submit needs to go to stable as well
|
||||
-------------------------------------------------------------------
|
||||
What should I do?
|
||||
|
||||
A: The same rules apply as with netdev patch submissions in general, see
|
||||
`netdev FAQ`_ under:
|
||||
|
||||
`Documentation/networking/netdev-FAQ.txt`_
|
||||
|
||||
Never add "``Cc: stable@vger.kernel.org``" to the patch description, but
|
||||
ask the BPF maintainers to queue the patches instead. This can be done
|
||||
with a note, for example, under the ``---`` part of the patch which does
|
||||
not go into the git log. Alternatively, this can be done as a simple
|
||||
request by mail instead.
|
||||
|
||||
Q: Queue stable patches
|
||||
-----------------------
|
||||
Q: Where do I find currently queued BPF patches that will be submitted
|
||||
to stable?
|
||||
|
||||
A: Once patches that fix critical bugs got applied into the bpf tree, they
|
||||
are queued up for stable submission under:
|
||||
|
||||
http://patchwork.ozlabs.org/bundle/bpf/stable/?state=*
|
||||
|
||||
They will be on hold there at minimum until the related commit made its
|
||||
way into the mainline kernel tree.
|
||||
|
||||
After having been under broader exposure, the queued patches will be
|
||||
submitted by the BPF maintainers to the stable maintainers.
|
||||
|
||||
Testing patches
|
||||
===============
|
||||
|
||||
Q: How to run BPF selftests
|
||||
---------------------------
|
||||
A: After you have booted into the newly compiled kernel, navigate to
|
||||
the BPF selftests_ suite in order to test BPF functionality (current
|
||||
working directory points to the root of the cloned git tree)::
|
||||
|
||||
$ cd tools/testing/selftests/bpf/
|
||||
$ make
|
||||
|
||||
To run the verifier tests::
|
||||
|
||||
$ sudo ./test_verifier
|
||||
|
||||
The verifier tests print out all the current checks being
|
||||
performed. The summary at the end of running all tests will dump
|
||||
information of test successes and failures::
|
||||
|
||||
Summary: 418 PASSED, 0 FAILED
|
||||
|
||||
In order to run through all BPF selftests, the following command is
|
||||
needed::
|
||||
|
||||
$ sudo make run_tests
|
||||
|
||||
See the kernels selftest `Documentation/dev-tools/kselftest.rst`_
|
||||
document for further documentation.
|
||||
|
||||
Q: Which BPF kernel selftests version should I run my kernel against?
|
||||
---------------------------------------------------------------------
|
||||
A: If you run a kernel ``xyz``, then always run the BPF kernel selftests
|
||||
from that kernel ``xyz`` as well. Do not expect that the BPF selftest
|
||||
from the latest mainline tree will pass all the time.
|
||||
|
||||
In particular, test_bpf.c and test_verifier.c have a large number of
|
||||
test cases and are constantly updated with new BPF test sequences, or
|
||||
existing ones are adapted to verifier changes e.g. due to verifier
|
||||
becoming smarter and being able to better track certain things.
|
||||
|
||||
LLVM
|
||||
====
|
||||
|
||||
Q: Where do I find LLVM with BPF support?
|
||||
-----------------------------------------
|
||||
A: The BPF back end for LLVM is upstream in LLVM since version 3.7.1.
|
||||
|
||||
All major distributions these days ship LLVM with BPF back end enabled,
|
||||
so for the majority of use-cases it is not required to compile LLVM by
|
||||
hand anymore, just install the distribution provided package.
|
||||
|
||||
LLVM's static compiler lists the supported targets through
|
||||
``llc --version``, make sure BPF targets are listed. Example::
|
||||
|
||||
$ llc --version
|
||||
LLVM (http://llvm.org/):
|
||||
LLVM version 6.0.0svn
|
||||
Optimized build.
|
||||
Default target: x86_64-unknown-linux-gnu
|
||||
Host CPU: skylake
|
||||
|
||||
Registered Targets:
|
||||
bpf - BPF (host endian)
|
||||
bpfeb - BPF (big endian)
|
||||
bpfel - BPF (little endian)
|
||||
x86 - 32-bit X86: Pentium-Pro and above
|
||||
x86-64 - 64-bit X86: EM64T and AMD64
|
||||
|
||||
For developers in order to utilize the latest features added to LLVM's
|
||||
BPF back end, it is advisable to run the latest LLVM releases. Support
|
||||
for new BPF kernel features such as additions to the BPF instruction
|
||||
set are often developed together.
|
||||
|
||||
All LLVM releases can be found at: http://releases.llvm.org/
|
||||
|
||||
Q: Got it, so how do I build LLVM manually anyway?
|
||||
--------------------------------------------------
|
||||
A: You need cmake and gcc-c++ as build requisites for LLVM. Once you have
|
||||
that set up, proceed with building the latest LLVM and clang version
|
||||
from the git repositories::
|
||||
|
||||
$ git clone http://llvm.org/git/llvm.git
|
||||
$ cd llvm/tools
|
||||
$ git clone --depth 1 http://llvm.org/git/clang.git
|
||||
$ cd ..; mkdir build; cd build
|
||||
$ cmake .. -DLLVM_TARGETS_TO_BUILD="BPF;X86" \
|
||||
-DBUILD_SHARED_LIBS=OFF \
|
||||
-DCMAKE_BUILD_TYPE=Release \
|
||||
-DLLVM_BUILD_RUNTIME=OFF
|
||||
$ make -j $(getconf _NPROCESSORS_ONLN)
|
||||
|
||||
The built binaries can then be found in the build/bin/ directory, where
|
||||
you can point the PATH variable to.
|
||||
|
||||
Q: Reporting LLVM BPF issues
|
||||
----------------------------
|
||||
Q: Should I notify BPF kernel maintainers about issues in LLVM's BPF code
|
||||
generation back end or about LLVM generated code that the verifier
|
||||
refuses to accept?
|
||||
|
||||
A: Yes, please do!
|
||||
|
||||
LLVM's BPF back end is a key piece of the whole BPF
|
||||
infrastructure and it ties deeply into verification of programs from the
|
||||
kernel side. Therefore, any issues on either side need to be investigated
|
||||
and fixed whenever necessary.
|
||||
|
||||
Therefore, please make sure to bring them up at netdev kernel mailing
|
||||
list and Cc BPF maintainers for LLVM and kernel bits:
|
||||
|
||||
* Yonghong Song <yhs@fb.com>
|
||||
* Alexei Starovoitov <ast@kernel.org>
|
||||
* Daniel Borkmann <daniel@iogearbox.net>
|
||||
|
||||
LLVM also has an issue tracker where BPF related bugs can be found:
|
||||
|
||||
https://bugs.llvm.org/buglist.cgi?quicksearch=bpf
|
||||
|
||||
However, it is better to reach out through mailing lists with having
|
||||
maintainers in Cc.
|
||||
|
||||
Q: New BPF instruction for kernel and LLVM
|
||||
------------------------------------------
|
||||
Q: I have added a new BPF instruction to the kernel, how can I integrate
|
||||
it into LLVM?
|
||||
|
||||
A: LLVM has a ``-mcpu`` selector for the BPF back end in order to allow
|
||||
the selection of BPF instruction set extensions. By default the
|
||||
``generic`` processor target is used, which is the base instruction set
|
||||
(v1) of BPF.
|
||||
|
||||
LLVM has an option to select ``-mcpu=probe`` where it will probe the host
|
||||
kernel for supported BPF instruction set extensions and selects the
|
||||
optimal set automatically.
|
||||
|
||||
For cross-compilation, a specific version can be select manually as well ::
|
||||
|
||||
$ llc -march bpf -mcpu=help
|
||||
Available CPUs for this target:
|
||||
|
||||
generic - Select the generic processor.
|
||||
probe - Select the probe processor.
|
||||
v1 - Select the v1 processor.
|
||||
v2 - Select the v2 processor.
|
||||
[...]
|
||||
|
||||
Newly added BPF instructions to the Linux kernel need to follow the same
|
||||
scheme, bump the instruction set version and implement probing for the
|
||||
extensions such that ``-mcpu=probe`` users can benefit from the
|
||||
optimization transparently when upgrading their kernels.
|
||||
|
||||
If you are unable to implement support for the newly added BPF instruction
|
||||
please reach out to BPF developers for help.
|
||||
|
||||
By the way, the BPF kernel selftests run with ``-mcpu=probe`` for better
|
||||
test coverage.
|
||||
|
||||
Q: clang flag for target bpf?
|
||||
-----------------------------
|
||||
Q: In some cases clang flag ``-target bpf`` is used but in other cases the
|
||||
default clang target, which matches the underlying architecture, is used.
|
||||
What is the difference and when I should use which?
|
||||
|
||||
A: Although LLVM IR generation and optimization try to stay architecture
|
||||
independent, ``-target <arch>`` still has some impact on generated code:
|
||||
|
||||
- BPF program may recursively include header file(s) with file scope
|
||||
inline assembly codes. The default target can handle this well,
|
||||
while ``bpf`` target may fail if bpf backend assembler does not
|
||||
understand these assembly codes, which is true in most cases.
|
||||
|
||||
- When compiled without ``-g``, additional elf sections, e.g.,
|
||||
.eh_frame and .rela.eh_frame, may be present in the object file
|
||||
with default target, but not with ``bpf`` target.
|
||||
|
||||
- The default target may turn a C switch statement into a switch table
|
||||
lookup and jump operation. Since the switch table is placed
|
||||
in the global readonly section, the bpf program will fail to load.
|
||||
The bpf target does not support switch table optimization.
|
||||
The clang option ``-fno-jump-tables`` can be used to disable
|
||||
switch table generation.
|
||||
|
||||
- For clang ``-target bpf``, it is guaranteed that pointer or long /
|
||||
unsigned long types will always have a width of 64 bit, no matter
|
||||
whether underlying clang binary or default target (or kernel) is
|
||||
32 bit. However, when native clang target is used, then it will
|
||||
compile these types based on the underlying architecture's conventions,
|
||||
meaning in case of 32 bit architecture, pointer or long / unsigned
|
||||
long types e.g. in BPF context structure will have width of 32 bit
|
||||
while the BPF LLVM back end still operates in 64 bit. The native
|
||||
target is mostly needed in tracing for the case of walking ``pt_regs``
|
||||
or other kernel structures where CPU's register width matters.
|
||||
Otherwise, ``clang -target bpf`` is generally recommended.
|
||||
|
||||
You should use default target when:
|
||||
|
||||
- Your program includes a header file, e.g., ptrace.h, which eventually
|
||||
pulls in some header files containing file scope host assembly codes.
|
||||
|
||||
- You can add ``-fno-jump-tables`` to work around the switch table issue.
|
||||
|
||||
Otherwise, you can use ``bpf`` target. Additionally, you *must* use bpf target
|
||||
when:
|
||||
|
||||
- Your program uses data structures with pointer or long / unsigned long
|
||||
types that interface with BPF helpers or context data structures. Access
|
||||
into these structures is verified by the BPF verifier and may result
|
||||
in verification failures if the native architecture is not aligned with
|
||||
the BPF architecture, e.g. 64-bit. An example of this is
|
||||
BPF_PROG_TYPE_SK_MSG require ``-target bpf``
|
||||
|
||||
|
||||
.. Links
|
||||
.. _Documentation/process/: https://www.kernel.org/doc/html/latest/process/
|
||||
.. _MAINTAINERS: ../../MAINTAINERS
|
||||
.. _Documentation/networking/netdev-FAQ.txt: ../networking/netdev-FAQ.txt
|
||||
.. _netdev FAQ: ../networking/netdev-FAQ.txt
|
||||
.. _samples/bpf/: ../../samples/bpf/
|
||||
.. _selftests: ../../tools/testing/selftests/bpf/
|
||||
.. _Documentation/dev-tools/kselftest.rst:
|
||||
https://www.kernel.org/doc/html/latest/dev-tools/kselftest.html
|
||||
|
||||
Happy BPF hacking!
|
|
@ -1,570 +0,0 @@
|
|||
This document provides information for the BPF subsystem about various
|
||||
workflows related to reporting bugs, submitting patches, and queueing
|
||||
patches for stable kernels.
|
||||
|
||||
For general information about submitting patches, please refer to
|
||||
Documentation/process/. This document only describes additional specifics
|
||||
related to BPF.
|
||||
|
||||
Reporting bugs:
|
||||
---------------
|
||||
|
||||
Q: How do I report bugs for BPF kernel code?
|
||||
|
||||
A: Since all BPF kernel development as well as bpftool and iproute2 BPF
|
||||
loader development happens through the netdev kernel mailing list,
|
||||
please report any found issues around BPF to the following mailing
|
||||
list:
|
||||
|
||||
netdev@vger.kernel.org
|
||||
|
||||
This may also include issues related to XDP, BPF tracing, etc.
|
||||
|
||||
Given netdev has a high volume of traffic, please also add the BPF
|
||||
maintainers to Cc (from kernel MAINTAINERS file):
|
||||
|
||||
Alexei Starovoitov <ast@kernel.org>
|
||||
Daniel Borkmann <daniel@iogearbox.net>
|
||||
|
||||
In case a buggy commit has already been identified, make sure to keep
|
||||
the actual commit authors in Cc as well for the report. They can
|
||||
typically be identified through the kernel's git tree.
|
||||
|
||||
Please do *not* report BPF issues to bugzilla.kernel.org since it
|
||||
is a guarantee that the reported issue will be overlooked.
|
||||
|
||||
Submitting patches:
|
||||
-------------------
|
||||
|
||||
Q: To which mailing list do I need to submit my BPF patches?
|
||||
|
||||
A: Please submit your BPF patches to the netdev kernel mailing list:
|
||||
|
||||
netdev@vger.kernel.org
|
||||
|
||||
Historically, BPF came out of networking and has always been maintained
|
||||
by the kernel networking community. Although these days BPF touches
|
||||
many other subsystems as well, the patches are still routed mainly
|
||||
through the networking community.
|
||||
|
||||
In case your patch has changes in various different subsystems (e.g.
|
||||
tracing, security, etc), make sure to Cc the related kernel mailing
|
||||
lists and maintainers from there as well, so they are able to review
|
||||
the changes and provide their Acked-by's to the patches.
|
||||
|
||||
Q: Where can I find patches currently under discussion for BPF subsystem?
|
||||
|
||||
A: All patches that are Cc'ed to netdev are queued for review under netdev
|
||||
patchwork project:
|
||||
|
||||
http://patchwork.ozlabs.org/project/netdev/list/
|
||||
|
||||
Those patches which target BPF, are assigned to a 'bpf' delegate for
|
||||
further processing from BPF maintainers. The current queue with
|
||||
patches under review can be found at:
|
||||
|
||||
https://patchwork.ozlabs.org/project/netdev/list/?delegate=77147
|
||||
|
||||
Once the patches have been reviewed by the BPF community as a whole
|
||||
and approved by the BPF maintainers, their status in patchwork will be
|
||||
changed to 'Accepted' and the submitter will be notified by mail. This
|
||||
means that the patches look good from a BPF perspective and have been
|
||||
applied to one of the two BPF kernel trees.
|
||||
|
||||
In case feedback from the community requires a respin of the patches,
|
||||
their status in patchwork will be set to 'Changes Requested', and purged
|
||||
from the current review queue. Likewise for cases where patches would
|
||||
get rejected or are not applicable to the BPF trees (but assigned to
|
||||
the 'bpf' delegate).
|
||||
|
||||
Q: How do the changes make their way into Linux?
|
||||
|
||||
A: There are two BPF kernel trees (git repositories). Once patches have
|
||||
been accepted by the BPF maintainers, they will be applied to one
|
||||
of the two BPF trees:
|
||||
|
||||
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/
|
||||
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/
|
||||
|
||||
The bpf tree itself is for fixes only, whereas bpf-next for features,
|
||||
cleanups or other kind of improvements ("next-like" content). This is
|
||||
analogous to net and net-next trees for networking. Both bpf and
|
||||
bpf-next will only have a master branch in order to simplify against
|
||||
which branch patches should get rebased to.
|
||||
|
||||
Accumulated BPF patches in the bpf tree will regularly get pulled
|
||||
into the net kernel tree. Likewise, accumulated BPF patches accepted
|
||||
into the bpf-next tree will make their way into net-next tree. net and
|
||||
net-next are both run by David S. Miller. From there, they will go
|
||||
into the kernel mainline tree run by Linus Torvalds. To read up on the
|
||||
process of net and net-next being merged into the mainline tree, see
|
||||
the netdev FAQ under:
|
||||
|
||||
Documentation/networking/netdev-FAQ.txt
|
||||
|
||||
Occasionally, to prevent merge conflicts, we might send pull requests
|
||||
to other trees (e.g. tracing) with a small subset of the patches, but
|
||||
net and net-next are always the main trees targeted for integration.
|
||||
|
||||
The pull requests will contain a high-level summary of the accumulated
|
||||
patches and can be searched on netdev kernel mailing list through the
|
||||
following subject lines (yyyy-mm-dd is the date of the pull request):
|
||||
|
||||
pull-request: bpf yyyy-mm-dd
|
||||
pull-request: bpf-next yyyy-mm-dd
|
||||
|
||||
Q: How do I indicate which tree (bpf vs. bpf-next) my patch should be
|
||||
applied to?
|
||||
|
||||
A: The process is the very same as described in the netdev FAQ, so
|
||||
please read up on it. The subject line must indicate whether the
|
||||
patch is a fix or rather "next-like" content in order to let the
|
||||
maintainers know whether it is targeted at bpf or bpf-next.
|
||||
|
||||
For fixes eventually landing in bpf -> net tree, the subject must
|
||||
look like:
|
||||
|
||||
git format-patch --subject-prefix='PATCH bpf' start..finish
|
||||
|
||||
For features/improvements/etc that should eventually land in
|
||||
bpf-next -> net-next, the subject must look like:
|
||||
|
||||
git format-patch --subject-prefix='PATCH bpf-next' start..finish
|
||||
|
||||
If unsure whether the patch or patch series should go into bpf
|
||||
or net directly, or bpf-next or net-next directly, it is not a
|
||||
problem either if the subject line says net or net-next as target.
|
||||
It is eventually up to the maintainers to do the delegation of
|
||||
the patches.
|
||||
|
||||
If it is clear that patches should go into bpf or bpf-next tree,
|
||||
please make sure to rebase the patches against those trees in
|
||||
order to reduce potential conflicts.
|
||||
|
||||
In case the patch or patch series has to be reworked and sent out
|
||||
again in a second or later revision, it is also required to add a
|
||||
version number (v2, v3, ...) into the subject prefix:
|
||||
|
||||
git format-patch --subject-prefix='PATCH net-next v2' start..finish
|
||||
|
||||
When changes have been requested to the patch series, always send the
|
||||
whole patch series again with the feedback incorporated (never send
|
||||
individual diffs on top of the old series).
|
||||
|
||||
Q: What does it mean when a patch gets applied to bpf or bpf-next tree?
|
||||
|
||||
A: It means that the patch looks good for mainline inclusion from
|
||||
a BPF point of view.
|
||||
|
||||
Be aware that this is not a final verdict that the patch will
|
||||
automatically get accepted into net or net-next trees eventually:
|
||||
|
||||
On the netdev kernel mailing list reviews can come in at any point
|
||||
in time. If discussions around a patch conclude that they cannot
|
||||
get included as-is, we will either apply a follow-up fix or drop
|
||||
them from the trees entirely. Therefore, we also reserve to rebase
|
||||
the trees when deemed necessary. After all, the purpose of the tree
|
||||
is to i) accumulate and stage BPF patches for integration into trees
|
||||
like net and net-next, and ii) run extensive BPF test suite and
|
||||
workloads on the patches before they make their way any further.
|
||||
|
||||
Once the BPF pull request was accepted by David S. Miller, then
|
||||
the patches end up in net or net-next tree, respectively, and
|
||||
make their way from there further into mainline. Again, see the
|
||||
netdev FAQ for additional information e.g. on how often they are
|
||||
merged to mainline.
|
||||
|
||||
Q: How long do I need to wait for feedback on my BPF patches?
|
||||
|
||||
A: We try to keep the latency low. The usual time to feedback will
|
||||
be around 2 or 3 business days. It may vary depending on the
|
||||
complexity of changes and current patch load.
|
||||
|
||||
Q: How often do you send pull requests to major kernel trees like
|
||||
net or net-next?
|
||||
|
||||
A: Pull requests will be sent out rather often in order to not
|
||||
accumulate too many patches in bpf or bpf-next.
|
||||
|
||||
As a rule of thumb, expect pull requests for each tree regularly
|
||||
at the end of the week. In some cases pull requests could additionally
|
||||
come also in the middle of the week depending on the current patch
|
||||
load or urgency.
|
||||
|
||||
Q: Are patches applied to bpf-next when the merge window is open?
|
||||
|
||||
A: For the time when the merge window is open, bpf-next will not be
|
||||
processed. This is roughly analogous to net-next patch processing,
|
||||
so feel free to read up on the netdev FAQ about further details.
|
||||
|
||||
During those two weeks of merge window, we might ask you to resend
|
||||
your patch series once bpf-next is open again. Once Linus released
|
||||
a v*-rc1 after the merge window, we continue processing of bpf-next.
|
||||
|
||||
For non-subscribers to kernel mailing lists, there is also a status
|
||||
page run by David S. Miller on net-next that provides guidance:
|
||||
|
||||
http://vger.kernel.org/~davem/net-next.html
|
||||
|
||||
Q: I made a BPF verifier change, do I need to add test cases for
|
||||
BPF kernel selftests?
|
||||
|
||||
A: If the patch has changes to the behavior of the verifier, then yes,
|
||||
it is absolutely necessary to add test cases to the BPF kernel
|
||||
selftests suite. If they are not present and we think they are
|
||||
needed, then we might ask for them before accepting any changes.
|
||||
|
||||
In particular, test_verifier.c is tracking a high number of BPF test
|
||||
cases, including a lot of corner cases that LLVM BPF back end may
|
||||
generate out of the restricted C code. Thus, adding test cases is
|
||||
absolutely crucial to make sure future changes do not accidentally
|
||||
affect prior use-cases. Thus, treat those test cases as: verifier
|
||||
behavior that is not tracked in test_verifier.c could potentially
|
||||
be subject to change.
|
||||
|
||||
Q: When should I add code to samples/bpf/ and when to BPF kernel
|
||||
selftests?
|
||||
|
||||
A: In general, we prefer additions to BPF kernel selftests rather than
|
||||
samples/bpf/. The rationale is very simple: kernel selftests are
|
||||
regularly run by various bots to test for kernel regressions.
|
||||
|
||||
The more test cases we add to BPF selftests, the better the coverage
|
||||
and the less likely it is that those could accidentally break. It is
|
||||
not that BPF kernel selftests cannot demo how a specific feature can
|
||||
be used.
|
||||
|
||||
That said, samples/bpf/ may be a good place for people to get started,
|
||||
so it might be advisable that simple demos of features could go into
|
||||
samples/bpf/, but advanced functional and corner-case testing rather
|
||||
into kernel selftests.
|
||||
|
||||
If your sample looks like a test case, then go for BPF kernel selftests
|
||||
instead!
|
||||
|
||||
Q: When should I add code to the bpftool?
|
||||
|
||||
A: The main purpose of bpftool (under tools/bpf/bpftool/) is to provide
|
||||
a central user space tool for debugging and introspection of BPF programs
|
||||
and maps that are active in the kernel. If UAPI changes related to BPF
|
||||
enable for dumping additional information of programs or maps, then
|
||||
bpftool should be extended as well to support dumping them.
|
||||
|
||||
Q: When should I add code to iproute2's BPF loader?
|
||||
|
||||
A: For UAPI changes related to the XDP or tc layer (e.g. cls_bpf), the
|
||||
convention is that those control-path related changes are added to
|
||||
iproute2's BPF loader as well from user space side. This is not only
|
||||
useful to have UAPI changes properly designed to be usable, but also
|
||||
to make those changes available to a wider user base of major
|
||||
downstream distributions.
|
||||
|
||||
Q: Do you accept patches as well for iproute2's BPF loader?
|
||||
|
||||
A: Patches for the iproute2's BPF loader have to be sent to:
|
||||
|
||||
netdev@vger.kernel.org
|
||||
|
||||
While those patches are not processed by the BPF kernel maintainers,
|
||||
please keep them in Cc as well, so they can be reviewed.
|
||||
|
||||
The official git repository for iproute2 is run by Stephen Hemminger
|
||||
and can be found at:
|
||||
|
||||
https://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2.git/
|
||||
|
||||
The patches need to have a subject prefix of '[PATCH iproute2 master]'
|
||||
or '[PATCH iproute2 net-next]'. 'master' or 'net-next' describes the
|
||||
target branch where the patch should be applied to. Meaning, if kernel
|
||||
changes went into the net-next kernel tree, then the related iproute2
|
||||
changes need to go into the iproute2 net-next branch, otherwise they
|
||||
can be targeted at master branch. The iproute2 net-next branch will get
|
||||
merged into the master branch after the current iproute2 version from
|
||||
master has been released.
|
||||
|
||||
Like BPF, the patches end up in patchwork under the netdev project and
|
||||
are delegated to 'shemminger' for further processing:
|
||||
|
||||
http://patchwork.ozlabs.org/project/netdev/list/?delegate=389
|
||||
|
||||
Q: What is the minimum requirement before I submit my BPF patches?
|
||||
|
||||
A: When submitting patches, always take the time and properly test your
|
||||
patches *prior* to submission. Never rush them! If maintainers find
|
||||
that your patches have not been properly tested, it is a good way to
|
||||
get them grumpy. Testing patch submissions is a hard requirement!
|
||||
|
||||
Note, fixes that go to bpf tree *must* have a Fixes: tag included. The
|
||||
same applies to fixes that target bpf-next, where the affected commit
|
||||
is in net-next (or in some cases bpf-next). The Fixes: tag is crucial
|
||||
in order to identify follow-up commits and tremendously helps for people
|
||||
having to do backporting, so it is a must have!
|
||||
|
||||
We also don't accept patches with an empty commit message. Take your
|
||||
time and properly write up a high quality commit message, it is
|
||||
essential!
|
||||
|
||||
Think about it this way: other developers looking at your code a month
|
||||
from now need to understand *why* a certain change has been done that
|
||||
way, and whether there have been flaws in the analysis or assumptions
|
||||
that the original author did. Thus providing a proper rationale and
|
||||
describing the use-case for the changes is a must.
|
||||
|
||||
Patch submissions with >1 patch must have a cover letter which includes
|
||||
a high level description of the series. This high level summary will
|
||||
then be placed into the merge commit by the BPF maintainers such that
|
||||
it is also accessible from the git log for future reference.
|
||||
|
||||
Q: What do I need to consider when adding a new instruction or feature
|
||||
that would require BPF JIT and/or LLVM integration as well?
|
||||
|
||||
A: We try hard to keep all BPF JITs up to date such that the same user
|
||||
experience can be guaranteed when running BPF programs on different
|
||||
architectures without having the program punt to the less efficient
|
||||
interpreter in case the in-kernel BPF JIT is enabled.
|
||||
|
||||
If you are unable to implement or test the required JIT changes for
|
||||
certain architectures, please work together with the related BPF JIT
|
||||
developers in order to get the feature implemented in a timely manner.
|
||||
Please refer to the git log (arch/*/net/) to locate the necessary
|
||||
people for helping out.
|
||||
|
||||
Also always make sure to add BPF test cases (e.g. test_bpf.c and
|
||||
test_verifier.c) for new instructions, so that they can receive
|
||||
broad test coverage and help run-time testing the various BPF JITs.
|
||||
|
||||
In case of new BPF instructions, once the changes have been accepted
|
||||
into the Linux kernel, please implement support into LLVM's BPF back
|
||||
end. See LLVM section below for further information.
|
||||
|
||||
Stable submission:
|
||||
------------------
|
||||
|
||||
Q: I need a specific BPF commit in stable kernels. What should I do?
|
||||
|
||||
A: In case you need a specific fix in stable kernels, first check whether
|
||||
the commit has already been applied in the related linux-*.y branches:
|
||||
|
||||
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/
|
||||
|
||||
If not the case, then drop an email to the BPF maintainers with the
|
||||
netdev kernel mailing list in Cc and ask for the fix to be queued up:
|
||||
|
||||
netdev@vger.kernel.org
|
||||
|
||||
The process in general is the same as on netdev itself, see also the
|
||||
netdev FAQ document.
|
||||
|
||||
Q: Do you also backport to kernels not currently maintained as stable?
|
||||
|
||||
A: No. If you need a specific BPF commit in kernels that are currently not
|
||||
maintained by the stable maintainers, then you are on your own.
|
||||
|
||||
The current stable and longterm stable kernels are all listed here:
|
||||
|
||||
https://www.kernel.org/
|
||||
|
||||
Q: The BPF patch I am about to submit needs to go to stable as well. What
|
||||
should I do?
|
||||
|
||||
A: The same rules apply as with netdev patch submissions in general, see
|
||||
netdev FAQ under:
|
||||
|
||||
Documentation/networking/netdev-FAQ.txt
|
||||
|
||||
Never add "Cc: stable@vger.kernel.org" to the patch description, but
|
||||
ask the BPF maintainers to queue the patches instead. This can be done
|
||||
with a note, for example, under the "---" part of the patch which does
|
||||
not go into the git log. Alternatively, this can be done as a simple
|
||||
request by mail instead.
|
||||
|
||||
Q: Where do I find currently queued BPF patches that will be submitted
|
||||
to stable?
|
||||
|
||||
A: Once patches that fix critical bugs got applied into the bpf tree, they
|
||||
are queued up for stable submission under:
|
||||
|
||||
http://patchwork.ozlabs.org/bundle/bpf/stable/?state=*
|
||||
|
||||
They will be on hold there at minimum until the related commit made its
|
||||
way into the mainline kernel tree.
|
||||
|
||||
After having been under broader exposure, the queued patches will be
|
||||
submitted by the BPF maintainers to the stable maintainers.
|
||||
|
||||
Testing patches:
|
||||
----------------
|
||||
|
||||
Q: Which BPF kernel selftests version should I run my kernel against?
|
||||
|
||||
A: If you run a kernel xyz, then always run the BPF kernel selftests from
|
||||
that kernel xyz as well. Do not expect that the BPF selftest from the
|
||||
latest mainline tree will pass all the time.
|
||||
|
||||
In particular, test_bpf.c and test_verifier.c have a large number of
|
||||
test cases and are constantly updated with new BPF test sequences, or
|
||||
existing ones are adapted to verifier changes e.g. due to verifier
|
||||
becoming smarter and being able to better track certain things.
|
||||
|
||||
LLVM:
|
||||
-----
|
||||
|
||||
Q: Where do I find LLVM with BPF support?
|
||||
|
||||
A: The BPF back end for LLVM is upstream in LLVM since version 3.7.1.
|
||||
|
||||
All major distributions these days ship LLVM with BPF back end enabled,
|
||||
so for the majority of use-cases it is not required to compile LLVM by
|
||||
hand anymore, just install the distribution provided package.
|
||||
|
||||
LLVM's static compiler lists the supported targets through 'llc --version',
|
||||
make sure BPF targets are listed. Example:
|
||||
|
||||
$ llc --version
|
||||
LLVM (http://llvm.org/):
|
||||
LLVM version 6.0.0svn
|
||||
Optimized build.
|
||||
Default target: x86_64-unknown-linux-gnu
|
||||
Host CPU: skylake
|
||||
|
||||
Registered Targets:
|
||||
bpf - BPF (host endian)
|
||||
bpfeb - BPF (big endian)
|
||||
bpfel - BPF (little endian)
|
||||
x86 - 32-bit X86: Pentium-Pro and above
|
||||
x86-64 - 64-bit X86: EM64T and AMD64
|
||||
|
||||
For developers in order to utilize the latest features added to LLVM's
|
||||
BPF back end, it is advisable to run the latest LLVM releases. Support
|
||||
for new BPF kernel features such as additions to the BPF instruction
|
||||
set are often developed together.
|
||||
|
||||
All LLVM releases can be found at: http://releases.llvm.org/
|
||||
|
||||
Q: Got it, so how do I build LLVM manually anyway?
|
||||
|
||||
A: You need cmake and gcc-c++ as build requisites for LLVM. Once you have
|
||||
that set up, proceed with building the latest LLVM and clang version
|
||||
from the git repositories:
|
||||
|
||||
$ git clone http://llvm.org/git/llvm.git
|
||||
$ cd llvm/tools
|
||||
$ git clone --depth 1 http://llvm.org/git/clang.git
|
||||
$ cd ..; mkdir build; cd build
|
||||
$ cmake .. -DLLVM_TARGETS_TO_BUILD="BPF;X86" \
|
||||
-DBUILD_SHARED_LIBS=OFF \
|
||||
-DCMAKE_BUILD_TYPE=Release \
|
||||
-DLLVM_BUILD_RUNTIME=OFF
|
||||
$ make -j $(getconf _NPROCESSORS_ONLN)
|
||||
|
||||
The built binaries can then be found in the build/bin/ directory, where
|
||||
you can point the PATH variable to.
|
||||
|
||||
Q: Should I notify BPF kernel maintainers about issues in LLVM's BPF code
|
||||
generation back end or about LLVM generated code that the verifier
|
||||
refuses to accept?
|
||||
|
||||
A: Yes, please do! LLVM's BPF back end is a key piece of the whole BPF
|
||||
infrastructure and it ties deeply into verification of programs from the
|
||||
kernel side. Therefore, any issues on either side need to be investigated
|
||||
and fixed whenever necessary.
|
||||
|
||||
Therefore, please make sure to bring them up at netdev kernel mailing
|
||||
list and Cc BPF maintainers for LLVM and kernel bits:
|
||||
|
||||
Yonghong Song <yhs@fb.com>
|
||||
Alexei Starovoitov <ast@kernel.org>
|
||||
Daniel Borkmann <daniel@iogearbox.net>
|
||||
|
||||
LLVM also has an issue tracker where BPF related bugs can be found:
|
||||
|
||||
https://bugs.llvm.org/buglist.cgi?quicksearch=bpf
|
||||
|
||||
However, it is better to reach out through mailing lists with having
|
||||
maintainers in Cc.
|
||||
|
||||
Q: I have added a new BPF instruction to the kernel, how can I integrate
|
||||
it into LLVM?
|
||||
|
||||
A: LLVM has a -mcpu selector for the BPF back end in order to allow the
|
||||
selection of BPF instruction set extensions. By default the 'generic'
|
||||
processor target is used, which is the base instruction set (v1) of BPF.
|
||||
|
||||
LLVM has an option to select -mcpu=probe where it will probe the host
|
||||
kernel for supported BPF instruction set extensions and selects the
|
||||
optimal set automatically.
|
||||
|
||||
For cross-compilation, a specific version can be select manually as well.
|
||||
|
||||
$ llc -march bpf -mcpu=help
|
||||
Available CPUs for this target:
|
||||
|
||||
generic - Select the generic processor.
|
||||
probe - Select the probe processor.
|
||||
v1 - Select the v1 processor.
|
||||
v2 - Select the v2 processor.
|
||||
[...]
|
||||
|
||||
Newly added BPF instructions to the Linux kernel need to follow the same
|
||||
scheme, bump the instruction set version and implement probing for the
|
||||
extensions such that -mcpu=probe users can benefit from the optimization
|
||||
transparently when upgrading their kernels.
|
||||
|
||||
If you are unable to implement support for the newly added BPF instruction
|
||||
please reach out to BPF developers for help.
|
||||
|
||||
By the way, the BPF kernel selftests run with -mcpu=probe for better
|
||||
test coverage.
|
||||
|
||||
Q: In some cases clang flag "-target bpf" is used but in other cases the
|
||||
default clang target, which matches the underlying architecture, is used.
|
||||
What is the difference and when I should use which?
|
||||
|
||||
A: Although LLVM IR generation and optimization try to stay architecture
|
||||
independent, "-target <arch>" still has some impact on generated code:
|
||||
|
||||
- BPF program may recursively include header file(s) with file scope
|
||||
inline assembly codes. The default target can handle this well,
|
||||
while bpf target may fail if bpf backend assembler does not
|
||||
understand these assembly codes, which is true in most cases.
|
||||
|
||||
- When compiled without -g, additional elf sections, e.g.,
|
||||
.eh_frame and .rela.eh_frame, may be present in the object file
|
||||
with default target, but not with bpf target.
|
||||
|
||||
- The default target may turn a C switch statement into a switch table
|
||||
lookup and jump operation. Since the switch table is placed
|
||||
in the global readonly section, the bpf program will fail to load.
|
||||
The bpf target does not support switch table optimization.
|
||||
The clang option "-fno-jump-tables" can be used to disable
|
||||
switch table generation.
|
||||
|
||||
- For clang -target bpf, it is guaranteed that pointer or long /
|
||||
unsigned long types will always have a width of 64 bit, no matter
|
||||
whether underlying clang binary or default target (or kernel) is
|
||||
32 bit. However, when native clang target is used, then it will
|
||||
compile these types based on the underlying architecture's conventions,
|
||||
meaning in case of 32 bit architecture, pointer or long / unsigned
|
||||
long types e.g. in BPF context structure will have width of 32 bit
|
||||
while the BPF LLVM back end still operates in 64 bit. The native
|
||||
target is mostly needed in tracing for the case of walking pt_regs
|
||||
or other kernel structures where CPU's register width matters.
|
||||
Otherwise, clang -target bpf is generally recommended.
|
||||
|
||||
You should use default target when:
|
||||
|
||||
- Your program includes a header file, e.g., ptrace.h, which eventually
|
||||
pulls in some header files containing file scope host assembly codes.
|
||||
- You can add "-fno-jump-tables" to work around the switch table issue.
|
||||
|
||||
Otherwise, you can use bpf target. Additionally, you _must_ use bpf target
|
||||
when:
|
||||
|
||||
- Your program uses data structures with pointer or long / unsigned long
|
||||
types that interface with BPF helpers or context data structures. Access
|
||||
into these structures is verified by the BPF verifier and may result
|
||||
in verification failures if the native architecture is not aligned with
|
||||
the BPF architecture, e.g. 64-bit. An example of this is
|
||||
BPF_PROG_TYPE_SK_MSG require '-target bpf'
|
||||
|
||||
Happy BPF hacking!
|
Loading…
Reference in New Issue