2018-04-19 06:55:58 +08:00
|
|
|
/* SPDX-License-Identifier: GPL-2.0 */
|
|
|
|
/* Copyright (c) 2018 Facebook */
|
|
|
|
|
|
|
|
#ifndef _LINUX_BTF_H
|
|
|
|
#define _LINUX_BTF_H 1
|
|
|
|
|
|
|
|
#include <linux/types.h>
|
2019-10-25 08:18:11 +08:00
|
|
|
#include <uapi/linux/btf.h>
|
bpf: Add bpf_snprintf_btf helper
A helper is added to support tracing kernel type information in BPF
using the BPF Type Format (BTF). Its signature is
long bpf_snprintf_btf(char *str, u32 str_size, struct btf_ptr *ptr,
u32 btf_ptr_size, u64 flags);
struct btf_ptr * specifies
- a pointer to the data to be traced
- the BTF id of the type of data pointed to
- a flags field is provided for future use; these flags
are not to be confused with the BTF_F_* flags
below that control how the btf_ptr is displayed; the
flags member of the struct btf_ptr may be used to
disambiguate types in kernel versus module BTF, etc;
the main distinction is the flags relate to the type
and information needed in identifying it; not how it
is displayed.
For example a BPF program with a struct sk_buff *skb
could do the following:
static struct btf_ptr b = { };
b.ptr = skb;
b.type_id = __builtin_btf_type_id(struct sk_buff, 1);
bpf_snprintf_btf(str, sizeof(str), &b, sizeof(b), 0, 0);
Default output looks like this:
(struct sk_buff){
.transport_header = (__u16)65535,
.mac_header = (__u16)65535,
.end = (sk_buff_data_t)192,
.head = (unsigned char *)0x000000007524fd8b,
.data = (unsigned char *)0x000000007524fd8b,
.truesize = (unsigned int)768,
.users = (refcount_t){
.refs = (atomic_t){
.counter = (int)1,
},
},
}
Flags modifying display are as follows:
- BTF_F_COMPACT: no formatting around type information
- BTF_F_NONAME: no struct/union member names/types
- BTF_F_PTR_RAW: show raw (unobfuscated) pointer values;
equivalent to %px.
- BTF_F_ZERO: show zero-valued struct/union members;
they are not displayed by default
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1601292670-1616-4-git-send-email-alan.maguire@oracle.com
2020-09-28 19:31:05 +08:00
|
|
|
#include <uapi/linux/bpf.h>
|
2018-04-19 06:55:58 +08:00
|
|
|
|
bpf: Introduce BPF_MAP_TYPE_STRUCT_OPS
The patch introduces BPF_MAP_TYPE_STRUCT_OPS. The map value
is a kernel struct with its func ptr implemented in bpf prog.
This new map is the interface to register/unregister/introspect
a bpf implemented kernel struct.
The kernel struct is actually embedded inside another new struct
(or called the "value" struct in the code). For example,
"struct tcp_congestion_ops" is embbeded in:
struct bpf_struct_ops_tcp_congestion_ops {
refcount_t refcnt;
enum bpf_struct_ops_state state;
struct tcp_congestion_ops data; /* <-- kernel subsystem struct here */
}
The map value is "struct bpf_struct_ops_tcp_congestion_ops".
The "bpftool map dump" will then be able to show the
state ("inuse"/"tobefree") and the number of subsystem's refcnt (e.g.
number of tcp_sock in the tcp_congestion_ops case). This "value" struct
is created automatically by a macro. Having a separate "value" struct
will also make extending "struct bpf_struct_ops_XYZ" easier (e.g. adding
"void (*init)(void)" to "struct bpf_struct_ops_XYZ" to do some
initialization works before registering the struct_ops to the kernel
subsystem). The libbpf will take care of finding and populating the
"struct bpf_struct_ops_XYZ" from "struct XYZ".
Register a struct_ops to a kernel subsystem:
1. Load all needed BPF_PROG_TYPE_STRUCT_OPS prog(s)
2. Create a BPF_MAP_TYPE_STRUCT_OPS with attr->btf_vmlinux_value_type_id
set to the btf id "struct bpf_struct_ops_tcp_congestion_ops" of the
running kernel.
Instead of reusing the attr->btf_value_type_id,
btf_vmlinux_value_type_id s added such that attr->btf_fd can still be
used as the "user" btf which could store other useful sysadmin/debug
info that may be introduced in the furture,
e.g. creation-date/compiler-details/map-creator...etc.
3. Create a "struct bpf_struct_ops_tcp_congestion_ops" object as described
in the running kernel btf. Populate the value of this object.
The function ptr should be populated with the prog fds.
4. Call BPF_MAP_UPDATE with the object created in (3) as
the map value. The key is always "0".
During BPF_MAP_UPDATE, the code that saves the kernel-func-ptr's
args as an array of u64 is generated. BPF_MAP_UPDATE also allows
the specific struct_ops to do some final checks in "st_ops->init_member()"
(e.g. ensure all mandatory func ptrs are implemented).
If everything looks good, it will register this kernel struct
to the kernel subsystem. The map will not allow further update
from this point.
Unregister a struct_ops from the kernel subsystem:
BPF_MAP_DELETE with key "0".
Introspect a struct_ops:
BPF_MAP_LOOKUP_ELEM with key "0". The map value returned will
have the prog _id_ populated as the func ptr.
The map value state (enum bpf_struct_ops_state) will transit from:
INIT (map created) =>
INUSE (map updated, i.e. reg) =>
TOBEFREE (map value deleted, i.e. unreg)
The kernel subsystem needs to call bpf_struct_ops_get() and
bpf_struct_ops_put() to manage the "refcnt" in the
"struct bpf_struct_ops_XYZ". This patch uses a separate refcnt
for the purose of tracking the subsystem usage. Another approach
is to reuse the map->refcnt and then "show" (i.e. during map_lookup)
the subsystem's usage by doing map->refcnt - map->usercnt to filter out
the map-fd/pinned-map usage. However, that will also tie down the
future semantics of map->refcnt and map->usercnt.
The very first subsystem's refcnt (during reg()) holds one
count to map->refcnt. When the very last subsystem's refcnt
is gone, it will also release the map->refcnt. All bpf_prog will be
freed when the map->refcnt reaches 0 (i.e. during map_free()).
Here is how the bpftool map command will look like:
[root@arch-fb-vm1 bpf]# bpftool map show
6: struct_ops name dctcp flags 0x0
key 4B value 256B max_entries 1 memlock 4096B
btf_id 6
[root@arch-fb-vm1 bpf]# bpftool map dump id 6
[{
"value": {
"refcnt": {
"refs": {
"counter": 1
}
},
"state": 1,
"data": {
"list": {
"next": 0,
"prev": 0
},
"key": 0,
"flags": 2,
"init": 24,
"release": 0,
"ssthresh": 25,
"cong_avoid": 30,
"set_state": 27,
"cwnd_event": 28,
"in_ack_event": 26,
"undo_cwnd": 29,
"pkts_acked": 0,
"min_tso_segs": 0,
"sndbuf_expand": 0,
"cong_control": 0,
"get_info": 0,
"name": [98,112,102,95,100,99,116,99,112,0,0,0,0,0,0,0
],
"owner": 0
}
}
}
]
Misc Notes:
* bpf_struct_ops_map_sys_lookup_elem() is added for syscall lookup.
It does an inplace update on "*value" instead returning a pointer
to syscall.c. Otherwise, it needs a separate copy of "zero" value
for the BPF_STRUCT_OPS_STATE_INIT to avoid races.
* The bpf_struct_ops_map_delete_elem() is also called without
preempt_disable() from map_delete_elem(). It is because
the "->unreg()" may requires sleepable context, e.g.
the "tcp_unregister_congestion_control()".
* "const" is added to some of the existing "struct btf_func_model *"
function arg to avoid a compiler warning caused by this patch.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200109003505.3855919-1-kafai@fb.com
2020-01-09 08:35:05 +08:00
|
|
|
#define BTF_TYPE_EMIT(type) ((void)(type *)0)
|
|
|
|
|
2018-04-19 06:55:58 +08:00
|
|
|
struct btf;
|
2018-12-16 14:13:52 +08:00
|
|
|
struct btf_member;
|
2018-04-19 06:55:58 +08:00
|
|
|
struct btf_type;
|
2018-04-19 06:56:01 +08:00
|
|
|
union bpf_attr;
|
2020-09-28 19:31:04 +08:00
|
|
|
struct btf_show;
|
2018-04-19 06:55:58 +08:00
|
|
|
|
2018-04-19 06:56:02 +08:00
|
|
|
extern const struct file_operations btf_fops;
|
|
|
|
|
bpf: Remove hard-coded btf_vmlinux assumption from BPF verifier
Remove a permeating assumption thoughout BPF verifier of vmlinux BTF. Instead,
wherever BTF type IDs are involved, also track the instance of struct btf that
goes along with the type ID. This allows to gradually add support for kernel
module BTFs and using/tracking module types across BPF helper calls and
registers.
This patch also renames btf_id() function to btf_obj_id() to minimize naming
clash with using btf_id to denote BTF *type* ID, rather than BTF *object*'s ID.
Also, altough btf_vmlinux can't get destructed and thus doesn't need
refcounting, module BTFs need that, so apply BTF refcounting universally when
BPF program is using BTF-powered attachment (tp_btf, fentry/fexit, etc). This
makes for simpler clean up code.
Now that BTF type ID is not enough to uniquely identify a BTF type, extend BPF
trampoline key to include BTF object ID. To differentiate that from target
program BPF ID, set 31st bit of type ID. BTF type IDs (at least currently) are
not allowed to take full 32 bits, so there is no danger of confusing that bit
with a valid BTF type ID.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201203204634.1325171-10-andrii@kernel.org
2020-12-04 04:46:29 +08:00
|
|
|
void btf_get(struct btf *btf);
|
2018-04-19 06:56:01 +08:00
|
|
|
void btf_put(struct btf *btf);
|
|
|
|
int btf_new_fd(const union bpf_attr *attr);
|
|
|
|
struct btf *btf_get_by_fd(int fd);
|
2018-04-19 06:56:02 +08:00
|
|
|
int btf_get_info_by_fd(const struct btf *btf,
|
|
|
|
const union bpf_attr *attr,
|
|
|
|
union bpf_attr __user *uattr);
|
2018-04-19 06:55:58 +08:00
|
|
|
/* Figure out the size of a type_id. If type_id is a modifier
|
|
|
|
* (e.g. const), it will be resolved to find out the type with size.
|
|
|
|
*
|
|
|
|
* For example:
|
|
|
|
* In describing "const void *", type_id is "const" and "const"
|
|
|
|
* refers to "void *". The return type will be "void *".
|
|
|
|
*
|
|
|
|
* If type_id is a simple "int", then return type will be "int".
|
|
|
|
*
|
|
|
|
* @btf: struct btf object
|
|
|
|
* @type_id: Find out the size of type_id. The type_id of the return
|
|
|
|
* type is set to *type_id.
|
|
|
|
* @ret_size: It can be NULL. If not NULL, the size of the return
|
|
|
|
* type is set to *ret_size.
|
|
|
|
* Return: The btf_type (resolved to another type with size info if needed).
|
|
|
|
* NULL is returned if type_id itself does not have size info
|
|
|
|
* (e.g. void) or it cannot be resolved to another type that
|
|
|
|
* has size info.
|
|
|
|
* *type_id and *ret_size will not be changed in the
|
|
|
|
* NULL return case.
|
|
|
|
*/
|
|
|
|
const struct btf_type *btf_type_id_size(const struct btf *btf,
|
|
|
|
u32 *type_id,
|
|
|
|
u32 *ret_size);
|
2020-09-28 19:31:04 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Options to control show behaviour.
|
|
|
|
* - BTF_SHOW_COMPACT: no formatting around type information
|
|
|
|
* - BTF_SHOW_NONAME: no struct/union member names/types
|
|
|
|
* - BTF_SHOW_PTR_RAW: show raw (unobfuscated) pointer values;
|
|
|
|
* equivalent to %px.
|
|
|
|
* - BTF_SHOW_ZERO: show zero-valued struct/union members; they
|
|
|
|
* are not displayed by default
|
|
|
|
* - BTF_SHOW_UNSAFE: skip use of bpf_probe_read() to safely read
|
|
|
|
* data before displaying it.
|
|
|
|
*/
|
bpf: Add bpf_snprintf_btf helper
A helper is added to support tracing kernel type information in BPF
using the BPF Type Format (BTF). Its signature is
long bpf_snprintf_btf(char *str, u32 str_size, struct btf_ptr *ptr,
u32 btf_ptr_size, u64 flags);
struct btf_ptr * specifies
- a pointer to the data to be traced
- the BTF id of the type of data pointed to
- a flags field is provided for future use; these flags
are not to be confused with the BTF_F_* flags
below that control how the btf_ptr is displayed; the
flags member of the struct btf_ptr may be used to
disambiguate types in kernel versus module BTF, etc;
the main distinction is the flags relate to the type
and information needed in identifying it; not how it
is displayed.
For example a BPF program with a struct sk_buff *skb
could do the following:
static struct btf_ptr b = { };
b.ptr = skb;
b.type_id = __builtin_btf_type_id(struct sk_buff, 1);
bpf_snprintf_btf(str, sizeof(str), &b, sizeof(b), 0, 0);
Default output looks like this:
(struct sk_buff){
.transport_header = (__u16)65535,
.mac_header = (__u16)65535,
.end = (sk_buff_data_t)192,
.head = (unsigned char *)0x000000007524fd8b,
.data = (unsigned char *)0x000000007524fd8b,
.truesize = (unsigned int)768,
.users = (refcount_t){
.refs = (atomic_t){
.counter = (int)1,
},
},
}
Flags modifying display are as follows:
- BTF_F_COMPACT: no formatting around type information
- BTF_F_NONAME: no struct/union member names/types
- BTF_F_PTR_RAW: show raw (unobfuscated) pointer values;
equivalent to %px.
- BTF_F_ZERO: show zero-valued struct/union members;
they are not displayed by default
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1601292670-1616-4-git-send-email-alan.maguire@oracle.com
2020-09-28 19:31:05 +08:00
|
|
|
#define BTF_SHOW_COMPACT BTF_F_COMPACT
|
|
|
|
#define BTF_SHOW_NONAME BTF_F_NONAME
|
|
|
|
#define BTF_SHOW_PTR_RAW BTF_F_PTR_RAW
|
|
|
|
#define BTF_SHOW_ZERO BTF_F_ZERO
|
2020-09-28 19:31:04 +08:00
|
|
|
#define BTF_SHOW_UNSAFE (1ULL << 4)
|
|
|
|
|
2018-04-19 06:56:00 +08:00
|
|
|
void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj,
|
|
|
|
struct seq_file *m);
|
2020-09-28 19:31:09 +08:00
|
|
|
int btf_type_seq_show_flags(const struct btf *btf, u32 type_id, void *obj,
|
|
|
|
struct seq_file *m, u64 flags);
|
2020-09-28 19:31:04 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Copy len bytes of string representation of obj of BTF type_id into buf.
|
|
|
|
*
|
|
|
|
* @btf: struct btf object
|
|
|
|
* @type_id: type id of type obj points to
|
|
|
|
* @obj: pointer to typed data
|
|
|
|
* @buf: buffer to write to
|
|
|
|
* @len: maximum length to write to buf
|
|
|
|
* @flags: show options (see above)
|
|
|
|
*
|
|
|
|
* Return: length that would have been/was copied as per snprintf, or
|
|
|
|
* negative error.
|
|
|
|
*/
|
|
|
|
int btf_type_snprintf_show(const struct btf *btf, u32 type_id, void *obj,
|
|
|
|
char *buf, int len, u64 flags);
|
|
|
|
|
2018-05-05 05:49:51 +08:00
|
|
|
int btf_get_fd_by_id(u32 id);
|
bpf: Remove hard-coded btf_vmlinux assumption from BPF verifier
Remove a permeating assumption thoughout BPF verifier of vmlinux BTF. Instead,
wherever BTF type IDs are involved, also track the instance of struct btf that
goes along with the type ID. This allows to gradually add support for kernel
module BTFs and using/tracking module types across BPF helper calls and
registers.
This patch also renames btf_id() function to btf_obj_id() to minimize naming
clash with using btf_id to denote BTF *type* ID, rather than BTF *object*'s ID.
Also, altough btf_vmlinux can't get destructed and thus doesn't need
refcounting, module BTFs need that, so apply BTF refcounting universally when
BPF program is using BTF-powered attachment (tp_btf, fentry/fexit, etc). This
makes for simpler clean up code.
Now that BTF type ID is not enough to uniquely identify a BTF type, extend BPF
trampoline key to include BTF object ID. To differentiate that from target
program BPF ID, set 31st bit of type ID. BTF type IDs (at least currently) are
not allowed to take full 32 bits, so there is no danger of confusing that bit
with a valid BTF type ID.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201203204634.1325171-10-andrii@kernel.org
2020-12-04 04:46:29 +08:00
|
|
|
u32 btf_obj_id(const struct btf *btf);
|
2020-12-04 04:46:30 +08:00
|
|
|
bool btf_is_kernel(const struct btf *btf);
|
2018-12-16 14:13:52 +08:00
|
|
|
bool btf_member_is_reg_int(const struct btf *btf, const struct btf_type *s,
|
|
|
|
const struct btf_member *m,
|
|
|
|
u32 expected_offset, u32 expected_size);
|
2019-02-01 07:40:04 +08:00
|
|
|
int btf_find_spin_lock(const struct btf *btf, const struct btf_type *t);
|
2019-04-10 05:20:10 +08:00
|
|
|
bool btf_type_is_void(const struct btf_type *t);
|
2020-01-09 08:35:03 +08:00
|
|
|
s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind);
|
|
|
|
const struct btf_type *btf_type_skip_modifiers(const struct btf *btf,
|
|
|
|
u32 id, u32 *res_id);
|
|
|
|
const struct btf_type *btf_type_resolve_ptr(const struct btf *btf,
|
|
|
|
u32 id, u32 *res_id);
|
|
|
|
const struct btf_type *btf_type_resolve_func_ptr(const struct btf *btf,
|
|
|
|
u32 id, u32 *res_id);
|
bpf: Introduce BPF_MAP_TYPE_STRUCT_OPS
The patch introduces BPF_MAP_TYPE_STRUCT_OPS. The map value
is a kernel struct with its func ptr implemented in bpf prog.
This new map is the interface to register/unregister/introspect
a bpf implemented kernel struct.
The kernel struct is actually embedded inside another new struct
(or called the "value" struct in the code). For example,
"struct tcp_congestion_ops" is embbeded in:
struct bpf_struct_ops_tcp_congestion_ops {
refcount_t refcnt;
enum bpf_struct_ops_state state;
struct tcp_congestion_ops data; /* <-- kernel subsystem struct here */
}
The map value is "struct bpf_struct_ops_tcp_congestion_ops".
The "bpftool map dump" will then be able to show the
state ("inuse"/"tobefree") and the number of subsystem's refcnt (e.g.
number of tcp_sock in the tcp_congestion_ops case). This "value" struct
is created automatically by a macro. Having a separate "value" struct
will also make extending "struct bpf_struct_ops_XYZ" easier (e.g. adding
"void (*init)(void)" to "struct bpf_struct_ops_XYZ" to do some
initialization works before registering the struct_ops to the kernel
subsystem). The libbpf will take care of finding and populating the
"struct bpf_struct_ops_XYZ" from "struct XYZ".
Register a struct_ops to a kernel subsystem:
1. Load all needed BPF_PROG_TYPE_STRUCT_OPS prog(s)
2. Create a BPF_MAP_TYPE_STRUCT_OPS with attr->btf_vmlinux_value_type_id
set to the btf id "struct bpf_struct_ops_tcp_congestion_ops" of the
running kernel.
Instead of reusing the attr->btf_value_type_id,
btf_vmlinux_value_type_id s added such that attr->btf_fd can still be
used as the "user" btf which could store other useful sysadmin/debug
info that may be introduced in the furture,
e.g. creation-date/compiler-details/map-creator...etc.
3. Create a "struct bpf_struct_ops_tcp_congestion_ops" object as described
in the running kernel btf. Populate the value of this object.
The function ptr should be populated with the prog fds.
4. Call BPF_MAP_UPDATE with the object created in (3) as
the map value. The key is always "0".
During BPF_MAP_UPDATE, the code that saves the kernel-func-ptr's
args as an array of u64 is generated. BPF_MAP_UPDATE also allows
the specific struct_ops to do some final checks in "st_ops->init_member()"
(e.g. ensure all mandatory func ptrs are implemented).
If everything looks good, it will register this kernel struct
to the kernel subsystem. The map will not allow further update
from this point.
Unregister a struct_ops from the kernel subsystem:
BPF_MAP_DELETE with key "0".
Introspect a struct_ops:
BPF_MAP_LOOKUP_ELEM with key "0". The map value returned will
have the prog _id_ populated as the func ptr.
The map value state (enum bpf_struct_ops_state) will transit from:
INIT (map created) =>
INUSE (map updated, i.e. reg) =>
TOBEFREE (map value deleted, i.e. unreg)
The kernel subsystem needs to call bpf_struct_ops_get() and
bpf_struct_ops_put() to manage the "refcnt" in the
"struct bpf_struct_ops_XYZ". This patch uses a separate refcnt
for the purose of tracking the subsystem usage. Another approach
is to reuse the map->refcnt and then "show" (i.e. during map_lookup)
the subsystem's usage by doing map->refcnt - map->usercnt to filter out
the map-fd/pinned-map usage. However, that will also tie down the
future semantics of map->refcnt and map->usercnt.
The very first subsystem's refcnt (during reg()) holds one
count to map->refcnt. When the very last subsystem's refcnt
is gone, it will also release the map->refcnt. All bpf_prog will be
freed when the map->refcnt reaches 0 (i.e. during map_free()).
Here is how the bpftool map command will look like:
[root@arch-fb-vm1 bpf]# bpftool map show
6: struct_ops name dctcp flags 0x0
key 4B value 256B max_entries 1 memlock 4096B
btf_id 6
[root@arch-fb-vm1 bpf]# bpftool map dump id 6
[{
"value": {
"refcnt": {
"refs": {
"counter": 1
}
},
"state": 1,
"data": {
"list": {
"next": 0,
"prev": 0
},
"key": 0,
"flags": 2,
"init": 24,
"release": 0,
"ssthresh": 25,
"cong_avoid": 30,
"set_state": 27,
"cwnd_event": 28,
"in_ack_event": 26,
"undo_cwnd": 29,
"pkts_acked": 0,
"min_tso_segs": 0,
"sndbuf_expand": 0,
"cong_control": 0,
"get_info": 0,
"name": [98,112,102,95,100,99,116,99,112,0,0,0,0,0,0,0
],
"owner": 0
}
}
}
]
Misc Notes:
* bpf_struct_ops_map_sys_lookup_elem() is added for syscall lookup.
It does an inplace update on "*value" instead returning a pointer
to syscall.c. Otherwise, it needs a separate copy of "zero" value
for the BPF_STRUCT_OPS_STATE_INIT to avoid races.
* The bpf_struct_ops_map_delete_elem() is also called without
preempt_disable() from map_delete_elem(). It is because
the "->unreg()" may requires sleepable context, e.g.
the "tcp_unregister_congestion_control()".
* "const" is added to some of the existing "struct btf_func_model *"
function arg to avoid a compiler warning caused by this patch.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200109003505.3855919-1-kafai@fb.com
2020-01-09 08:35:05 +08:00
|
|
|
const struct btf_type *
|
|
|
|
btf_resolve_size(const struct btf *btf, const struct btf_type *type,
|
2020-08-26 03:21:13 +08:00
|
|
|
u32 *type_size);
|
2020-01-09 08:35:03 +08:00
|
|
|
|
|
|
|
#define for_each_member(i, struct_type, member) \
|
|
|
|
for (i = 0, member = btf_type_member(struct_type); \
|
|
|
|
i < btf_type_vlen(struct_type); \
|
|
|
|
i++, member++)
|
2018-11-21 06:08:20 +08:00
|
|
|
|
2020-09-30 07:50:47 +08:00
|
|
|
#define for_each_vsi(i, datasec_type, member) \
|
|
|
|
for (i = 0, member = btf_type_var_secinfo(datasec_type); \
|
|
|
|
i < btf_type_vlen(datasec_type); \
|
|
|
|
i++, member++)
|
|
|
|
|
2019-10-25 08:18:11 +08:00
|
|
|
static inline bool btf_type_is_ptr(const struct btf_type *t)
|
|
|
|
{
|
|
|
|
return BTF_INFO_KIND(t->info) == BTF_KIND_PTR;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline bool btf_type_is_int(const struct btf_type *t)
|
|
|
|
{
|
|
|
|
return BTF_INFO_KIND(t->info) == BTF_KIND_INT;
|
|
|
|
}
|
|
|
|
|
2020-06-25 06:20:39 +08:00
|
|
|
static inline bool btf_type_is_small_int(const struct btf_type *t)
|
|
|
|
{
|
|
|
|
return btf_type_is_int(t) && t->size <= sizeof(u64);
|
|
|
|
}
|
|
|
|
|
2019-10-25 08:18:11 +08:00
|
|
|
static inline bool btf_type_is_enum(const struct btf_type *t)
|
|
|
|
{
|
|
|
|
return BTF_INFO_KIND(t->info) == BTF_KIND_ENUM;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline bool btf_type_is_typedef(const struct btf_type *t)
|
|
|
|
{
|
|
|
|
return BTF_INFO_KIND(t->info) == BTF_KIND_TYPEDEF;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline bool btf_type_is_func(const struct btf_type *t)
|
|
|
|
{
|
|
|
|
return BTF_INFO_KIND(t->info) == BTF_KIND_FUNC;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline bool btf_type_is_func_proto(const struct btf_type *t)
|
|
|
|
{
|
|
|
|
return BTF_INFO_KIND(t->info) == BTF_KIND_FUNC_PROTO;
|
|
|
|
}
|
|
|
|
|
2020-09-30 07:50:44 +08:00
|
|
|
static inline bool btf_type_is_var(const struct btf_type *t)
|
|
|
|
{
|
|
|
|
return BTF_INFO_KIND(t->info) == BTF_KIND_VAR;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* union is only a special case of struct:
|
|
|
|
* all its offsetof(member) == 0
|
|
|
|
*/
|
|
|
|
static inline bool btf_type_is_struct(const struct btf_type *t)
|
|
|
|
{
|
|
|
|
u8 kind = BTF_INFO_KIND(t->info);
|
|
|
|
|
|
|
|
return kind == BTF_KIND_STRUCT || kind == BTF_KIND_UNION;
|
|
|
|
}
|
|
|
|
|
2020-01-09 08:35:03 +08:00
|
|
|
static inline u16 btf_type_vlen(const struct btf_type *t)
|
|
|
|
{
|
|
|
|
return BTF_INFO_VLEN(t->info);
|
|
|
|
}
|
|
|
|
|
2020-01-21 08:53:46 +08:00
|
|
|
static inline u16 btf_func_linkage(const struct btf_type *t)
|
|
|
|
{
|
|
|
|
return BTF_INFO_VLEN(t->info);
|
|
|
|
}
|
|
|
|
|
2020-01-09 08:35:03 +08:00
|
|
|
static inline bool btf_type_kflag(const struct btf_type *t)
|
|
|
|
{
|
|
|
|
return BTF_INFO_KFLAG(t->info);
|
|
|
|
}
|
|
|
|
|
bpf: Introduce BPF_MAP_TYPE_STRUCT_OPS
The patch introduces BPF_MAP_TYPE_STRUCT_OPS. The map value
is a kernel struct with its func ptr implemented in bpf prog.
This new map is the interface to register/unregister/introspect
a bpf implemented kernel struct.
The kernel struct is actually embedded inside another new struct
(or called the "value" struct in the code). For example,
"struct tcp_congestion_ops" is embbeded in:
struct bpf_struct_ops_tcp_congestion_ops {
refcount_t refcnt;
enum bpf_struct_ops_state state;
struct tcp_congestion_ops data; /* <-- kernel subsystem struct here */
}
The map value is "struct bpf_struct_ops_tcp_congestion_ops".
The "bpftool map dump" will then be able to show the
state ("inuse"/"tobefree") and the number of subsystem's refcnt (e.g.
number of tcp_sock in the tcp_congestion_ops case). This "value" struct
is created automatically by a macro. Having a separate "value" struct
will also make extending "struct bpf_struct_ops_XYZ" easier (e.g. adding
"void (*init)(void)" to "struct bpf_struct_ops_XYZ" to do some
initialization works before registering the struct_ops to the kernel
subsystem). The libbpf will take care of finding and populating the
"struct bpf_struct_ops_XYZ" from "struct XYZ".
Register a struct_ops to a kernel subsystem:
1. Load all needed BPF_PROG_TYPE_STRUCT_OPS prog(s)
2. Create a BPF_MAP_TYPE_STRUCT_OPS with attr->btf_vmlinux_value_type_id
set to the btf id "struct bpf_struct_ops_tcp_congestion_ops" of the
running kernel.
Instead of reusing the attr->btf_value_type_id,
btf_vmlinux_value_type_id s added such that attr->btf_fd can still be
used as the "user" btf which could store other useful sysadmin/debug
info that may be introduced in the furture,
e.g. creation-date/compiler-details/map-creator...etc.
3. Create a "struct bpf_struct_ops_tcp_congestion_ops" object as described
in the running kernel btf. Populate the value of this object.
The function ptr should be populated with the prog fds.
4. Call BPF_MAP_UPDATE with the object created in (3) as
the map value. The key is always "0".
During BPF_MAP_UPDATE, the code that saves the kernel-func-ptr's
args as an array of u64 is generated. BPF_MAP_UPDATE also allows
the specific struct_ops to do some final checks in "st_ops->init_member()"
(e.g. ensure all mandatory func ptrs are implemented).
If everything looks good, it will register this kernel struct
to the kernel subsystem. The map will not allow further update
from this point.
Unregister a struct_ops from the kernel subsystem:
BPF_MAP_DELETE with key "0".
Introspect a struct_ops:
BPF_MAP_LOOKUP_ELEM with key "0". The map value returned will
have the prog _id_ populated as the func ptr.
The map value state (enum bpf_struct_ops_state) will transit from:
INIT (map created) =>
INUSE (map updated, i.e. reg) =>
TOBEFREE (map value deleted, i.e. unreg)
The kernel subsystem needs to call bpf_struct_ops_get() and
bpf_struct_ops_put() to manage the "refcnt" in the
"struct bpf_struct_ops_XYZ". This patch uses a separate refcnt
for the purose of tracking the subsystem usage. Another approach
is to reuse the map->refcnt and then "show" (i.e. during map_lookup)
the subsystem's usage by doing map->refcnt - map->usercnt to filter out
the map-fd/pinned-map usage. However, that will also tie down the
future semantics of map->refcnt and map->usercnt.
The very first subsystem's refcnt (during reg()) holds one
count to map->refcnt. When the very last subsystem's refcnt
is gone, it will also release the map->refcnt. All bpf_prog will be
freed when the map->refcnt reaches 0 (i.e. during map_free()).
Here is how the bpftool map command will look like:
[root@arch-fb-vm1 bpf]# bpftool map show
6: struct_ops name dctcp flags 0x0
key 4B value 256B max_entries 1 memlock 4096B
btf_id 6
[root@arch-fb-vm1 bpf]# bpftool map dump id 6
[{
"value": {
"refcnt": {
"refs": {
"counter": 1
}
},
"state": 1,
"data": {
"list": {
"next": 0,
"prev": 0
},
"key": 0,
"flags": 2,
"init": 24,
"release": 0,
"ssthresh": 25,
"cong_avoid": 30,
"set_state": 27,
"cwnd_event": 28,
"in_ack_event": 26,
"undo_cwnd": 29,
"pkts_acked": 0,
"min_tso_segs": 0,
"sndbuf_expand": 0,
"cong_control": 0,
"get_info": 0,
"name": [98,112,102,95,100,99,116,99,112,0,0,0,0,0,0,0
],
"owner": 0
}
}
}
]
Misc Notes:
* bpf_struct_ops_map_sys_lookup_elem() is added for syscall lookup.
It does an inplace update on "*value" instead returning a pointer
to syscall.c. Otherwise, it needs a separate copy of "zero" value
for the BPF_STRUCT_OPS_STATE_INIT to avoid races.
* The bpf_struct_ops_map_delete_elem() is also called without
preempt_disable() from map_delete_elem(). It is because
the "->unreg()" may requires sleepable context, e.g.
the "tcp_unregister_congestion_control()".
* "const" is added to some of the existing "struct btf_func_model *"
function arg to avoid a compiler warning caused by this patch.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200109003505.3855919-1-kafai@fb.com
2020-01-09 08:35:05 +08:00
|
|
|
static inline u32 btf_member_bit_offset(const struct btf_type *struct_type,
|
|
|
|
const struct btf_member *member)
|
|
|
|
{
|
|
|
|
return btf_type_kflag(struct_type) ? BTF_MEMBER_BIT_OFFSET(member->offset)
|
|
|
|
: member->offset;
|
|
|
|
}
|
|
|
|
|
2020-01-09 08:35:03 +08:00
|
|
|
static inline u32 btf_member_bitfield_size(const struct btf_type *struct_type,
|
|
|
|
const struct btf_member *member)
|
|
|
|
{
|
|
|
|
return btf_type_kflag(struct_type) ? BTF_MEMBER_BITFIELD_SIZE(member->offset)
|
|
|
|
: 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline const struct btf_member *btf_type_member(const struct btf_type *t)
|
|
|
|
{
|
|
|
|
return (const struct btf_member *)(t + 1);
|
|
|
|
}
|
|
|
|
|
2020-09-30 07:50:47 +08:00
|
|
|
static inline const struct btf_var_secinfo *btf_type_var_secinfo(
|
|
|
|
const struct btf_type *t)
|
|
|
|
{
|
|
|
|
return (const struct btf_var_secinfo *)(t + 1);
|
|
|
|
}
|
|
|
|
|
2018-11-21 06:08:20 +08:00
|
|
|
#ifdef CONFIG_BPF_SYSCALL
|
bpf: Remove hard-coded btf_vmlinux assumption from BPF verifier
Remove a permeating assumption thoughout BPF verifier of vmlinux BTF. Instead,
wherever BTF type IDs are involved, also track the instance of struct btf that
goes along with the type ID. This allows to gradually add support for kernel
module BTFs and using/tracking module types across BPF helper calls and
registers.
This patch also renames btf_id() function to btf_obj_id() to minimize naming
clash with using btf_id to denote BTF *type* ID, rather than BTF *object*'s ID.
Also, altough btf_vmlinux can't get destructed and thus doesn't need
refcounting, module BTFs need that, so apply BTF refcounting universally when
BPF program is using BTF-powered attachment (tp_btf, fentry/fexit, etc). This
makes for simpler clean up code.
Now that BTF type ID is not enough to uniquely identify a BTF type, extend BPF
trampoline key to include BTF object ID. To differentiate that from target
program BPF ID, set 31st bit of type ID. BTF type IDs (at least currently) are
not allowed to take full 32 bits, so there is no danger of confusing that bit
with a valid BTF type ID.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201203204634.1325171-10-andrii@kernel.org
2020-12-04 04:46:29 +08:00
|
|
|
struct bpf_prog;
|
|
|
|
|
bpf: Introduce bpf_func_info
This patch added interface to load a program with the following
additional information:
. prog_btf_fd
. func_info, func_info_rec_size and func_info_cnt
where func_info will provide function range and type_id
corresponding to each function.
The func_info_rec_size is introduced in the UAPI to specify
struct bpf_func_info size passed from user space. This
intends to make bpf_func_info structure growable in the future.
If the kernel gets a different bpf_func_info size from userspace,
it will try to handle user request with part of bpf_func_info
it can understand. In this patch, kernel can understand
struct bpf_func_info {
__u32 insn_offset;
__u32 type_id;
};
If user passed a bpf func_info record size of 16 bytes, the
kernel can still handle part of records with the above definition.
If verifier agrees with function range provided by the user,
the bpf_prog ksym for each function will use the func name
provided in the type_id, which is supposed to provide better
encoding as it is not limited by 16 bytes program name
limitation and this is better for bpf program which contains
multiple subprograms.
The bpf_prog_info interface is also extended to
return btf_id, func_info, func_info_rec_size and func_info_cnt
to userspace, so userspace can print out the function prototype
for each xlated function. The insn_offset in the returned
func_info corresponds to the insn offset for xlated functions.
With other jit related fields in bpf_prog_info, userspace can also
print out function prototypes for each jited function.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-20 07:29:11 +08:00
|
|
|
const struct btf_type *btf_type_by_id(const struct btf *btf, u32 type_id);
|
|
|
|
const char *btf_name_by_offset(const struct btf *btf, u32 offset);
|
2019-10-16 11:24:57 +08:00
|
|
|
struct btf *btf_parse_vmlinux(void);
|
2019-11-15 02:57:17 +08:00
|
|
|
struct btf *bpf_prog_get_target_btf(const struct bpf_prog *prog);
|
2018-11-21 06:08:20 +08:00
|
|
|
#else
|
|
|
|
static inline const struct btf_type *btf_type_by_id(const struct btf *btf,
|
|
|
|
u32 type_id)
|
|
|
|
{
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
static inline const char *btf_name_by_offset(const struct btf *btf,
|
|
|
|
u32 offset)
|
|
|
|
{
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
#endif
|
2018-04-19 06:55:58 +08:00
|
|
|
|
|
|
|
#endif
|