OpenCloudOS-Kernel/net/core
Daniel Borkmann de8f3a83b0 bpf: add meta pointer for direct access
This work enables generic transfer of metadata from XDP into skb. The
basic idea is that we can make use of the fact that the resulting skb
must be linear and already comes with a larger headroom for supporting
bpf_xdp_adjust_head(), which mangles xdp->data. Here, we base our work
on a similar principle and introduce a small helper bpf_xdp_adjust_meta()
for adjusting a new pointer called xdp->data_meta. Thus, the packet has
a flexible and programmable room for meta data, followed by the actual
packet data. struct xdp_buff is therefore laid out that we first point
to data_hard_start, then data_meta directly prepended to data followed
by data_end marking the end of packet. bpf_xdp_adjust_head() takes into
account whether we have meta data already prepended and if so, memmove()s
this along with the given offset provided there's enough room.

xdp->data_meta is optional and programs are not required to use it. The
rationale is that when we process the packet in XDP (e.g. as DoS filter),
we can push further meta data along with it for the XDP_PASS case, and
give the guarantee that a clsact ingress BPF program on the same device
can pick this up for further post-processing. Since we work with skb
there, we can also set skb->mark, skb->priority or other skb meta data
out of BPF, thus having this scratch space generic and programmable
allows for more flexibility than defining a direct 1:1 transfer of
potentially new XDP members into skb (it's also more efficient as we
don't need to initialize/handle each of such new members). The facility
also works together with GRO aggregation. The scratch space at the head
of the packet can be multiple of 4 byte up to 32 byte large. Drivers not
yet supporting xdp->data_meta can simply be set up with xdp->data_meta
as xdp->data + 1 as bpf_xdp_adjust_meta() will detect this and bail out,
such that the subsequent match against xdp->data for later access is
guaranteed to fail.

The verifier treats xdp->data_meta/xdp->data the same way as we treat
xdp->data/xdp->data_end pointer comparisons. The requirement for doing
the compare against xdp->data is that it hasn't been modified from it's
original address we got from ctx access. It may have a range marking
already from prior successful xdp->data/xdp->data_end pointer comparisons
though.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 13:36:44 -07:00
..
Makefile net: core: Make the FIB notification chain generic 2017-08-03 15:35:59 -07:00
datagram.c datagram: Remove redundant unlikely() 2017-09-26 09:54:06 -07:00
dev.c bpf: add meta pointer for direct access 2017-09-26 13:36:44 -07:00
dev_addr_lists.c net: fix spelling for synchronized 2014-11-18 15:26:32 -05:00
dev_ioctl.c net: check dev->addr_len for dev_set_mac_address() 2017-07-29 11:25:05 -07:00
devlink.c devlink: Add IPv6 header for dpipe 2017-08-31 14:42:19 -07:00
drop_monitor.c drop_monitor: use setup_timer 2017-03-12 23:47:16 -07:00
dst.c net: check type when freeing metadata dst 2017-08-21 10:57:38 -07:00
dst_cache.c net: dst_cache_per_cpu_dst_set() can be static 2016-03-18 17:45:08 -04:00
ethtool.c net: ethtool: Add back transceiver type 2017-09-21 15:20:40 -07:00
fib_notifier.c net: Add module reference to FIB notifiers 2017-09-01 20:33:42 -07:00
fib_rules.c rtnetlink: make rtnl_register accept a flags parameter 2017-08-09 16:57:38 -07:00
filter.c bpf: add meta pointer for direct access 2017-09-26 13:36:44 -07:00
flow_dissector.c flow_dissector: Add limit for number of headers to dissect 2017-09-05 11:40:08 -07:00
gen_estimator.c net_sched: gen_estimator: fix scaling error in bytes/packets samples 2017-09-13 13:30:53 -07:00
gen_stats.c net_sched: gen_estimator: complete rewrite of rate estimators 2016-12-05 15:21:59 -05:00
gro_cells.c net: Generic XDP 2017-04-25 13:33:49 -04:00
hwbm.c net: hwbm: Fix unbalanced spinlock in error case 2016-05-25 12:35:09 -07:00
link_watch.c dev: introduce dev_get_iflink() 2015-04-02 14:04:59 -04:00
lwt_bpf.c bpf: rename bpf_compute_data_end into bpf_compute_data_pointers 2017-09-26 13:36:44 -07:00
lwtunnel.c ipv6: sr: define core operations for seg6local lightweight tunnel 2017-08-07 14:16:22 -07:00
neighbour.c neigh: make strucrt neigh_table::entry_size unsigned int 2017-09-25 20:36:17 -07:00
net-procfs.c net-procfs: Use vsnprintf extension %phN 2017-06-04 19:52:58 -04:00
net-sysfs.c net: style cleanups 2017-08-18 22:38:47 -07:00
net-sysfs.h
net-traces.c bridge: add tracepoint in br_fdb_update 2017-08-31 11:42:41 -07:00
net_namespace.c net: call newid/getid without rtnl mutex held 2017-08-09 16:57:38 -07:00
netclassid_cgroup.c cgroup: add @flags to css_task_iter_start() and implement CSS_TASK_ITER_PROCS 2017-07-21 11:14:51 -04:00
netevent.c netevent: remove automatic variable in register_netevent_notifier() 2015-05-31 00:03:21 -07:00
netpoll.c netpoll: Fix device name check in netpoll_setup() 2017-07-26 17:01:43 -07:00
netprio_cgroup.c net: break include loop netdevice.h, dsa.h, devlink.h 2017-03-28 22:46:04 -07:00
pktgen.c net: convert sk_buff.users from atomic_t to refcount_t 2017-07-01 07:39:07 -07:00
ptp_classifier.c ptp: Change ptp_class to a proper bitmask 2015-11-03 11:08:22 -05:00
request_sock.c ipv4: Namespaceify tcp_max_syn_backlog knob 2016-12-29 11:38:31 -05:00
rtnetlink.c rtnelink: Move link dump consistency check out of the loop 2017-08-13 19:43:57 -07:00
scm.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/user.h> 2017-03-02 08:42:29 +01:00
secure_seq.c tcp: Namespaceify sysctl_tcp_timestamps 2017-06-08 10:53:29 -04:00
skbuff.c bpf: add meta pointer for direct access 2017-09-26 13:36:44 -07:00
sock.c neigh: increase queue_len_bytes to match wmem_default 2017-08-29 16:10:50 -07:00
sock_diag.c netlink: extended ACK reporting 2017-04-13 13:58:20 -04:00
sock_reuseport.c soreuseport: use "unsigned int" in __reuseport_alloc() 2017-04-03 19:06:38 -07:00
stream.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h> 2017-03-02 08:42:29 +01:00
sysctl_net_core.c net: move somaxconn init from sysctl code 2017-05-25 13:12:17 -04:00
timestamping.c net: skb_defer_rx_timestamp should check for phydev before setting up classify 2015-07-09 14:17:15 -07:00
tso.c net: tso: add support for IPv6 2015-10-26 22:24:22 -07:00
utils.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2017-05-02 16:40:27 -07:00