OpenCloudOS-Kernel/tools/sched_ext/ravg.bpf.h

43 lines
1.0 KiB
C
Raw Normal View History

rue/scx/sched_ext: Add scx_rusty, a rust userspace hybrid scheduler Upstream: no Rusty is a multi-domain BPF / userspace hybrid scheduler where the BPF part does simple round robin in each domain and the userspace part calculates the load factor of each domain and tells the BPF part how to load balance the domains. This scheduler demonstrates dividing scheduling logic between BPF and userspace and using rust to build the userspace part. An earlier variant of this scheduler was used to balance across six domains, each representing a chiplet in a six-chiplet AMD processor, and could match the performance of production setup using CFS. See the --help message for more details. v5: * Renamed to scx_rusty and improve build scripts. * Load metrics are now tracked in BPF using the running average implementation in tools/sched_ext/ravg[_impl].bpf.h and ravg.read.rs.h. Before, the userspace part was iterating all tasks to calculate load metrics and make LB decisions. Now, high level LB decisions are made by simply reading per-domain load averages and Picking migrating target tasks only accesses the load metrics for a fixed number of recently active tasks in the pushing domains. This greatly reduces CPU overhead and makes rusty a lot more scalable. v4: * tools/sched_ext/atropos renamed to tools/sched_ext/scx_atropos for consistency. * LoadBalancer sometimes couldn't converge on balanced state due to restrictions it put on each balancing operation. Fixed. * Topology information refactored into struct Topology and Tuner is added. Tuner runs in shorter cycles (100ms) than LoadBalancer and dynamically adjusts scheduling behaviors, currently, based on the per-domain utilization states. * ->select_cpu() has been revamped. Combined with other improvements, this allows atropos to outperform CFS in various sub-saturation scenarios when tested with fio over dm-crypt. * Many minor code cleanups and improvements. v3: * The userspace code is substantially restructured and rewritten. The binary is renamed to scx_atropos and can now figure out the domain topology automatically based on L3 cache configuration. The LB logic which was rather broken in the previous postings are revamped and should behave better. * Updated to support weighted vtime scheduling (can be turned off with --fifo-sched). Added a couple options (--slice_us, --kthreads-local) to modify scheduling behaviors. * Converted to use BPF inline iterators. v2: * Updated to use generic BPF cpumask helpers. Signed-off-by: Dan Schatzberg <dschatzberg@meta.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Hongbo Li <herberthbli@tencent.com>
2023-11-15 03:19:50 +08:00
#ifndef __SCX_RAVG_BPF_H__
#define __SCX_RAVG_BPF_H__
/*
* Running average helpers to be used in BPF progs. Assumes vmlinux.h has
* already been included.
*/
enum ravg_consts {
RAVG_VAL_BITS = 44, /* input values are 44bit */
RAVG_FRAC_BITS = 20, /* 1048576 is 1.0 */
};
/*
* Running avg mechanism. Accumulates values between 0 and RAVG_MAX_VAL in
* arbitrary time intervals. The accumulated values are halved every half_life
* with each period starting when the current time % half_life is 0. Zeroing is
* enough for initialization.
*
* See ravg_accumulate() and ravg_read() for more details.
*/
struct ravg_data {
/* current value */
u64 val;
/*
* The timestamp of @val. The latest completed seq #:
*
* (val_at / half_life) - 1
*/
u64 val_at;
/* running avg as of the latest completed seq */
u64 old;
/*
* Accumulated value of the current period. Input value is 48bits and we
* normalize half-life to 16bit, so it should fit in an u64.
*/
u64 cur;
};
#endif /* __SCX_RAVG_BPF_H__ */