Upstream: no
Rusty is a multi-domain BPF / userspace hybrid scheduler where the BPF part
does simple round robin in each domain and the userspace part calculates the
load factor of each domain and tells the BPF part how to load balance the
domains.
This scheduler demonstrates dividing scheduling logic between BPF and
userspace and using rust to build the userspace part. An earlier variant of
this scheduler was used to balance across six domains, each representing a
chiplet in a six-chiplet AMD processor, and could match the performance of
production setup using CFS.
See the --help message for more details.
v5: * Renamed to scx_rusty and improve build scripts.
* Load metrics are now tracked in BPF using the running average
implementation in tools/sched_ext/ravg[_impl].bpf.h and
ravg.read.rs.h. Before, the userspace part was iterating all tasks to
calculate load metrics and make LB decisions. Now, high level LB
decisions are made by simply reading per-domain load averages and
Picking migrating target tasks only accesses the load metrics for a
fixed number of recently active tasks in the pushing domains. This
greatly reduces CPU overhead and makes rusty a lot more scalable.
v4: * tools/sched_ext/atropos renamed to tools/sched_ext/scx_atropos for
consistency.
* LoadBalancer sometimes couldn't converge on balanced state due to
restrictions it put on each balancing operation. Fixed.
* Topology information refactored into struct Topology and Tuner is
added. Tuner runs in shorter cycles (100ms) than LoadBalancer and
dynamically adjusts scheduling behaviors, currently, based on the
per-domain utilization states.
* ->select_cpu() has been revamped. Combined with other improvements,
this allows atropos to outperform CFS in various sub-saturation
scenarios when tested with fio over dm-crypt.
* Many minor code cleanups and improvements.
v3: * The userspace code is substantially restructured and rewritten. The
binary is renamed to scx_atropos and can now figure out the domain
topology automatically based on L3 cache configuration. The LB logic
which was rather broken in the previous postings are revamped and
should behave better.
* Updated to support weighted vtime scheduling (can be turned off with
--fifo-sched). Added a couple options (--slice_us, --kthreads-local)
to modify scheduling behaviors.
* Converted to use BPF inline iterators.
v2: * Updated to use generic BPF cpumask helpers.
Signed-off-by: Dan Schatzberg <dschatzberg@meta.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Hongbo Li <herberthbli@tencent.com>