vllm/tests/worker
Cade Daniel 8437bae6ef
[Speculative decoding 3/9] Worker which speculates, scores, and applies rejection sampling (#3103)
2024-03-08 23:32:46 -08:00
..
__init__.py [Speculative decoding 2/9] Multi-step worker for draft model (#2424) 2024-01-21 16:31:47 -08:00
test_model_runner.py Remove hardcoded `device="cuda" ` to support more devices (#2503) 2024-02-01 15:46:39 -08:00