Commit Graph

18 Commits

Author SHA1 Message Date
Antoni Baum fb96c1e98c
Asynchronous tokenization (#2879) 2024-03-15 23:37:01 +00:00
Simon Mo 81653d9688
[Hotfix] [Debug] test_openai_server.py::test_guided_regex_completion (#3383) 2024-03-13 17:02:21 -07:00
Cade Daniel 8437bae6ef
[Speculative decoding 3/9] Worker which speculates, scores, and applies rejection sampling (#3103) 2024-03-08 23:32:46 -08:00
SangBin Cho 24aecf421a
[Tests] Add block manager and scheduler tests (#3108) 2024-03-05 18:23:34 -08:00
Woosuk Kwon 929b4f2973
Add LoRA support for Gemma (#3050) 2024-02-28 13:03:28 -08:00
Ronen Schaffer 4caf7044e0
Include tokens from prompt phase in `counter_generation_tokens` (#2802) 2024-02-22 14:00:12 -08:00
Zhuohan Li a61f0521b8
[Test] Add basic correctness test (#2908) 2024-02-18 16:44:50 -08:00
Simon Mo f964493274
[CI] Ensure documentation build is checked in CI (#2842) 2024-02-12 22:53:07 -08:00
Roger Wang a4211a4dc3
Serving Benchmark Refactoring (#2433) 2024-02-12 22:53:00 -08:00
Woosuk Kwon f8ecb84c02
Speed up Punica compilation (#2632) 2024-01-27 17:46:56 -08:00
Antoni Baum 9b945daaf1
[Experimental] Add multi-LoRA support (#1804)
Co-authored-by: Chen Shen <scv119@gmail.com>
Co-authored-by: Shreyas Krishnaswamy <shrekris@anyscale.com>
Co-authored-by: Avnish Narayan <avnish@anyscale.com>
2024-01-23 15:26:37 -08:00
Simon Mo 00efdc84ba
Add benchmark serving to CI (#2505) 2024-01-19 20:20:19 -08:00
shiyi.c_98 d10f8e1d43
[Experimental] Prefix Caching Support (#1669)
Co-authored-by: DouHappy <2278958187@qq.com>
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
2024-01-17 16:32:10 -08:00
FlorianJoncour 14cc317ba4
OpenAI Server refactoring (#2360) 2024-01-16 21:33:14 -08:00
Simon Mo 8cd5a992bf
ci: retry on build failure as well (#2457) 2024-01-16 12:51:04 -08:00
Simon Mo 947f0b23cc
CI: make sure benchmark script exit on error (#2449) 2024-01-16 09:50:13 -08:00
Simon Mo bfc072addf
Allow buildkite to retry build on agent lost (#2446) 2024-01-15 15:43:15 -08:00
Simon Mo 6e01e8c1c8
[CI] Add Buildkite (#2355) 2024-01-14 12:37:58 -08:00