Michael Goin
5f6d10c14c
[CI/Build] Enforce style for C++ and CUDA code with `clang-format` ( #4722 )
2024-05-22 07:18:41 +00:00
SangBin Cho
2e9a2227ec
[Lora] Support long context lora ( #4787 )
...
Currently we need to call rotary embedding kernel for each LoRA, which makes it hard to serve multiple long context length LoRA. Add batched rotary embedding kernel and pipe it through.
It replaces the rotary embedding layer to the one that is aware of multiple cos-sin-cache per scaling factors.
Follow up of https://github.com/vllm-project/vllm/pull/3095/files
2024-05-18 16:05:23 +09:00
SangBin Cho
fb087af52e
[mypy][7/N] Cover all directories ( #4555 )
2024-05-02 10:47:41 -07:00
SangBin Cho
cf8cac8c70
[mypy][6/N] Fix all the core subdirectory typing ( #4450 )
...
Co-authored-by: Cade Daniel <edacih@gmail.com>
2024-05-02 03:01:00 +00:00
SangBin Cho
df29793dc7
[mypy][5/N] Support all typing on model executor ( #4427 )
2024-04-28 19:01:26 -07:00
SangBin Cho
b5b4a398a7
[Mypy] Typing lora folder ( #4337 )
2024-04-25 19:13:50 +00:00
SangBin Cho
0ae11f78ab
[Mypy] Part 3 fix typing for nested directories for most of directory ( #4161 )
2024-04-22 21:32:44 -07:00
SangBin Cho
533d2a1f39
[Typing] Mypy typing part 2 ( #4043 )
...
Co-authored-by: SangBin Cho <sangcho@sangcho-LT93GQWG9C.local>
2024-04-17 17:28:43 -07:00
SangBin Cho
09473ee41c
[mypy] Add mypy type annotation part 1 ( #4006 )
2024-04-12 14:35:50 -07:00
SangBin Cho
01bfb22b41
[CI] Try introducing isort. ( #3495 )
2024-03-25 07:59:47 -07:00
Zhuohan Li
4c922709b6
Add distributed model executor abstraction ( #3191 )
2024-03-11 11:03:45 -07:00
Massimiliano Pronesti
93dc5a2870
chore(vllm): codespell for spell checking ( #2820 )
2024-02-21 18:56:01 -08:00
Simon Mo
1e4277d2d1
lint: format all python file instead of just source code ( #2567 )
2024-01-23 15:53:06 -08:00
Simon Mo
5ffc0d13a2
Migrate linter from `pylint` to `ruff` ( #1665 )
2023-11-20 11:58:01 -08:00
Cade Daniel
e575df33b1
[Small] Formatter only checks lints in changed files ( #1528 )
2023-10-31 15:39:38 -07:00
Zhuohan Li
ba0bfd40e2
TP/quantization/weight loading refactor part 1 - Simplify parallel linear logic ( #1181 )
2023-10-02 15:36:09 -07:00
Zhuohan Li
d6fa1be3a8
[Quality] Add code formatter and linter ( #326 )
2023-07-03 11:31:55 -07:00