Commit Graph

26 Commits

Author SHA1 Message Date
Harry Mellor 5950f555a1
[Doc] Group examples into categories (#11782)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-01-08 09:20:12 +08:00
Rafael Vasquez 32aa2059ad
[Docs] Convert rST to MyST (Markdown) (#11145)
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
2024-12-23 22:35:38 +00:00
Russell Bryant 3be5b26a76
[CI/Build] Add shell script linting using shellcheck (#7925)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2024-11-07 18:17:29 +00:00
Russell Bryant e0dbdb013d
[CI/Build] Add linting for github actions workflows (#7876)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2024-10-07 21:18:10 +00:00
Tyler Michael Smith 2e7fe7e79f
[Build/CI] Set FETCHCONTENT_BASE_DIR to one location for better caching (#8930) 2024-09-29 03:13:01 +00:00
Daniele 2467b642dd
[CI/Build] fix setuptools-scm usage (#8771) 2024-09-24 12:38:12 -07:00
Daniele ee5f34b1c2
[CI/Build] use setuptools-scm to set __version__ (#4738)
Co-authored-by: youkaichao <youkaichao@126.com>
2024-09-23 09:44:26 -07:00
Luka Govedič 71c60491f2
[Kernel] Build flash-attn from source (#8245) 2024-09-20 23:27:10 -07:00
Lucas Wilkinson 5288c06aa0
[Kernel] (1/N) Machete - Hopper Optimized Mixed Precision Linear Kernel (#7174) 2024-08-20 07:09:33 -06:00
Michael Goin 855866caa9
[Kernel] Add tuned triton configs for ExpertsInt8 (#7601) 2024-08-16 11:37:01 -07:00
Michael Goin 111fc6e7ec
[Misc] Add generated git commit hash as `vllm.__commit__` (#6386) 2024-07-12 22:52:15 +00:00
Harry Mellor 3d925165f2
Add example scripts to documentation (#4225)
Co-authored-by: Harry Mellor <hmellor@oxts.com>
2024-04-22 16:36:54 +00:00
Adrian Abeyta 2ff767b513
Enable scaled FP8 (e4m3fn) KV cache on ROCm (AMD GPU) (#3290)
Co-authored-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
Co-authored-by: HaiShaw <hixiao@gmail.com>
Co-authored-by: AdrianAbeyta <Adrian.Abeyta@amd.com>
Co-authored-by: Matthew Wong <Matthew.Wong2@amd.com>
Co-authored-by: root <root@gt-pla-u18-08.pla.dcgpu>
Co-authored-by: mawong-amd <156021403+mawong-amd@users.noreply.github.com>
Co-authored-by: ttbachyinsda <ttbachyinsda@outlook.com>
Co-authored-by: guofangze <guofangze@kuaishou.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: jacobthebanana <50071502+jacobthebanana@users.noreply.github.com>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2024-04-03 14:15:55 -07:00
Woosuk Kwon 1cb0cc2975
[FIX] Make `flash_attn` optional (#3269) 2024-03-08 10:52:20 -08:00
Woosuk Kwon 2daf23ab0c
Separate attention backends (#3005) 2024-03-07 01:45:50 -08:00
Harry Mellor 2709c0009a
Support OpenAI API server in `benchmark_serving.py` (#2172) 2024-01-18 20:34:08 -08:00
TJian 6ccc0bfffb
Merge EmbeddedLLM/vllm-rocm into vLLM main (#1836)
Co-authored-by: Philipp Moritz <pcmoritz@gmail.com>
Co-authored-by: Amir Balwel <amoooori04@gmail.com>
Co-authored-by: root <kuanfu.liu@akirakan.com>
Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com>
Co-authored-by: kuanfu <kuanfu.liu@embeddedllm.com>
Co-authored-by: miloice <17350011+kliuae@users.noreply.github.com>
2023-12-07 23:16:52 -08:00
Woosuk Kwon e3e79e9e8a
Implement AWQ quantization support for LLaMA (#1032)
Co-authored-by: Robert Irvine <robert@seamlessml.com>
Co-authored-by: root <rirv938@gmail.com>
Co-authored-by: Casper <casperbh.96@gmail.com>
Co-authored-by: julian-q <julianhquevedo@gmail.com>
2023-09-16 00:03:37 -07:00
Zhuohan Li 2cf1a333b6
[Doc] Documentation for distributed inference (#261) 2023-06-26 11:34:23 -07:00
Zhuohan Li a255885f83
Add logo and polish readme (#156) 2023-06-19 16:31:13 +08:00
Woosuk Kwon 376725ce74
[PyPI] Packaging for PyPI distribution (#140) 2023-06-05 20:03:14 -07:00
Woosuk Kwon 19d2899439
Add initial sphinx docs (#120) 2023-05-22 17:02:44 -07:00
Zhuohan Li 4858f3bb45
Add an option to launch cacheflow without ray (#51) 2023-04-30 15:42:17 +08:00
Woosuk Kwon 84eee24e20
Collect system stats in scheduler & Add scripts for experiments (#30) 2023-04-12 15:03:49 -07:00
Woosuk Kwon 3b41f16596 Add gitignore 2023-02-16 07:47:21 +00:00
Woosuk Kwon 0a11a2e5ca Add gitignore 2023-02-09 11:28:12 +00:00