Commit Graph

24 Commits

Author SHA1 Message Date
Woosuk Kwon c8d70e2437
Pin Ray version to 2.40.0 (#13490)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-02-18 12:50:31 -08:00
Woosuk Kwon 9915912f7f
[V1][PP] Fix & Pin Ray version in requirements-cuda.txt (#13436)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-02-17 21:58:06 -08:00
youkaichao cf797aa856
[core] port pynvml into vllm codebase (#12963)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-02-09 15:00:00 +08:00
Alphi d93bf4da85
[Model] Refactoring of MiniCPM-V and add MiniCPM-o-2.6 support for vLLM (#12069)
Signed-off-by: hzh <hezhihui_thu@163.com>
Signed-off-by: Sungjae Lee <33976427+llsj14@users.noreply.github.com>
Signed-off-by: shaochangxu.scx <shaochangxu.scx@antgroup.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: NickLucche <nlucches@redhat.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Roger Wang <ywang@roblox.com>
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
Signed-off-by: Akshat Tripathi <akshat@krai.ai>
Signed-off-by: Oleg Mosalov <oleg@krai.ai>
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Signed-off-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Signed-off-by: Yida Wu <yidawu@alumni.cmu.edu>
Signed-off-by: Chenguang Li <757486878@qq.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Shanshan Shen <467638484@qq.com>
Signed-off-by: elijah <f1renze.142857@gmail.com>
Signed-off-by: Yikun <yikunkero@gmail.com>
Signed-off-by: mgoin <michael@neuralmagic.com>
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: Konrad Zawora <kzawora@habana.ai>
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: Rui Qiao <ruisearch42@gmail.com>
Co-authored-by: Sungjae Lee <33976427+llsj14@users.noreply.github.com>
Co-authored-by: shaochangxu <85155497+shaochangxu@users.noreply.github.com>
Co-authored-by: shaochangxu.scx <shaochangxu.scx@antgroup.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>
Co-authored-by: sixgod <evethwillbeok@outlook.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
Co-authored-by: Rafael Vasquez <rafvasq21@gmail.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Akshat Tripathi <Akshat.tripathi6568@gmail.com>
Co-authored-by: Oleg Mosalov <oleg@krai.ai>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Avshalom Manevich <12231371+avshalomman@users.noreply.github.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com>
Co-authored-by: Yangcheng Li <liyangcheng.lyc@alibaba-inc.com>
Co-authored-by: Siyuan Li <94890248+liaoyanqing666@users.noreply.github.com>
Co-authored-by: Concurrensee <yida.wu@amd.com>
Co-authored-by: Chenguang Li <757486878@qq.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Alex Brooks <alex.brooks@ibm.com>
Co-authored-by: Chen Zhang <zhangch99@outlook.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Shanshan Shen <467638484@qq.com>
Co-authored-by: elijah <30852919+e1ijah1@users.noreply.github.com>
Co-authored-by: Yikun Jiang <yikunkero@gmail.com>
Co-authored-by: Steve Luo <36296769+SunflowerAries@users.noreply.github.com>
Co-authored-by: mgoin <michael@neuralmagic.com>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Co-authored-by: Konrad Zawora <kzawora@habana.ai>
Co-authored-by: TJian <tunjian1996@gmail.com>
Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com>
Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com>
Co-authored-by: maang-h <55082429+maang-h@users.noreply.github.com>
Co-authored-by: Elfie Guo <164945471+elfiegg@users.noreply.github.com>
Co-authored-by: Rui Qiao <161574667+ruisearch42@users.noreply.github.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
2025-01-29 09:24:59 +00:00
Jiaxin Shan 47a0b615b4
Add ray[default] to wget to run distributed inference out of box (#11265)
Signed-off-by: Jiaxin Shan <seedjeffwan@gmail.com>
2024-12-20 13:54:55 -08:00
youkaichao 7801f56ed7
[ci][gh200] dockerfile clean up (#11351)
Signed-off-by: drikster80 <ed.sealing@gmail.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: drikster80 <ed.sealing@gmail.com>
Co-authored-by: cenzhiyao <2523403608@qq.com>
2024-12-19 18:13:06 -08:00
cennn b3b1526f03
WIP: [CI/Build] simplify Dockerfile build for ARM64 / GH200 (#11212)
Signed-off-by: drikster80 <ed.sealing@gmail.com>
Co-authored-by: drikster80 <ed.sealing@gmail.com>
2024-12-16 09:20:49 +00:00
bnellnm d93478b399
[Bugfix] Upgrade to pytorch 2.5.1 (#10001)
Signed-off-by: Bill Nell <bill@neuralmagic.com>
2024-11-04 15:11:28 -08:00
Jee Jee Li 6aa6020f9b
[Misc] Specify minimum pynvml version (#9827)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2024-10-29 23:05:43 -07:00
bnellnm 3cb07a36a2
[Misc] Upgrade to pytorch 2.5 (#9588)
Signed-off-by: Bill Nell <bill@neuralmagic.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
2024-10-27 09:44:24 +00:00
Luka Govedič 71c60491f2
[Kernel] Build flash-attn from source (#8245) 2024-09-20 23:27:10 -07:00
tomeras91 386087970a
[CI/Build] build on empty device for better dev experience (#4773) 2024-08-11 13:09:44 -07:00
Woosuk Kwon 805a8a75f2
[Misc] Support attention logits soft-capping with flash-attn (#7022) 2024-08-01 13:14:37 -07:00
Sage Moore 7e0861bd0b
[CI/Build] Update PyTorch to 2.4.0 (#6951)
Co-authored-by: Michael Goin <michael@neuralmagic.com>
2024-08-01 11:11:24 -07:00
Cody Yu aa48e502fb
[MISC] Upgrade dependency to PyTorch 2.3.1 (#5327) 2024-07-12 12:04:26 -07:00
Isotr0py edd5fe5fa2
[Bugfix] Add phi3v resize for dynamic shape and fix torchvision requirement (#5772) 2024-06-24 12:11:53 +08:00
Antoni Baum 0ab278ca31
[Core] Remove unnecessary copies in flash attn backend (#5138) 2024-06-03 09:39:31 -07:00
youkaichao 5bd3c65072
[Core][Optimization] remove vllm-nccl (#5091) 2024-05-29 05:13:52 +00:00
Woosuk Kwon b57e6c5949
[Kernel] Add flash-attn back (#4907) 2024-05-19 18:11:30 -07:00
Woosuk Kwon 89579a201f
[Misc] Use vllm-flash-attn instead of flash-attn (#4686) 2024-05-08 13:15:34 -07:00
Michael Goin d627a3d837
[Misc] Upgrade to `torch==2.3.0` (#4454) 2024-04-29 20:05:47 -04:00
youkaichao e4bf860a54
[CI][Build] change pynvml to nvidia-ml-py (#4302) 2024-04-23 18:33:12 -07:00
Roy 8db1bf32f8
[Misc] Upgrade triton to 2.2.0 (#4061) 2024-04-14 17:43:54 -07:00
Woosuk Kwon cfaf49a167
[Misc] Define common requirements (#3841) 2024-04-05 00:39:17 -07:00