Commit Graph

49 Commits

Author SHA1 Message Date
Kevin H. Luu 666ad0aa16
[ci] Cleanup & refactor Dockerfile to pass different Python versions and sccache bucket via build args (#7705)
Signed-off-by: kevin <kevin@anyscale.com>
2024-08-22 20:10:55 +00:00
Peng Guanwen f710fb5265
[Core] Use flashinfer sampling kernel when available (#7137)
Co-authored-by: Michael Goin <michael@neuralmagic.com>
2024-08-19 03:24:03 +00:00
Lily Liu ec2affa8ae
[Kernel] Flashinfer correctness fix for v0.1.3 (#7319) 2024-08-12 07:59:17 +00:00
Rui Qiao 05308891e2
[Core] Pipeline parallel with Ray ADAG (#6837)
Support pipeline-parallelism with Ray accelerated DAG.

Signed-off-by: Rui Qiao <ruisearch42@gmail.com>
2024-08-02 13:55:40 -07:00
Sage Moore 7e0861bd0b
[CI/Build] Update PyTorch to 2.4.0 (#6951)
Co-authored-by: Michael Goin <michael@neuralmagic.com>
2024-08-01 11:11:24 -07:00
Jee Jee Li 7ecee34321
[Kernel][RFC] Refactor the punica kernel based on Triton (#5036) 2024-07-31 17:12:24 -07:00
youkaichao 5a96ee52a3
[ci][build] add back vim in docker (#6661) 2024-07-22 16:26:29 -07:00
Kevin H. Luu 69d5ae38dc
[ci] Use different sccache bucket for CUDA 11.8 wheel build (#6656)
Signed-off-by: kevin <kevin@anyscale.com>
2024-07-22 14:20:41 -07:00
youkaichao e81522e879
[build] add ib in image for out-of-the-box infiniband support (#6599)
[build] add ib so that multi-node support with infiniband can be supported out-of-the-box (#6599)
2024-07-19 17:16:57 -07:00
Tyler Michael Smith 1689219ebf
[CI/Build] Build on Ubuntu 20.04 instead of 22.04 (#6517) 2024-07-18 17:29:25 -07:00
Pernekhan Utemuratov a63a4c6341
[Misc] Use 0.0.9 version for flashinfer (#6447)
Co-authored-by: Pernekhan Utemuratov <pernekhan@deepinfra.com>
2024-07-15 10:10:26 -07:00
Robert Shaw a754dc2cb9
[CI/Build] Cross python wheel (#6394) 2024-07-14 18:54:46 -07:00
youkaichao ccd3c04571
[ci][build] fix commit id (#6420)
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2024-07-14 22:16:21 +08:00
Simon Mo 4f0e0ea131
Add FlashInfer to default Dockerfile (#6172) 2024-07-08 13:38:03 -07:00
Simon Mo bc96d5c330
Move release wheel env var to Dockerfile instead (#6163) 2024-07-05 17:19:53 -07:00
Mor Zusman 9d6a8daa87
[Model] Jamba support (#4115)
Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>
Co-authored-by: Erez Schwartz <erezs@ai21.com>
Co-authored-by: Mor Zusman <morz@ai21.com>
Co-authored-by: tomeras91 <57313761+tomeras91@users.noreply.github.com>
Co-authored-by: Tomer Asida <tomera@ai21.com>
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
Co-authored-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>
2024-07-02 23:11:29 +00:00
zhyncs f1e72cc19a
[BugFix] exclude version 1.15.0 for modelscope (#5668) 2024-06-21 13:15:48 -06:00
Kevin H. Luu 19091efc44
[ci] Setup Release pipeline and build release wheels with cache (#5610)
Signed-off-by: kevin <kevin@anyscale.com>
2024-06-18 11:00:36 -07:00
Antoni Baum a8fda4f661
Seperate dev requirements into lint and test (#5474) 2024-06-13 11:22:41 -07:00
Kevin H. Luu 916d219d62
[ci] Use sccache to build images (#5419)
Signed-off-by: kevin <kevin@anyscale.com>
2024-06-12 17:58:12 -07:00
youkaichao 4fbcb0f27e
[Doc][Build] update after removing vllm-nccl (#5103)
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
2024-05-29 23:51:18 +00:00
Woosuk Kwon 89579a201f
[Misc] Use vllm-flash-attn instead of flash-attn (#4686) 2024-05-08 13:15:34 -07:00
Simon Mo 021b1a2ab7
[CI] check size of the wheels (#4319) 2024-05-04 20:44:36 +00:00
Prashant Gupta b31a1fb63c
[Doc] add visualization for multi-stage dockerfile (#4456)
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
2024-04-30 17:41:59 +00:00
Michael Goin d627a3d837
[Misc] Upgrade to `torch==2.3.0` (#4454) 2024-04-29 20:05:47 -04:00
Woosuk Kwon cfaf49a167
[Misc] Define common requirements (#3841) 2024-04-05 00:39:17 -07:00
youkaichao d03d64fd2e
[CI/Build] refactor dockerfile & fix pip cache
[CI/Build] fix pip cache with vllm_nccl & refactor dockerfile to build wheels (#3859)
2024-04-04 21:53:16 -07:00
youkaichao ca81ff5196
[Core] manage nccl via a pypi package & upgrade to pt 2.2.1 (#3805) 2024-04-04 10:26:19 -07:00
yhu422 d8658c8cc1
Usage Stats Collection (#2852) 2024-03-28 22:16:12 -07:00
Simon Mo 7bc94a0fdd
add ccache to docker build image (#3704) 2024-03-28 22:14:24 -07:00
youkaichao 8f44facddd
[Core] remove cupy dependency (#3625) 2024-03-27 00:33:26 -07:00
ifsheldon c614cfee58
Update dockerfile with ModelScope support (#3429) 2024-03-19 10:54:59 -07:00
bnellnm 9fdf3de346
Cmake based build system (#2830) 2024-03-18 15:38:33 -07:00
Thomas Parnell 06ec486794
Install `flash_attn` in Docker image (#3396) 2024-03-14 10:55:54 -07:00
Ronan McGovern e221910e77
add hf_transfer to requirements.txt (#3031) 2024-03-12 23:33:43 -07:00
Nikola Borisov 87069ccf68
Fix docker python version (#2845) 2024-02-14 10:17:57 -08:00
Simon Mo b9e96b17de
fix python 3.8 syntax (#2716) 2024-02-01 14:00:58 -08:00
Philipp Moritz d0d93b92b1
Add unit test for Mixtral MoE layer (#2677) 2024-01-31 14:34:17 -08:00
Philipp Moritz 390b495ff3
Don't build punica kernels by default (#2605) 2024-01-26 15:19:19 -08:00
Simon Mo 6e01e8c1c8
[CI] Add Buildkite (#2355) 2024-01-14 12:37:58 -08:00
Alexandre Payot 937e7b7d7c
Build docker image with shared objects from "build" step (#2237) 2024-01-04 09:35:18 -08:00
Antoni Baum 21d93c140d
Optimize Mixtral with expert parallelism (#2090) 2023-12-13 23:55:07 -08:00
Simon Mo 3fefe271ec
Update Dockerfile to build Megablocks (#2042) 2023-12-12 17:34:17 -08:00
Simon Mo eb17212858
Update Dockerfile to support Mixtral (#2027) 2023-12-11 11:59:08 -08:00
Simon Mo c85b80c2b6
[Docker] Add cuda arch list as build option (#1950) 2023-12-08 09:53:47 -08:00
AguirreNicolas 24f60a54f4
[Docker] Adding number of nvcc_threads during build as envar (#1893) 2023-12-07 11:00:32 -08:00
Allen f07c1ceaa5
[FIX] Fix docker build error (#1831) (#1832)
Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
2023-11-29 23:06:50 -08:00
GhaziSyed aebfcb262a
Dockerfile: Upgrade Cuda to 12.1 (#1609) 2023-11-09 11:49:02 -08:00
Stephen Krider 9cabcb7645
Add Dockerfile (#1350) 2023-10-31 12:36:47 -07:00