..
fp8
[mypy] Enable type checking for test directory ( #5017 )
2024-06-15 04:45:31 +00:00
production_monitoring
[Misc] Add OpenTelemetry support ( #4687 )
2024-06-19 01:17:03 +09:00
api_client.py
[Quality] Add code formatter and linter ( #326 )
2023-07-03 11:31:55 -07:00
aqlm_example.py
[Misc] Fix arg names ( #5524 )
2024-06-14 09:47:44 -07:00
gradio_openai_chatbot_webserver.py
[CI] Try introducing isort. ( #3495 )
2024-03-25 07:59:47 -07:00
gradio_webserver.py
Remove deprecated parameter: concurrency_count ( #2315 )
2024-01-03 09:56:21 -08:00
llava_example.py
[Core] Support image processor ( #4197 )
2024-06-02 22:56:41 -07:00
llm_engine_example.py
[CI] Try introducing isort. ( #3495 )
2024-03-25 07:59:47 -07:00
logging_configuration.md
[MISC] Rework logger to enable pythonic custom logging configuration to be provided ( #4273 )
2024-05-01 17:34:40 -07:00
lora_with_quantization_inference.py
[Feature][Kernel] Support bitsandbytes quantization and QLoRA ( #4776 )
2024-06-01 14:51:10 -06:00
multilora_inference.py
[CI] Try introducing isort. ( #3495 )
2024-03-25 07:59:47 -07:00
offline_inference.py
[Quality] Add code formatter and linter ( #326 )
2023-07-03 11:31:55 -07:00
offline_inference_arctic.py
[Model] Snowflake arctic model implementation ( #4652 )
2024-05-09 22:37:14 +00:00
offline_inference_distributed.py
[mypy] Enable type checking for test directory ( #5017 )
2024-06-15 04:45:31 +00:00
offline_inference_embedding.py
[Model][Misc] Add e5-mistral-7b-instruct and Embedding API ( #3734 )
2024-05-11 11:30:37 -07:00
offline_inference_neuron.py
[Hardware][Neuron] Refactor neuron support ( #3471 )
2024-03-22 01:22:17 +00:00
offline_inference_openai.md
[Frontend] Support OpenAI batch file format ( #4794 )
2024-05-15 19:13:36 -04:00
offline_inference_with_prefix.py
[Bugfix] Add warmup for prefix caching example ( #5235 )
2024-06-03 19:36:41 -07:00
openai_chat_completion_client.py
Add example scripts to documentation ( #4225 )
2024-04-22 16:36:54 +00:00
openai_completion_client.py
lint: format all python file instead of just source code ( #2567 )
2024-01-23 15:53:06 -08:00
openai_embedding_client.py
[Model][Misc] Add e5-mistral-7b-instruct and Embedding API ( #3734 )
2024-05-11 11:30:37 -07:00
openai_example_batch.jsonl
[docs] Fix typo in examples filename openi -> openai ( #4864 )
2024-05-17 00:42:17 +09:00
phi3v_example.py
[Model] Initialize Phi-3-vision support ( #4986 )
2024-06-17 19:34:33 -07:00
save_sharded_state.py
[Core] Implement sharded state loader ( #4690 )
2024-05-15 22:11:54 -07:00
template_alpaca.jinja
Support chat template and `echo` for chat API ( #1756 )
2023-11-30 16:43:13 -08:00
template_baichuan.jinja
Fix Baichuan chat template ( #3340 )
2024-03-15 21:02:12 -07:00
template_chatglm.jinja
Add chat templates for ChatGLM ( #3418 )
2024-03-14 23:19:22 -07:00
template_chatglm2.jinja
Add chat templates for ChatGLM ( #3418 )
2024-03-14 23:19:22 -07:00
template_chatml.jinja
Support chat template and `echo` for chat API ( #1756 )
2023-11-30 16:43:13 -08:00
template_falcon.jinja
Add chat templates for Falcon ( #3420 )
2024-03-14 23:19:02 -07:00
template_falcon_180b.jinja
Add chat templates for Falcon ( #3420 )
2024-03-14 23:19:02 -07:00
template_inkbot.jinja
Support chat template and `echo` for chat API ( #1756 )
2023-11-30 16:43:13 -08:00
template_llava.jinja
[Frontend] Add OpenAI Vision API Support ( #5237 )
2024-06-07 11:23:32 -07:00
tensorize_vllm_model.py
[Frontend] [Core] Support for sharded tensorized models ( #4990 )
2024-06-12 14:13:52 -07:00