Commit Graph

130 Commits

Author SHA1 Message Date
Martin Stefcek bdaa34216a
chore: add fix for windows cudarc into the readme (#2189) 2024-05-16 14:32:50 +02:00
hardlydearly c68ed8963f
chore: fix some typos in comments (#2121)
Signed-off-by: hardlydearly <799511800@qq.com>
2024-04-28 08:34:32 +02:00
Laurent Mazare cfab6e7616
Mention phi-v3 in the readmes. (#2122) 2024-04-24 20:54:24 +02:00
Laurent Mazare 52ae332910
Use llama v3 by default + add to readme. (#2094) 2024-04-20 16:11:24 +02:00
Laurent Mazare ce6d08df94
Minor fix to the readme. (#2080)
Co-authored-by: Jane Doe <jane.doe@example.org>
2024-04-17 22:43:00 +02:00
Laurent Mazare 50e49ecc5f
Add a quantized version of recurrent-gemma. (#2054)
* Add a quantized version of recurrent-gemma.

* Share the rglru part.

* Get the quantized gemma model to work.
2024-04-13 20:07:01 +02:00
Vishal Patil 2be1a35710
Added link to the Coursera ML algorithm implementations (#1989)
* Added link to the coursera ML algo implementations

* Fixed link
2024-04-03 07:16:32 +02:00
Santiago Medina 92f81d2fcb
Add Moondream transformer implementation and example (#1970)
* moondream implementation

* add moondream example

* change config default activation

* Add assets and integrate phi mixformer with example

* Make use of kv cache and fix seq_len bug; Clean up example code

* Add README link to example

* Remove pos_embed scaling; Remove assets; Add to README; Expand VisionConfig

* Delete image

* Use apply instead of forward
2024-03-31 08:54:56 +02:00
Laurent Mazare 708e422456
Qwen MoE model. (#1960)
* Qwen MoE model.

* Add the MoE model to the example.

* Fix the scaling.

* Readme updates.

* Readme tweaks.
2024-03-28 23:10:57 +01:00
Thomas Santerre 2bb9c683b9
Update README.md (#1840)
Adds the candle-einops to the readme as an external resource
2024-03-13 14:36:25 +01:00
Laurent Mazare 6530932285
Add the new models to the main readme. (#1797) 2024-03-03 16:25:14 +01:00
Laurent Mazare 64d4038e4f
Mention rwkv v6 in the readmes. (#1784) 2024-03-01 08:58:30 +01:00
Jani Monoses 979deaca07
EfficientVit (MSRA) model (#1783)
* Add EfficientVit (Microsoft Research Asia) model.

* Mention models in README
2024-03-01 08:53:52 +01:00
Laurent Mazare 4fd00b8900
Add the StarCoder2 model. (#1779)
* Add the StarCoder2 model.

* Add the example code and get things to work.

* And also tweak the readme.
2024-02-28 21:02:41 +01:00
Laurent Mazare 57267cd536
Add a flag to force running the quantized model on CPUs. (#1778)
* Add a flag to force running the quantized model on CPUs.

* Add encodec to the readme.
2024-02-28 14:58:42 +01:00
Laurent Mazare 45d5322d62
Add the Gemma models. (#1741)
* Add the Gemma models.

* Add the gemma example.

* Adapt the RmsNorm.

* Get the 2b model to work.

* 7b support.

* Use the config head dim.

* Yet another fix.

* Make the matrixes contiguous.

* Also get the 7b model to work.

* And add to the readme.
2024-02-21 22:02:50 +01:00
Laurent Mazare 2d5f2a728d
Add the RWKV model (v5). (#1707)
* Start adding the RWKV model.

* More of the forward step.

* Handle rescaling.

* FeedForward.

* More work on RWKV.

* Better state tracking.

* Finish a first pass on forward.

* Fix the shape mismatches.

* Do not rescale in f32.

* Rename to rwkv-v5.

* Add the new models to the readme.
2024-02-14 10:58:32 +01:00
Laurent Mazare 1e26d539d9
Improved mamba model optimized for inference (#1694)
* Sketch the mamba model for inference.

* Complete the forward pass.

* Add the mamba example.

* Optimize the selective-scan part.

* Fix a couple shape mismatches and get inference to work.

* Tweak the readmes.

* More readme tweaks.
2024-02-11 17:04:57 +01:00
Laurent Mazare 27ffd644a9
Mention TrOCR in the readmes. (#1691) 2024-02-10 15:49:38 +01:00
Laurent Mazare a510ddec4e
Mention the new models in the readme. (#1651) 2024-02-03 15:19:57 +01:00
SebastianRueClausen a46864bd56
Fix "Minimal Mamba" link in README. (#1577) 2024-01-12 17:47:07 +01:00
Laurent Mazare 8e06bfb4fd
Mention VGG in the readme. (#1573) 2024-01-12 09:59:29 +01:00
Jeroen Vlek 3a7304cb0d
add link to gpt-from-scratch-rs (#1525) 2024-01-05 11:59:46 +01:00
Laurent Mazare 65cb90bd40
Add some mention to SOLAR-10.7B in the readme. (#1487) 2023-12-27 15:25:39 +01:00
Laurent Mazare d8b9a727fc
Support different mamba models. (#1471) 2023-12-23 10:46:02 +01:00
Laurent Mazare 1e86717bf2
Fix a couple typos (#1451)
* Mixtral quantized instruct.

* Fix a couple typos.
2023-12-17 05:20:05 -06:00
Laurent Mazare cfdf9640a3
Readme tweaks. (#1446) 2023-12-16 06:23:12 -06:00
Laurent Mazare e12cbfd73b
Update the readme to mention mixtral. (#1443) 2023-12-15 19:29:03 -06:00
Laurent Mazare 7be982f6f7
Mention phi-2 in the readme. (#1434) 2023-12-14 08:02:27 -06:00
Edwin Cheng 37bf1ed012
Stable Diffusion Turbo Support (#1395)
* Add support for SD Turbo

* Set Leading as default in euler_ancestral discrete

* Use the appropriate default values for n_steps and guidance_scale.

---------

Co-authored-by: Laurent <laurent.mazare@gmail.com>
2023-12-03 08:37:10 +01:00
Eric Buehler f83e14f68d
Add candle-lora transformers to readme? (#1356)
* Demonstrate lora transformers in readme

* Shorten readme
2023-11-21 17:54:24 +00:00
Laurent Mazare c7e613ab5e
Update the readme. (#1354) 2023-11-21 09:38:27 +00:00
Laurent Mazare 8f63f68289
Fix the kalosm link (#1353) 2023-11-21 06:18:14 +01:00
Laurent Mazare f1e678b39c
Mention the Yi-6b/Yi-34b models in the readme. (#1321) 2023-11-11 12:39:11 +01:00
Juarez Bochi 18d30005c5
Add support to UL2 model family (#1300)
* Add support to UL2 model family

* Update docs with UL2

* Create ActivationWithOptionalGating to avoid polluting activations

* Also refactor quantized t5

* Remove useless conversion

* Revert Activation::NewGelu name change

* Remove useless return

* Apply rustfmt and clippy recommendations

* Reuse t5::ActivationWithOptionalGating in quantized version

* (cosmetic change) use a match rather than ifs + avoid early returns.

---------

Co-authored-by: Laurent <laurent.mazare@gmail.com>
2023-11-09 18:55:09 +01:00
Juarez Bochi c912d24570
Update README: Move T5 to Text to Text section (#1288)
I think it makes more sense to have it there, since it's a seq2seq model with cross attention, and not a LM. There are also Decoder only T5 models that work as LMs, but that's not the standard.
2023-11-07 16:14:04 +01:00
Juarez Bochi d5c2a7b64b
Add info about MADLAD-400 in readme files (#1287) 2023-11-07 15:21:59 +01:00
Eric Buehler abc4f698c5
Add candle-sampling (#1278) 2023-11-06 12:53:29 +01:00
YiiSh a923e8b53a
Add a link to candle-ext to README.md (#1277) 2023-11-06 12:44:39 +01:00
Laurent Mazare 2a45bcf943
Put the onnx example behind a feature flag. (#1276)
* Put the onnx example behind a feature flag.

* Exclude the onnx bits from the workspace.

* README tweaks.
2023-11-06 07:45:07 +01:00
figgefigge 47f4ddb011
Added info about missing protoc (#1275)
Co-authored-by: figgefigge <fredric.1337mail.com>
2023-11-06 06:47:32 +01:00
Yuchao Zhang bfe95115c6
Update README.md (#1264) 2023-11-04 05:32:32 +01:00
ealmloff ad63f20781
add Kalosm to the list of external resources (#1257) 2023-11-03 19:16:46 +01:00
Eric Buehler 1b5063f3ca
Add vllm external resource (#1253) 2023-11-03 12:40:31 +01:00
Laurent Mazare 4c967b9184
Use the hub files for the marian example. (#1220)
* Use the hub files for the marian example.

* Use the secondary decoder.

* Add a readme.

* More readme.
2023-10-30 17:29:36 +00:00
Laurent Mazare 0ec5ebcec4
Use the hub model file when possible. (#1190)
* Use the hub model file when possible.

* And add a mention in the main readme.
2023-10-26 20:00:50 +01:00
Blanchon e37b487767
Add Blip to online demos README.md (#1184)
* Add Blip to online demos README.md

* Punctuation.

---------

Co-authored-by: Laurent Mazare <laurent.mazare@gmail.com>
2023-10-26 11:07:01 +01:00
Laurent Mazare e7b886d56f
Add a link to the optimisers crate. (#1180) 2023-10-25 21:51:45 +01:00
Laurent Mazare df2f89b6cf
Add some KV cache to blip. (#1150)
* Add some KV cache to blip.

* Mention BLIP in the readme.
2023-10-22 09:44:48 +01:00
Laurent Mazare 31ca4897bb
Readme updates. (#1134) 2023-10-20 09:08:39 +01:00