candle

Commit Graph

Author	SHA1	Message	Date
唐璜	500c9f2882	add models support and example for THUDM/glm-4 (#2362 ) * add models support and example for THUDM/glm-4 * fix the ci report * fmt * fix * Update README.org * Update README.org * fmt * Update README.org * README.md add codegeex4 * README.md add glm4 * Typo. * change expect into ? --------- Co-authored-by: Laurent Mazare <laurent.mazare@gmail.com>	2024-08-05 17:48:09 +02:00
Laurent Mazare	2be9bd211e	Support for mistral-nemo. (#2396 )	2024-08-04 19:52:40 +02:00
Laurent Mazare	89eae41efd	Support the flux-dev model too. (#2395 )	2024-08-04 12:16:24 +02:00
Laurent Mazare	19db6b9723	Add the flux model for image generation. (#2390 ) * Add the flux autoencoder. * Add the encoder down-blocks. * Upsampling in the decoder. * Sketch the flow matching model. * More flux model. * Add some of the positional embeddings. * Add the rope embeddings. * Add the sampling functions. * Add the flux example. * Fix the T5 bits. * Proper T5 tokenizer. * Clip encoder path fix. * Get the clip embeddings. * No configurable weights in layer norm. * More weights related fixes. * Yet another shape fix. * DType fix. * Fix a couple more shape issues. * DType fixes. * Fix the latent dims. * Fix more shape issues. * Autoencoder fixes. * Get some generations out. * Bugfix. * T5 padding. * Clippy fix. * Add the decode only mode. * Fix. * More fixes. * Finally get some generations to work. * Add readme.	2024-08-04 08:14:33 +02:00
Laurent Mazare	9ca277a9d7	Fix cargo fmt. (#2383 ) * Fix cargo fmt. * Clippy fix. * Cosmetic tweaks.	2024-08-01 14:19:41 +02:00
Joan Fontanals	2e9c010609	Jina Bert Example fix and more configuration (#2191 ) * fix: fix jina bert example logic * feat: enable jina embeddings de * feat: allow more flexibility on Jina Bert	2024-08-01 13:59:20 +02:00
Jani Monoses	ac51f477eb	Add Hiera vision model. (#2382 )	2024-08-01 11:59:22 +02:00
Laurent Mazare	957d604a78	Enable BF16 on metal. (#2380 )	2024-08-01 11:05:07 +02:00
Laurent Mazare	1ba87a9450	Use BF16 on metal when possible. (#2378 )	2024-08-01 10:48:58 +02:00
Zheng Li	4a52aeb437	bert attention mask (#1934 ) * bert attention mask * Allow for using None as a mask. * Revert part of the changes so that the proper default mask applies. * Cosmetic change. * Another cosmetic tweak. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-08-01 08:26:19 +02:00
Eric Buehler	0f5cbb08b3	Add support for Llama 3.1 (#2359 ) * Add Llama 3.1 rope * Clippy * Format * Clippy * Add support for multiple eos tokens: * Untagged either * Remove either dep and fix settings.json * Make the max positional embeddings configurable	2024-07-26 21:32:26 +02:00
shua	6056fd5c90	onnx: fix pad, unsqueeze (#2317 ) * onnx: fix pad, unsqueeze both implementations have off-by-one errors: - Pad 'reflect' cycle for eg `dim==3` is `[0,1,2,1]` which has length of 4 (or `dim2 - 2`) not 5 (current code `dim2 - 1`) - Unsqueeze(-1) for tensor with `dim==3` should be 3 (ie `dim+index+1`) not 2 (ie currently `dim+index`) in addition, Pad is incorrectly calculating the starting padding. If we want to pad out 2 elements to the start, and we have this cycle of indices of length 6, then we should skip 4 elements, but currently we skip 2. A more visual representation of what's going on is below: ``` pad_start: 2 data: [a,b,c,d] indices: [0, 1, 2, 3, 2, 1, 0, 1, 2, 3, 2, 1, 0, ..] // zigzag between 0..4 actual: skip [ c d\| c b a b] expected: ~ skip ~ [ c b\| a b c d] ``` The values between `[` and `\|` are padding and the values between `\|` and `]` in the example should match the original data being padded. * Fix clippy lints. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-07-23 23:10:57 +02:00
Caio Petrucci Rosa	ebc9aa60bc	fix clip example title (#2345 )	2024-07-23 22:55:18 +02:00
donjuanplatinum	2489a606fe	feat(candle-transformers/models/codegeex4-9b): add codegeex4-9 (#2334 ) * feat(candle-transformers/models/codegeex4-9b): add codegeex4-9b transoformers * change mod.rs * feat(candle-examples/codegeex4-9b) * Update codegeex4_9b.rs * Update main.rs * Update codegeex4_9b.rs * Update main.rs * fmt * fix * fmt * Clippy fix. * Remove some print statements. * Avoid using unwrap. * 1. add README 2. change the print fmt * Another clippy fix. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-07-21 13:00:41 +02:00
Laurent Mazare	3c815b1dca	Pin the revision used by moondream. (#2340 )	2024-07-18 10:49:46 +02:00
Laurent Mazare	42891cc613	Add mathstral in the examples. (#2339 )	2024-07-18 08:24:49 +02:00
Zhuo Jinggang	c63048d374	add quantized qwen2 (#2329 ) * add quantized version of qwen2 and corresponding example for qwen2-instruct * fix quantized qwen2 clippy error	2024-07-12 10:00:03 +02:00
Jani Monoses	a226a9736b	Add Mobilenet v4 (#2325 ) * Support different resolutions in load_image() * Added MobilenetV4 model. * Add MobileNetv4 to README	2024-07-09 13:52:20 +02:00
v-espitalier	9cd54aa5d4	Add EVA-02 model ( https://arxiv.org/abs/2303.11331 ) (#2311 ) * Add EVA-02 model ( https://arxiv.org/abs/2303.11331 ) * Clippy fix. * And apply fmt. --------- Co-authored-by: v-espitalier <> Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-07-07 20:09:31 +02:00
v-espitalier	ecff05d72b	Beit: Add the gen_relative_position_index() function (#2306 ) Co-authored-by: v-espitalier <>	2024-07-04 09:45:26 +02:00
v-espitalier	7f1ba8038c	Add Beit model ( https://arxiv.org/abs/2106.08254 ) (#2305 ) Co-authored-by: v-espitalier <>	2024-07-01 22:11:48 +02:00
Czxck001	74e9e41911	make up for the missing last token output of phi2 example (#2299 )	2024-06-29 21:34:42 +02:00
v-espitalier	e27aac0a06	Add DINOv2Reg4 + PlantCLEF2024 (#2293 ) * Add: DINOv2Reg4 with PlantCLEF2024 weights and example ( See https://arxiv.org/abs/2309.16588 and https://zenodo.org/records/10848263 ) * Remove extra files + update README to download them + remove extra lines * minor fix (README remove extra spaces) * minor fix (README: Fix image url) * Modif: Add back interpolate_pos_encoding() + fix when no interpolation + remove extra comments + Update README ( source image changed and so the predictions ) * Fix: Improve code lisibility with '$ cargo clippy' and '$ cargo fmt' * Another clippy fix. --------- Co-authored-by: x-VEspit <vincent.espitalier@cirad.fr> Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-06-29 11:49:15 +02:00
Jeroen Vlek	242e006bbb	Depth Anything v2 (#2279 ) * define structs * construct ResidualConvUnit * forward() for ResidualConvUnit * implement FeatureFusionBlock * implement Scratch * implement DPTHead * add identity module * implement forward for DTPHead * add get_intermediate_layers to DinoVisionTransformer * implement DepthAnythingV2 * some minor tweaks * fix compile errors * fix var builder prefixes * setup initial example * use fixed patch size of 37 (518 / 14) * debugged until output * print min and max values * add some dynamism to the output location * scale input image * extract prep function * extract output path function * normalize image with magic mean and std * add spectral coloring * squeeze in the right place * make enterpolation optional * use bail instead of panic * omit unnecessary Shape call * remove empty curly braces * use bail instead of assert * use vb and pp * remove closures * extract config object * Apply rustfmt. * Fix some clippy lints. * More lints. * Use the array methods. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-06-24 19:12:52 +02:00
Laurent Mazare	54ff971e35	Support for the new Qwen2 models. (#2257 ) * Support for the new Qwen2 models. * Add more models.	2024-06-07 10:51:50 +01:00
chenwanqq	cd4d941ed1	Add LLaVA support (#2234 ) * first commit * llava * clippy and fmt * some fixes * minor fixes * remove useless file * refactor: Remove llava/constants.rs and update llava/mod.rs * modify variable name * modify code after clippy * Minor tweaks. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-06-03 11:54:09 +02:00
Laurent Mazare	45e235a747	Simplify the KvCache api. (#2207 )	2024-05-23 17:07:21 +02:00
Jani Monoses	77ea479a18	Add Phi-3 Medium (#2205 )	2024-05-23 13:33:17 +02:00
Laurent Mazare	7ebc3548e1	Use flash-attn in gemma. (#2195 ) * Use flash-attn in gemma. * Fix flash-attn for head dim 256.	2024-05-18 19:18:59 +02:00
Laurent Mazare	eefc1c77ef	Support flash-attn in quantized phi3. (#2194 )	2024-05-18 17:12:56 +02:00
Laurent Mazare	01545f7303	Add a slice_set op. (#2193 ) * Add a slice_set op. * Add some testing. * Add the dedicated kv-cache module. * Derive debug and clone. * Expose more kv-cache functions. * Return the current data when appending. * Use the new cache in the quantized phi3 model.	2024-05-18 15:58:18 +02:00
Yin Guobing	349c3e806a	Support embedding model gte-Qwen1.5-7B-instruct (#2190 ) * Support embedding model gte-Qwen1.5-7B-instruct This is a text embedding model based on Qwen2. They share same model architecture except the last MLP module. This commit brings in minimal modification of the old Qwen2 implementation to support both models. An example is provided, and had been verified according to the official PyTorch implementation. * Avoid doing the 'last-token filtering' based on the absence of attention mask. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-05-16 21:34:10 +02:00
Daniel Varga	cc80e065e5	Allow the threshold argumet to be negative in the segment-anything example (#2187 ) Threshold is 0.0 by default, negative values make more points included, expanding the mask. Positive values make it more picky, making the mask smaller. Negative numbers start with a minus sign, which normally makes clap consider it a flag.	2024-05-15 13:17:20 +02:00
Laurent Mazare	a75cd8164f	Force the revision for the phi3-llama quantized models. (#2159 )	2024-05-04 10:41:18 +02:00
Laurent Mazare	b13a82a438	Separate quantized phi-3 implementation. (#2157 ) * Separate quantized phi-3 implementation. * Integrate the quantized phi3 model.= * Small fixes, get the generation to work properly. * Keep the old llama implementation around. * Change the default.	2024-05-04 10:14:57 +02:00
Laurent Mazare	59b18d974e	Pin the version used for the quantized phi 3 gguf file. (#2156 )	2024-05-03 15:03:22 +02:00
Laurent Mazare	89f53b9d7b	Bump the version number to 0.5.1. (#2155 ) * Bump the version number to 0.5.1. * Fix clippy lints for 1.78. * More clippy fixes.	2024-05-03 11:17:05 +02:00
Laurent Mazare	a09d451d11	Support top-k in tthe llama example. (#2150 )	2024-05-01 22:25:47 +02:00
Laurent Mazare	ed7b99f525	Add a toggle for F16/BF16 accumulation in gemm. (#2141 ) * Add a toggle to control f16/bf16 gemm precision. * Use the faster variant in the quantized example. * Bugfix.	2024-04-29 09:21:07 +02:00
hardlydearly	c68ed8963f	chore: fix some typos in comments (#2121 ) Signed-off-by: hardlydearly <799511800@qq.com>	2024-04-28 08:34:32 +02:00
Laurent Mazare	3b429f3023	Make the dtype configurable for phi. (#2133 )	2024-04-27 21:32:49 +02:00
Isotr0py	6cf82fd7a3	Add Olmo models (#2127 ) * add olmo support * add olmo readme * Fix fmt. * Fix clippy. * Get olmo to work on cuda. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-04-26 11:02:51 +02:00
Laurent Mazare	cfab6e7616	Mention phi-v3 in the readmes. (#2122 )	2024-04-24 20:54:24 +02:00
Laurent Mazare	11d4a3c588	Add the phi-3 model. (#2120 ) * Add the phi-3 model. * Faster rope. * Bugfix. * Fix the detokenization.	2024-04-24 09:48:13 +02:00
Laurent Mazare	9d3f1c8af5	Add the phi-v3 quantized model. (#2118 ) * Add the phi-v3 quantized model. * Also include phi-3 in the main phi example.	2024-04-24 08:22:23 +02:00
Laurent Mazare	618ecf5e23	Better time measurement for the llama example. (#2106 )	2024-04-22 17:54:27 +02:00
Laurent Mazare	c388be93e7	Updated quantized phi model (#2099 ) * Quantized phi in a separate file. * Add the quantized phi example + rework the model code. * Improve the phi model. * Get some generation out. * Use the appropriate rope shape. * Tweak the default prompt. --------- Co-authored-by: Jane Doe <jane.doe@example.org>	2024-04-21 07:37:07 +02:00
Laurent Mazare	587ee3bb6f	Small cleanups to the llama multi-process example. (#2098 )	2024-04-20 22:19:46 +02:00
Laurent Mazare	52ae332910	Use llama v3 by default + add to readme. (#2094 )	2024-04-20 16:11:24 +02:00
Laurent Mazare	8b390ddd29	Only download the weights in the main process (and not in the child processes). (#2093 )	2024-04-20 13:01:23 +02:00
Laurent Mazare	c97d639fa0	Multiprocess/multi-GPU support for llama 3. (#2092 ) * Multiprocess/multi-GPU support for llama 3. * Modernize the mp example a bit.	2024-04-20 12:49:21 +02:00
Laurent Mazare	9c532aef47	Also enable llama-v3 8b instruct. (#2088 )	2024-04-19 08:50:06 +02:00
Thomas Santerre	f7a6468238	Add support for llama3 on the quantized example (#2086 ) * add support for l3b, new tokenizer * add todo * Add todo and use k_s model * Use the official tokenizers. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-04-18 22:52:00 +02:00
Laurent Mazare	e6ee7ba4d4	Llama v3. (#2085 ) * Llama v3. * Tweak the default params + handle special tokens. * Small tweak.	2024-04-18 22:19:54 +02:00
NorilskMajor	4d14777673	Utilize batches in Stable Diffusion (#2071 ) * Utilize batches in Stable Diffusion that were already there, but unutilized. Also refactor out the `save_image` function. * Clippy + cosmetic fixes. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-04-16 06:49:04 +02:00
Harry Stern	c119600d6e	Move image tensor to device in trocr example (#2063 ) Signed-off-by: Harry Stern <harry@harrystern.net>	2024-04-15 06:50:32 +02:00
Laurent Mazare	50e49ecc5f	Add a quantized version of recurrent-gemma. (#2054 ) * Add a quantized version of recurrent-gemma. * Share the rglru part. * Get the quantized gemma model to work.	2024-04-13 20:07:01 +02:00
Laurent Mazare	26cbbf8d84	Mandatory topk sampling for recurrent-gemma. (#2051 )	2024-04-13 10:31:39 +02:00
Laurent Mazare	2bf413caa3	Add the recurrent-gemma model. (#2039 ) * Start adding the recurrent-gemma model. * More griffin. * Add the example + get the weights to load from the HF version. * More inference code. * Rope + kv-cache on the attention side. * Add to the inference code. * Add more to the recurrent gemma inference. * Get some first inference to run. * Add the softcap on logits. * Fixes. * Use partial rotary embeddings. * Get inference to work. * Add a comment. * And add a readme.	2024-04-13 00:05:21 +02:00
Laurent Mazare	a0460cd2b1	Add the code-gemma models. (#2038 ) * Add the code-gemma models. * Tweak to the gemma config.	2024-04-10 21:19:21 +02:00
Laurent Mazare	b81ecf712d	Support alternative dtypes for mamba (#2036 ) * Allow different dtypes in mamba. * Add a dtype flag.	2024-04-10 18:10:01 +02:00
Laurent Mazare	7f354473cf	Optimize copy-2d for metal. (#2024 ) * Optimize copy-2d for metal. * Add a hacky stopping rule for moondream.	2024-04-07 12:34:16 +02:00
Laurent Mazare	33c9b66554	Add the new gemma models. (#2023 ) * Add the new gemma models. * Revert the lightning changes. * Support for the 1.1 models.	2024-04-06 21:25:38 +02:00
Santiago Medina	ace282e5c2	Add flag to run Moondream in f16 precision (#2015 ) * moondream implementation * add moondream example * change config default activation * Add assets and integrate phi mixformer with example * Make use of kv cache and fix seq_len bug; Clean up example code * Add README link to example * Remove pos_embed scaling; Remove assets; Add to README; Expand VisionConfig * Delete image * Use apply instead of forward * Use latest release special token; Fix token/s accuracy; Use GeluPytorchTanh in VisionConfig v2 * Add flag to use f16 * Avoid breaking the quantized version on cuda. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-04-05 07:03:33 +02:00
Laurent Mazare	c87381fc96	Use F16 for moondream on cuda. (#2013 )	2024-04-04 23:30:10 +02:00
Laurent Mazare	f48c07e242	Include topk sampling in the quantized example. (#2005 ) * Include topk sampling in the quantized example. * Also sample with top-k on the mistral side.	2024-04-04 09:27:54 +02:00
Santiago Medina	d17b2cdad9	Match Moondream's latest release (#1997 ) * moondream implementation * add moondream example * change config default activation * Add assets and integrate phi mixformer with example * Make use of kv cache and fix seq_len bug; Clean up example code * Add README link to example * Remove pos_embed scaling; Remove assets; Add to README; Expand VisionConfig * Delete image * Use apply instead of forward * Use latest release special token; Fix token/s accuracy; Use GeluPytorchTanh in VisionConfig v2	2024-04-02 21:37:09 +02:00
Laurent Mazare	be9c200cbb	Expose the t5 config fields + allow t5-large. (#1987 )	2024-04-01 20:58:34 +02:00
Santiago Medina	ea0d8d3753	Quantized moondream implementation and BOS token (#1980 ) * moondream implementation * add moondream example * change config default activation * Add assets and integrate phi mixformer with example * Make use of kv cache and fix seq_len bug; Clean up example code * Add README link to example * Remove pos_embed scaling; Remove assets; Add to README; Expand VisionConfig * Delete image * Use apply instead of forward * Pass bos token at the beginning of tensor. * Quantize moondream. * Forward with image bos token. * Clippy. * Use q4_0 quantization. * Add pointers for sequence and tokens; Remove seq_len conditional	2024-04-01 19:37:54 +02:00
Laurent Mazare	b20acd622c	Update for pyo3 0.21. (#1985 ) * Update for pyo3 0.21. * Also adapt the RL example. * Fix for the pyo3-onnx bindings... * Print details on failures. * Revert pyi.	2024-04-01 17:07:02 +02:00
Laurent Mazare	c7557b65dc	Switch the default to using the faster kernels. (#1978 ) * Switch the default to using the faster kernels. * Add the force-dmmv flag.	2024-04-01 10:00:11 +02:00
Laurent Mazare	cd29c7ccd4	More ggml cuda kernels (#1977 ) * Add more cuda kernels for quantized matmul. * Add the vec-dot bits. * Expose the quantized matmul-vec kernels. * Also include the quantize-q8-1 kernel. * Glue code for the q8-1 quantization. * mm-vec product via q8-1 quantization. * Add a test. * Add a mm test. * Get the test to return some sensible results. * Also test dmmv. * Fix the launch params. * Allow for tweaking the force_dmmv parameter while it's experimental.	2024-04-01 00:15:48 +02:00
Laurent Mazare	f9954b73ba	Add options to use local files + specify a custom repo or branch. (#1973 )	2024-03-31 09:32:50 +02:00
Laurent Mazare	eead1dcead	Clippy fix. (#1972 )	2024-03-31 08:57:40 +02:00
Santiago Medina	92f81d2fcb	Add Moondream transformer implementation and example (#1970 ) * moondream implementation * add moondream example * change config default activation * Add assets and integrate phi mixformer with example * Make use of kv cache and fix seq_len bug; Clean up example code * Add README link to example * Remove pos_embed scaling; Remove assets; Add to README; Expand VisionConfig * Delete image * Use apply instead of forward	2024-03-31 08:54:56 +02:00
Laurent Mazare	3144150b8d	Move the tensor-tools binary in a separate crate. (#1969 )	2024-03-30 15:49:37 +01:00
Laurent Mazare	8ad12a0e81	Add some examples using the MT5 variants. (#1963 )	2024-03-29 18:09:29 +01:00
Laurent Mazare	eb1b27abcd	Readme fix. (#1961 )	2024-03-28 23:24:46 +01:00
Laurent Mazare	708e422456	Qwen MoE model. (#1960 ) * Qwen MoE model. * Add the MoE model to the example. * Fix the scaling. * Readme updates. * Readme tweaks.	2024-03-28 23:10:57 +01:00
Laurent Mazare	c5092f2c29	Add a couple t5 models. (#1958 )	2024-03-28 17:58:06 +01:00
Tigran Zhampeissov	b0340d72ec	CLIP model implementation with example (#1950 ) * CLIP model implementation with example * CLIP Implementation fixes, batch images * CLIP model remove images from git * CLIP model remove unnecessary use of batch_indices	2024-03-28 13:44:12 +01:00
Laurent Mazare	e2b4829531	Support more mistral models. (#1927 ) * Support more mistral models. * Use the appropriate rope parameter.	2024-03-24 08:04:04 +01:00
Laurent Mazare	a00e24d752	Improve the error message on overlong prompts. (#1908 )	2024-03-21 21:08:07 +01:00
Sanchit Gandhi	bb3ee48039	whisper readme (#1899 )	2024-03-21 12:54:09 +01:00
Sanchit Gandhi	0c11e055be	support distil-large-v3 (#1898 )	2024-03-21 11:46:49 +01:00
Laurent Mazare	18036c6ccb	Update the image crate + use the re-exported version. (#1893 ) * Update the image crate + use the re-exported version. * Update to using ab_glyph.	2024-03-21 10:56:41 +01:00
Laurent Mazare	455c42aa72	Avoid copying the data on squeeze and unsqueeze. (#1884 ) * Avoid copying the data on squeeze and unsqueeze. * Fix the quantized llama example. * Unrelated fix for the quantized stable-lm example on cuda. * Fix for mamba on cuda (unrelated to the PR).	2024-03-20 13:04:36 +01:00
Laurent Mazare	f115895b9e	Apply rustfmt. (#1873 )	2024-03-18 21:43:31 +01:00
Gabriel	6a966cf9e0	Add a DQN example to the reinforcement-learning section (#1872 )	2024-03-18 21:22:53 +01:00
Laurent Mazare	58605252e8	Microphone support for the encodec example. (#1866 )	2024-03-18 11:19:46 +01:00
Laurent Mazare	d365ef32d9	Improve the encodec example: handle resampling. (#1865 ) * Improve the encodec example: handle resampling. * Play the audio directly.	2024-03-18 10:09:40 +01:00
Laurent Mazare	a15f859ab4	Fix for the encodec example. (#1861 )	2024-03-17 21:15:12 +01:00
Laurent Mazare	74bf6994b1	Move the image tensor to the appropriate device. (#1856 )	2024-03-16 22:25:46 +01:00
Jani Monoses	e1f9c3776d	StableLM-2 models were updated to use GPT-2 tokenization. (#1847 )	2024-03-14 21:01:36 +01:00
Tyler Rockwood	3318fe30fb	Update gemma README (#1843 ) * Update gemma README * Fixit	2024-03-13 21:41:36 +01:00
Laurent Mazare	56c9d3ee7b	Fix the model path for rwkv. (#1825 )	2024-03-09 11:21:48 +01:00
Laurent Mazare	dd00482ea3	Quantized version of the metavoice model. (#1824 ) * Quantized version of the metavoice model. * Integrate the quantized version of metavoice.	2024-03-09 11:06:04 +01:00
Laurent Mazare	3440cec3a0	Fast CPU kernel for transposed 1d convolutions. (#1822 ) * Fast CPU kernel for transposed 1d convolutions. * Bugfix.	2024-03-08 22:43:07 +01:00
Niklas Hallqvist	0a3487a776	Add a --seed argument to the stable-diffusion example. (#1812 ) * Add a --seed argument to the stable-diffusion example. * Make the case when no seed is specified, that it will not be set, but use the engine's default. This will make the CPU engine work again when no --seed is given, and will cause a bailout when a seed is there, as the engine does not currently support it. --------- Co-authored-by: niklas <niklas@appli.se>	2024-03-08 08:17:36 +01:00
Laurent Mazare	8a99cf7dd2	Add a flag to select the dtype used in metavoice. (#1805 )	2024-03-05 12:16:00 +01:00

1 2 3 4 5 ...

744 Commits