candle

Commit Graph

Author	SHA1	Message	Date
Laurent Mazare	aa53368aeb	Better control on the optional dequantization in QMatMul (#1049 ) * Cosmetic change to the quantized whisper model. * Fix the dequantization. * Add the dequantize all variable.	2023-10-07 10:16:18 +01:00
Laurent Mazare	d5f7267087	Add the stable-lm example. (#1046 ) * Add the stable-lm example. * Get stable-lm to generate some proper text.	2023-10-06 19:20:35 +01:00
Laurent Mazare	b0442eff8a	Sketch the stable-lm model. (#1045 )	2023-10-06 18:19:06 +01:00
Laurent Mazare	4631c48273	Remove some todos. (#1042 )	2023-10-05 22:42:20 +01:00
Juarez Bochi	f47bd9bab5	Delete invalid comment (#1038 )	2023-10-05 19:28:08 +01:00
Laurent Mazare	089fc3b584	Improve the quantized whisper setup. (#1018 ) * Improve the quantized whisper setup. * Fix the config file paths. * Use the standard matmul where possible.	2023-10-02 17:17:46 +01:00
Laurent Mazare	e04c789230	Add a quantized variant of whisper (#1017 ) * Add the quantized-whisper model. * Quantized the whisper model. * Adapt the whisper example to handle quantization. * Add the quantized flag. * Load the proper weights.	2023-10-02 14:59:53 +01:00
Laurent Mazare	096dee7073	Bump the version to 0.3.0. (#1014 ) * Bump the version to 0.3.0. * Changelog update.	2023-10-01 13:51:57 +01:00
Laurent Mazare	deee7612da	Quantized version of mistral. (#1009 ) * Quantized version of mistral. * Integrate the quantized mistral variant. * Use the quantized weight files. * Tweak the quantization command. * Fix the dtype when computing the rotary embeddings. * Update the readme with the quantized version. * Fix the decoding of the remaining tokens.	2023-09-30 18:25:47 +01:00
Laurent Mazare	4021272875	Use flash-attn for mistral. (#1004 )	2023-09-30 12:15:10 +01:00
Laurent Mazare	6203ced495	Add negative prompts to segment-anything. (#1000 )	2023-09-30 06:17:42 +01:00
Laurent Mazare	d188d6a764	Fix the multiple points case for sam. (#998 )	2023-09-29 22:39:43 +02:00
Laurent Mazare	53510ce427	Use a silu activation in mistral. (#991 )	2023-09-29 07:06:54 +01:00
Laurent Mazare	23b3576c47	Add the sliding window. (#986 )	2023-09-28 17:26:33 +01:00
Laurent Mazare	716ab2ccdc	Mistral gpu fix (#985 ) * Add the mistral example. * Use the two model files. * Adjust the dtype. * Tweak the weight paths. * Remove the end of text token. * Get the mistral model to generate some text. * Fix when running on the gpu. * More gpu fixes.	2023-09-28 16:38:13 +01:00
Laurent Mazare	ada8851a23	Add the mistral example. (#984 ) * Add the mistral example. * Use the two model files. * Adjust the dtype. * Tweak the weight paths. * Remove the end of text token. * Get the mistral model to generate some text.	2023-09-28 16:19:18 +01:00
Laurent Mazare	c05a348e36	Add the Mistral 7b model (#983 ) * Start sketching the mistral 7b model. * Add the kv cache. * Add the decoder layer. * Add the mistral model. * Rotary embeddings. * Add the attention mask.	2023-09-28 14:29:41 +01:00
Laurent Mazare	ce0a4e3a85	Use the gelu-erf activation. (#969 )	2023-09-26 22:30:21 +01:00
Laurent Mazare	1fcac4afed	Expose a function to clear the KV cache on mixformers. (#964 )	2023-09-26 05:41:07 +01:00
Laurent Mazare	a36d883254	Use a single flag for the point argument. (#958 )	2023-09-25 12:53:24 +01:00
GeauxEric	7f2bbcf746	[segment-anything] Support multi-point as the prompt input (#945 ) * [sam] Support multi-point prompts * [segment-anything] Pass points by reference * [segment-anything] Update example code and image * Fix clippy lint. --------- Co-authored-by: Yun Ding <yunding@nvidia.com> Co-authored-by: laurent <laurent.mazare@gmail.com>	2023-09-25 12:14:10 +01:00
Laurent Mazare	0007ae9c11	Add the quantized mixformer model. (#953 ) * Add the quantized mixformer model. * Add the quantized option in the phi example.	2023-09-24 15:03:48 +01:00
Laurent Mazare	e15862cfdb	Shared the quantized var-builder code. (#952 ) * Shared the quantized var-builder code. * Fix compilation.	2023-09-24 12:55:07 +01:00
Laurent Mazare	bb3471ea31	Adapt more examples to the updated safetensor api. (#947 ) * Simplify the safetensor usage. * Convert more examples. * Move more examples. * Adapt stable-diffusion.	2023-09-23 21:26:03 +01:00
Laurent Mazare	7582937a32	Add the causal mask in mixformer. (#937 )	2023-09-23 09:50:26 +01:00
Laurent Mazare	b54acfa3d0	Tracing for the phi model (#936 ) * Add some tracing bits to mixformers. * Add the missing file. * Add the conv2d layer to with-tracing. * Improve the tracing usage.	2023-09-23 09:19:34 +01:00
Laurent Mazare	df6f5240ba	Complete the mixformer implementation. (#930 ) * Complete the mixformers implementation. * Tweak the attention. * Add the phi-1.5 example. * Improve the phi example. * Bugfix. * Get the phi example to work.	2023-09-22 20:03:16 +01:00
Laurent Mazare	a46b1b4657	Mixformer (#929 ) * Sketch the mixformer model. * More modeling code. * More mixformers. * MixFormer creation. * More mixformers.	2023-09-22 16:17:14 +01:00
Radamés Ajna	19e52e5007	T5 Wasm (#918 ) * init t5 wasm model * split workers for each model * clean up * add some ui * readme * index * typo * remove cache param, clear_kv_cache * add max_length as param * add model tasks option to ui * add method to load quantized gguf from buffer * Add quantized wasm module * add quantized models to UI, dynamic import wasms * link to quantized * fix copy * fix ModelEncoder * fix README.md	2023-09-22 15:31:10 +01:00
Laurent Mazare	3b557765e8	T5 quantized example (#922 ) * Load gguf files for the quantized t5. * Add the quantized t5 example. * Allow for loading local files. * Add some support for quantizing safetensor files. * Transpose before quantizing. * Quantized t5. * Retrieve the weights from the hub.	2023-09-21 12:33:15 +01:00
Laurent Mazare	2619c4307f	Add a quantized version of the t5 model. (#921 )	2023-09-21 11:13:39 +01:00
Laurent Mazare	c89b82b2d4	Add a clear cache function to the t5 model. (#919 )	2023-09-21 09:01:06 +01:00
Laurent Mazare	ab1d40ea97	Add more t5 tracing. (#915 )	2023-09-20 20:20:54 +01:00
Laurent Mazare	3a0d3e05df	Add more t5 tracing. (#914 ) * Add more t5 tracing. * Rever the sm change.	2023-09-20 16:37:51 +01:00
Laurent Mazare	9b24d89d2d	Tracing mode for T5. (#913 ) * Tracing mode for T5. * Tracing for the linear layer.	2023-09-20 15:03:35 +01:00
Laurent Mazare	fb1c2ac535	Add flash-attn support. (#912 ) * Add flash-attn support. * Add the use-flash-attn flag. * Re-enable flash-attn.	2023-09-20 14:07:55 +01:00
Laurent Mazare	f685b2231c	Add some missing biases. (#908 )	2023-09-20 10:14:51 +01:00
Juarez Bochi	05626ef492	Flan T5: Read lm_head when word embeddings are not tied (#903 ) * Read lm_head when word embeddings are not tied * Fix formatting * Address comments	2023-09-19 22:36:47 +01:00
Laurent Mazare	67a486d18d	Line-up the wuerstchen model with the python implementation. (#901 ) * Line-up the wuerstchen model with the python implementation. * Missing cos. * Fix the picture denormalization.	2023-09-19 21:59:44 +01:00
Juarez Bochi	8696f64bae	Fix T5 kv cache (#899 ) * Fix T5 kv cache * Add argument for decoder prompt * Fix range	2023-09-19 20:36:15 +01:00
Laurent Mazare	4f91c8e109	Improve the error message on shape mismatch for cat. (#897 ) * Improve the error message on shape mismatch for cat. * Cosmetic tweak.	2023-09-19 15:09:47 +01:00
Laurent Mazare	06e46d7c3b	Only use classifier free guidance for the prior. (#896 ) * Only use classifier free guidance for the prior. * Add another specific layer-norm structure. * Tweaks. * Fix the latent shape. * Print the prior shape. * More shape fixes. * Remove some debugging continue.	2023-09-19 14:13:05 +01:00
Laurent Mazare	92db8cecd3	Specialized attention module for Wuerstchen. (#890 ) * Specialized attention module for Wuerstchen. * Reshaping ops. * Attention processor. * Finish the forward pass. * Hook the new attention processor. * Get the prior forward pass to work. * Make it contiguous.	2023-09-18 21:16:09 +01:00
Laurent Mazare	82a98f6da0	Prior denoising. (#889 )	2023-09-18 16:51:38 +01:00
Laurent Mazare	5082954c52	Fix the W clip embeddings. (#887 ) * Fix the W clip embeddings. * Add the specialized ddpm scheduler.	2023-09-18 14:50:14 +01:00
Laurent Mazare	7dd8e12472	Bump the crate versions to v0.2.3. (#886 ) * Bump the crate version. * Also update the python bindings.	2023-09-18 12:14:03 +01:00
Laurent Mazare	c2b866172a	More Wuerstchen fixes. (#882 ) * More Weurstchen fixes. * More shape fixes. * Add more of the prior specific bits. * Broadcast add. * Fix the clip config. * Add some masking options to the clip model.	2023-09-17 22:08:11 +01:00
Laurent Mazare	06cc329e71	Remove the parameters for the Wuerstchen layer-norm. (#879 ) * Remove the parameters for the Wuerstchen layer-norm. * Fixes. * More fixes (including conv-transpose2d. * More fixes. * Again more fixes.	2023-09-17 15:59:27 +01:00
Laurent Mazare	5f83c13f17	Add the DDPM scheduler. (#877 ) * Add the DDPM scheduler. * Minor tweaks.	2023-09-17 15:03:01 +01:00
Laurent Mazare	db3e9dae04	Wuerstchen main (#876 ) * Wuerstchen main. * More of the wuerstchen cli example. * Paella creation. * Build the prior model. * Fix the weight file names.	2023-09-17 12:46:38 +01:00
Laurent Mazare	7f65af1f0d	Avoid re-encoding the input in the T5 example. (#875 )	2023-09-17 10:25:54 +01:00
Laurent Mazare	1a276b5da7	Add a KV cache to T5. (#873 ) * Add a KV cache to T5. * Suggest using release mode. * Use the kv cache in decoding. * Add a comment.	2023-09-17 08:00:45 +01:00
Juarez Bochi	3e49f8fce5	Implement T5 decoding (#864 ) * Load t5 decoder * Run enc, dec, and lm head, but no cross attn * Cross-attention over key_value_states * New arg for decoder input ids * Add mask, don't forward position biases through decoder * Update t5 examples * Clippy + rustfmt	2023-09-15 22:05:12 +02:00
Laurent Mazare	c2007ac88f	W fixes. (#862 )	2023-09-15 15:11:11 +01:00
Laurent Mazare	30be5b6660	Replication pad (#861 ) * Add the embed mapper convolutions. * Add the replication pad layer. * Use the replication-pad op. * Tweak a todo.	2023-09-15 14:06:21 +01:00
Laurent Mazare	107d3d9530	Add the embed mapper convolutions. (#860 )	2023-09-15 11:38:38 +02:00
Laurent Mazare	2746f2c4be	DiffNeXt/unet (#859 ) * DiffNeXt/unet * Start adding the vae. * VAE residual block. * VAE forward pass. * Add pixel shuffling. * Actually use pixel shuffling.	2023-09-15 10:14:02 +01:00
Laurent Mazare	130fe5a087	Add the upblocks. (#853 )	2023-09-14 22:24:56 +01:00
Laurent Mazare	91ec546feb	More DiffNeXt. (#847 ) * More DiffNeXt. * Down blocks.	2023-09-14 22:16:31 +02:00
Laurent Mazare	0a647875ec	Use softmax-last-dim in the quantized example. (#848 )	2023-09-14 17:29:24 +01:00
Laurent Mazare	a0c6d5548c	Add the attention block. (#846 ) * Add the attention block. * Add more to clipnext.	2023-09-14 15:40:09 +01:00
Laurent Mazare	286f01db14	Start adding the Wuerstchen diffusion pipeline (#843 ) * Wuerstchen common bits. * Add the prior layer. * Start adding diffnext.	2023-09-14 10:56:07 +01:00
Juarez Bochi	49d3f7f708	Add support to flan-t5 (#840 )	2023-09-13 19:27:20 +02:00
Laurent Mazare	3e94324012	Add some sentence similarity part to the t5 example. (#835 ) * Add some sentence similarity part to the t5 example. * Clippy fix.	2023-09-13 10:44:02 +01:00
Laurent Mazare	e4553fb355	T5 tweaks (#831 ) * Use default values rather than options. * Avoid exposing the device field. * More tweaks.	2023-09-13 07:37:04 +01:00
Laurent Mazare	d801e1d564	Clippy fix. (#830 )	2023-09-13 07:16:20 +01:00
Juarez Bochi	9daa6dbe87	Extract T5 module and add main function to use it (#829 ) * Extract t5 out of musicgen * Add main for t5 module	2023-09-13 07:14:05 +01:00
Juarez Bochi	805bf9ffa7	Implement top_p / nucleus sampling (#819 ) * Implement top_p / nucleus sampling * Update changelog * rustfmt * Add tests * Fix clippy warning * Fix another clippy error	2023-09-12 18:10:16 +02:00
Laurent Mazare	2257f4d475	Bump the crate version + update the changelog. (#822 )	2023-09-12 06:39:24 +01:00
Laurent Mazare	c5a058b169	Use the module trait in stable-diffusion. (#817 )	2023-09-11 20:40:07 +01:00
Laurent Mazare	d7b9fec849	Move the stable-diffusion modeling code so that it's easier to re-use. (#812 )	2023-09-11 11:45:57 +01:00
Laurent Mazare	84ee870efd	Use softmax-last-dim in whisper. (#810 )	2023-09-11 11:05:05 +01:00
Laurent Mazare	90e077e409	Return the low res mask in the wasm segment-anything module. (#798 ) * Return the low res mask. * Add some validations.	2023-09-10 13:03:02 +01:00
Laurent Mazare	584171cae1	Add a wasm module for the segment anything example. (#797 )	2023-09-10 12:29:37 +01:00
Laurent Mazare	35f72514f5	Move more models to candle-transformers (#796 ) * Move dinov2. * Move efficientnet. * Move the quantized llama model. * Move segment-anything.	2023-09-10 10:20:18 +01:00
Laurent Mazare	d3f05eae8c	Move some models to candle-transformers so that it's easier to re-use. (#794 ) * Move some models to candle-transformers so that they can be shared. * Also move falcon. * Move Llama. * Move whisper (partial).	2023-09-10 09:40:27 +01:00
Laurent Mazare	618f4e4c78	Add some documentation. (#673 ) * Add some documentation. * Bump the crate version.	2023-08-30 11:54:00 +01:00
Laurent Mazare	a3f97c143d	Bump the crate version + update CHANGELOG. (#628 )	2023-08-27 18:17:11 +01:00
Laurent Mazare	6e485f2deb	Add some optional repeat penalty. (#623 ) * Add some optional repeat penalty. * Add the missing files.	2023-08-27 10:48:45 +01:00
Laurent Mazare	aba1e90797	Add some group parameter to convolutions. (#566 ) * Add some group parameter to convolutions. * Avoid some unnecessary groups checks. * Move the tensor convolution bits. * Properh handling of groups. * Bump the crate version. * And add a changelog.	2023-08-23 12:58:55 +01:00
Laurent Mazare	3507e14c0c	Yolo v8 fixes (#542 ) * Fixes for the yolo-v8 layout. * Bugfixes. * Another silly bugfix. * Remove the hf-hub dependency. * Remove the transformers dependency.	2023-08-21 21:05:40 +01:00
Laurent Mazare	912561614f	Better handling of zero temperatures. (#532 )	2023-08-21 07:51:46 +01:00
Laurent Mazare	a8f61e66cc	Bump the crates version to 0.1.2. (#522 )	2023-08-20 08:07:07 +01:00
Laurent Mazare	531f23b4d0	Rename vec-dot to vec-ops. (#449 ) * Rename vec-dot to vec-ops. * Also bump the crate version. * Add a currently empty readme.	2023-08-15 10:48:57 +01:00
Laurent Mazare	b278834267	Support the Accelerate BLAS on macOS. (#325 ) * Add the accelerate feature. * Ffi tweaks.	2023-08-05 17:25:24 +01:00
Laurent Mazare	4fe8a02f88	Update the repo location. (#305 )	2023-08-02 11:12:18 +01:00
Laurent Mazare	03a421f714	Add some missing readme files. (#304 )	2023-08-02 10:57:12 +01:00
Laurent Mazare	d38943aadc	Add version numbers for all the candle crates (#303 ) * Switch to candle-gemm for the time being. * Add the missing versions.	2023-08-02 10:52:13 +01:00
Laurent Mazare	51e51da896	Rename the candle crate to candle-core (#301 ) * Rename to candle-core. * More candle-core renaming.	2023-08-02 08:20:22 +01:00
Laurent Mazare	3eb2bc6d07	Softmax numerical stability. (#267 ) * Softmax numerical stability. * Fix the flash-attn test.	2023-07-28 13:13:01 +01:00
Laurent Mazare	c34f932319	Fix the mkl build. (#204 ) * Fix the mkl build. * Fix the build properly.	2023-07-19 19:41:11 +01:00
Nicolas Patry	439321745a	Removing `candle-hub` internal to extract into `hf-hub` standalone.	2023-07-19 15:04:38 +02:00
Laurent Mazare	b8abe2bb4b	Factorize the tokenizers version in the workspace cargo def. (#186 )	2023-07-18 06:48:13 +01:00
Laurent Mazare	104f89df31	Centralize the dependency versions and inherit them. (#177 )	2023-07-16 07:47:17 +01:00
Nicolas Patry	4ed56d7861	Removing cuda default. Seems very important for a lot of exploring users usually on laptop without GPUs. Adding more README instructions in a follow up.	2023-07-14 16:52:15 +02:00
Laurent Mazare	ba35d895e7	Sketch the candle-transformers crate. (#147 ) * Sketch the candle-transformers crate. * Format the empty files.	2023-07-12 13:49:31 +01:00

... 2 3 4 5 6

296 Commits