candle

Commit Graph

Author	SHA1	Message	Date
Sacha Arbonel	11ea7aac4d	tests (#1724 )	2024-02-23 06:35:46 +01:00
Daniel Varga	32eb56d6b3	Fix typo in README (#1740 )	2024-02-22 12:35:26 +01:00
Laurent Mazare	28057781aa	Make the cache for the llama model explicit too. (#1745 )	2024-02-22 12:04:33 +01:00
laurent	544018b6d0	Explicit caching in llama2.c.	2024-02-22 10:22:03 +01:00
Laurent Mazare	c753f72c85	Support for attention bias in gemma + refactor things a bit. (#1744 ) * Support for attention bias in gemma + refactor things a bit. * Fix the cuda tests.	2024-02-22 09:35:28 +01:00
Kirpal Grewal	8013b50829	Add grads for interpolate1d (#1742 ) * add backprop for interpolate1d * fix clippy lint * correct fix clippy lint	2024-02-22 08:44:01 +01:00
Laurent Mazare	45d5322d62	Add the Gemma models. (#1741 ) * Add the Gemma models. * Add the gemma example. * Adapt the RmsNorm. * Get the 2b model to work. * 7b support. * Use the config head dim. * Yet another fix. * Make the matrixes contiguous. * Also get the 7b model to work. * And add to the readme.	2024-02-21 22:02:50 +01:00
Laurent Mazare	a2cb2edead	Add a couple backtraces on cpu errors. (#1738 )	2024-02-20 19:54:13 +01:00
Laurent Mazare	fc67d878bb	Bugfix for conv-transpose1d (#1734 ) * Add a currently broken test. * Bugfix + fix test.	2024-02-19 09:04:49 +01:00
Laurent Mazare	3ba37443e5	Bugfix for applying the bias in conv1d-transpose. (#1732 )	2024-02-18 22:51:20 +01:00
Laurent Mazare	1fb728772d	Support for groups in conv-transpose1d. (#1731 ) * Groups support in conv-transpose-1d. * Remove dangling file.	2024-02-18 21:28:07 +01:00
Laurent Mazare	cb86b0c82c	Fix float unpickling. (#1730 )	2024-02-18 19:33:55 +01:00
Laurent Mazare	6284ad784c	Module implementation for options. (#1728 )	2024-02-18 14:12:55 +01:00
Laurent Mazare	678d44a7f6	Expose the weights and biases in transposed convolutions. (#1727 )	2024-02-18 10:35:01 +01:00
Laurent Mazare	41416d2376	Expose more conv1d functions/structs. (#1726 )	2024-02-17 18:50:55 +01:00
Laurent Mazare	5ebcfeaf0f	Make the r, k, v tensors contiguous. (#1719 )	2024-02-16 09:17:35 +01:00
Laurent Mazare	7c7400fb63	Use the tokenizer-output-stream in the llama example. (#1715 ) * Use the tokenizer-output-stream in the llama example. * Also use tokenizer-output-stream for llama2-c.	2024-02-15 16:47:33 +01:00
Laurent Mazare	058a910d0e	Add a readme for rwkv. (#1712 )	2024-02-14 15:31:33 +01:00
Laurent Mazare	26fe162ab5	Custom tokenizer for rwkv. (#1711 ) * Custom tokenizer for rwkv. * Custom tokenizer. * Getting the tokenizer to work.	2024-02-14 15:11:38 +01:00
Laurent Mazare	121a71e01f	Fix the silu cuda kernel. (#1710 )	2024-02-14 11:08:18 +01:00
Laurent Mazare	2d5f2a728d	Add the RWKV model (v5). (#1707 ) * Start adding the RWKV model. * More of the forward step. * Handle rescaling. * FeedForward. * More work on RWKV. * Better state tracking. * Finish a first pass on forward. * Fix the shape mismatches. * Do not rescale in f32. * Rename to rwkv-v5. * Add the new models to the readme.	2024-02-14 10:58:32 +01:00
Jani Monoses	68f7655895	Add ConvNeXt-V2 and smaller model variants. (#1709 )	2024-02-14 10:53:07 +01:00
OlivierDehaene	b60064780d	feat: add silu activation function (#1706 ) * feat: add silu activation function * use silu/arg in grad * update candle-nn * use node	2024-02-14 10:27:22 +01:00
Nicolas Patry	14010a8498	Update our cuda runner. (#1705 ) * Update our cuda runner. * Fix install rust. * Simplify. * Docker in docker. * Install curl * Install curl * No sudo. * devel * Put curl again. * Add missing deps. * pkg-config. * Cleanup.	2024-02-13 19:06:15 +01:00
Laurent Mazare	0de0795220	Qmetal tweaks (#1704 ) * Add the dummy qmetal backend. * Fix the metal compilation.	2024-02-13 18:11:17 +01:00
Nicolas Patry	c1b418586c	Fixing quantized llama demo on metal. (#1703 )	2024-02-13 16:28:56 +01:00
Laurent Mazare	ad73e93da2	Detach the tensors on batch-norm eval. (#1702 ) * Detach the tensors on batch-norm eval. * Fix pyo3 bindings. * Black tweak. * Formatting. * Also update the pyo3-onnx formatting. * Apply black.	2024-02-13 14:26:32 +01:00
drbh	13c67226e6	feat: support microphone whisper streaming (#1678 ) * feat: support microphone whisper streaming * fix: cleanup print stmts and adjust how input is read * fix: remove incorrect comment * feat: split into new example and simplify * fix: feature flag example file * fix: fmt fixes * feat: simplify and remove redundant files	2024-02-12 18:01:21 +01:00
Laurent Mazare	d0aa197b07	ConvTranspose1d cuda support. (#1697 ) * ConvTranspose1d cuda support. * Add the conv-transpose1d kernel. * Remove some unused variables.	2024-02-12 15:03:18 +01:00
Laurent Mazare	274bf11633	Support defaultdict in PyTorch checkpoints. (#1696 ) * Support defaultdict in PyTorch checkpoints. * Fix clippy lint.	2024-02-12 10:26:56 +01:00
Laurent Mazare	1e26d539d9	Improved mamba model optimized for inference (#1694 ) * Sketch the mamba model for inference. * Complete the forward pass. * Add the mamba example. * Optimize the selective-scan part. * Fix a couple shape mismatches and get inference to work. * Tweak the readmes. * More readme tweaks.	2024-02-11 17:04:57 +01:00
Nicolas Patry	74497e6bf7	Fixing the qwen tokenizer location. (#1693 ) Using the chatglm one causes a bug where the "<\|endoftext\|>" is not found.	2024-02-11 08:52:36 +01:00
Todsaporn Banjerdkit	8ab384e63d	docs: add trocr examples (#1692 )	2024-02-10 16:14:50 +01:00
Laurent Mazare	27ffd644a9	Mention TrOCR in the readmes. (#1691 )	2024-02-10 15:49:38 +01:00
Laurent Mazare	bf20cc854c	Support sinusoidal embeddings in trocr. (#1690 ) * Support sinusoidal embeddings in trocr. * Support tie-word-embeddings.	2024-02-10 15:17:51 +01:00
Laurent Mazare	42ce593ec6	Use the repo config for trocr rather than hardcoding it + small tweaks. (#1689 ) * Use the repo config for trocr rather than hardcoding it + small tweaks. * Add support for the printed models. * Fail with an appropriate error message on missing position embeddings.	2024-02-10 13:15:03 +01:00
Laurent Mazare	67589791d2	Remove the unused pragma in vit + handle the final layernorm. (#1688 )	2024-02-10 11:08:50 +01:00
Laurent Mazare	1c8d61f051	ChatGLM custom tokenizer. (#1687 )	2024-02-10 10:47:04 +01:00
Laurent Mazare	90447bc993	Add the custom tokenizer. (#1686 )	2024-02-09 17:36:50 +01:00
Laurent Mazare	40ce16001b	Use the proper endoftext token for gwen. (#1685 )	2024-02-09 17:02:03 +01:00
Laurent Mazare	5657e596cd	Add the Qwen2 model (#1684 ) * Initial check-in for the qwen2 model. * More qwen2 inference. * Polish the qwen example. * Fix the rope basis. * Get the inference to work. * Support different model sizes.	2024-02-09 15:02:49 +01:00
Laurent Mazare	0dee8ea19b	Add the ChatGLM model. (#1237 ) * Add the ChatGLM model. * Rotary embeddings. * Add to the forward pass. * Add to the forward pass. * Add the rotary embeddings. * Add the KV cache. * Add the chatglm example. * Bugfix. * More glm fixes. * Fix some shape issues. * Get the inference to work.	2024-02-09 11:51:38 +01:00
drbh	9cadd4e644	feat: support multithread spectrogram and small perf tweaks (#1674 ) * feat: support multithread spectrogram and small perf tweaks * feat: clippy improvement for loop variable * fix: add back speed up scale down logic * fix: readd mirroring logic * feat: prefer scoped thread and simplify/improve logic/traits	2024-02-08 21:54:12 +01:00
Laurent Mazare	020a979de2	Fix clippy lints for 1.76. (#1682 )	2024-02-08 16:48:47 +01:00
Laurent Mazare	cdc3823d8f	Pickle support: dig within the _rebuild_parameter calls. (#1681 )	2024-02-08 13:09:49 +01:00
Dilshod Tadjibaev	e5eb9602d0	Add support for loading Fortran contiguous tensors (#1672 ) * Add support for loading Fortran contiguous tensors This commit introduces the ability to handle Fortran contiguous tensors in the tensor loading process. Previously, the code only supported loading tensors that were contiguous in memory, failing with an error for non-contiguous tensors. With this update, tensors identified as Fortran contiguous (column-major order) are now correctly handled by reversing their dimensions after loading. This enhancement ensures broader compatibility with different tensor layouts, improving the robustness of tensor loading operations. - Check if a tensor is Fortran contiguous using the `is_fortran_contiguous` flag. - For Fortran contiguous tensors, reverse the dimensions after loading to correctly represent their layout in memory. - Continue to bail out with an error for tensors that are neither C contiguous nor Fortran contiguous, maintaining the previous behavior for non-contiguous tensors without explicit support. This change addresses the issue of loading Fortran contiguous tensors, which was previously unsupported, thereby extending the functionality of the tensor loading mechanism to accommodate a wider variety of tensor layouts. * Add reshape step to handle fortran contiguous case * Skip fortran contiguous fix if rank is < 2 * Fail on rank 0, 1 if contiguous	2024-02-07 21:49:59 +01:00
Dilshod Tadjibaev	b75e8945bc	Enhance pickle to retrieve state_dict with a given key (#1671 )	2024-02-06 21:17:33 +01:00
Daniël de Kok	a90fc5ca5a	Add `VarBuilder::from_backend` (#1670 ) `candle-nn` already exposes a trait to define custom backends. However, it's not possible to actually construct a `VarBuilder` with a custom backend because the constructor is not exposed. This change makes the constructor public and renames it from `new` to `from_backend` to avoid that it is seen as the primary constructor (which could be confusing to users).	2024-02-06 15:26:11 +01:00
Laurent Mazare	adfae2460a	Fix rustfmt. (#1669 )	2024-02-06 12:06:06 +01:00
Guoqing Bao	678f64dd27	Fix token generation in bilingual models (non-English outputs) (#1668 ) Co-authored-by: Guoqing Bao <guoqing.bao@enflame-tech.com>	2024-02-06 12:03:53 +01:00

1 2 3 4 5 ...

1793 Commits All Branches Search

1793 Commits

All Branches