* Add the Gemma models.
* Add the gemma example.
* Adapt the RmsNorm.
* Get the 2b model to work.
* 7b support.
* Use the config head dim.
* Yet another fix.
* Make the matrixes contiguous.
* Also get the 7b model to work.
* And add to the readme.
* Start adding the RWKV model.
* More of the forward step.
* Handle rescaling.
* FeedForward.
* More work on RWKV.
* Better state tracking.
* Finish a first pass on forward.
* Fix the shape mismatches.
* Do not rescale in f32.
* Rename to rwkv-v5.
* Add the new models to the readme.
* feat: support microphone whisper streaming
* fix: cleanup print stmts and adjust how input is read
* fix: remove incorrect comment
* feat: split into new example and simplify
* fix: feature flag example file
* fix: fmt fixes
* feat: simplify and remove redundant files
* Sketch the mamba model for inference.
* Complete the forward pass.
* Add the mamba example.
* Optimize the selective-scan part.
* Fix a couple shape mismatches and get inference to work.
* Tweak the readmes.
* More readme tweaks.
* Use the repo config for trocr rather than hardcoding it + small tweaks.
* Add support for the printed models.
* Fail with an appropriate error message on missing position embeddings.
* Initial check-in for the qwen2 model.
* More qwen2 inference.
* Polish the qwen example.
* Fix the rope basis.
* Get the inference to work.
* Support different model sizes.
* Add the ChatGLM model.
* Rotary embeddings.
* Add to the forward pass.
* Add to the forward pass.
* Add the rotary embeddings.
* Add the KV cache.
* Add the chatglm example.
* Bugfix.
* More glm fixes.
* Fix some shape issues.
* Get the inference to work.
* feat: support multithread spectrogram and small perf tweaks
* feat: clippy improvement for loop variable
* fix: add back speed up scale down logic
* fix: readd mirroring logic
* feat: prefer scoped thread and simplify/improve logic/traits
* Add support for loading Fortran contiguous tensors
This commit introduces the ability to handle Fortran contiguous tensors in the tensor loading process. Previously, the code only supported loading tensors that were contiguous in memory, failing with an error for non-contiguous tensors. With this update, tensors identified as Fortran contiguous (column-major order) are now correctly handled by reversing their dimensions after loading. This enhancement ensures broader compatibility with different tensor layouts, improving the robustness of tensor loading operations.
- Check if a tensor is Fortran contiguous using the `is_fortran_contiguous` flag.
- For Fortran contiguous tensors, reverse the dimensions after loading to correctly represent their layout in memory.
- Continue to bail out with an error for tensors that are neither C contiguous nor Fortran contiguous, maintaining the previous behavior for non-contiguous tensors without explicit support.
This change addresses the issue of loading Fortran contiguous tensors, which was previously unsupported, thereby extending the functionality of the tensor loading mechanism to accommodate a wider variety of tensor layouts.
* Add reshape step to handle fortran contiguous case
* Skip fortran contiguous fix if rank is < 2
* Fail on rank 0, 1 if contiguous
`candle-nn` already exposes a trait to define custom backends. However,
it's not possible to actually construct a `VarBuilder` with a custom
backend because the constructor is not exposed.
This change makes the constructor public and renames it from `new` to
`from_backend` to avoid that it is seen as the primary
constructor (which could be confusing to users).