Commit Graph

1950 Commits

Author SHA1 Message Date
Laurent Mazare 020a979de2
Fix clippy lints for 1.76. (#1682) 2024-02-08 16:48:47 +01:00
Laurent Mazare cdc3823d8f
Pickle support: dig within the _rebuild_parameter calls. (#1681) 2024-02-08 13:09:49 +01:00
Dilshod Tadjibaev e5eb9602d0
Add support for loading Fortran contiguous tensors (#1672)
* Add support for loading Fortran contiguous tensors

This commit introduces the ability to handle Fortran contiguous tensors in the tensor loading process. Previously, the code only supported loading tensors that were contiguous in memory, failing with an error for non-contiguous tensors. With this update, tensors identified as Fortran contiguous (column-major order) are now correctly handled by reversing their dimensions after loading. This enhancement ensures broader compatibility with different tensor layouts, improving the robustness of tensor loading operations.

- Check if a tensor is Fortran contiguous using the `is_fortran_contiguous` flag.
- For Fortran contiguous tensors, reverse the dimensions after loading to correctly represent their layout in memory.
- Continue to bail out with an error for tensors that are neither C contiguous nor Fortran contiguous, maintaining the previous behavior for non-contiguous tensors without explicit support.

This change addresses the issue of loading Fortran contiguous tensors, which was previously unsupported, thereby extending the functionality of the tensor loading mechanism to accommodate a wider variety of tensor layouts.

* Add reshape step to handle fortran contiguous case

* Skip fortran contiguous fix if rank is < 2

* Fail on rank 0, 1 if contiguous
2024-02-07 21:49:59 +01:00
Dilshod Tadjibaev b75e8945bc
Enhance pickle to retrieve state_dict with a given key (#1671) 2024-02-06 21:17:33 +01:00
Daniël de Kok a90fc5ca5a
Add `VarBuilder::from_backend` (#1670)
`candle-nn` already exposes a trait to define custom backends. However,
it's not possible to actually construct a `VarBuilder` with a custom
backend because the constructor is not exposed.

This change makes the constructor public and renames it from `new` to
`from_backend` to avoid that it is seen as the primary
constructor (which could be confusing to users).
2024-02-06 15:26:11 +01:00
Laurent Mazare adfae2460a
Fix rustfmt. (#1669) 2024-02-06 12:06:06 +01:00
Guoqing Bao 678f64dd27
Fix token generation in bilingual models (non-English outputs) (#1668)
Co-authored-by: Guoqing Bao <guoqing.bao@enflame-tech.com>
2024-02-06 12:03:53 +01:00
Laurent Mazare b545f54a19
Fix clippy lints. (#1667) 2024-02-06 09:03:36 +01:00
Roma Klapaukh 1ba11f22d6
Fix: pth files don't load on Windows (#1661)
* Don't treat zip path as OS path

* Add a test case

* Add code to generate test pth data
2024-02-06 08:50:55 +01:00
Jiayu Liu 982722019b
add roll function to tensor (#1666) 2024-02-06 08:49:45 +01:00
Laurent Mazare a83ca2ece0
Bump the crate version to 0.4.0. (#1658) 2024-02-04 19:08:01 +01:00
Tarek 153c940a9c
Update docs to reflect current usage of example (#1610)
modified:   candle-examples/examples/onnx/README.md
2024-02-04 11:59:47 +01:00
Laurent Mazare 50be8a98ba
Quantized support for stable-lm2. (#1654)
* Quantized support for stable-lm2.

* Quantized support for v2-zephyr.
2024-02-04 11:57:05 +01:00
Daniel Clough 58cc896e69
make llama derive clone (#1648)
Co-authored-by: danielclough <danielclough@users.noreply.github.com>
2024-02-04 11:56:03 +01:00
wanglong001 5cdd84e0f6
onnx: add the Flatten operator. (#1638)
* onnx: add the Flatten operator.

* onnx flatten: merge axis condition

---------

Co-authored-by: 王泽龙 <wangzelong@shenqishen.com>
2024-02-03 16:28:47 +01:00
Laurent Mazare a510ddec4e
Mention the new models in the readme. (#1651) 2024-02-03 15:19:57 +01:00
Jani Monoses d32abbce53
Add StableLM-2, StableLM Code and Zephyr variants (#1650)
* Add StableLM Code and Zephyr variants

* Add V2 models

* Update README
2024-02-03 14:58:41 +01:00
Hubert Shelley dfab45e1c8
Supports more audio formats (#1628)
* Supports more audio formats

* Simplify the handling of the different buffer types.

* Check the sample rate.

---------

Co-authored-by: laurent <laurent.mazare@gmail.com>
2024-02-03 14:26:04 +01:00
Bayang 96bc704d17
Update mixformer.rs (#1601)
Update the source of the configuration_mixformer_sequential.py
It has been removed, therefore, it is still available in this -> d38e6f954ec29b96fe2cf033937dad64e279b5d9
2024-02-03 13:42:16 +01:00
Jani Monoses a52d407ae6
Add ConvNeXt model. (#1604) 2024-02-03 13:34:28 +01:00
Laurent Mazare 9e824ec810
Explicit version for packages that are not in the workspace. (#1642) 2024-01-31 18:57:38 +01:00
Laurent Mazare beadb1b434
Explicit candle version so that cargo publish can be used easily. (#1641) 2024-01-31 18:42:22 +01:00
Christopher Fleetwood 6d83d42efb
Merge pull request #1606 from FL33TW00D/feature/larger-batches
fix: larger batches
2024-01-29 15:31:10 +00:00
FL33TW00D b6afb46601
chore: final 2024-01-22 15:15:19 +00:00
ivarflakstad fd7c856564
Merge pull request #1533 from huggingface/ivarflakstad/metal-prng 2024-01-22 07:30:20 +01:00
FL33TW00D 73d79e6092
chore: actual fix 2024-01-19 09:35:42 +00:00
FL33TW00D b1879f17f6
chore: switch to buffer 2024-01-19 08:57:49 +00:00
FL33TW00D 4f79f5df8a
fix: larger batches 2024-01-18 14:30:14 +00:00
ivarflakstad 1cf34368b7
Merge pull request #1602 from mimiquate/fix-metal-kernel-type
Metal: Use uint8_t as output type in int64_t binary op kernel
2024-01-18 08:40:34 +01:00
Gonzalo 17e6e2d7ee
Fixes metal kernel u8 type 2024-01-17 15:47:08 -03:00
Ivar Flakstad 80b1c689f9 Revert public EncoderParam 2024-01-17 18:09:28 +01:00
Ivar Flakstad db923517b3 Merge branch 'main' into ivarflakstad/metal-prng 2024-01-17 18:03:57 +01:00
Nicolas Patry 403680f17d
Quantized GGUF style (#1523)
* Metal quantized modifications proposal.

- Add a device param, wherever needed.
- Create new QMetal storage thing that implements QuantizedType.
- Update everywhere needed.

Fix Python.

Fixing examples.

Fix: fmt + clippy + stub.

Moving everything around.

Only missing the actual implems.

Fixing everything + adding dequantized kernels.

More work.

Fixing matmul.

Fmt + Clippy

Some clippy fixes.

Working state.

Q2K Metal -> Bugged (also present in GGML).
Q4K CPU -> Bugged (present previously, new test catch it).
Q5K CPU -> Bugged (present previously).
Q8_1 Both -> Never really implemented it seems
Q8K metal -> Never implemented in metal

Fixing Q2K bug (present in ggml).

* Cleanup.

* Fix the rebase.

* Removing the fences speeds everything up and *is* correct this time...

* Cleanup the fence.

* After rebase.

* Bad code removal.

* Rebase after phi2 merge + fix replit default to CPU.

* Making the CI happy.

* More happy tests.

---------

Co-authored-by: Nicolas Patry <nicolas@Nicolass-MacBook-Pro.local>
2024-01-17 10:27:58 +01:00
Ivar Flakstad 86a8e58897 Update metal random kernel and set_seed method
* set_seed via buffer content pointer copy + did_modify_range

* ensure random.metal kernel does not write outside of buffer range when tid==0
2024-01-17 09:12:44 +01:00
Jani Monoses 5270224f40
Add MobileOne model. (#1595)
* Add MobileOne model.

* Clippy fixes

* Remove a comment.

---------

Co-authored-by: laurent <laurent.mazare@gmail.com>
2024-01-16 06:34:16 +01:00
dependabot[bot] 7e3349d7c3
Update parquet requirement from 45.0.0 to 50.0.0 (#1592)
Updates the requirements on [parquet](https://github.com/apache/arrow-rs) to permit the latest version.
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md)
- [Commits](https://github.com/apache/arrow-rs/compare/45.0.0...45.0.0)

---
updated-dependencies:
- dependency-name: parquet
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-01-15 22:35:01 +01:00
dependabot[bot] 1257fc6719
Update safetensors requirement from 0.3.1 to 0.4.1 (#1591)
Updates the requirements on [safetensors](https://github.com/huggingface/safetensors) to permit the latest version.
- [Release notes](https://github.com/huggingface/safetensors/releases)
- [Changelog](https://github.com/huggingface/safetensors/blob/main/RELEASE.md)
- [Commits](https://github.com/huggingface/safetensors/compare/v0.3.1...v0.3.3)

---
updated-dependencies:
- dependency-name: safetensors
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-01-15 22:34:40 +01:00
Laurent Mazare ea36f3b11f
Use the new phi model by default. (#1589) 2024-01-15 12:30:27 +01:00
Ivar Flakstad 79478ff5a1 Seed should be updated by random kernel result. 2024-01-15 11:58:25 +01:00
Laurent Mazare 86b7c01b30
Update gemm to the latest version. (#1587) 2024-01-15 09:44:51 +01:00
Laurent Mazare bdd8107fda
Expose the ndarray trait. (#1586) 2024-01-14 20:09:49 +01:00
Ivar Flakstad ecf88a6d38 Merge branch 'main' into ivarflakstad/metal-prng 2024-01-14 17:10:54 +01:00
Laurent Mazare e6d86b0819
Add the pow operator. (#1583)
* Add the pow operator.

* Support the pow operation in onnx.
2024-01-13 20:24:06 +01:00
Laurent Mazare 88618255cb
Fix the rotary embeddings for the new phi implementation. (#1582)
* Fix the rotary embeddings for the new phi implementation.

* Match the activation.

* KV cache fix.

* Use the config activation function.
2024-01-13 19:44:41 +01:00
Laurent Mazare 539ead927a
Update the Phi model to use the updated architecture. (#1580)
* Update the Phi model to use the updated architecture.

* Add more of the phi model.

* Repeat KV + caching.

* Apply the rotary embeddings.

* Add support for the new phi model in the phi example.

* Fix a couple glitches.

* Fix a couple more glitches.
2024-01-13 17:38:27 +01:00
SebastianRueClausen a46864bd56
Fix "Minimal Mamba" link in README. (#1577) 2024-01-12 17:47:07 +01:00
Nicolas Patry bafe95b660
Fix format. (#1576) 2024-01-12 14:23:17 +01:00
ivarflakstad a3d92ab226
Metal: Activate bfloat affine and add benchmark (#1543)
* Use cfg to seperate benchmark results based on features

* Add bfloat affine and benchmarks

* Fix flops calculation

* Remove allow pragma

* Avoid some unnecessary returns.

* Improve benchmarks layout

---------

Co-authored-by: Laurent <laurent.mazare@gmail.com>
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
2024-01-12 11:19:49 +01:00
ivarflakstad e90bcdcc7c
Metal: f16 and bf16 where_cond + benchmark (#1545)
* Use cfg to seperate benchmark results based on features

* Add metal where_cond for f16 and bf16. Add benchmark

* Remove allow pragma

* Avoid some unnecessary returns.

* Improve benchmarks layout

* Updated feature separated benchmarks

---------

Co-authored-by: Laurent <laurent.mazare@gmail.com>
2024-01-12 11:18:11 +01:00
Laurent Mazare 8e06bfb4fd
Mention VGG in the readme. (#1573) 2024-01-12 09:59:29 +01:00