* Add round, floor, ceil into FloatTensorOps trait; Impl round, floor, ceil for tensor, tch, ndarray, candle; Add tests for autodiff
* Add test for round, floor, ceil in burn tensor backend
* Add test for round, floor, ceil in burn candle backend
* Impl round, floor, ceil for burn fushion backend
* Update burn book
* Fix round gradient != 0 issue
* Add tests for halfway cases
* Use `round_ties_even` for `round` in ndarray backend
* Impl round to even for candle
* Add round, floor, ceil for burn router
* Add round, floor, ceil for jit backend; Upgrade cubecl
* Add round_ties_even for no-std in ndarray backend
* Be explicit about what rounding strategy is used
---------
Co-authored-by: Guillaume Lagrange <lagrange.guillaume.1@gmail.com>
* Made compatible with thumbv6m-none-eabi
* Added example of no_std on rp2040
* Added documentation on usage in no_std
* Rename rp2040 example and add README.md
* Move QuantizationScheme to burn-tensor
* Refactor QuantizedTensorPrimitive to include the quantization strategy
* Fix QFloat tensor data display
* Refactor quantization methods to use scheme and qparams (on backend device)
* Fix clippy
* Fix fmt
* Add qtensor primitive tests
* Add QuantizationBackend, QTensorOps and QTensor
* Refactor QTensorOps as part of Backend trait
* Add tensor dequantize, QFloat dtype and default affine/symmetric quant
* Add ndarray default quantization implementation
* Fix clippy
* Add rayon parallel iter
* Add quantization operations to book
* Add q_shape and q_device ops to avoid converting the tensor just to get attributes
* Implement autodiff grad ops
* Mark autodiff todo for QAT
* Remove note
* Add q_inner and q_from_inner
* Implement 3D and transposed 3D convolutions.
* Merge changes from onnx-ir #1921 pr
---------
Co-authored-by: Dilshod Tadjibaev <939125+antimora@users.noreply.github.com>
* Move distribution to module
* Add new TensorData with serialization support
* Implement display and from for TensorData
* Add missing Cargo.lock
* Add missing bytemuck feature
* Add zeros, ones, full and random TensorData methods
* Refactor Data -> TensorData usage
* Fix tests
Since TensorData is not generic over the element type anymore no type inference can be done by the compiler. We must explicitly cast the expected results to the expected backend type.
* Remove commented line
* Fix import
* Add record-backward-compat
* Remove dim const generic from TensorData
* Support NestedValue de/serialization with TensorData
* Fix burn-jit tests
* Remove eprinln
* Refactor onnx import to use TensorData
* Fix tch from_data
* Fix nested value serialization for u8
* Fix missing import
* Fix reduce min onnx test
* Fix deprecated attribute
* Remove shape getter
* Remove strict assert in tests
* Add tensor data as_bytes
* Add tensor check for rank mismatch
* Fix typo (dimensions plural)
* Fix error message
* Update book examples with from_data and fix Display impl for TensorData
* Add deprecation note
* Element already implements One
* Add element module
* Add our own traits for Zero, One and ToPrimitive to support bool Element
* Fix typo
* Add basic tests for ToPrimitive with expected values
* The most important change of all
* Remove One + Zero identities
* Move zero/one outside mapv + refactor ToPrimitive -> ToElement trait
* Add num-traits to NOTICES.md
* #1747
Upgrade Rust dependencies
* Revert upgrade for tch
The update of tch on windows gives an error:
INTEL MKL ERROR: The specified module could not be found. mkl_vml_avx2.1.dll.
Intel MKL FATAL ERROR: cannot load mkl_vml_avx2.1.dll or mkl_vml_def.1.dll.
* Keep only .cargo/config.toml file which works with rust > 1.75
---------
Co-authored-by: Sylvain Benner <sylvain@benner.online>
* Move HandlerContainer and Tensor Ops description to burn-tensor
Move HandleContainer and Tensor operations descriptions to burn-tensor crate.
Removed the FusionDevice and replaced it with a DeviceOps trait bound to Backend::Device.
For now added modules to burn-tensor are excluded from no-std as they rely on Arc.
* [burn-tensor] Flatten module hierarchy for tensor representation
+ Add new repr feature to cargo file.
* Remove prefix on dosctring
* [burn-fusion] Require default features of burn-tensor
* Add remainder_scalar op to numeric trait and associated int/float functions
* Update burn-tch crate
* Update ndarray crate
* Update jit crate
* Update candle crate
* Update fusion crate
* Update autodiff crate
* Forgot float.rs for fusion
* Add burn-tensor tests
* Redirect to the pre-existing modulus op
* Fix sign
* Remove mut from burn-tch
* Use sign trick to make wgpu backend work
* Add more unit tests in to cover bases
* Naming fix for burn-fusion
* Update tests w/PyTorch link
* Use different WGSL instructions for remainder
* Redirect to remainder Operator instead of modulo
* Revert Modulo in instruction.rs
Add training support for nearest interpolation
---------
Co-authored-by: yurzhang <yurzhang.oi@gmail.com>
Co-authored-by: Dilshod Tadjibaev <939125+antimora@users.noreply.github.com>
* Add int_random to int tensor ops
* Int random for tch backend
* Int random for burn-fusion
* int random for autodiff
* Int random for candle backend
* Int random for ndarray backend
* Int random for wgpu backend
* Merge imports
* Typo
* Shader file for int uniform distribution
* Create AutotuneOperationSet and public int_sum_dim_autotune
* Adjust bounds to 0..10
* Create uniform_int_kernel, unit tests, use new kernel
* Reduction kernels for regular and shared memory sum_dim int operations
* Macro that accomadates wgpu IntElement
* Add autotuning to int_mean_dim
* Use correct macro for Int autotuning
* Add int_mean_dim_shared_memory
* Add int_mean_dim and unit test
* Create autotunables for mean_dim
* Run fmt
* Remove comment
* Finish resolving merge conflict, fix doc
* Make the element trait bound a parameter to reduce_tune_ops macro
* Update book
* Fix requested change
* Change range to [0, 255] and update test accordingly
* Forgot to include candle in last commit
* Fix comment
* Use correct int autotune for mean dim
* Fix typo- not sure how this passed earlier
* Resolve syntax issues from merge
* Fix cast_float
* Saving here
* Continue fixing merge conflicts, all tests pass locally
* Run fmt
* Change cast_float to cast_u32_to_float
* Make uniform_int_inner_loop safer
* Be even more explicit about u32 casts
* Skip an intermediate step and cast directly to u32
* Replace JitElement + Element with IntElement
* Run fmt
* This should fix the CI
* This time for sure