Commit Graph

64 Commits

Author SHA1 Message Date
Nathaniel Simard 978ac6c4ec
Chore: Update to newer cubecl version (#2181) 2024-08-25 15:33:16 -04:00
Nathaniel Simard bb4a605ca6
Chore/integrate updated cubecl (#2142) 2024-08-08 16:19:39 -04:00
Nathaniel Simard 19cd67a9e2
Migration/cubecl (#2041) 2024-07-22 11:08:40 -04:00
Nathaniel Simard 35345de62a
Feat/cube/slice (#2004)
* Refactor Variable types

* Sice

* Implement slice wgsl

* handle lifetime correctly

* Add cuda impl

* Update cmma

* Cleanup

* Fix tests

* Fix slice signature
2024-07-11 11:28:53 -04:00
Nathaniel Simard e7a3cc4fba
Fix wgsl remainder definition (#1979) 2024-07-08 15:26:10 -04:00
Arthur Brussee 3f9e97946f
Feat: Dynamic cube count dispatch (#1975) 2024-07-06 19:17:01 -04:00
Nathaniel Simard b331290f8a
Refactor/jit/unary (#1965) 2024-07-05 19:47:24 -04:00
Sylvain Benner d6efb3ca17 Set DEFAULT_MAX_TASKS to 1 when running tests 2024-07-05 18:57:01 -04:00
Arthur Brussee 0928a52eea
Always derive Cube features from adapter (#1958) 2024-07-05 17:38:07 -04:00
Nathaniel Simard 51aea94a30
Dynamic memory management preset + updated wgpu buffer memory management (#1962)
---------

Co-authored-by: mepatrick73 <pameu17@ulaval.ca>
2024-07-04 16:47:08 -04:00
Nathaniel Simard f709858a8b
Revert "Perf: cube reuse shape and strides (#1939)" (#1967)
This reverts commit ad81a997af.
2024-07-04 16:16:17 -04:00
Nathaniel Simard 82a883a57d
Feat/cube/fma (#1947) 2024-07-02 08:32:39 -04:00
Nathaniel Simard cb6b5e7183
Feat/cube/cooperative matrix-multiply and accumulate. (#1943) 2024-07-02 08:31:00 -04:00
Nathaniel Simard ad81a997af
Perf: cube reuse shape and strides (#1939) 2024-07-02 08:28:32 -04:00
Arthur Brussee 849c8f453b
Consistent sync/async handling, allow more functions to be async for wasm. (#1936) 2024-07-02 08:25:28 -04:00
Arthur Brussee 14d1bbba64
Do not use default burn-compute features unless enabled. (#1908) 2024-06-19 10:12:11 -04:00
Nathaniel Simard 4f6db974a1
Perf/dynamic mm (#1906) 2024-06-18 08:41:07 -04:00
Arthur Brussee ac9f942a46
Remove GraphicsAPI generic for WgpuRuntime (#1888) 2024-06-17 09:04:25 -04:00
Nathaniel Simard 5e58ae1a02
Refactor the tuner to be used standalone (#1884)
* Refactor the tuner to be used standalone

* Add a name for the autotune cache

* Fix tests

* Fix typo
2024-06-13 13:23:58 -04:00
Arthur Brussee c873d87ac8
Add option to flush queue instead of waiting for completion. (#1864)
* Make sync_type an option on sync instead of adding submit
2024-06-13 09:56:08 -04:00
Arthur Brussee 4b174a88bd
Get resources from server (#1861) 2024-06-06 17:33:57 -04:00
Arthur Brussee 75e26d03c3
Speedup client.create for small allocations. (#1858)
* Speedup client.create for small allocations.
2024-06-06 17:09:01 -04:00
Arthur Brussee e0a1094f89
Add a feature to initialize from an existing wgpu adapter/device/queue (#1788)
* Add a feature to initialize from an existing wgpu adapter/device/queue

This is useful when interacting with other wgpu applications (eg. displaying a burn tensor as a texture in egui). The existing devices are keyed by the wgpu Device ID. Alternatively they could be keyed per adapter which would be more inline with other burn WgpuDevice's (one per adapter), but also there's no real inherent reason to.

This also involves making Queue into an Arc. Alternatively, this could give up ownership of the queue, but it's helpful to be able to synchronize burn operations and custom wgpu operations.
2024-06-05 07:19:52 -04:00
mepatrick73 36ed65a5cd
Feat/dynamic mm basic implementation + small refactor (#1844) 2024-06-04 17:01:33 -04:00
Nathaniel Simard 36d4bcd705
[Refactor - Breaking] Refactor cube operations with better names & Support subgroup operations (#1839) 2024-05-31 17:07:21 -04:00
Louis Fortier-Dubois de0b49e4a3
Cube: Topology constants (#1838)
---------

Co-authored-by: nathaniel <nathaniel.simard.42@gmail.com>
2024-05-30 12:03:30 -04:00
Guillaume Lagrange e4836241e1
Fix `DataSerialize` conversion for elements of the same type (#1832) 2024-05-28 18:12:44 -04:00
Louis Fortier-Dubois 033171920c
Cube: first ported kernel + comptime support + variable reuse + cleanup (#1797) 2024-05-22 14:08:21 -04:00
Louis Fortier-Dubois 76fe0ed881
Refactor/cube/vectorization (#1781) 2024-05-19 13:20:55 -04:00
Louis Fortier-Dubois 499ff0dd26
Feat/enable cube cl (#1777)
* Ben WIP

* Compile burn-jit

* WGPU works

* Remove old code

* move language cube stuff

* cleaning up

* some import reworking

* remove cube reexport

* template feature flag in cube

* ci

---------

Co-authored-by: nathaniel <nathaniel.simard.42@gmail.com>
2024-05-19 10:55:04 -04:00
Ahmed Yarub Hani Al Nuaimi 10737527d8
#1747 Upgrade Rust dependencies (#1748)
* #1747
Upgrade Rust dependencies

* Revert upgrade for tch

The update of tch on windows gives an error:

INTEL MKL ERROR: The specified module could not be found. mkl_vml_avx2.1.dll.
Intel MKL FATAL ERROR: cannot load mkl_vml_avx2.1.dll or mkl_vml_def.1.dll.

* Keep only .cargo/config.toml file which works with rust > 1.75

---------

Co-authored-by: Sylvain Benner <sylvain@benner.online>
2024-05-10 16:25:19 -04:00
Thierry Cantin-Demers b09d8431df
Fix Cargo.toml repository links (#1749)
* Fix wgpu github link

* Fix burn-train repo link

* Fix burn-tensor github repo

* Fix burn-tensor repo link

* Fix remaining repo links in crates Cargo.toml

---------

Co-authored-by: Jonathan Richard <47578360+jwric@users.noreply.github.com>
2024-05-09 15:40:05 -04:00
Sébastien Boisvert bd06b38fac
Refactor: replace trait TemplateKernel by existing trait JitKernel (#1737)
* Refactor: replace trait TemplateKernel by existing trait JitKernel

* Refactor: implement trait JitKernel for struct Kernel
2024-05-06 20:59:00 -04:00
Nathaniel Simard 587b8f80b3
First draft CUDA runtime (#1685)
Initial cuda runtime crate with a WIP compiler.
2024-04-30 09:46:29 -04:00
Guillaume Lagrange ce2429eb10
Refactor element type to be decoupled from runtime (#1693) 2024-04-26 08:53:55 -04:00
Nathaniel Simard 599a20d586
Upgrade wgpu (#1692) 2024-04-25 16:32:50 -04:00
Nathaniel Simard 886a1de235
Refactor/burn compute (#1580) 2024-04-23 13:05:15 -04:00
Sylvain Benner c579686a8a
Move HandleContainer and Tensor Ops descriptions from burn-fusion to burn-tensor (#1654)
* Move HandlerContainer and Tensor Ops description to burn-tensor

Move HandleContainer and Tensor operations descriptions to burn-tensor crate.
Removed the FusionDevice and replaced it with a DeviceOps trait bound to Backend::Device.

For now added modules to burn-tensor are excluded from no-std as they rely on Arc.

* [burn-tensor] Flatten module hierarchy for tensor representation

+ Add new repr feature to cargo file.

* Remove prefix on dosctring

* [burn-fusion] Require default features of burn-tensor
2024-04-23 11:27:54 -04:00
Mathias Insley 7377bbe31c
Feat/remainder (#1597)
* Add remainder_scalar op to numeric trait and associated int/float functions

* Update burn-tch crate

* Update ndarray crate

* Update jit crate

* Update candle crate

* Update fusion crate

* Update autodiff crate

* Forgot float.rs for fusion

* Add burn-tensor tests

* Redirect to the pre-existing modulus op

* Fix sign

* Remove mut from burn-tch

* Use sign trick to make wgpu backend work

* Add more unit tests in to cover bases

* Naming fix for burn-fusion

* Update tests w/PyTorch link

* Use different WGSL instructions for remainder

* Redirect to remainder Operator instead of modulo

* Revert Modulo in instruction.rs
2024-04-16 08:35:20 -04:00
Sylvain Benner e303e31c8b
Bump next version of Burn to 0.14.0 (#1618) 2024-04-12 17:14:45 -04:00
Guillaume Lagrange 264c167c11
Update licenses symlinks (#1613) 2024-04-12 14:43:58 -04:00
Louis Fortier-Dubois f5159b6d22
Refactor: split JitKernel and SourceKernel (#1569)
* refactor execute_dynamic into Execution

* minor change

* extension cfg

* jitkernel and sourcekernel

* add todo statement

* cleanup and docs

* update book

* fix server dependancy on compiler

* refactor into shader information

* refactor to compile shader once

* clippy

* clippy

* clippy

* fix doc

* fix doc

* fmt

* rename feature flag

* refactor

* All broked

* compile at the right time

* todo done

* all dynamic

* all dynamic in template too

* fmt

* fix ci

---------

Co-authored-by: nathaniel <nathaniel.simard.42@gmail.com>
2024-04-05 12:58:10 -04:00
Nathaniel Simard efc3b2d243
[Breaking] add runtime options in wgpu init methods (#1505) 2024-03-28 12:44:38 -04:00
Nathaniel Simard 40a26bd2ea
Feat/backend bridge (#1529) 2024-03-26 19:24:45 -04:00
Louis Fortier-Dubois da5b0438ec
Migrate/jit/pooling (#1509)
* separate forward backward

* refactor with pool strategy

* refactor further

* pooling refactored

* refactoring for adaptive wip

* wip adaptive

* adaptive

* delete some wgsl

* avg pool backward

* clippy

* minor refactor
2024-03-25 16:04:58 -04:00
Louis Fortier-Dubois dd699a90a2
Migrate/jit/matmul tiling 2d (#1472)
* refactor matmul files

* wip refactor matmul

* everything is memco

* support local arrays

* advancing tiling2d

* advancing tiling2d

* advancing tiling2d

* tiling2d finished but buggy

* configurable unrolling

* not bugged

* fails on unroll

* stupid break

* tiling2d no assumption works

* clippy

* bounds check as bool

* lhs rhs as enum

* tiling 2d major refactor

* remove assign vec4

* variable declarations above loops

* fmt

* clippy

* Fix autotune + unroll

* move val

* clippy

* fmt

---------

Co-authored-by: nathaniel <nathaniel.simard.42@gmail.com>
2024-03-22 08:26:32 -04:00
Louis Fortier-Dubois 278fcb3dad
Migrate/jit/mask (#1456) 2024-03-12 12:43:05 -04:00
Louis Fortier-Dubois 02d37011ab
Fix/main/print (#1459) 2024-03-11 18:52:36 -04:00
Louis Fortier-Dubois 093cbd397d
JIT Migration: PRNG (#1433)
* wip bernoulli

* wip

* bernoulli works

* uniform works

* done

* remove old

* refactor prng traits

* forgot to save file

* allow

* clippy

* clippy

* scalar commutativity

* array instead of vec
2024-03-11 11:40:27 -04:00
Louis Fortier-Dubois 9eecc713a4
JIT: Fix min & max values (#1429)
* real min and max values

* fix

* fmt
2024-03-07 15:10:30 -05:00