Commit Graph

113 Commits

Author SHA1 Message Date
David Chavez 60c24430c6
chore(tests): allow overriding autographics backend (#1005) 2023-11-27 13:58:43 -05:00
Louis Fortier-Dubois 58273a8441
Feat/fusion/cmp (#992) 2023-11-23 12:52:37 -05:00
Nathaniel Simard 3d6c738776
Refactor/fusion/graph (#988) 2023-11-22 09:55:42 -05:00
Luni-4 445603401d
ci/Check dependencies (#895) 2023-11-19 10:35:03 -05:00
Nathaniel Simard 8f1526b9c9
Update readme (#962) 2023-11-17 13:04:41 -05:00
Zsombor c0859dde59
Implement fusing for recip() (#959) 2023-11-15 17:15:01 -05:00
Nathaniel Simard 24014aca33
WGPU: Support elemwise operation fusion (#948) 2023-11-15 15:13:37 -05:00
Zsombor 4fc0c27e31
Implement tensor.recip() function to calculate elementwise reciprocals (#953) 2023-11-15 09:17:32 -05:00
Louis Fortier-Dubois 831335ac2e
Perf/wgpu/reduce dim (#943)
* new reduce half working

* surprisingly working

* good on elongated matrix, bad on balanced ones

* working and clean

* autotune not tested, tests fail at non contiguous

* fixed

* autotune tested

* mean dim

* some fixes

* clippy
2023-11-13 07:20:50 -05:00
Nathaniel Simard 322480b744
Feat/op fusion decorator (#939)
* WIP

* Impl backend decorator

* WIP

* WIP

* WIP

* WIP

* WIP

* WIP

* Refactor

* Handle graph single ops execution

* WIP

* Starting to get concrete

* WIP

* Fix locator

* Implement add ops

* Start implementing ops

* Add more ops

* Add more ops

* More float ops

* Almost finish float ops

* Almost done with Int

* Some fix

* Into float

* Implement bool ops

* Almost done with MVP

* Fix adaptive pooling

* Add fusion as backend

* Fix memory leak

* Fix

* WIP Doc

* Doc all ops enum

* Initial docs

* Clippy

* Clippy v2

* Fix typos

* Fix doc

* Fix feature flags

* Add missing ops

* Some cleanup

* Revert u128 id

* cosmetic fixes

---------

Co-authored-by: louisfd <louisfd94@gmail.com>
2023-11-09 21:21:41 -05:00
Nathaniel Simard c4bc96e27f
Better settings (#933) 2023-11-07 07:34:39 -05:00
Louis Fortier-Dubois a0297530ea
Autotune: fix inputs (#926) 2023-11-06 08:59:31 -05:00
Louis Fortier-Dubois 1cc1844d32
Refactor/autotune/key (#924) 2023-11-03 08:46:25 -04:00
Louis Fortier-Dubois 35df31f700
Perf/wgpu/matmul unpadded (#922) 2023-11-01 16:37:33 -04:00
Louis Fortier-Dubois 8742d31d16
Perf/wgpu/matmul vec4rhs (#914) 2023-10-31 08:37:17 -04:00
Nathaniel Simard 96524d40a1
[Breaking] Refactor Backend Names (#904) 2023-10-29 18:27:49 -04:00
Louis Fortier-Dubois e2a3329997
Feat/wgpu/autotune compute (#906) 2023-10-29 16:44:59 -04:00
Nathaniel Simard 233922d60c
Chore: Bump version for next release (#900) 2023-10-24 19:31:13 -04:00
Louis Fortier-Dubois e76b6d47de
WGPU: matmul vec4 (#897) 2023-10-24 17:23:43 -04:00
nathaniel d021c7d7e8 Remove wrong comments 2023-10-24 11:55:39 -04:00
Louis Fortier-Dubois d96f73da0a
Feat/compute/autotune (#861)
* wip autotune compute

* too much generics

* wip

* megawip

* in progress

* first test passes

* first test passes

* fixed test

* refactor for cache hit and miss

* cleanup and fixes

* doc and stuff

* doc and stuff

* clippy

* format

* remove lifetime

* cleanup operation

* wip

* wip

* compiles

* wip mutable borrow

* refactor with autotune server

* wip tune benchmark

* test passes

* fix autotune key

* cache hit miss tests

* refactor wgpu to match burn-compute

* better operation execution

* cleanup & refactor

* test for parametered kernel

* fmt

* fmt

* clippy

* allow clippy

* fix no-std

* fmt

* review and ci

* Fix CI

* delete dummy benchmarks again

---------

Co-authored-by: nathaniel <nathaniel.simard.42@gmail.com>
2023-10-23 11:29:44 -04:00
Louis Fortier-Dubois e4d9d67526
make candle available (#886) 2023-10-23 10:00:39 -04:00
Mathias Insley 07c0cf146d
Wgpu/Clamp Kernels (#866)
* Update kernel mod.rs

* Wgpu crate implementations and add shader files

* Direct backends to the correct implementation

* Use mask method for candle

* Add index out of bounds protection

* Use a macro to avoid duplication

* Use unary_scalar templates

* New shaders for clamp and clamp_inplace

* Remove unneccessary clamp shaders

* Clamp implementation and test

* Use new clamp implementation for float and int ops

* Better variable names for clamp_min/max

* Revert changes to tensor/ops/tensor.rs

* Fix clamp.wgsl

* Fix shader types

* Use native candle clamp

* Use candle ops for clamp_min/max and revert tensor.rs

* Maximum/minimum were reversed
2023-10-23 07:49:24 -04:00
Nathaniel Simard d263968236
Refactor unfold4d + Add Module (#870) 2023-10-22 11:53:59 -04:00
Christophe Biocca 3eb7f380f3
Also consider devices of type Other when trying to find the best device. (#875) 2023-10-18 20:44:07 -04:00
Mathias Insley 255dfefab2
Feat/tensor unfold (#819) 2023-10-15 17:05:34 -04:00
Nathaniel Simard 809ad72843
Fix flaky init (#842)
* Fix flaky init

* Remove print
2023-10-03 13:57:16 -04:00
Louis Fortier-Dubois 163e48c969
wgpu: Yet another (faster) matmul (#836) 2023-10-02 14:05:53 -04:00
Nathaniel Simard 80731913c7
Speedup CI (#835) 2023-09-29 15:13:16 -04:00
Nathaniel Simard ca787d6446
Feat/async read (#833) 2023-09-28 17:09:58 -04:00
Louis Fortier-Dubois aa90fe8efb
Refactor/burn benchmark (#829) 2023-09-28 09:38:21 -04:00
Nathaniel Simard 95e660488e
Refactor/burn compute wgpu (#826) 2023-09-25 10:42:45 -04:00
Louis Fortier-Dubois 8c215e8be3
Bugfix/int swap dims (#823) 2023-09-22 08:38:38 -04:00
Juliano Decico Negri 293020aae6
#384 Include tests for int.rs and float.rs (#794) 2023-09-21 09:00:09 -04:00
Nathaniel Simard ac4adb54ea
Burn compute (#809) 2023-09-18 19:56:53 -04:00
Nathaniel Simard af0be5cfeb
Chore: bump version (#777) 2023-09-06 12:15:13 -04:00
Nathaniel Simard c95b34c511
Book: backend extension + custom wgpu kernel (#728) 2023-08-31 09:55:43 -04:00
Louis Fortier-Dubois c89f9969ed
Perf/tensor ops/tests (#710) 2023-08-28 12:53:17 -04:00
Mathias Insley d2aa4c0c9d
Perf/Empty Context Cache (#676)
* Add a pipeline_counter and methods for process of retaining best kernel

* Put a tune flag on the Context

* Put counts into cache instead of using pipeline_counter

* Formatting

* Add optimize_cache flag and rework ComputePipeline clearing process

* Update tune() so that it starts Context tuning and flags the Context as ready for clearing

* Consistent single quotes

* Use AtomicBool for is_tuning, prevent caching during tuning

* Collect TemplateIds during tuning and clean them out after tuning

* Fix comment

* Move cache cleanup to stop_tuning function
2023-08-28 10:04:05 -04:00
MOZGIII 7f558bdc46
Expose element traits (#700) 2023-08-27 09:02:39 -04:00
Louis Fortier-Dubois fb2a71bb81
remove to device (#694) 2023-08-25 09:55:18 -04:00
Jerome Robert edb3e9fc4b
Do not use default device when running kernel::matmul::tune (#684)
Use the device of the involved Tensor instead of Device::default
2023-08-24 14:01:27 -04:00
Nathaniel Simard d18d1b0bb9
Can configure wgpu max tasks (#603) 2023-08-23 12:20:27 -04:00
Caio Piccirillo 2fefc82099
Dilation maxpool (#668) 2023-08-21 14:14:25 -04:00
Nathaniel Simard bda03c6a76
Feat/avg pool/include pad config (#653) 2023-08-17 08:50:31 -04:00
Louis Fortier-Dubois d659f11639
Perf/wgpu/autotune (#609) 2023-08-15 11:26:00 -04:00
Nathaniel Simard c74e75f748
Fix/wgpu/max pool2d backward (#613) 2023-08-09 16:45:49 -04:00
Caio Piccirillo 1d3bbaab13
Typos (#608) 2023-08-08 17:57:51 -04:00
Nathaniel Simard 441a7011ce
Feat/tensor casting (#604) 2023-08-08 10:02:17 -04:00
Nathaniel Simard 8bc687e1bb
WGPU use best limits for the adaptor (#601) 2023-08-07 15:10:21 -04:00