candle/candle-flash-attn
OlivierDehaene 75629981bc
feat: parse Cuda compute cap from env (#1066)
* feat: add support for multiple compute caps

* Revert to one compute cap

* fmt

* fix
2023-10-16 15:37:38 +01:00
..
cutlass@c4f6b8c6bc Add flash attention (#241) 2023-07-26 07:48:10 +01:00
kernels Add back the bf16 flash-attn kernels. (#730) 2023-09-04 07:50:52 +01:00
src Properly set the is_bf16 flag. (#738) 2023-09-04 16:45:26 +01:00
tests Flash attention without padding (varlen). (#281) 2023-07-31 09:45:39 +01:00
Cargo.toml Bump the version to 0.3.0. (#1014) 2023-10-01 13:51:57 +01:00
README.md Add some missing readme files. (#304) 2023-08-02 10:57:12 +01:00
build.rs feat: parse Cuda compute cap from env (#1066) 2023-10-16 15:37:38 +01:00

README.md

candle-flash-attn