Commit Graph

3 Commits

Author SHA1 Message Date
Laurent Mazare 30cdd769f9
Update the flash attn kernels. (#2333) 2024-07-15 20:37:36 +02:00
OlivierDehaene 8d1a57c9a0
chore: update flash attention kernels (#1518)
* chore: update flash attention kernels

* fmt

* remove unused kernels

* force f32

* correct stride
2024-01-05 18:28:55 +01:00
Laurent Mazare 2ce5f12513
Again set a few extra params in flash-attn. (#245)
* Again set a few extra params.

* Use the appropriate kernel sizes.

* Add all the kernel sizes.

* Parallel compiling.

* Reduce the amount of parallelism.

* Add the missing kernel.

* Fix a typo.

* Remove bf16 support for now.
2023-07-26 14:16:37 +01:00