DBCSR is a library designed to efficiently perform sparse
matrix-matrix multiplication, among other operations.
It is MPI and OpenMP parallel and can exploit Nvidia and AMD GPUs via
CUDA and HIP.
DBCSR was developed as a part of CP2K, where it provides core
functionality for linear scaling electronic structure theory. It is
now released as a standalone library for integration in other projects.
This requires a MPI implementation, however the package isn't working
with mpich. Use openmpi instead.
* HIP and OpenCL still experimental