Abstract: With the increasing popularity of RISC-V in the academic and industrial world, an ever growing number of open-source implementations of the instruction set have become available. However, it ...
Abstract: State-of-the-art deep learning models rely on large GPU clusters and various parallelism strategies, which in turn depend on collective communication (CC) operators to synchronize data.