Abstract: Matrix multiplication is one of the most basic and important operation in many computation applications, which comes with high time complexity. Several parallel algorithms have been proposed ...
Abstract: This research proposes and evaluates a novel approach to optimizing matrix multiplication (MatMul) on Huawei Ascend NPUs, motivated by a key insight: during matrix-vector multiplication ...
* Program re-ordering for improved L2 cache hit rate. * Automatic performance tuning. # Motivations # Matrix multiplications are a key building block of most modern high-performance computing systems.