Matrix multiplication (hereafter we use the acronym MM) is among the most fundamental operations of modern computations. The efficiency of its performance depends on various factors, in particular vectorization, data movement and arithmetic complexity of the computations, but here we focus just on the study of the arithmetic cost and the impact of this study on other areas of modern computing. In the early 1970s it was expected that the straightforward cubic time algorithm for MM will soon be accelerated to enable MM in nearly quadratic arithmetic time, with some far fetched implications. While pursuing this goal the mainstream research had its focus on the decrease of the classical exponent 3 of the complexity of MM towards its lower bound 2, disregarding the growth of the input size required to support this decrease. Eventually, surprising combinations of novel ideas and sophisticated techniques enabled the decrease of the exponent to its benchmark value of about 2.38, but the supporting MM algorithms improved the straightforward one only for the inputs of immense sizes. Meanwhile, the communication complexity, rather than the arithmetic complexity, has become the bottleneck of computations in linear algebra. This development may seem to undermine the value of the past and future research aimed at the decrease of the arithmetic cost of MM, but we feel that the study should be reassessed rather than closed and forgotten. We review the old and new work in this area in the present day context, recall some major techniques introduced in the study of MM, discuss their impact on the modern theory and practice of computations for MM and beyond MM, and link one of these techniques to some simple algorithms for inner product and summation.
Pan, Victor, "Matrix Multiplication, Trilinear Decompositions, APA Algorithms, and Summation" (2015). CUNY Academic Works.