An asynchronous direct solver for banded linear systems

Document Type


Date of Original Version



Banded linear systems occur frequently in mathematics and physics. However, direct solvers for large systems cannot be performed in parallel without communication. The aim of this paper is to develop a general asymmetric banded solver with a direct approach that scales across many processors efficiently. The key mechanism behind this is that reduction to a row-echelon form is not required by the solver. The method requires more floating point calculations than a standard solver such as LU decomposition, but by leveraging multiple processors the overall solution time is reduced. We present a solver using a superposition approach that decomposes the original linear system into q subsystems, where q is the number of superdiagonals. These methods show optimal computational cost when q processors are available because each system can be solved in parallel asynchronously. This is followed by a q×q dense constraint matrix problem that is solved before a final vectorized superposition is performed. Reduction to row echelon form is not required by the solver, and hence the method avoids fill-in. The algorithm is first developed for tridiagonal systems followed by an extension to arbitrary banded systems. Accuracy and performance is compared with existing solvers and software is provided in the supplementary material.

Publication Title, e.g., Journal

Numerical Algorithms