Abstract

Sparse LU Factorization with Partial Pivoting on Distributed MemoryMachines

by: Cong Fu and Tao Yang

Abstract:

Sparse LU factorization with partial pivoting is important to many scientificapplications, but the effective parallelization of this algorithm is still anopen problem. The main difficulty is that partial pivoting operations makestructures of $L$ and $U$ factors unpredictable beforehand. This paperpresents a novel approach called $S^*$ for parallelizing this problem ondistributed memory machines. $S^*$ incorporates static symbolic factorizationto avoid run-time control overhead and uses nonsymmetric L/U supernodepartitioning and amalgamation strategies to maximize the use of BLAS-3routines. The irregular task parallelism embedded in sparse LU is exploitedusing the RAPID run-time system which optimizes asynchronous communication,overlaps computation with communication and balances processor loads. Theexperimental results on the Cray-T3D with a set of Harwell-Boeing nonsymmetricmatrices are very encouraging and good scalability has been achieved. Evencompared to a highly optimized sequential code, the parallel speedups are stillimpressive considering the current status of sparse LU research.

Keywords:

Sparse matrix, LU factorization, partial pivoting, irregularparallelism

Date:

July 1996

Document: 1996-18

Department of Computer Science

University of California, Santa Barbara

Abstract

Sparse LU Factorization with Partial Pivoting on Distributed MemoryMachines

by: Cong Fu and Tao Yang