Algorithmic redistribution methods for block-cyclic decompositions

Antoine P. Petitet, Jack J. Dongarra

    Research output: Contribution to journalArticlepeer-review

    Abstract

    This article presents various data redistribution methods for block-partitioned linear algebra algorithms operating on dense matrices that are distributed in a block-cyclic fashion. Because the algorithmic partitioning unit and the distribution blocking factor are most often chosen to be equal, severe alignment restrictions are induced on the operands, and optimal values with respect to performance are architecture dependent. The techniques presented in this paper redistribute data `on the fly,' so that the user's data distribution blocking factor becomes independent from the architecture dependent algorithmic partitioning. These techniques are applied to the matrix-matrix multiplication operation. A performance analysis along with experimental results shows that alignment restrictions can then be removed and that high performance can be maintained across platforms independently from the user's data distribution blocking factor.
    Original languageEnglish
    Pages (from-to)1201-1216
    Number of pages15
    JournalIEEE Transactions on Parallel and Distributed Systems
    Volume10
    Issue number12
    DOIs
    Publication statusPublished - 1999

    Fingerprint

    Dive into the research topics of 'Algorithmic redistribution methods for block-cyclic decompositions'. Together they form a unique fingerprint.

    Cite this