DAGuE: A generic distributed DAG engine for High Performance Computing

George Bosilca, Aurelien Bouteiller, Anthony Danalis, Thomas Herault, Pierre Lemarinier, Jack Dongarra

    Research output: Contribution to journalArticlepeer-review

    Abstract

    The frenetic development of the current architectures places a strain on the current state-of-the-art programming environments. Harnessing the full potential of such architectures is a tremendous task for the whole scientific computing community. We present DAGuE a generic framework for architecture aware scheduling and management of micro-tasks on distributed many-core heterogeneous architectures. Applications we consider can be expressed as a Direct Acyclic Graph of tasks with labeled edges designating data dependencies. DAGs are represented in a compact, problem-size independent format that can be queried on-demand to discover data dependencies, in a totally distributed fashion. DAGuE assigns computation threads to the cores, overlaps communications and computations and uses a dynamic, fully-distributed scheduler based on cache awareness, data-locality and task priority. We demonstrate the efficiency of our approach, using several micro-benchmarks to analyze the performance of different components of the framework, and a linear algebra factorization as a use case. © 2011 Elsevier B.V. All rights reserved.
    Original languageEnglish
    Pages (from-to)37-51
    Number of pages14
    JournalParallel Computing
    Volume38
    Issue number1-2
    DOIs
    Publication statusPublished - Jan 2012

    Keywords

    • Architecture aware scheduling
    • Heterogeneous architectures
    • HPC
    • Micro-task DAG

    Fingerprint

    Dive into the research topics of 'DAGuE: A generic distributed DAG engine for High Performance Computing'. Together they form a unique fingerprint.

    Cite this