Divide and conquer on hybrid GPU-accelerated multicore systems

Christof Vömel, Stanimire Tomov, Jack Dongarra

    Research output: Contribution to journalArticlepeer-review


    With the raw computing power of graphics processing units (GPUs) being more widely available in commodity multicore systems, there is an imminent need to harness their power for important numerical libraries such as LAPACK. In this paper, we consider the solution of dense symmetric and Hermitian eigenproblems by the LAPACK divide and conquer algorithm on such modern heterogeneous systems. We focus on how to make the best use of the individual strengths of the massively parallel manycore GPUs and multicore CPUs. The resulting algorithm overcomes performance bottlenecks faced by current implementations that are optimized for a homogeneous multicore. On a dual socket quad-core Intel Xeon 2.33 GHz with an NVIDIA GTX 280 GPU, we typically obtain up to about a tenfold improvement in performance for the complete dense problem. The techniques described here thus represent an example of how to develop numerical software to efficiently use heterogeneous architectures. As heterogeneity becomes more common in the architecture design, the significance of and need for this work are expected to grow. © 2012 Society for Industrial and Applied Mathematics.
    Original languageEnglish
    Pages (from-to)C70-C82
    JournalSIAM Journal on Scientific Computing
    Issue number2
    Publication statusPublished - 2012


    • GPU
    • Heterogeneous computing
    • Hybrid architecture
    • LAPACK
    • Multicore
    • Performance
    • Symmetric eigenvalue problem


    Dive into the research topics of 'Divide and conquer on hybrid GPU-accelerated multicore systems'. Together they form a unique fingerprint.

    Cite this