Experiences in autotuning matrix multiplication for energy minimization on GPUs

Hartwig Anzt, Blake Haugen, Jakub Kurzak, Piotr Luszczek, Jack Dongarra

    Research output: Contribution to journalArticlepeer-review

    Abstract

    In this paper, we report extensive results and analysis of autotuning the computationally intensive graphics processing units kernel for dense matrix-matrix multiplication in double precision. In contrast to traditional autotuning and/or optimization for runtime performance only, we also take the energy efficiency into account. For kernels achieving equal performance, we show significant differences in their energy balance. We also identify the memory throughput as the most influential metric that trades off performance and energy efficiency. As a result, the performance optimal case ends up not being the most efficient kernel in overall resource use.

    Original languageEnglish
    Pages (from-to)5096-5113
    JournalConcurrency and Computation: Practice & Experience
    Volume27
    Issue number17
    DOIs
    Publication statusAccepted/In press - 24 Mar 2015

    Keywords

    • Automatic software tuning
    • Energy
    • Hardware accelerators
    • Matrix multiplication
    • Power

    Fingerprint

    Dive into the research topics of 'Experiences in autotuning matrix multiplication for energy minimization on GPUs'. Together they form a unique fingerprint.

    Cite this