Abstract
This paper presents a GPU-accelerated Cholesky factorization for two different modes of operation. The first one is the batch mode, where many independent factorizations on small matrices can be performed concurrently. This mode supports fixed size and variable size problems, and is found in many scientific applications. The second mode is the native mode, where one factorization is performed on a large matrix without any CPU involvement, which allows the CPU do other useful work. We show that, despite the different workloads, both modes of operation share a common code-base that uses the GPU only. We also show that the developed routines achieve significant speedups against a multicore CPU using the MKL library, and against a GPU implementation by cuSOLVER. This work is part of the MAGMA library.
Original language | English |
---|---|
Pages (from-to) | 85-93 |
Number of pages | 9 |
Journal | Journal of Computational Science |
Volume | 20 |
Early online date | 31 Dec 2016 |
DOIs | |
Publication status | Published - May 2017 |
Keywords
- GPU computing
- Cholesky factorization
- Batched execution