A cost-effective approach to improving performance of big genomic data analyses in clouds

Christopher Smowton, Andoena Balla, Demetris Antoniades, Crispin Miller, George Pallis, Marios D. Dikaiakos, Wei Xing

Research output: Contribution to journalArticlepeer-review

Abstract

With the rapidly growing demand for DNA analysis, the need for storing and processing large-scale genome data has presented significant challenges. This paper describes how the Genome Analysis Toolkit (GATK) can be deployed to an elastic cloud, and defines policy to drive elastic scaling of the application. We extensively analyse the GATK to expose opportunities for resource elasticity, demonstrate that it can be practically deployed at scale in a cloud environment, and demonstrate that applying elastic scaling improves the performance to cost tradeoff achieved in a simulated environment.

Original languageEnglish
Pages (from-to)368-381
Number of pages14
JournalFuture Generation Computer Systems
Volume67
Early online date17 Dec 2015
DOIs
Publication statusPublished - 1 Feb 2017

Keywords

  • Big data
  • Clouds
  • Performance

Research Beacons, Institutes and Platforms

  • Manchester Cancer Research Centre

Fingerprint

Dive into the research topics of 'A cost-effective approach to improving performance of big genomic data analyses in clouds'. Together they form a unique fingerprint.

Cite this