Cost-efficient Resource Management for Scientific Workflows on the Cloud

  • Ilia Pietri

Student thesis: Phd

Abstract

Scientific workflows are used in many scientific fields to abstract complex computations (tasks) and data or flow dependencies between them. High performance computing (HPC) systems have been widely used for the execution of scientific workflows. Cloud computing has gained popularity by offering users on-demand provisioning of resources and providing the ability to choose from a wide range of possible configurations. To do so, resources are made available in the form of virtual machines (VMs), described as a set of resource characteristics, e.g. amount of CPU and memory. The notion of VMs enables the use of different resource combinations which facilitates the deployment of the applications and the management of the resources. A problem that arises is determining the configuration, such as the number and type of resources, that leads to efficient resource provisioning. For example, allocating a large amount of resources may reduce application execution time however at the expense of increased costs. This thesis investigates the challenges that arise on resource provisioning and task scheduling of scientific workflows and explores ways to address them, developing approaches to improve energy efficiency for scientific workflows and meet the user's objectives, e.g. makespan and monetary cost. The motivation stems from the wide range of options that enable to select cost-efficient configurations and improve resource utilisation. The contributions of this thesis are the following. (i) A survey of the issues arising in resource management in cloud computing; The survey focuses on VM management, cost efficiency and the deployment of scientific workflows. (ii) A performance model to estimate the workflow execution time for a different number of resources based on the workflow structure; The model can be used to estimate the respective user and energy costs in order to determine configurations that lead to efficient resource provisioning and achieve a balance between various conflicting goals. (iii) Two energy-aware scheduling algorithms that maximise the number of completed workflows from an ensemble under energy and budget or deadline constraints; The algorithms address the problem of energy-aware resource provisioning and scheduling for scientific workflow ensembles. (iv) An energy-aware algorithm that selects the frequency to be used for each workflow task in order to achieve energy savings without exceeding the workflow deadline; The algorithm takes into account the different requirements and constraints that arise depending on the workflow and system characteristics. (v) Two cost-based frequency selection algorithms that choose the CPU frequency for each provisioned resource in order to achieve cost-efficient resource configurations for the user and complete the workflow within the deadline; Decision making is based on both the workflow characteristics and the pricing model of the provider.
Date of Award1 Aug 2016
Original languageEnglish
Awarding Institution
  • The University of Manchester
SupervisorRizos Sakellariou (Supervisor) & John Brooke (Supervisor)

Keywords

  • VM management
  • VM provisioning
  • cost efficiency
  • scientific workflows
  • workflow scheduling
  • cloud computing
  • energy efficiency

Cite this

'