Modern computers are parallel. From the most powerful supercomputers to desktop machines, computing environments with multiple processing resources are ubiquitous. Moreover, these resources are increasingly likely to be heterogeneous, differing widely from one another in terms of their speed, energy consumption and so on. To exploit this new landscape, scientific computing applications are often expressed in the form of a graph, with vertices representing discrete chunks of work called tasks and edges indicating the order in which they must be executed. This exposes the parallelism of the application, but provokes an immediate question: which tasks should each processor execute, and when? In other words, how do we find the optimal schedule that the processors should follow? First, we consider the case when there are only two different types of processing resources. Computing environments of this type are widespread today, most commonly comprising multicore CPUs and one or more GPUs. However many of the algorithms used for scheduling such platforms do not exploit their unique properties. We investigate how a common heuristic framework in which tasks are assigned priorities and scheduled in this order can be optimized for accelerated platforms. We propose a suite of possibilities and compare their performance with existing methods through extensive simulation, finding that different choices are better in different situations. Key to many scheduling heuristics is anticipating the critical path, the longest sequence of tasks during the schedule execution. We describe how approximations of the critical path are computed and used in scheduling heuristics for generic heterogeneous platforms, suggesting also new methods of our own. These are then evaluated through simulation in order to establish whether they are useful and, if so, when. Given a schedule, how long will it actually take? This is a trivial question when the task execution times and communication delays can be predicted exactly. But in practice that will never be true, so they are often modeled as random variables instead of scalars. However, computing the schedule duration then becomes an intractable problem. We suggest a heuristic framework that may be useful for computing approximate solutions and compare it to existing algorithms through numerical experimentation. We conclude by considering the natural next question: how do we compute a schedule which is robust to variation in the task execution and communication times? After reviewing existing heuristics, with a focus on those that work by first transforming the problem into a deterministic one, we propose a modification to one such algorithm which appears to improve its rate of convergence.
|Date of Award||1 Aug 2022|
- The University of Manchester
|Supervisor||Nicholas Higham (Supervisor) & Neil Walton (Supervisor)|