Exploring vectorisation for parallel breadth-first search on an advanced vector processor

  • Mireya Paredes Lopez

Student thesis: Phd


Modern applications generate a massive amount of data that is challenging to process or analyse. Graph algorithms have emerged as a solution for the analysis of this data because they can represent the entities participating in the generation of large scale datasets in terms of vertices and their relationships in terms of edges. Graph analysis algorithms are used for finding patterns within these relationships, aiming to extract information to be further analysed.The breadth-first search (BFS) is one of the main graph search algorithms used for graph analysis and its optimisation has been widely researched using different parallel computers. However, the BFS parallelisation has been shown to be chal- lenging because of its inherent characteristics, including irregular memory access patterns, data dependencies and workload imbalance, that limit its scalability.This thesis investigates the optimisation of the BFS on the Xeon Phi, which is a modern parallel architecture provided with an advanced vector processor using a self-created development framework integrated with the Graph 500 benchmark. As a result, optimised parallel versions of two high-level algorithms for BFS were created using vectorisation, starting with the conventional top-down BFS algorithm and, building on this, leading to the hybrid BFS algorithm. The best implementations resulted in speedups of 1.37x and 1.33x, for a one million vertices graph, compared to the state-of-the-art, respectively. The hybrid BFS algorithm can be further used by other graph analysis algorithms and the lessons learned from vectorisation can be applied to other algorithms targeting the existing and future models of the Xeon Phi and other advanced vector architectures.
Date of Award1 Aug 2017
Original languageEnglish
Awarding Institution
  • The University of Manchester
SupervisorMikel Lujan Moreno (Supervisor) & Graham Riley (Supervisor)


  • breadth first search
  • parallel architecture
  • vectorisation
  • Xeon Phi
  • graph algorithms
  • graph 500 benchmark

Cite this