An Effective and Efficient Authentication Framework for MapReduce in a Multiple Public Cloud Environment

  • Soontorn Sirapaisan

Student thesis: Phd


Increasingly, there is a growing trend for inter-organisational collaborative Big Data sharing and analysis. For efficiency reasons, such Big Data analysis is usually carried out by using distributed computing services deployed in public clouds. Executing Collaborative Big Data Computation (CBDC) in a Multiple Public Cloud (MPC) environment introduces some open issues. One of these issues is how to maximise security protection level with minimum overhead costs. We set to investigate these issues based on the authentication property as authentication is the first line of defence in any computing systems. The investigation has led to the design, prototype, and evaluation of a novel authentication solution that takes into account of the characteristics of the underlying system. To this end, this thesis has made the following contributions. Firstly, the thesis has formulated a generic use case model for CBDC-MPC. This model captures an extreme form of distributed computation where multiple collaborators jointly perform CBDC on shared datasets using an example distributed computing framework, MapReduce (MR), deployed in an MPC environment. The model is used to gain a thorough understanding of the threats in relation to impersonation, unauthorised access, and alteration to data in the context and guide the design of an effective, efficient, and scalable authentication solution for distributed systems. Secondly, the thesis has proposed a novel authentication framework for CBDC-MPC. The framework, called the Multi-domain Decentralised Authentication (MDA) framework, consists of two further novel components, the Multi-factor Interaction based Entity Authentication (MIEA) framework and the Communication Pattern based Data Authentication (CPDA) framework. The MIEA framework provides risk-aware entity authentication to every interaction during the entire execution cycle of a data processing job. The framework has been analysed and evaluated both theoretically and experimentally. The analysis and evaluation results demonstrate that MIEA provides a stronger level of entity authentication but with the same level of overhead cost compared with Kerberos, one of the most used entity authentication protocols in a distributed computing environment. The CPDA framework provides data authenticity and non-repudiation of origin for every data object processed by the underlying system. To maximise the protection level while minimising the overhead cost, a novel idea of communication pattern based aggregations of authentication data (generation and verification operations) and communication is used in conjunction with multiple cryptographic schemes. The theoretical and experimental evaluation results show that the CPDA approach offers the strongest level of data authenticity protection but reduces the overhead cost by up to 67% in comparison with the most related solution that digitally signs every object individually. The results demonstrate that the idea of tailoring the design of an authentication solution in line with the characteristics of the underlying system brings much benefit in terms of supporting efficient and scalable authentication in a large-scale distributed system.
Date of Award31 Dec 2021
Original languageEnglish
Awarding Institution
  • The University of Manchester
SupervisorKe Chen (Supervisor) & Ning Zhang (Supervisor)


  • Big Data
  • Authentication
  • Security
  • Cloud
  • MapReduce
  • Distributed Computing

Cite this