TY - JOUR
T1 - Communication Pattern based Data Authentication (CPDA) Designed for Big Data Processing in a Multiple Public Cloud Environment
AU - Sirapaisan, Soontorn
AU - Zhang, Ning
AU - He, Qian
N1 - Funding Information:
This work was supported by the University of Manchester under Dean’s Doctoral Scholar Awards.
Publisher Copyright:
© 2013 IEEE.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2020/6/9
Y1 - 2020/6/9
N2 - With the development of cloud computing, there is a growing trend of multi-cloud Collaborative Big Data Computation (CBDC). In this environment, threats from authorized insiders are of particular concerns. Based on an extreme case of distributed computation where multiple collaborators jointly perform CBDC on shared datasets using an example distributed computing framework, MapReduce (MR), deployed in a Multiple Public Cloud (MPC) environment, this paper investigates how to protect the authenticity of data used during the computation in an efficient and scalable manner by proposing and evaluating a novel data authentication solution. The solution, called a Communication Pattern based Data Authentication (CPDA) framework, ensures data authenticity and non-repudiation of origin at the finest granularity without compromising efficiency and scalability. This is achieved by using an idea of communication pattern based authentication data aggregation. The framework has been comprehensively evaluated both theoretically and experimentally. The evaluation results show that the CPDA framework offers the strongest level of data authenticity protection (equivalent to that provided by digitally signing each data object individually) but introduces much lower overhead cost than the digital signature based solution. The results demonstrate that the idea of communication pattern based authentication data aggregation brings much benefit in terms of supporting efficient and scalable data authentication in a large-scale distributed system.
AB - With the development of cloud computing, there is a growing trend of multi-cloud Collaborative Big Data Computation (CBDC). In this environment, threats from authorized insiders are of particular concerns. Based on an extreme case of distributed computation where multiple collaborators jointly perform CBDC on shared datasets using an example distributed computing framework, MapReduce (MR), deployed in a Multiple Public Cloud (MPC) environment, this paper investigates how to protect the authenticity of data used during the computation in an efficient and scalable manner by proposing and evaluating a novel data authentication solution. The solution, called a Communication Pattern based Data Authentication (CPDA) framework, ensures data authenticity and non-repudiation of origin at the finest granularity without compromising efficiency and scalability. This is achieved by using an idea of communication pattern based authentication data aggregation. The framework has been comprehensively evaluated both theoretically and experimentally. The evaluation results show that the CPDA framework offers the strongest level of data authenticity protection (equivalent to that provided by digitally signing each data object individually) but introduces much lower overhead cost than the digital signature based solution. The results demonstrate that the idea of communication pattern based authentication data aggregation brings much benefit in terms of supporting efficient and scalable data authentication in a large-scale distributed system.
KW - Big data
KW - MapReduce
KW - cloud
KW - data authentication
KW - distributed computing
UR - http://www.scopus.com/inward/record.url?scp=85087499501&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2020.3000989
DO - 10.1109/ACCESS.2020.3000989
M3 - Article
SN - 2169-3536
VL - 8
SP - 107716
EP - 107748
JO - IEEE Access
JF - IEEE Access
M1 - 9112146
ER -