TY - UNPB
T1 - A Terminology for Scientific Workflow Systems
AU - Suter, Frédéric
AU - Coleman, Tainã
AU - Altintaş, İlkay
AU - Badia, Rosa M.
AU - Balis, Bartosz
AU - Chard, Kyle
AU - Colonnelli, Iacopo
AU - Deelman, Ewa
AU - Tommaso, Paolo Di
AU - Fahringer, Thomas
AU - Goble, Carole
AU - Jha, Shantenu
AU - Katz, Daniel S.
AU - Köster, Johannes
AU - Leser, Ulf
AU - Mehta, Kshitij
AU - Oliver, Hilary
AU - Peterson, J. -Luc
AU - Pizzi, Giovanni
AU - Pottier, Loïc
AU - Sirvent, Raül
AU - Suchyta, Eric
AU - Thain, Douglas
AU - Wilkinson, Sean R.
AU - Wozniak, Justin M.
AU - da Silva, Rafael Ferreira
PY - 2025/6/9
Y1 - 2025/6/9
N2 - The term scientific workflow has evolved over the last two decades to encompass a broad range of compositions of interdependent compute tasks and data movements. It has also become an umbrella term for processing in modern scientific applications. Today, many scientific applications can be considered as workflows made of multiple dependent steps, and hundreds of workflow management systems (WMSs) have been developed to manage and run these workflows. However, no turnkey solution has emerged to address the diversity of scientific processes and the infrastructure on which they are implemented. Instead, new research problems requiring the execution of scientific workflows with some novel feature often lead to the development of an entirely new WMS. A direct consequence is that many existing WMSs share some salient features, offer similar functionalities, and can manage the same categories of workflows but also have some distinct capabilities. This situation makes researchers who develop workflows face the complex question of selecting a WMS. This selection can be driven by technical considerations, to find the system that is the most appropriate for their application and for the resources available to them, or other factors such as reputation, adoption, strong community support, or long-term sustainability. To address this problem, a group of WMS developers and practitioners joined their efforts to produce a community-based terminology of WMSs. This paper summarizes their findings and introduces this new terminology to characterize WMSs. This terminology is composed of fives axes: workflow characteristics, composition, orchestration, data management, and metadata capture. Each axis comprises several concepts that capture the prominent features of WMSs. Based on this terminology, this paper also presents a classification of 23 existing WMSs according to the proposed axes and terms.
AB - The term scientific workflow has evolved over the last two decades to encompass a broad range of compositions of interdependent compute tasks and data movements. It has also become an umbrella term for processing in modern scientific applications. Today, many scientific applications can be considered as workflows made of multiple dependent steps, and hundreds of workflow management systems (WMSs) have been developed to manage and run these workflows. However, no turnkey solution has emerged to address the diversity of scientific processes and the infrastructure on which they are implemented. Instead, new research problems requiring the execution of scientific workflows with some novel feature often lead to the development of an entirely new WMS. A direct consequence is that many existing WMSs share some salient features, offer similar functionalities, and can manage the same categories of workflows but also have some distinct capabilities. This situation makes researchers who develop workflows face the complex question of selecting a WMS. This selection can be driven by technical considerations, to find the system that is the most appropriate for their application and for the resources available to them, or other factors such as reputation, adoption, strong community support, or long-term sustainability. To address this problem, a group of WMS developers and practitioners joined their efforts to produce a community-based terminology of WMSs. This paper summarizes their findings and introduces this new terminology to characterize WMSs. This terminology is composed of fives axes: workflow characteristics, composition, orchestration, data management, and metadata capture. Each axis comprises several concepts that capture the prominent features of WMSs. Based on this terminology, this paper also presents a classification of 23 existing WMSs according to the proposed axes and terms.
KW - Scientific workflows
KW - workflow management systems
KW - community-based terminology
KW - terminology
KW - workflow
U2 - arXiv.2506.07838
DO - arXiv.2506.07838
M3 - Preprint
BT - A Terminology for Scientific Workflow Systems
PB - arXiv
ER -