CWLProv - Interoperable Retrospective Provenance capture and its challenges

Farah Zaib Khan, Stian Soiland-Reyes, Michael R. Crusoe, Andrew Lonie, Richard Sinnott

Research output: Contribution to conferencePaperpeer-review

33 Downloads (Pure)

Abstract

The automation of data analysis in the form of scientific workflows is a widely adopted practise in many fields of research nowadays. Computationally driven data-intensive experiments using workflows enable Automation, Scaling, Adaption and Provenance support (ASAP). However there are still a number of challenges associated with the effective sharing, publication, understandability and reproducibility of such workflows due to the incomplete capture of provenance and the dependence on particular technical (software) platforms. This paper presents CWLProv, an approach for retrospective provenance capture utilizing open source community-driven standards involving application and customization of workflow-centric Research Objects (ROs). The ROs are produced as an output of a workflow enactment defined in the Common Workflow Language (CWL) using the CWL reference implementation and its data structures. The approach aggregates and annotates all the resources involved in the scientific investigation including inputs, outputs, workflow specification, command line tool specifications and input parameter settings. The resources are linked within the RO to enable re-enactment of an analysis without depending on external resources. The workflow provenance profile is represented in W3C recommended standard PROV-N and PROV-JSON format to capture retrospective provenance of the workflow enactment. The workflow-centric RO produced as an output of a CWL workflow enactment is expected to be interoperable, reusable, shareable and portable across different platforms. This paper describes the need and motivation for CWLProv and the lessons learned in applying it for ROs using CWL in the bioinformatics domain.
Original languageEnglish
Number of pages14
DOIs
Publication statusSubmitted - 27 Mar 2018
EventInternational Provenance and Annotation Workshop (IPAW) 2018 - ProvenanceWeek 2018, King's College London, London, United Kingdom
Duration: 9 Jul 201813 Jul 2018
Conference number: 7
http://provenanceweek2018.org/

Workshop

WorkshopInternational Provenance and Annotation Workshop (IPAW) 2018
Abbreviated titleIPAW
Country/TerritoryUnited Kingdom
CityLondon
Period9/07/1813/07/18
Internet address

Keywords

  • Provenance
  • Retrospective provenance
  • Research Object
  • Common Workflow Language
  • PROV Data Model
  • PROV-N
  • PROV-JSON

Fingerprint

Dive into the research topics of 'CWLProv - Interoperable Retrospective Provenance capture and its challenges'. Together they form a unique fingerprint.

Cite this