Enhancing and Abstracting Scientific Workflow Provenance for Data Publishing

Pinar Alper, Khalid Belhajjame, Carole Goble

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    220 Downloads (Pure)

    Abstract

    Many scientists are using workflows to systematically design and run computational experiments. Once the workflow is executed, the scientist may want to publish the dataset generated as a result, to be, e.g., reused by other scientists as input to their experiments. In doing so, the scientist needs to curate such dataset by specifying metadata information that describes it, e.g. its derivation history, origins and ownership. To assist the scientist in this task, we ex- plore in this paper the use of provenance traces collected by work- flow management systems when enacting workflows. Specifically, we identify the shortcomings of such raw provenance traces in sup- porting the data publishing task, and propose an approach whereby distilled, yet more informative, provenance traces that are fit for the data publishing task can be derived.
    Original languageEnglish
    Title of host publicationhost publication
    Publication statusPublished - 2013
    EventInternational Workshop on Managing and Querying Provenance Data at Scale -
    Duration: 1 Jan 1824 → …

    Conference

    ConferenceInternational Workshop on Managing and Querying Provenance Data at Scale
    Period1/01/24 → …

    Fingerprint

    Dive into the research topics of 'Enhancing and Abstracting Scientific Workflow Provenance for Data Publishing'. Together they form a unique fingerprint.

    Cite this