Computing Identity Co-Reference Across Drug Discovery Datasets

C Y A Brenninkmeijer, I Dunlop, Carole Goble, A J G Gray, S Pettifer, R Stevens, A Paschke (Editor), A Burger (Editor), P Romano (Editor), M S Marshall (Editor), A Splendiani (Editor)

    Research output: Chapter in Book/Conference proceedingConference contributionpeer-review

    Abstract

    This paper presents the rules used within the OpenPHACTS (http://www.openphacts.org) Identity Management Service to compute co-reference chains across multiple datasets. The web of (linked) data has encouraged a proliferation of identifiers for the concepts cap- tured in datasets; with each dataset using their own identifier. A key data integration challenge is linking the co-referent identifiers, i.e. identifying and linking the equivalent concept in every dataset. Exacerbating this challenge, the datasets model the data differently, so when is one repre- sentation truly the same as another? Finally, different users have their own task and domain specific notions of equivalence that are driven by their operational knowledge. Consumers of the data need to be able to choose the notion of operational equivalence to be applied for the con- text of their application. We highlight the challenges of automatically computing co-reference and the need for capturing the context of the equivalence. This context is then used to control the co-reference computation. Ultimately, the context will enable data consumers to decide which co-references to include in their applications.
    Original languageEnglish
    Title of host publication{Proceedings of the 6th International Workshop on Semantic Web Applications and Tools for Life Sciences (SWAT4LS)}
    EditorsA Paschke, A Burger, P Romano, M S Marshall, A Splendiani
    PublisherRWTH Aachen University
    Volume1114
    Publication statusPublished - 2013

    Publication series

    NameCEUR Workshop Proceedings - Semantic Web Applications and Technologies for Life sciences (SWAT4LS)

    Fingerprint

    Dive into the research topics of 'Computing Identity Co-Reference Across Drug Discovery Datasets'. Together they form a unique fingerprint.

    Cite this