Ambiguity and variability of database and software names in bioinformatics

Geraint Duck, David Robertson, Robert Stevens, Goran Nenadic

    Research output: Chapter in Book/Conference proceedingConference contributionpeer-review

    115 Downloads (Pure)

    Abstract

    There are now numerous options available to achieve various tasks in bioinformatics, but, as yet, little progress has been made to capture the common practice by analysing usage and mentions of databases and tools within the literature. In this paper we analyse the variability and ambiguity of database and software name mentions and provide a set of 30 full-text documents manually annotated on the mention level. Our analyses show that identification of mentions of databases and tools is not a task that can be achieved through dictionary matching alone: our baseline dictionary look-up achieved a F-score of just over 50%. This is primarily because of high variability and ambiguity in database and software mentions contained within the literature and due to the extensive number of new resources introduced. We characterise the issues with various mention types and propose potential ways of capturing additional database and software mentions in the literature.
    Original languageEnglish
    Title of host publicationSMBM 2012 - Proceedings of the 5th International Symposium on Semantic Mining in Biomedicine|SMBM - Proc. Int. Symp. Semantic Min. Biomed.
    Place of Publicationhttp://www.zora.uzh.ch/64476/
    Pages2-9
    Number of pages7
    DOIs
    Publication statusPublished - 2012
    Event5th International Symposium on Semantic Mining in Biomedicine, SMBM 2012 - Zurich
    Duration: 1 Jul 2012 → …
    http://https://www.escholar.manchester.ac.uk/uk-ac-man-scw:175435

    Conference

    Conference5th International Symposium on Semantic Mining in Biomedicine, SMBM 2012
    CityZurich
    Period1/07/12 → …
    Internet address

    Fingerprint

    Dive into the research topics of 'Ambiguity and variability of database and software names in bioinformatics'. Together they form a unique fingerprint.

    Cite this