Creating a focused corpus of factual outcomes from biomedical experiments

J Eales, G Demetriou, R Stevens

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    Abstract

    The results of an experiment are often described in a series of textual state- ments, the most concise of which being the title of the article. Here we imple- mented a novel approach, using standard data mining techniques, to collect a set of concise `factual' statements about a research area. We compare two standard text classification approaches to identify `factual' and `non-factual' sentences in article titles; the first of which uses a statistical language-modelling approach, and the second a more sophisticated semantic and grammatical approach. We find that the simple approach provides more accurately classified titles; achiev- ing 92% overall accuracy compared to 90% for the complex approach. We also implement a strategy to convert the phrasal dependencies in a `factual' title into subject-predicate-object structures (triples). These triples can then be organised according to a schema provided by domain ontologies; which occurs by mapping URIs to entities found in the textual labels.
    Original languageEnglish
    Title of host publication{Proceedings of the Mining Complex Entities from Biomedical Data (MIND 2011), A Workshop held in Conjuction with PKDD 2011)}
    Publication statusPublished - 2011
    EventMIND workshop, ECML/PKDD 2011 - Athens
    Duration: 9 Sept 201110 Sept 2011

    Conference

    ConferenceMIND workshop, ECML/PKDD 2011
    CityAthens
    Period9/09/1110/09/11

    Keywords

    • text mining
    • data mining
    • knowledge network

    Fingerprint

    Dive into the research topics of 'Creating a focused corpus of factual outcomes from biomedical experiments'. Together they form a unique fingerprint.

    Cite this