Projects per year
Abstract
Motivation: In recent years, there has been great progress in the field of automated curation of biomedical networks and models, aided by text mining methods that provide evidence from literature. Such methods must not only extract snippets of text that relate to model interactions, but also be able to contextualise the evidence and provide additional confidence scores for the interaction in question. While various approaches calculating confidence scores have focused primarily on the quality of the extracted information, there has been little work on exploring the textual uncertainty conveyed by the author. Although textual uncertainty is acknowledged in biomedical text mining as an attribute of text mined interactions (events), it is significantly understudied as a means of providing a confidence measure for interactions in pathways or other biomedical models. In this work, we focus on improving identification of textual uncertainty for events and explore how it can be used as an additional measure of confidence for biomedical models.
Results: We present a novel method for extracting uncertainty from the literature using a hybrid approach that combines rule induction and machine learning. Variations of this hybrid approach are then discussed, alongside their advantages and disadvantages. We use subjective logic theory to combine multiple uncertainty values extracted from different sources for the same interaction. Our approach achieves an F-scores of 0.76 and 0.88 based on the BioNLP-ST and Genia-MK corpora, respectively, making considerable improvements over previously published work. Moreover, we evaluate our proposed system on pathways related to two different areas, namely leukemia and melanoma cancer research.
Availability: The leukemia pathway model used is available in Pathway Studio while the Ras model is available via PathwayCommons. Online demonstration of the uncertainty extraction system is available for research purposes at http://argo.nactem.ac.uk/test . The related code is available on https://github.com/czrv/ uncertainty_components.git . Details on the above are available in the Supplementary Information.
Results: We present a novel method for extracting uncertainty from the literature using a hybrid approach that combines rule induction and machine learning. Variations of this hybrid approach are then discussed, alongside their advantages and disadvantages. We use subjective logic theory to combine multiple uncertainty values extracted from different sources for the same interaction. Our approach achieves an F-scores of 0.76 and 0.88 based on the BioNLP-ST and Genia-MK corpora, respectively, making considerable improvements over previously published work. Moreover, we evaluate our proposed system on pathways related to two different areas, namely leukemia and melanoma cancer research.
Availability: The leukemia pathway model used is available in Pathway Studio while the Ras model is available via PathwayCommons. Online demonstration of the uncertainty extraction system is available for research purposes at http://argo.nactem.ac.uk/test . The related code is available on https://github.com/czrv/ uncertainty_components.git . Details on the above are available in the Supplementary Information.
Original language | English |
---|---|
Journal | Bioinformatics |
DOIs | |
Publication status | Published - 24 Jul 2017 |
Keywords
- Text mining
- uncertainty identification
- event extraction
- pathway curation
- interaction ranking
- Annotation
Fingerprint
Dive into the research topics of 'Using uncertainty to link and rank evidence from biomedical literature for model curation'. Together they form a unique fingerprint.Projects
- 2 Finished
-
S.Ananiadou: BBSRC: Japan Partnering Award. Text mining and bioinformatics platforms for metabolic pathway modelling
Ananiadou, S. (PI) & Batista-Navarro, R. T. (CoI)
1/05/17 → 30/04/21
Project: Research
-
Enriching Metabolic PATHwaY models with evidence from the literature (EMPATHY).
Ananiadou, S. (PI) & Kell, D. (CoI)
1/04/15 → 23/03/19
Project: Research