Extracting Gene-Disease Relations from Text to Support Biomarker Discovery

Paul Thompson, Sophia Ananiadou

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


The biomedical literature constitutes a rich source of evidence to
support the discovery of biomarkers. However, locating evidence
in huge volumes of text can be difficult, as typical keyword
queries cannot account for the meaning and structure of text.
Text mining (TM) methods carry out automated semantic
analysis of documents, to facilitate structured searching that can
more precisely match users’ information needs. We describe our
TM approach to the detection of sentence-level associations
between genes and diseases, as a first step towards developing a
sophisticated search system targeted at locating biomarker
evidence in the literature. We vary the sophistication of our
detection methodology according to sentence complexity, using
either co-occurring mentions of genes and diseases, or linguistic
patterns obtained using evidence from approximately 1 million
biomedical abstracts. We demonstrate that this method can
detect associations more successfully than applying a single
technique, with an accuracy that compares highly favourably to
related efforts. We also show that the identified relations can
complement those detected using alternative approaches.
Original languageEnglish
Title of host publicationDigital Health 2017: Global Public Health, Personalised Medicine, and Emergency Medicine in the Age of Big Data
PublisherACM Digital Library
Publication statusPublished - 2 Jun 2018

Research Beacons, Institutes and Platforms

  • Manchester Institute of Biotechnology


Dive into the research topics of 'Extracting Gene-Disease Relations from Text to Support Biomarker Discovery'. Together they form a unique fingerprint.

Cite this