Mining semantically related terms from biomedical literature

    Research output: Contribution to journalArticlepeer-review


    Discovering links and relationships is one of the main challenges in biomedical research, as scientists are interested in uncovering entities that have similar functions, take part in the same processes, or are coregulated. This article discusses the extraction of such semantically related entities (represented by domain terms) from biomedical literature. The method combines various text-based aspects, such as lexical, syntactic, and contextual similarities between terms. Lexical similarities are based on the level of sharing of word constituents. Syntactic similarities rely on expressions (such as term enumerations and conjunctions) in which a sequence of terms appears as a single syntactic unit. Finally, contextual similarities are based on automatic discovery of relevant contexts shared among terms. The approach is evaluated using the Genia resources, and the results of experiments are presented. Lexical and syntactic links have shown high precision and low recall, while contextual similarities have resulted in significantly higher recall with moderate precision. By combining the three metrics, we achieved F measures of 68% for semanticalty related terms and 37% for highly related entities. © 2006 ACM.
    Original languageEnglish
    Pages (from-to)22-43
    Number of pages21
    JournalACM Transactions on Asian Language Information Processing
    Issue number1
    Publication statusPublished - 2006


    • Biomedical literature
    • Contextual patterns
    • Term similarities
    • Text mining


    Dive into the research topics of 'Mining semantically related terms from biomedical literature'. Together they form a unique fingerprint.

    Cite this