Abstract
Discovering links and relationships is one of the main challenges in biomedical research, as scientists are interested in uncovering entities that have similar functions, take part in the same processes, or are coregulated. This article discusses the extraction of such semantically related entities (represented by domain terms) from biomedical literature. The method combines various text-based aspects, such as lexical, syntactic, and contextual similarities between terms. Lexical similarities are based on the level of sharing of word constituents. Syntactic similarities rely on expressions (such as term enumerations and conjunctions) in which a sequence of terms appears as a single syntactic unit. Finally, contextual similarities are based on automatic discovery of relevant contexts shared among terms. The approach is evaluated using the Genia resources, and the results of experiments are presented. Lexical and syntactic links have shown high precision and low recall, while contextual similarities have resulted in significantly higher recall with moderate precision. By combining the three metrics, we achieved F measures of 68% for semanticalty related terms and 37% for highly related entities. © 2006 ACM.
Original language | English |
---|---|
Pages (from-to) | 22-43 |
Number of pages | 21 |
Journal | ACM Transactions on Asian Language Information Processing |
Volume | 5 |
Issue number | 1 |
DOIs | |
Publication status | Published - 2006 |
Keywords
- Biomedical literature
- Contextual patterns
- Term similarities
- Text mining