Text Mining for Drug Discovery

  • Dimitrios Piliouras

Student thesis: Master of Philosophy


Over the last several decades, medical and biological research has risen to extraor-dinary levels, opening vast windows into the mechanisms underlying health and dis-ease in living organisms. Integrating this knowledge into a unified framework to en-hance our understanding and decision-making is a significant challenge for the researchcommunity. Efficient drug discovery and development requires methods for bridgingpre-clinical data with patient data, as well as effective literature-mining, in order to es-timate both efficacy and safety outcomes for new molecules and treatment approaches.Text mining is often regarded as an antidote to this exponential growth of biomedicalpublications.In this thesis text-mining and natural-language-processing techniques and tools,aiming to assist with various computational aspects of drug-discovery, are presented.In particular, methods useful for modelling of pharmaco-kinetic parameters, a processby which, the pharmaceutical effects of a drug can be simulated using mathematicalmodels, are pursued. In order to fully realise the potential of such modelling, there isa tremendous need for databases of verified pharmaco-kinetic/dynamic properties ofdrugs. To that end, a context-free-grammar, capable of capturing such parameters, andtheir potential modifications, is proposed. The fully deterministic nature of a contextfree grammar can be side-stepped by embedding a lexical analyser which is able toplug-in external components for specialised sub-tasks (i.e., named entity recognition).The feasibility of this approach is evaluated against a gold-standard corpus, where it isshown to be both effective and efficient, with predictive accuracy touching 90%.
Date of Award31 Dec 2014
Original languageEnglish
Awarding Institution
  • The University of Manchester


  • drug-NER
  • text-mining
  • pharmacology

Cite this