Protein classification using ontology classification

K. Wolstencroft, P. Lord, L. Tabernero, A. Brass, R. Stevens

    Research output: Contribution to journalArticlepeer-review

    Abstract

    Motivation: The classification of proteins expressed by an organism is an important step in understanding the molecular biology of that organism. Traditionally, this classification has been performed by human experts. Human knowledge can recognise the functional properties that are sufficient to place an individual gene product into a particular protein family group. Automation of this task usually fails to meet the 'gold standard' of the human annotator because of the difficult recognition stage. The growing number of genomes, the rapid changes in knowledge and the central role of classification in the annotation process, however, motivates the need to automate this process. Results: We capture human understanding of how to recognise members of the protein phosphatases family by domain architecture as an ontology. By describing protein instances in terms of the domains they contain, it is possible to use description logic reasoners and our ontology to assign those proteins to a protein family class. We have tested our system on classifying the protein phosphatases of the human and Aspergillus fumigatus genomes and found that our knowledge-based, automatic classification matches, and sometimes surpasses, that of the human annotators. We have made the classification process fast and reproducible and, where appropriate knowledge is available, the method can potentially be generalised for use with any protein family. © 2006 Oxford University Press.
    Original languageEnglish
    Pages (from-to)e530-e538
    JournalBioinformatics
    Volume22
    Issue number14
    DOIs
    Publication statusPublished - 15 Jul 2006

    Fingerprint

    Dive into the research topics of 'Protein classification using ontology classification'. Together they form a unique fingerprint.

    Cite this