Predicting enzyme class from protein structure without alignments

Paul D. Dobson, Andrew J. Doig

    Research output: Contribution to journalArticlepeer-review

    Abstract

    Methods for predicting protein function from structure are becoming more important as the rate at which structures are solved increases more rapidly than experimental knowledge. As a result, protein structures now frequently lack functional annotations. The majority of methods for predicting protein function are reliant upon identifying a similar protein and transferring its annotations to the query protein. This method fails when a similar protein cannot be identified, or when any similar proteins identified also lack reliable annotations. Here, we describe a method that can assign function from structure without the use of algorithms reliant upon alignments. Using simple attributes that can be calculated from any crystal structure, such as secondary structure content, amino acid propensities, surface properties and ligands, we describe each enzyme in a non-redundant set. The set is split according to Enzyme Classification (EC) number. We combine the predictions of one-class versus one-class support vector machine models to make overall assignments of EC number to an accuracy of 35% with the top-ranked prediction, rising to 60% accuracy with the top two ranks. In doing so we demonstrate the utility of simple structural attributes in protein function prediction and shed light on the link between structure and function. We apply our methods to predict the function of every currently unclassified protein in the Protein Data Bank. © 2004 Elsevier Ltd. All rights reserved.
    Original languageEnglish
    Pages (from-to)187-199
    Number of pages12
    JournalJournal of molecular biology
    Volume345
    Issue number1
    DOIs
    Publication statusPublished - 7 Jan 2005

    Keywords

    • EC number
    • machine learning
    • protein function prediction
    • structural genomics
    • structure

    Fingerprint

    Dive into the research topics of 'Predicting enzyme class from protein structure without alignments'. Together they form a unique fingerprint.

    Cite this