Distinguishing enzyme structures from non-enzymes without alignments.

Paul D. Dobson, Andrew J. Doig

    Research output: Contribution to journalArticlepeer-review


    The ability to predict protein function from structure is becoming increasingly important as the number of structures resolved is growing more rapidly than our capacity to study function. Current methods for predicting protein function are mostly reliant on identifying a similar protein of known function. For proteins that are highly dissimilar or are only similar to proteins also lacking functional annotations, these methods fail. Here, we show that protein function can be predicted as enzymatic or not without resorting to alignments. We describe 1178 high-resolution proteins in a structurally non-redundant subset of the Protein Data Bank using simple features such as secondary-structure content, amino acid propensities, surface properties and ligands. The subset is split into two functional groupings, enzymes and non-enzymes. We use the support vector machine-learning algorithm to develop models that are capable of assigning the protein class. Validation of the method shows that the function can be predicted to an accuracy of 77% using 52 features to describe each protein. An adaptive search of possible subsets of features produces a simplified model based on 36 features that predicts at an accuracy of 80%. We compare the method to sequence-based methods that also avoid calculating alignments and predict a recently released set of unrelated proteins. The most useful features for distinguishing enzymes from non-enzymes are secondary-structure content, amino acid frequencies, number of disulphide bonds and size of the largest cleft. This method is applicable to any structure as it does not require the identification of sequence or structural similarity to a protein of known function.
    Original languageEnglish
    Pages (from-to)771-783
    Number of pages12
    JournalJournal of molecular biology
    Issue number4
    Publication statusPublished - 18 Jul 2003


    Dive into the research topics of 'Distinguishing enzyme structures from non-enzymes without alignments.'. Together they form a unique fingerprint.

    Cite this