We have analysed a non-redundant set of 294 enzymes for differences in sequence and structural features between the six main Enzyme Commission (EC) classification groups. This systematic study of enzymes, and their active sites in particular, aims to increase understanding of how the structure of an enzyme relates to its functional role. Many features showed significant differences between the EC classes, including active-site polarity, enzyme size and active-site amino acid propensities. Many attributes correlate with each other to form clusters of related features from which we chose representative features for further analysis. Oxidoreductases have more non-polar active sites, which can be attributed to cofactor binding and a preference for Glu over Asp in active sites in comparison to the other classes. Lyases form a significantly higher proportion of oligomers than any other class, whilst the hydrolases form the largest proportion of monomers. These features were then used in a prediction model that classified each enzyme into its top EC class with an accuracy of 33.1%, which is an increase of 16.4% over random classification. Understanding the link between structure and function is critical to improving enzyme design and the prediction of protein function from structure without transfer of annotation from alignments. © 2008 Elsevier Ltd. All rights reserved.
- binding site