TY - JOUR
T1 - Prediction of beta-strand packing interactions using the signature product.
AU - Brown, W Michael
AU - Martin, Shawn
AU - Chabarek, Joseph P
AU - Strauss, Charlie
AU - Faulon, Jean-Loup
PY - 2006/2
Y1 - 2006/2
N2 - The prediction of beta-sheet topology requires the consideration of long-range interactions between beta-strands that are not necessarily consecutive in sequence. Since these interactions are difficult to simulate using ab initio methods, we propose a supplementary method able to assign beta-sheet topology using only sequence information. We envision using the results of our method to reduce the three-dimensional search space of ab initio methods. Our method is based on the signature molecular descriptor, which has been used previously to predict protein-protein interactions successfully, and to develop quantitative structure-activity relationships for small organic drugs and peptide inhibitors. Here, we show how the signature descriptor can be used in a Support Vector Machine to predict whether or not two beta-strands will pack adjacently within a protein. We then show how these predictions can be used to order beta-strands within beta-sheets. Using the entire PDB database with ten-fold cross-validation, we have achieved 74.0% accuracy in packing prediction and 75.6% accuracy in the prediction of edge strands. For the case of beta-strand ordering, we are able to predict the correct ordering accurately for 51.3% of the beta-sheets. Furthermore, using a simple confidence metric, we can determine those sheets for which accurate predictions can be obtained. For the top 25% highest confidence predictions, we are able to achieve 95.7% accuracy in beta-strand ordering. [Figure: see text].
AB - The prediction of beta-sheet topology requires the consideration of long-range interactions between beta-strands that are not necessarily consecutive in sequence. Since these interactions are difficult to simulate using ab initio methods, we propose a supplementary method able to assign beta-sheet topology using only sequence information. We envision using the results of our method to reduce the three-dimensional search space of ab initio methods. Our method is based on the signature molecular descriptor, which has been used previously to predict protein-protein interactions successfully, and to develop quantitative structure-activity relationships for small organic drugs and peptide inhibitors. Here, we show how the signature descriptor can be used in a Support Vector Machine to predict whether or not two beta-strands will pack adjacently within a protein. We then show how these predictions can be used to order beta-strands within beta-sheets. Using the entire PDB database with ten-fold cross-validation, we have achieved 74.0% accuracy in packing prediction and 75.6% accuracy in the prediction of edge strands. For the case of beta-strand ordering, we are able to predict the correct ordering accurately for 51.3% of the beta-sheets. Furthermore, using a simple confidence metric, we can determine those sheets for which accurate predictions can be obtained. For the top 25% highest confidence predictions, we are able to achieve 95.7% accuracy in beta-strand ordering. [Figure: see text].
U2 - 10.1007/s00894-005-0052-4
DO - 10.1007/s00894-005-0052-4
M3 - Article
C2 - 16365772
SN - 0948-5023
VL - 12
JO - Journal of molecular modeling
JF - Journal of molecular modeling
IS - 3
ER -