Abstract
PROMIS (protein machine induction system), a program for machine learning, was used to generalize rules that characterize the relationship between primary and secondary structure in globular proteins. These rules can be used to predict an unknown secondary structure from a known primary structure. The symbolic induction method used by PROMIS was specifically designed to produce rules that are meaningful in terms of chemical properties of the residues. The rules found were compared with existing knowledge of protein structure: some features of the rules were already recognized (e.g. amphipathic nature of α-helices). Other features are not understood, and are under investigation. The rules produced a prediction accuracy for three states (α-helix, β-strand and coil) of 60% for all proteins, 73% for proteins of known a domain type, 62% for proteins of known β domain type and 59% for proteins of known α/β domain type. We conclude that machine learning is a useful tool in the examination of the large databases generated in molecular biology. © 1990 Academic Press Limited.
Original language | English |
---|---|
Pages (from-to) | 441-457 |
Number of pages | 16 |
Journal | Journal of molecular biology |
Volume | 216 |
Issue number | 2 |
DOIs | |
Publication status | Published - 1990 |