Protein secondary structure prediction using logic-based machine learning

S. Muggleton, R. D. King, M. J E Sternberg

    Research output: Contribution to journalArticlepeer-review

    Abstract

    Many attempts have been made to solve the problem of predicting protein secondary structure from the primary sequence but the best performance results are still disappointing. In this paper, the use of a machine learning algorithm which allows relational descriptions is shown to lead to improved performance. The Inductive Logic Programming computer program, Golem, was applied to learning secondary structure prediction rules for α/α domain type proteins. The input to the program consisted of 12 non-homologous proteins (1612 residues) of known structure, together with a background knowledge describing the chemical and physical properties of the residues. Golem learned a small set of rules that predict which residues are part of the α-helices—based on their positional relationships and chemical and physical properties. The rules were tested on four independent non-homologous proteins (416 residues) giving an accuracy of 81% (±2%). This is an improvement, on identical data, over the previously reported result of 73% by King and Sternberg (1990, J. Mol. Biol., 216, 441–457) using the machine learning program PROMIS, and of 72% using the standard Gamier-Osguthorpe-Robson method. The best previously reported result in the literature for the α/α domain type is 76%, achieved using a neural net approach. Machine learning also has the advantage over neural network and statistical methods in producing more understandable results.
    Original languageEnglish
    Pages (from-to)647-657
    Number of pages10
    JournalProtein engineering
    Volume5
    Issue number7
    DOIs
    Publication statusPublished - 1992

    Fingerprint

    Dive into the research topics of 'Protein secondary structure prediction using logic-based machine learning'. Together they form a unique fingerprint.

    Cite this