Semisupervised Gaussian Process for Automated Enzyme Search.

Joseph Mellor, Ioana Grigoras, Pablo Carbonell, Jean-Loup Faulon

    Research output: Contribution to journalArticlepeer-review

    125 Downloads (Pure)

    Abstract

    Synthetic biology is today harnessing the design of novel and greener biosynthesis routes for the production of added-value chemicals and natural products. The design of novel pathways often requires a detailed selection of enzyme sequences to import into the chassis at each of the reaction steps. To address such design requirements in an automated way, we present here a tool for exploring the space of enzymatic reactions. Given a reaction and an enzyme the tool provides a probability estimate that the enzyme catalyzes the reaction. Our tool first considers the similarity of a reaction to known biochemical reactions with respect to signatures around their reaction centers. Signatures are defined based on chemical transformation rules by using extended connectivity fingerprint descriptors. A semisupervised Gaussian process model associated with the similar known reactions then provides the probability estimate. The Gaussian process model uses information about both the reaction and the enzyme in providing the estimate. These estimates were validated experimentally by the application of the Gaussian process model to a newly identified metabolite in Escherichia coli in order to search for the enzymes catalyzing its associated reactions. Furthermore, we show with several pathway design examples how such ability to assign probability estimates to enzymatic reactions provides the potential to assist in bioengineering applications, providing experimental validation to our proposed approach. To the best of our knowledge, the proposed approach is the first application of Gaussian processes dealing with biological sequences and chemicals, the use of a semisupervised Gaussian process framework is also novel in the context of machine learning applied to bioinformatics. However, the ability of an enzyme to catalyze a reaction depends on the affinity between the substrates of the reaction and the enzyme. This affinity is generally quantified by the Michaelis constant KM. Therefore, we also demonstrate using Gaussian process regression to predict KM given a substrate-enzyme pair.
    Original languageEnglish
    JournalACS synthetic biology|ACS Synth Biol
    Early online date23 Mar 2016
    DOIs
    Publication statusPublished - 2016

    Keywords

    • Gaussian process regression
    • enzyme kinetics
    • enzyme screening
    • metabolic engineering
    • reaction fingerprint
    • semisupervised Gaussian process

    Research Beacons, Institutes and Platforms

    • Manchester Institute of Biotechnology

    Fingerprint

    Dive into the research topics of 'Semisupervised Gaussian Process for Automated Enzyme Search.'. Together they form a unique fingerprint.

    Cite this