Lysine and arginine content of proteins: Computational analysis suggests a new tool for solubility design

Jim Warwicker, Spyros Charonis, Robin A. Curtis

    Research output: Contribution to journalArticlepeer-review


    Prediction and engineering of protein solubility is an important but imprecise area. While some features are routinely used, such as the avoidance of extensive non-polar surface area, scope remains for benchmarking of sequence and structural features with experimental data. We study properties in the context of experimental solubilities, protein gene expression levels, and families of abundant proteins (serum albumin and myoglobin) and their less abundant paralogues. A common feature that emerges for proteins with elevated solubility and at higher expression and abundance levels is an increased ratio of lysine content to arginine content. We suggest that the same properties of arginine that give rise to its recorded propensity for specific interaction surfaces also lead to favorable interactions at nonspecific contacts, and thus lysine is favored for proteins at relatively high concentration. A survey of protein therapeutics shows that a significant subset possesses a relatively low lysine to arginine ratio, and therefore may not be favored for high protein concentration. We conclude that modulation of lysine and arginine content could prove a useful and relatively simple addition to the toolkit available for engineering protein solubility in biotechnological applications. © 2013 American Chemical Society.
    Original languageEnglish
    Pages (from-to)294-303
    Number of pages10
    JournalMolecular Pharmaceutics
    Issue number1
    Early online date20 Nov 2013
    Publication statusPublished - 6 Jan 2014


    • amino acid side chain charge
    • bioinformatics
    • biologics
    • protein aggregation
    • solubility prediction


    Dive into the research topics of 'Lysine and arginine content of proteins: Computational analysis suggests a new tool for solubility design'. Together they form a unique fingerprint.

    Cite this