TY - JOUR
T1 - CONSeQuence
T2 - Prediction of reference peptides for absolute quantitative proteomics using consensus machine learning approaches
AU - Eyers, Claire E.
AU - Lawless, Craig
AU - Wedge, David C.
AU - Lau, King Wai
AU - Gaskell, Simon J.
AU - Hubbard, Simon J.
PY - 2011/11
Y1 - 2011/11
N2 - Mass spectrometric based methods for absolute quantification of proteins, such as QconCAT, rely on internal standards of stable-isotope labeled reference peptides, or "Q-peptides," to act as surrogates. Key to the success of this and related methods for absolute protein quantification (such as AQUA) is selection of the Q-peptide. Here we describe a novel method, CONSeQuence (consensus predictor for Q-peptide sequence), based on four different machine learning approaches for Q-peptide selection. CONSeQuence demonstrates improved performance over existing methods for optimal Q-peptide selection in the absence of prior experimental information, as validated using two independent test sets derived from yeast. Furthermore, we examine the physicochemical parameters associated with good peptide surrogates, and demonstrate that in addition to charge and hydrophobicity, peptide secondary structure plays a significant role in determining peptide "detectability" in liquid chromatography- electrospray ionization experiments. We relate peptide properties to protein tertiary structure, demonstrating a counterintuitive preference for buried status for frequently detected peptides. Finally, we demonstrate the improved efficacy of the general approach by applying a predictor trained on yeast data to sets of proteotypic peptides from two additional species taken from an existing peptide identification repository. © 2011 by The American Society for Biochemistry and Molecular Biology, Inc.
AB - Mass spectrometric based methods for absolute quantification of proteins, such as QconCAT, rely on internal standards of stable-isotope labeled reference peptides, or "Q-peptides," to act as surrogates. Key to the success of this and related methods for absolute protein quantification (such as AQUA) is selection of the Q-peptide. Here we describe a novel method, CONSeQuence (consensus predictor for Q-peptide sequence), based on four different machine learning approaches for Q-peptide selection. CONSeQuence demonstrates improved performance over existing methods for optimal Q-peptide selection in the absence of prior experimental information, as validated using two independent test sets derived from yeast. Furthermore, we examine the physicochemical parameters associated with good peptide surrogates, and demonstrate that in addition to charge and hydrophobicity, peptide secondary structure plays a significant role in determining peptide "detectability" in liquid chromatography- electrospray ionization experiments. We relate peptide properties to protein tertiary structure, demonstrating a counterintuitive preference for buried status for frequently detected peptides. Finally, we demonstrate the improved efficacy of the general approach by applying a predictor trained on yeast data to sets of proteotypic peptides from two additional species taken from an existing peptide identification repository. © 2011 by The American Society for Biochemistry and Molecular Biology, Inc.
U2 - 10.1074/mcp.M110.003384
DO - 10.1074/mcp.M110.003384
M3 - Article
C2 - 21813416
SN - 1535-9476
VL - 10
JO - Molecular and Cellular Proteomics
JF - Molecular and Cellular Proteomics
IS - 11
M1 - M110.003384
ER -