pKa prediction from "quantum chemical topology" descriptors

    Research output: Contribution to journalArticlepeer-review


    Knowing the pKa of a compound gives insight into many properties relevant to many industries, in particular the pharmaceutical industry during drug development processes. In light of this, we have used the theory of Quantum Chemical Topology (QCT), to provide ab initio descriptors that are able to accurately predict pATa values for 228 carboxylic acids. This Quantum Topological Molecular Similarity (QTMS) study involved the comparison of 5 increasingly more expensive levels of theory to conclude that HF/6-3 lG(d) and B3LYP/ 6-311 +G(2d,p) provided an accurate representation of the compounds studies. We created global and subset models for the carboxylic acids using Partial Least Square (PLS), Support Vector Machines (SVM), and Radial Basis Function Neural Networks (RBFNN). The models were extensively validated using 4-, 7-, and 10-fold cross-validation, with the validation sets selected based on systematic and random sampling. HF/ 6-31G(d) in conjunction with SVM provided the best statistics when taking into account the large increase in CPU time required to optimize the geometries at the B3LYP/6-311+G(2d,p) level. The SVM models provided an average q2 value of 0.886 and an RMSE value of 0.293 for all the carboxylic acids, a q2 of 0.825 and RMSE of 0.378 for the ortho-substituted acids, a q2 of 0.923 and RMSE of 0.112 for the paraand meta-substituted acids, and a q2 of 0.906 and RMSE of 0.268 for the aliphatic acids. Our method compares favorably to ACD/Laboratories, VCCLAB, SPARC, and ChemAxon's p/£a prediction software based of the RMSE calculated by the leave-one-out method. © 2009 American Chemical Society.
    Original languageEnglish
    Pages (from-to)1914-1924
    Number of pages10
    JournalJournal of Chemical Information and Modeling
    Issue number8
    Publication statusPublished - 24 Aug 2009


    Dive into the research topics of 'pKa prediction from "quantum chemical topology" descriptors'. Together they form a unique fingerprint.

    Cite this