The Prediction of Mutagenicity and pKa for Pharmaceutically Relevant Compounds Using "Quantum Chemical Topology" Descriptors

  • Alexander Harding

Student thesis: Doctor of Engineering


Quantum Chemical Topology (QCT) descriptors, calculated from ab initio wave functions, have been utilised to model pKa and mutagenicity for data sets of pharmaceutically relevant compounds. The pKa of a compound is a pivotal property in both life science and chemistry since the propensity of a compound to donate or accept a proton is fundamental to understanding chemical and biological processes. The prediction of mutagenicity, specifically as determined by the Ames test, is important to aid medicinal chemists select compounds avoiding this potential pitfall in drug design. Carbocyclic and heterocyclic aromatic amines were chosen because this compounds class is synthetically very useful but also prone to positive outcomes in the battery of genotoxicity assays.The importance of pKa and genotoxic characteristics cannot be overestimated in drug design, where the multivariate optimisations of properties that influence the Absorption-Distribution-Metabolism-Excretion-Toxicity (ADMET) profiles now features very early on in the drug discovery process.Models were constructed using carboxylic acids in conjunction with the Quantum Topological Molecular Similarity (QTMS) method. The models produced Root Mean Square Error of Prediction (RMSEP) values of less than 0.5 pKa units and compared favourably to other pKa prediction methods. The ortho-substituted benzoic acids had the largest RMSEP which was significantly improved by splitting the compounds into high-correlation subsets. For these subsets, single-term equations containing one ab initio bond length were able to accurately predict pKa. The pKa prediction equations were extended to phenols and anilines.Quantitative Structure Activity Relationship (QSAR) models of acceptable quality were built based on literature data to predict the mutagenic potency (LogMP) of carbo- and heterocyclic aromatic amines using QTMS. However, these models failed to predict Ames test values for compounds screened at GSK. Contradictory internal and external data for several compounds motivated us to determine the fidelity of the Ames test for this compound class. The systematic investigation involved recrystallisation to purify compounds, analytical methods to measure the purity and finally comparative Ames testing. Unexpectedly, the Ames test results were very reproducible when 14 representative repurified molecules were tested as the freebase and the hydrochloride salt in two different solvents (water and DMSO). This work formed the basis for the analysis of Ames data at GSK and a systematic Ames testing programme for aromatic amines. So far, an unprecedentedly large list of 400 compounds has been made available to guide medicinal chemists. We constructed a model for the subset of 100 meta-/para-substituted anilines that could predict 70% of the Ames classifications. The experimental values of several of the model outliers appeared questionable after closer inspection and three of these have been retested so far. The retests lead to the reclassification of two of them and thereby to improved model accuracy of 78%. This demonstrates the power of the iterative process of model building, critical analysis of experimental data, retesting outliers and rebuilding the model.
Date of Award1 Aug 2011
Original languageEnglish
Awarding Institution
  • The University of Manchester
SupervisorPaul Popelier (Supervisor)


  • ab initio
  • cross validation
  • anilines
  • carboxylic acids
  • phenols
  • bond length
  • Atoms in Molecules
  • mutagenicity
  • toxicity
  • pka
  • Quantum Chemical Topology
  • genotoxicity

Cite this