Filtered datasets of benzene, ethanol, formic acid dimer and fomepizole

Dataset

Description

The uploaded datasets contain filtered geometries of benzene (BZ), ethanol (ETL), formic acid dimer (FAD), and fomepizole (FPL) used in a recently submitted paper to demonstrate the transferability of hyperparameters in anisotropic GPR models. These models are trained on atomic energies and charges (but in general any multipole moment) of topological quantum atoms in roughly 10 000 conformations of each molecule of interest. Once trained, these models were deployed in FFLUX simulations which suggested that transfer learning models perform as good as direct learning ones, despite being trained much faster (up to an order of magnitude). The datasets can be used to train any machine learning model that will reproduce atomic energies and multipole moments. It also contains total electronic energies of each conformation computed at the B3LYP/aug-cc-pVTZ [BZ and ETL] and B3LYP/6-31+G(d,p) [FAD and FPL] levels of theory. These can be used directly to create surrogate potential energy surface (PES) models.
Date made available25 Apr 2024
PublisherMendeley Data

Cite this