TY - JOUR
T1 - Kriging atomic properties with a variable number of inputs
AU - Davie, Stuart
AU - Di Pasquale, Nicodemo
AU - Popelier, Paul
PY - 2016
Y1 - 2016
N2 - A new force field called FFLUX uses the machine learning technique kriging to capture the link between the properties (energies and multipole moments) of topological atoms (i.e., output) and the coordinates of the surrounding atoms (i.e., input). Here we present a novel, general method of applying kriging to chemical systems that do not possess a fixed number of (geometrical) inputs. Unlike traditional kriging methods, which require an input system to be of fixed dimensionality, the method presented here can be readily applied to molecular simulation, where an interaction cutoff radius is commonly used and the number of atoms or molecules within the cutoff radius is not constant. The method described here is general and can be applied to any machine learning technique that normally operates under a fixed number of inputs. In particular, the method described here is also useful for interpolating methods other than kriging, which may suffer from difficulties stemming from identical sets of inputs corresponding to different outputs or input biasing. As a demonstration, the new method is used to predict 54 energetic and electrostatic properties of the central water molecule of a set of 5000, 4 Å radius water clusters, with a variable number of water molecules. The results are validated against equivalent models from a set of clusters composed of a fixed number of water molecules (set to ten, i.e., decamers) and against models created by using a naïve method of treating the variable number of inputs problem presented. Results show that the 4 Å water cluster models, utilising the method presented here, return similar or better kriging models than the decamer clusters for all properties considered and perform much better than the truncated models.
AB - A new force field called FFLUX uses the machine learning technique kriging to capture the link between the properties (energies and multipole moments) of topological atoms (i.e., output) and the coordinates of the surrounding atoms (i.e., input). Here we present a novel, general method of applying kriging to chemical systems that do not possess a fixed number of (geometrical) inputs. Unlike traditional kriging methods, which require an input system to be of fixed dimensionality, the method presented here can be readily applied to molecular simulation, where an interaction cutoff radius is commonly used and the number of atoms or molecules within the cutoff radius is not constant. The method described here is general and can be applied to any machine learning technique that normally operates under a fixed number of inputs. In particular, the method described here is also useful for interpolating methods other than kriging, which may suffer from difficulties stemming from identical sets of inputs corresponding to different outputs or input biasing. As a demonstration, the new method is used to predict 54 energetic and electrostatic properties of the central water molecule of a set of 5000, 4 Å radius water clusters, with a variable number of water molecules. The results are validated against equivalent models from a set of clusters composed of a fixed number of water molecules (set to ten, i.e., decamers) and against models created by using a naïve method of treating the variable number of inputs problem presented. Results show that the 4 Å water cluster models, utilising the method presented here, return similar or better kriging models than the decamer clusters for all properties considered and perform much better than the truncated models.
U2 - 10.1063/1.4962197
DO - 10.1063/1.4962197
M3 - Article
SN - 0021-9606
JO - The Journal of chemical physics
JF - The Journal of chemical physics
ER -