Computational Chemistry, Short talk
CC-025

Automated force-field refinement for compound families

M. Pereira Oliveira1, P. H. Hünenberger1*
1Laboratory of Physical Chemistry, ETH Zürich, Switzerland

Nowadays, classical (force-field based) atomistic simulations of condensed-phase systems play a key role in all areas of natural sciences. Owing to computational and methodological advances in the field over the past few decades, and to a resulting increase in the usefulness of the simulation approach in e.g. material and drug design, the last few years have witnessed a massive explorative effort towards the automation of these calculations and, in particular, of the force-field development procedures. In this context, our goal is the design, implementation and application of an integrated scheme for the automated refinement of force-field parameters against experimental condensed-phase (predominantly thermodynamic) data, considering entire classes of organic molecules constructed using a fragment library via combinatorial isomer enumeration. The main features and objectives of the proposed approach can be stated as follows: (i) keep the force-field design focused on the central building blocks of physical organic chemistry, the chemical functional groups; (ii) treat the force-field parameters as empirical quantities, to be optimized primarily against experimental data rather than against the results of quantum-mechanical calculations; (iii) enable a complete automation of the topology-construction and parameter-optimization procedures; (iv) construct force fields with a broad (though not exhaustive) coverage of the chemical space, optimized against an extensive experimental dataset; (v) enable the comparison of different choices in the force-field functional form at optimal parametrization level, so as to guide the refinement of this functional form towards the most relevant improvements; (vi) provide chemical insight into the specific properties of the classes of organic compounds considered. As a first application, this workflow is illustrated here in the context of two molecule families, saturated haloalkanes and non-hydrogen-bonding oxygen-containing compounds (ethers, esters, aldehydes and ketones) up to 6 carbon atoms. Considering 300 and 123 molecules, respectively, in the two families, the force-field parameters (based on the GROMOS force-field functional form) are systematically optimized against a total 607 and 233 experimental values, respectively, for the liquid density and vaporization enthalpy. After 10-25 refinement iterations for the non-bonded interaction parameters (about 2 weeks calculation time using 400 CPUs), the final RMSD's against experiment are 55.65 kg/m3 and 3.48 kJ/mol for the haloalkanes, and 24.87 kg/m3 and 4.62 kJ/mol for the oxygenated compounds. Further sets of 320 and 444 molecules (600 and 726 experimental data points) are used for validation, with only slightly larger RMSD's relative to experiment.