Reducing Compute Costs of Generating Training Data for Excitation Energy Prediction Using Multifidelity Methods

Vinod, V., Maity, S., Zaspel, P., & Kleinekathöfer, U. (2023). Multifidelity machine learning for molecular excitation energies. Journal of Chemical Theory and Computation, 19(21), 7658-7670 https://doi.org/10.1021/acs.jctc.3c00882.

A major challenge to accurate predictions of quantum chemical (QC) properties with machine learning methods is the lack of high accuracy data. Generating high accuracy training data for machine learning (ML) is computationally expensive. With the multifidelity machine learning (MFML) method, cheaper and less accurate data is used alongside very little high accuracy data to result in a model with better accuracy in predicting high fidelity data. In this work, the MFML method is benchmarked for vertical excitation energies, a QC property vital to understanding elementary life processes such as photosynthesis. Numerical results indicate a time benefit over a factor of 30. This is a strong step towards development of ML methods for QC reducing the compute cost of generating a training set. This work is authored by Vivin Vinod, Sayan Maity, Peter Zaspel, and Ulrich Kleinekathöfer and has been published in the Journal of Chemical Theory and Computation.