A Two-Stage Equivariant Training Framework Enables Quantum-Accurate Force Fields with Reduced DFT Training Data

Si-Yu Hu; Jie Zhu; Chen Wang; Lin-Wang Wang; Hong-Liang Liu; Wei-Le Jia; Guang-Ming Tan

doi:10.1007/s11390-026-6106-z

Si-Yu Hu, Jie Zhu, Chen Wang, Lin-Wang Wang, Hong-Liang Liu, Wei-Le Jia, Guang-Ming Tan. A Two-Stage Equivariant Training Framework Enables Quantum-Accurate Force Fields with Reduced DFT Training Data. Journal of Computer Science and Technology. DOI: 10.1007/s11390-026-6106-z

Citation:

A Two-Stage Equivariant Training Framework Enables Quantum-Accurate Force Fields with Reduced DFT Training Data

Abstract

Abstract

Neural network potentials (NNPs) have emerged as a powerful tool for materials simulations, offering speedups of several orders of magnitude over molecular dynamics (MD) simulations while retaining ab initio accuracy. This capability holds the promise of transforming the paradigm of computational materials research. However, the development of NNPs is often constrained by the high cost of accessing density functional theory (DFT) reference data. This underscores a critical need for accurate and data-efficient NNPs that can be trained with only limited DFT inputs. NNPs can be categorized into dedicated and pretrained models. Dedicated NNPs deliver high accuracy for specific systems but suffer from data inefficiency, whereas pretrained NNPs provide broad transferability at the expense of task-specific precision. Here, we introduce a two-stage training framework that integrates the strengths of these two NNPs. In the first stage (finetuning), we leverage a degree-parity-independent low-rank adaptation (LoRA) finetuning strategy to enhance the accuracy of pretrained NNPs. In the second stage (data augmentation), the finetuned model is employed to generate high-quality synthetic data, thereby expanding chemical space coverage and enabling more effective training of dedicated NNPs. Experiments demonstrate that our framework is both parameter and data-efficient. The finetuning strategy yields an average accuracy improvement of 22.0% compared with the baseline method, outperforming full-parameter finetuning in 26 out of 27 datasets. Meanwhile, data augmentation enables dedicated NNPs to achieve 7.1%–61.0% higher accuracy (average 36.3%) with only 50 DFT reference points, while reducing the number of training iterations by at least 5 times.

FullText(HTML)

References (0)

Relative Articles

Supplements (1)

Cited By

A Two-Stage Equivariant Training Framework Enables Quantum-Accurate Force Fields with Reduced DFT Training Data

Abstract

Catalog

Export File

Citation

Format

Content