Abstract

Acta Materialia 214 (2021) 116980 Contents lists available at ScienceDirect Acta Materialia journal homepage: www.elsevier.com/locate/actamat Machine-learning interatomic potentials for materials science Y. Mishin Department of Physics and Astronomy, MSN 3F3, George Mason University, Fairfax, VA 22030, USA a r t i c l e i n f o Article history: Received 11 February 2021 Revised 5 May 2021 Accepted 6 May 2021 Available online 19 May 2021 Keywords: Atomistic simulation Interatomic potential Machine-learning a b s t r a c t Large-scale atomistic computer simulations of materials rely on interatomic potentials providing compu- tationally efficient predictions of energy and Newtonian forces. Traditional potentials have served in this capacity for over three decades. Recently, a new class of potentials has emerged, which is based on a rad- ically different philosophy. The new potentials are constructed using machine-learning (ML) methods

1. Bibliographic Information

Title: Machine-learning interatomic potentials for materials science
Authors: Y. Mishin. Affiliated with the Department of Physics and Astronomy, George Mason University, Fairfax, VA, USA.
Journal/Conference: Published in Acta Materialia. This is a highly reputable, peer-reviewed journal in the field of materials science, known for publishing significant and high-impact research.
Publication Year: 2021 (Accepted 6 May 2021, Available online 19 May 2021).
Abstract: The paper reviews the field of interatomic potentials used in large-scale atomistic simulations. It contrasts traditional, physics-based potentials with the newer class of machine-learning (ML) potentials, which are constructed by interpolating massive quantum-mechanical databases. The author discusses the strengths and weaknesses of both approaches and introduces a third, hybrid class: physically-informed ML potentials. This new class aims to combine the accuracy of ML with the transferability of physics-based models. The review focuses on applications in materials science and outlines future directions for the field.
Original Source Link: /files/papers/68ef4a95e77486f6f3192eb6/paper.pdf (Formally published paper).

2. Executive Summary

Background & Motivation (Why):
- Core Problem: Atomistic simulations, which are crucial for understanding material properties at the microscopic level, require a way to calculate the energy and forces between atoms. The most accurate method, quantum mechanics (specifically Density Functional Theory, or DFT), is computationally too expensive for large systems (more than a few hundred atoms) or long simulation times (more than picoseconds). For decades, this gap has been filled by traditional interatomic potentials, which are fast but have limited accuracy and flexibility.
- Gaps & Challenges: Traditional potentials are based on simplified physical models and struggle to describe complex materials or systems with mixed chemical bonding. Their development is often considered an "art" and is not systematically improvable. Recently, machine-learning (ML) potentials have emerged as a powerful alternative, offering near-DFT accuracy. However, they are often "black-box" models that perform poorly when simulating atomic environments not seen during training (poor transferability/extrapolation).
- Fresh Angle: This paper provides a comprehensive review comparing these two philosophies and, most importantly, introduces and advocates for a third class of potentials: physically-informed ML potentials. This hybrid approach seeks to merge the high accuracy of ML with the robust physical foundation and transferability of traditional models, representing a "best of both worlds" solution.
Main Contributions / Findings (What):
- Systematic Classification: The paper categorizes interatomic potentials into three distinct classes: Traditional, Mathematical ML, and Physically-Informed ML, providing a clear framework for understanding the field.
- Comparative Analysis: It offers a detailed comparison of these classes across multiple criteria, including physical foundation, accuracy, transferability, computational speed, and reliance on human expertise. This is concisely summarized in Table 1.
- Introduction to Physically-Informed ML: The paper champions a hybrid approach where an ML model learns to predict the parameters of a physics-based potential based on the local atomic environment. This novel concept aims to overcome the critical limitation of poor extrapolation in standard ML potentials.
- Future Outlook: The author provides a historical perspective on the field's evolution and offers a vision for its future, predicting a shift from methodological development towards using these advanced potentials for new materials discovery and scientific insight.

Foundational Concepts:
- Interatomic Potential (or Force Field): A mathematical function that describes the potential energy of a system of atoms based on their positions. From this energy, the forces acting on each atom can be calculated ( $F = -\nabla E$ ), which is essential for simulating how atoms move.
- Atomistic Simulations: Computer methods like Molecular Dynamics (MD) and Monte Carlo (MC) that use interatomic potentials to simulate the behavior of materials at the atomic scale. MD simulates the trajectory of atoms over time by solving Newton's equations of motion, while MC uses statistical methods to sample different atomic configurations.
- Density Functional Theory (DFT): A first-principles quantum mechanical method used to calculate the electronic structure of materials. It is considered the "gold standard" for accuracy in calculating energies and forces but is computationally very intensive, with its cost typically scaling as $N^3$ with the number of atoms $N$ .
- Potential Energy Surface (PES): A high-dimensional surface that maps every possible arrangement of atoms in a system to its corresponding potential energy. The goal of an interatomic potential is to provide a fast and accurate approximation of this surface.
- Machine Learning (ML): In this context, it refers to supervised learning, specifically regression. A model is trained on a large dataset of inputs (atomic configurations) and corresponding outputs (DFT-calculated energies/forces) to learn the underlying relationship, allowing it to predict the energy for new, unseen configurations.
- Accuracy vs. Transferability: Accuracy refers to how well a potential reproduces the data it was trained on (interpolation). Transferability (or extrapolation) refers to its ability to make physically reasonable predictions for atomic configurations that are significantly different from the training data.
Previous Works & Technological Evolution: The paper traces the evolution of interatomic potentials from the 1980s to the present.
- Traditional Potentials (1980s-present): These were the first generation of many-body potentials. They are based on physical intuition about chemical bonding.
  - For metals, models like the Embedded-Atom Method (EAM) were developed, which describe energy as a sum of pairwise interactions and an "embedding" energy term that depends on the local electron density.
  - For covalent materials like silicon, models like the Tersoff and Stillinger-Weber potentials were introduced to account for the directional nature of covalent bonds.
  - These models have a fixed functional form with a small number of parameters (~10-20) fitted to experimental data (e.g., lattice constant, elastic constants) and a few DFT calculations.
- Machine-Learning Potentials (2010s-present): This new wave represents a paradigm shift from physics-based modeling to data-driven regression.
  - Pioneered by Behler and Parrinello in 2007, these models do not assume any physical form for the interaction. Instead, they learn the complex relationship between atomic structure and energy directly from a massive DFT database.
  - Key examples include Neural Network Potentials (NNPs), Gaussian Approximation Potentials (GAPs), Moment Tensor Potentials (MTPs), and Spectral Neighbor Analysis Potentials (SNAPs).
  - They rely on a two-step process: first describing the local atomic environment with mathematical "fingerprints" (descriptors), and then using a flexible regression model to map these fingerprints to an energy value.
Differentiation: The paper's core contribution is its clear differentiation between the three classes of potentials:
- Traditional: Physics-based, computationally very fast, reasonably transferable, but with limited accuracy and not systematically improvable.
- Mathematical ML: Data-driven, highly accurate (near-DFT), systematically improvable by adding more data, but computationally slower and suffering from poor transferability (unreliable extrapolation).
- Physically-Informed ML: A hybrid model that uses an ML framework to inform a physics-based potential. It aims to achieve the DFT-level accuracy of ML models while retaining the physical grounding and better transferability of traditional models.

4. Methodology (Core Technology & Implementation)

The paper categorizes and explains the methodology behind the three classes of potentials.

4.1. The Traditional Interatomic Potentials

Principle: The potential energy $E$ is expressed as a sum of atomic energies $E_i$ , where each $E_i$ is determined by a physically-motivated analytical function $\Phi$ that depends on the positions of neighboring atoms $\mathbf{R}_i$ and a small set of fixed global parameters $\mathbf{p}$ . $E_i = \Phi(\mathbf{R}_i, \mathbf{p})$
Steps & Procedures:
1. A functional form $\Phi$ is chosen based on the physics of the material (e.g., EAM for metals).
2. A small number of parameters $\mathbf{p}$ are optimized by fitting to a small database of experimental properties (e.g., cohesive energy, elastic constants) and some DFT data.
3. Once fitted, the parameters are fixed and used for all subsequent simulations.
Flowchart Visualization (Figure 1):

$Fig. 1. Flowchart of total energy calculations with traditional interatomic potentials. The energy $E _ { i }$ of an atom i is computed using atomic coordinates within the cutoff sphere (green) and f…$ 该图像是图1，一个展示传统原子间势能计算总能量的流程图。图中，原子i的能量 $E_i$ 通过其在截止半径（绿色区域）内的原子坐标以及固定势能参数计算得出。系统中所有原子的能量与来自其他原子的能量通过求和符号 $\Sigma$ 累加，最终得到系统的总能量。

This flowchart shows that for a given atom $i$ , its local atomic positions are fed into a potential function. This function uses a fixed set of pre-determined Parameters to calculate the atomic energy $E_i$ . The total energy is the sum of these individual atomic energies.

4.2. Machine-Learning Potentials

Principle: ML potentials replace the explicit physics-based function with a high-dimensional mathematical regression that interpolates a large database of DFT calculations. The process is purely data-driven.
Steps & Procedures:
1. Descriptor Generation: The local atomic environment around an atom $i$ $i$ , defined by the neighbor positions $\mathbf{R}_i$ $R_{i}$ , is first converted into a fixed-length feature vector of local structural parameters (or "fingerprints") $\mathbf{G}_i$ $G_{i}$ . These descriptors are designed to be invariant to translation, rotation, and permutation of identical atoms. Examples include:
  - Gaussian descriptors (Behler-Parrinello symmetry functions)
  - Smooth Overlap of Atomic Positions (SOAP)
  - Spectral Neighbor Analysis Potential (SNAP) descriptors
  - Moment Tensor Potentials (MTP) descriptors
  - Atomic Cluster Expansion (ACE)
2. Regression: A flexible regression model $\mathcal{R}$ $R$ is trained to map the descriptor vector $\mathbf{G}_i$ $G_{i}$ to the atomic energy $E_i$ $E_{i}$ . Common regression models include:
  - Artificial Neural Networks (NNs): Highly flexible non-linear models.
  - Gaussian Process Regression: Used in Gaussian Approximation Potentials (GAPs).
3. Energy Calculation: The total energy of the system is the sum of the predicted atomic energies. The overall mapping is: $\mathbf{R}_i \rightarrow \mathbf{G}_i \xrightarrow{\mathcal{R}} E_i$
Flowchart Visualization (Figure 3):

该图像是图3，一个展示ML原子间势能总能量计算的流程图。图中，原子i的局部环境被编码为局部结构参数，通过回归模型映射到原子i的能量 $E_i$ 。之后， $E_i$ 与系统中其他原子的能量（用符号 $\Sigma$ 表示）求和，得到系统的总能量，即PES上的一个点。

This flowchart illustrates the ML process. The atomic positions for atom $i$ are first transformed into Local structural parameters. These parameters are then fed into a Regression model, which outputs the atomic energy $E_i$ . Summing these energies produces a point on the Potential Energy Surface (PES).
Neural Network Example (Figure 4):

$Fig. 4. Example of a feed-forward NN containing two hidden layers. G is the input vector, $\\mathbf { q }$ is the output vector. The signals transmitted between the NN nodes (neurons) are transformed…$ 该图像是图4所示的具有两个隐藏层的前馈神经网络（NN）示意图。其中， $\mathbf { G }$ 是输入向量， $\mathbf { q }$ 是输出向量。信号在网络节点（神经元）之间传递时，通过权重矩阵 $\mathbf { w } ^ { ( 1 ) }$ 、 $\mathbf { w } ^ { ( 2 ) }$ 、 $\mathbf { w } ^ { ( 3 ) }$ 和偏置向量 $\mathbf { b } ^ { ( 1 ) }$ 、 $\mathbf { b } ^ { ( 2 ) }$ 、 $\mathbf { b } ^ { ( 3 ) }$ 进行转换。

This diagram shows a typical feed-forward neural network used in ML potentials. The input layer receives the descriptor vector $\mathbf{G}$ . The information propagates through hidden layers, where it is transformed by weights ( $\mathbf{w}$ ) and biases ( $\mathbf{b}$ ). The output layer produces the final prediction, which can be the energy $E_i$ (in this diagram, a generic output vector $\mathbf{q}$ ). The numerous weights and biases are the fitting parameters optimized during training.
Training and Loss Function: The model parameters are trained by minimizing a loss function, which measures the difference between the model's predictions and the reference DFT data. A typical loss function is: $\mathcal{E} = \frac{1}{N} \sum_{s=1}^{N} \left( \frac{E^s - E_{\mathrm{DFT}}^s}{N_s} \right)^2 + \tau_1 \frac{1}{N} \sum_{s=1}^{N} \sum_{\alpha=1}^{3} [F_{\alpha}^s - (F_{\alpha}^s)_{\mathrm{DFT}}]^2 + \tau_2 \frac{1}{N} \sum_{s=1}^{N} \sum_{\alpha, \beta=1}^{3} [T_{\alpha\beta}^s - (T_{\alpha\beta}^s)_{\mathrm{DFT}}]^2 + \tau_3 \frac{1}{L} \sum_{\kappa=1}^{L} |p_{\kappa}|^2$
- Symbol Explanation:
  - $\mathcal{E}$ : The loss function to be minimized.
  - $N$ : Total number of atomic configurations (supercells) in the database.
  - $s$ : Index for a specific supercell.
  - $E^s, F^s, T^s$ : The energy, forces, and stress tensor predicted by the potential for supercell $s$ .
  - $E_{\mathrm{DFT}}^s, (F^s)_{\mathrm{DFT}}, (T^s)_{\mathrm{DFT}}$ : The reference values from DFT calculations.
  - $N_s$ : Number of atoms in supercell $s$ .
  - $\tau_1, \tau_2, \tau_3$ : Hyperparameters that weight the contribution of forces, stresses, and regularization to the total loss.
  - $L$ : Total number of fitting parameters in the model.
  - $p_{\kappa}$ : The $\kappa$ -th fitting parameter. The final term is a regularization term to prevent overfitting by keeping parameter values small.

4.3. Physically-Informed Machine-Learning Potentials

Principle: This hybrid approach uses an ML model not to predict the energy directly, but to predict the parameters of a physics-based potential that are specific to an atom's local environment. This injects physical constraints into the model, improving transferability.
Steps & Procedures:
1. Descriptor Generation: Same as in standard ML potentials, the local environment $\mathbf{R}_i$ is encoded into a descriptor vector $\mathbf{G}_i$ .
2. Regression to Parameters: An ML regression model $\mathcal{R}$ is trained to map the descriptor vector $\mathbf{G}_i$ to a set of local potential parameters $\mathbf{p}_i$ .
3. Physics-Based Energy Calculation: The atomic energy $E_i$ is then calculated using a physics-based potential function $\Phi$ , but with the locally predicted parameters $\mathbf{p}_i$ . The overall mapping is: $\mathbf{R}_i \rightarrow \mathbf{G}_i \xrightarrow{\mathcal{R}} \mathbf{p}_i \xrightarrow{\Phi} E_i$
Flowchart Visualization (Figure 5):

该图像是图5的流程图，展示了物理信息辅助机器学习(ML)原子间势的总能量计算过程。原子i的局部环境被编码为局部结构参数，通过回归映射到物理势参数。这些参数结合原子坐标计算原子i的能量 $E_i$ 。通过对原子i的能量和其他原子的能量进行求和 ( $\Sigma$ )，得到系统的总能量 $E_{total} = \Sigma E_i$ 。

This flowchart shows the hybrid pipeline. The local structural parameters are fed into a Regression model which outputs Parameters. These dynamic, locally-aware parameters are then used by the Potential function, along with the atomic positions, to calculate the energy $E_i$ .
Example: PINN (Physically-Informed Neural Network): The paper highlights the PINN method as an implementation of this idea.
- It uses a Neural Network as the regression model.
- It uses a general-purpose Bond-Order Potential (BOP) as the physics-based function $\Phi$ . The BOP is flexible enough to describe both metallic and covalent bonding.
- The NN predicts local adjustments $\delta\mathbf{p}_i$ to a globally pre-fitted set of parameters $\mathbf{p}^0$ , making the model's predictions primarily guided by physics while using ML for fine-tuning.

5. Experimental Setup

As a review paper, it does not present a single experimental setup. Instead, it compares the typical setups and evaluation criteria for the different classes of potentials. The central comparison is summarized in Table 1.

Datasets:
- Traditional: Small databases consisting of experimental properties (lattice constants, elastic constants, defect energies) and a limited number of DFT calculations.
- ML & Physically-Informed ML: Large databases containing thousands of atomic configurations with corresponding energies, forces, and stresses calculated via high-throughput DFT. The diversity of structures in the database is critical for the potential's quality.
Evaluation Metrics: The paper evaluates potentials qualitatively based on several criteria. The key performance aspects are Accuracy (interpolation) and Transferability (extrapolation). Accuracy is often quantitatively measured by:
- Root-Mean-Square Error (RMSE): Measures the average magnitude of the errors in energy or force predictions.
- Mean Absolute Error (MAE): The average of the absolute errors.
Baselines: The three classes of potentials serve as baselines for each other. The paper systematically compares their performance characteristics.

6. Results & Analysis

The core result of the paper is the comparative analysis, best illustrated by Table 1 and Figure 2.

Comparison of Potential Classes (Table 1):

(This table has been transcribed from the paper's text.)

Table 1: Comparison of three classes of interatomic potentials.

Potential type
	Traditional	ML	Physically-informed ML
Physical foundation	Strong	None	Strong
Number of fitting parameters	~ 10	≥ 10³	≥ 10³
Computational speed	Very high	Slowerª	Slowerª
Reference database	Small	Large	Large
Accuracy (interpolation)	Limited	~ 1 meV/atom	~ 1 meV/atom
Transferability (extrapolation)	Reasonable	Poor	Reasonable
Reliance on human expertise	Strong	Weakerᵇ	Weakerᵇ
Extension to chemistries	Challenge	Challenge	Challenge
Specific to class of materials?	Yes	No	No
Systematically improvable?	No	Yes	Yes
Can be made artificial?	Yes	Maybeᶜ	Maybeᶜ

ª but orders of magnitude faster than straight DFT calculations.
ᵇ Some steps of database selection and training can be partially automatized.
ᶜ Not impossible in principle but we are not aware of attempts.

Analysis of Table 1:

Physical Foundation: Traditional and Physically-informed ML potentials are grounded in physical models of bonding, whereas standard ML potentials are purely mathematical interpolators.
Accuracy vs. Transferability: ML and Physically-informed ML potentials achieve near-DFT accuracy (~1 meV/atom) on training data, far surpassing traditional potentials. However, the purely mathematical ML potentials have poor transferability. The key claim is that Physically-informed ML restores reasonable transferability, similar to traditional potentials, by enforcing physical rules on extrapolation.
Flexibility and Improvability: ML-based potentials are universal (not specific to a material class) and can be systematically improved by adding more data to the training set, which is not possible for traditional potentials.

Accuracy and Transferability Visualization (Figure 2):

$Fig. 2. Schematic illustration of accuracy and transferability of (a) traditional (b) mathematical ML and (c) physically-informed ML interatomic potentials. The energy-volume $( E - V )$ relation for…$ 该图像是图2的示意图，展示了(a)传统、(b)数学机器学习和(c)物理启发式机器学习原子间势的准确性与可迁移性。图中通过DFT计算获得的特定结构能量-体积 (E-V) 关系（点）与三种势的预测曲线进行比较。其中，实心点表示训练数据，空心点表示训练域外的验证数据。图像表明，物理启发式机器学习势在训练内外均表现出较好的准确性和可迁移性，优于传统势和纯数学机器学习势在训练域外的表现。

This figure visualizes the core trade-off. The filled circles are training data, and the open circles are validation data (extrapolation).
- (a) Traditional: Captures the general physical trend but has limited accuracy even on the training data.
- (b) Mathematical ML: Fits the training data perfectly (high accuracy) but produces unphysical oscillations and errors outside the training domain (poor transferability).
- (c) Physically-informed ML: Achieves high accuracy on the training data like the mathematical ML potential, but also provides a physically reasonable extrapolation, similar to the traditional potential. This illustrates the "best of both worlds" argument.
Performance of a PINN Potential (Figure 6):

该图像是多面板图，展示机器学习原子间势在材料科学中的应用与准确性。它包含声子色散、热膨胀与实验数据对比，以及固液界面、裂纹扩展原子模拟。图中广义堆垛层错能、扩散及反应能垒计算与DFT数据高度一致，有力证明了机器学习势在预测材料行为方面的强大能力。

This figure showcases the impressive performance of a PINN potential for Tantalum (Ta) and Aluminum (Al) across a wide range of properties, demonstrating its general-purpose capability:
- (a) Phonon dispersion curves match DFT calculations.
- (b) Thermal expansion matches experimental data up to the melting point.
- (c, d) Simulations of complex phenomena like solid-liquid interfaces and crack propagation interacting with defects.
- (e, f) Generalized stacking fault energy surfaces, critical for predicting plastic deformation, match DFT results.
- (g, h) Defect properties like vacancy diffusion barriers are accurately predicted. This evidence supports the claim that physically-informed ML potentials can be both highly accurate and broadly applicable.

7. Conclusion & Reflections

Conclusion Summary: The author concludes that while traditional potentials have served the community well, the field is rapidly moving towards ML-based approaches. Purely mathematical ML potentials offer unprecedented accuracy but are hampered by their unreliability outside their training domain. The newly proposed class of physically-informed ML potentials offers a promising path forward by integrating the flexibility and accuracy of ML with the robustness and physical intuition of traditional models. This hybrid approach has the potential to produce general-purpose potentials with both high accuracy and reliable transferability.
Limitations & Future Work: The author identifies several challenges and future directions:
- Multicomponent Systems: Developing potentials for alloys remains a significant challenge. A key issue is the lack of "inheritance" in most ML potentials, meaning an elemental potential cannot be easily reused within a binary or ternary system. This leads to a proliferation of potentials and makes systematic alloy studies difficult.
- Shift in Focus: The field is currently in a "hype" phase, focused on methodological development. The author envisions a future phase where the focus will shift from making potentials to using them to generate new scientific knowledge about materials, as demonstrated by several recent studies cited in the paper.
- Integration with Physics: The ultimate future lies in turning back to physics, using ML as a powerful tool to build more robust, physically-grounded models rather than as a pure black-box interpolator.
Personal Insights & Critique:
- This paper is an exceptionally clear and well-structured review that serves as an excellent introduction to the field of interatomic potentials for materials science. Its greatest strength is the lucid classification of potentials into three distinct categories, which clarifies the ongoing evolution in the field.
- The proposal of "physically-informed ML potentials" is the paper's most significant contribution. It addresses the single biggest weakness of standard ML potentials—poor transferability—by offering a conceptually elegant and practical solution. This "third way" is compelling because it doesn't discard decades of physical understanding but instead leverages it within a modern machine-learning framework.
- The critique regarding the "hype cycle" and the need to move from methodology to discovery is astute and timely. It serves as a crucial reminder to the community that the ultimate goal of developing these tools is to solve real-world materials science problems.
- The paper's focus on materials science applications (defects, mechanical properties, etc.) makes it particularly relevant to metallurgists, physicists, and engineers. The discussion of challenges like multicomponent systems highlights key areas for future research. Overall, this is a landmark review that not only summarizes the state of the art but also charts a clear and promising path for the future.

Machine-learning interatomic potentials for materials science

TL;DR Summary

Abstract

Mind Map

In-depth Reading

English Analysis~16 min read · 19,849 chars

1. Bibliographic Information

2. Executive Summary

4. Methodology (Core Technology & Implementation)

4.1. The Traditional Interatomic Potentials

4.2. Machine-Learning Potentials

4.3. Physically-Informed Machine-Learning Potentials

5. Experimental Setup

6. Results & Analysis

7. Conclusion & Reflections

Similar papers

Machine-learning interatomic potentials for materials science

TL;DR Summary

Abstract

Mind Map

In-depth Reading

English Analysis~16 min read · 19,849 chars

1. Bibliographic Information

2. Executive Summary

3. Prerequisite Knowledge & Related Work

4. Methodology (Core Technology & Implementation)

4.1. The Traditional Interatomic Potentials

4.2. Machine-Learning Potentials

4.3. Physically-Informed Machine-Learning Potentials

5. Experimental Setup

6. Results & Analysis

7. Conclusion & Reflections

Similar papers