Fitting SNAP using GaN data
This example:
- Uses the atomic positions in the folder
example_GaN_data
- Generates surrogate DFT data based on the GaN model presented in
10.1088/1361-648x/ab6cbe
- Uses
snap.jl
and 80% of the GaN data to create the matrix A. The matrix is generated only with the energy block. - Uses 80% of the GaN data to create the vector b. The reference energy ($E_{\rm ref}$) is assumed to be zero.
- Uses backslash to fit the parameters $\boldsymbol{\beta}$, thus, solves $\mathbf{A \cdot \boldsymbol{\beta}=y}$
- Uses 20% of the GaN data and the bispectrum components provided by
snap.jl
and the fitted parameters $\boldsymbol{\beta}$ to validate the fitting. - The error computed during the validation is < ~5%.
TODO:
- Check if the GaN model is correct.
- Calculate reference energy ($E_{\rm ref}$).
- Check ill-conditioned system.
Install and import dependencies
First let's make sure we have all required packages installed
using Pkg
pkg"add LAMMPS"
We start off by importing the necessary packages
using LAMMPS
using LinearAlgebra: norm, pinv
using Printf
Estimation of the SNAP coefficients
The choice of the coefficients $\boldsymbol{\beta}=(\beta_0^1, \tilde{\beta}^1, \dots, \beta_0^l, \tilde{\beta}^l)$ is based on a system of linear equations which considers a large number of atomic configurations and $l$ atom types. The matrix formulation for this system $\mathbf{A \cdot \boldsymbol{\beta}=y}$ is defined in the following equations (see 10.1016/j.jcp.2014.12.018):
\[\begin{equation*} \mathbf{A}= \begin{pmatrix} \vdots & & & & & & & & \\ N_{s_1} & \sum_{i=1}^{N_{s_1}} B_1^i & \dots & \sum_{i=1}^{N_{s_1}} B_k^i & \dots & N_{s_L} & \sum_{i=1}^{N_{s_L}} B_1^i & \dots & \sum_{i=1}^{N_{s_L}} B_k^i\\ \vdots & & & & & & & & \\ 0 & -\sum_{i=1}^{N_{s_1}} \frac{\partial B_1^i}{\partial r_j^{\alpha}} & \dots & -\sum_{i=1}^{N_{s_1}} \frac{\partial B_k^i}{\partial r_j^{\alpha}} & \dots & 0 & -\sum_{i=1}^{N_{s_l}} \frac{\partial B_1^i}{\partial r_j^{\alpha}} & \dots & -\sum_{i=1}^{N_{s_l}} \frac{\partial B_k^i}{\partial r_j^{\alpha}} \\ \vdots & & & & & & & & \\ 0 & - \sum_{j=1}^{N_{s_1}} r^j_{\alpha} \sum_{i=1}^{N_{s_1}} \frac{\partial B_1^i}{\partial r_j^{\beta}} & \dots & - \sum_{j=1}^{N_{s_1}} r^j_{\alpha} \sum_{i=1}^{N_{s_1}} \frac{\partial B_k^i}{\partial r_j^{\beta}} & \dots & 0 & - \sum_{j=1}^{N_{s_l}} r^j_{\alpha} \sum_{i=1}^{N_{s_l}} \frac{\partial B_1^i}{\partial r_j^{\beta}} & \dots & - \sum_{j=1}^{N_{s_l}} r^j_{\alpha} \sum_{i=1}^{N_{s_l}} \frac{\partial B_k^i}{\partial r_j^{\beta}}\\ \vdots & & & & & & & & \\ \end{pmatrix} \end{equation*}\]
The indexes $\alpha, \beta = 1,2,3$ depict the $x$, $y$ and $z$ spatial component, $j$ is an individual atom, and $s$ a particular configuration. All atoms in each configuration are considered. The number of atoms of type $m$ in the configuration $s$ is $N_{s_m}$.
The RHS of the linear system is computed as:
\[\begin{equation*} \mathbf{y}= \begin{pmatrix} \vdots \\ E^s_{\rm qm} - E^s_{\rm ref} \\ \vdots \\\\ F^{s,j,\alpha}_{\rm qm} - F^{s,j,\alpha}_{\rm ref} \\ \vdots \\ W_{\rm qm}^{s,\alpha,\beta} - W_{\rm ref}^{s,\alpha,\beta} \\ \vdots \\ \end{pmatrix} \end{equation*}\]
Calculate $A$ using snap.jl
This example only fits the energy, thus, the first block of the matrix A.
include(joinpath(dirname(pathof(LAMMPS)), "..", "examples", "snap.jl"))
A
48×62 Matrix{Float64}:
96.0 714.892 175.705 125.324 … 3157.79 330.568 1336.49 432.028
96.0 741.638 171.502 122.601 3101.62 340.641 1254.11 545.843
96.0 736.281 172.097 123.056 3099.71 340.618 1260.01 536.707
96.0 736.39 172.848 123.5 3100.18 340.488 1254.05 548.47
96.0 743.85 171.078 122.015 3097.8 339.628 1250.89 543.843
96.0 736.916 172.51 123.381 … 3101.99 340.895 1256.86 546.525
96.0 743.904 170.969 122.077 3099.39 341.379 1251.63 549.245
96.0 738.596 171.981 122.985 3104.01 336.768 1259.6 535.521
96.0 739.471 172.462 123.217 3107.2 338.89 1259.06 548.391
96.0 740.706 171.673 122.861 3105.18 341.492 1255.34 551.494
⋮ ⋱ ⋮
96.0 743.088 171.117 122.303 3101.88 344.48 1252.19 555.025
96.0 737.216 172.745 123.571 … 3102.71 337.652 1256.42 545.619
96.0 738.869 173.318 123.551 3103.85 334.767 1252.77 547.766
96.0 741.594 171.769 122.864 3111.31 342.38 1257.92 557.53
96.0 737.911 171.982 122.856 3104.09 340.038 1259.47 536.737
96.0 738.979 171.821 122.92 3109.29 341.452 1259.55 545.03
96.0 741.679 170.768 122.231 … 3100.44 343.354 1254.12 548.632
96.0 737.134 172.699 123.294 3097.01 336.846 1253.92 539.131
96.0 738.856 172.706 123.366 3104.08 340.743 1254.96 551.47
Calculate $y$
A molecular mechanics model for the interaction of gallium and nitride is used to generate surrogate DFT data (see 10.1088/1361-648x/ab6cbe). The total potential energy is computed for each configuration
\[E = \sum_{i \lt j; r_{i,j} \leq rcut } E_{GaN}(r_{i,j}) \ \text{ where } \\ E_{GaN}(r_{i,j}) = \left\{ \begin{array}{ll} E_{C}(r_{i,j}) + E_{LJ}(r_{i,j}, \epsilon_{Ga,Ga}, \sigma_{Ga,Ga}) & \mbox{if both atoms are Ga} \\ E_{C}(r_{i,j}) + E_{LJ}(r_{i,j}, \epsilon_{N,N}, \sigma_{N,N}) & \mbox{if both atoms are N} \\ E_{C}(r_{i,j}) + E_{BM}(r_{i,j}, A_{Ga,N}, \rho_{Ga,N}) & \mbox{if one atom is Ga and the other is N}\\ \end{array} \right.\]
Note: this is a work in progress, this model should be checked.
const N = N1 + N2
const M1 = M
const M2 = 61
const ε_Ga_Ga = 0.643
const σ_Ga_Ga = 2.390
const ε_N_N = 1.474
const σ_N_N = 1.981
const A_Ga_N = 608.54
const ρ_Ga_N = 0.435
const q_Ga = 3.0
const q_N = -3.0
const ε0 = 55.26349406 # e2⋅GeV−1⋅fm−1 ToDo: check this
const E_ref = zeros(M1)
E_LJ(r, ε = 1.0, σ = 1.0) = 4.0 * ε * ((σ / norm(r))^12 - (σ / norm(r))^6)
E_BM(r, A = 1.0, ρ = 1.0) = A * exp(-norm(r) / ρ)
E_C(r) = q_Ga * q_N / (4.0 * π * ε0 * norm(r))
function read_atomic_conf(m, N)
rs = []
open(joinpath(DATA, string(m), "DATA")) do f
for i = 1:23
readline(f)
end
for i = 1:N
s = split(readline(f))
r = [parse(Float64, s[3]),
parse(Float64, s[4]),
parse(Float64, s[5])]
push!(rs, r)
end
end
return rs
end
function calc_tot_energy(rcut, rs, N, ε_Ga_Ga, σ_Ga_Ga, ε_N_N, σ_N_N, A_Ga_N, ρ_Ga_N)
E_tot_acc = 0.0
for i = 1:N
for j = i:N
r_diff = rs[i] - rs[j]
if norm(r_diff) <= rcut && norm(r_diff) > 0.0
if i <= N1 && j <= N1
E_tot_acc += E_C(r_diff) + E_LJ(r_diff, ε_Ga_Ga, σ_Ga_Ga)
elseif i > N1 && j > N1
E_tot_acc += E_C(r_diff) + E_LJ(r_diff, ε_N_N, σ_N_N)
else
E_tot_acc += E_C(r_diff) + E_BM(r_diff, A_Ga_N, ρ_Ga_N)
end
end
end
end
return E_tot_acc
end
calc_tot_energy (generic function with 1 method)
Vector $y$ is calculated using the DFT data and $E_{\rm ref}$
function calc_y(rcut, M1, N, ε_Ga_Ga, σ_Ga_Ga, ε_N_N, σ_N_N, A_Ga_N, ρ_Ga_N, E_ref)
y = zeros(M1)
for m = 1:M1
rs = read_atomic_conf(m, N)
y[m] = calc_tot_energy(rcut, rs, N, ε_Ga_Ga, σ_Ga_Ga,
ε_N_N, σ_N_N, A_Ga_N, ρ_Ga_N) - E_ref[m]
end
return y
end
y = calc_y(rcut, M1, N, ε_Ga_Ga, σ_Ga_Ga, ε_N_N, σ_N_N, A_Ga_N, ρ_Ga_N, E_ref)
48-element Vector{Float64}:
1722.4498771523417
1677.2635491164112
1711.3148873863208
1732.267614125171
1708.435054676101
1690.4950201294132
1684.066527325373
1746.948996355329
1710.9274852177402
1715.0902572123912
⋮
1744.267721178441
1681.2684856943063
1721.9940387385514
1736.3547928530331
1745.397940711828
1726.9880294606958
1705.9038135458254
1697.3739427062326
1715.5343234240943
Calculate the optimal solution
The optimal solution $\widehat{\mathbf{\beta}}$ is, thus:
\[\begin{equation*} \widehat{\mathbf{\beta}} = \mathrm{argmin}_{\mathbf{\beta}} ||\mathbf{A \cdot \boldsymbol{\beta} - y}||^2 = \mathbf{A^{-1} \cdot y} \end{equation*}\]
β = A \ y
62-element Vector{Float64}:
10.186840372409128
-153.97924361444737
32.105482990505585
-48.040124411822326
92.71701861106854
-5.5374135409148195
193.0568779355885
-123.4210657200344
-130.9828131774663
-90.14059965823128
⋮
-195.9265740329176
174.71930890601402
75.46114960910315
20.530494605269546
-89.75357933645647
-77.4138645009578
30.71316571463391
107.16067919386244
-13.9531148168796
Check results
The local energy can be decomposed into separate contributions for each atom. SNAP energy can be written in terms of the bispectrum components of the atoms and a set of coefficients. $K$ components of the bispectrum are considered so that $\mathbf{B}^{i}=\{ B^i_1, \dots, B_K^i\}$ for each atom $i$, whose SNAP energy is computed as follows:
\[ E^i_{\rm SNAP}(\mathbf{B}^i) = \beta_0^{\alpha_i} + \sum_{k=1}^K \beta_k^{\alpha_i} B_k^i = \beta_0^{\alpha_i} + \boldsymbol{\tilde{\beta}}^{\alpha_i} \cdot \mathbf{B}^i\]
where $\alpha_i$ depends on the atom type.
function calc_fitted_tot_energy(path, β, ncoeff, N1, N)
# Calculate y
lmp = LMP(["-screen","none"])
bs = run_snap(lmp, path, rcut, twojmax)
E_tot_acc = 0.0
for n in 1:N1
E_atom_acc = β[1]
for k in 2:ncoeff
k2 = k - 1
E_atom_acc += β[k] * bs[k2, n]
end
E_tot_acc += E_atom_acc
end
for n in N1+1:N
E_atom_acc = β[ncoeff+1]
for k in ncoeff+2:2*ncoeff
k2 = k - ncoeff - 1
E_atom_acc += β[k] * bs[k2, n]
end
E_tot_acc += E_atom_acc
end
command(lmp, "clear")
return E_tot_acc
end
calc_fitted_tot_energy (generic function with 1 method)
Comparison between the potential energy calculated based on the DFT data, and the SNAP potential energy calculated based on the bispectrum components provided by $snap.jl$ and the fitted coefficients $\mathbf{\beta}$.
@printf("Potential Energy, Fitted Potential Energy, Error (%%)\n")
for m = M1+1:M2
path = joinpath(DATA, string(m), "DATA")
rs = read_atomic_conf(m, N)
E_tot = calc_tot_energy(rcut, rs, N, ε_Ga_Ga, σ_Ga_Ga, ε_N_N, σ_N_N, A_Ga_N, ρ_Ga_N)
E_tot_fit = calc_fitted_tot_energy(path, β, ncoeff, N1, N)
@printf("%0.2f, %0.2f, %0.2f\n", E_tot, E_tot_fit,
abs(E_tot - E_tot_fit) / E_tot * 100.)
end
Potential Energy, Fitted Potential Energy, Error (%)
1695.02, 1668.98, 1.54
1716.77, 1746.47, 1.73
1727.48, 1756.01, 1.65
1690.29, 1646.75, 2.58
1709.51, 1654.80, 3.20
1749.57, 1695.39, 3.10
1752.94, 1767.68, 0.84
1765.52, 1728.84, 2.08
1725.46, 1628.16, 5.64
1732.92, 1688.68, 2.55
1687.87, 1712.33, 1.45
1726.56, 1611.73, 6.65
1685.35, 1731.82, 2.76
This page was generated using Literate.jl.