DFT Notes

These are notes.

Algorithmically, DFT will differ from Hartee-Fock primarily in the fact that we need to compute two additional quantities:

the exchange-correlation (XC) energy, \(E^{XC}\left[\rho\left(\vec{r}\right)\right]\), and
the XC potential, \(V^{XC}\left[\rho\left(\vec{r}\right)\right]\),

both of which are functionals of the electron density of the system, \(\rho\left(\vec{r}\right)\). By definition, \(V^{XC}\left[\rho\left(\vec{r}\right)\right]\) is:

\[ \begin{align}\begin{aligned}\newcommand{\density}{\rho\left(\vec{r}\right)} \newcommand{\exc}[1]{E^{XC}\left[#1\right]} \newcommand{\vxc}[1]{V^{XC}\left[#1\right]}\\\vxc{\density} \equiv \frac{\partial \exc{\density}}{\partial \density}\end{aligned}\end{align} \]

or rearranging for \(E^{XC}\left[\rho\left(\vec{r}\right)\right]\):

\[\exc{\density} = \int \vxc{\density}\density d\vec{r}.\]

Conceptually, \(E^{XC}\left[\rho\left(\vec{r}\right)\right]\) is the XC contribution to the electronic energy and \(V^{XC}\left[\rho\left(\vec{r}\right)\right]\) is the potential needed to make the non-interacting system have the same density as the real system. Unfortunately, we do not know the analytic form of \(E^{XC}\left[\rho\left(\vec{r}\right)\right]\) (or equivalently \(V^{XC}\left[\rho\left(\vec{r}\right)\right]\)) and likely never will.

To make progress, we need to introduce an ansantze for either \(E^{XC}\left[\rho\left(\vec{r}\right)\right]\) or \(V^{XC}\left[\rho\left(\vec{r}\right)\right]\). To do this we define a quantity \(f\left[\rho\left(\vec{r}\right), \cdots\right]\) called the XC energy density which is the XC energy per infinitesimal volume \(d\vec{r}\). In terms of \(f\left[\rho\left(\vec{r}\right), \cdots\right]\), \(E^{XC}\left[\rho\left(\vec{r}\right)\right]\) is written as:

\[ \begin{align}\begin{aligned}\newcommand{\edensity}[1]{f\left[#1\right]} \newcommand{\eparticle}[1]{\epsilon\left[#1\right]}\\\exc{\density} = \int \edensity{\density,\cdots} d\vec{r}.\end{aligned}\end{align} \]

Related to \(f\left[\rho\left(\vec{r}\right), \cdots\right]\) is a quantity \(\epsilon\left[\rho\left(\vec{r}\right), \cdots\right]\) which is the XC energy per unit particle. The exact relationship between the two is:

\[\edensity{\density,\cdots} = \eparticle{\density,\cdots}\density.\]

It should be noted that, unlike \(V^{XC}\left[\rho\left(\vec{r}\right)\right]\) and \(E^{XC}\left[\rho\left(\vec{r}\right)\right]\), \(f\left[\rho\left(\vec{r}\right), \cdots\right]\) and \(\epsilon\left[\rho\left(\vec{r}\right), \cdots\right]\) will in general depend on additional functionals beyond the density.

Warning

Colloquially speaking, “a DFT functional” (e.g., when someone says they used “the BLYP DFT functional”) can refer to the analytic form of \(f\left[\rho\left(\vec{r}\right), \cdots\right]\) or \(\epsilon\left[\rho\left(\vec{r}\right), \cdots\right]\). Making matters worse, \(\epsilon\) is commonly used to denote both quantities. When comparing equations it is critical to distinguish between these two quantities.

XC functionals are typically classified by the parameters that \(f\left[\rho\left(\vec{r}\right), \cdots\right]\) or \(\epsilon\left[\rho\left(\vec{r}\right), \cdots\right]\) depend on. More specifically dependence on:

only \(\rho\left(\vec{r}\right)\) defines the local density approximation (LDA),
\(\rho\left(\vec{r}\right)\) and \(\left|\bigtriangledown\rho\left(\vec{r}\right)\right|^2\) (the square of the gradient of \(\rho\left(\vec{r}\right)\)) defines the generalized gradient approximation (GGA), and
\(\rho\left(\vec{r}\right)\), \(\left|\bigtriangledown\rho\left(\vec{r}\right)\right|^2\), the Laplacian of \(\rho\left(\vec{r}\right)\), and the kinetic energy density defines a meta GGA.

For an LDA we have:

\[\begin{split}\vxc{\density} =& \frac{\partial \exc{\density}}{\partial \density}\\ =& \frac{\partial \edensity{\density}}{\partial \density}\end{split}\]

Introduction of AOs

In Kohn-Sham DFT we solve the Kohn-Sham equations in an orbital basis that is obtained as a linear combination of atomic orbitals (AOs). Assume that there are \(N_b\) AOs and let \(\phi_\mu\left(\vec{r}\right)\) be the \(\mu\)-th AO. The equation for \(E^{XC}\left[\rho\left(\vec{r}\right)\right]\) remains unchanged other than \(\rho\left(\vec{r}\right)\) is now:

\[ \begin{align}\begin{aligned}\newcommand{\bf}[1]{\phi_{#1}\left(\vec{r}\right)}\\\density = \sum_{\mu}^{N_b}\sum_{\nu}^{N_b} \bf{\mu}P_{\mu\nu}\bf{\nu}\end{aligned}\end{align} \]

where \(P_{\mu\nu}\) is the \(\mu\nu\)-th element of the atomic density matrix. In the AO basis set the \(\mu\nu\)-th element of \(V^{XC}\left[\rho\left(\vec{r}\right)\right]\) is:

\[V^{XC}_{\mu\nu} = \int \bf{\mu} \vxc{\density} \bf{\nu} d\vec{r}.\]

In the LDA this becomes:

\[V^{XC}_{\mu\nu} = \int \bf{\mu} \frac{\partial \edensity{\density}}{\partial \density} \bf{\nu} d\vec{r}.\]

For most DFT functionals, analytic solutions for the above integrals are not known and \(E^{XC}\left[\rho\left(\vec{r}\right)\right]\) and \(V^{XC}\left[\rho\left(\vec{r}\right)\right]\) must be evaluated by quadrature.

Introduction of Quadrature

In solving an integral by quadrature, we make the following approximation:

\[\int g(\vec{r}) d\vec{r} \approx \sum_{i=1}^{N_g} w_i g(\vec{r_i}).\]

That is we define a quadrature rule \(\mathcal{Q}\) which is a set of \(N_g\) pairs of the form \(\lbrace w_i, \vec{r}_i\rbrace\). Here, \(w_i\) and \(\vec{r_i}\) are respectively the quadrature weight and real-space location of the \(i\)-th grid point.

At this point it is helpful to define:

\[ \begin{align}\begin{aligned}\newcommand{\densityg}[1]{\rho_{#1}} \newcommand{\bfg}[1]{\phi_{#1}}\\\begin{split} \densityg{i}\equiv&\rho\left(\vec{r_i}\right)\\ \bfg{\mu i}\equiv&\phi_{\mu}\left(\vec{r_i}\right)\end{split}\end{aligned}\end{align} \]

which respectively are the values of the density and the \(\mu\)-th AO evaluated at the \(i\)-th grid point. Similarly, we define:

\[ \begin{align}\begin{aligned}\newcommand{\edensityg}[1]{f_{#1}} \newcommand{\dedensitygdrho}[1]{f_{#1}^{\left(\rho\right)}}\\\begin{split}\edensityg{i}\equiv&\edensity{\densityg{i}}\\ \dedensitygdrho{i}\equiv& \left. \frac{\partial \edensity{\density{}}} {\partial \density{}} \right|_{\density{}=\densityg{i}}\end{split}\end{aligned}\end{align} \]

which are the energy density, and the “derivative of the energy density with respect to the density” evaluated at \(\rho_i\).

Using these quantities, \(\rho_i\) is then given by:

\[\begin{split}\densityg{i} =& \sum_{\mu}^{N_b} \sum_{\nu}^{N_b} \bfg{\mu i}P_{\mu \nu}\bfg{\nu i}\\ =& \sum_{\mu}^{N_b} \bfg{\mu i}X_{\mu i}\end{split}\]

where in the second line we defined the common intermediate (the collocation matrix):

\[X_{\mu i} = \sum_{\nu}^{N_b} P_{\mu\nu}\bfg{\nu i}\]

Using \(\mathcal{Q}\), \(E^{XC}\left[\rho\left(\vec{r}\right)\right]\) becomes:

\[\exc{\density{}} = \sum_i^{N_g} w_i\edensityg{i}\]

and \(V^{XC}\left[\rho\left(\vec{r}\right)\right]\) becomes:

\[\begin{split}V_{\mu\nu}^{XC} =& \sum_i^{N_g} w_i \bfg{\mu i} \dedensitygdrho{i} \bfg{\nu i}\\ =& \sum_i^{N_g} w_i \bfg{\mu i} Z_{\nu i}\end{split}\]

where we defined the intermediate:

\[Z_{\mu i} =\dedensitygdrho{i} \bfg{\mu i}.\]

As a Sparse Map Problem

While the last sections have described DFT as a tensor problem it’s usually not solved as one. DFT is not usually treated as a tensor problem because:

Large tensors. Grids minimally use about 1000 grid points per atom (higher- quality grids tend to be order 10,000) and most AO basis sets have order 10 basis functions per atom. Tensors like \(\phi_{\mu i}\) then have minimally “10,000 times number of atoms squared” elements, meaning the tensor for 100 atoms already requires gigabytes of memory.
Sparsity. Most DFT quantities are local. So if basis functions for a tensor element are spatially far a part, the element is usually close to zero.

To describe the sparsity we introduce sparse maps. Given two basis sets, \(A\) and \(B\), the sparse map \(L\) maps each basis function in \(A\) to a subset of the basis functions in \(B\). Assume we have some tensor with elements \(T_{ab}\) where \(a\) indexes basis functions in \(A\) and \(b\) indexes basis functions in \(B\). For a given value of \(a\), the non-zero elements of \(T_{ab}\) are those such that \(b\) is in \(L(a)\).

In DFT, we use atom-centered grids and AOs. It is therefore common to define sparse maps \(L(A\rightarrow i)\) and \(L(A\rightarrow \mu)\) which respectively map atom indices to grid points and atom indices to AOs. Using these maps the equation for the density becomes:

\[\densityg{i_A} = \sum_{\mu_A} \bfg{\mu_A i_A}X_{\mu_A i_A}\]

where an index like \(i_A\) is shorthand for restricting the value of \(i\) to those afforded by the sparse map \(L(A\rightarrow i)\). Applying the same logic to the other DFT quantities:

\[\begin{split}X_{\mu_A i_A} =& \sum_{\nu_A} P_{\mu_A\nu_A}\bfg{\nu_A i_A}\\ Z_{\mu_A i_A} =& \dedensitygdrho{i_A} \bfg{\mu_A i_A}\\ V_{\mu_A\nu_A}^{XC} =& \sum_{i_A} w_{i_A}\bfg{\mu_A i_A} Z_{\nu_A i_A}.\end{split}\]

Finally, the equation for \(E^{XC}\left[\rho\left(\vec{r}\right)\right]\) becomes:

\[\exc{\density{}} = \sum_{A}\sum_{i_A} w_{i_A}\edensityg{i_A}\]

Of note, for a given grid we expect the number of grid points associated with an atom to be roughly constant. Similarly, for a given AO basis set we expect the number of AOs associated with an atom to also be roughly constant. This means that cost to form all quantities will scale linearly with the number of atoms.