Introduction to DFT

Introduction to DFT#

A comprehensive overview of DFT is beyond the scope of this book, as excellent reviews on these subjects are readily found in the literature, and are suggested reading in the following paragraph. Instead, this chapter is intended to provide a useful starting point for a non-expert to begin learning about and using DFT in the manner used in this book. Much of the information presented here is standard knowledge among experts, but a consequence of this is that it is rarely discussed in current papers in the literature. A secondary goal of this chapter is to provide new users with a path through the extensive literature available and to point out potential difficulties and pitfalls in these calculations.

A modern and practical introduction to density functional theory can be found in Sholl and Steckel cite:sholl-2009-densit-funct-theor. A fairly standard textbook on DFT is the one written by Parr and Yang cite:parr-yang. The Chemist’s Guide to DFT cite:koch2001 is more readable and contains more practical information for running calculations, but both of these books focus on molecular systems. The standard texts in solid state physics are by Kittel cite:kittel and Ashcroft and Mermin cite:ashcroft-mermin. Both have their fine points, the former being more mathematically rigorous and the latter more readable. However, neither of these books is particularly easy to relate to chemistry. For this, one should consult the exceptionally clear writings of Roald Hoffman cite:hoffmann1987,RevModPhys.60.601, and follow these with the work of N\o rskov and coworkers cite:hammer2000:adv-cat,greeley2002:elect.

In this chapter, only the elements of DFT that are relevant to this work will be discussed. An excellent review on other implementations of DFT can be found in Reference citealp:freeman1995:densit, and details on the various algorithms used in DFT codes can be found in Refs. citealp:payne1992:iterat,Kresse199615.

One of the most useful sources of information has been the dissertations of other students, perhaps because the difficulties they faced in learning the material are still fresh in their minds. Thomas Bligaard, a coauthor of Dacapo, wrote a particularly relevant thesis on exchange/correlation functionals cite:bligaard2000:exchan-correl-funct and a dissertation illustrating the use of DFT to design new alloys with desirable thermal and mechanical properties cite:bligaard2003:under-mater-proper-basis-densit. The Ph.D. thesis of Ari Seitsonen contains several useful appendices on k-point setups, and convergence tests of calculations, in addition to a thorough description of DFT and analysis of calculation output cite:seitsonen2000:phd. Finally, another excellent overview of DFT and its applications to bimetallic alloy phase diagrams and surface reactivity is presented in the PhD thesis of Robin Hirschl cite:hirschl2002:binar-trans-metal-alloy-their-surfac.

Background#

In 1926, Erwin Schrödinger published the first accounts of his now famous wave equation cite:pauling1963. He later shared the Nobel prize with Paul A. M. Dirac in 1933 for this discovery. Schrödinger’s wave function seemed extremely promising, as it contains all of the information available about a system. Unfortunately, most practical systems of interest consist of many interacting electrons, and the effort required to find solutions to Schrödinger’s equation increases exponentially with the number of electrons, limiting this approach to systems with a small number of relevant electrons, \(N \lesssim O(10)\) cite:RevModPhys.71.1253. Even if this rough estimate is off by an order of magnitude, a system with 100 electrons is still very small, for example, two Ru atoms if all the electrons are counted, or perhaps ten Pt atoms if only the valence electrons are counted. Thus, the wave function method, which has been extremely successful in studying the properties of small molecules, is unsuitable for studies of large, extended solids. Interestingly, this difficulty was recognized by Dirac as early as 1929, when he wrote “The underlying physical laws necessary for the mathematical theory of a large part of physics and the whole of chemistry are thus completely known, and the difficulty is only that the application of these laws leads to equations much too complicated to be soluble.” cite:dirac1929:quant-mechan-many-elect-system.

In 1964, Hohenberg and Kohn showed that the ground state total energy of a system of interacting electrons is a unique functional of the electron density cite:PhysRev.136.B864. By definition, a function returns a number when given a number. For example, in \(f(x)=x^2\), \(f(x)\) is the function, and it equals four when \(x=2\). A functional returns a number when given a function. Thus, in \(g(f(x))=\int_0^\pi f(x) dx\), \(g(f(x))\) is the functional, and it is equal to two when \(f(x)=\sin(x)\). Hohenberg and Kohn further identified a variational principle that appeared to reduce the problem of finding the ground state energy of an electron gas in an external potential (i.e., in the presence of ion cores) to that of the minimization of a functional of the three-dimensional density function. Unfortunately, the definition of the functional involved a set of 3N-dimensional trial wave functions.

In 1965, Kohn and Sham made a significant breakthrough when they showed that the problem of many interacting electrons in an external potential can be mapped exactly to a set of noninteracting electrons in an effective external potential cite:PhysRev.140.A1133. This led to a set of self-consistent, single particle equations known as the Kohn-Sham (KS) equations:

\begin{equation}\label{eq:KS} \biggl(-\frac{1}{2}\nabla^2 + v_{eff}(\mathbf{r}) -\epsilon_j \biggr) \varphi_j (\mathbf{r})=0, \end{equation}

with

\begin{equation}\label{eq:veff} v_{eff}(\mathbf{r})=v(\mathbf{r})+\int \frac{n(\mathbf{r'})}{|\mathbf{r-r'}|}d\mathbf{r'} + v_{xc}(\mathbf{r}), \end{equation}

where \(v(\mathbf{r})\) is the external potential and \(v_{xc}(\mathbf{r})\) is the exchange-correlation potential, which depends on the entire density function. Thus, the density needs to be known in order to define the effective potential so that Eq. eqref:eq:KS can be solved. \(\varphi_j(\mathbf{r})\) corresponds to the \(j^{th}\) KS orbital of energy \(\epsilon_j\).

The ground state density is given by:

\begin{equation}\label{eq:density} \displaystyle n(\mathbf{r})=\sum_{j=1}^{N}|\varphi_j(\mathbf{r})|^2 \end{equation}

To solve Eq. \eqref{eq:KS} then, an initial guess is used for \(\varphi_j(r)\) which is used to generate Eq. \eqref{eq:density}, which is subsequently used in Eq. \eqref{eq:veff}. This equation is then solved for \(\varphi_j(\mathbf{r})\) iteratively until the \(\varphi_j(\mathbf{r})\) that result from the solution are the same as the \(\varphi_j(\mathbf{r})\) that are used to define the equations, that is, the solutions are self-consistent. Finally, the ground state energy is given by:

\begin{equation}\label{eq:dftEnergy} \displaystyle E=\sum_j\epsilon_j + E_{xc}[n(\mathbf{r})]-\int v_{xc}(\mathbf{r}) n(\mathbf{r}) d\mathbf{r} -\frac{1}{2}\int{\frac{n(\mathbf{r})n(\mathbf{r'})}{|\mathbf{r-r'}|}d\mathbf{r'}\mathbf{r}}, \end{equation}

where \(E_{xc}[n(\mathbf{r})]\) is the exchange-correlation energy functional. Walter Kohn shared the Nobel prize in Chemistry in 1998 for this work cite:RevModPhys.71.1253. The other half of the prize went to John Pople for his efforts in wave function based quantum mechanical methods cite:RevModPhys.71.1267. Provided the exchange-correlation energy functional is known, Eq. (ref:eq:dftEnergy) is exact. However, the exact form of the exchange-correlation energy functional is not known, thus approximations for this functional must be used.

Exchange correlation functionals#

The two main types of exchange/correlation functionals used in DFT are the local density approximation (LDA) and the generalized gradient approximation (GGA). In the LDA, the exchange-correlation functional is defined for an electron in a uniform electron gas of density \(n\) cite:PhysRev.140.A1133. It is exact for a uniform electron gas, and is anticipated to be a reasonable approximation for slowly varying densities. In molecules and solids, however, the density tends to vary substantially in space. Despite this, the LDA has been very successfully used in many systems. It tends to predict overbonding in both molecular and solid systems cite:fuchs1998:pseud, and it tends to make semiconductor systems too metallic (the band gap problem) cite:perdew1982:elect-kohn-sham.

The generalized gradient approximation includes corrections for gradients in the electron density, and is often implemented as a corrective function of the LDA. The form of this corrective function, or “exchange enhancement” function determines which functional it is, e.g. PBE, RPBE, revPBE, etc. cite:hammer1999:improv-pbe. In this book the PBE GGA functional is used the most. N\o{}rskov and coworkers have found that the RPBE functional gives superior chemisorption energies for atomic and molecular bonding to surfaces, but that it gives worse bulk properties, such as lattice constants compared to experimental data cite:hammer1999:improv-pbe.

Finally, there are increasingly new types of functionals in the literature. The so-called hybrid functionals, such as B3LYP, are more popular with gaussian basis sets (e.g. in Gaussian), but they are presently inefficient with planewave basis sets. None of these other types of functionals were used in this work. For more details see Chapter 6 in Ref. citealp:koch2001 and Thomas Bligaard’s thesis on exchange and correlation functionals cite:bligaard2000:exchan-correl-funct.

Basis sets#

Briefly, VASP utilizes planewaves as the basis set to expand the Kohn-Sham orbitals. In a periodic solid, one can use Bloch’s theorem to show that the wave function for an electron can be expressed as the product of a planewave and a function with the periodicity of the lattice cite:ashcroft-mermin:

(1)#\[\begin{equation} \psi_{n\mathbf{k}}(\mathbf{r})=\exp({i\mathbf{k}\cdot\mathbf{r}})u_{n\mathbf{k}}(\mathbf{r}) \end{equation}\]

where \(\mathbf{r}\) is a position vector, and \(\mathbf{k}\) is a so-called wave vector that will only have certain allowed values defined by the size of the unit cell. Bloch’s theorem sets the stage for using planewaves as a basis set, because it suggests a planewave character of the wave function. If the periodic function \(u_{n\mathbf{k}}(\mathbf{r})\) is also expanded in terms of planewaves determined by wave vectors of the reciprocal lattice vectors, \(\mathbf{G}\), then the wave function can be expressed completely in terms of a sum of planewaves cite:payne1992:iterat:

(2)#\[\begin{equation} \psi_i(\mathbf{r})=\sum_\mathbf{G} c_{i,\mathbf{k+G}} \exp(i\mathbf{(k+G)\cdot r}). \end{equation}\]

where \(c_{i,\mathbf{k+G}}\) are now coefficients that can be varied to determine the lowest energy solution. This also converts Eq. eqref:eq:KS from an integral equation to a set of algebraic equations that can readily be solved using matrix algebra.

In aperiodic systems, such as systems with even one defect, or randomly ordered alloys, there is no periodic unit cell. Instead one must represent the portion of the system of interest in a supercell, which is then subjected to the periodic boundary conditions so that a planewave basis set can be used. It then becomes necessary to ensure the supercell is large enough to avoid interactions between the defects in neighboring supercells. The case of the randomly ordered alloy is virtually hopeless as the energy of different configurations will fluctuate statistically about an average value. These systems were not considered in this work, and for more detailed discussions the reader is referred to Ref. citealp:makov1995:period-bound-condit. Once a supercell is chosen, however, Bloch’s theorem can be applied to the new artificially periodic system.

To get a perfect expansion, one needs an infinite number of planewaves. Luckily, the coefficients of the planewaves must go to zero for high energy planewaves, otherwise the energy of the wave function would go to infinity. This provides justification for truncating the planewave basis set above a cutoff energy. Careful testing of the effect of the cutoff energy on the total energy can be done to determine a suitable cutoff energy. The cutoff energy required to obtain a particular convergence precision is also element dependent, shown in Table ref:tab:pwcut. It can also vary with the “softness” of the pseudopotential. Thus, careful testing should be done to ensure the desired level of convergence of properties in different systems. Table ref:tab:pwcut refers to convergence of total energies. These energies are rarely considered directly, it is usually differences in energy that are important. These tend to converge with the planewave cutoff energy much more quickly than total energies, due to cancellations of convergence errors. In this work, 350 eV was found to be suitable for the H adsorption calculations, but a cutoff energy of 450 eV was required for O adsorption calculations.

Precision	Low	High
Mo	168	293
O	300	520
O_sv	1066	1847

Bloch’s theorem eliminates the need to calculate an infinite number of wave functions, because there are only a finite number of electrons in the unit (super) cell. However, there are still an infinite number of discrete k points that must be considered, and the energy of the unit cell is calculated as an integral over these points. It turns out that wave functions at k points that are close together are similar, thus an interpolation scheme can be used with a finite number of k points. This also converts the integral used to determine the energy into a sum over the k points, which are suitably weighted to account for the finite number of them. There will be errors in the total energy associated with the finite number of k, but these can be reduced and tested for convergence by using higher k-point densities. An excellent discussion of this for aperiodic systems can be found in Ref. citealp:makov1995:period-bound-condit.

The most common schemes for generating k points are the Chadi-Cohen scheme cite:PhysRevB.8.5747, and the Monkhorst-Pack scheme cite:PhysRevB.13.5188. The use of these k point setups amounts to an expansion of the periodic function in reciprocal space, which allows a straight-forward interpolation of the function between the points that is more accurate than with other k point generation schemes cite:PhysRevB.13.5188.

Pseudopotentials#

The core electrons of an atom are computationally expensive with planewave basis sets because they are highly localized. This means that a very large number of planewaves are required to expand their wave functions. Furthermore, the contributions of the core electrons to bonding compared to those of the valence electrons is usually negligible. In fact, the primary role of the core electron wave functions is to ensure proper orthogonality between the valence electrons and core states. Consequently, it is desirable to replace the atomic potential due to the core electrons with a pseudopotential that has the same effect on the valence electrons cite:PhysRevB.43.1993. There are essentially two kinds of pseudopotentials, norm-conserving soft pseudopotentials cite:PhysRevB.43.1993 and Vanderbilt ultrasoft pseudopotentials cite:PhysRevB.41.7892. In either case, the pseudopotential function is generated from an all-electron calculation of an atom in some reference state. In norm-conserving pseudopotentials, the charge enclosed in the pseudopotential region is the same as that enclosed by the same space in an all-electron calculation. In ultrasoft pseudopotentials, this requirement is relaxed and charge augmentation functions are used to make up the difference. As its name implies, this allows a “softer” pseudopotential to be generated, which means fewer planewaves are required to expand it.

The pseudopotentials are not unique, and calculated properties depend on them. However, there are standard methods for ensuring the quality and transferability (to different chemical environments) of the pseudopotentials cite:PhysRevB.56.15629.

VASP provides a database of PAW potentials cite:PhysRevB.50.17953,PhysRevB.59.1758.

Fermi Temperature and band occupation numbers#

At absolute zero, the occupancies of the bands of a system are well-defined step functions; all bands up to the Fermi level are occupied, and all bands above the Fermi level are unoccupied. There is a particular difficulty in the calculation of the electronic structures of metals compared to semiconductors and molecules. In molecules and semiconductors, there is a clear energy gap between the occupied states and unoccupied states. Thus, the occupancies are insensitive to changes in the energy that occur during the self-consistency cycles. In metals, however, the density of states is continuous at the Fermi level, and there are typically a substantial number of states that are close in energy to the Fermi level. Consequently, small changes in the energy can dramatically change the occupation numbers, resulting in instabilities that make it difficult to converge to the occupation step function. A related problem is that the Brillouin zone integral (which in practice is performed as a sum over a finite number of k points) that defines the band energy converges very slowly with the number of k points due to the discontinuity in occupancies in a continuous distribution of states for metals cite:gillan1989:calcul,Kresse199615. The difficulty arises because the temperature in most DFT calculations is at absolute zero. At higher temperatures, the DOS is smeared across the Fermi level, resulting in a continuous occupation function over the distribution of states. A finite-temperature version of DFT was developed cite:PhysRev.137.A1441, which is the foundation on which one solution to this problem is based. In this solution, the step function is replaced by a smoothly varying function such as the Fermi-Dirac function at a small, but non-zero temperature cite:Kresse199615. The total energy is then extrapolated back to absolute zero.

Spin polarization and magnetism#

There are two final points that need to be discussed about these calculations, spin polarization and dipole corrections. Spin polarization is important for systems that contain net spin. For example, iron, cobalt and nickel are magnetic because they have more electrons with spin “up” than spin “down” (or vice versa). Spin polarization must also be considered in atoms and molecules with unpaired electrons, such as hydrogen and oxygen atoms, oxygen molecules and radicals. For example, there are two spin configurations for an oxygen molecule, the singlet state with no unpaired electrons, and the triplet state with two unpaired electrons. The oxygen triplet state is lower in energy than the oxygen singlet state, and thus it corresponds to the ground state for an oxygen atom. A classically known problem involving spin polarization is the dissociation of a hydrogen molecule. In this case, the molecule starts with no net spin, but it dissociates into two atoms, each of which has an unpaired electron. See section 5.3.5 in Reference citealp:koch2001 for more details on this.

In VASP, spin polarization is not considered by default; it must be turned on, and an initial guess for the magnetic moment of each atom in the unit cell must be provided (typically about one Bohr-magneton per unpaired electron). For Fe, Co, and Ni, the experimental values are 2.22, 1.72, and 0.61 Bohr-magnetons, respectively cite:kittel and are usually good initial guesses. See Reference citealp:PhysRevB.56.15629 for a very thorough discussion of the determination of the magnetic properties of these metals with DFT. For a hydrogen atom, an initial guess of 1.0 Bohr-magnetons (corresponding to one unpaired electron) is usually good. An oxygen atom has two unpaired electrons, thus an initial guess of 2.0 Bohr-magnetons should be used. The spin-polarized solution is sensitive to the initial guess, and typically converges to the closest solution. Thus, a magnetic initial guess usually must be provided to get a magnetic solution. Finally, unless an adsorbate is on a magnetic metal surface, spin polarization typically does not need to be considered, although the gas-phase reference state calculation may need to be done with spin-polarization.

The downside of including spin polarization is that it essentially doubles the calculation time.