7.5 The Extended Koopmans' theorem

The calculation of excited state energies in solids using QMC methods is a fairly new area of research. The method described here obtains a large number of excitation energies from a single QMC calculation involving averages over the ground state wave function.

The Extended Koopmans' Theorem (EKT) was derived apparently independently by Smith, Day and Garrod[98] and by Morrell, Parr and Levy.[99] The EKT is closely related to the earlier work of Feynman[131] on calculating excitation energies in the superfluid state of He

, although the quantum chemists appear to have developed the theory independently. The EKT has been shown to give very good excitation energies for simple molecular systems,[132] and has been applied to atomic and diatomic systems.[133,134] It appears particularly well suited to QMC calculations in which explicitly correlated many-body wave functions are used. The derivation is reviewed following Ref. [98], and a variation suitable for QMC is formulated.

7.5.1 Valence band energies

In this method the band energies are calculated as ionisation energies. The starting point is an approximation to the normalized ground state wave function, $\psi^N$ . The wave function for the

-1 electron system is approximated by the Ansatz of eliminating an orbital from $\psi^N$ :

The valence orbital to be eliminated,

, will be determined variationally. This Ansatz is reminiscent of a quasiparticle wave function for the state, although the formulation is for

-1 particle eigenstates of the Hamiltonian. Expressing equation 7.9 in second quantization yields

The ionisation energy is given by the difference in the expectation values of the Hamiltonian calculated with the

and

-1 electron wave functions:[99,133]

The denominator in equation 7.12 is the one-body density matrix. Solution of this equation is facilitated by expanding in a set of orbitals, $\{\phi_i\}$ , so that $u_v({\bf r})= \sum c_{iv} \phi_i({\bf r})$ , and $\hat{\mathcal{O}}_v=\sum c_{iv} \hat{a}_i$ , where $\hat{a}_i$ is the destruction operator for $\phi_i$ . The condition for a stationary value of $\epsilon_v$ generates a secular equation

The matrix ${\bf S}^v$ is the one-body density matrix, and the elements of ${\bf V}^v$ are

$\hat{H}_1$ consists of the terms in the

-electron Hamiltonian involving coordinate ${\bf r}_1$ , so that

where $\hat{h}$ consists of the one-body kinetic energy operator and ionic potential, including both local and non-local pseudopotential components, and

is the electron-electron interaction potential.

If a HF wave function is used for $\psi^N$ and the density matrix is expanded in a basis set of HF orbitals, then $V^v_{ij}$ reduces to a matrix with the HF

-particle eigenvalues on the occupied part of the diagonal, and zeros everywhere else. The resulting excitation energies are those given by the well-known Koopmans' theorem.[112] The contents of the EKT method are now reasonably clear. The method consists of a quasiparticle-like Ansatz for the wave function of the

-1 particle system, which is used to calculate the ionisation energies of the system. Electron correlations are included, but no allowance is made for relaxation of the other orbitals in the presence of the excitation. Although this relaxation can be important in small systems, it is expected to be much less important for excitations in extended systems such as the silicon crystal studied here.

7.5.2 Conduction band energies

An analogous theory exists for the conduction band energies. The wave function for the

+1 electron system is approximated by the Ansatz of adding an orbital to $\psi^N$ :

where $\hat{\mathcal{A}}$ is the antisymmetriser and the orbital

is to be determined variationally. In second quantization we have

Expanding in a set of orbitals gives $u_c({\bf r})= \sum c_{ic} \phi_i({\bf r})$ and $\hat{\mathcal{O}}^\dagger_c=\sum c_{ic} \hat{a}^\dagger_i$ . The coefficients $c_{ic}$ are the solutions of the secular equation

where the matrix elements are $V^c_{ij}=\langle \psi^N\vert\hat{a}_i[\hat H,\hat{a}^\dagger_j]\vert\psi^N\rangle$ and $S^c_{ij}=\langle \psi^N\vert\hat{a}_i\hat{a}^\dagger_j\vert\psi^N \rangle = \delta_{ij}-\rho_{ij}$ .

It is instructive to introduce a new potential with matrix elements $V_{ij}=V^v_{ij}+V^c_{ij}$ :

$\displaystyle V_{ij}$	$\textstyle =\int$	$\displaystyle \phi_i({\bf r}_0)\hat{h}_0 \phi^*_j({\bf r}_0)d{\bf r}_0$
	$\textstyle +N\int$	$\displaystyle \phi_i({\bf r}_0)\phi^*_j({\bf r}_0)v({\bf r}_0,{\bf r}_1)$
		$\displaystyle \times \vert\psi({\bf r}_1,\dots,{\bf r}_N)\vert^2d{\bf r}_0 d{\bf r}_1 \ldots d{\bf r}_N$
	$\textstyle -N\int$	$\displaystyle \phi_i({\bf r}_0)\phi^*_j({\bf r}_1)v({\bf r}_0,{\bf r}_1) \psi({\bf r}_0,{\bf r}_2,\dots,{\bf r}_N)$
		$\displaystyle \times \psi^*({\bf r}_1,\dots,{\bf r}_N) d{\bf r}_0 d{\bf r}_1 \ldots d{\bf r}_N \,.$	(7.21)

If an HF wave function a basis of HF orbitals is used, $V_{ij}$ reduces to a matrix with the HF

-particle eigenvalues on the diagonal, and zeros everywhere else. In this case one can readily identify the second and third terms in equation 7.21 as, respectively, the Hartree and exchange terms. (Similarly $V^c_{ij}$ reduces to the HF energy eigenvalues on the unoccupied part of the diagonal.)

7.5.3 VMC formulation of the EKT

The matrix elements of $V^v_{ij}$ and $V_{ij}$ were accumulated, subsequently forming the matrix $V^c_{ij} = V_{ij} - V^v_{ij}$ . The matrix elements, $V^v_{ij}$ , are given by

$\displaystyle {V^v_{ij}=N\int \phi_i({\bf r}_1)\phi^*_j({\bf r}^\prime) \frac{\hat{H}_1 \psi({\bf r}_1,\dots,{\bf r}_N)} {\psi({\bf r}_1,\dots,{\bf r}_N)} }$
		$\displaystyle \times \frac{\psi( {\bf r}^\prime,{\bf r}_2,\ldots,{\bf r}_{N})} ... ...\bf r}_1,\dots,{\bf r}_N)\vert^2 d{\bf r}^\prime d{\bf r}_1\ldots d{\bf r}_N\,.$	(7.22)

$\displaystyle V^v_{ij}= \left<\,\sum_{n=1}^N \int \phi^*_i({\bf r}_n)\phi_j({\b... ...t{H}_n \psi({\bf r}_1,\dots,{\bf r}_N)}{\psi({\bf r}_1,\dots,{\bf r}_N)}\right.$
$\displaystyle \left. \times \frac{\psi(\ldots,{\bf r}^\prime, \ldots)}{\psi( \ldots,{\bf r}_n,\ldots)} d{\bf r}^\prime \right>_{\vert\psi\vert^2},$			(7.23)

so that

values are accumulated at each step. The terms contributing to $\hat{H}_n\psi/\psi$ are already available in a VMC calculation, allowing $V^v_{ij}$ to be accumulated with virtually no additional cost beyond that required for the density matrix. An analogous VMC formulation for calculating $V_{ij}$ exists. The single particle terms, $\hat{h}_0$ , appearing in the first term in equation 7.21 were evaluated directly without use of Monte Carlo integration. Using Monte Carlo integration for all the terms resulted in a small increase in the variance of the matrix elements. It is therefore preferable, but less elegant, to calculate the $\hat{h}_0$ terms directly. The matrix elements $V^v_{ij}$ and $V_{ij}$ were accumulated at the same time as the elements $\rho_{ij}$ . The full crystal symmetry was again used, which reduces the number of matrix elements which must be accumulated and ensures the correct symmetry in the presence of statistical noise.

7.5.4 Results

7.5.4.1 Full EKT

The correlation lengths along the VMC walks for $V^v_{ij}$ and $V_{ij}$ were found to be similar to those of the local energy and density matrix, and the distribution of statistical errors was similar to that for the density matrix. The elements of $V^v_{ij}$ and $V^c_{ij}$ with the largest statistical errors were the diagonal elements. The statistical error bars on these elements are estimated to be $\pm 0.2$ eV.

The matrix equations 7.13 and 7.20 were diagonalised using a generalized eigenvalue solver. The ratios of the statistical error bars to the mean values were significantly larger for the diagonal elements of $V^v_{ij}$ and $V^c_{ij}$ than for $\rho_{ij}$ . Just as for the density matrix, small perturbations were applied to estimate the statistical errors in the eigenvalues and eigenvectors. The numerical stability of the diagonalisation was improved by explicitly zeroing elements of $V^v_{ij}$ and $V^c_{ij}$ within statistical error of zero. The accuracy of the eigenvalues was further verified by gradually increasing the number of bands in the diagonalisation procedure. The valence and low lying conduction band energies were stable to within $\pm 0.4$ eV.

The resulting band energies are shown in figure 7.3 and given in table 7.2, with the energy at the top of the valence band set to zero. Of the k-points computed here, the available experimental data (Exp) is limited to the $\Gamma$ -point. Because of this we also give empirical pseudopotential (Emp) data [135] in table 7.2 and in figure 7.3. The empirical pseudopotential data should provide a good interpolation between this data. Results for the Ewald and MPC interactions are in good agreement, indicating that the finite size errors are not significant at the level of statistical accuracy obtained here. The energies are in good qualitative agreement with the empirical data at all k-points except for the upper $\Sigma_1$ valence band state, and the $\Delta_{2^\prime}$ conduction band state. The source of the error is principally the value of the diagonal matrix elements for these states, $V^v_{ii}$ and $V^c_{ii}$ (see next section).

**Figure 7.3:** The band structure of silicon, as computed within the VMC Extended Koopmans' Theorem using the MPC interaction ( $\Box$ ) and the Ewald interaction ( $\circ$ ), and within the LDA ( $\times$ ). The results for the MPC interaction are shown with statistical error bars. The statistical error bars for the Ewald results have been omitted for clarity, but they are the same size as for the MPC interaction. As a guide to the eye, the empirical pseudopotential data of [135] is shown (solid lines), which is in good agreement with the available experimental data.
$\includegraphics [width=10cm]{Figures/ekt_fig3.eps}$

Table 7.2: Band energies of silicon in eV. a,b- VMC-EKT energies using the MPC and Ewald interactions, respectively. The statistical error bars are $\pm 0.4$ eV. c,d- Diagonal approximation to the VMC-EKT energies using the MPC and Ewald interactions, respectively. The statistical error bars are $\pm 0.2$ eV. e- Direct DMC calculations, with statistical error bars of $\pm 0.2$ eV, from Ref. [103]. f- Ref. [105]. g- This work. h- Ref. [136]. i- Ref. [135] j- From the compilation given in Ref. [136].

Band	VMC				DMC	HF	LDA		Emp	Exp
	EKT	EKT	DEKT	DEKT
$\Gamma_{2^\prime}$	4.4	4.5	5.1	5.2	4.6	9.0	3.22	3.89	4.1	4.23,4.1
$\Gamma_{15}$	3.8	3.9	4.4	4.4	3.7	8.0	2.51	3.36	3.4	3.40,3.05
$\Gamma_{25^\prime}$	0.0	0.0	0.0	0.0	0.0	0.0	0.00	0.00	0.0	0.00
$\Gamma_{1}$	-12.9	-13.1	-13.3	-13.4	-13.6	-18.9	-11.98	-11.95	-12.36	-12.5 $\pm$ 0.6
$\Lambda_3$	4.0	4.7	5.4	5.7		9.0	3.46		4.5
$\Lambda_1$	3.0	3.1	3.6	3.6		6.8	1.68		2.5
$\Lambda_{3}$	-0.6	-1.2	-1.3	-1.5		-1.7	-1.01		-0.9
$\Lambda_1$	-5.0	-5.0	-5.4	-5.4		-6.3	-5.38		-5.2
$\Lambda_1$	-11.6	-11.9	-12.3	-12.5		-16.4	-10.60		-10.6
$\Delta_{2^\prime}$	4.7	4.8	5.7	6.0		7.2	1.88		2.4
$\Delta_{1}$	2.1	2.3	2.7	3.0		5.5	0.61		1.3
$\Delta_5$	-1.7	-1.8	-2.4	-2.5		-3.8	-2.45		-2.5
$\Delta_{2^\prime}$	-4.9	-4.8	-5.5	-5.7		-7.8	-5.01		-5.0
$\Delta_{1}$	-11.6	-12.0	-11.7	-12.2		-16.0	-10.08		-10.2
$\Sigma_1$	6.4	5.4	6.7	6.1			5.73		6.0
$\Sigma_3$	3.4	3.1	4.1	4.0			1.45		2.0
$\Sigma_2$	-1.1	-1.6	-1.2	-2.0			-2.14		-2.1
$\Sigma_1$	-1.8	-2.5	-1.9	-2.5			-4.50		-4.6
$\Sigma_3$	-6.1	-6.3	-6.4	-6.8			-6.79		-6.7
$\Sigma_1$	-8.9	-9.0	-9.5	-9.8			-8.70		-8.7

The EKT band energies at the $\Gamma$ -point are in good agreement with the available experimental data, and are also in good agreement with the DMC data from direct calculations of the excitation energies.[103] The EKT valence band width of

eV is smaller than the value of

eV obtained from the DMC calculations and is in good agreement with the experimental value of

eV. In comparison the HF data shows the well known overestimation of band gaps and band widths which is due to the neglect of correlation energy, the LDA gives excellent valence band energies while the conduction bands are too low in energy by 0.7-1.0 eV, and the

data is in very good agreement with experiment.

The EKT is a formulation for the eigenstates of the

-1 and

+1 electron systems, while the direct method is aimed at calculating the eigenstates of the

electron system. The EKT and direct results should therefore differ by the exciton binding energy, but this energy is small for silicon and cannot be resolved at the level of statistical accuracy obtained. Depending on the application one would like to be able to choose whether to include excitonic effects, so that it is advantageous to have the different QMC techniques available. Clearly further refinement of excited state QMC methods is required, but the results from the EKT and direct approaches are promising for the study of more strongly correlated systems, for which the LDA and

methods are less reliable.

7.5.5 Approximations to the EKT

If the off-diagonal elements of $V^v_{ij}$ , $V^c_{ij}$ , and $\rho_{ij}$ are neglected then the valence and conduction band energies can be approximated by

where the superscript DEKT denotes the diagonal approximation to the EKT. This approximation has been used within VMC and without reference to the EKT by Fahy et al. [121] to calculate quasihole energies in silicon and also by Tanaka[122] to calculate the quasihole energies in NiO. This approximation has the computational advantage that far fewer matrix elements are required, and also that the problems of statistical noise are reduced, because the values of only two matrix elements enter the calculation of each band energy. However the results are basis set dependent, and could differ significantly between, for example, a HF and LDA basis.

If an HF wave function and basis set of HF orbitals is used then the matrices $V^v_{ij}$ , $V^c_{ij}$ , and $\rho_{ij}$ are all diagonal, and consequently the full EKT and the DEKT are equivalent. In general, for correlated wave functions, the DEKT gives neither an upper nor lower bound to the energy obtained from the full EKT. Comparison with the data from the DEKT is made in table 7.2. The DEKT values are close to the full EKT values, including the two cases mentioned above where the agreement with experiment is poor. The DEKT works quite well for the valence bands and slightly less well for the conduction bands, because the off-diagonal matrix elements of $V^c_{ij}$ coupling the unoccupied states are more significant. In similar VMC calculations for silicon using the DEKT and a smaller simulation cell of 64 electrons, Fahy et al. [121] calculated a valence band width of 14.5(4) eV, which is larger than the value of 13.3(2) eV and the experimental value of 12.5(6) eV. Fahy et al. [121] also obtained occupation numbers of 1.96(4) and 1.92(3) for the $\Gamma_1$ and $\Gamma_{25^\prime}$ valence band states respectively, compared with the values of 1.9817(2) and 1.9375(5) obtained in the larger cell of this study.

The scaling with system size of the diagonal approximation is very advantageous compared with direct evaluation of excitation energies. In direct methods, the fractional energy change due to the promotion of an electron is considered. This energy change is inversely proportional to the number of electrons in the system, i.e., a ` $\frac{1}{N}$ ' effect. The precision of the calculation must therefore be sufficient to resolve the energy change amid the statistical noise. This requirement is very challenging for small band gap materials, as the system must also be sufficiently large to approximate the infinite solid. In the DEKT, the band energies are computed directly as averages over the square of the ground state wave function, rather than as the difference between averages over the squares of the ground and excited state wave functions. This greatly improves the sampling statistics. We have found that the errors in $V^v_{ii}$ , $V^c_{ii}$ and $\rho_{ii}$ scale as the inverse of the square root of the number of independent electron configurations, independently of the system size. For sufficiently large systems, it therefore requires fewer statistically independent samples in the DEKT to obtain excitation energies to a given statistical accuracy than with direct methods. The scaling behavior of the full EKT is more difficult to analyze because we are required to diagonalise noisy matrix equations, where the number of off-diagonal matrix elements increases as the square of the system size, and where there are certainly statistical correlations between matrix elements.

$\displaystyle V^v_{ij}$	$\textstyle =$	$\displaystyle N \int \phi_i({\bf r}_1)\phi^_j({\bf r}^\prime) \psi^({\bf r}_1,\dots,{\bf r}_N)$
		$\displaystyle \times \hat{H}_1 \psi({\bf r}^\prime,{\bf r}_2,\dots,{\bf r}_N)d{\bf r}^\prime d{\bf r}_1\ldots d{\bf r}_N\,.$	(7.15)