In [1, p. 61 exercise 3.6] we are asked to assume that the variance is to be estimated as well as the mean for the conditions of [1, p. 60 exercise 3.4] (see also [2, solution of exercise 3.4]) . We are asked to prove for the vector parameter \mathbf{\theta}=\left[\mu_x \; \sigma^2_x\right]^T, that the Fisher information matrix is
\mathbf{I}_{\theta}=\left[\begin{array}{cc} \frac{N}{\sigma^2_x} & 0 \\ 0 & \frac{N}{2\sigma^4_x} \end{array}\right]

Furthermore we are asked to find the CR bound and to determine if the sample mean \hat{\mu}_x is efficient. If additionaly the variance is to be estimated as
\hat{\sigma}^2_x=\frac{1}{N-1}\sum\limits_{n=0}^{N-1}(x[n]-\hat{\mu}_x)^2

then we are asked to determine if this estimator is unbiased and efficient. Hint: We are instructed to use the result that
\frac{(N-1)\hat{\sigma}^2_x}{\sigma^2_x} \sim \chi^2_{N-1}


Solution: We have already obtained the joint pdf f(\mathbf{x}) of the N independent samples with normal distribution N(\mu_x,\sigma^2_x) and the natural logarithm of the joint pdf is given by from [3, relation (3) ]:
\ln{f(\mathbf{x;\theta})}=-\ln(\sqrt{2\pi}\sigma_x)N-\frac{1}{2}\sum\limits_{i=0}^{N-1}\left(\frac{x[i]-\mu_x}{\sigma_x}\right)^2
=-\frac{N}{2}\ln(2\pi \sigma^{2}_x)-\frac{1}{2}\sum\limits_{i=0}^{N-1}\left(\frac{x[i]-\mu_x}{\sigma_x}\right)^2

From this relation we can find the gradient in respect to the vector parameter:
\nabla_{\mathbf{\theta}} \ln(f(\mathbf{x;\theta}))=\left[ \begin{array}{cc} \frac{\partial \ln(f(\mathbf{x};\theta))}{\partial \mu_{x}} &   \frac{\partial \ln(f(\mathbf{x};\theta))}{\partial \sigma^{2}_{x}}  \end{array} \right]^{T}
=\left[ \begin{array}{cc} \sum_{i=0}^{N-1}\left(\frac{x[i]-\mu_x}{\sigma^{2}_x}\right) &  -\frac{N}{2\sigma_{x}^{2}}+\frac{1}{2} \sum_{i=0}^{N-1}\frac{\left(x[i]-\mu_x\right)^2}{\sigma^{4}_x}\end{array} \right]^{T}

Thus Fisher’s information matrix is given by [1, p. 47, (3.22) ]:
\mathbf{I}_{\theta}=E\left\{ \nabla_{\mathbf{\theta}} \ln(f(\mathbf{x;\theta})) \nabla_{\mathbf{\theta}} \ln(f(\mathbf{x;\theta}))^{T}\right\}

Considering that the samples x[i] \;, i=0,..,N are independent we obtain the individual elements of the matrix I_{ij} are given by:
I_{11}=E\left\{ \left(\sum\limits_{i=0}^{N-1}\left(\frac{x[i]-\mu_x}{\sigma^{2}_x}\right) \right)^{2} \right\}
=E\left\{ \sum\limits_{i=0}^{N-1}\frac{\left(x[i]-\mu_x\right)^{2}}{\sigma^{4}_x} +\sum\limits_{i=0}^{N-1}\sum\limits_{j=0,j\neq i}^{N-1}\frac{\left(x[i]-\mu_x\right)\left(x[j]-\mu_x\right)}{\sigma^{4}_x}   \right\}
=\sum\limits_{i=0}^{N-1}\frac{E\left\{ \left(x[i]-\mu_x\right)^{2}\right\}}{\sigma^{4}_x}
 +\sum\limits_{i=0}^{N-1}\sum\limits_{j=0,j\neq i}^{N-1}\frac{E\left\{\left(x[i]-\mu_x\right)\right\}E\left\{\left(x[j]-\mu_x\right)\right\}}{\sigma^{4}_x}
=\frac{N\sigma^{2}_x}{\sigma^{4}_x} +0
=\frac{N}{\sigma^{2}_x}  (1)
I_{12}=I_{21}=E\left\{  \left(\sum\limits_{i=0}^{N-1}\left(\frac{x[i]-\mu_x}{\sigma^{2}_x}\right) \right) \left(-\frac{N}{2\sigma_{x}^{2}}+\frac{1}{2} \sum\limits_{i=0}^{N-1}\frac{\left(x[i]-\mu_x\right)^2}{\sigma^{4}_x}\right) \right\}
=E\left\{ -\frac{N}{2 \sigma_{x}^{2} } \sum\limits_{i=0}^{N-1} \left(\frac{x[i]-\mu_x}{\sigma^{2}_x}\right)\right\}
 +E\left\{ \left(\sum\limits_{i=0}^{N-1}\frac{x[i]-\mu_x}{\sigma^{2}_x} \right) \left(\frac{1}{2}\sum\limits_{i=0}^{N-1}\frac{\left( x[i]-\mu_x \right)^2}{\sigma^{4}_x}\right) \right\}
=-\frac{N}{2 \sigma_{x}^{2} } \sum\limits_{i=0}^{N-1} \left(\frac{E\left\{x[i]\right\}-\mu_x}{\sigma^{2}_x}\right)
+\frac{1}{2}E\left\{ \sum\limits_{i=0}^{N-1}\left(\frac{x[i]-\mu_x}{\sigma^{2}_x} \right)^{3}\right\}
 + \frac{1}{2}E\left\{\sum\limits_{i=0}^{N-1} \sum\limits_{j=0,j\neq i}^{N-1}\frac{\left(x[j]-\mu_x\right)\left( x[i]-\mu_x \right)^2}{\sigma^{6}_x} \right\}
=0+\frac{1}{2} E\left\{ \sum\limits_{i=0}^{N-1}\left(\frac{x[i]-\mu_x}{\sigma^{2}_x} \right)^{3}\right\}
 +\frac{1}{2}\sum\limits_{i=0}^{N-1} \sum\limits_{j=0,j\neq i}^{N-1}\frac{ E\left\{\left(x[j]-\mu_x\right)\left( x[i]-\mu_x \right)^2\right\}}{\sigma^{6}_x}
=\frac{1}{2}E\left\{ \sum\limits_{i=0}^{N-1}\left(\frac{x[i]-\mu_x}{\sigma^{2}_x} \right)^{3}\right\}
 +\frac{1}{2}\sum\limits_{i=0}^{N-1} \sum\limits_{j=0,j\neq i}^{N-1}\frac{ E\left\{\left(x[j]-\mu_x\right)\right\}E\left\{\left( x[i]-\mu_x \right)^2\right\}}{\sigma^{6}_x}
=\frac{1}{2} E\left\{ \sum\limits_{i=0}^{N-1}\left(\frac{x[i]-\mu_x}{\sigma^{2}_x} \right)^{3}\right\} +0
=\frac{1}{2} \sum\limits_{i=0}^{N-1}E\left\{\left(\frac{x[i]-\mu_x}{\sigma^{2}_x} \right)^{3} \right\} (2)

We note that h(x[i])=\left(\frac{x[i]-\mu_x}{\sigma^{2}_x} \right)^{3} is an odd function about \mu_{x} (that is h(\mu_{x}+x)=-h(\mu_{x}-x)) while the gaussian distribution is an even function about  \mu_{x}. Thus the mean of this function -the integral which is symmetric about \mu_{x} and extends from -\infty to \infty will be equal to zero. This fact can be shown by the following approach:
E\left\{\left(\frac{x[i]-\mu_x}{\sigma^{2}_x} \right)^{3} \right\}=\frac{1}{\sqrt{2\pi}\sigma_{x}}\int_{-\infty}^{+\infty}\left(\frac{x[i]-\mu_x}{\sigma^{2}_x} \right)^{3}e^{-\left(\frac{x[i]-\mu_{x}}{\sigma_{x}}\right)^{2}} dx
=\frac{1}{\sqrt{2\pi}\sigma_{x}}\int_{-\infty}^{\mu_{x}}\left(\frac{x[i]-\mu_x}{\sigma^{2}_x} \right)^{3}e^{-\left(\frac{x[i]-\mu_{x}}{\sigma_{x}}\right)^{2}} dx
 +\frac{1}{\sqrt{2\pi}\sigma_{x}}\int_{\mu_{x}}^{+\infty}\left(\frac{x[i]-\mu_x}{\sigma^{2}_x} \right)^{3}e^{-\left(\frac{x[i]-\mu_{x}}{\sigma_{x}}\right)^{2}} dx

Let u=\left(x[i]-\mu_{x}\right) in the last formula
E\left\{\left(\frac{x[i]-\mu_x}{\sigma^{2}_x} \right)^{3} \right\} =\frac{1}{\sqrt{2\pi}\sigma_{x}}\int_{-\infty}^{0}\left(\frac{u}{\sigma^{2}_x} \right)^{3}e^{-\left(\frac{u}{\sigma_{x}}\right)^{2}} du
 +\frac{1}{\sqrt{2\pi}\sigma_{x}}\int_{0}^{+\infty}\left(\frac{u}{\sigma^{2}_x} \right)^{3}e^{-\left(\frac{u}{\sigma_{x}}\right)^{2}} du
 =\frac{1}{\sqrt{2\pi}\sigma_{x}}\int_{-\infty}^{0}\left(\frac{u}{\sigma^{2}_x} \right)^{3}e^{-\left(\frac{u}{\sigma_{x}}\right)^{2}} du
 +\frac{1}{\sqrt{2\pi}\sigma_{x}}\int_{0}^{+\infty}-\left(\frac{-u}{\sigma^{2}_x} \right)^{3}e^{-\left(\frac{u}{\sigma_{x}}\right)^{2}} du
 =\frac{1}{\sqrt{2\pi}\sigma_{x}}\int_{-\infty}^{0}\left(\frac{u}{\sigma^{2}_x} \right)^{3}e^{-\left(\frac{u}{\sigma_{x}}\right)^{2}} du
 +\frac{1}{\sqrt{2\pi}\sigma_{x}}\int_{0}^{+\infty}\left(\frac{-u}{\sigma^{2}_x} \right)^{3}e^{-\left(\frac{-u}{\sigma_{x}}\right)^{2}} d(-u)

If we set v=-u in the second integral, we obtain the following formulas:
E\left\{\left(\frac{x[i]-\mu_x}{\sigma^{2}_x} \right)^{3} \right\} =\frac{1}{\sqrt{2\pi}\sigma_{x}}\int_{-\infty}^{0}\left(\frac{u}{\sigma^{2}_x} \right)^{3}e^{-\left(\frac{u}{\sigma_{x}}\right)^{2}} du
 +\frac{1}{\sqrt{2\pi}\sigma_{x}}\int_{0}^{-\infty}\left(\frac{v}{\sigma^{2}_x} \right)^{3}e^{-\left(\frac{v}{\sigma_{x}}\right)^{2}} d(v)
 =\frac{1}{\sqrt{2\pi}\sigma_{x}}\int_{-\infty}^{0}\left(\frac{u}{\sigma^{2}_x} \right)^{3}e^{-\left(\frac{u}{\sigma_{x}}\right)^{2}} du
 -\frac{1}{\sqrt{2\pi}\sigma_{x}}\int_{-\infty}^{0}\left(\frac{v}{\sigma^{2}_x} \right)^{3}e^{-\left(\frac{v}{\sigma_{x}}\right)^{2}} d(v)
E\left\{\left(\frac{x[i]-\mu_x}{\sigma^{2}_x} \right)^{3} \right\}=0 (3)

Using [2] in conjunction with [3] we obtain
I_{12}=I_{21}=0 . (4)

Finally it remains to obtain the value for I_{22}:
I_{22}=E\left\{\left( -\frac{N}{2\sigma_{x}^{2}}+\frac{1}{2} \sum\limits_{i=0}^{N-1}\frac{\left(x[i]-\mu_x\right)^2}{\sigma^{4}_x}\right)^{2}\right\}
=\frac{1}{4\sigma_{x}^{4}} E\left\{\left( \sum\limits_{i=0}^{N-1}\left(\frac{x[i]-\mu_x}{\sigma_x}\right)^{2}  -N\right)^{2}\right\} (5)

We note that v_{i}=\frac{x[i]-\mu_x}{\sigma_x} in (5) is a normalized random gaussian variable and that the squared sum of such variables \chi^{2}=\sum_{i=0}^{N-1}v_{i}^{2} has a chi-square distribution [4, p. 682] with N degrees of freedom with mean E\left\{\chi^{2}\right\}=N and variance equal to E\left\{\left(\chi^{2}-N\right)^{2}\right\}=2N. Thus
I_{22}=\frac{N}{2\sigma_{x}^{4}} (6)

From (1), (4) and (6) we obtain finally that Fisher’s information matrix is equal to:
\mathbf{I}_{\theta}=\left[ \begin{array}{cc} \frac{N}{\sigma_{x}^{2}} & 0 \\ 0 &\frac{N}{2\sigma_{x}^{4}} \end{array} \right]. (7)

We have shown in [2, solution of exercise 3.4] that the mean and the variance [2, solution of exercise 3.4, relation (5) ] of the estimator \hat{\mu}_{x} are given by:
E\left\{\hat{\mu}_{x}\right\}=\mu_{x}
Var\left\{\hat{\mu}_{x}\right\} =E\left\{\left(\hat{\mu}_{x}- \mu_{x} \right)^{2}\right\}=\frac{\sigma^{2}_{x}}{N}

The mean and the variance of the estimator \hat{\sigma}^{2}_{x} are given by (always considering the independence of the random variables x[n], x[k] for k\neq n):
E\left\{\hat{\sigma}^{2}_{x}\right\}=\frac{1}{N-1}\sum\limits_{n=0}^{N-1}E\left\{(x[n]-\hat{\mu}_{x})^{2}\right\}
=\frac{1}{N-1}\sum\limits_{n=0}^{N-1}E\left\{((x[n]-\mu_{x})- (\hat{\mu}_{x}-\mu_{x}))^{2}\right\}
=\frac{1}{N-1}\sum\limits_{n=0}^{N-1}E\left\{(x[n]-\mu_{x})^{2}\right\}+\frac{1}{N-1}\sum\limits_{n=0}^{N-1} E\left\{(\hat{\mu}_{x}-\mu_{x})^{2}\right\}
 -  \frac{2}{N-1}\sum\limits_{n=0}^{N-1}E\left\{(x[n]-\mu_{x})(\hat{\mu}_{x}-\mu_{x})\right\}
=\frac{N\sigma^{2}_{x}}{N-1} + \frac{\sigma^{2}_{x}}{N-1} -   \frac{2}{N-1}\sum\limits_{n=0}^{N-1}E\left\{(x[n]-\mu_{x})(\hat{\mu}_{x}-\mu_{x})\right\}
=\frac{(N+1)\sigma^{2}_{x}}{N-1} -  \frac{2}{N-1}\sum\limits_{n=0}^{N-1}E\left\{(x[n]-\mu_{x})(\hat{\mu}_{x}-\mu_{x})\right\}
=\frac{(N+1)\sigma^{2}_{x}}{N-1} -  \frac{2}{N-1}\sum\limits_{n=0}^{N-1}E\left\{(x[n]-\mu_{x})(\frac{1}{N}\sum\limits_{i=0}^{N-1}x[i]-\mu_{x})\right\}
=\frac{(N+1)\sigma^{2}_{x}}{N-1} -  \frac{2}{N-1}\sum\limits_{n=0}^{N-1}E\left\{\frac{1}{N}(x[n]-\mu_{x})(\sum\limits_{i=0}^{N-1}(x[i]-\mu_{x}))\right\}
=\frac{(N+1)\sigma^{2}_{x}}{N-1} -
 \frac{2}{N-1}\sum\limits_{n=0}^{N-1}\frac{1}{N}\Bigg( E\left\{(x[n]-\mu_{x})^{2}\right\}
 \left. +\sum\limits_{i=0,i\neq n}^{N-1}E\left\{(x[n]-\mu_{x})(x[i]-\mu_{x})\right\} \right)
=\frac{(N+1)\sigma^{2}_{x}}{N-1} -
 \frac{2}{N-1}\sum\limits_{n=0}^{N-1}\frac{1}{N}\Bigg( \sigma_{x}^{2}
 +\sum\limits_{i=0,i\neq n}^{N-1}E\left\{(x[n]-\mu_{x})\right\}E\left\{(x[i]-\mu_{x})\right\} \Bigg)
=\frac{(N+1)\sigma^{2}_{x}}{N-1} -  \frac{2}{N-1}\sum\limits_{n=0}^{N-1}\frac{1}{N}(\sigma_{x}^{2}+0)
=\frac{(N+1)\sigma^{2}_{x}}{N-1} -  \frac{2\sigma_{x}^{2}}{N-1}

From the previous relation we obtain finally:
E\left\{\hat{\sigma}^{2}_{x}\right\}=\sigma^{2}_{x}. (8)

Considering that \frac{(N-1)\hat{\sigma}^2_x}{\sigma^2_x} \sim \chi^2_{N-1} we can also obtain the variance of the estimator \hat{\sigma}_{x}^{2}:
Var\left\{\hat{\sigma}^{2}_{x}\right\}=Var\left\{\frac{\sigma_{x}^{2}}{N-1}\chi^2_{N-1}\right\}
=\frac{\sigma_{x}^{4}}{(N-1)^{2}} Var\left\{\chi^2_{N-1} \right \}
=\frac{\sigma_{x}^{4}}{(N-1)^{2}} 2 (N-1)
=\frac{2 \sigma_{x}^{4}}{N-1}. (9)

Because the Cramer Rao bound for the estimator \hat{\sigma}^{2}_{x} is given by the inverse of the diagonal element of Fisher’s information matrix (6) I_{22} we obtain using (9) :
Var\left\{\hat{\sigma}^{2}_{x}\right\} \geq I_{22}^{-1}
\frac{2 \sigma_{x}^{4}}{N-1} \geq \frac{2\sigma_{x}^{4}}{N} (10)

thus the estimator \hat{\sigma}^{2}_{x} is unbiased because of (8) but not efficient because it doesn’t attain the Cramer – Rao bound given by (10). QED.

[1] Steven M. Kay: “Modern Spectral Estimation – Theory and Applications”, Prentice Hall, ISBN: 0-13-598582-X.
[2] Chatzichrisafis: “Solution of exercise 3.4 from Kay’s Modern Spectral Estimation - Theory and Applications”.
[3] Chatzichrisafis: “Solution of exercise 3.5 from Kay’s Modern Spectral Estimation - Theory and Applications”.
[4] Granino A. Korn and Theresa M. Korn: “Mathematical Handbook for Scientists and Engineers”, Dover, ISBN: 978-0-486-41147-7.