Paper 4, Section II, 28 K28 \mathrm{~K}

Principles of Statistics
Part II, 2018

Let g:RRg: \mathbb{R} \rightarrow \mathbb{R} be an unknown function, twice continuously differentiable with g(x)M\left|g^{\prime \prime}(x)\right| \leqslant M for all xRx \in \mathbb{R}. For some x0Rx_{0} \in \mathbb{R}, we know the value g(x0)g\left(x_{0}\right) and we wish to estimate its derivative g(x0)g^{\prime}\left(x_{0}\right). To do so, we have access to a pseudo-random number generator that gives U1,,UNU_{1}^{*}, \ldots, U_{N}^{*} i.i.d. uniform over [0,1][0,1], and a machine that takes input x1,,xNRx_{1}, \ldots, x_{N} \in \mathbb{R} and returns g(xi)+εig\left(x_{i}\right)+\varepsilon_{i}, where the εi\varepsilon_{i} are i.i.d. N(0,σ2)\mathcal{N}\left(0, \sigma^{2}\right).

(a) Explain how this setup allows us to generate NN independent Xi=x0+hZiX_{i}=x_{0}+h Z_{i}, where the ZiZ_{i} take value 1 or 1-1 with probability 1/21 / 2, for any h>0h>0.

(b) We denote by YiY_{i} the output g(Xi)+εig\left(X_{i}\right)+\varepsilon_{i}. Show that for some independent ξiR\xi_{i} \in \mathbb{R}

Yig(x0)=hZig(x0)+h22g(ξi)+εiY_{i}-g\left(x_{0}\right)=h Z_{i} g^{\prime}\left(x_{0}\right)+\frac{h^{2}}{2} g^{\prime \prime}\left(\xi_{i}\right)+\varepsilon_{i}

(c) Using the intuition given by the least-squares estimator, justify the use of the estimator g^N\hat{g}_{N} given by

g^N=1Ni=1NZi(Yig(x0))h\hat{g}_{N}=\frac{1}{N} \sum_{i=1}^{N} \frac{Z_{i}\left(Y_{i}-g\left(x_{0}\right)\right)}{h}

(d) Show that

E[g^Ng(x0)2]h2M24+σ2Nh2.\mathbb{E}\left[\left|\hat{g}_{N}-g^{\prime}\left(x_{0}\right)\right|^{2}\right] \leqslant \frac{h^{2} M^{2}}{4}+\frac{\sigma^{2}}{N h^{2}} .

Show that for some choice hNh_{N} of parameter hh, this implies

E[g^Ng(x0)2]σMN\mathbb{E}\left[\left|\hat{g}_{N}-g^{\prime}\left(x_{0}\right)\right|^{2}\right] \leqslant \frac{\sigma M}{\sqrt{N}}