Provide a html document using RMarkdown for these Exercises. Both the code and the explanations of the code will be evaluated. The html file should be sent back by email (tom.rohmer@inrae.fr) by February 16th. The work can be done in pairs (please specify the names in the html file.)

Exercise 1: Gradient descent algorithm

Questions: Explanations of your code is to provide, either directly in the code or outside the chunk.

For \(n=2000\), simulate a sample \(y_1,\ldots,y_n\) from a gamma distribution with shape parameter \(\alpha= 3\) and rate parameter \(\beta = 5\).
Implement a gradient descent algorithm to estimate the shape and rate parameters of the gamma distribution using MLE.

Note that:

the density of the Gamma distribution with shape parameter \(\alpha\) and rate parameter \(\beta\) is defined by \[ f(x;\alpha,\beta) = \frac{\beta^\alpha}{\Gamma(\alpha)}x^{\alpha-1}exp(-\beta x),\quad x>0, \alpha>0, \beta>0, \]

where the Gamma function \(\Gamma\) is defined by

\[ \Gamma(z)=\int_0^\infty t^{z-1}e^{-t}dt. \] The derivatives of the log-likelihood function are

\[ \frac{\partial}{\partial \alpha} \log \mathcal L(y_1,\ldots,y_n, \alpha , \beta) = n\log(\beta) - n\psi(\alpha) + \sum_{i=1}^n\log(y_i) \] where \(\psi\) is the digamma function (use the digamma R function): \(\psi(\alpha)= \Gamma'(\alpha)/\Gamma(\alpha)\), and

\[ \frac{\partial}{\partial \beta} \log \mathcal L(y_1,\ldots,y_n, \alpha , \beta) = n\frac{\alpha}{\beta} - \sum_{i=1}^ny_i. \]

Determine the MLE using the \(fidistrplus\) R package and compare with your result. Again using the \(fidistrplus\) R package, draw the contour of the loglikelihood function and add the successive parameter values of the algorithm.

Exercise 2: Monte Carlo approximation

approximation of \(\pi\)

The objective of this TP is to approximate \(\pi\). Let recall that the equation of the circle is given by \(x^2+y^2=1\). Let sample random vectors \(U_1=(U_{1,1},U_{1,2})\) and \(U_2=(U_{2,1},U_{2,2})\) according to independent uniform variables on \([0,1]\). Then we look for the position of \(U_1\) and \(U_2\) in the unit square:

Here,

\(U_1=(U_{1,1},U_{1,2})=(0.27, 0.37)\) is under the curve of the quarter circle: \(U_{1,1}^2+U_{1,2}^2 = 0.27^2 + 0.37^2 = 0.21<1\).
\(U_2=(U_{2,1},U_{2,2})=(0.57, 0.91)\) is over the curve of the quarter circle: \(U_{2,1}^2+U_{2,2}^2 = 0.57^2+ 0.91^2 = 1.15 >1\).

The R code to produce the previous figure is the following:

set.seed(1)
plot(function(x) sqrt(1-x^2),ylab="",xlab="")
U1=runif(2,0,1)
U2=runif(2,0,1)
points(U1[1],U1[2],col="red",pch=3)
text(U1[1]+0.04,U1[2]+0.01,"U1",col="red")
segments(x0=0,y0=U1[2],x1=U1[1],y1=U1[2],lty=2,col="red")
segments(x=U1[1],y0=0,x1=U1[1],y1=U1[2],lty=2,col="red")

points(U2[1],U2[2],col="blue",pch=3)
text(U2[1]+0.04,U2[2]+0.01,"U2",col="blue")
segments(x0=0,y0=U2[2],x1=U2[1],y1=U2[2],lty=2,col="blue")
segments(x=U2[1],y0=0,x1=U2[1],y1=U2[2],lty=2,col="blue")

print(U1[1]^2 + U1[2]^2)
print(U2[1]^2 + U2[2]^2)

Question:

Explain briefly in the previous chunk, what do the different R code lines.

Remember that the area of the circle is \(\pi\) so the area of the quarter circle is \(\pi/4\). To estimate \(\pi/4\), the Monte Carlo procedure consists in simulating \(M\) independent random vectors \(U_m=(U_{m,1},U_{m,2})\), \(m=1,\ldots,M\), where each coordinate \(U_{m,1}\) and \(U_{m,2}\) are drawn from an uniform distribution on \([0,1]\), and evaluating the proportion of points \(U_m\) that lie under the curve of the quarter circle, that is to say, such that \(U^2_{m,1}+U^2_{m,2}<1\).

Questions: Explanations of your code is to provide, either directly in the code or outside the chunk

For \(M=10^4\), simulate \(M\) such random vectors
plot the points which are under the curve of the quarter circle in red and the points which are over the curve of the quarter circle in blue (following the next figure):

Then, \(\pi/4\) can be approximated by the average of points which are under the curve of the quarter circle (among the total number of point) \[ \pi/4 \approx \frac{\text{number of red points}}{\text{total number of points}}. \]

Evaluate \(\pi/4\) then \(\pi\)
Produce the following R graph to illustrate the almost surely convergence of the Monte Carlo estimator of \(\pi\):

Approximation of the area under the curve of the Gamma density

Questions: Explanations of your code is to provide, either directly in the code or outside the chunk.

In the the same way of exercise 2, propose a Monte Carlo(MC) algorithm to evaluate the area under the Gaussian density between \(a=-1\) and \(b=5\), that is to say the probability \(P(a\leq X\leq b) = P(X\leq b) - P(X\leq a)\) for \(X\) a random variable with standard Gaussian distribution.

Plot the density of the Gaussian distribution on the interval \((-10,10)\).
In the same graphic, add the vertical lines \(a=-1\) and \(b=5\).
Add the points of the MC algorithm which are under the Gaussian curve in red and the points of the MC algorithm which are above the Gaussian curve in blue (following the next figure):

Then probability \(P(a\leq X\leq b)\) can be approximated using

\[ P(a\leq X\leq b) \approx \frac{\text{number of red points}}{\text{total number of points}}\times(b-a) \]

Evaluate the value of the area under the Gaussian density between \(a=-1\) and \(b=5\) using the Monte Carlo algorithm and compare with the theoretical value obtained using \(pnorm\).
Illustrate the (almost surely) convergence of the Monte Carlo estimation of the area under the Gaussian curve.

R project, deterministic and stochastic algorithms

Exercise 1: Gradient descent algorithm

Exercise 2: Monte Carlo approximation

approximation of \(\pi\)

Approximation of the area under the curve of the Gamma density