# Exercize 5 **Numerically compute an integral value via Monte Carlo approaches** The integral to be evaluated is the following: $$ I = \int\limits_0^1 dx \, e(x) $$ whose exact value is 1.7182818285... The three most popular MC methods where applied: plain Monte Carlo, Miser and Vegas (besides this popularity fact, these three method were chosen for being the only ones implemented in the GSL library). ## Plain Monte Carlo When the integral $I$ over a volume $V$ of a function $f$ must be evaluated in a $n-$dimensional space, the simplest MC method approach is to sample $N$ points $x_i$ evenly distributed in $V$ and approx $I$ as: $$ I \sim I_N = \frac{V}{N} \sum_{i=1}^N f(x_i) = V \cdot \langle f \rangle $$ with $I_N \rightarrow I$ for $N \rightarrow + \infty$ for the law of large numbers. Hence, the variance can be merely extimated as: $$ \sigma^2 = \frac{1}{N-1} \sum_{i = 1}^N \left( f(x_i) - \langle f \rangle \right)^2 $$ thus, the error on $I_N$ decreases as $1/\sqrt{N}$. Unlike in deterministic methods, the estimate of the error is not a strict error bound: random sampling may not uncover all the important features of the integrand that can result in an underestimate of the error. Since the proximity of $I_N$ to $I$ is related to $N$, the accuracy of the method is determined by the number of function calls when implemented (proof in @tbl:MC). ----------------------------------------------------------------- 500'000 calls 5'000'000 calls 50'000'000 calls --------- ----------------- ------------------ ------------------ result 1.7166435813 1.7181231109 1.7183387184 $\sigma$ 0.0006955691 0.0002200309 0.0000695809 ----------------------------------------------------------------- Table: MC results and errors with different numbers of function calls. {#tbl:MC} ## Miser The MISER algorithm is based on recursive stratified sampling. On each recursion step the integral and the error are estimated using a plain Monte Carlo algorithm. If the error estimate is larger than the required accuracy, the integration volume is divided into sub-volumes and the procedure is recursively applied to sub-volumes. This technique aims to reduce the overall integration error by concentrating integration points in the regions of highest variance. The idea of stratified sampling begins with the observation that for two disjoint regions $a$ and $b$ with Monte Carlo estimates of the integral $E_a (f)$ and $E_b (f)$ and variances $\sigma_a^2 (f)$ and $\sigma_b^2 (f)$, the variance $V (f)$ of the combined estimate $E (f)$: $$ E (f)= \frac {1}{2} \left( E_a (f) + E_b (f) \right) $$ is given by, $$ V(f) = \frac{\sigma_a^2 (f)}{4 N_a} + \frac{\sigma_b^2 (f)}{4 N_b} $$ It can be shown that this variance is minimized by distributing the points such that, $$ \frac{N_a}{N_a + N_b} = \frac{\sigma_a}{\sigma_a + \sigma_b} \cdot \frac{N_a}{N_a + N_b} = \frac{\sigma_a}{\sigma_a + \sigma b} $$ Hence the smallest error estimate is obtained by allocating sample points in proportion to the standard deviation of the function in each sub-region. --- --------------------------------------------------------- calls plain MC Miser Vegas ------------ -------------- -------------- -------------- 500'000 1.7166435813 1.7182850738 1.7182818354 5'000'000 1.7181231109 1.7182819143 1.7182818289 50'000'000 1.7183387184 1.7182818221 1.7182818285 --------------------------------------------------------- Table: Results of the three methods. {#tbl:results} --------------------------------------------------------- calls plain MC Miser Vegas ------------ -------------- -------------- -------------- 500'000 0.0006955691 0.0000021829 0.0000000137 5'000'000 0.0002200309 0.0000001024 0.0000000004 50'000'000 0.0000695809 0.0000000049 0.0000000000 --------------------------------------------------------- Table: $\sigma$s of the three methods. {#tbl:sigmas}