analistica/notes/sections/5.md
2020-03-06 02:24:32 +01:00

114 lines
4.0 KiB
Markdown

# Exercize 5
**Numerically compute an integral value via Monte Carlo approaches**
The integral to be evaluated is the following:
$$
I = \int\limits_0^1 dx \, e(x)
$$
whose exact value is 1.7182818285...
The three most popular MC methods where applied: plain Monte Carlo, Miser and
Vegas (besides this popularity fact, these three method were chosen for being
the only ones implemented in the GSL library).
## Plain Monte Carlo
When the integral $I$ over a volume $V$ of a function $f$ must be evaluated in
a $n-$dimensional space, the simplest MC method approach is to sample $N$
points $x_i$ evenly distributed in $V$ and approx $I$ as:
$$
I \sim I_N = \frac{V}{N} \sum_{i=1}^N f(x_i) = V \cdot \langle f \rangle
$$
with $I_N \rightarrow I$ for $N \rightarrow + \infty$ for the law of large
numbers. Hence, the variance can be merely extimated as:
$$
\sigma^2 =
\frac{1}{N-1} \sum_{i = 1}^N \left( f(x_i) - \langle f \rangle \right)^2
$$
thus, the error on $I_N$ decreases as $1/\sqrt{N}$. Unlike in deterministic
methods, the estimate of the error is not a strict error bound: random sampling
may not uncover all the important features of the integrand that can result in
an underestimate of the error.
Since the proximity of $I_N$ to $I$ is related to $N$, the accuracy of the
method is determined by the number of function calls when implemented (proof
in @tbl:MC).
-----------------------------------------------------------------
500'000 calls 5'000'000 calls 50'000'000 calls
--------- ----------------- ------------------ ------------------
result 1.7166435813 1.7181231109 1.7183387184
$\sigma$ 0.0006955691 0.0002200309 0.0000695809
-----------------------------------------------------------------
Table: MC results and errors with different numbers of function
calls. {#tbl:MC}
## Miser
The MISER algorithm is based on recursive stratified sampling.
On each recursion step the integral and the error are estimated using a plain
Monte Carlo algorithm. If the error estimate is larger than the required
accuracy, the integration volume is divided into sub-volumes and the procedure
is recursively applied to sub-volumes.
This technique aims to reduce the overall integration error by concentrating
integration points in the regions of highest variance.
The idea of stratified sampling begins with the observation that for two
disjoint regions $a$ and $b$ with Monte Carlo estimates of the integral
$E_a (f)$ and $E_b (f)$ and variances $\sigma_a^2 (f)$ and $\sigma_b^2 (f)$,
the variance $V (f)$ of the combined estimate $E (f)$:
$$
E (f)= \frac {1}{2} \left( E_a (f) + E_b (f) \right)
$$
is given by,
$$
V(f) = \frac{\sigma_a^2 (f)}{4 N_a} + \frac{\sigma_b^2 (f)}{4 N_b}
$$
It can be shown that this variance is minimized by distributing the points such that,
$$
\frac{N_a}{N_a + N_b} = \frac{\sigma_a}{\sigma_a + \sigma_b}
\cdot \frac{N_a}{N_a + N_b}
= \frac{\sigma_a}{\sigma_a + \sigma b}
$$
Hence the smallest error estimate is obtained by allocating sample points in proportion
to the standard deviation of the function in each sub-region.
---
---------------------------------------------------------
calls plain MC Miser Vegas
------------ -------------- -------------- --------------
500'000 1.7166435813 1.7182850738 1.7182818354
5'000'000 1.7181231109 1.7182819143 1.7182818289
50'000'000 1.7183387184 1.7182818221 1.7182818285
---------------------------------------------------------
Table: Results of the three methods. {#tbl:results}
---------------------------------------------------------
calls plain MC Miser Vegas
------------ -------------- -------------- --------------
500'000 0.0006955691 0.0000021829 0.0000000137
5'000'000 0.0002200309 0.0000001024 0.0000000004
50'000'000 0.0000695809 0.0000000049 0.0000000000
---------------------------------------------------------
Table: $\sigma$s of the three methods. {#tbl:sigmas}