ex-5: complete importance sampling

This commit is contained in:
Giù Marcer 2020-03-15 21:42:41 +01:00 committed by rnhmjoj
parent 4d6f18ae07
commit 01a5f06cc5

View File

@ -108,8 +108,7 @@ $$
$$
$$
\sigma_i^2 = \frac{1}{n_i - 1} \sum_j \left( \frac{x_j - \bar{x}_i}{n_i}
\right)^2
\sigma_i^2 = \frac{1}{n_i - 1} \sum_j \left( x_j - \bar{x}_i \right)^2
\thus
{\sigma^2_x}_i = \frac{1}{n_i^2} \sum_j \sigma_i^2 = \frac{\sigma_i^2}{n_i}
$$
@ -225,11 +224,14 @@ diff, seems to seesaw around the correct value.
## Importance sampling
In statistics, importance sampling is a technique for estimating properties of
a given distribution, while only having samples generated from a different
distribution than the distribution of interest.
Consider a sample of $n$ points {$x_i$} generated according to a probability
distribition function $P$ which gives thereby the following expected value:
In statistics, importance sampling is a method which samples points from the
probability distribution $f$ itself, so that the points cluster in the regions
that make the largest contribution to the integral.
Remind that $I = V \cdot \langle f \rangle$ and therefore only $\langle f
\rangle$ must be estimated. Then, consider a sample of $n$ points {$x_i$}
generated according to a probability distribition function $P$ which gives
thereby the following expected value:
$$
E [x, P] = \frac{1}{n} \sum_i x_i
@ -238,21 +240,46 @@ $$
with variance:
$$
\sigma^2 [E, P] = \frac{\sigma^2 [x, P]}{n}
\sigma^2 [E, P] = \frac{\sigma^2 [x, P]}{n}
\with \sigma^2 [x, P] = \frac{1}{n -1} \sum_i \left( x_i - E [x, P] \right)^2
$$
where $i$ runs over the sample and $\sigma^2 [x, P]$ is the variance of the
sorted points.
The idea is to sample them from a different distribution to lower the variance
of $E[x, P]$. This is accomplished by choosing a random variable $y \geq 0$ such
that $E[y ,P] = 1$. Then, a new probability $P^{(y)}$ is defined in order to
satisfy:
where $i$ runs over the sample.
In the case of plain MC, $\langle f \rangle$ is estimated as the expected
value of points {$f(x_i)$} sorted with $P (x_i) = 1 \quad \forall i$, since they
are evenly distributed in $\Omega$. The idea is to sample points from a
different distribution to lower the variance of $E[x, P]$, which results in
lowering $\sigma^2 [x, P]$. This is accomplished by choosing a random variable
$y$ and defining a new probability $P^{(y)}$ in order to satisfy:
$$
E [x, P] = E \left[ \frac{x}{y}, P^{(y)} \right]
$$
This new estimate is better then former one if:
which is to say:
$$
I = \int \limits_{\Omega} dx f(x) =
\int \limits_{\Omega} dx \, \frac{f(x)}{g(x)} \, g(x)=
\int \limits_{\Omega} dx \, w(x) \, g(x)
$$
where $E \, \longleftrightarrow \, I$ and:
$$
\begin{cases}
f(x) \, \longleftrightarrow \, x \\
1 \, \longleftrightarrow \, P
\end{cases}
\et
\begin{cases}
w(x) \, \longleftrightarrow \, \frac{x}{y} \\
g(x) \, \longleftrightarrow \, y = P^{(y)}
\end{cases}
$$
Where the symbol $\longleftrightarrow$ points out the connection between the
variables. This new estimate is better than the former if:
$$
\sigma^2 \left[ \frac{x}{y}, P^{(y)} \right] < \sigma^2 [x, P]
@ -261,55 +288,32 @@ $$
The best variable $y$ would be:
$$
y^{\star} = \frac{x}{E [x, P]} \thus \frac{x}{y^{\star}} = E [x, P]
y^{\star} = \frac{x}{E [x, P]} \, \longleftrightarrow \, \frac{f(x)}{I}
\thus \frac{x}{y^{\star}} = E [x, P]
$$
and a single sample under $P^{(y^{\star})}$ suffices to give its value.
and even a single sample under $P^{(y^{\star})}$ would be sufficient to give its
value. Obviously, it is not possible to take exactly this choice, since $E [x,
P]$ is not given a priori.
However, this gives an insight into what importance sampling does. In fact,
given that:
$$
E [x, P] = \int \limits_{a = - \infty}^{a = + \infty}
a P(x \in [a, a + da])
$$
the best probability change $P^{(y^{\star})}$ redistributes the law of $x$ so
that its samples frequencies are sorted directly according to their weights in
$E[x, P]$, namely:
$$
P^{(y^{\star})}(x \in [a, a + da]) = \frac{1}{E [x, P]} a P (x \in [a, a + da])
$$
---
The logic underlying importance sampling lies in a simple rearrangement of terms
in the integral to be computed:
$$
I = \int \limits_{\Omega} dx f(x) =
\int \limits_{\Omega} dx \, \frac{f(x)}{g(x)} \, g(x)=
\int \limits_{\Omega} dx \, w(x) \, g(x)
$$
where $w(x)$ is called 'importance function': a good importance function will be
large when the integrand is large and small otherwise.
---
For example, in some of these points the function value is lower compared to
others and therefore contributes less to the whole integral.
### VEGAS \textcolor{red}{WIP}