ex-5: complete importance sampling
This commit is contained in:
parent
4d6f18ae07
commit
01a5f06cc5
@ -108,8 +108,7 @@ $$
|
||||
$$
|
||||
|
||||
$$
|
||||
\sigma_i^2 = \frac{1}{n_i - 1} \sum_j \left( \frac{x_j - \bar{x}_i}{n_i}
|
||||
\right)^2
|
||||
\sigma_i^2 = \frac{1}{n_i - 1} \sum_j \left( x_j - \bar{x}_i \right)^2
|
||||
\thus
|
||||
{\sigma^2_x}_i = \frac{1}{n_i^2} \sum_j \sigma_i^2 = \frac{\sigma_i^2}{n_i}
|
||||
$$
|
||||
@ -225,11 +224,14 @@ diff, seems to seesaw around the correct value.
|
||||
|
||||
## Importance sampling
|
||||
|
||||
In statistics, importance sampling is a technique for estimating properties of
|
||||
a given distribution, while only having samples generated from a different
|
||||
distribution than the distribution of interest.
|
||||
Consider a sample of $n$ points {$x_i$} generated according to a probability
|
||||
distribition function $P$ which gives thereby the following expected value:
|
||||
In statistics, importance sampling is a method which samples points from the
|
||||
probability distribution $f$ itself, so that the points cluster in the regions
|
||||
that make the largest contribution to the integral.
|
||||
|
||||
Remind that $I = V \cdot \langle f \rangle$ and therefore only $\langle f
|
||||
\rangle$ must be estimated. Then, consider a sample of $n$ points {$x_i$}
|
||||
generated according to a probability distribition function $P$ which gives
|
||||
thereby the following expected value:
|
||||
|
||||
$$
|
||||
E [x, P] = \frac{1}{n} \sum_i x_i
|
||||
@ -238,21 +240,46 @@ $$
|
||||
with variance:
|
||||
|
||||
$$
|
||||
\sigma^2 [E, P] = \frac{\sigma^2 [x, P]}{n}
|
||||
\sigma^2 [E, P] = \frac{\sigma^2 [x, P]}{n}
|
||||
\with \sigma^2 [x, P] = \frac{1}{n -1} \sum_i \left( x_i - E [x, P] \right)^2
|
||||
$$
|
||||
|
||||
where $i$ runs over the sample and $\sigma^2 [x, P]$ is the variance of the
|
||||
sorted points.
|
||||
The idea is to sample them from a different distribution to lower the variance
|
||||
of $E[x, P]$. This is accomplished by choosing a random variable $y \geq 0$ such
|
||||
that $E[y ,P] = 1$. Then, a new probability $P^{(y)}$ is defined in order to
|
||||
satisfy:
|
||||
where $i$ runs over the sample.
|
||||
In the case of plain MC, $\langle f \rangle$ is estimated as the expected
|
||||
value of points {$f(x_i)$} sorted with $P (x_i) = 1 \quad \forall i$, since they
|
||||
are evenly distributed in $\Omega$. The idea is to sample points from a
|
||||
different distribution to lower the variance of $E[x, P]$, which results in
|
||||
lowering $\sigma^2 [x, P]$. This is accomplished by choosing a random variable
|
||||
$y$ and defining a new probability $P^{(y)}$ in order to satisfy:
|
||||
|
||||
$$
|
||||
E [x, P] = E \left[ \frac{x}{y}, P^{(y)} \right]
|
||||
$$
|
||||
|
||||
This new estimate is better then former one if:
|
||||
which is to say:
|
||||
|
||||
$$
|
||||
I = \int \limits_{\Omega} dx f(x) =
|
||||
\int \limits_{\Omega} dx \, \frac{f(x)}{g(x)} \, g(x)=
|
||||
\int \limits_{\Omega} dx \, w(x) \, g(x)
|
||||
$$
|
||||
|
||||
where $E \, \longleftrightarrow \, I$ and:
|
||||
|
||||
$$
|
||||
\begin{cases}
|
||||
f(x) \, \longleftrightarrow \, x \\
|
||||
1 \, \longleftrightarrow \, P
|
||||
\end{cases}
|
||||
\et
|
||||
\begin{cases}
|
||||
w(x) \, \longleftrightarrow \, \frac{x}{y} \\
|
||||
g(x) \, \longleftrightarrow \, y = P^{(y)}
|
||||
\end{cases}
|
||||
$$
|
||||
|
||||
Where the symbol $\longleftrightarrow$ points out the connection between the
|
||||
variables. This new estimate is better than the former if:
|
||||
|
||||
$$
|
||||
\sigma^2 \left[ \frac{x}{y}, P^{(y)} \right] < \sigma^2 [x, P]
|
||||
@ -261,55 +288,32 @@ $$
|
||||
The best variable $y$ would be:
|
||||
|
||||
$$
|
||||
y^{\star} = \frac{x}{E [x, P]} \thus \frac{x}{y^{\star}} = E [x, P]
|
||||
y^{\star} = \frac{x}{E [x, P]} \, \longleftrightarrow \, \frac{f(x)}{I}
|
||||
\thus \frac{x}{y^{\star}} = E [x, P]
|
||||
$$
|
||||
|
||||
and a single sample under $P^{(y^{\star})}$ suffices to give its value.
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
and even a single sample under $P^{(y^{\star})}$ would be sufficient to give its
|
||||
value. Obviously, it is not possible to take exactly this choice, since $E [x,
|
||||
P]$ is not given a priori.
|
||||
However, this gives an insight into what importance sampling does. In fact,
|
||||
given that:
|
||||
|
||||
$$
|
||||
E [x, P] = \int \limits_{a = - \infty}^{a = + \infty}
|
||||
a P(x \in [a, a + da])
|
||||
$$
|
||||
|
||||
the best probability change $P^{(y^{\star})}$ redistributes the law of $x$ so
|
||||
that its samples frequencies are sorted directly according to their weights in
|
||||
$E[x, P]$, namely:
|
||||
|
||||
$$
|
||||
P^{(y^{\star})}(x \in [a, a + da]) = \frac{1}{E [x, P]} a P (x \in [a, a + da])
|
||||
$$
|
||||
|
||||
|
||||
---
|
||||
|
||||
The logic underlying importance sampling lies in a simple rearrangement of terms
|
||||
in the integral to be computed:
|
||||
|
||||
$$
|
||||
I = \int \limits_{\Omega} dx f(x) =
|
||||
\int \limits_{\Omega} dx \, \frac{f(x)}{g(x)} \, g(x)=
|
||||
\int \limits_{\Omega} dx \, w(x) \, g(x)
|
||||
$$
|
||||
|
||||
where $w(x)$ is called 'importance function': a good importance function will be
|
||||
large when the integrand is large and small otherwise.
|
||||
|
||||
---
|
||||
|
||||
|
||||
For example, in some of these points the function value is lower compared to
|
||||
others and therefore contributes less to the whole integral.
|
||||
|
||||
### VEGAS \textcolor{red}{WIP}
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user