ex-5: complete writing

2020-03-15 22:48:56 +01:00 · 2020-03-15 22:48:56 +01:00 · 2e2a4eacf9
commit 2e2a4eacf9
parent 01a5f06cc5
1 changed files with 34 additions and 69 deletions
--- a/notes/sections/5.md
+++ b/notes/sections/5.md
@ -219,7 +219,8 @@ Table: MISER results with different numbers of function calls. Be careful:
       is divided into subsections. {#tbl:MISER}

 This time the error, altough it lies always in the same order of magnitude of
-diff, seems to seesaw around the correct value.
+diff, seems to seesaw around the correct value, which is much more closer to
+the expected one.


 ## Importance sampling
@ -311,81 +312,45 @@ $$
  P^{(y^{\star})}(x \in [a, a + da]) = \frac{1}{E [x, P]} a P (x \in [a, a + da])
 $$

-
---
+In conclusion, since certain values of $x$ have more impact on $E [x, P]$ than
+others, these "important" values must be emphasized by sampling them more
+frequently. As a consequence, the estimator variance will be reduced.


-### VEGAS \textcolor{red}{WIP}
-
-The VEGAS algorithm is based on importance sampling. It samples points from the
-probability distribution described by the function $f$, so that the points are
-concentrated in the regions that make the largest contribution to the integral.
-
-In general, if the MC integral of $f$ is sampled with points distributed
-according to a probability distribution $g$, the following estimate of the integral
-is obtained:
-
-$$
-  E (f|g \, , \, N) \with \sigma^2(f|g \, , \, N)
-$$
-
-If the probability distribution is chosen as $g = f$, it can be shown that the
-variance vanishes, and the error in the estimate will therefore be zero.  
-In practice, it is impossible to sample points from the exact distribution: only
-a good approximation can be achieved. In GSL, the VEGAS algorithm approximates
-the distribution by histogramming the function $f$ in different subregions. Each
-histogram is used to define a sampling distribution for the next pass, which
-consists in doing the same thing recorsively: this procedure converges
-asymptotically to the desired distribution.
-
-In order to avoid the number of histogram bins growing like $K^d$, the
-probability distribution is approximated by a separable function:
-
-$$
-  f (x_1, x_2, \ldots) = f_1(x_1) f_2(x_2) \ldots
-$$
-
-so that the number of bins required is only $Kd$. This is equivalent to locating
-the peaks of the function from the projections of the integrand onto the
-coordinate axes. The efficiency of VEGAS depends on the validity of this
-assumption. It is most efficient when the peaks of the integrand are
-well-localized. If an integrand can be rewritten in a form which is
-approximately separable this will increase the efficiency of integration with
-VEGAS.
-
-VEGAS incorporates a number of additional features, and combines both stratified
-sampling and importance sampling. The integration region is divided into a number
-of “boxes”, with each box getting a fixed number of points (the goal is 2). Each
-box can then have a fractional number of bins, but if the ratio of bins-per-box is
-less than two, Vegas switches to a kind variance reduction (rather than importance
-sampling).
+### VEGAS


+The VEGAS algorithm is based on importance sampling. It aims to reduce the
+integration error by concentrating points in the regions that make the largest
+contribution to the integral.

+As stated before, in practice it is impossible to sample points from the best
+distribution $P^{(y^{\star})}$: only a good approximation can be achieved. In
+GSL, the VEGAS algorithm approximates the distribution by histogramming the
+function $f$ in different subregions. Each histogram is used to define a
+sampling distribution for the next pass, which consists in doing the same thing
+recorsively: this procedure converges asymptotically to the desired
+distribution. It follows that a better estimation is achieved with a greater
+number of function calls.  
+The integration uses a fixed number of function calls. The result and its
+error estimate are based on a weighted average of independent samples, as for
+MISER.  
+For this particular sample, results are shown in @tbl:VEGAS.

---
+-------------------------------------------------------------------------
+                   500'000 calls     5'000'000 calls    50'000'000 calls
+----------------- ----------------- ------------------ ------------------
+$I^{\text{oss}}$     1.7182818354      1.7182818289       1.7182818285

---------------------------------------------------------
-       calls     plain MC        Miser          Vegas    
------------ -------------- -------------- --------------
-    500'000   1.7166435813   1.7182850738   1.7182818354 
+$\sigma$             0.0000000137      0.0000000004       0.0000000000

-  5'000'000   1.7181231109   1.7182819143   1.7182818289 
+diff                 0.0000000069      0.0000000004       0.0000000000
+-------------------------------------------------------------------------

- 50'000'000   1.7183387184   1.7182818221   1.7182818285 
---------------------------------------------------------
-
-Table: Results of the three methods. {#tbl:results}
-
---------------------------------------------------------
-       calls     plain MC        Miser          Vegas    
------------ -------------- -------------- --------------
-    500'000   0.0006955691   0.0000021829   0.0000000137
-
-  5'000'000   0.0002200309   0.0000001024   0.0000000004
-
- 50'000'000   0.0000695809   0.0000000049   0.0000000000
---------------------------------------------------------
-
-Table: $\sigma$s of the three methods. {#tbl:sigmas}
+Table: VEGAS results with different numbers of
+       function calls. {#tbl:VEGAS}

+This time, the error estimation is notably close to diff for each number of
+function calls, meaning that the estimation of both the integral and its
+error turn out to be very accurate, much more than the ones obtained with
+both plain Monte Carlo method and stratified sampling.