ex-5: complete writing

2020-03-15 22:48:56 +01:00 · 2020-03-15 22:48:56 +01:00 · 2e2a4eacf9
commit 2e2a4eacf9
parent 01a5f06cc5
1 changed files with 34 additions and 69 deletions
--- a/notes/sections/5.md
+++ b/notes/sections/5.md
@ -219,7 +219,8 @@ Table: MISER results with different numbers of function calls. Be careful:
       is divided into subsections. {#tbl:MISER}
 This time the error, altough it lies always in the same order of magnitude of
-diff, seems to seesaw around the correct value.
+diff, seems to seesaw around the correct value, which is much more closer to
 the expected one.
 ## Importance sampling
@ -311,81 +312,45 @@ $$
  P^{(y^{\star})}(x \in [a, a + da]) = \frac{1}{E [x, P]} a P (x \in [a, a + da])
 $$
-
+In conclusion, since certain values of $x$ have more impact on $E [x, P]$ than
---
+others, these "important" values must be emphasized by sampling them more
 frequently. As a consequence, the estimator variance will be reduced.
-### VEGAS \textcolor{red}{WIP}
+### VEGAS
 The VEGAS algorithm is based on importance sampling. It samples points from the
 probability distribution described by the function $f$, so that the points are
 concentrated in the regions that make the largest contribution to the integral.
 In general, if the MC integral of $f$ is sampled with points distributed
 according to a probability distribution $g$, the following estimate of the integral
 is obtained:
 $$
  E (f|g \, , \, N) \with \sigma^2(f|g \, , \, N)
 $$
 If the probability distribution is chosen as $g = f$, it can be shown that the
 variance vanishes, and the error in the estimate will therefore be zero.  
 In practice, it is impossible to sample points from the exact distribution: only
 a good approximation can be achieved. In GSL, the VEGAS algorithm approximates
 the distribution by histogramming the function $f$ in different subregions. Each
 histogram is used to define a sampling distribution for the next pass, which
 consists in doing the same thing recorsively: this procedure converges
 asymptotically to the desired distribution.
 In order to avoid the number of histogram bins growing like $K^d$, the
 probability distribution is approximated by a separable function:
 $$
  f (x_1, x_2, \ldots) = f_1(x_1) f_2(x_2) \ldots
 $$
 so that the number of bins required is only $Kd$. This is equivalent to locating
 the peaks of the function from the projections of the integrand onto the
 coordinate axes. The efficiency of VEGAS depends on the validity of this
 assumption. It is most efficient when the peaks of the integrand are
 well-localized. If an integrand can be rewritten in a form which is
 approximately separable this will increase the efficiency of integration with
 VEGAS.
 VEGAS incorporates a number of additional features, and combines both stratified
 sampling and importance sampling. The integration region is divided into a number
 of “boxes”, with each box getting a fixed number of points (the goal is 2). Each
 box can then have a fractional number of bins, but if the ratio of bins-per-box is
 less than two, Vegas switches to a kind variance reduction (rather than importance
 sampling).
 The VEGAS algorithm is based on importance sampling. It aims to reduce the
 integration error by concentrating points in the regions that make the largest
 contribution to the integral.
 As stated before, in practice it is impossible to sample points from the best
 distribution $P^{(y^{\star})}$: only a good approximation can be achieved. In
 GSL, the VEGAS algorithm approximates the distribution by histogramming the
 function $f$ in different subregions. Each histogram is used to define a
 sampling distribution for the next pass, which consists in doing the same thing
 recorsively: this procedure converges asymptotically to the desired
 distribution. It follows that a better estimation is achieved with a greater
 number of function calls.  
 The integration uses a fixed number of function calls. The result and its
 error estimate are based on a weighted average of independent samples, as for
 MISER.  
 For this particular sample, results are shown in @tbl:VEGAS.
---
+-------------------------------------------------------------------------
                   500'000 calls     5'000'000 calls    50'000'000 calls
 ----------------- ----------------- ------------------ ------------------
 $I^{\text{oss}}$     1.7182818354      1.7182818289       1.7182818285
---------------------------------------------------------
+$\sigma$             0.0000000137      0.0000000004       0.0000000000
       calls     plain MC        Miser          Vegas    
 ------------ -------------- -------------- --------------
    500'000   1.7166435813   1.7182850738   1.7182818354 
-  5'000'000   1.7181231109   1.7182819143   1.7182818289 
+diff                 0.0000000069      0.0000000004       0.0000000000
 -------------------------------------------------------------------------
- 50'000'000   1.7183387184   1.7182818221   1.7182818285 
+Table: VEGAS results with different numbers of
---------------------------------------------------------
+       function calls. {#tbl:VEGAS}
 Table: Results of the three methods. {#tbl:results}
 ---------------------------------------------------------
       calls     plain MC        Miser          Vegas    
 ------------ -------------- -------------- --------------
    500'000   0.0006955691   0.0000021829   0.0000000137
  5'000'000   0.0002200309   0.0000001024   0.0000000004
 50'000'000   0.0000695809   0.0000000049   0.0000000000
 ---------------------------------------------------------
 Table: $\sigma$s of the three methods. {#tbl:sigmas}
 This time, the error estimation is notably close to diff for each number of
 function calls, meaning that the estimation of both the integral and its
 error turn out to be very accurate, much more than the ones obtained with
 both plain Monte Carlo method and stratified sampling.