diff --git a/notes/sections/5.md b/notes/sections/5.md
index 2d6cf47..be503c3 100644
--- a/notes/sections/5.md
+++ b/notes/sections/5.md
@@ -1,8 +1,6 @@
 # Exercize 5
 
-**Numerically compute an integral value via Monte Carlo approaches**
-
-The integral to be evaluated is the following:
+The following integral must be evaluated:
 
 $$
   I = \int\limits_0^1 dx \, e^x
@@ -143,7 +141,8 @@ For this reason, stratified sampling is used as a method of variance reduction
 when MC methods are used to estimate population statistics from a known
 population.
 
-**MISER**
+
+### MISER
 
 The MISER technique aims to reduce the integration error through the use of
 recursive stratified sampling.  
@@ -224,7 +223,95 @@ This time the error, altough it lies always in the same order of magnitude of
 diff, seems to seesaw around the correct value.
 
 
-## VEGAS \textcolor{red}{WIP}
+## Importance sampling
+
+In statistics, importance sampling is a technique for estimating properties of
+a given distribution, while only having samples generated from a different
+distribution than the distribution of interest.  
+Consider a sample of $n$ points {$x_i$} generated according to a probability
+distribition function $P$ which gives thereby the following expected value:
+
+$$
+  E [x, P] = \frac{1}{n} \sum_i x_i
+$$
+
+with variance:
+
+$$
+  \sigma^2 [E, P] = \frac{\sigma^2 [x, P]}{n} 
+$$
+
+where $i$ runs over the sample and $\sigma^2 [x, P]$ is the variance of the
+sorted points.   
+The idea is to sample them from a different distribution to lower the variance
+of $E[x, P]$. This is accomplished by choosing a random variable $y \geq 0$ such
+that $E[y ,P] = 1$. Then, a new probability $P^{(y)}$ is defined in order to
+satisfy:
+
+$$
+  E [x, P] = E \left[ \frac{x}{y}, P^{(y)} \right]
+$$
+
+This new estimate is better then former one if:
+
+$$
+  \sigma^2 \left[ \frac{x}{y}, P^{(y)} \right] < \sigma^2 [x, P]
+$$
+
+The best variable $y$ would be:
+
+$$
+  y^{\star} = \frac{x}{E [x, P]} \thus \frac{x}{y^{\star}} = E [x, P]
+$$
+
+and a single sample under $P^{(y^{\star})}$ suffices to give its value. 
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+---
+
+The logic underlying importance sampling lies in a simple rearrangement of terms
+in the integral to be computed:
+
+$$
+  I = \int \limits_{\Omega} dx f(x) = 
+      \int \limits_{\Omega} dx \, \frac{f(x)}{g(x)} \, g(x)=
+      \int \limits_{\Omega} dx \, w(x) \, g(x)
+$$
+
+where $w(x)$ is called 'importance function': a good importance function will be
+large when the integrand is large and small otherwise.
+
+---
+
+
+For example, in some of these points the function value is lower compared to
+others and therefore contributes less to the whole integral.
+
+### VEGAS \textcolor{red}{WIP}
 
 The VEGAS algorithm is based on importance sampling. It samples points from the
 probability distribution described by the function $f$, so that the points are