diff --git a/notes/images/6-emd-hist-0.pdf b/notes/images/6-emd-hist-0.pdf deleted file mode 100644 index f2bcc5b..0000000 Binary files a/notes/images/6-emd-hist-0.pdf and /dev/null differ diff --git a/notes/images/6-emd-hist-0.05.pdf b/notes/images/6-noise-emd-0.005.pdf similarity index 70% rename from notes/images/6-emd-hist-0.05.pdf rename to notes/images/6-noise-emd-0.005.pdf index 57e7f3b..bbb8632 100644 Binary files a/notes/images/6-emd-hist-0.05.pdf and b/notes/images/6-noise-emd-0.005.pdf differ diff --git a/notes/images/6-emd-hist-0.005.pdf b/notes/images/6-noise-emd-0.01.pdf similarity index 69% rename from notes/images/6-emd-hist-0.005.pdf rename to notes/images/6-noise-emd-0.01.pdf index b0f5350..9ca04ca 100644 Binary files a/notes/images/6-emd-hist-0.005.pdf and b/notes/images/6-noise-emd-0.01.pdf differ diff --git a/notes/images/6-emd-hist-0.01.pdf b/notes/images/6-noise-emd-0.05.pdf similarity index 66% rename from notes/images/6-emd-hist-0.01.pdf rename to notes/images/6-noise-emd-0.05.pdf index 042f617..9826392 100644 Binary files a/notes/images/6-emd-hist-0.01.pdf and b/notes/images/6-noise-emd-0.05.pdf differ diff --git a/notes/images/6-original.pdf b/notes/images/6-original.pdf index e7a58e2..bc597a8 100644 Binary files a/notes/images/6-original.pdf and b/notes/images/6-original.pdf differ diff --git a/notes/images/6-smoothed.pdf b/notes/images/6-smoothed.pdf index 9517d83..4e7bc6d 100644 Binary files a/notes/images/6-smoothed.pdf and b/notes/images/6-smoothed.pdf differ diff --git a/notes/sections/1.md b/notes/sections/1.md index 91b55d5..bf08e46 100644 --- a/notes/sections/1.md +++ b/notes/sections/1.md @@ -304,7 +304,7 @@ $$ ![Example of a Moyal distribution density obtained by the KDE method. The rug plot shows the original sample used in the reconstruction. -](images/landau-kde.pdf) +](images/1-landau-kde.pdf) On the other hand, obtaining a good estimate of the FWHM from a sample is much more difficult. In principle, it could be measured by binning the data and diff --git a/notes/sections/4.md b/notes/sections/4.md index e0281a5..0ac0f63 100644 --- a/notes/sections/4.md +++ b/notes/sections/4.md @@ -181,7 +181,7 @@ where 'floor' is the function which gives the bigger integer smaller than its argument and the bins are counted starting from zero. Then, a vector in which the $j^{\text{th}}$ entry contains both the sum $S_j$ of all the $|P_v|$s relative to each $P_h$ fallen into the $j^{\text{th}}$ bin -itself and the number num$_j$ of the bin entries was reiteratively updated. At +itself and the number num$_j$ of the bin entries was iteratively updated. At the end, the average value of $|P_v|_j$ was computed as $S_j / \text{num}_j$. For the sake of clarity, for each sampled couple the procedure is the following. At first $S_j = 0 \wedge \text{num}_j = 0 \, \forall \, j$, then: diff --git a/notes/sections/5.md b/notes/sections/5.md index d5fdbe4..bf6d141 100644 --- a/notes/sections/5.md +++ b/notes/sections/5.md @@ -68,7 +68,7 @@ In this case $f(x) = e^{x}$ and $\Omega = [0,1]$, hence $V = 1$. Since the proximity of $I_N$ to $I$ is related to $N$, the accuracy of the method lies in how many points are generated, namely how many function calls -are executed when the reiterative method is implemented. In @fig:MC and +are executed when the iterative method is implemented. In @fig:MC and @fig:MI, results obtained with the plain MC method are shown in red. In @tbl:MC, some of them are listed: the estimated integrals $I^{\text{oss}}$ are compared to the expected value $I$ and the differences diff between them are @@ -333,7 +333,7 @@ the largest contribution to the integral. As stated before, in practice it is impossible to sample points from the best distribution $P^{(y)}$: only a good approximation can be achieved. In GSL, the VEGAS algorithm approximates the distribution by histogramming the function $f$ -in different subregions with a reiterative method [@lepage78], namely: +in different subregions with a iterative method [@lepage78], namely: - a fixed number of points (function calls) is evenly generated in the whole region; @@ -371,7 +371,7 @@ where $I_i$ and $\sigma_i$ are are the integral and standard deviation estimated in each interval $\Delta x_i$ of the last iteration using the plain MC technique. For the results to be reliable, the chi-squared per degree of freedom -$\chi_r^2$ must be consistent with 1. While performing the reiterations, if +$\chi_r^2$ must be consistent with 1. While performing the eiterations, if the $\chi_r^2$ value exceed 1.5, the cycle breaks. Clearly, once again a better estimation is achieved with a greater number of diff --git a/notes/sections/6.md b/notes/sections/6.md index 007672c..647d739 100644 --- a/notes/sections/6.md +++ b/notes/sections/6.md @@ -85,21 +85,21 @@ omitted: The so obtained sample was binned and stored in a histogram with a customizable number $n$ of bins (default set $n = 150$) ranging from $\theta = 0$ to $\theta -= \pi/2$ bacause of the system symmetry. In @fig:original an example is shown. += \pi/2$ because of the system symmetry. In @fig:original an example is shown. ![Example of intensity histogram.](images/6-original.pdf){#fig:original} ## Gaussian convolution {#sec:convolution} -In order to simulate the instrumentation response, the sample was then to be -smeared with a Gaussian function with the aim to recover the original sample -afterwards, implementing a deconvolution routine. +In order to simulate the instrumentation response, the sample was then smeared +with a Gaussian function with the aim to recover the original sample afterwards, +implementing a deconvolution routine. For this purpose, a 'kernel' histogram with an even number $m$ of bins and the same bin width of the previous one, but a smaller number of them ($m \sim 6\% -\, n$), was created according to a Gaussian distribution with mean $\mu$, +\, n$), was generated according to a Gaussian distribution with mean $\mu = 0$ and variance $\sigma$. The reason why the kernel was set this way will be -descussed lately. +discussed shortly. Then, the original histogram was convolved with the kernel in order to obtain the smeared signal. As an example, the result obtained for $\sigma = \Delta \theta$, where $\Delta \theta$ is the bin width, is shown in @fig:convolved. @@ -121,11 +121,11 @@ down to an element wise product between $s$ and the flipped histogram of $k$ histograms. Namely, if $c_i$ is the $i^{\text{th}}$ bin of the convolved histogram: $$ - c_i = \sum_j k_j s_{i - j} = \sum_{j'} k_{m - j'} s_{i - m + j'} - \with j' = m - j + c_i = \sum_{j = 0}^{m - 1} k_j s_{i - j} + = \sum_{j' = m - 1}^{0} k_{m - 1 - j'} s_{i - m + 1 + j'} + \with j' = m - 1 - j $$ -where $j$ runs over the bins of the kernel. For a better understanding, see @fig:dot_conv: the third histogram turns out with $n + m - 1$ bins, a number greater than the original one. @@ -187,8 +187,8 @@ with $n + m - 1$ bins, a number greater than the original one. % nodes \node [above] at (2.25,-5.5) {$c_i$}; \node [above] at (3.25,0) {$s_i$}; - \node [above] at (1.95,0) {$s_{i-3}$}; - \node [below] at (1.75,-1) {$k_3$}; + \node [above] at (1.95,0) {$s_{i-j}$}; + \node [below] at (1.75,-1) {$k_j$}; \end{tikzpicture} \caption{Element wise product as a step of the convolution between the original signal (above) and the kernel (center). The final result is the lower @@ -210,13 +210,13 @@ $$ where $\hat{F}[\quad]$ stands for the Fourier transform of its argument. Being the histogram a discrete set of data, the Discrete Fourier Transform (DFT) -was appied. When dealing with arrays of discrete values, the theorems still -holds if the two arrays have the same lenght and a cyclical convolution is +was applied. When dealing with arrays of discrete values, the theorem still +holds if the two arrays have the same length and a cyclical convolution is applied. For this reason the kernel was 0-padded in order to make it the same -lenght of the original signal. Besides, the 0-padding allows to avoid unpleasant +length of the original signal. Besides, the 0-padding allows to avoid unpleasant side effects due to the cyclical convolution. -In order to accomplish this procedure, every histogram was transformed into a -vector. The implementation lies in the computation of the Fourier trasform of +In order to accomplish this procedure, both histograms were transformed into +vectors. The implementation lies in the computation of the Fourier transform of the smeared signal and the kernel, the ratio between their transforms and the anti-transformation of the result: $$ @@ -225,16 +225,17 @@ $$ $$ The FFT are efficient algorithms for calculating the DFT. Given a set of $n$ -values {$z_i$}, each one is transformed into: +values {$z_j$}, each one is transformed into: $$ x_j = \sum_{k=0}^{n-1} z_k \exp \left( - \frac{2 \pi i j k}{n} \right) $$ +where $i$ is the imaginary unit. The evaluation of the DFT is a matrix-vector multiplication $W \vec{z}$. A general matrix-vector multiplication takes $O(n^2)$ operations. FFT algorithms, -instad, use a divide-and-conquer strategy to factorize the matrix into smaller -sub-matrices. If $n$ can be factorized into a product of integers $n_1$, $n_2 -\ldots n_m$, then the DFT can be computed in $O(n \sum n_i) < O(n^2)$ +instead, use a *divide-and-conquer* strategy to factorize the matrix into +smaller sub-matrices. If $n$ can be factorized into a product of integers $n_1$, +$n_2 \ldots n_m$, then the DFT can be computed in $O(n \sum n_i) < O(n^2)$ operations, hence the name. The inverse Fourier transform is thereby defined as: $$ @@ -243,13 +244,13 @@ $$ $$ In GSL, `gsl_fft_complex_forward()` and `gsl_fft_complex_inverse()` are -functions which allow to compute the foreward and inverse transform, +functions which allow to compute the forward and inverse transform, respectively. The inputs and outputs for the complex FFT routines are packed arrays of -floating point numbers. In a packed array the real and imaginary parts of -each complex number are placed in alternate neighboring elements. In this -special case, the sequence of values to be transformed is made of real numbers, -but the Fourier transform is not real: it is a complex sequence wich satisfies: +floating point numbers. In a packed array, the real and imaginary parts of each +complex number are placed in alternate neighboring elements. In this special +case, the sequence of values to be transformed is made of real numbers, hence +Fourier transform is a complex sequence which satisfies: $$ z_k = z^*_{n-k} $$ @@ -260,7 +261,7 @@ forward transform (from real to half-complex) and inverse transform (from half-complex to real). As a consequence, the routines are divided into two sets: `gsl_fft_real` and `gsl_fft_halfcomplex`. The symmetry of the half-complex sequence requires only half of the complex numbers in the output to be stored. -This works for all lengths: when the length is even, the middle value is real. +This works for all lengths: when the length is odd, the middle value is real. Thus, only $n$ real numbers are required to store the half-complex sequence (half for the real part and half for the imaginary). @@ -293,7 +294,7 @@ Thus, only $n$ real numbers are required to store the half-complex sequence \end{scope} \end{tikzpicture} \caption{The histogram on the right shows how the real numbers histogram on the - left is handled by the dedicated GSL functions`.}\label{fig:reorder} + left is handled by the dedicated GSL functions.}\label{fig:reorder} } \end{figure} @@ -305,7 +306,7 @@ the array up to the middle and the negative backwards from the end of the array Whilst do not matters if the convolved histogram has positive or negative values, the kernel must be centered in zero in order to compute a correct convolution. This requires the kernel to be made of an ever number of bins -in order to be possible to cut it into two same-lenght halves. +in order to be possible to cut it into two same-length halves. When $\hat{F}[s * k]$ and $\hat{F}[k]$ are computed, they are given in the half-complex GSL format and their normal format must be restored in order to @@ -318,24 +319,24 @@ code. At the end, the external bins which exceed the original signal size are cut away in order to restore the original number of bins $n$. Results will be -discussed in @sec:conv_Results. +discussed in @sec:conv_results. ## Unfolding with Richardson-Lucy -The Richardson–Lucy (RL) deconvolution is an iterative procedure tipically used +The Richardson–Lucy (RL) deconvolution is an iterative procedure typically used for recovering an image that has been blurred by a known 'point spread function'. -Consider the problem of estimating the frequeny distribution $f(\xi)$ of a +Consider the problem of estimating the frequency distribution $f(\xi)$ of a variable $\xi$ when the available measure is a sample {$x_i$} of points -dronwn not by $f(x)$ but by another function $\phi(x)$ such that: +drown not by $f(x)$ but by another function $\phi(x)$ such that: $$ \phi(x) = \int d\xi \, f(\xi) P(x | \xi) $$ {#eq:conv} where $P(x | \xi) \, d\xi$ is the probability (presumed known) that $x$ falls -in the interval $(x, x + dx)$ when $\xi = \xi$. If the so-colled point spread +in the interval $(x, x + dx)$ when $\xi = \xi$. If the so-called point spread function $P(x | \xi)$ follows a normal distribution with variance $\sigma$, namely: $$ @@ -371,12 +372,12 @@ $$ f(\xi) = \int dx \, \phi(x) Q(\xi | x) $$ {#eq:second} -Since $Q (\xi | x)$ depends on $f(\xi)$, @eq:second suggests a reiterative +Since $Q (\xi | x)$ depends on $f(\xi)$, @eq:second suggests an iterative procedure for generating estimates of $f(\xi)$. With a guess for $f(\xi)$ and a known $P(x | \xi)$, @eq:first can be used to calculate and estimate for $Q (\xi | x)$. Then, taking the hint provided by @eq:second, an improved -estimate for $f (\xi)$ is generated, using the observed sample {$x_i$} to give -an approximation for $\phi$. +estimate for $f (\xi)$ can be generated, using the observed sample {$x_i$} to +give an approximation for $\phi$. Thus, if $f^t$ is the $t^{\text{th}}$ estimate, the $t^{\text{th + 1}}$ is: $$ f^{t + 1}(\xi) = \int dx \, \phi(x) Q^t(\xi | x) @@ -401,15 +402,15 @@ $$ where $P^{\star}$ is the flipped point spread function [@lucy74]. In this special case, the Gaussian kernel which was convolved with the original -histogram stands for the point spread function and, dealing with discrete -values, the division and multiplication are element wise and the convolution is -to be carried out as described in @sec:convolution. +histogram stands for the point spread function. Dealing with discrete values, +the division and multiplication are element wise and the convolution is to be +carried out as described in @sec:convolution. When implemented, this method results in an easy step-wise routine: - - create a flipped copy of the kernel; - choose a zero-order estimate for {$f(\xi)$}; + - create a flipped copy of the kernel; - compute the convolutions, the product and the division at each step; - - proceed until a given number of reiterations is achieved. + - proceed until a given number $r$ of iterations is reached. In this case, the zero-order was set $f(\xi) = 0.5 \, \forall \, \xi$. Different number of iterations where tested. Results are discussed in @@ -450,10 +451,9 @@ $$ W (P, Q, F) = \sum_{i = 1}^m \sum_{j = 1}^n f_{ij} d_{ij} $$ -The fact is that the $Q$ region is to be considerd empty at the beginning: the -'dirt' present in $P$ must be moved to $Q$ in order to reach the same -distribution as close as possible. Namely, the following constraints must be -satisfied: +The $Q$ region is to be considered empty at the beginning: the 'dirt' present in +$P$ must be moved to $Q$ in order to reach the same distribution as close as +possible. Namely, the following constraints must be satisfied: \begin{align*} &\text{1.} \hspace{20pt} f_{ij} \ge 0 \hspace{15pt} @@ -474,7 +474,7 @@ second limits the amount of dirt moved by each position in $P$ in order to not exceed the available quantity; the third sets a limit to the dirt moved to each position in $Q$ in order to not exceed the required quantity and the last one forces to move the maximum amount of supplies possible: either all the dirt -present in $P$ has be moved, or the $Q$ distibution is obtained. +present in $P$ has been moved or the $Q$ distribution has been obtained. The total moved amount is the total flow. If the two distributions have the same amount of dirt, hence all the dirt present in $P$ is necessarily moved to $Q$ and the flow equals the total amount of available dirt. @@ -487,9 +487,9 @@ $$ {\sum_{i = 1}^m \sum_{j=1}^n f_{ij}} $$ -In this case, where the EMD has to be applied to two same-lenght histograms, the -procedure simplifies a lot. By representing both histograms with two vectors $u$ -and $v$, the equation above boils down to [@ramdas17]: +In this case, where the EMD is to be measured between two same-length +histograms, the procedure simplifies a lot. By representing both histograms +with two vectors $u$ and $v$, the equation above boils down to [@ramdas17]: $$ \text{EMD} (u, v) = \sum_i |U_i - V_i| @@ -497,7 +497,7 @@ $$ where the sum runs over the entries of the vectors $U$ and $V$, which are the cumulative vectors of the histograms. In the code, the following equivalent -recursive routine was implemented. +iterative routine was implemented. $$ \text{EMD} (u, v) = \sum_i |\text{EMD}_i| \with @@ -526,35 +526,202 @@ In fact: \end{align*} This simple formula enabled comparisons to be made between a great number of -results. +results. +In order to make the code more flexible, the data were normalized before +computing the EMD: in doing so, it is possible to compare even samples with a +different number of points. + ## Results comparison {#sec:conv_results} -As can be seen, increasing the value of $\sigma$ implies a stronger smoothing of -the curve. The FFT deconvolution process seems not to be affected by $\sigma$ -amplitude changes: it always gives the same outcome, which is exactly the -original signal. In fact, the FFT is the analitical result of the deconvolution. -In the real world, it is unpratical, since signals are inevitably blurred by -noise. -The same can't be said about the RL deconvolution, which, on the other hand, -looks heavily influenced by the variance magnitude: the greater $\sigma$, the -worse the deconvolved result. In fact, given the same number of steps, the -deconvolved signal is always the same 'distance' far form the convolved one: -if it very smooth, the deconvolved signal is very smooth too and if the -convolved is less smooth, it is less smooth too. -The original signal is shown below for convenience. +### Noiseless results {#sec:noiseless} + +Along with the analysis of the results obtained varying the convolved Gaussian +width $\sigma$, the possibility to add a Gaussian noise to the convolved +histogram was also implemented to check weather the deconvolution is affected +or not by this kind of interference. This approach is described in the next +subsection, while the noiseless results are given in this one. + +The two methods were compared for three different values of $\sigma$: +$$ + \sigma = 0.1 \, \Delta \theta \et + \sigma = 0.5 \, \Delta \theta \et + \sigma = \Delta \theta +$$ + +Since the RL method depends on the number $r$ of performed rounds, in order to +find out how many of them it was sufficient or necessary to compute, the earth +mover's distance between the deconvolved signal and the original one was +measured for different $r$s for each of the three tested values of $\sigma$. +To achieve this goal, a number of 1000 experiments (default and customizable +value) were simulated and, for each of them, the original signal was convolved +with the kernel, the appropriate $\sigma$ value set, and then deconvolved with +the RL algorithm with a given $r$ and the EMD was measured. Then, an average of +the so-obtained EMDs was computed together with the standard deviation. This +procedure was repeated for a few tens of different $r$s till a flattening or a +minimum of the curve became evident. Results in @fig:rounds-noiseless. + +The plots in @fig:rless-0.1 show the average (red) and standard deviation (grey) +of the measured EMD for $\sigma = 0.1 \, \Delta \theta$. The number of +iterations does not affect the quality of the outcome (those fluctuations are +merely a fact of floating-points precision) and the best result is obtained +for $r = 2$, meaning that the convergence of the RL algorithm is really fast and +this is due to the fact that the histogram was modified pretty poorly. In +@fig:rless-0.5, the curve starts to flatten at about 10 rounds, whereas in +@fig:rless-1 a minimum occurs around \SI{5e3}{} rounds, meaning that, whit such a +large kernel, the convergence is very slow, even if the best results are close +to the one found for $\sigma = 0.5$. +The following $r$s were chosen as the most fitted: +\begin{align*} + \sigma = 0.1 \, \Delta \theta &\thus n^{\text{best}} = 2 \\ + \sigma = 0.5 \, \Delta \theta &\thus n^{\text{best}} = 10 \\ + \sigma = 1 \, \Delta \theta &\thus n^{\text{best}} = \SI{5e3}{} +\end{align*} + +Note the difference between @fig:rless-0.1 and the plots resulting from $\sigma = +0.5 \, \Delta \theta$ and $\sigma = \, \Delta \theta$ as regards the order of +magnitude: the RL deconvolution is heavily influenced by the variance magnitude: +the greater $\sigma$, the worse the deconvolved result. +On the other hand, the FFT deconvolution procedure is not affected by $\sigma$ +amplitude changes: it always gives the same outcome, which would be exactly the +original signal, if the floating point precision would not affect the result. In +fact, the FFT is the analytical result of the deconvolution. + +