diff --git a/notes/sections/6.md b/notes/sections/6.md index a9c5cc0..54d0eed 100644 --- a/notes/sections/6.md +++ b/notes/sections/6.md @@ -4,21 +4,20 @@ The diffraction of a plane wave through a round slit is to be simulated by generating $N =$ 50'000 points according to the intensity distribution -$I(\theta)$ [@hecht02] on a screen at a great distance $L$ from the slit itself +$I(\theta)$ [@hecht02] on a screen at a great distance from the slit itself (see @fig:slit): $$ I(\theta) = \frac{E^2}{2} \left( \frac{2 \pi a^2 \cos{\theta}}{L} \frac{J_1(x)}{x} \right)^2 \with x = k a \sin{\theta} $$ - where: -- $E$ is the electric field amplitude, default set $E = \SI{1e4}{V/m}$; -- $a$ is the radius of the slit aperture, default set $a = \SI{0.01}{m}$; -- $\theta$ is the angle specified in @fig:slit; -- $J_1$ is the Bessel function of first kind; -- $k$ is the wavenumber, default set $k = \SI{1e-4}{m^{-1}}$; -- $L$ default set $L = \SI{1}{m}$. +- $E$ is the electric field amplitude, default $E = \SI{1e4}{V/m}$; +- $a$ is the radius of the slit aperture, default $a = \SI{0.01}{m}$; +- $\theta$ is the diffraction angle, shown in @fig:slit; +- $J_1$ is a Bessel function of first kind; +- $k$ is the wavenumber, default $k = \SI{1e-4}{m^{-1}}$; +- $L$ s the distance from the screen, default $L = \SI{1}{m}$. \begin{figure} \hypertarget{fig:slit}{% @@ -52,9 +51,9 @@ where: Once again, the *hit-miss* method described in @sec:3 was implemented and the same procedure about the generation of $\theta$ was applied. This time, -though, $\theta$ must be evenly distributed on half sphere, hence: +though, $\theta$ must be uniformly distributed on the half sphere, hence: \begin{align*} - \frac{d^2 P}{d\omega^2} = const = \frac{1}{2 \pi} + \frac{d^2 P}{d\omega^2} = \frac{1}{2 \pi} &\thus d^2 P = \frac{1}{2 \pi} d\omega^2 = \frac{1}{2 \pi} d\phi \sin{\theta} d\theta \\ &\thus \frac{dP}{d\theta} = \int_0^{2 \pi} d\phi \frac{1}{2 \pi} \sin{\theta} @@ -64,13 +63,13 @@ though, $\theta$ must be evenly distributed on half sphere, hence: \begin{align*} \theta = \theta (x) &\thus \frac{dP}{d\theta} = \frac{dP}{dx} \cdot \left| \frac{dx}{d\theta} \right| - = \left. \frac{dP}{dx} \middle/ \, \left| \frac{d\theta}{dx} \right| \right. + = \left. \frac{dP}{dx} \middle/ \, \left| \frac{d\theta}{dx} \right| \right. \\ &\thus \sin{\theta} = \left. 1 \middle/ \, \left| - \frac{d\theta}{dx} \right| \right. + \frac{d\theta}{dx} \right| \right. \end{align*} -If $\theta$ is chosen to grew together with $x$, then the absolute value can be +If $\theta$ is taken to increase with $x$, then the absolute value can be omitted: \begin{align*} \frac{d\theta}{dx} = \frac{1}{\sin{\theta}} @@ -84,50 +83,59 @@ omitted: \end{align*} The so obtained sample was binned and stored in a histogram with a customizable -number $n$ of bins (default set $n = 150$) ranging from $\theta = 0$ to $\theta +number $n$ of bins (default to $n = 150$) ranging from $\theta = 0$ to $\theta = \pi/2$ because of the system symmetry. In @fig:original an example is shown. ![Example of intensity histogram.](images/6-original.pdf){#fig:original} -## Gaussian convolution {#sec:convolution} +## Convolution {#sec:convolution} -In order to simulate the instrumentation response, the sample was then smeared -with a Gaussian function with the aim to recover the original sample afterwards, -implementing a deconvolution routine. +In order to simulate the instrumentation response, the sample was then +convolved with a gaussian kernel with the aim to recover the original sample +afterwards, implementing a deconvolution routine. For this purpose, a 'kernel' histogram with an even number $m$ of bins and the same bin width of the previous one, but a smaller number of them ($m \sim 6\% -\, n$), was generated according to a Gaussian distribution with mean $\mu = 0$ +\, n$), was generated according to a gaussian distribution with mean $\mu = 0$ and variance $\sigma$. The reason why the kernel was set this way will be discussed shortly. Then, the original histogram was convolved with the kernel in order to obtain the smeared signal. As an example, the result obtained for $\sigma = \Delta \theta$, where $\Delta \theta$ is the bin width, is shown in @fig:convolved. The smeared signal looks smoother with respect to the original one: the higher -$\sigma$, the greater the smoothness. +$\sigma$, the greater the smoothness. ![Convolved signal.](images/6-smoothed.pdf){#fig:convolved} The convolution was implemented as follow. Consider the definition of -convolution of two functions $f(x)$ and $g(x)$: +convolution for two integrable functions $f(x)$ and $g(x)$: $$ - f * g (x) = \int \limits_{- \infty}^{+ \infty} dy f(y) g(x - y) + (f * g)(x) = \int \limits_{- \infty}^{+ \infty} dy f(y) g(x - y) $$ +This definition is easily recast into a form that lends itself to be +implemented for discrete arrays of numbers, such as histograms or vectors: +\begin{align*} + (f * g)(x) + &= \int \limits_{- \infty}^{+ \infty} dy f(y) (R \, g)(y-x) \\ + &= \int \limits_{- \infty}^{+ \infty} dy f(y) (T_x \, R \, g)(y) \\ + &= (f, T_x \, R \, g)(y) +\end{align*} -Since a histogram is made of discrete values, a discrete convolution of the -signal ($s$) and the kernel ($k$) must be computed. Hence, the procedure boils -down to an element wise product between $s$ and the flipped histogram of $k$ -(from the last bin to the first one) for each relative position of the two -histograms. Namely, if $c_i$ is the $i^{\text{th}}$ bin of the convolved -histogram: -$$ - c_i = \sum_{j = 0}^{m - 1} k_j s_{i - j} - = \sum_{j' = m - 1}^{0} k_{m - 1 - j'} s_{i - m + 1 + j'} - \with j' = m - 1 - j -$$ +where: -For a better understanding, see @fig:dot_conv: the third histogram turns out -with $n + m - 1$ bins, a number greater than the original one. + - $R$ and $T_x$ are the reflection and translation by $x$ operators + - $(\cdot, \cdot)$ is an inner product + +Given a signal $s$ of $n$ elements and a kernel $k$ of $m$ elements, +their convolution is a vector of $n + m + 1$ elements computed +by flipping $s$ ($R$ operator) and shifting its indices ($T_i$ operator): +$$ + c_i = (s, T_i \, R \, k) +$$ +The shift is defined such that when index overflows ($\ge m$ or $\le$ 0) the +element is zero. This convention specifies the behavior at the edges +and results in the $m + 1$ increase in size. +For a better understanding, see @fig:dot_conv. \begin{figure} \hypertarget{fig:dot_conv}{% @@ -197,31 +205,31 @@ with $n + m - 1$ bins, a number greater than the original one. \end{figure} -## Unfolding with FFT +## Deconvolution by Fourier transform Two different unfolding routines were implemented, one of which exploiting the Fast Fourier Transform (FFT). -This method is based on the convolution theorem, according to which, given two -functions $f(x)$ and $g(x)$: +This method is based on the convolution theorem, which states that given two +$L^1$ functions $f(x)$ and $g(x)$: $$ - \hat{F}[f * g] = \hat{F}[f] \cdot \hat{F}[g] + \mathcal{F}[f * g] = \mathcal{F}[f] \cdot \mathcal{F}[g] $$ -where $\hat{F}[\quad]$ stands for the Fourier transform of its argument. +where $\mathcal{F}[\cdot]$ stands for the Fourier transform. Being the histogram a discrete set of data, the Discrete Fourier Transform (DFT) was applied. When dealing with arrays of discrete values, the theorem still -holds if the two arrays have the same length and a cyclical convolution is -applied. For this reason the kernel was 0-padded in order to make it the same -length of the original signal. Besides, the 0-padding allows to avoid unpleasant -side effects due to the cyclical convolution. +holds if the two arrays have the same length and $(*)$ is understood +as a cyclical convolution. +For this reason, the kernel was 0-padded to make it the same length of the +original signal and, at the same time, avoiding the cyclical convolution. In order to accomplish this procedure, both histograms were transformed into vectors. The implementation lies in the computation of the Fourier transform of the smeared signal and the kernel, the ratio between their transforms and the anti-transformation of the result: $$ - \hat{F}[s * k] = \hat{F}[s] \cdot \hat{F}[k] \thus - \hat{F} [s] = \frac{\hat{F}[s * k]}{\hat{F}[k]} + \mathcal{F}[s * k] = \mathcal{F}[s] \cdot \mathcal{F}[k] \thus + \mathcal{F} [s] = \frac{\mathcal{F}[s * k]}{\mathcal{F}[k]} $$ The FFT are efficient algorithms for calculating the DFT. Given a set of $n$ @@ -233,8 +241,8 @@ $$ where $i$ is the imaginary unit. The evaluation of the DFT is a matrix-vector multiplication $W \vec{z}$. A general matrix-vector multiplication takes $O(n^2)$ operations. FFT algorithms, -instead, use a *divide-and-conquer* strategy to factorize the matrix into -smaller sub-matrices. If $n$ can be factorized into a product of integers $n_1$, +instead, use a *divide-and-conquer* strategy to factorise the matrix into +smaller sub-matrices. If $n$ can be factorised into a product of integers $n_1$, $n_2 \ldots n_m$, then the DFT can be computed in $O(n \sum n_i) < O(n^2)$ operations, hence the name. The inverse Fourier transform is thereby defined as: @@ -243,14 +251,14 @@ $$ \sum_{k=0}^{n-1} x_k \exp \left( \frac{2 \pi i j k}{n} \right) $$ -In GSL, `gsl_fft_complex_forward()` and `gsl_fft_complex_inverse()` are +In GSL, `gsl_fft_complex_forward()` and `gsl_fft_complex_inverse()` are functions which allow to compute the forward and inverse transform, respectively. The inputs and outputs for the complex FFT routines are packed arrays of floating point numbers. In a packed array, the real and imaginary parts of each complex number are placed in alternate neighboring elements. In this special -case, the sequence of values to be transformed is made of real numbers, hence -Fourier transform is a complex sequence which satisfies: +case where the sequence of values to be transformed is made of real numbers, +the Fourier transform is a complex sequence which satisfies: $$ z_k = z^*_{n-k} $$ @@ -303,30 +311,29 @@ If the bin width is $\Delta \theta$, then the DFT domain ranges from $-1 / (2 aforementioned GSL functions store the positive values from the beginning of the array up to the middle and the negative backwards from the end of the array (see @fig:reorder). -Whilst do not matters if the convolved histogram has positive or negative -values, the kernel must be centered in zero in order to compute a correct +While the order of frequencies of the convolved histogram is immaterial, +the kernel must be centered at zero in order to compute a correct convolution. This requires the kernel to be made of an ever number of bins -in order to be possible to cut it into two same-length halves. +to be divided into two equal halves. -When $\hat{F}[s * k]$ and $\hat{F}[k]$ are computed, they are given in the -half-complex GSL format and their normal format must be restored in order to -use them as standard complex numbers and compute the ratio between them. Then, -the result must return in the half-complex format for the inverse DFT +When $\mathcal{F}[s * k]$ and $\mathcal{F}[k]$ are computed, they are given in the +half-complex GSL packed format, so they must be unpacked to a complex +GSL vector before performing the element-wise division. Then, +the result is repacked to the half-complex format for the inverse DFT computation. GSL provides the function `gsl_fft_halfcomplex_unpack()` which convert the vectors from half-complex format to standard complex format but the -inverse procedure is not provided by GSL and was hence implemented in the -code. +inverse procedure is not provided by GSL and had to be implemented. -At the end, the external bins which exceed the original signal size are cut +In the end, the external bins which exceed the original signal size are cut away in order to restore the original number of bins $n$. Results will be -discussed in @sec:conv_results. +discussed in @sec:conv-results. -## Unfolding with Richardson-Lucy +## Richardson-Lucy deconvolution The Richardson–Lucy (RL) deconvolution is an iterative procedure typically used for recovering an image that has been blurred by a known 'point spread -function'. +function', or 'kernel'. Consider the problem of estimating the frequency distribution $f(\xi)$ of a variable $\xi$ when the available measure is a sample {$x_i$} of points @@ -337,15 +344,15 @@ $$ {#eq:conv} where $P(x | \xi) \, d\xi$ is the probability (presumed known) that $x$ falls in the interval $(x, x + dx)$ when $\xi = \xi$. If the so-called point spread -function $P(x | \xi)$ follows a normal distribution with variance $\sigma$, -namely: +function $P(x | \xi)$ is a function of $x-\xi$ only, for example a normal +distribution with variance $\sigma$, $$ P(x | \xi) = \frac{1}{\sqrt{2 \pi} \sigma} \exp \left( - \frac{(x - \xi)^2}{2 \sigma^2} \right) $$ -then, @eq:conv becomes a convolution and finding $f(\xi)$ turns out to be a -deconvolution. +then, @eq:conv becomes a convolution and finding $f(\xi)$ amounts +to a deconvolution. An example of this problem is precisely that of correcting an observed distribution $\phi(x)$ for the effect of observational errors, which are represented by the function $P (x | \xi)$. @@ -376,16 +383,15 @@ Since $Q (\xi | x)$ depends on $f(\xi)$, @eq:second suggests an iterative procedure for generating estimates of $f(\xi)$. With a guess for $f(\xi)$ and a known $P(x | \xi)$, @eq:first can be used to calculate and estimate for $Q (\xi | x)$. Then, taking the hint provided by @eq:second, an improved -estimate for $f (\xi)$ can be generated, using the observed sample {$x_i$} to +estimate for $f(\xi)$ can be generated, using the observed sample {$x_i$} to give an approximation for $\phi$. -Thus, if $f^t$ is the $t^{\text{th}}$ estimate, the $t^{\text{th + 1}}$ is: +Thus, if $f^t$ is the $t-th$ estimate, the next is given by: $$ f^{t + 1}(\xi) = \int dx \, \phi(x) Q^t(\xi | x) \with Q^t(\xi | x) = \frac{f^t(\xi) \cdot P(x | \xi)} {\int d\xi \, f^t(\xi) P(x | \xi)} $$ - from which: $$ f^{t + 1}(\xi) = f^t(\xi) @@ -393,18 +399,17 @@ $$ P(x | \xi) $$ {#eq:solution} -When the spread function $P(x | \xi)$ is Gaussian, @eq:solution can be +When the spread function $P(x | \xi) = P(x-\xi)$, @eq:solution can be rewritten in terms of convolutions: $$ f^{t + 1} = f^{t}\left( \frac{\phi}{{f^{t}} * P} * P^{\star} \right) $$ - where $P^{\star}$ is the flipped point spread function [@lucy74]. -In this special case, the Gaussian kernel which was convolved with the original -histogram stands for the point spread function. Dealing with discrete values, -the division and multiplication are element wise and the convolution is to be -carried out as described in @sec:convolution. +In this particular instance, a gaussian kernel was convolved with the original +histogram. Again, dealing with discrete arrays of numbers, the division and +multiplication are element wise and the convolution is to be carried out as +described in @sec:convolution. When implemented, this method results in an easy step-wise routine: - choose a zero-order estimate for {$f(\xi)$}; @@ -412,9 +417,9 @@ When implemented, this method results in an easy step-wise routine: - compute the convolutions, the product and the division at each step; - proceed until a given number $r$ of iterations is reached. -In this case, the zero-order was set $f(\xi) = 0.5 \, \forall \, \xi$. Different -number of iterations where tested. Results are discussed in -@sec:conv_results. +In this case, the zero-order was set $f(\xi) = 0.5 \, \forall \, \xi$ and +different number of iterations where tested. Results are discussed in +@sec:conv-results. ## The earth mover's distance @@ -424,16 +429,15 @@ deconvolved outcome with the original signal was quantified using the earth mover's distance. In statistics, the earth mover's distance (EMD) is the measure of distance -between two distributions [@cock41]. Informally, if one imagines the two +between two distributions [@cock41]. Informally, if one imagines the distributions as two piles of different amount of dirt in their respective -regions, the EMD is the minimum cost of turning one pile into the other, -making the first one the most possible similar to the second one, where the -cost is the amount of dirt moved times the distance by which it is moved. -Computing the EMD is based on a solution to the transportation problem, which -can be formalized as follows. +regions, the EMD is the minimum cost of turning one pile into the other, making +the first the most possible similar to the second, where the cost is the amount +of dirt moved times the distance by which it is moved. -Consider two vectors $P$ and $Q$ which represent the two distributions whose -EMD has to be measured: +Computing the EMD is based on the solution to a transportation problem, which +can be formalized as follows. Consider two vectors $P$ and $Q$ which represent +the two distributions whose EMD has to be measured: $$ P = \{ (p_1, w_{p1}) \dots (p_m, w_{pm}) \} \et @@ -442,18 +446,18 @@ $$ where $p_i$ and $q_i$ are the 'values' (that is, the location of the dirt) and $w_{pi}$ and $w_{qi}$ are the 'weights' (that is, the quantity of dirt). A -ground distance matrix $D_{ij}$ is defined such as its entries $d_{ij}$ are the -distances between $p_i$ and $q_j$. The aim is to find the flow matrix $F_{ij}$, -where each entry $f_{ij}$ is the flow from $p_i$ to $q_j$ (which would be -the quantity of moved dirt), which minimizes the cost $W$: +ground distance matrix $D$ is defined such as its entries $d_{ij}$ are the +distances between $p_i$ and $q_j$. The aim is to find the flow matrix $F$, +where each entry $f_{ij}$ is the flow from $p_i$ to $q_j$ (which would be the +quantity of moved dirt), which minimizes the cost $W$: $$ W (P, Q, F) = \sum_{i = 1}^m \sum_{j = 1}^n f_{ij} d_{ij} $$ -The $Q$ region is to be considered empty at the beginning: the 'dirt' present in -$P$ must be moved to $Q$ in order to reach the same distribution as close as -possible. Namely, the following constraints must be satisfied: +The $Q$ region is to be considered empty at the beginning: the 'dirt' present +in $P$ must be moved to $Q$ in order to reproduce the same distribution as +close as possible. Formally, the following constraints must be satisfied: \begin{align*} &\text{1.} \hspace{20pt} f_{ij} \ge 0 \hspace{15pt} @@ -480,8 +484,7 @@ same amount of dirt, hence all the dirt present in $P$ is necessarily moved to $Q$ and the flow equals the total amount of available dirt. Once the transportation problem is solved and the optimal flow is found, the -EMD is defined as the work normalized by the total flow: - +EMD is defined as the work normalized by the total flow: $$ \text{EMD} (P, Q) = \frac{\sum_{i = 1}^m \sum_{j = 1}^n f_{ij} d_{ij}} {\sum_{i = 1}^m \sum_{j=1}^n f_{ij}} @@ -496,52 +499,52 @@ $$ $$ where the sum runs over the entries of the vectors $U$ and $V$, which are the -cumulative vectors of the histograms. In the code, the following equivalent +cumulative sums of the histograms. In the code, the following equivalent iterative routine was implemented. $$ - \text{EMD} (u, v) = \sum_i |\text{EMD}_i| \with + \text{EMD} (u, v) = \sum_i |\text{d}_i| \with \begin{cases} - \text{EMD}_i = v_i - u_i + \text{EMD}_{i-1} \\ - \text{EMD}_0 = 0 + \text{d}_i = v_i - u_i + \text{d}_{i-1} \\ + \text{d}_0 = 0 \end{cases} $$ -In fact: - +The equivalence is apparent once the definition is expanded: \begin{align*} - \text{EMD} (u, v) &= \sum_i |\text{EMD}_i| = |\text{EMD}_0| + |\text{EMD}_1| - + |\text{EMD}_2| + |\text{EMD}_3| + \dots \\ - &= 0 + |v_1 - u_1 + \text{EMD}_0| + - |v_2 - u_2 + \text{EMD}_1| + - |v_3 - u_3 + \text{EMD}_2| + \dots \\ - &= |v_1 - u_1| + - |v_1 - u_1 + v_2 - u_2| + - |v_1 - u_1 + v_2 - u_2 + v_3 - u_3| + \dots \\ - &= |v_1 - u_i| + - |v_1 + v_2 - (u_1 + u_2)| + - |v_1 + v_2 + v_3 - (u_1 + u_2 + u_3))| + \dots \\ - &= |V_1 - U_1| + |V_2 - U_2| + |V_3 - U_3| + \dots \\ - &= \sum_i |U_i - V_i| + \text{EMD} (u, v) + &= \sum_i |\text{d}_i| = |\text{d}_0| + |\text{d}_1| + + |\text{d}_2| + |\text{d}_3| + \dots \\ + &= 0 + |v_1 - u_1 + \text{d}_0| + + |v_2 - u_2 + \text{d}_1| + + |v_3 - u_3 + \text{d}_2| + \dots \\ + &= |v_1 - u_1| + + |v_1 - u_1 + v_2 - u_2| + + |v_1 - u_1 + v_2 - u_2 + v_3 - u_3| + \dots \\ + &= |v_1 - u_1| + + |v_1 + v_2 - (u_1 + u_2)| + + |v_1 + v_2 + v_3 - (u_1 + u_2 + u_3))| + \dots \\ + &= |V_1 - U_1| + |V_2 - U_2| + |V_3 - U_3| + \dots \\ + &= \sum_i |U_i - V_i| \end{align*} -This simple formula enabled comparisons to be made between a great number of -results. +This simple algorithm enabled the comparisons between a great number of +histogram to be computed efficiently. In order to make the code more flexible, the data were normalized before computing the EMD: in doing so, it is possible to compare even samples with a different number of points. -## Results comparison {#sec:conv_results} +## Results comparison {#sec:conv-results} ### Noiseless results {#sec:noiseless} -Along with the analysis of the results obtained varying the convolved Gaussian -width $\sigma$, the possibility to add a Gaussian noise to the convolved -histogram was also implemented to check weather the deconvolution is affected -or not by this kind of interference. This approach is described in the next -subsection, while the noiseless results are given in this one. +In addition to the convolution with a gaussian kernel of width $\sigma$, the +possibility to add a gaussian noise to the convolved histogram counts was also +implemented to check weather the deconvolution is affected or not by this kind +of interference. This approach is described in the next subsection, while the +noiseless results are given in this one. The two methods were compared for three different values of $\sigma$: $$ @@ -551,28 +554,31 @@ $$ $$ Since the RL method depends on the number $r$ of performed rounds, in order to -find out how many of them it was sufficient or necessary to compute, the earth -mover's distance between the deconvolved signal and the original one was -measured for different $r$s for each of the three tested values of $\sigma$. -To achieve this goal, a number of 1000 experiments (default and customizable -value) were simulated and, for each of them, the original signal was convolved -with the kernel, the appropriate $\sigma$ value set, and then deconvolved with -the RL algorithm with a given $r$ and the EMD was measured. Then, an average of -the so-obtained EMDs was computed together with the standard deviation. This -procedure was repeated for a few tens of different $r$s till a flattening or a -minimum of the curve became evident. Results in @fig:rounds-noiseless. +find out how many are sufficient or necessary to compute, the earth mover's +distance between the deconvolved signal and the original one was measured for +different $r$s for each of the three tested values of the kernel $\sigma$. -The plots in @fig:rless-0.1 show the average (red) and standard deviation (grey) -of the measured EMD for $\sigma = 0.1 \, \Delta \theta$. The number of +To achieve this goal, a number of 1000 experiments were simulated. Each +consists in generating the diffraction signal, convolving it with the a kernel +of width $\sigma$, deconvolving with the RL algorithm with a given number of +rounds $r$ and measuring the EMD. +The distances are used to build an histogram of EMD distribution, from which +mean and standard deviation are computed. +This procedure was repeated for a few tens of different $r$s till a flattening +or a minimum of the curve became evident. All the results are shown in +@fig:rounds-noiseless. + +The plots in @fig:rless-0.1 show the average (red) and standard deviation +(grey) of the measured EMD for $\sigma = 0.1 \, \Delta \theta$. The number of iterations does not affect the quality of the outcome (those fluctuations are -merely a fact of floating-points precision) and the best result is obtained -for $r = 2$, meaning that the convergence of the RL algorithm is really fast and -this is due to the fact that the histogram was modified pretty poorly. In -@fig:rless-0.5, the curve starts to flatten at about 10 rounds, whereas in -@fig:rless-1 a minimum occurs around \num{5e3} rounds, meaning that, whit such a -large kernel, the convergence is very slow, even if the best results are close -to the one found for $\sigma = 0.5$. -The following $r$s were chosen as the most fitted: +merely a fact of floating-points precision) and the best result is obtained for +$r = 2$, meaning that the convergence of the RL algorithm is really fast and +this is due to the fact that the histogram was only slighlty modified. +In @fig:rless-0.5, the curve starts to flatten at about 10 rounds, whereas in +@fig:rless-1 a minimum occurs around \num{5e3} rounds, meaning that, whit such +a large kernel, the convergence is very slow, even if the best results are +close to the one found for $\sigma = 0.5$. +The following $r$s were chosen as the most fitting: \begin{align*} \sigma = 0.1 \, \Delta \theta &\thus n^{\text{best}} = 2 \\ \sigma = 0.5 \, \Delta \theta &\thus n^{\text{best}} = 10 \\ @@ -590,13 +596,14 @@ fact, the FFT is the analytical result of the deconvolution.