700 lines
32 KiB
Markdown
700 lines
32 KiB
Markdown
# Exercise 6
|
||
|
||
## Generating points according to Fraunhöfer diffraction
|
||
|
||
The diffraction of a plane wave through a round slit is to be simulated by
|
||
generating $N =$ 50'000 points according to the intensity distribution
|
||
$I(\theta)$ [@hecht02] on a screen at a great distance from the slit itself
|
||
(see @fig:slit):
|
||
$$
|
||
I(\theta) = \frac{E^2}{2} \left( \frac{2 \pi a^2 \cos{\theta}}{L}
|
||
\frac{J_1(x)}{x} \right)^2 \with x = k a \sin{\theta}
|
||
$$
|
||
where:
|
||
|
||
- $E$ is the electric field amplitude, default $E = \SI{1e4}{V/m}$;
|
||
- $a$ is the radius of the slit aperture, default $a = \SI{0.01}{m}$;
|
||
- $\theta$ is the diffraction angle shown in @fig:slit;
|
||
- $J_1$ is the Bessel function of first kind;
|
||
- $k$ is the wavenumber, default $k = \SI{1e-4}{m^{-1}}$;
|
||
- $L$ is the distance from the screen, default $L = \SI{1}{m}$.
|
||
|
||
\begin{figure}
|
||
\hypertarget{fig:slit}{%
|
||
\centering
|
||
\begin{tikzpicture}
|
||
% Walls
|
||
\draw [thick] (-1,3) -- (1,3) -- (1,0.3) -- (1.2,0.3) -- (1.2,3)
|
||
-- (9,3);
|
||
\draw [thick] (-1,-3) -- (1,-3) -- (1,-0.3) -- (1.2,-0.3) -- (1.2,-3)
|
||
-- (9,-3);
|
||
\draw [thick] (10,3) -- (9.8,3) -- (9.8,-3) -- (10,-3);
|
||
% Lines
|
||
\draw [thick, gray] (0.7,0.3) -- (0.5,0.3);
|
||
\draw [thick, gray] (0.7,-0.3) -- (0.5,-0.3);
|
||
\draw [thick, gray] (0.6,0.3) -- (0.6,-0.3);
|
||
\draw [thick, gray] (1.2,0) -- (9.8,0);
|
||
\draw [thick, gray] (1.2,-0.1) -- (1.2,0.1);
|
||
\draw [thick, gray] (9.8,-0.1) -- (9.8,0.1);
|
||
\draw [thick, cyclamen] (1.2,0) -- (9.8,-2);
|
||
\draw [thick, cyclamen] (7,0) to [out=-90, in=50] (6.6,-1.23);
|
||
% Nodes
|
||
\node at (0,0) {$2a$};
|
||
\node at (5.5,0.4) {$L$};
|
||
\node [cyclamen] at (5.5,-0.4) {$\theta$};
|
||
\node [rotate=-90] at (10.2,0) {screen};
|
||
\end{tikzpicture}
|
||
\caption{Fraunhöfer diffraction.}\label{fig:slit}
|
||
}
|
||
\end{figure}
|
||
|
||
Once again, the *hit-miss* method described in @sec:3 was implemented and
|
||
the same procedure about the generation of $\theta$ was applied. This time,
|
||
though, $\theta$ must be uniformly distributed on the half sphere, hence:
|
||
\begin{align*}
|
||
\frac{d^2 P}{d\omega^2} = \frac{1}{2 \pi}
|
||
&\thus d^2 P = \frac{1}{2 \pi} d\omega^2 =
|
||
\frac{1}{2 \pi} d\phi \sin{\theta} d\theta \\
|
||
&\thus \frac{dP}{d\theta} = \int_0^{2 \pi} \!\!\! d\phi \frac{1}{2 \pi} \sin{\theta}
|
||
= \frac{1}{2 \pi} \sin{\theta} \, 2 \pi = \sin{\theta}
|
||
\end{align*}
|
||
\begin{align*}
|
||
\theta = \theta (x) &\thus
|
||
\frac{dP}{d\theta} = \frac{dP}{dx} \cdot \left| \frac{dx}{d\theta} \right|
|
||
= \left. \frac{dP}{dx} \middle/ \, \left| \frac{d\theta}{dx} \right| \right.
|
||
\\
|
||
&\thus \sin{\theta} = \left. 1 \middle/ \, \left|
|
||
\frac{d\theta}{dx} \right| \right.
|
||
\end{align*}
|
||
If $\theta$ is chosen to increase with $x$, then the absolute value can be
|
||
omitted:
|
||
\begin{align*}
|
||
\frac{d\theta}{dx} = \frac{1}{\sin{\theta}}
|
||
&\thus d\theta \sin(\theta) = dx
|
||
\\
|
||
&\thus - \cos (\theta') |_{0}^{\theta} = x(\theta) - x(0) = x - 0 = x
|
||
\\
|
||
&\thus - \cos(\theta) + 1 =x
|
||
\\
|
||
&\thus \theta = \text{acos} (1 -x)
|
||
\end{align*}
|
||
The so obtained sample was binned and stored in a histogram with a customizable
|
||
number $n$ of bins (default to $n = 150$) ranging from $\theta = 0$ to $\theta
|
||
= \pi/2$ because of the system symmetry. An example is shown in @fig:original.
|
||
|
||
![Example of intensity histogram.](images/6-original.pdf){#fig:original}
|
||
|
||
|
||
## Convolution {#sec:convolution}
|
||
|
||
In order to simulate the instrumentation response, the sample was then
|
||
convolved with a Gaussian kernel with the aim to recover the original sample
|
||
afterwards, implementing a deconvolution routine.
|
||
For this purpose, a 'kernel' histogram with an even number $m$ of bins and the
|
||
same bin width of the previous one, but a smaller number of them ($m \sim 6\%
|
||
\, n$), was generated according to a Gaussian distribution with mean $\mu = 0$
|
||
and variance $\sigma$. The reason why the kernel was set this way will be
|
||
discussed shortly.
|
||
The original histogram was then convolved with the kernel in order to obtain
|
||
the smeared signal. As an example, the result obtained for $\sigma = \Delta
|
||
\theta$, where $\Delta \theta$ is the bin width, is shown in @fig:convolved.
|
||
The smeared signal looks smoother with respect to the original one: the higher
|
||
$\sigma$, the greater the smoothness.
|
||
|
||
![Convolved signal.](images/6-smoothed.pdf){#fig:convolved}
|
||
|
||
The convolution was implemented as follow. Consider the definition of
|
||
convolution for two integrable functions $f(x)$ and $g(x)$:
|
||
$$
|
||
(f * g)(x) = \int \limits_{- \infty}^{+ \infty} dy f(y) g(x - y)
|
||
$$
|
||
This definition is easily recast into a form that lends itself to be
|
||
implemented for discrete arrays of numbers, such as histograms or vectors:
|
||
\begin{align*}
|
||
(f * g)(x)
|
||
&= \int \limits_{- \infty}^{+ \infty} dy f(y) (R \, g)(y-x) \\
|
||
&= \int \limits_{- \infty}^{+ \infty} dy f(y) (T_x \, R \, g)(y) \\
|
||
&= (f, T_x \, R \, g)(y)
|
||
\end{align*}
|
||
|
||
where:
|
||
|
||
- $R$ and $T_x$ are the reflection and translation by $x$ operators,
|
||
- $(\cdot, \cdot)$ is an inner product.
|
||
|
||
Given a signal $s$ of $n$ elements and a kernel $k$ of $m$ elements,
|
||
their convolution is a vector of $n + m + 1$ elements computed
|
||
by flipping $s$ ($R$ operator) and shifting its indices ($T_i$ operator):
|
||
$$
|
||
c_i = (s, T_i \, R \, k)
|
||
$$
|
||
The shift is defined such that when a index overflows ($\ge m$ or $\le$ 0) the
|
||
element is zero. This convention specifies the behavior at the edges
|
||
and results in the $m + 1$ increase in size.
|
||
For a better understanding, see @fig:dot_conv.
|
||
|
||
\begin{figure}
|
||
\hypertarget{fig:dot_conv}{%
|
||
\centering
|
||
\begin{tikzpicture}
|
||
% original histogram
|
||
\draw [thick, cyclamen, fill=cyclamen!05!white] (0.0,0) rectangle (0.5,2.5);
|
||
\draw [thick, cyclamen, fill=cyclamen!05!white] (0.5,0) rectangle (1.0,2.8);
|
||
\draw [thick, cyclamen, fill=cyclamen!25!white] (1.0,0) rectangle (1.5,2.3);
|
||
\draw [thick, cyclamen, fill=cyclamen!25!white] (1.5,0) rectangle (2.0,1.8);
|
||
\draw [thick, cyclamen, fill=cyclamen!25!white] (2.0,0) rectangle (2.5,1.4);
|
||
\draw [thick, cyclamen, fill=cyclamen!25!white] (2.5,0) rectangle (3.0,1.0);
|
||
\draw [thick, cyclamen, fill=cyclamen!25!white] (3.0,0) rectangle (3.5,1.0);
|
||
\draw [thick, cyclamen, fill=cyclamen!05!white] (3.5,0) rectangle (4.0,0.6);
|
||
\draw [thick, cyclamen, fill=cyclamen!05!white] (4.0,0) rectangle (4.5,0.4);
|
||
\draw [thick, cyclamen, fill=cyclamen!05!white] (4.5,0) rectangle (5.0,0.2);
|
||
\draw [thick, cyclamen, fill=cyclamen!05!white] (5.0,0) rectangle (5.5,0.2);
|
||
\draw [thick, cyclamen] (6.0,0) -- (6.0,0.2);
|
||
\draw [thick, cyclamen] (6.5,0) -- (6.5,0.2);
|
||
\draw [thick, <->] (0,3.3) -- (0,0) -- (7,0);
|
||
% kernel histogram
|
||
\draw [thick, cyclamen, fill=cyclamen!25!white] (1.0,-1) rectangle (1.5,-1.2);
|
||
\draw [thick, cyclamen, fill=cyclamen!25!white] (1.5,-1) rectangle (2.0,-1.6);
|
||
\draw [thick, cyclamen, fill=cyclamen!25!white] (2.0,-1) rectangle (2.5,-1.8);
|
||
\draw [thick, cyclamen, fill=cyclamen!25!white] (2.5,-1) rectangle (3.0,-1.6);
|
||
\draw [thick, cyclamen, fill=cyclamen!25!white] (3.0,-1) rectangle (3.5,-1.2);
|
||
\draw [thick, <->] (1,-2) -- (1,-1) -- (4,-1);
|
||
% arrows
|
||
\draw [thick, cyclamen, <->] (1.25,-0.2) -- (1.25,-0.8);
|
||
\draw [thick, cyclamen, <->] (1.75,-0.2) -- (1.75,-0.8);
|
||
\draw [thick, cyclamen, <->] (2.25,-0.2) -- (2.25,-0.8);
|
||
\draw [thick, cyclamen, <->] (2.75,-0.2) -- (2.75,-0.8);
|
||
\draw [thick, cyclamen, <->] (3.25,-0.2) -- (3.25,-0.8);
|
||
\draw [thick, cyclamen, ->] (2.25,-2.0) -- (2.25,-4.2);
|
||
% smeared histogram
|
||
\begin{scope}[shift={(0,-1)}]
|
||
\draw [thick, cyclamen, fill=cyclamen!05!white] (-1.0,-4.5) rectangle (-0.5,-4.3);
|
||
\draw [thick, cyclamen, fill=cyclamen!05!white] (-0.5,-4.5) rectangle ( 0.0,-4.2);
|
||
\draw [thick, cyclamen, fill=cyclamen!05!white] ( 0.0,-4.5) rectangle ( 0.5,-2.0);
|
||
\draw [thick, cyclamen, fill=cyclamen!05!white] ( 0.5,-4.5) rectangle ( 1.0,-1.6);
|
||
\draw [thick, cyclamen, fill=cyclamen!05!white] ( 1.0,-4.5) rectangle ( 1.5,-2.3);
|
||
\draw [thick, cyclamen, fill=cyclamen!05!white] ( 1.5,-4.5) rectangle ( 2.0,-2.9);
|
||
\draw [thick, cyclamen, fill=cyclamen!25!white] ( 2.0,-4.5) rectangle ( 2.5,-3.4);
|
||
\draw [thick, cyclamen] (3.0,-4.5) -- (3.0,-4.3);
|
||
\draw [thick, cyclamen] (3.5,-4.5) -- (3.5,-4.3);
|
||
\draw [thick, cyclamen] (4.0,-4.5) -- (4.0,-4.3);
|
||
\draw [thick, cyclamen] (4.5,-4.5) -- (4.5,-4.3);
|
||
\draw [thick, cyclamen] (5.0,-4.5) -- (5.0,-4.3);
|
||
\draw [thick, cyclamen] (5.5,-4.5) -- (5.5,-4.3);
|
||
\draw [thick, cyclamen] (6.0,-4.5) -- (6.0,-4.3);
|
||
\draw [thick, cyclamen] (6.5,-4.5) -- (6.5,-4.3);
|
||
\draw [thick, cyclamen] (7.0,-4.5) -- (7.0,-4.3);
|
||
\draw [thick, cyclamen] (7.5,-4.5) -- (7.5,-4.3);
|
||
\draw [thick, <->] (-1,-2.5) -- (-1,-4.5) -- (8,-4.5);
|
||
\end{scope}
|
||
% nodes
|
||
\node [above] at (2.25,-5.5) {$c_i$};
|
||
\node [above] at (3.25,0) {$s_i$};
|
||
\node [above] at (1.95,0) {$s_{i-j}$};
|
||
\node [below] at (1.75,-1) {$k_j$};
|
||
\end{tikzpicture}
|
||
\caption{Element wise product as a step of the convolution between the original signal
|
||
(above) and the kernel (center). The final result is the lower
|
||
fledging histogram.}\label{fig:dot_conv}
|
||
}
|
||
\end{figure}
|
||
|
||
|
||
## Deconvolution by Fourier transform
|
||
|
||
Two different unfolding routines were implemented, one of which exploiting the
|
||
Fast Fourier Transform (FFT).
|
||
|
||
This method is based on the convolution theorem, which states that given two
|
||
$L^1$ functions $f(x)$ and $g(x)$:
|
||
$$
|
||
\mathcal{F}[f * g] = \mathcal{F}[f] \cdot \mathcal{F}[g]
|
||
$$
|
||
where $\mathcal{F}[\cdot]$ stands for the Fourier transform.
|
||
Being the histogram a discrete set of data, the Discrete Fourier Transform (DFT)
|
||
was applied. When dealing with arrays of discrete values, the theorem still
|
||
holds if the two arrays have the same length and $(*)$ is understood
|
||
as a cyclical convolution.
|
||
For this reason, the kernel was 0-padded to make it the same length of the
|
||
original signal and, at the same time, avoiding the cyclical convolution.
|
||
In order to accomplish this procedure, both histograms were transformed into
|
||
vectors. The implementation lies in the computation of the Fourier transform of
|
||
the smeared signal and the kernel, the ratio between their transforms and the
|
||
anti-transformation of the result:
|
||
$$
|
||
\mathcal{F}[s * k] = \mathcal{F}[s] \cdot \mathcal{F}[k] \thus
|
||
\mathcal{F} [s] = \frac{\mathcal{F}[s * k]}{\mathcal{F}[k]}
|
||
$$
|
||
The FFT are efficient algorithms for calculating the DFT. Given a set of $n$
|
||
values {$z_j$}, each one is transformed into:
|
||
$$
|
||
x_j = \sum_{k=0}^{n-1} z_k \exp \left( - \frac{2 \pi i j k}{n} \right)
|
||
$$
|
||
where $i$ is the imaginary unit.
|
||
The evaluation of the DFT is a matrix-vector multiplication $W \vec{z}$. A
|
||
general matrix-vector multiplication takes $O(n^2)$ operations. FFT algorithms,
|
||
instead, use a *divide-and-conquer* strategy to factorise the matrix into
|
||
smaller sub-matrices. If $n$ can be factorised into a product of integers $n_1$,
|
||
$n_2 \ldots n_m$, then the DFT can be computed in $O(n \sum n_i) < O(n^2)$
|
||
operations, hence the name.
|
||
The inverse Fourier transform is thereby defined as:
|
||
$$
|
||
z_j = \frac{1}{n}
|
||
\sum_{k=0}^{n-1} x_k \exp \left( \frac{2 \pi i j k}{n} \right)
|
||
$$
|
||
In GSL, `gsl_fft_complex_forward()` and `gsl_fft_complex_inverse()` are
|
||
functions which allow to compute the forward and inverse transform,
|
||
respectively.
|
||
The inputs and outputs for the complex FFT routines are packed arrays of
|
||
floating point numbers. In a packed array, the real and imaginary parts of each
|
||
complex number are placed in alternate neighboring elements. In this special
|
||
case, where the sequence of values to be transformed is made of real numbers,
|
||
the Fourier transform is a complex sequence which satisfies:
|
||
$$
|
||
z_k = z^*_{n-k}
|
||
$$
|
||
where $z^*$ is the conjugate of $z$. A sequence with this symmetry is called
|
||
'half-complex'. This structure requires particular storage layouts for the
|
||
forward transform (from real to half-complex) and inverse transform (from
|
||
half-complex to real). As a consequence, the routines are divided into two sets:
|
||
`gsl_fft_real` and `gsl_fft_halfcomplex`. The symmetry of the half-complex
|
||
sequence requires only half of the complex numbers in the output to be stored.
|
||
This works for all lengths: when the length is odd, the middle value is real.
|
||
Thus, only $n$ real numbers are required to store the half-complex sequence
|
||
(half for the real part and half for the imaginary).
|
||
|
||
\begin{figure}
|
||
\hypertarget{fig:reorder}{%
|
||
\centering
|
||
\begin{tikzpicture}
|
||
% standard histogram
|
||
\draw [thick, cyclamen] (0.5,0) -- (0.5,0.2);
|
||
\draw [thick, cyclamen, fill=cyclamen!25!white] (1.0,0) rectangle (1.5,0.6);
|
||
\draw [thick, cyclamen, fill=cyclamen!25!white] (1.5,0) rectangle (2.0,1.2);
|
||
\draw [thick, cyclamen, fill=cyclamen!25!white] (2.0,0) rectangle (2.5,1.4);
|
||
\draw [thick, cyclamen, fill=cyclamen!25!white] (2.5,0) rectangle (3.0,1.4);
|
||
\draw [thick, cyclamen, fill=cyclamen!25!white] (3.0,0) rectangle (3.5,1.2);
|
||
\draw [thick, cyclamen, fill=cyclamen!25!white] (3.5,0) rectangle (4.0,0.6);
|
||
\draw [thick, cyclamen] (4.5,0) -- (4.5,0.2);
|
||
\draw [thick, ->] (0,0) -- (5,0);
|
||
\draw [thick, ->] (2.5,0) -- (2.5,2);
|
||
% shifted histogram
|
||
\begin{scope}[shift={(7,0)}]
|
||
\draw [thick, cyclamen, fill=cyclamen!25!white] (0.5,0) rectangle (1.0,1.4);
|
||
\draw [thick, cyclamen, fill=cyclamen!25!white] (1.0,0) rectangle (1.5,1.2);
|
||
\draw [thick, cyclamen, fill=cyclamen!25!white] (1.5,0) rectangle (2.0,0.6);
|
||
\draw [thick, cyclamen, fill=cyclamen!25!white] (3.0,0) rectangle (3.5,0.6);
|
||
\draw [thick, cyclamen, fill=cyclamen!25!white] (3.5,0) rectangle (4.0,1.2);
|
||
\draw [thick, cyclamen, fill=cyclamen!25!white] (4.0,0) rectangle (4.5,1.4);
|
||
\draw [thick, ->] (0,0) -- (5,0);
|
||
\draw [thick, ->] (2.5,0) -- (2.5,2);
|
||
\end{scope}
|
||
\end{tikzpicture}
|
||
\caption{The histogram on the right shows how the real numbers histogram on the
|
||
left is handled by the dedicated GSL functions.}\label{fig:reorder}
|
||
}
|
||
\end{figure}
|
||
|
||
If the bin width is $\Delta \theta$, then the DFT domain ranges from $-1 / (2
|
||
\Delta \theta)$ to $+1 / (2 \Delta \theta$). As regards the real values, the
|
||
aforementioned GSL functions store the positive values from the beginning of
|
||
the array up to the middle and the negative backwards from the end of the array
|
||
(see @fig:reorder).
|
||
While the order of frequencies of the convolved histogram is immaterial,
|
||
the kernel must be centered at zero in order to compute a correct
|
||
convolution. This requires the kernel to be made of an ever number of bins
|
||
to be divided into two equal halves.
|
||
|
||
When $\mathcal{F}[s * k]$ and $\mathcal{F}[k]$ are computed, they are given in the
|
||
half-complex GSL packed format, so they must be unpacked to a complex
|
||
GSL vector before performing the element-wise division. Then,
|
||
the result is repacked to the half-complex format for the inverse DFT
|
||
computation. GSL provides the function `gsl_fft_halfcomplex_unpack()` which
|
||
convert the vectors from half-complex format to standard complex format but the
|
||
inverse procedure is not provided by GSL and had to be implemented.
|
||
|
||
At the end, the external bins which exceed the original signal size were cut
|
||
away in order to restore the original number of bins $n$. Results will be
|
||
discussed in @sec:conv-results.
|
||
|
||
|
||
## Richardson-Lucy deconvolution
|
||
|
||
The Richardson–Lucy (RL) deconvolution is an iterative procedure typically used
|
||
for recovering an image that has been blurred by a known 'point spread
|
||
function', or 'kernel'.
|
||
|
||
Consider the problem of estimating the frequency distribution $f(\xi)$ of a
|
||
variable $\xi$ when the available measure is a sample {$x_i$} of points
|
||
drown not by $f(x)$ but by another function $\phi(x)$ such that:
|
||
$$
|
||
\phi(x) = \int d\xi \, f(\xi) P(x | \xi)
|
||
$$ {#eq:conv}
|
||
where $P(x | \xi) \, d\xi$ is the probability (presumed known) that $x$ falls
|
||
in the interval $(x, x + dx)$ when $\xi = \xi$. If the so-called point spread
|
||
function $P(x | \xi)$ is a function of $x-\xi$ only, for example a normal
|
||
distribution with variance $\sigma$:
|
||
$$
|
||
P(x | \xi) = \frac{1}{\sqrt{2 \pi} \sigma}
|
||
\exp \left( - \frac{(x - \xi)^2}{2 \sigma^2} \right)
|
||
$$
|
||
then, @eq:conv becomes a convolution and finding $f(\xi)$ amounts
|
||
to a deconvolution.
|
||
An example of this problem is precisely that of correcting an observed
|
||
distribution $\phi(x)$ for the effect of observational errors, which are
|
||
represented by the function $P (x | \xi)$.
|
||
|
||
Let $Q(\xi | x) d\xi$ be the probability that $\xi$ comes from the interval
|
||
$(\xi, \xi + d\xi)$ when the measured quantity is $x = x$. The probability that
|
||
both $x \in (x, x + dx)$ and $(\xi, \xi + d\xi)$ is therefore given by $\phi(x)
|
||
dx \cdot Q(\xi | x) d\xi$ which is identical to $f(\xi) d\xi \cdot P(x | \xi)
|
||
dx$, hence:
|
||
$$
|
||
\phi(x) dx \cdot Q(\xi | x) d\xi = f(\xi) d\xi \cdot P(x | \xi) dx
|
||
\thus Q(\xi | x) = \frac{f(\xi) \cdot P(x | \xi)}{\phi(x)}
|
||
$$
|
||
$$
|
||
\thus Q(\xi | x) = \frac{f(\xi) \cdot P(x | \xi)}
|
||
{\int d\xi \, f(\xi) P(x | \xi)}
|
||
$$ {#eq:first}
|
||
which is the Bayes theorem for conditional probability. From the normalization
|
||
of $P(x | \xi)$, it follows also that:
|
||
$$
|
||
f(\xi) = \int dx \, \phi(x) Q(\xi | x)
|
||
$$ {#eq:second}
|
||
Since $Q (\xi | x)$ depends on $f(\xi)$, @eq:second suggests an iterative
|
||
procedure for generating estimates of $f(\xi)$. With a guess for $f(\xi)$ and
|
||
a known $P(x | \xi)$, @eq:first can be used to calculate an estimate for
|
||
$Q (\xi | x)$. Then, taking the hint provided by @eq:second, an improved
|
||
estimate for $f(\xi)$ can be generated, using the observed sample {$x_i$} to
|
||
give an approximation for $\phi$.
|
||
Thus, if $f^t$ is the $t-th$ estimate, the next is given by:
|
||
$$
|
||
f^{t + 1}(\xi) = \int dx \, \phi(x) Q^t(\xi | x)
|
||
\with
|
||
Q^t(\xi | x) = \frac{f^t(\xi) \cdot P(x | \xi)}
|
||
{\int d\xi \, f^t(\xi) P(x | \xi)}
|
||
$$
|
||
from which:
|
||
$$
|
||
f^{t + 1}(\xi) = f^t(\xi)
|
||
\int dx \, \frac{\phi(x)}{\int d\xi \, f^t(\xi) P(x | \xi)}
|
||
P(x | \xi)
|
||
$$ {#eq:solution}
|
||
When the spread function $P(x | \xi) = P(x-\xi)$, @eq:solution can be
|
||
rewritten in terms of convolutions:
|
||
$$
|
||
f^{t + 1} = f^{t}\left( \frac{\phi}{{f^{t}} * P} * P^{\star} \right)
|
||
$$
|
||
where $P^{\star}$ is the flipped point spread function [@lucy74].
|
||
|
||
In this particular instance, a Gaussian kernel was convolved with the original
|
||
histogram. Again, dealing with discrete arrays of numbers, the division and
|
||
multiplication are element wise and the convolution is to be carried out as
|
||
described in @sec:convolution.
|
||
When implemented, this method results in an easy step-wise routine:
|
||
|
||
- choose a zero-order estimate for {$f(\xi)$};
|
||
- create a flipped copy of the kernel;
|
||
- compute the convolutions, the product and the division at each step;
|
||
- proceed until a given number $r$ of iterations is reached.
|
||
|
||
In this case, the zero-order was set $f(\xi) = 0.5 \, \forall \, \xi$ and
|
||
different number of iterations where tested. Results are discussed in
|
||
@sec:conv-results.
|
||
|
||
|
||
## The earth mover's distance
|
||
|
||
With the aim of comparing the two deconvolution methods, the similarity of a
|
||
deconvolved outcome with the original signal was quantified using the earth
|
||
mover's distance.
|
||
|
||
In statistics, the earth mover's distance (EMD) is the measure of distance
|
||
between two distributions [@cock41]. Informally, if one imagines the
|
||
distributions as two piles of different amount of dirt in their respective
|
||
regions, the EMD is the minimum cost of turning one pile into the other, making
|
||
the first the most possible similar to the second, where the cost is the amount
|
||
of dirt moved times the distance by which it is moved.
|
||
|
||
Computing the EMD is based on the solution to a transportation problem which
|
||
can be formalized as follows. Consider two vectors $P$ and $Q$ which represent
|
||
the two distributions whose EMD has to be measured:
|
||
$$
|
||
P = \{ (p_1, w_{p1}) \dots (p_m, w_{pm}) \} \et
|
||
Q = \{ (q_1, w_{q1}) \dots (q_n, w_{qn}) \}
|
||
$$
|
||
where $p_i$ and $q_i$ are the 'values' (that is, the location of the dirt) and
|
||
$w_{pi}$ and $w_{qi}$ are the 'weights' (that is, the quantity of dirt). A
|
||
ground distance matrix $D$ is defined such as its entries $d_{ij}$ are the
|
||
distances between $p_i$ and $q_j$. The aim is to find the flow matrix $F$,
|
||
where each entry $f_{ij}$ is the flow from $p_i$ to $q_j$ (which would be the
|
||
quantity of moved dirt), which minimizes the cost $W$:
|
||
$$
|
||
W (P, Q, F) = \sum_{i = 1}^m \sum_{j = 1}^n f_{ij} d_{ij}
|
||
$$
|
||
The $Q$ region is to be considered empty at the beginning: the 'dirt' present
|
||
in $P$ must be moved to $Q$ in order to reproduce the same distribution as
|
||
close as possible. Formally, the following constraints must be satisfied:
|
||
\begin{align*}
|
||
&\text{1.} \hspace{20pt} f_{ij} \ge 0 \hspace{15pt}
|
||
&1 \le i \le m \wedge 1 \le j \le n
|
||
\\
|
||
&\text{2.} \hspace{20pt} \sum_{j = 1}^n f_{ij} \le w_{pi}
|
||
&1 \le i \le m
|
||
\\
|
||
&\text{3.} \hspace{20pt} \sum_{i = 1}^m f_{ij} \le w_{qj}
|
||
&1 \le j \le n
|
||
\\
|
||
&\text{4.} \hspace{20pt} \sum_{j = 1}^n f_{ij} \sum_{j = 1}^m f_{ij} \le w_{qj}
|
||
= \text{min} \left( \sum_{i = 1}^m w_{pi}, \sum_{j = 1}^n w_{qj} \right)
|
||
\end{align*}
|
||
The first constraint allows moving dirt from $P$ to $Q$ and not vice versa; the
|
||
second limits the amount of dirt moved by each position in $P$ in order to not
|
||
exceed the available quantity; the third sets a limit to the dirt moved to each
|
||
position in $Q$ in order to not exceed the required quantity and the last one
|
||
forces to move the maximum amount of supplies possible: either all the dirt
|
||
present in $P$ has been moved or the $Q$ distribution has been obtained.
|
||
The total moved amount is the total flow. If the two distributions have the
|
||
same amount of dirt, all the dirt present in $P$ is necessarily moved to $Q$ and
|
||
the total flow equals the amount of available dirt.
|
||
|
||
Once the transportation problem is solved and the optimal flow is found, the
|
||
EMD is defined as the work normalized by the total flow:
|
||
$$
|
||
\text{EMD} (P, Q) = \frac{\sum_{i = 1}^m \sum_{j = 1}^n f_{ij} d_{ij}}
|
||
{\sum_{i = 1}^m \sum_{j=1}^n f_{ij}}
|
||
$$
|
||
In this case, where the EMD is to be measured between two same-length
|
||
histograms, the procedure simplifies a lot. By representing both histograms
|
||
with two vectors $u$ and $v$, the equation above boils down to [@ramdas17]:
|
||
$$
|
||
\text{EMD} (u, v) = \sum_i |U_i - V_i|
|
||
$$
|
||
where the sum runs over the entries of the vectors $U$ and $V$, which are the
|
||
cumulative sums of the histograms. In the code, the following equivalent
|
||
iterative routine was implemented.
|
||
$$
|
||
\text{EMD} (u, v) = \sum_i |\text{d}_i| \with
|
||
\begin{cases}
|
||
\text{d}_i = v_i - u_i + \text{d}_{i-1} \\
|
||
\text{d}_0 = 0
|
||
\end{cases}
|
||
$$
|
||
The equivalence is apparent once the definition is expanded:
|
||
\begin{align*}
|
||
\text{EMD} (u, v)
|
||
&= \sum_i |\text{d}_i| = |\text{d}_0| + |\text{d}_1|
|
||
+ |\text{d}_2| + |\text{d}_3| + \dots \\
|
||
&= 0 + |v_1 - u_1 + \text{d}_0| +
|
||
|v_2 - u_2 + \text{d}_1| +
|
||
|v_3 - u_3 + \text{d}_2| + \dots \\
|
||
&= |v_1 - u_1| +
|
||
|v_1 - u_1 + v_2 - u_2| +
|
||
|v_1 - u_1 + v_2 - u_2 + v_3 - u_3| + \dots \\
|
||
&= |v_1 - u_1| +
|
||
|v_1 + v_2 - (u_1 + u_2)| +
|
||
|v_1 + v_2 + v_3 - (u_1 + u_2 + u_3))| + \dots \\
|
||
&= |V_1 - U_1| + |V_2 - U_2| + |V_3 - U_3| + \dots \\
|
||
&= \sum_i |U_i - V_i|
|
||
\end{align*}
|
||
This simple algorithm enabled the comparisons between a great number of
|
||
histograms to be computed efficiently.
|
||
In order to make the code more flexible, the data were normalized before
|
||
computing the EMD: in doing so, it is possible to compare even samples with a
|
||
different number of points.
|
||
|
||
|
||
## Results comparison {#sec:conv-results}
|
||
|
||
|
||
### Noiseless results {#sec:noiseless}
|
||
|
||
In addition to the convolution with a Gaussian kernel of width $\sigma$, the
|
||
possibility to add a Gaussian noise to the convolved histogram counts was also
|
||
implemented to check weather the deconvolution is affected or not by this kind
|
||
of interference. This approach is described in the next subsection, while the
|
||
noiseless results are given in this one.
|
||
|
||
The two methods were compared for three different values of $\sigma$:
|
||
$$
|
||
\sigma = 0.1 \, \Delta \theta \et
|
||
\sigma = 0.5 \, \Delta \theta \et
|
||
\sigma = \Delta \theta
|
||
$$
|
||
Since the RL method depends on the number $r$ of performed rounds, in order to
|
||
find out how many are sufficient or necessary to compute, the earth mover's
|
||
distance between the deconvolved signal and the original one was measured for
|
||
different $r$s for each of the three tested values of the kernel $\sigma$.
|
||
|
||
To achieve this goal, a number of 1000 experiments was simulated. Each one
|
||
consists in generating the diffraction signal, convolving it with a kernel
|
||
of width $\sigma$, deconvolving with the RL algorithm with a given number of
|
||
rounds $r$ and measuring the EMD.
|
||
The distances are used to build an histogram of EMD distribution, from which
|
||
mean and standard deviation are computed.
|
||
This procedure was repeated for a few tens of different $r$s till a flattening
|
||
or a minimum of the curve became evident. All the results are shown in
|
||
@fig:rounds-noiseless.
|
||
|
||
The plots in @fig:rless-0.1 show the average (red) and standard deviation
|
||
(grey) of the measured EMD for $\sigma = 0.1 \, \Delta \theta$. The number of
|
||
iterations does not affect the quality of the outcome (those fluctuations are
|
||
merely a fact of floating-points precision) and the best result is obtained for
|
||
$r = 2$, meaning that the convergence of the RL algorithm is really fast and
|
||
this is due to the fact that the histogram was only slightly modified.
|
||
In @fig:rless-0.5, the curve starts to flatten at about 10 rounds, whereas in
|
||
@fig:rless-1 a minimum occurs around \num{5e3} rounds, meaning that, whit such
|
||
a large kernel, the convergence is very slow, even if the best results are
|
||
close to the one found for $\sigma = 0.5$.
|
||
The following $r$s were chosen as the most fitting:
|
||
\begin{align*}
|
||
\sigma = 0.1 \, \Delta \theta &\thus n^{\text{best}} = 2 \\
|
||
\sigma = 0.5 \, \Delta \theta &\thus n^{\text{best}} = 10 \\
|
||
\sigma = 1 \, \Delta \theta &\thus n^{\text{best}} = \num{5e3}
|
||
\end{align*}
|
||
|
||
Note the difference between @fig:rless-0.1 and the plots resulting from $\sigma =
|
||
0.5 \, \Delta \theta$ and $\sigma = \, \Delta \theta$ as regards the order of
|
||
magnitude: the RL deconvolution is heavily influenced by the variance magnitude:
|
||
the greater $\sigma$, the worse the deconvolved result.
|
||
|
||
On the other hand, the FFT deconvolution procedure is not affected by $\sigma$
|
||
amplitude changes: it always gives the same outcome, which would be exactly the
|
||
original signal, if the floating point precision would not affect the result,
|
||
being the FFT the analytical result of the deconvolution.
|
||
For this reason, the EMD obtained with the FFT can be used as a reference point
|
||
against which to compare the EMDs measured with RL.
|
||
As described above, for a given $r$, a thousands of experiments were simulated:
|
||
for each of this simulations, an EMD was computed. Besides computing their
|
||
average and standard deviations, those values were used to build histograms
|
||
showing the EMD distribution.
|
||
Once the best numbers of rounds $r^{\text{best}}$ were found, their histograms
|
||
were compared to the histograms of the FFT results, started from the same
|
||
convolved signals, and the EMD of the convolved signals themselves, in order to
|
||
check if an improvement was truly achieved. Results are shown in
|
||
@fig:emd-noiseless.
|
||
|
||
::: {id=fig:rounds-noiseless}
|
||
![](images/6-nonoise-rounds-0.1.pdf){#fig:rless-0.1}
|
||
|
||
![](images/6-nonoise-rounds-0.5.pdf){#fig:rless-0.5}
|
||
|
||
![](images/6-nonoise-rounds-1.pdf){#fig:rless-1}
|
||
|
||
EMD as a function of RL rounds for different kernel $\sigma$ values. The
|
||
average is shown in red and the standard deviation in grey. Noiseless results.
|
||
:::
|
||
|
||
::: {id="fig:emd-noiseless"}
|
||
![$\sigma = 0.1 \, \Delta \theta$](images/6-nonoise-emd-0.1.pdf){#fig:eless-0.1}
|
||
|
||
![$\sigma = 0.5 \, \Delta \theta$](images/6-nonoise-emd-0.5.pdf){#fig:eless-0.5}
|
||
|
||
![$\sigma = \Delta \theta$](images/6-nonoise-emd-1.pdf){#fig:eless-1}
|
||
|
||
EMD distributions for different kernel $\sigma$ values. The plots on the left
|
||
show the results for the FFT deconvolution, the central column the results for
|
||
the RL deconvolution and the third one shows the EMD for the convolved signal.
|
||
Noiseless results.
|
||
:::
|
||
|
||
As expected, the FFT results are always of the same order of magnitude,
|
||
\num{1e-15}, independently from the kernel width, whereas the RL deconvolution
|
||
is greatly affected by it, ranging from \num{1e-16} for $\sigma = 0.1 \, \Delta
|
||
\theta$ to \num{1e-4} for $\sigma = \Delta \theta$.
|
||
On the other hand, the first result came off as a surprise: for very low values
|
||
of $\sigma$, the RL routine gives better results. Apparently the RL algorithm
|
||
is more numerically stable than a FFT, which is more affected by floating
|
||
points round-off errors.
|
||
|
||
Regarding the comparison with the convolved signal, which shows always
|
||
an EMD of \num{1e-2}, both RL and FFT always return results closer to the
|
||
original signal, meaning that the deconvolution is indeed working.
|
||
|
||
|
||
### Noisy results
|
||
|
||
In order to observe the effect of the Gaussian noise on the two deconvolution
|
||
methods, a value of $\sigma = 0.8 \, \Delta \theta$ for the kernel width was
|
||
arbitrary chosen. The noise was then applied to the convolved histogram as
|
||
follows.
|
||
|
||
![Example of noisy histogram,
|
||
$\sigma_N = 0.05$.](images/6-noisy.pdf){#fig:noisy}
|
||
|
||
For each bin, once the convolved histogram was computed, a value $v_N$ was
|
||
randomly sampled from a Gaussian distribution with standard deviation
|
||
$\sigma_N$, and the value $v_n \cdot b$ was added to the bin itself, where $b$
|
||
is the count of the bin. An example with $\sigma_N = 0.05$ of the new
|
||
histogram is shown in @fig:noisy.
|
||
The following three values of $\sigma_N$ were tested:
|
||
$$
|
||
\sigma_N = 0.005 \et
|
||
\sigma_N = 0.01 \et
|
||
\sigma_N = 0.05
|
||
$$
|
||
The same procedure followed in @sec:noiseless was then repeated for noisy
|
||
signals. Hence, in @fig:rounds-noise the EMD as a function of the RL rounds is
|
||
shown, this time varying $\sigma_N$ and keeping $\sigma = 0.8 \, \Delta \theta$
|
||
constant.
|
||
|
||
::: {id=fig:rounds-noise}
|
||
![](images/6-noise-rounds-0.005.pdf){#fig:rnoise-0.005}
|
||
|
||
![](images/6-noise-rounds-0.01.pdf){#fig:rnoise-0.01}
|
||
|
||
![](images/6-noise-rounds-0.05.pdf){#fig:rnoise-0.05}
|
||
|
||
EMD as a function of RL rounds for different noise $\sigma_N$ values with the
|
||
kernel $\sigma = 0.8 \Delta \theta$. The average is shown in red and the
|
||
standard deviation in grey. Noisy results.
|
||
:::
|
||
|
||
::: {id=fig:emd-noisy}
|
||
![$\sigma_N = 0.005$](images/6-noise-emd-0.005.pdf){#fig:enoise-0.005}
|
||
|
||
![$\sigma_N = 0.01$](images/6-noise-emd-0.01.pdf){#fig:enoise-0.01}
|
||
|
||
![$\sigma_N = 0.05$](images/6-noise-emd-0.05.pdf){#fig:enoise-0.05}
|
||
|
||
EMD distributions for different noise $\sigma_N$ values. The plots on the left
|
||
show the results for the FFT deconvolution, the central column the results for
|
||
the RL deconvolution and the third one shows the EMD for the convolved signal.
|
||
Noisy results.
|
||
:::
|
||
|
||
In @fig:rnoise-0.005, the flattening is achieved around $r = 20$ and in
|
||
@fig:rnoise-0.01 it is sufficient $\sim r = 15$. When the noise becomes too
|
||
high, on the other hand, as $r$ grows, the algorithm becomes
|
||
largely ineffective, so only a few iterations are actually needed.
|
||
The most fitting values were chosen as:
|
||
\begin{align*}
|
||
\sigma_N = 0.005 &\thus r^{\text{best}} = 20 \\
|
||
\sigma_N = 0.01 &\thus r^{\text{best}} = 15 \\
|
||
\sigma_N = 0.05 &\thus r^{\text{best}} = 1
|
||
\end{align*}
|
||
About the distance, as $\sigma_n$ increases, unsurprisingly the EMD grows
|
||
larger, ranging from $\sim$ \num{2e-4} in @fig:rnoise-0.005 to $\sim$
|
||
\num{1.5e-3} in @fig:rnoise-0.05.
|
||
Since the FFT is no more able to give the original signal as close as before,
|
||
it is no more assumed to be a reference point. In fact, as shown in @fig:noisy,
|
||
the FFT algorithm, when dealing with noisy signals whose noise shape is
|
||
unknown, is as efficient as the RL, since they share the same order of magnitude
|
||
regarding the EMD.
|
||
An increasing noise entails the shift of the convolved histogram to slightly
|
||
greater values, still remaining in the same order of magnitude of the noiseless
|
||
signals. However, the deconvolution still works, being the EMD of the convolved
|
||
histogram an order of magnitude greater than the worst obtained with both
|
||
methods.
|
||
|
||
In conclusion, when dealing with smeared signals with a known point spread
|
||
function, the FFT proved to be the best deconvolution method, restoring
|
||
perfectly the original signal to the extent permitted by the floating point
|
||
precision. When the point spread function is particularly small, this precision
|
||
problem is partially solved by the RL deconvolution process, which employs a
|
||
more stable computations.
|
||
However, in real world applications the measures are affected by (possibly
|
||
unknown) noise and the signal can only be partially reconstructed by either
|
||
method.
|