analistica/notes/sections/4.md

235 lines
8.0 KiB
Markdown
Raw Normal View History

2020-03-06 02:24:32 +01:00
# Exercise 4
2020-04-01 01:36:28 +02:00
## Kinematic dip PDF derivation
2020-03-06 02:24:32 +01:00
Consider a great number of non-interacting particles, each of which with a
2020-05-12 22:38:14 +02:00
random momentum $\vec{P}$ with module between 0 and $P_{\text{max}}$ randomly
2020-03-06 02:24:32 +01:00
angled with respect to a coordinate system {$\hat{x}$, $\hat{y}$, $\hat{z}$}.
Once the polar angle $\theta$ is defined, the momentum vertical and horizontal
2020-05-12 22:38:14 +02:00
components of a particle, which will be referred as $\vec{P_v}$ and $\vec{P_h}$
2020-04-23 23:56:53 +02:00
respectively, are the ones shown in @fig:components.
2020-03-06 02:24:32 +01:00
If $\theta$ is evenly distributed on the sphere and the same holds for the
2020-05-12 22:38:14 +02:00
module $P$, which distribution will the average value of $|P_v|$ follow as a
function of $P_h$?
2020-03-06 02:24:32 +01:00
\begin{figure}
\hypertarget{fig:components}{%
\centering
2020-04-23 23:56:53 +02:00
\begin{tikzpicture}[font=\scriptsize]
2020-03-06 02:24:32 +01:00
% Axes
\draw [thick, ->] (5,2) -- (5,8);
\draw [thick, ->] (5,2) -- (2,1);
\draw [thick, ->] (5,2) -- (8,1);
\node at (1.5,0.9) {$x$};
\node at (8.5,0.9) {$y$};
\node at (5,8.4) {$z$};
% Momentum
\definecolor{cyclamen}{RGB}{146, 24, 43}
\draw [ultra thick, ->, cyclamen] (5,2) -- (3.8,6);
\draw [thick, dashed, cyclamen] (3.8,0.8) -- (3.8,6);
\draw [thick, dashed, cyclamen] (5,7.2) -- (3.8,6);
\draw [ultra thick, ->, pink] (5,2) -- (5,7.2);
\draw [ultra thick, ->, pink] (5,2) -- (3.8,0.8);
2020-05-12 22:38:14 +02:00
\node at (4.8,1.1) {$\vec{P_h}$};
\node at (5.5,6.6) {$\vec{P_v}$};
\node at (3.3,5.5) {$\vec{P}$};
2020-03-06 02:24:32 +01:00
% Angle
\draw [thick, cyclamen] (4.4,4) to [out=80,in=210] (5,5);
\node at (4.7,4.2) {$\theta$};
\end{tikzpicture}
\caption{Momentum components.}\label{fig:components}
}
\end{figure}
2020-05-12 22:38:14 +02:00
Since the aim is to compute $\langle |P_v| \rangle (P_h)$, the conditional
distribution probability of $P_v$ given a fixed value of $P_h = x$ must first
be determined. It can be computed as the ratio between the probability of
getting a fixed value of $P_v$ given $x$ over the total probability of
getting that $x$:
2020-04-23 23:56:53 +02:00
$$
f (P_v | P_h = x) = \frac{f_{P_h , P_v} (x, P_v)}
{\int_{\{ P_v \}} d P_v f_{P_h , P_v} (x, P_v)}
= \frac{f_{P_h , P_v} (x, P_v)}{I}
$$
where $f_{P_h , P_v}$ is the joint pdf of the two variables $P_v$ and $P_h$ and
2020-05-12 22:38:14 +02:00
the integral $I$ runs over all the possible values of $P_v$ given a certain
$P_h$.
$f_{P_h , P_v}$ can simply be computed from the joint pdf of $\theta$ and $P$
with a change of variables. For the pdf of $\theta$ $f_{\theta} (\theta)$, the
same considerations done in @sec:3 lead to:
2020-03-06 02:24:32 +01:00
$$
2020-04-23 23:56:53 +02:00
f_{\theta} (\theta) = \frac{1}{2} \sin{\theta} \chi_{[0, \pi]} (\theta)
2020-03-06 02:24:32 +01:00
$$
2020-05-12 22:38:14 +02:00
whereas, being $P$ evenly distributed:
2020-04-23 23:56:53 +02:00
$$
2020-05-12 22:38:14 +02:00
f_P (P) = \chi_{[0, P_{\text{max}}]} (P)
2020-04-23 23:56:53 +02:00
$$
2020-03-06 02:24:32 +01:00
2020-04-23 23:56:53 +02:00
where $\chi_{[a, b]} (y)$ is the normalized characteristic function which value
2020-05-12 22:38:14 +02:00
is $1/N$ between $a$ and $b$ (where $N$ is the normalization term) and 0
2020-04-23 23:56:53 +02:00
elsewhere. Being a couple of independent variables, their joint pdf is simply
given by the product of their pdfs:
2020-03-06 02:24:32 +01:00
$$
2020-05-12 22:38:14 +02:00
f_{\theta , P} (\theta, P) = f_{\theta} (\theta) f_P (P)
2020-04-23 23:56:53 +02:00
= \frac{1}{2} \sin{\theta} \chi_{[0, \pi]} (\theta)
2020-05-12 22:38:14 +02:00
\chi_{[0, P_{\text{max}}]} (P)
2020-03-06 02:24:32 +01:00
$$
2020-04-23 23:56:53 +02:00
Given the new variables:
2020-03-06 02:24:32 +01:00
$$
2020-04-23 23:56:53 +02:00
\begin{cases}
P_v = P \cos(\theta) \\
P_h = P \sin(\theta)
\end{cases}
2020-03-06 02:24:32 +01:00
$$
2020-05-12 22:38:14 +02:00
with $\theta \in [0, \pi]$, the previous ones can be written as:
2020-03-06 02:24:32 +01:00
$$
2020-04-23 23:56:53 +02:00
\begin{cases}
P = \sqrt{P_v^2 + P_h^2} \\
2020-05-12 22:38:14 +02:00
\theta = \text{atan}_2 ( P_h/P_v ) :=
2020-04-23 23:56:53 +02:00
\begin{cases}
\text{atan} ( P_h/P_v ) &\incase P_v > 0 \\
\text{atan} ( P_h/P_v ) + \pi &\incase P_v < 0
\end{cases}
\end{cases}
2020-03-06 02:24:32 +01:00
$$
2020-04-23 23:56:53 +02:00
which can be shown having Jacobian:
2020-03-06 02:24:32 +01:00
$$
2020-04-23 23:56:53 +02:00
J = \frac{1}{\sqrt{P_v^2 + P_h^2}}
$$
Hence:
$$
f_{P_h , P_v} (P_h, P_v) =
2020-05-12 22:38:14 +02:00
\frac{1}{2} \sin[ \text{atan}_2 ( P_h/P_v )]
\chi_{[0, \pi]} (\text{atan}_2 ( P_h/P_v )) \cdot \\
2020-04-23 23:56:53 +02:00
\frac{\chi_{[0, p_{\text{max}}]} \left( \sqrt{P_v^2 + P_h^2} \right)}
{\sqrt{P_v^2 + P_h^2}}
2020-03-06 02:24:32 +01:00
$$
2020-04-23 23:56:53 +02:00
from which, the integral $I$ can now be computed. The edges of the integral
are fixed by the fact that the total momentum can not exceed $P_{\text{max}}$:
2020-03-06 02:24:32 +01:00
$$
2020-04-23 23:56:53 +02:00
I = \int
\limits_{- \sqrt{P_{\text{max}}^2 - P_h}}^{\sqrt{P_{\text{max}}^2 - P_h}}
dP_v \, f_{P_h , P_v} (x, P_v)
2020-03-06 02:24:32 +01:00
$$
2020-04-23 23:56:53 +02:00
after a bit of maths, using the identity:
2020-03-06 02:24:32 +01:00
$$
2020-05-12 22:38:14 +02:00
\sin[ \text{atan}_2 ( P_h/P_v )] = \frac{P_h}{\sqrt{P_h^2 + P_v^2}}
2020-03-06 02:24:32 +01:00
$$
2020-05-12 22:38:14 +02:00
and the fact that both the characteristic functions play no role within the
integral limits, the following result arises:
2020-04-23 23:56:53 +02:00
$$
I = 2 \, \text{atan} \left( \sqrt{\frac{P_{\text{max}}^2}{x^2} - 1} \right)
$$
2020-03-06 02:24:32 +01:00
2020-04-23 23:56:53 +02:00
from which:
$$
f (P_v | P_h = x) = \frac{x}{P_v^2 + x^2} \cdot
\frac{1}{2 \, \text{atan}
\left( \sqrt{\frac{P_{\text{max}}^2}{x^2} - 1} \right)}
$$
2020-03-06 02:24:32 +01:00
2020-05-12 22:38:14 +02:00
Finally, putting all the pieces together, the average value of $|P_v|$ can be
computed:
$$
\langle |P_v| \rangle = \int
2020-04-23 23:56:53 +02:00
\limits_{- \sqrt{P_{\text{max}}^2 - P_h}}^{\sqrt{P_{\text{max}}^2 - P_h}}
2020-05-12 22:38:14 +02:00
f (P_v | P_h = x) = [ \dots ]
= x \, \frac{\ln \left( \frac{P_{\text{max}}}{x} \right)}
{\text{atan} \left( \sqrt{ \frac{P^2_{\text{max}}}{x^2} - 1} \right)}
2020-05-12 22:38:14 +02:00
$$ {#eq:dip}
2020-03-06 02:24:32 +01:00
Namely:
2020-04-23 23:56:53 +02:00
![Plot of the expected distribution with
$P_{\text{max}} = 10$.](images/expected.pdf){#fig:plot}
2020-03-06 02:24:32 +01:00
2020-04-01 01:36:28 +02:00
## Monte Carlo simulation
2020-03-06 02:24:32 +01:00
The same distribution should be found by generating and binning points in a
proper way. A number of $N = 50000$ points were generated as a couple of values
2020-05-12 22:38:14 +02:00
($P$, $\theta$), with $P$ evenly distributed between 0 and $P_{\text{max}}$ and
2020-03-06 02:24:32 +01:00
$\theta$ given by the same procedure described in @sec:3, namely:
$$
\theta = \text{acos}(1 - 2x)
$$
2020-05-12 22:38:14 +02:00
with $x$ uniformly distributed between 0 and 1.
The data binning turned out to be a bit challenging. Once $P$ was sampled and
$P_h$ was computed, the bin containing the latter's value must be found. If $n$
is the number of bins in which the range $[0, P_{\text{max}}]$ is divided into,
then the width $w$ of each bin is given by:
2020-03-06 02:24:32 +01:00
$$
2020-05-12 22:38:14 +02:00
w = \frac{P_{\text{max}}}{n}
2020-03-06 02:24:32 +01:00
$$
2020-05-12 22:38:14 +02:00
and the $i^{th}$ bin in which $P_h$ goes in is:
2020-03-06 02:24:32 +01:00
$$
2020-05-12 22:38:14 +02:00
i = \text{floor} \left( \frac{P_h}{w} \right)
2020-03-06 02:24:32 +01:00
$$
2020-05-12 22:38:14 +02:00
where 'floor' is the function which gives the bigger integer smaller than its
2020-03-06 02:24:32 +01:00
argument and the bins are counted starting from zero.
2020-05-12 22:38:14 +02:00
Then, a vector in which the $j^{\text{th}}$ entry contains both the sum $S_j$
of all the $|P_v|$s relative to each $P_h$ fallen into the $j^{\text{th}}$ bin
itself and the number num$_j$ of the bin entries was reiteratively updated. At
the end, the average value of $|P_v|_j$ was computed as $S_j / \text{num}_j$.
For the sake of clarity, for each sampled couple the procedure is the
following. At first $S_j = 0 \wedge \text{num}_j = 0 \, \forall \, j$, then:
2020-03-06 02:24:32 +01:00
2020-05-12 22:38:14 +02:00
- the couple $(P, \theta)$ is generated,
- $P_h$ and $P_v$ are computed,
- the $j^{\text{th}}$ bin containing $P_h$ is found,
- num$_j$ is increased by 1,
- $S_j$ is increased by $|P_v|$.
2020-03-06 02:24:32 +01:00
2020-05-12 22:38:14 +02:00
For $P_{\text{max}} = 10$ and $n = 50$, the following result was obtained:
2020-03-06 02:24:32 +01:00
2020-05-12 22:38:14 +02:00
![Sampled points histogram.](images/dip.pdf)
2020-03-06 02:24:32 +01:00
2020-05-12 22:38:14 +02:00
In order to check whether the expected distribution (@eq:dip) properly matches
the produced histogram, a chi-squared minimization was applied. Being a simple
one-parameter fit, the $\chi^2$ was computed without a suitable GSL function
2020-05-12 22:38:14 +02:00
and the error of the so obtained estimation of $P_{\text{max}}$ was given as
the inverse of the $\chi^2$ second derivative in its minimum, according to the
Cramér-Rao bound.
The following results were obtained:
$$
2020-05-12 22:38:14 +02:00
P^{\text{oss}}_{\text{max}} = 10.005 \pm 0.018 \with \chi_r^2 = 0.071
$$
2020-05-12 22:38:14 +02:00
where $\chi_r^2$ is the reduced $\chi^2$, proving a good convergence.
In order to compare $P^{\text{oss}}_{\text{max}}$ with the expected value
$P_{\text{max}} = 10$, the following compatibility t-test was applied:
2020-04-28 22:24:09 +02:00
$$
p = 1 - \text{erf}\left(\frac{t}{\sqrt{2}}\right)\ \with
2020-05-12 22:38:14 +02:00
t = \frac{|P^{\text{oss}}_{\text{max}} - P_{\text{max}}|}
{\Delta P^{\text{oss}}_{\text{max}}}
2020-04-28 22:24:09 +02:00
$$
2020-05-12 22:38:14 +02:00
where $\Delta P^{\text{oss}}_{\text{max}}$ is the $P^{\text{oss}}_{\text{max}}$
uncertainty. At 95% confidence level, the values are compatible if $p > 0.05$.
2020-04-28 22:24:09 +02:00
In this case:
2020-04-28 23:44:24 +02:00
- t = 0.295
- p = 0.768
2020-04-28 22:24:09 +02:00
which allows to assert that the sampled points actually follow the predicted
2020-04-28 23:44:24 +02:00
distribution. In @fig:fit, the fit function superimposed on the histogram is
shown.
2020-05-12 22:38:14 +02:00
![Fitted sampled data. $P^{\text{oss}}_{\text{max}} = 10.005
\pm 0.018$, $\chi_r^2 = 0.071$.](images/fit.pdf){#fig:fit}