2020-06-10 16:23:33 +02:00
|
|
|
# Sample statistics
|
2020-06-06 19:40:48 +02:00
|
|
|
|
|
|
|
|
2020-06-10 16:23:33 +02:00
|
|
|
## Sample statistics
|
2020-06-07 00:02:20 +02:00
|
|
|
|
2020-06-10 16:23:33 +02:00
|
|
|
How to estimate sample median, mode and FWHM?
|
2020-06-07 14:32:03 +02:00
|
|
|
|
|
|
|
. . .
|
|
|
|
|
2020-06-10 16:23:33 +02:00
|
|
|
- \only<3>\strike{Binning data $\hence$ depends wildly on bin-width}
|
2020-06-07 14:32:03 +02:00
|
|
|
|
|
|
|
. . .
|
|
|
|
|
|
|
|
- Alternative solutions
|
2020-06-10 16:23:33 +02:00
|
|
|
- Robust estimators
|
|
|
|
- Kernel density estimation
|
2020-06-07 14:32:03 +02:00
|
|
|
|
|
|
|
|
|
|
|
## Sample median
|
2020-06-06 19:40:48 +02:00
|
|
|
|
2020-06-10 16:23:33 +02:00
|
|
|
:::: {.columns}
|
|
|
|
::: {.column width=50% .c}
|
|
|
|
$$
|
|
|
|
F(m) = \frac{1}{2}
|
|
|
|
$$
|
2020-06-06 19:40:48 +02:00
|
|
|
|
2020-06-10 16:23:33 +02:00
|
|
|
\vspace{20pt}
|
2020-06-07 14:32:03 +02:00
|
|
|
|
2020-06-10 16:23:33 +02:00
|
|
|
. . .
|
2020-06-07 14:32:03 +02:00
|
|
|
|
2020-06-10 16:23:33 +02:00
|
|
|
- Sort points in ascending order
|
|
|
|
|
|
|
|
. . .
|
|
|
|
|
|
|
|
- Middle element if odd
|
2020-06-07 14:32:03 +02:00
|
|
|
|
2020-06-10 16:23:33 +02:00
|
|
|
Average of the two central elements if even
|
|
|
|
:::
|
|
|
|
|
|
|
|
::: {.column width=50%}
|
|
|
|
![](images/median.pdf)
|
|
|
|
:::
|
|
|
|
::::
|
2020-06-07 14:32:03 +02:00
|
|
|
|
|
|
|
|
|
|
|
## Sample mode
|
|
|
|
|
|
|
|
Most probable value
|
|
|
|
|
|
|
|
. . .
|
|
|
|
|
2020-06-10 16:23:33 +02:00
|
|
|
Half Sample Mode
|
2020-06-07 14:32:03 +02:00
|
|
|
|
|
|
|
- Iteratively identify the smallest interval containing half points
|
2020-06-08 23:45:13 +02:00
|
|
|
- Once the sample is reduced to less than three points, take average
|
2020-06-07 14:32:03 +02:00
|
|
|
|
2020-06-10 16:23:33 +02:00
|
|
|
. . .
|
|
|
|
|
|
|
|
\setbeamercovered{}
|
|
|
|
|
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[remember picture]
|
|
|
|
% line
|
|
|
|
\draw [line width=3, ->, cyclamen] (-5,0) -- (5,0);
|
|
|
|
\node [right] at (5,0) {$x$};
|
|
|
|
% points
|
|
|
|
\draw [blue, fill=blue] (-4.6,-0.1) rectangle (-4.8,0.1);
|
|
|
|
\draw [blue, fill=blue] (-4,-0.1) rectangle (-4.2,0.1);
|
|
|
|
\draw [blue, fill=blue] (-3.3,-0.1) rectangle (-3.5,0.1);
|
|
|
|
\draw [blue, fill=blue] (-2.3,-0.1) rectangle (-2.5,0.1);
|
|
|
|
\draw [blue, fill=blue] (-0.6,-0.1) rectangle (-0.8,0.1);
|
|
|
|
\draw [blue, fill=blue] (-0.1,-0.1) rectangle (0.1,0.1);
|
|
|
|
\draw [blue, fill=blue] (1.1,-0.1) rectangle (1.3,0.1);
|
|
|
|
\draw [blue, fill=blue] (2 ,-0.1) rectangle (2.2,0.1);
|
|
|
|
\draw [blue, fill=blue] (2.7,-0.1) rectangle (2.9,0.1);
|
|
|
|
\draw [blue, fill=blue] (4,-0.1) rectangle (4.2,0.1);
|
|
|
|
% future nodes
|
|
|
|
\node at (-1,-0.3) (1a) {};
|
|
|
|
\node at (3.1,0.3) (1b) {};
|
|
|
|
\node at (0.9,-0.3) (2a) {};
|
|
|
|
\node at (1.8,-0.3) (3a) {};
|
|
|
|
% result nodes
|
|
|
|
\node at (2.45,-0.7) (f1) {};
|
|
|
|
\node at (2.45,0.7) (f2) {};
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
|
|
|
|
|
|
|
. . .
|
|
|
|
|
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[remember picture, overlay]
|
|
|
|
% region
|
|
|
|
\draw [orange, fill=orange, opacity=0.5] (1a) rectangle (1b);
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
|
|
|
|
|
|
|
. . .
|
|
|
|
|
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[remember picture, overlay]
|
|
|
|
% region
|
|
|
|
\draw [orange, fill=orange, opacity=0.5] (2a) rectangle (1b);
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
|
|
|
|
|
|
|
. . .
|
|
|
|
|
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[remember picture, overlay]
|
|
|
|
% region
|
|
|
|
\draw [orange, fill=orange, opacity=0.5] (3a) rectangle (1b);
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
|
|
|
|
|
|
|
. . .
|
|
|
|
|
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[remember picture, overlay]
|
|
|
|
% region
|
|
|
|
\draw [cyclamen, ultra thick] (f1) -- (f2);
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
|
|
|
|
2020-06-07 14:32:03 +02:00
|
|
|
|
|
|
|
## Sample FWHM
|
|
|
|
|
2020-06-06 19:40:48 +02:00
|
|
|
$$
|
2020-06-07 14:32:03 +02:00
|
|
|
\text{FWHM} = x_+ - x_- \with L(x_{\pm}) = \frac{L_{\text{max}}}{2}
|
2020-06-06 19:40:48 +02:00
|
|
|
$$
|
|
|
|
|
2020-06-10 16:23:33 +02:00
|
|
|
\setbeamercovered{transparent}
|
2020-06-06 19:40:48 +02:00
|
|
|
. . .
|
|
|
|
|
2020-06-10 16:23:33 +02:00
|
|
|
Kernel Density Estimation
|
2020-06-07 14:32:03 +02:00
|
|
|
|
|
|
|
- empirical PDF construction:
|
|
|
|
|
2020-06-06 19:40:48 +02:00
|
|
|
$$
|
2020-06-07 14:32:03 +02:00
|
|
|
f_\varepsilon(x) = \frac{1}{N\varepsilon} \sum_{i = 1}^N
|
|
|
|
G \left( \frac{x-x_i}{\varepsilon} \right)
|
2020-06-06 19:40:48 +02:00
|
|
|
$$
|
2020-06-07 00:02:20 +02:00
|
|
|
|
2020-06-08 23:45:13 +02:00
|
|
|
The parameter $\varepsilon$ controls the strength of the smoothing
|
2020-06-07 14:32:03 +02:00
|
|
|
|
2020-06-07 00:02:20 +02:00
|
|
|
|
2020-06-07 14:32:03 +02:00
|
|
|
## Sample FWHM
|
|
|
|
|
|
|
|
Silverman's rule of thumb:
|
|
|
|
|
|
|
|
$$
|
|
|
|
f_\varepsilon(x) = \frac{1}{N\varepsilon} \sum_{i = 1}^N
|
|
|
|
G \left( \frac{x-x_i}{\varepsilon} \right)
|
|
|
|
\with
|
2020-06-08 18:02:21 +02:00
|
|
|
\varepsilon = 0.88 \, S_N
|
2020-06-07 14:32:03 +02:00
|
|
|
\left( \frac{d + 2}{4}N \right)^{-1/(d + 4)}
|
|
|
|
$$
|
|
|
|
|
|
|
|
with:
|
|
|
|
|
2020-06-08 23:45:13 +02:00
|
|
|
- $S_N$ is the sample standard deviation
|
|
|
|
- $d$ is number of dimensions ($d = 1$)
|
2020-06-07 14:32:03 +02:00
|
|
|
|
|
|
|
. . .
|
2020-06-07 00:02:20 +02:00
|
|
|
|
2020-06-10 16:23:33 +02:00
|
|
|
Numerical minimization (Brent) for $\quad f_{\varepsilon_{\text{max}}}$
|
|
|
|
Numerical root finding (Brent) for $\quad f_{\varepsilon}(x_{\pm}) =
|
|
|
|
\frac{f_{\varepsilon_{\text{max}}}}{2}$
|
2020-06-09 16:52:28 +02:00
|
|
|
|
|
|
|
|
|
|
|
## Sample FWHM
|
|
|
|
|
|
|
|
![](images/kde.pdf)
|