sections: fix some typos in every section
This commit is contained in:
parent
4d73974190
commit
b7e1857862
@ -149,11 +149,12 @@ To obtain a better estimate of the mode and its error, the above procedure was
|
|||||||
bootstrapped. The original sample was treated as a population and used to build
|
bootstrapped. The original sample was treated as a population and used to build
|
||||||
100 other samples of the same size, by *sampling with replacements*. For each one
|
100 other samples of the same size, by *sampling with replacements*. For each one
|
||||||
of the new samples, the above statistic was computed. By simply taking the
|
of the new samples, the above statistic was computed. By simply taking the
|
||||||
mean of these statistics the following estimate was obtained:
|
mean and standard deviation of these statistics the following estimate was
|
||||||
|
obtained:
|
||||||
$$
|
$$
|
||||||
\text{observed mode: } m_o = \num{-0.29 \pm 0.19}
|
\text{observed mode: } m_o = \num{-0.29 \pm 0.19}
|
||||||
$$
|
$$
|
||||||
In order to compare the values $m_e$ and $m_0$, the following compatibility
|
In order to compare the values $m_e$ and $m_o$, the following compatibility
|
||||||
$t$-test was applied:
|
$t$-test was applied:
|
||||||
$$
|
$$
|
||||||
p = 1 - \text{erf}\left(\frac{t}{\sqrt{2}}\right)\ \with
|
p = 1 - \text{erf}\left(\frac{t}{\sqrt{2}}\right)\ \with
|
||||||
@ -184,7 +185,7 @@ middle elements otherwise.
|
|||||||
|
|
||||||
The expected median was derived from the quantile function (QDF) of the Landau
|
The expected median was derived from the quantile function (QDF) of the Landau
|
||||||
distribution[^1].
|
distribution[^1].
|
||||||
Once this is know, the median is simply given by $\text{QDF}(1/2)$. Since both
|
Once this is known, the median is simply given by $\text{QDF}(1/2)$. Since both
|
||||||
the CDF and QDF have no known closed form, they must be computed numerically.
|
the CDF and QDF have no known closed form, they must be computed numerically.
|
||||||
The cumulative probability was computed by quadrature-based numerical
|
The cumulative probability was computed by quadrature-based numerical
|
||||||
integration of the PDF (`gsl_integration_qagiu()` function in GSL). The function
|
integration of the PDF (`gsl_integration_qagiu()` function in GSL). The function
|
||||||
@ -210,13 +211,13 @@ where the absolute and relative tolerances $\varepsilon_\text{abs}$ and
|
|||||||
$\varepsilon_\text{rel}$ were set to \num{1e-10} and \num{1e-6},
|
$\varepsilon_\text{rel}$ were set to \num{1e-10} and \num{1e-6},
|
||||||
respectively.
|
respectively.
|
||||||
As for the QDF, this was implemented by numerically inverting the CDF. This was
|
As for the QDF, this was implemented by numerically inverting the CDF. This was
|
||||||
done by solving the equation;
|
done by solving the equation for x:
|
||||||
$$
|
$$
|
||||||
p(x) = p_0
|
p(x) = p_0
|
||||||
$$
|
$$
|
||||||
for x, given a probability value $p_0$, where $p(x)$ is the CDF. The (unique)
|
given a probability value $p_0$, where $p(x)$ is the CDF. The (unique) root of
|
||||||
root of this equation was found by a root-finding routine
|
this equation was found by a root-finding routine (`gsl_root_fsolver_brent` in
|
||||||
(`gsl_root_fsolver_brent` in GSL) based on the Brent-Dekker method.
|
GSL) based on the Brent-Dekker method.
|
||||||
The following condition was checked for convergence:
|
The following condition was checked for convergence:
|
||||||
$$
|
$$
|
||||||
|a - b| < \varepsilon_\text{abs} + \varepsilon_\text{rel} \min(|a|, |b|)
|
|a - b| < \varepsilon_\text{abs} + \varepsilon_\text{rel} \min(|a|, |b|)
|
||||||
|
@ -10,7 +10,7 @@ $$
|
|||||||
\sum_{k=1}^{n} \frac{1}{k}
|
\sum_{k=1}^{n} \frac{1}{k}
|
||||||
- \ln(n) \right)
|
- \ln(n) \right)
|
||||||
$$ {#eq:gamma}
|
$$ {#eq:gamma}
|
||||||
and represents the limiting blue area in @fig:gamma. The first 30 digits of
|
and represents the limiting red area in @fig:gamma. The first 30 digits of
|
||||||
$\gamma$ are:
|
$\gamma$ are:
|
||||||
$$
|
$$
|
||||||
\gamma = 0.57721\ 56649\ 01532\ 86060\ 65120\ 90082 \dots
|
\gamma = 0.57721\ 56649\ 01532\ 86060\ 65120\ 90082 \dots
|
||||||
@ -52,7 +52,7 @@ efficiency of the methods lies on how quickly they converge to their limit.
|
|||||||
\draw (7.0,-0.05) -- (7.0,0.05); \node [below, scale=0.7] at (7.0,-0.05) {7};
|
\draw (7.0,-0.05) -- (7.0,0.05); \node [below, scale=0.7] at (7.0,-0.05) {7};
|
||||||
\end{tikzpicture}
|
\end{tikzpicture}
|
||||||
\caption{The area of the red region converges to the Euler–Mascheroni
|
\caption{The area of the red region converges to the Euler–Mascheroni
|
||||||
constant..}\label{fig:gamma}
|
constant.}\label{fig:gamma}
|
||||||
}
|
}
|
||||||
\end{figure}
|
\end{figure}
|
||||||
|
|
||||||
@ -109,10 +109,8 @@ sign, 8 for the exponent and 55 for the mantissa, hence:
|
|||||||
$$
|
$$
|
||||||
2^{55} = 10^{d} \thus d = 55 \cdot \log(2) \sim 16.6
|
2^{55} = 10^{d} \thus d = 55 \cdot \log(2) \sim 16.6
|
||||||
$$
|
$$
|
||||||
Only 10 digits were correctly computed: this means that when the terms of the
|
But only 10 digits were correctly computed. The best result is shown in
|
||||||
series start being smaller than the smallest representable double, the sum of
|
@tbl:naive-res.
|
||||||
all the remaining terms gives a number $\propto 10^{-11}$. The best result is
|
|
||||||
shown in @tbl:naive-res.
|
|
||||||
|
|
||||||
------- --------------------
|
------- --------------------
|
||||||
exact 0.57721 56649 01533
|
exact 0.57721 56649 01533
|
||||||
|
@ -13,7 +13,7 @@ distribution function $F$:
|
|||||||
\end{align*}
|
\end{align*}
|
||||||
where $\theta$ and $\phi$ are, respectively, the polar and azimuthal angles, and
|
where $\theta$ and $\phi$ are, respectively, the polar and azimuthal angles, and
|
||||||
$$
|
$$
|
||||||
\alpha_0 = 0.65 \et \beta_0 = 0.06 \et \gamma_0 = -0.18
|
\alpha = 0.65 \et \beta = 0.06 \et \gamma = -0.18
|
||||||
$$
|
$$
|
||||||
To generate the points, a *hit-miss* method was employed:
|
To generate the points, a *hit-miss* method was employed:
|
||||||
|
|
||||||
|
@ -49,9 +49,9 @@ approximate $I$ as:
|
|||||||
$$
|
$$
|
||||||
I \approx I_N = \frac{V}{N} \sum_{i=1}^N f(x_i) = V \cdot \avg{f}
|
I \approx I_N = \frac{V}{N} \sum_{i=1}^N f(x_i) = V \cdot \avg{f}
|
||||||
$$
|
$$
|
||||||
If $x_i$ are uniformly distributed $I_N \rightarrow I$ for $N \rightarrow +
|
If $x_i$ are uniformly distributed, $I_N \rightarrow I$ for $N \rightarrow +
|
||||||
\infty$ by the law of large numbers, whereas the integral variance can be
|
\infty$ by the law of large numbers, whereas the integral variance $\sigma^2_I$
|
||||||
estimated as:
|
can be estimated as:
|
||||||
$$
|
$$
|
||||||
\sigma^2_f = \frac{1}{N - 1}
|
\sigma^2_f = \frac{1}{N - 1}
|
||||||
\sum_{i = 1}^N \left( f(x_i) - \avg{f} \right)^2
|
\sum_{i = 1}^N \left( f(x_i) - \avg{f} \right)^2
|
||||||
|
@ -123,7 +123,7 @@ where:
|
|||||||
- $(\cdot, \cdot)$ is an inner product.
|
- $(\cdot, \cdot)$ is an inner product.
|
||||||
|
|
||||||
Given a signal $s$ of $n$ elements and a kernel $k$ of $m$ elements,
|
Given a signal $s$ of $n$ elements and a kernel $k$ of $m$ elements,
|
||||||
their convolution is a vector of $n + m + 1$ elements computed
|
their convolution $c$ is a vector of $n + m + 1$ elements computed
|
||||||
by flipping $s$ ($R$ operator) and shifting its indices ($T_i$ operator):
|
by flipping $s$ ($R$ operator) and shifting its indices ($T_i$ operator):
|
||||||
$$
|
$$
|
||||||
c_i = (s, T_i \, R \, k)
|
c_i = (s, T_i \, R \, k)
|
||||||
@ -446,8 +446,8 @@ close as possible. Formally, the following constraints must be satisfied:
|
|||||||
&\text{3.} \hspace{20pt} \sum_{i = 1}^m f_{ij} \le w_{qj}
|
&\text{3.} \hspace{20pt} \sum_{i = 1}^m f_{ij} \le w_{qj}
|
||||||
&1 \le j \le n
|
&1 \le j \le n
|
||||||
\\
|
\\
|
||||||
&\text{4.} \hspace{20pt} \sum_{j = 1}^n f_{ij} \sum_{j = 1}^m f_{ij} \le w_{qj}
|
&\text{4.} \hspace{20pt} \sum_{j = 1}^n \sum_{j = 1}^m f_{ij} \le
|
||||||
= \text{min} \left( \sum_{i = 1}^m w_{pi}, \sum_{j = 1}^n w_{qj} \right)
|
\text{min} \left( \sum_{i = 1}^m w_{pi}, \sum_{j = 1}^n w_{qj} \right)
|
||||||
\end{align*}
|
\end{align*}
|
||||||
The first constraint allows moving dirt from $P$ to $Q$ and not vice versa; the
|
The first constraint allows moving dirt from $P$ to $Q$ and not vice versa; the
|
||||||
second limits the amount of dirt moved by each position in $P$ in order to not
|
second limits the amount of dirt moved by each position in $P$ in order to not
|
||||||
@ -549,9 +549,9 @@ a large kernel, the convergence is very slow, even if the best results are
|
|||||||
close to the one found for $\sigma = 0.5$.
|
close to the one found for $\sigma = 0.5$.
|
||||||
The following $r$s were chosen as the most fitting:
|
The following $r$s were chosen as the most fitting:
|
||||||
\begin{align*}
|
\begin{align*}
|
||||||
\sigma = 0.1 \, \Delta \theta &\thus n^{\text{best}} = 2 \\
|
\sigma = 0.1 \, \Delta \theta &\thus r^{\text{best}} = 2 \\
|
||||||
\sigma = 0.5 \, \Delta \theta &\thus n^{\text{best}} = 10 \\
|
\sigma = 0.5 \, \Delta \theta &\thus r^{\text{best}} = 10 \\
|
||||||
\sigma = 1 \, \Delta \theta &\thus n^{\text{best}} = \num{5e3}
|
\sigma = 1 \, \Delta \theta &\thus r^{\text{best}} = \num{5e3}
|
||||||
\end{align*}
|
\end{align*}
|
||||||
|
|
||||||
Note the difference between @fig:rless-0.1 and the plots resulting from $\sigma =
|
Note the difference between @fig:rless-0.1 and the plots resulting from $\sigma =
|
||||||
|
@ -86,8 +86,8 @@ $$
|
|||||||
\tilde{\mu}_2 − \tilde{\mu}_1 = w^T (\mu_2 − \mu_1)
|
\tilde{\mu}_2 − \tilde{\mu}_1 = w^T (\mu_2 − \mu_1)
|
||||||
$$
|
$$
|
||||||
This expression can be made arbitrarily large simply by increasing the
|
This expression can be made arbitrarily large simply by increasing the
|
||||||
magnitude of $w$, fortunately the problem is easily solved by requiring $w$
|
magnitude of $w$ but, fortunately, the problem is easily solved by requiring
|
||||||
to be normalised: $| w^2 | = 1$. Using a Lagrange multiplier to perform the
|
$w$ to be normalised: $| w^2 | = 1$. Using a Lagrange multiplier to perform the
|
||||||
constrained maximization, it can be found that $w \propto (\mu_2 − \mu_1)$,
|
constrained maximization, it can be found that $w \propto (\mu_2 − \mu_1)$,
|
||||||
meaning that the line onto the points must be projected is the one joining the
|
meaning that the line onto the points must be projected is the one joining the
|
||||||
class means.
|
class means.
|
||||||
@ -334,21 +334,21 @@ To see how it works, consider the four possible situations:
|
|||||||
\quad f(x) = 0 \quad \Longrightarrow \quad \Delta = 0$
|
\quad f(x) = 0 \quad \Longrightarrow \quad \Delta = 0$
|
||||||
the current estimations work properly: $b$ and $w$ do not need to be updated;
|
the current estimations work properly: $b$ and $w$ do not need to be updated;
|
||||||
- $e = 1 \quad \wedge \quad f(x) = 0 \quad \Longrightarrow \quad
|
- $e = 1 \quad \wedge \quad f(x) = 0 \quad \Longrightarrow \quad
|
||||||
\Delta = 1$
|
\Delta \propto 1$
|
||||||
the current $b$ and $w$ underestimate the correct output: they must be
|
the current $b$ and $w$ underestimate the correct output: they must be
|
||||||
increased;
|
increased;
|
||||||
- $e = 0 \quad \wedge \quad f(x) = 1 \quad \Longrightarrow \quad
|
- $e = 0 \quad \wedge \quad f(x) = 1 \quad \Longrightarrow \quad
|
||||||
\Delta = -1$
|
\Delta \propto -1$
|
||||||
the current $b$ and $w$ overestimate the correct output: they must be
|
the current $b$ and $w$ overestimate the correct output: they must be
|
||||||
decreased.
|
decreased.
|
||||||
|
|
||||||
Whilst the $b$ updating is obvious, as regards $w$ the following consideration
|
Whilst the $b$ updating is obvious, as regards $w$ the following consideration
|
||||||
may help clarify. Consider the case with $e = 0 \quad \wedge \quad f(x) = 1
|
may help clarify. Consider the case with $e = 0 \quad \wedge \quad f(x) = 1
|
||||||
\quad \Longrightarrow \quad \Delta = -1$:
|
\quad \Longrightarrow \quad \Delta = -r$:
|
||||||
$$
|
$$
|
||||||
w^T \cdot x \to (w^T + \Delta x^T) \cdot x
|
w^T \cdot x \to (w^T + \Delta x^T) \cdot x
|
||||||
= w^T \cdot x + \Delta |x|^2
|
= w^T \cdot x + \Delta |x|^2
|
||||||
= w^T \cdot x - |x|^2 \leq w^T \cdot x
|
= w^T \cdot x - r|x|^2 \leq w^T \cdot x
|
||||||
$$
|
$$
|
||||||
Similarly for the case with $e = 1$ and $f(x) = 0$.
|
Similarly for the case with $e = 1$ and $f(x) = 0$.
|
||||||
|
|
||||||
@ -399,8 +399,8 @@ $x_n$, the threshold function $f(x_n)$ was computed, then:
|
|||||||
|
|
||||||
and similarly for the positive points.
|
and similarly for the positive points.
|
||||||
Finally, the mean and standard deviation were computed from $N_{fn}$ and
|
Finally, the mean and standard deviation were computed from $N_{fn}$ and
|
||||||
$N_{fp}$ for every sample and used to estimate the purity $\alpha$ and
|
$N_{fp}$ for every sample and used to estimate the significance $\alpha$
|
||||||
efficiency $\beta$ of the classification:
|
and not-purity $\beta$ of the classification:
|
||||||
$$
|
$$
|
||||||
\alpha = 1 - \frac{\text{mean}(N_{fn})}{N_s} \et
|
\alpha = 1 - \frac{\text{mean}(N_{fn})}{N_s} \et
|
||||||
\beta = 1 - \frac{\text{mean}(N_{fp})}{N_n}
|
\beta = 1 - \frac{\text{mean}(N_{fp})}{N_n}
|
||||||
|
Loading…
Reference in New Issue
Block a user