ex-7: started writing the test part

This commit is contained in:
Giù Marcer 2020-04-07 23:36:59 +02:00 committed by rnhmjoj
parent 12fc0c406e
commit ea9b6cc0be
2 changed files with 66 additions and 19 deletions

View File

@ -42,6 +42,10 @@ header-includes: |
\DeclareMathOperator*{\et}{% \DeclareMathOperator*{\et}{%
\hspace{30pt} \wedge \hspace{30pt} \hspace{30pt} \wedge \hspace{30pt}
} }
%% "if" in formulas
\DeclareMathOperator*{\incase}{%
\hspace{20pt} \text{if} \hspace{20pt}
}
\makeatletter \makeatletter
\renewcommand\maketitle{ \renewcommand\maketitle{

View File

@ -199,11 +199,11 @@ this case were the weight vector and the position of the point to be projected.
![Gaussian of the samples on the projection ![Gaussian of the samples on the projection
line.](images/fisher-proj.pdf){height=5.7cm} line.](images/fisher-proj.pdf){height=5.7cm}
Aeral and lateral views of the projection direction, in blue, and the cut, in Aerial and lateral views of the projection direction, in blue, and the cut, in
red. red.
</div> </div>
Results obtained for the same sample in @fig:fisher_points are shown in Results obtained for the same sample in @fig:points are shown in
@fig:fisher_proj. The weight vector $w$ was found to be: @fig:fisher_proj. The weight vector $w$ was found to be:
$$ $$
@ -227,22 +227,21 @@ output value. The inferred function can be used for mapping new examples. The
algorithm will be generalized to correctly determine the class labels for unseen algorithm will be generalized to correctly determine the class labels for unseen
instances. instances.
The aim is to determine the threshold function $f(x)$ for the dot product The aim is to determine the bias $b$ such that the threshold function $f(x)$:
between the (in this case 2D) vector point $x$ and the weight vector $w$:
$$ $$
f(x) = x \cdot w + b f(x) = x \cdot w + b \hspace{20pt}
\begin{cases}
\geqslant 0 \incase x \in \text{signal} \\
< 0 \incase x \in \text{noise}
\end{cases}
$$ {#eq:perc} $$ {#eq:perc}
where $b$ is called 'bias'. If $f(x) \geqslant 0$, than the point can be The training was performed as follow. Initial values were set as $w = (0,0)$ and
assigned to the class $C_1$, to $C_2$ otherwise. $b = 0$. From these, the perceptron starts to improve their estimations. The
sample was passed point by point into a reiterative procedure a grand total of
The training was performed as follow. The idea is that the function $f(x)$ must $N_c$ calls: each time, the projection $w \cdot x$ of the point was computed
return 0 when the point $x$ belongs to the noise and 1 if it belongs to the and then the variable $\Delta$ was defined as:
signal. Initial values were set as $w = (0,0)$ and $b = 0$. From these, the
perceptron starts to improve their estimations. The sample was passed point by
point into a reiterative procedure a grand total of $N_c$ calls: each time, the
projection $w \cdot x$ of the point was computed and then the variable $\Delta$ was defined as:
$$ $$
\Delta = r * (e - \theta (f(x)) \Delta = r * (e - \theta (f(x))
@ -254,15 +253,15 @@ where:
larger $r$, the more volatile the weight changes. In the code, it was set larger $r$, the more volatile the weight changes. In the code, it was set
$r = 0.8$; $r = 0.8$;
- $e$ is the expected value, namely 0 if $x$ is noise and 1 if it is signal; - $e$ is the expected value, namely 0 if $x$ is noise and 1 if it is signal;
- $\theta$ is the Heavyside theta function; - $\theta$ is the Heaviside theta function;
- $o$ is the observed value of $f(x)$ defined in @eq:perc. - $o$ is the observed value of $f(x)$ defined in @eq:perc.
Then $b$ and $w$ must be updated as: Then $b$ and $w$ must be updated as:
$$ $$
b \longrightarrow b + \Delta b \to b + \Delta
\et \et
w \longrightarrow w + x \Delta w \to w + x \Delta
$$ $$
<div id="fig:percep_proj"> <div id="fig:percep_proj">
@ -270,12 +269,12 @@ $$
![Gaussian of the samples on the projection ![Gaussian of the samples on the projection
line.](images/percep-proj.pdf){height=5.7cm} line.](images/percep-proj.pdf){height=5.7cm}
Aeral and lateral views of the projection direction, in blue, and the cut, in Aerial and lateral views of the projection direction, in blue, and the cut, in
red. red.
</div> </div>
It can be shown that this method converges to the coveted function. It can be shown that this method converges to the coveted function.
As stated in the previous section, the weight vector must finally be normalzied. As stated in the previous section, the weight vector must finally be normalized.
With $N_c = 5$, the values of $w$ and $t_{\text{cut}}$ level off up to the third With $N_c = 5$, the values of $w$ and $t_{\text{cut}}$ level off up to the third
digit. The following results were obtained: digit. The following results were obtained:
@ -289,3 +288,47 @@ this case, the projection line does not lies along the mains of the two
samples. Plots in @fig:percep_proj. samples. Plots in @fig:percep_proj.
## Efficiency test ## Efficiency test
A program was implemented in order to check the validity of the two
aforementioned methods.
A number $N_t$ of test samples was generated and the
points were divided into the two classes according to the selected method.
At each iteration, false positives and negatives are recorded using a running
statistics method implemented in the `gsl_rstat` library, being suitable for
handling large datasets for which it is inconvenient to store in memory all at
once.
For each sample, the numbers $N_{fn}$ and $N_{fp}$ of false positive and false
negative are computed with the following trick:
Every noise point $x_n$ was checked this way: the function $f(x_n)$ was computed
with the weight vector $w$ and the $t_{\text{cut}}$ given by the employed method,
then:
- if $f(x) < 0 \thus$ $N_{fn} \to N_{fn}$
- if $f(x) > 0 \thus$ $N_{fn} \to N_{fn} + 1$
Similarly for the positive points.
Finally, the mean and the standard deviation were obtained from $N_{fn}$ and
$N_{fp}$ computed for every sample in order to get the mean purity $\alpha$
and efficiency $\beta$ for the employed statistics:
$$
\alpha = 1 - \frac{\text{mean}(N_{fn})}{N_s} \et
\beta = 1 - \frac{\text{mean}(N_{fp})}{N_n}
$$
Results for $N_t = 500$:
-------------------------------------------------------------------------------------------
$\alpha$ $\sigma_{\alpha}$ $\beta$ $\sigma_{\beta}$
----------- ------------------- ------------------- ------------------- -------------------
Fisher 0.9999 0.33 0.9999 0.33
Perceptron 0.9999 0.28 0.9995 0.64
-------------------------------------------------------------------------------------------
Table: Results for Fisher and perceptron method. $\sigma_{\alpha}$ and
$\sigma_{\beta}$ stand for the standard deviation of the false
negative and false positive respectively.
\textcolor{red}{MISSING COMMENTS ON RESULTS.}