ex-7: started writing the test part
This commit is contained in:
parent
12fc0c406e
commit
ea9b6cc0be
@ -42,6 +42,10 @@ header-includes: |
|
|||||||
\DeclareMathOperator*{\et}{%
|
\DeclareMathOperator*{\et}{%
|
||||||
\hspace{30pt} \wedge \hspace{30pt}
|
\hspace{30pt} \wedge \hspace{30pt}
|
||||||
}
|
}
|
||||||
|
%% "if" in formulas
|
||||||
|
\DeclareMathOperator*{\incase}{%
|
||||||
|
\hspace{20pt} \text{if} \hspace{20pt}
|
||||||
|
}
|
||||||
|
|
||||||
\makeatletter
|
\makeatletter
|
||||||
\renewcommand\maketitle{
|
\renewcommand\maketitle{
|
||||||
|
@ -199,11 +199,11 @@ this case were the weight vector and the position of the point to be projected.
|
|||||||
![Gaussian of the samples on the projection
|
![Gaussian of the samples on the projection
|
||||||
line.](images/fisher-proj.pdf){height=5.7cm}
|
line.](images/fisher-proj.pdf){height=5.7cm}
|
||||||
|
|
||||||
Aeral and lateral views of the projection direction, in blue, and the cut, in
|
Aerial and lateral views of the projection direction, in blue, and the cut, in
|
||||||
red.
|
red.
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
Results obtained for the same sample in @fig:fisher_points are shown in
|
Results obtained for the same sample in @fig:points are shown in
|
||||||
@fig:fisher_proj. The weight vector $w$ was found to be:
|
@fig:fisher_proj. The weight vector $w$ was found to be:
|
||||||
|
|
||||||
$$
|
$$
|
||||||
@ -227,22 +227,21 @@ output value. The inferred function can be used for mapping new examples. The
|
|||||||
algorithm will be generalized to correctly determine the class labels for unseen
|
algorithm will be generalized to correctly determine the class labels for unseen
|
||||||
instances.
|
instances.
|
||||||
|
|
||||||
The aim is to determine the threshold function $f(x)$ for the dot product
|
The aim is to determine the bias $b$ such that the threshold function $f(x)$:
|
||||||
between the (in this case 2D) vector point $x$ and the weight vector $w$:
|
|
||||||
|
|
||||||
$$
|
$$
|
||||||
f(x) = x \cdot w + b
|
f(x) = x \cdot w + b \hspace{20pt}
|
||||||
|
\begin{cases}
|
||||||
|
\geqslant 0 \incase x \in \text{signal} \\
|
||||||
|
< 0 \incase x \in \text{noise}
|
||||||
|
\end{cases}
|
||||||
$$ {#eq:perc}
|
$$ {#eq:perc}
|
||||||
|
|
||||||
where $b$ is called 'bias'. If $f(x) \geqslant 0$, than the point can be
|
The training was performed as follow. Initial values were set as $w = (0,0)$ and
|
||||||
assigned to the class $C_1$, to $C_2$ otherwise.
|
$b = 0$. From these, the perceptron starts to improve their estimations. The
|
||||||
|
sample was passed point by point into a reiterative procedure a grand total of
|
||||||
The training was performed as follow. The idea is that the function $f(x)$ must
|
$N_c$ calls: each time, the projection $w \cdot x$ of the point was computed
|
||||||
return 0 when the point $x$ belongs to the noise and 1 if it belongs to the
|
and then the variable $\Delta$ was defined as:
|
||||||
signal. Initial values were set as $w = (0,0)$ and $b = 0$. From these, the
|
|
||||||
perceptron starts to improve their estimations. The sample was passed point by
|
|
||||||
point into a reiterative procedure a grand total of $N_c$ calls: each time, the
|
|
||||||
projection $w \cdot x$ of the point was computed and then the variable $\Delta$ was defined as:
|
|
||||||
|
|
||||||
$$
|
$$
|
||||||
\Delta = r * (e - \theta (f(x))
|
\Delta = r * (e - \theta (f(x))
|
||||||
@ -254,15 +253,15 @@ where:
|
|||||||
larger $r$, the more volatile the weight changes. In the code, it was set
|
larger $r$, the more volatile the weight changes. In the code, it was set
|
||||||
$r = 0.8$;
|
$r = 0.8$;
|
||||||
- $e$ is the expected value, namely 0 if $x$ is noise and 1 if it is signal;
|
- $e$ is the expected value, namely 0 if $x$ is noise and 1 if it is signal;
|
||||||
- $\theta$ is the Heavyside theta function;
|
- $\theta$ is the Heaviside theta function;
|
||||||
- $o$ is the observed value of $f(x)$ defined in @eq:perc.
|
- $o$ is the observed value of $f(x)$ defined in @eq:perc.
|
||||||
|
|
||||||
Then $b$ and $w$ must be updated as:
|
Then $b$ and $w$ must be updated as:
|
||||||
|
|
||||||
$$
|
$$
|
||||||
b \longrightarrow b + \Delta
|
b \to b + \Delta
|
||||||
\et
|
\et
|
||||||
w \longrightarrow w + x \Delta
|
w \to w + x \Delta
|
||||||
$$
|
$$
|
||||||
|
|
||||||
<div id="fig:percep_proj">
|
<div id="fig:percep_proj">
|
||||||
@ -270,12 +269,12 @@ $$
|
|||||||
![Gaussian of the samples on the projection
|
![Gaussian of the samples on the projection
|
||||||
line.](images/percep-proj.pdf){height=5.7cm}
|
line.](images/percep-proj.pdf){height=5.7cm}
|
||||||
|
|
||||||
Aeral and lateral views of the projection direction, in blue, and the cut, in
|
Aerial and lateral views of the projection direction, in blue, and the cut, in
|
||||||
red.
|
red.
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
It can be shown that this method converges to the coveted function.
|
It can be shown that this method converges to the coveted function.
|
||||||
As stated in the previous section, the weight vector must finally be normalzied.
|
As stated in the previous section, the weight vector must finally be normalized.
|
||||||
|
|
||||||
With $N_c = 5$, the values of $w$ and $t_{\text{cut}}$ level off up to the third
|
With $N_c = 5$, the values of $w$ and $t_{\text{cut}}$ level off up to the third
|
||||||
digit. The following results were obtained:
|
digit. The following results were obtained:
|
||||||
@ -289,3 +288,47 @@ this case, the projection line does not lies along the mains of the two
|
|||||||
samples. Plots in @fig:percep_proj.
|
samples. Plots in @fig:percep_proj.
|
||||||
|
|
||||||
## Efficiency test
|
## Efficiency test
|
||||||
|
|
||||||
|
A program was implemented in order to check the validity of the two
|
||||||
|
aforementioned methods.
|
||||||
|
A number $N_t$ of test samples was generated and the
|
||||||
|
points were divided into the two classes according to the selected method.
|
||||||
|
At each iteration, false positives and negatives are recorded using a running
|
||||||
|
statistics method implemented in the `gsl_rstat` library, being suitable for
|
||||||
|
handling large datasets for which it is inconvenient to store in memory all at
|
||||||
|
once.
|
||||||
|
For each sample, the numbers $N_{fn}$ and $N_{fp}$ of false positive and false
|
||||||
|
negative are computed with the following trick:
|
||||||
|
|
||||||
|
Every noise point $x_n$ was checked this way: the function $f(x_n)$ was computed
|
||||||
|
with the weight vector $w$ and the $t_{\text{cut}}$ given by the employed method,
|
||||||
|
then:
|
||||||
|
|
||||||
|
- if $f(x) < 0 \thus$ $N_{fn} \to N_{fn}$
|
||||||
|
- if $f(x) > 0 \thus$ $N_{fn} \to N_{fn} + 1$
|
||||||
|
|
||||||
|
Similarly for the positive points.
|
||||||
|
Finally, the mean and the standard deviation were obtained from $N_{fn}$ and
|
||||||
|
$N_{fp}$ computed for every sample in order to get the mean purity $\alpha$
|
||||||
|
and efficiency $\beta$ for the employed statistics:
|
||||||
|
|
||||||
|
$$
|
||||||
|
\alpha = 1 - \frac{\text{mean}(N_{fn})}{N_s} \et
|
||||||
|
\beta = 1 - \frac{\text{mean}(N_{fp})}{N_n}
|
||||||
|
$$
|
||||||
|
|
||||||
|
Results for $N_t = 500$:
|
||||||
|
|
||||||
|
-------------------------------------------------------------------------------------------
|
||||||
|
$\alpha$ $\sigma_{\alpha}$ $\beta$ $\sigma_{\beta}$
|
||||||
|
----------- ------------------- ------------------- ------------------- -------------------
|
||||||
|
Fisher 0.9999 0.33 0.9999 0.33
|
||||||
|
|
||||||
|
Perceptron 0.9999 0.28 0.9995 0.64
|
||||||
|
-------------------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
Table: Results for Fisher and perceptron method. $\sigma_{\alpha}$ and
|
||||||
|
$\sigma_{\beta}$ stand for the standard deviation of the false
|
||||||
|
negative and false positive respectively.
|
||||||
|
|
||||||
|
\textcolor{red}{MISSING COMMENTS ON RESULTS.}
|
||||||
|
Loading…
Reference in New Issue
Block a user