ex-7: revised and typo-fixed

In addition, the folder ex-7/iters was created in order to plot the results
of the Perceptron method as a function of the iterations parameter.
This commit is contained in:
Giù Marcer 2020-05-24 12:01:36 +02:00 committed by rnhmjoj
parent c8f34f1822
commit 3dcf733725
5 changed files with 52 additions and 29 deletions

11
ex-7/iters/iter.txt Normal file
View File

@ -0,0 +1,11 @@
#iters w_x w_y b
1 0.4265959958 0.9044422902 1.7487522878
2 0.5659895562 0.8244124103 1.4450529880
3 0.6540716938 0.7564325610 1.2125805084
4 0.6540716938 0.7564325610 1.2125805084
5 0.6540716938 0.7564325610 1.2125805084
6 0.6540716938 0.7564325610 1.2125805084
7 0.6540716938 0.7564325610 1.2125805084
8 0.6540716938 0.7564325610 1.2125805084
9 0.6540716938 0.7564325610 1.2125805084
10 0.6540716938 0.7564325610 1.2125805084

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@ -377,13 +377,28 @@ As in the previous section, once found, the weight vector is to be normalized.
With $N = 5$ iterations, the values of $w$ and $t_{\text{cut}}$ level off up to the third
digit. The following results were obtained:
Different values of the learning rate were tested, all giving the same result,
converging for a number $N = 3$ of iterations. In @fig:iterations, results are
shown for $r = 0.8$: as can be seen, for $N = 3$, the values of $w$ and
$t^{\text{cut}}$ level off.
The following results were obtained:
$$
w = (0.654, 0.756) \et t_{\text{cut}} = 1.213
$$
where, once again, $t_{\text{cut}}$ is computed from the origin of the axes. In
this case, the projection line does not lies along the mains of the two
samples. Plots in @fig:percep_proj.
In this case, the projection line is not parallel with the line joining the
means of the two samples. Plots in @fig:percep_proj.
<div id="fig:percep_proj">
![View of the samples in the plane.](images/7-percep-plane.pdf)
![View of the samples projections onto the projection
line.](images/7-percep-proj.pdf)
Aerial and lateral views of the samples. Projection line in blue and cut in
red.
</div>
## Efficiency test {#sec:7_results}
<div id="fig:percep_proj">
![View from above of the samples.](images/7-percep-plane.pdf){height=5.7cm}
@ -396,40 +411,37 @@ red.
## Efficiency test {#sec:7_results}
A program was implemented to check the validity of the two
classification methods.
A number $N_t$ of test samples, with the same parameters of the training set,
is generated using an RNG and their points are divided into noise/signal by
both methods. At each iteration, false positives and negatives are recorded
using a running statistics method implemented in the `gsl_rstat` library, to
avoid storing large datasets in memory.
In each sample, the numbers $N_{fn}$ and $N_{fp}$ of false positive and false
negative are obtained in this way: for every noise point $x_n$ compute the
activation function $f(x_n)$ with the weight vector $w$ and the
$t_{\text{cut}}$, then:
Using the same parameters of the training set, a number $N_t$ of test
samples was generated and the points were divided into noise and signal
applying both methods. To avoid storing large datasets in memory, at each
iteration, false positives and negatives were recorded using a running
statistics method implemented in the `gsl_rstat` library. For each sample, the
numbers $N_{fn}$ and $N_{fp}$ of false positive and false negative were obtained
this way: for every noise point $x_n$, the threshold function $f(x_n)$ was
computed, then:
- if $f(x) < 0 \thus$ $N_{fn} \to N_{fn}$
- if $f(x) > 0 \thus$ $N_{fn} \to N_{fn} + 1$
- if $f(x) = 0 \thus$ $N_{fn} \to N_{fn}$
- if $f(x) \neq 0 \thus$ $N_{fn} \to N_{fn} + 1$
and similarly for the positive points.
Finally, the mean and standard deviation are computed from $N_{fn}$ and
$N_{fp}$ of every sample and used to estimate purity $\alpha$
and efficiency $\beta$ of the classification:
Finally, the mean and standard deviation were computed from $N_{fn}$ and
$N_{fp}$ for every sample and used to estimate the purity $\alpha$ and
efficiency $\beta$ of the classification:
$$
\alpha = 1 - \frac{\text{mean}(N_{fn})}{N_s} \et
\beta = 1 - \frac{\text{mean}(N_{fp})}{N_n}
$$
Results for $N_t = 500$ are shown in @tbl:res_comp. As can be seen, the
Fisher discriminant gives a nearly perfect classification
with a symmetric distribution of false negative and false positive,
whereas the perceptron show a little more false-positive than
false-negative, being also more variable from dataset to dataset.
A possible explanation of this fact is that, for linearly separable and
normally distributed points, the Fisher linear discriminant is an exact
analytical solution, whereas the perceptron is only expected to converge to the
solution and thus more subjected to random fluctuations.
Results for $N_t = 500$ are shown in @tbl:res_comp. As can be seen, the Fisher
discriminant gives a nearly perfect classification with a symmetric distribution
of false negative and false positive, whereas the perceptron shows a little more
false-positive than false-negative, being also more variable from dataset to
dataset.
A possible explanation of this fact is that, for linearly separable and normally
distributed points, the Fisher linear discriminant is an exact analytical
solution, whereas the perceptron is only expected to converge to the solution
and is therefore more subject to random fluctuations.
-------------------------------------------------------------------------------------------
@ -442,4 +454,4 @@ Perceptron 0.9999 0.28 0.9995 0.64
Table: Results for Fisher and perceptron method. $\sigma_{\alpha}$ and
$\sigma_{\beta}$ stand for the standard deviation of the false
negative and false positive respectively. {#tbl:res_comp}
negatives and false positives respectively. {#tbl:res_comp}