ex-7: revised and typo-fixed

In addition, the folder ex-7/iters was created in order to plot the results
of the Perceptron method as a function of the iterations parameter.
This commit is contained in:
Giù Marcer 2020-05-24 12:01:36 +02:00 committed by rnhmjoj
parent c8f34f1822
commit 3dcf733725
5 changed files with 52 additions and 29 deletions

11
ex-7/iters/iter.txt Normal file
View File

@ -0,0 +1,11 @@
#iters w_x w_y b
1 0.4265959958 0.9044422902 1.7487522878
2 0.5659895562 0.8244124103 1.4450529880
3 0.6540716938 0.7564325610 1.2125805084
4 0.6540716938 0.7564325610 1.2125805084
5 0.6540716938 0.7564325610 1.2125805084
6 0.6540716938 0.7564325610 1.2125805084
7 0.6540716938 0.7564325610 1.2125805084
8 0.6540716938 0.7564325610 1.2125805084
9 0.6540716938 0.7564325610 1.2125805084
10 0.6540716938 0.7564325610 1.2125805084

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@ -377,13 +377,28 @@ As in the previous section, once found, the weight vector is to be normalized.
With $N = 5$ iterations, the values of $w$ and $t_{\text{cut}}$ level off up to the third With $N = 5$ iterations, the values of $w$ and $t_{\text{cut}}$ level off up to the third
digit. The following results were obtained: digit. The following results were obtained:
Different values of the learning rate were tested, all giving the same result,
converging for a number $N = 3$ of iterations. In @fig:iterations, results are
shown for $r = 0.8$: as can be seen, for $N = 3$, the values of $w$ and
$t^{\text{cut}}$ level off.
The following results were obtained:
$$ $$
w = (0.654, 0.756) \et t_{\text{cut}} = 1.213 w = (0.654, 0.756) \et t_{\text{cut}} = 1.213
$$ $$
where, once again, $t_{\text{cut}}$ is computed from the origin of the axes. In In this case, the projection line is not parallel with the line joining the
this case, the projection line does not lies along the mains of the two means of the two samples. Plots in @fig:percep_proj.
samples. Plots in @fig:percep_proj.
<div id="fig:percep_proj">
![View of the samples in the plane.](images/7-percep-plane.pdf)
![View of the samples projections onto the projection
line.](images/7-percep-proj.pdf)
Aerial and lateral views of the samples. Projection line in blue and cut in
red.
</div>
## Efficiency test {#sec:7_results}
<div id="fig:percep_proj"> <div id="fig:percep_proj">
![View from above of the samples.](images/7-percep-plane.pdf){height=5.7cm} ![View from above of the samples.](images/7-percep-plane.pdf){height=5.7cm}
@ -396,40 +411,37 @@ red.
## Efficiency test {#sec:7_results} ## Efficiency test {#sec:7_results}
A program was implemented to check the validity of the two Using the same parameters of the training set, a number $N_t$ of test
classification methods. samples was generated and the points were divided into noise and signal
A number $N_t$ of test samples, with the same parameters of the training set, applying both methods. To avoid storing large datasets in memory, at each
is generated using an RNG and their points are divided into noise/signal by iteration, false positives and negatives were recorded using a running
both methods. At each iteration, false positives and negatives are recorded statistics method implemented in the `gsl_rstat` library. For each sample, the
using a running statistics method implemented in the `gsl_rstat` library, to numbers $N_{fn}$ and $N_{fp}$ of false positive and false negative were obtained
avoid storing large datasets in memory. this way: for every noise point $x_n$, the threshold function $f(x_n)$ was
In each sample, the numbers $N_{fn}$ and $N_{fp}$ of false positive and false computed, then:
negative are obtained in this way: for every noise point $x_n$ compute the
activation function $f(x_n)$ with the weight vector $w$ and the
$t_{\text{cut}}$, then:
- if $f(x) < 0 \thus$ $N_{fn} \to N_{fn}$ - if $f(x) = 0 \thus$ $N_{fn} \to N_{fn}$
- if $f(x) > 0 \thus$ $N_{fn} \to N_{fn} + 1$ - if $f(x) \neq 0 \thus$ $N_{fn} \to N_{fn} + 1$
and similarly for the positive points. and similarly for the positive points.
Finally, the mean and standard deviation are computed from $N_{fn}$ and Finally, the mean and standard deviation were computed from $N_{fn}$ and
$N_{fp}$ of every sample and used to estimate purity $\alpha$ $N_{fp}$ for every sample and used to estimate the purity $\alpha$ and
and efficiency $\beta$ of the classification: efficiency $\beta$ of the classification:
$$ $$
\alpha = 1 - \frac{\text{mean}(N_{fn})}{N_s} \et \alpha = 1 - \frac{\text{mean}(N_{fn})}{N_s} \et
\beta = 1 - \frac{\text{mean}(N_{fp})}{N_n} \beta = 1 - \frac{\text{mean}(N_{fp})}{N_n}
$$ $$
Results for $N_t = 500$ are shown in @tbl:res_comp. As can be seen, the Results for $N_t = 500$ are shown in @tbl:res_comp. As can be seen, the Fisher
Fisher discriminant gives a nearly perfect classification discriminant gives a nearly perfect classification with a symmetric distribution
with a symmetric distribution of false negative and false positive, of false negative and false positive, whereas the perceptron shows a little more
whereas the perceptron show a little more false-positive than false-positive than false-negative, being also more variable from dataset to
false-negative, being also more variable from dataset to dataset. dataset.
A possible explanation of this fact is that, for linearly separable and A possible explanation of this fact is that, for linearly separable and normally
normally distributed points, the Fisher linear discriminant is an exact distributed points, the Fisher linear discriminant is an exact analytical
analytical solution, whereas the perceptron is only expected to converge to the solution, whereas the perceptron is only expected to converge to the solution
solution and thus more subjected to random fluctuations. and is therefore more subject to random fluctuations.
------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------
@ -442,4 +454,4 @@ Perceptron 0.9999 0.28 0.9995 0.64
Table: Results for Fisher and perceptron method. $\sigma_{\alpha}$ and Table: Results for Fisher and perceptron method. $\sigma_{\alpha}$ and
$\sigma_{\beta}$ stand for the standard deviation of the false $\sigma_{\beta}$ stand for the standard deviation of the false
negative and false positive respectively. {#tbl:res_comp} negatives and false positives respectively. {#tbl:res_comp}