ex-7: revised and typo-fixed
In addition, the folder ex-7/iters was created in order to plot the results of the Perceptron method as a function of the iterations parameter.
This commit is contained in:
parent
c8f34f1822
commit
3dcf733725
11
ex-7/iters/iter.txt
Normal file
11
ex-7/iters/iter.txt
Normal file
@ -0,0 +1,11 @@
|
|||||||
|
#iters w_x w_y b
|
||||||
|
1 0.4265959958 0.9044422902 1.7487522878
|
||||||
|
2 0.5659895562 0.8244124103 1.4450529880
|
||||||
|
3 0.6540716938 0.7564325610 1.2125805084
|
||||||
|
4 0.6540716938 0.7564325610 1.2125805084
|
||||||
|
5 0.6540716938 0.7564325610 1.2125805084
|
||||||
|
6 0.6540716938 0.7564325610 1.2125805084
|
||||||
|
7 0.6540716938 0.7564325610 1.2125805084
|
||||||
|
8 0.6540716938 0.7564325610 1.2125805084
|
||||||
|
9 0.6540716938 0.7564325610 1.2125805084
|
||||||
|
10 0.6540716938 0.7564325610 1.2125805084
|
BIN
notes/images/7-iterations.pdf
Normal file
BIN
notes/images/7-iterations.pdf
Normal file
Binary file not shown.
Binary file not shown.
Binary file not shown.
@ -377,13 +377,28 @@ As in the previous section, once found, the weight vector is to be normalized.
|
|||||||
With $N = 5$ iterations, the values of $w$ and $t_{\text{cut}}$ level off up to the third
|
With $N = 5$ iterations, the values of $w$ and $t_{\text{cut}}$ level off up to the third
|
||||||
digit. The following results were obtained:
|
digit. The following results were obtained:
|
||||||
|
|
||||||
|
Different values of the learning rate were tested, all giving the same result,
|
||||||
|
converging for a number $N = 3$ of iterations. In @fig:iterations, results are
|
||||||
|
shown for $r = 0.8$: as can be seen, for $N = 3$, the values of $w$ and
|
||||||
|
$t^{\text{cut}}$ level off.
|
||||||
|
The following results were obtained:
|
||||||
$$
|
$$
|
||||||
w = (0.654, 0.756) \et t_{\text{cut}} = 1.213
|
w = (0.654, 0.756) \et t_{\text{cut}} = 1.213
|
||||||
$$
|
$$
|
||||||
|
|
||||||
where, once again, $t_{\text{cut}}$ is computed from the origin of the axes. In
|
In this case, the projection line is not parallel with the line joining the
|
||||||
this case, the projection line does not lies along the mains of the two
|
means of the two samples. Plots in @fig:percep_proj.
|
||||||
samples. Plots in @fig:percep_proj.
|
|
||||||
|
<div id="fig:percep_proj">
|
||||||
|
![View of the samples in the plane.](images/7-percep-plane.pdf)
|
||||||
|
![View of the samples projections onto the projection
|
||||||
|
line.](images/7-percep-proj.pdf)
|
||||||
|
|
||||||
|
Aerial and lateral views of the samples. Projection line in blue and cut in
|
||||||
|
red.
|
||||||
|
</div>
|
||||||
|
|
||||||
|
## Efficiency test {#sec:7_results}
|
||||||
|
|
||||||
<div id="fig:percep_proj">
|
<div id="fig:percep_proj">
|
||||||
![View from above of the samples.](images/7-percep-plane.pdf){height=5.7cm}
|
![View from above of the samples.](images/7-percep-plane.pdf){height=5.7cm}
|
||||||
@ -396,40 +411,37 @@ red.
|
|||||||
|
|
||||||
## Efficiency test {#sec:7_results}
|
## Efficiency test {#sec:7_results}
|
||||||
|
|
||||||
A program was implemented to check the validity of the two
|
Using the same parameters of the training set, a number $N_t$ of test
|
||||||
classification methods.
|
samples was generated and the points were divided into noise and signal
|
||||||
A number $N_t$ of test samples, with the same parameters of the training set,
|
applying both methods. To avoid storing large datasets in memory, at each
|
||||||
is generated using an RNG and their points are divided into noise/signal by
|
iteration, false positives and negatives were recorded using a running
|
||||||
both methods. At each iteration, false positives and negatives are recorded
|
statistics method implemented in the `gsl_rstat` library. For each sample, the
|
||||||
using a running statistics method implemented in the `gsl_rstat` library, to
|
numbers $N_{fn}$ and $N_{fp}$ of false positive and false negative were obtained
|
||||||
avoid storing large datasets in memory.
|
this way: for every noise point $x_n$, the threshold function $f(x_n)$ was
|
||||||
In each sample, the numbers $N_{fn}$ and $N_{fp}$ of false positive and false
|
computed, then:
|
||||||
negative are obtained in this way: for every noise point $x_n$ compute the
|
|
||||||
activation function $f(x_n)$ with the weight vector $w$ and the
|
|
||||||
$t_{\text{cut}}$, then:
|
|
||||||
|
|
||||||
- if $f(x) < 0 \thus$ $N_{fn} \to N_{fn}$
|
- if $f(x) = 0 \thus$ $N_{fn} \to N_{fn}$
|
||||||
- if $f(x) > 0 \thus$ $N_{fn} \to N_{fn} + 1$
|
- if $f(x) \neq 0 \thus$ $N_{fn} \to N_{fn} + 1$
|
||||||
|
|
||||||
and similarly for the positive points.
|
and similarly for the positive points.
|
||||||
Finally, the mean and standard deviation are computed from $N_{fn}$ and
|
Finally, the mean and standard deviation were computed from $N_{fn}$ and
|
||||||
$N_{fp}$ of every sample and used to estimate purity $\alpha$
|
$N_{fp}$ for every sample and used to estimate the purity $\alpha$ and
|
||||||
and efficiency $\beta$ of the classification:
|
efficiency $\beta$ of the classification:
|
||||||
|
|
||||||
$$
|
$$
|
||||||
\alpha = 1 - \frac{\text{mean}(N_{fn})}{N_s} \et
|
\alpha = 1 - \frac{\text{mean}(N_{fn})}{N_s} \et
|
||||||
\beta = 1 - \frac{\text{mean}(N_{fp})}{N_n}
|
\beta = 1 - \frac{\text{mean}(N_{fp})}{N_n}
|
||||||
$$
|
$$
|
||||||
|
|
||||||
Results for $N_t = 500$ are shown in @tbl:res_comp. As can be seen, the
|
Results for $N_t = 500$ are shown in @tbl:res_comp. As can be seen, the Fisher
|
||||||
Fisher discriminant gives a nearly perfect classification
|
discriminant gives a nearly perfect classification with a symmetric distribution
|
||||||
with a symmetric distribution of false negative and false positive,
|
of false negative and false positive, whereas the perceptron shows a little more
|
||||||
whereas the perceptron show a little more false-positive than
|
false-positive than false-negative, being also more variable from dataset to
|
||||||
false-negative, being also more variable from dataset to dataset.
|
dataset.
|
||||||
A possible explanation of this fact is that, for linearly separable and
|
A possible explanation of this fact is that, for linearly separable and normally
|
||||||
normally distributed points, the Fisher linear discriminant is an exact
|
distributed points, the Fisher linear discriminant is an exact analytical
|
||||||
analytical solution, whereas the perceptron is only expected to converge to the
|
solution, whereas the perceptron is only expected to converge to the solution
|
||||||
solution and thus more subjected to random fluctuations.
|
and is therefore more subject to random fluctuations.
|
||||||
|
|
||||||
|
|
||||||
-------------------------------------------------------------------------------------------
|
-------------------------------------------------------------------------------------------
|
||||||
@ -442,4 +454,4 @@ Perceptron 0.9999 0.28 0.9995 0.64
|
|||||||
|
|
||||||
Table: Results for Fisher and perceptron method. $\sigma_{\alpha}$ and
|
Table: Results for Fisher and perceptron method. $\sigma_{\alpha}$ and
|
||||||
$\sigma_{\beta}$ stand for the standard deviation of the false
|
$\sigma_{\beta}$ stand for the standard deviation of the false
|
||||||
negative and false positive respectively. {#tbl:res_comp}
|
negatives and false positives respectively. {#tbl:res_comp}
|
||||||
|
Loading…
Reference in New Issue
Block a user