ex-7: revised and typo-fixed
In addition, the folder ex-7/iters was created in order to plot the results of the Perceptron method as a function of the iterations parameter.
This commit is contained in:
parent
c8f34f1822
commit
3dcf733725
11
ex-7/iters/iter.txt
Normal file
11
ex-7/iters/iter.txt
Normal file
@ -0,0 +1,11 @@
|
||||
#iters w_x w_y b
|
||||
1 0.4265959958 0.9044422902 1.7487522878
|
||||
2 0.5659895562 0.8244124103 1.4450529880
|
||||
3 0.6540716938 0.7564325610 1.2125805084
|
||||
4 0.6540716938 0.7564325610 1.2125805084
|
||||
5 0.6540716938 0.7564325610 1.2125805084
|
||||
6 0.6540716938 0.7564325610 1.2125805084
|
||||
7 0.6540716938 0.7564325610 1.2125805084
|
||||
8 0.6540716938 0.7564325610 1.2125805084
|
||||
9 0.6540716938 0.7564325610 1.2125805084
|
||||
10 0.6540716938 0.7564325610 1.2125805084
|
BIN
notes/images/7-iterations.pdf
Normal file
BIN
notes/images/7-iterations.pdf
Normal file
Binary file not shown.
Binary file not shown.
Binary file not shown.
@ -377,13 +377,28 @@ As in the previous section, once found, the weight vector is to be normalized.
|
||||
With $N = 5$ iterations, the values of $w$ and $t_{\text{cut}}$ level off up to the third
|
||||
digit. The following results were obtained:
|
||||
|
||||
Different values of the learning rate were tested, all giving the same result,
|
||||
converging for a number $N = 3$ of iterations. In @fig:iterations, results are
|
||||
shown for $r = 0.8$: as can be seen, for $N = 3$, the values of $w$ and
|
||||
$t^{\text{cut}}$ level off.
|
||||
The following results were obtained:
|
||||
$$
|
||||
w = (0.654, 0.756) \et t_{\text{cut}} = 1.213
|
||||
$$
|
||||
|
||||
where, once again, $t_{\text{cut}}$ is computed from the origin of the axes. In
|
||||
this case, the projection line does not lies along the mains of the two
|
||||
samples. Plots in @fig:percep_proj.
|
||||
In this case, the projection line is not parallel with the line joining the
|
||||
means of the two samples. Plots in @fig:percep_proj.
|
||||
|
||||
<div id="fig:percep_proj">
|
||||
![View of the samples in the plane.](images/7-percep-plane.pdf)
|
||||
![View of the samples projections onto the projection
|
||||
line.](images/7-percep-proj.pdf)
|
||||
|
||||
Aerial and lateral views of the samples. Projection line in blue and cut in
|
||||
red.
|
||||
</div>
|
||||
|
||||
## Efficiency test {#sec:7_results}
|
||||
|
||||
<div id="fig:percep_proj">
|
||||
![View from above of the samples.](images/7-percep-plane.pdf){height=5.7cm}
|
||||
@ -396,40 +411,37 @@ red.
|
||||
|
||||
## Efficiency test {#sec:7_results}
|
||||
|
||||
A program was implemented to check the validity of the two
|
||||
classification methods.
|
||||
A number $N_t$ of test samples, with the same parameters of the training set,
|
||||
is generated using an RNG and their points are divided into noise/signal by
|
||||
both methods. At each iteration, false positives and negatives are recorded
|
||||
using a running statistics method implemented in the `gsl_rstat` library, to
|
||||
avoid storing large datasets in memory.
|
||||
In each sample, the numbers $N_{fn}$ and $N_{fp}$ of false positive and false
|
||||
negative are obtained in this way: for every noise point $x_n$ compute the
|
||||
activation function $f(x_n)$ with the weight vector $w$ and the
|
||||
$t_{\text{cut}}$, then:
|
||||
Using the same parameters of the training set, a number $N_t$ of test
|
||||
samples was generated and the points were divided into noise and signal
|
||||
applying both methods. To avoid storing large datasets in memory, at each
|
||||
iteration, false positives and negatives were recorded using a running
|
||||
statistics method implemented in the `gsl_rstat` library. For each sample, the
|
||||
numbers $N_{fn}$ and $N_{fp}$ of false positive and false negative were obtained
|
||||
this way: for every noise point $x_n$, the threshold function $f(x_n)$ was
|
||||
computed, then:
|
||||
|
||||
- if $f(x) < 0 \thus$ $N_{fn} \to N_{fn}$
|
||||
- if $f(x) > 0 \thus$ $N_{fn} \to N_{fn} + 1$
|
||||
- if $f(x) = 0 \thus$ $N_{fn} \to N_{fn}$
|
||||
- if $f(x) \neq 0 \thus$ $N_{fn} \to N_{fn} + 1$
|
||||
|
||||
and similarly for the positive points.
|
||||
Finally, the mean and standard deviation are computed from $N_{fn}$ and
|
||||
$N_{fp}$ of every sample and used to estimate purity $\alpha$
|
||||
and efficiency $\beta$ of the classification:
|
||||
Finally, the mean and standard deviation were computed from $N_{fn}$ and
|
||||
$N_{fp}$ for every sample and used to estimate the purity $\alpha$ and
|
||||
efficiency $\beta$ of the classification:
|
||||
|
||||
$$
|
||||
\alpha = 1 - \frac{\text{mean}(N_{fn})}{N_s} \et
|
||||
\beta = 1 - \frac{\text{mean}(N_{fp})}{N_n}
|
||||
$$
|
||||
|
||||
Results for $N_t = 500$ are shown in @tbl:res_comp. As can be seen, the
|
||||
Fisher discriminant gives a nearly perfect classification
|
||||
with a symmetric distribution of false negative and false positive,
|
||||
whereas the perceptron show a little more false-positive than
|
||||
false-negative, being also more variable from dataset to dataset.
|
||||
A possible explanation of this fact is that, for linearly separable and
|
||||
normally distributed points, the Fisher linear discriminant is an exact
|
||||
analytical solution, whereas the perceptron is only expected to converge to the
|
||||
solution and thus more subjected to random fluctuations.
|
||||
Results for $N_t = 500$ are shown in @tbl:res_comp. As can be seen, the Fisher
|
||||
discriminant gives a nearly perfect classification with a symmetric distribution
|
||||
of false negative and false positive, whereas the perceptron shows a little more
|
||||
false-positive than false-negative, being also more variable from dataset to
|
||||
dataset.
|
||||
A possible explanation of this fact is that, for linearly separable and normally
|
||||
distributed points, the Fisher linear discriminant is an exact analytical
|
||||
solution, whereas the perceptron is only expected to converge to the solution
|
||||
and is therefore more subject to random fluctuations.
|
||||
|
||||
|
||||
-------------------------------------------------------------------------------------------
|
||||
@ -442,4 +454,4 @@ Perceptron 0.9999 0.28 0.9995 0.64
|
||||
|
||||
Table: Results for Fisher and perceptron method. $\sigma_{\alpha}$ and
|
||||
$\sigma_{\beta}$ stand for the standard deviation of the false
|
||||
negative and false positive respectively. {#tbl:res_comp}
|
||||
negatives and false positives respectively. {#tbl:res_comp}
|
||||
|
Loading…
Reference in New Issue
Block a user