diff --git a/notes/sections/7.md b/notes/sections/7.md index bba7a20..7d0b3ba 100644 --- a/notes/sections/7.md +++ b/notes/sections/7.md @@ -289,41 +289,40 @@ samples. Plots in @fig:percep_proj. ## Efficiency test -A program was implemented in order to check the validity of the two -aforementioned methods. -A number $N_t$ of test samples was generated and the -points were divided into the two classes according to the selected method. -At each iteration, false positives and negatives are recorded using a running -statistics method implemented in the `gsl_rstat` library, being suitable for -handling large datasets for which it is inconvenient to store in memory all at -once. -For each sample, the numbers $N_{fn}$ and $N_{fp}$ of false positive and false -negative are computed with the following trick: every noise point $x_n$ was -checked this way: the function $f(x_n)$ was computed with the weight vector $w$ -and the $t_{\text{cut}}$ given by the employed method, then: - +A program was implemented to check the validity of the two +classification methods. +A number $N_t$ of test samples, with the same parameters of the training set, +is generated using an RNG and their points are divided into noise/signal by +both methods. At each iteration, false positives and negatives are recorded +using a running statistics method implemented in the `gsl_rstat` library, to +avoid storing large datasets in memory. +In each sample, the numbers $N_{fn}$ and $N_{fp}$ of false positive and false +negative are obtained in this way: for every noise point $x_n$ compute the +activation function $f(x_n)$ with the weight vector $w$ and the +$t_{\text{cut}}$, then: + - if $f(x) < 0 \thus$ $N_{fn} \to N_{fn}$ - if $f(x) > 0 \thus$ $N_{fn} \to N_{fn} + 1$ -Similarly for the positive points. -Finally, the mean and the standard deviation were computed from $N_{fn}$ and -$N_{fp}$ obtained for every sample in order to get the mean purity $\alpha$ -and efficiency $\beta$ for the employed statistics: +and similarly for the positive points. +Finally, the mean and standard deviation are computed from $N_{fn}$ and +$N_{fp}$ of every sample and used to estimate purity $\alpha$ +and efficiency $\beta$ of the classification: $$ \alpha = 1 - \frac{\text{mean}(N_{fn})}{N_s} \et \beta = 1 - \frac{\text{mean}(N_{fp})}{N_n} $$ -Results for $N_t = 500$ are shown in @tbl:res_comp. As can be observed, the -Fisher method gives a nearly perfect assignment of the points to their belonging -class, with a symmetric distribution of false negative and false positive, -whereas the points perceptron-divided show a little more false-positive than -false-negative, being also more changable from dataset to dataset. -The reason why this happened lies in the fact that the Fisher linear -discriminant is an exact analitical result, whereas the perceptron is based on -a convergent behaviour which cannot be exactely reached by definition. - +Results for $N_t = 500$ are shown in @tbl:res_comp. As can be seen, the +Fisher discriminant gives a nearly perfect classification +with a symmetric distribution of false negative and false positive, +whereas the perceptron show a little more false-positive than +false-negative, being also more variable from dataset to dataset. +A possible explanation of this fact is that, for linearly separable and +normally distributed points, the Fisher linear discriminant is an exact +analytical solution, whereas the perceptron is only expected to converge to the +solution and thus more subjected to random fluctuations. -------------------------------------------------------------------------------------------