diff --git a/notes/sections/0.md b/notes/sections/0.md index cfbade3..49a4cb9 100644 --- a/notes/sections/0.md +++ b/notes/sections/0.md @@ -42,6 +42,10 @@ header-includes: | \DeclareMathOperator*{\et}{% \hspace{30pt} \wedge \hspace{30pt} } + %% "if" in formulas + \DeclareMathOperator*{\incase}{% + \hspace{20pt} \text{if} \hspace{20pt} + } \makeatletter \renewcommand\maketitle{ diff --git a/notes/sections/7.md b/notes/sections/7.md index 5ebb290..c977686 100644 --- a/notes/sections/7.md +++ b/notes/sections/7.md @@ -199,11 +199,11 @@ this case were the weight vector and the position of the point to be projected. ![Gaussian of the samples on the projection line.](images/fisher-proj.pdf){height=5.7cm} -Aeral and lateral views of the projection direction, in blue, and the cut, in +Aerial and lateral views of the projection direction, in blue, and the cut, in red. -Results obtained for the same sample in @fig:fisher_points are shown in +Results obtained for the same sample in @fig:points are shown in @fig:fisher_proj. The weight vector $w$ was found to be: $$ @@ -227,22 +227,21 @@ output value. The inferred function can be used for mapping new examples. The algorithm will be generalized to correctly determine the class labels for unseen instances. -The aim is to determine the threshold function $f(x)$ for the dot product -between the (in this case 2D) vector point $x$ and the weight vector $w$: +The aim is to determine the bias $b$ such that the threshold function $f(x)$: $$ - f(x) = x \cdot w + b + f(x) = x \cdot w + b \hspace{20pt} + \begin{cases} + \geqslant 0 \incase x \in \text{signal} \\ + < 0 \incase x \in \text{noise} + \end{cases} $$ {#eq:perc} -where $b$ is called 'bias'. If $f(x) \geqslant 0$, than the point can be -assigned to the class $C_1$, to $C_2$ otherwise. - -The training was performed as follow. The idea is that the function $f(x)$ must -return 0 when the point $x$ belongs to the noise and 1 if it belongs to the -signal. Initial values were set as $w = (0,0)$ and $b = 0$. From these, the -perceptron starts to improve their estimations. The sample was passed point by -point into a reiterative procedure a grand total of $N_c$ calls: each time, the -projection $w \cdot x$ of the point was computed and then the variable $\Delta$ was defined as: +The training was performed as follow. Initial values were set as $w = (0,0)$ and +$b = 0$. From these, the perceptron starts to improve their estimations. The +sample was passed point by point into a reiterative procedure a grand total of +$N_c$ calls: each time, the projection $w \cdot x$ of the point was computed +and then the variable $\Delta$ was defined as: $$ \Delta = r * (e - \theta (f(x)) @@ -254,15 +253,15 @@ where: larger $r$, the more volatile the weight changes. In the code, it was set $r = 0.8$; - $e$ is the expected value, namely 0 if $x$ is noise and 1 if it is signal; - - $\theta$ is the Heavyside theta function; + - $\theta$ is the Heaviside theta function; - $o$ is the observed value of $f(x)$ defined in @eq:perc. Then $b$ and $w$ must be updated as: $$ - b \longrightarrow b + \Delta + b \to b + \Delta \et - w \longrightarrow w + x \Delta + w \to w + x \Delta $$