diff --git a/notes/sections/1.md b/notes/sections/1.md index 6386413..3110004 100644 --- a/notes/sections/1.md +++ b/notes/sections/1.md @@ -149,11 +149,12 @@ To obtain a better estimate of the mode and its error, the above procedure was bootstrapped. The original sample was treated as a population and used to build 100 other samples of the same size, by *sampling with replacements*. For each one of the new samples, the above statistic was computed. By simply taking the -mean of these statistics the following estimate was obtained: +mean and standard deviation of these statistics the following estimate was +obtained: $$ \text{observed mode: } m_o = \num{-0.29 \pm 0.19} $$ -In order to compare the values $m_e$ and $m_0$, the following compatibility +In order to compare the values $m_e$ and $m_o$, the following compatibility $t$-test was applied: $$ p = 1 - \text{erf}\left(\frac{t}{\sqrt{2}}\right)\ \with @@ -184,7 +185,7 @@ middle elements otherwise. The expected median was derived from the quantile function (QDF) of the Landau distribution[^1]. -Once this is know, the median is simply given by $\text{QDF}(1/2)$. Since both +Once this is known, the median is simply given by $\text{QDF}(1/2)$. Since both the CDF and QDF have no known closed form, they must be computed numerically. The cumulative probability was computed by quadrature-based numerical integration of the PDF (`gsl_integration_qagiu()` function in GSL). The function @@ -210,13 +211,13 @@ where the absolute and relative tolerances $\varepsilon_\text{abs}$ and $\varepsilon_\text{rel}$ were set to \num{1e-10} and \num{1e-6}, respectively. As for the QDF, this was implemented by numerically inverting the CDF. This was -done by solving the equation; +done by solving the equation for x: $$ p(x) = p_0 $$ -for x, given a probability value $p_0$, where $p(x)$ is the CDF. The (unique) -root of this equation was found by a root-finding routine -(`gsl_root_fsolver_brent` in GSL) based on the Brent-Dekker method. +given a probability value $p_0$, where $p(x)$ is the CDF. The (unique) root of +this equation was found by a root-finding routine (`gsl_root_fsolver_brent` in +GSL) based on the Brent-Dekker method. The following condition was checked for convergence: $$ |a - b| < \varepsilon_\text{abs} + \varepsilon_\text{rel} \min(|a|, |b|) diff --git a/notes/sections/2.md b/notes/sections/2.md index 6f7deb1..bb679d0 100644 --- a/notes/sections/2.md +++ b/notes/sections/2.md @@ -10,7 +10,7 @@ $$ \sum_{k=1}^{n} \frac{1}{k} - \ln(n) \right) $$ {#eq:gamma} -and represents the limiting blue area in @fig:gamma. The first 30 digits of +and represents the limiting red area in @fig:gamma. The first 30 digits of $\gamma$ are: $$ \gamma = 0.57721\ 56649\ 01532\ 86060\ 65120\ 90082 \dots @@ -52,7 +52,7 @@ efficiency of the methods lies on how quickly they converge to their limit. \draw (7.0,-0.05) -- (7.0,0.05); \node [below, scale=0.7] at (7.0,-0.05) {7}; \end{tikzpicture} \caption{The area of the red region converges to the Euler–Mascheroni - constant..}\label{fig:gamma} + constant.}\label{fig:gamma} } \end{figure} @@ -109,10 +109,8 @@ sign, 8 for the exponent and 55 for the mantissa, hence: $$ 2^{55} = 10^{d} \thus d = 55 \cdot \log(2) \sim 16.6 $$ -Only 10 digits were correctly computed: this means that when the terms of the -series start being smaller than the smallest representable double, the sum of -all the remaining terms gives a number $\propto 10^{-11}$. The best result is -shown in @tbl:naive-res. +But only 10 digits were correctly computed. The best result is shown in +@tbl:naive-res. ------- -------------------- exact 0.57721 56649 01533 diff --git a/notes/sections/3.md b/notes/sections/3.md index 150a8b0..fd772ab 100644 --- a/notes/sections/3.md +++ b/notes/sections/3.md @@ -13,7 +13,7 @@ distribution function $F$: \end{align*} where $\theta$ and $\phi$ are, respectively, the polar and azimuthal angles, and $$ - \alpha_0 = 0.65 \et \beta_0 = 0.06 \et \gamma_0 = -0.18 + \alpha = 0.65 \et \beta = 0.06 \et \gamma = -0.18 $$ To generate the points, a *hit-miss* method was employed: diff --git a/notes/sections/5.md b/notes/sections/5.md index 15c9122..d12a298 100644 --- a/notes/sections/5.md +++ b/notes/sections/5.md @@ -49,9 +49,9 @@ approximate $I$ as: $$ I \approx I_N = \frac{V}{N} \sum_{i=1}^N f(x_i) = V \cdot \avg{f} $$ -If $x_i$ are uniformly distributed $I_N \rightarrow I$ for $N \rightarrow + -\infty$ by the law of large numbers, whereas the integral variance can be -estimated as: +If $x_i$ are uniformly distributed, $I_N \rightarrow I$ for $N \rightarrow + +\infty$ by the law of large numbers, whereas the integral variance $\sigma^2_I$ +can be estimated as: $$ \sigma^2_f = \frac{1}{N - 1} \sum_{i = 1}^N \left( f(x_i) - \avg{f} \right)^2 diff --git a/notes/sections/6.md b/notes/sections/6.md index 8ce07dc..66e1df9 100644 --- a/notes/sections/6.md +++ b/notes/sections/6.md @@ -123,7 +123,7 @@ where: - $(\cdot, \cdot)$ is an inner product. Given a signal $s$ of $n$ elements and a kernel $k$ of $m$ elements, -their convolution is a vector of $n + m + 1$ elements computed +their convolution $c$ is a vector of $n + m + 1$ elements computed by flipping $s$ ($R$ operator) and shifting its indices ($T_i$ operator): $$ c_i = (s, T_i \, R \, k) @@ -446,8 +446,8 @@ close as possible. Formally, the following constraints must be satisfied: &\text{3.} \hspace{20pt} \sum_{i = 1}^m f_{ij} \le w_{qj} &1 \le j \le n \\ - &\text{4.} \hspace{20pt} \sum_{j = 1}^n f_{ij} \sum_{j = 1}^m f_{ij} \le w_{qj} - = \text{min} \left( \sum_{i = 1}^m w_{pi}, \sum_{j = 1}^n w_{qj} \right) + &\text{4.} \hspace{20pt} \sum_{j = 1}^n \sum_{j = 1}^m f_{ij} \le + \text{min} \left( \sum_{i = 1}^m w_{pi}, \sum_{j = 1}^n w_{qj} \right) \end{align*} The first constraint allows moving dirt from $P$ to $Q$ and not vice versa; the second limits the amount of dirt moved by each position in $P$ in order to not @@ -549,9 +549,9 @@ a large kernel, the convergence is very slow, even if the best results are close to the one found for $\sigma = 0.5$. The following $r$s were chosen as the most fitting: \begin{align*} - \sigma = 0.1 \, \Delta \theta &\thus n^{\text{best}} = 2 \\ - \sigma = 0.5 \, \Delta \theta &\thus n^{\text{best}} = 10 \\ - \sigma = 1 \, \Delta \theta &\thus n^{\text{best}} = \num{5e3} + \sigma = 0.1 \, \Delta \theta &\thus r^{\text{best}} = 2 \\ + \sigma = 0.5 \, \Delta \theta &\thus r^{\text{best}} = 10 \\ + \sigma = 1 \, \Delta \theta &\thus r^{\text{best}} = \num{5e3} \end{align*} Note the difference between @fig:rless-0.1 and the plots resulting from $\sigma = diff --git a/notes/sections/7.md b/notes/sections/7.md index 6ad4e9d..7cb6ef6 100644 --- a/notes/sections/7.md +++ b/notes/sections/7.md @@ -86,8 +86,8 @@ $$ \tilde{\mu}_2 − \tilde{\mu}_1 = w^T (\mu_2 − \mu_1) $$ This expression can be made arbitrarily large simply by increasing the -magnitude of $w$, fortunately the problem is easily solved by requiring $w$ -to be normalised: $| w^2 | = 1$. Using a Lagrange multiplier to perform the +magnitude of $w$ but, fortunately, the problem is easily solved by requiring +$w$ to be normalised: $| w^2 | = 1$. Using a Lagrange multiplier to perform the constrained maximization, it can be found that $w \propto (\mu_2 − \mu_1)$, meaning that the line onto the points must be projected is the one joining the class means. @@ -334,21 +334,21 @@ To see how it works, consider the four possible situations: \quad f(x) = 0 \quad \Longrightarrow \quad \Delta = 0$ the current estimations work properly: $b$ and $w$ do not need to be updated; - $e = 1 \quad \wedge \quad f(x) = 0 \quad \Longrightarrow \quad - \Delta = 1$ + \Delta \propto 1$ the current $b$ and $w$ underestimate the correct output: they must be increased; - $e = 0 \quad \wedge \quad f(x) = 1 \quad \Longrightarrow \quad - \Delta = -1$ + \Delta \propto -1$ the current $b$ and $w$ overestimate the correct output: they must be decreased. Whilst the $b$ updating is obvious, as regards $w$ the following consideration may help clarify. Consider the case with $e = 0 \quad \wedge \quad f(x) = 1 -\quad \Longrightarrow \quad \Delta = -1$: +\quad \Longrightarrow \quad \Delta = -r$: $$ w^T \cdot x \to (w^T + \Delta x^T) \cdot x = w^T \cdot x + \Delta |x|^2 - = w^T \cdot x - |x|^2 \leq w^T \cdot x + = w^T \cdot x - r|x|^2 \leq w^T \cdot x $$ Similarly for the case with $e = 1$ and $f(x) = 0$. @@ -399,8 +399,8 @@ $x_n$, the threshold function $f(x_n)$ was computed, then: and similarly for the positive points. Finally, the mean and standard deviation were computed from $N_{fn}$ and -$N_{fp}$ for every sample and used to estimate the purity $\alpha$ and -efficiency $\beta$ of the classification: +$N_{fp}$ for every sample and used to estimate the significance $\alpha$ +and not-purity $\beta$ of the classification: $$ \alpha = 1 - \frac{\text{mean}(N_{fn})}{N_s} \et \beta = 1 - \frac{\text{mean}(N_{fp})}{N_n}