diff --git a/notes/sections/1.md b/notes/sections/1.md index 3f0a6cc..7875d1e 100644 --- a/notes/sections/1.md +++ b/notes/sections/1.md @@ -41,7 +41,7 @@ $$ where: - $x$ runs over the sample, - - $F(x)$ is the Landau cumulative distribution and function + - $F(x)$ is the Landau cumulative distribution and function - $F_N(x)$ is the empirical cumulative distribution function of the sample. If $N$ numbers have been generated, for every point $x$, @@ -92,7 +92,7 @@ cluster or use very large bins in the others, making the $\chi^2$ test unpractical. -### Parameters comparison +## Parameters comparison When a sample of points is generated in a given range, different tests can be applied in order to check whether they follow an even distribution or not. The @@ -100,9 +100,11 @@ idea which lies beneath most of them is to measure how far the parameters of the distribution are from the ones measured in the sample. The same principle can be used to verify if the generated sample effectively follows the Landau distribution. Since it turns out to be a very pathological -PDF, only two parameters can be easily checked: the mode and the +PDF, very few parameters can be easily checked: mode, median and full-width-half-maximum (FWHM). +### Mode + ![Landau distribution with emphatized mode and FWHM = ($x_+ - x_-$).](images/landau.pdf) @@ -123,33 +125,33 @@ full-width-half-maximum (FWHM). \node [above right] at (6.85,3.1) {$x_-$}; \node [above right] at (8.95,3.1) {$x_+$}; \end{scope} -\end{tikzpicture} -} +\end{tikzpicture}} \end{figure} The mode of a set of data values is defined as the value that appears most -often, namely: it is the maximum of the PDF. Since there is no way to find -an analytic form for the mode of the Landau PDF, it was numerically estimated -through a minimization method (found in GSL, called method 'golden section') -with an arbitrary error of $10^{-2}$, obtaining: +often, namely: it is the maximum of the PDF. Since there is no closed form +for the mode of the Landau PDF, it was computed numerically by the *golden +section* method (`gsl_min_fminimizer_goldensection` in GSL), applied to $-f$ +with an arbitrary error of $10^{-2}$, giving: +$$ + \text{expected mode: } m_e = \SI{-0.2227830 \pm 0.0000001}{} +$$ - - expected mode $= m_e = \SI{-0.2227830 \pm 0.0000001}{}$ - -The minimization algorithm begins with a bounded region known to contain a -minimum. The region is described by a lower bound $x_\text{min}$ and an upper -bound $x_\text{max}$, with an estimate of the location of the minimum $x_e$. -The value of the function at $x_e$ must be less than the value of the function -at the ends of the interval, in order to guarantee that a minimum is contained -somewhere within the interval. +This is a minimization algorithm that begins with a bounded region known to +contain a minimum. The region is described by a lower bound $x_\text{min}$ and +an upper bound $x_\text{max}$, with an estimate of the location of the minimum +$x_e$. The value of the function at $x_e$ must be less than the value of the +function at the ends of the interval, in order to guarantee that a minimum is +contained somewhere within the interval. $$ f(x_\text{min}) > f(x_e) < f(x_\text{max}) $$ On each iteration the interval is divided in a golden section (using the ratio -($3 - \sqrt{5}/2 \approx 0.3819660$ and the value of the function at this new +($(3 - \sqrt{5})/2 \approx 0.3819660$) and the value of the function at this new point $x'$ is calculated. If the new point is a better estimate of the minimum, -namely is $f(x') < f(x_e)$, then the current estimate of the minimum is +namely if $f(x') < f(x_e)$, then the current estimate of the minimum is updated. The new point allows the size of the bounded interval to be reduced, by choosing the most compact set of points which satisfies the constraint $f(a) > f(x') < @@ -161,7 +163,9 @@ estimated as the central value of the bin with maximum events and the error is the half width of the bins. In this case, with 40 bins between -20 and 20, the following result was obtained: - - observed mode $= m_o = \SI{0 \pm 1}{}$ +$$ + \text{observed mode: } m_o = \SI{0 \pm 1}{} +$$ In order to compare the values $m_e$ and $x_0$, the following compatibility t-test was applied: @@ -181,16 +185,90 @@ $$ Thus, the observed value is compatible with the expected one. ---- -The same approach was taken as regards the FWHM. It is defined as the distance -between the two points at which the function assumes half times the maximum -value. Even in this case, there is not an analytic expression for it, thus it -was computed numerically ad follow. +### Median + +The median is a central tendency statistics that, unlike the mean, is not +very sensitive to extreme values, albeit less indicative. For this reason +is well suited as test statistic in a pathological case such as the +Landau distribution. +The median of a real probability distribution is defined as the value +such that its cumulative probability is $1/2$. In other words the median +partitions the probability in two (connected) halves. +The median of a sample, once sorted, is given by its middle element if the +sample size is odd, or the average of the two middle elements otherwise. + +The expected median was derived from the quantile function (QDF) of the Landau +distribution[^1]. +Once this is know, the median is simply given by $QDF(1/2)$. +Since both the CDF and QDF have no known closed form they must +be computed numerically. The comulative probability has been computed by +quadrature-based numerical integration of the PDF (`gsl_integration_qagiu()` +function in GSL). The function calculate an approximation of the integral + +$$ + I(x) = \int_x^{+\infty} f(t)dt +$$ + +[^1]: This is neither necessary nor the easiest way: it was chosen simply +because the quantile had been already implemented and was initially +used for reverse sampling. + + +The $CDF$ is then given by $p(x) = 1 - I(x)$. This was done to avoid the +left tail of the distribution, where the integration can sometimes fail. +The integral $I$ is actually mapped beforehand onto $(0, 1]$ by +the change of variable $t = a + (1-u)/u$, because the integration +routine works on definite integrals. The result should satisfy the following +accuracy requirement: +$$ + |\text{result} - I| \le \max(\varepsilon_\text{abs}, \varepsilon_\text{rel}I) +$$ + +The tolerances have been set to \SI{1e-10}{} and \SI{1e-6}{}, respectively. +As for the QDF, this was implemented by numerically inverting the CDF. This is +done by solving the equation +$$ + p(x) = p_0 +$$ + +for x, given a probability value $p_0$, where $p(x)$ is again the CDF. +The (unique) root of this equation is found by a root-finding routine +(`gsl_root_fsolver_brent` in GSL) based on the Brent-Dekker method. This +algorithm consists in a bisection search, similar to the one employed in the +mode optimisation, but improved by interpolating the function with a parabola +at each step. The following condition is checked for convergence: +$$ + |a - b| < \varepsilon_\text{abs} + \varepsilon_\text{rel} \min(|a|, |b|) +$$ + +where $a,b$ are the current interval bounds. The condition immediately gives an +upper bound on the error of the root as $\varepsilon = |a-b|$. The tolerances +here have been set to 0 and \SI{1e-3}{}. + +The result of the numerical computation is: + +$$ + \text{expected median: } m_e = \SI{1.3557804 \pm 0.0000091}{} +$$ + +while the sample median was found to be + +$$ + \text{observed median: } m_e = \SI{1.3479314}{} +$$ + + +### FWHM + +The same approach was taken as regards the FWHM. This statistic is defined as +the distance between the two points at which the function assumes half times +the maximum value. Even in this case, there is not an analytic expression for +it, thus it was computed numerically ad follow. First, some definitions must be given: $$ - f_{\text{max}} := f(m_e) \et \text{FWHM} = x_{+} - x_{-} \with + f_{\text{max}} := f(m_e) \et \text{FWHM} = x_+ - x_- \with f(x_{\pm}) = \frac{f_{\text{max}}}{2} $$ @@ -201,13 +279,13 @@ in which the points have been sampled) in order to be able to find both the minima of the function: $$ - f'(x) = |f(x) - \frac{f_{\text{max}}}{2}| + f'(x) = \left|f(x) - \frac{f_{\text{max}}}{2}\right| $$ resulting in: $$ - \text{expected FWHM} = w_e = \SI{4.0186457 \pm 0.0000001}{} + \text{expected FWHM: } w_e = \SI{4.0186457 \pm 0.0000001}{} $$ On the other hand, the observed FWHM was computed as the difference between @@ -215,7 +293,7 @@ the center of the bins with the values closer to $\frac{f_{\text{max}}}{2}$ and the error was taken as twice the width of the bins, obtaining: $$ - \text{observed FWHM} = w_o = \SI{4 \pm 2}{} + \text{observed FWHM: } w_o = \SI{4 \pm 2}{} $$ This two values turn out to be compatible too, with: diff --git a/notes/sections/3.md b/notes/sections/3.md index 9311848..14d69f1 100644 --- a/notes/sections/3.md +++ b/notes/sections/3.md @@ -315,7 +315,7 @@ on a theorem that proves the existence of a number $\mu_k$ such that \Delta_k \mu_k = \|D_k \vec\delta_k\| && (H_k + \mu_k D_k^TD_k) \vec\delta_k = -\nabla_k \end{align*} -Using the approximation[^1] $H\approx J^TJ$, obtained by computing the Hessian +Using the approximation[^2] $H\approx J^TJ$, obtained by computing the Hessian of the first-order Taylor expansion of $\chi^2$, $\vec\delta_k$ can be found by solving the system @@ -326,7 +326,7 @@ $$ \end{cases} $$ -[^1]: Here $J_{ij} = \partial f_i/\partial x_j$ is the Jacobian matrix of the +[^2]: Here $J_{ij} = \partial f_i/\partial x_j$ is the Jacobian matrix of the vector-valued function $\vec f(\vec x)$. The algorithm terminates if on of the following condition are satisfied: