ex-2: revised and typo-fixed

This commit is contained in:
Giù Marcer 2020-05-08 23:07:37 +02:00 committed by rnhmjoj
parent 1bd691a24b
commit c6939be6d0
4 changed files with 135 additions and 57 deletions

View File

@ -173,7 +173,7 @@ void mpq_log2(mpq_t rop, size_t digits) {
* where 1<a<2 and n is the number of binary * where 1<a<2 and n is the number of binary
* digits of x. * digits of x.
* *
* log(a) is the computed from * log(a) is computed from
* *
* log((1+y)/(1-y)) = 2Σ_{k=0} y^(2k+1)/(2k+1) * log((1+y)/(1-y)) = 2Σ_{k=0} y^(2k+1)/(2k+1)
* *

View File

@ -20,3 +20,72 @@
year={2017}, year={2017},
publisher={Multidisciplinary Digital Publishing Institute} publisher={Multidisciplinary Digital Publishing Institute}
} }
@article{marsaglia03,
title={Evaluating Kolmogorovs distribution},
author={Marsaglia, George and Tsang, Wai Wan and Wang, Jingbo and others},
journal={Journal of Statistical Software},
volume={8},
number={18},
pages={1--4},
year={2003}
}
@article{robertson74,
title={An iterative procedure for estimating the mode},
author={Robertson, Tim and Cryer, Jonathan D},
journal={Journal of the American Statistical Association},
volume={69},
number={348},
pages={1012--1016},
year={1974},
publisher={Taylor \& Francis Group}
}
@misc{GSL,
title={Gnu Scientific Library Reference Manual (3rd Edition)},
author={M. Galassi et al},
ISBN={0954612078},
url={http://www.gnu.org/software/gsl/}
}
@book{silverman86,
title={Density estimation for statistics and data analysis},
author={Silverman, Bernard W},
volume={26},
year={1986},
publisher={CRC press}
}
@book{davis59,
title={Leonhard Euler's Integral: A Historical Profile of the Gamma Function},
author={Davis, P. J.},
year={1959},
journal={American Mathematical Monthly},
pages={849 - 869},
doi={10.2307/2309786}
}
@book{bak91,
title={Complex analysis},
author={Bak, Joseph and Newman, Donald J and Newman, Donald J},
year={1991},
publisher={Springer},
pages={265 -- 268}
}
@misc{yee19,
title={Formulas and Algorithms},
author={Alexander Yee},
year={2019},
journal={Alex Yee Website},
url={http://www.numberworld.org/y-cruncher/internals/formulas.html}
}
@article{riddle08,
title={Approximating the Sum of a Convergent Series},
author={Riddle, Larry},
journal={AP{\textregistered} Calculus},
year={2008},
publisher={Citeseer}
}

View File

@ -1,5 +1,6 @@
# Exercise 1 {#sec:Landau} # Exercise 1 {#sec:Landau}
## Random numbers following the Landau distribution ## Random numbers following the Landau distribution
The Landau distribution is a probability density function which can be defined The Landau distribution is a probability density function which can be defined
@ -24,6 +25,7 @@ function and plotted in a 100-bins histogram ranging from -20 to
## Randomness testing of the generated sample ## Randomness testing of the generated sample
### Kolmogorov-Smirnov test ### Kolmogorov-Smirnov test
In order to compare the sample with the Landau distribution, the In order to compare the sample with the Landau distribution, the

View File

@ -1,10 +1,10 @@
# Exercise 2 # Exercise 2
## Euler-Mascheroni constant ## Euler-Mascheroni constant
The Euler-Mascheroni constant is defined as the limiting difference between the The Euler-Mascheroni constant is defined as the limiting difference between the
partial sums of the harmonic series and the natural logarithm: partial sums of the harmonic series and the natural logarithm:
$$ $$
\gamma = \lim_{n \rightarrow +\infty} \left( \sum_{k=1}^{n} \frac{1}{k} \gamma = \lim_{n \rightarrow +\infty} \left( \sum_{k=1}^{n} \frac{1}{k}
- \ln(n) \right) - \ln(n) \right)
@ -12,7 +12,6 @@ $$ {#eq:gamma}
and represents the limiting blue area in @fig:gamma. The first 30 digits of and represents the limiting blue area in @fig:gamma. The first 30 digits of
$\gamma$ are: $\gamma$ are:
$$ $$
\gamma = 0.57721\ 56649\ 01532\ 86060\ 65120\ 90082 \dots \gamma = 0.57721\ 56649\ 01532\ 86060\ 65120\ 90082 \dots
$$ {#eq:exact} $$ {#eq:exact}
@ -29,16 +28,17 @@ efficiency of the methods lies on how quickly they converge to their limit.
![The area of the blue region converges to the EulerMascheroni ![The area of the blue region converges to the EulerMascheroni
constant.](images/gamma-area.png){#fig:gamma width=7cm} constant.](images/gamma-area.png){#fig:gamma width=7cm}
## Computing the constant ## Computing the constant
### Definition ### Definition
First, in order to have a quantitative idea of how hard it is to reach a good First, in order to give a quantitative idea of how hard it is to reach a good
estimation of $\gamma$, it was naively computed using the definition given in estimation of $\gamma$, it was naively computed using the definition given in
@eq:gamma. The difference was computed for increasing value of $n$, with @eq:gamma. The difference was computed for increasing value of $n$, with
$n_{i+1} = 10 \cdot n_i$ and $n_1 = 20$, till the approximation starts getting $n_{i+1} = 10 \cdot n_i$ and $n_1 = 20$, till the approximation starts getting
worse, namely: worse, namely:
$$ $$
| \gamma(n_{i+1}) - \gamma | > | \gamma(n_i) - \gamma| | \gamma(n_{i+1}) - \gamma | > | \gamma(n_i) - \gamma|
$$ $$
@ -74,14 +74,14 @@ n sum $|\gamma(n)-\gamma|$
Table: Partial results using the definition of $\gamma$ with double Table: Partial results using the definition of $\gamma$ with double
precision. {#tbl:1_results} precision. {#tbl:1_results}
The convergence is logarithmic: to fix the first $d$ decimal places about The convergence is logarithmic: to fix the first $d$ decimal places, about
$10^d$ terms are needed. The double precision runs out at the $10^d$ terms are needed. The double precision runs out at the
10\textsuperscript{th} place, $n=\SI{2e10}{}$. 10\textsuperscript{th} place, $n=\SI{2e10}{}$.
Since all the number are given with double precision, there can be at best 15 Since all the number are given with double precision, there can be at best 15
correct digits but only 10 were correctly computed: this means that when the correct digits but only 10 were correctly computed: this means that when the
terms of the series start being smaller than the smallest representable double, terms of the series start being smaller than the smallest representable double,
the sum of all the remaining terms give a number $\propto 10^{-11}$. the sum of all the remaining terms give a number $\propto 10^{-11}$.
Best result in @tbl:first.
--------- ----------------------- --------- -----------------------
true: 0.57721\ 56649\ 01533 true: 0.57721\ 56649\ 01533
@ -91,13 +91,14 @@ approx: 0.57721\ 56648\ 77325
diff: 0.00000\ 00000\ 24207 diff: 0.00000\ 00000\ 24207
--------- ----------------------- --------- -----------------------
Table: First method results. {#tbl:first} Table: First method best result. From the top down: true value, best estimation
and difference between them. {#tbl:first}
### Alternative formula ### Alternative formula
As a first alternative, the constant was computed through the identity which As a first alternative, the constant was computed through the identity which
relates $\gamma$ to the $\Gamma$ function as follow: relates $\gamma$ to the $\Gamma$ function as follow [@davis59]:
$$ $$
\gamma = \lim_{M \rightarrow + \infty} \sum_{k = 1}^{M} \gamma = \lim_{M \rightarrow + \infty} \sum_{k = 1}^{M}
\binom{M}{k} \frac{(-1)^k}{k} \ln(\Gamma(k + 1)) \binom{M}{k} \frac{(-1)^k}{k} \ln(\Gamma(k + 1))
@ -105,7 +106,7 @@ $$
Varying $M$ from 1 to 100, the best result was obtained for $M = 41$ (see Varying $M$ from 1 to 100, the best result was obtained for $M = 41$ (see
@tbl:second). It went sour: the convergence is worse than using the definition @tbl:second). It went sour: the convergence is worse than using the definition
itself. Only two places were correctly computed. itself. Only two places were correctly computed (#@tbl:second).
--------- ----------------------- --------- -----------------------
true: 0.57721\ 56649\ 01533 true: 0.57721\ 56649\ 01533
@ -115,38 +116,35 @@ approx: 0.57225\ 72410\ 34058
diff: 0.00495\ 84238\ 67473 diff: 0.00495\ 84238\ 67473
--------- ----------------------- --------- -----------------------
Table: Second method results. {#tbl:second} Table: Best esitimation of $\gamma$ using
the alternative formula. {#tbl:second}
Here, the problem lies in the binomial term: computing the factorial of a Here, the problem lies in the binomial term: computing the factorial of a
number greater than 18 goes over 15 places and so cannot be correctly number greater than 18 goes over 15 places and so cannot be correctly
represented. Furthermore, the convergence (even if this is not a series represented. Furthermore, the convergence is slowed down by the logarithmic
and, consequently, it is not properly a "convergence") is slowed down by factor.
the logarithmic factor.
### Reciprocal $\Gamma$ function ### Reciprocal $\Gamma$ function
A better result was found using the well known reciprocal $\Gamma$ function A better result was found using the well known reciprocal $\Gamma$ function
formula: formula [@bak91]:
$$ $$
\frac{1}{\Gamma(z)} = z e^{yz} \prod_{k = 1}^{+ \infty} \frac{1}{\Gamma(z)} = z e^{yz} \prod_{k = 1}^{+ \infty}
\left( 1 + \frac{z}{k} \right) e^{-z/k} \left( 1 + \frac{z}{k} \right) e^{-z/k}
$$ $$
which gives: which gives:
$$ $$
\gamma = - \frac{1}{z} \ln \left( z \Gamma(z) \gamma = - \frac{1}{z} \ln \left( z \Gamma(z)
\prod_{k = 1}^{+ \infty} \left( 1 + \frac{z}{k} \right) e^{-z/k} \right) \prod_{k = 1}^{+ \infty} \left( 1 + \frac{z}{k} \right) e^{-z/k} \right)
$$ $$
The execution stops when there is no difference between two consecutive therms The execution stops when there is no difference between two consecutive therms
of the infinite product (it happens for $k = 456565794$, meaning that for this of the infinite product (it happens for $k = 456565794 \sim \SI{4.6e8}{}$,
value of $k$, the term of the product is equal to 1). Different values of $z$ meaning that for this value of $k$ the term of the product is equal to 1 in
were checked, with $z_{i+1} = z_i + 0.01$, ranging from 0 to 20 and the best terms of floating points). Different values of $z$ were checked, with $z_{i+1}
result was found for $z = 9$. As can be seen in @tbl:3_results, it's only by = z_i + 0.01$ ranging from 0 to 20, and the best result was found for $z = 9$.
chance, since all $|\gamma(z) - \gamma |$ are of the same order of magnitude.
The best one is compared with the exact value of $\gamma$ in @tbl:third.
--------------------------------------------------------------- ---------------------------------------------------------------
z $|\gamma(z) - \gamma |$ z $|\gamma(z) - \gamma |$ z $|\gamma(z) - \gamma |$ z $|\gamma(z) - \gamma |$
@ -172,8 +170,15 @@ The best one is compared with the exact value of $\gamma$ in @tbl:third.
19 \SI{10.084e-9}{} 9.04 \SI{9.419e-9}{} 19 \SI{10.084e-9}{} 9.04 \SI{9.419e-9}{}
--------------------------------------------------------------- ---------------------------------------------------------------
Table: Differences between the obtained values of $\gamma$ and the exact Table: Differences between some obtained values of $\gamma$ and
one. {#tbl:3_results} the exact one found with the reciprocal $\Gamma$ function formula.
The values on the left are shown to give an idea of the $z$
large-scale behaviour; on the right, the values around the best
one ($z = 9.00$) are listed. {#tbl:3_results}
As can be seen in @tbl:3_results, the best value for $z$ is only by chance,
since all $|\gamma(z) - \gamma |$ are of the same order of magnitude. The best
one is compared with the exact value of $\gamma$ in @tbl:third.
--------- ----------------------- --------- -----------------------
true: 0.57721\ 56649\ 01533 true: 0.57721\ 56649\ 01533
@ -188,17 +193,15 @@ Table: Third method results for z = 9.00. {#tbl:third}
This time, the convergence of the infinite product is fast enough to ensure the This time, the convergence of the infinite product is fast enough to ensure the
$8^{th}$ place. $8^{th}$ place.
### Fastest convergence formula ### Fastest convergence formula
The fastest known convergence belongs to the following formula: The fastest known convergence belongs to the following formula [@yee19]:
(source: http://www.numberworld.org/y-cruncher/internals/formulas.html):
$$ $$
\gamma = \frac{A(N)}{B(N)} -\frac{C(N)}{B^2(N)} - \ln(N) \gamma = \frac{A(N)}{B(N)} -\frac{C(N)}{B^2(N)} - \ln(N)
$$ {#eq:faster} $$ {#eq:faster}
with: with:
\begin{align*} \begin{align*}
&A(N) = \sum_{k=1}^{+ \infty} \frac{N^k}{k!} \cdot H(k) &A(N) = \sum_{k=1}^{+ \infty} \frac{N^k}{k!} \cdot H(k)
\with H(k) = \sum_{j=1}^{k} \frac{1}{j} \\ \with H(k) = \sum_{j=1}^{k} \frac{1}{j} \\
@ -207,11 +210,15 @@ with:
\frac{((2k)!)^3}{(k!)^4 \cdot (16k)^2k} \\ \frac{((2k)!)^3}{(k!)^4 \cdot (16k)^2k} \\
\end{align*} \end{align*}
The series $A$ and $B$ are computed till there is no difference between two The series $A$ and $B$ were computed till there is no difference between two
consecutive terms. consecutive terms. The number of desired correct decimals $D$ was given in
The number of desired correct decimals is given in input and $N$ is input and $N$ was consequently computed through the formula:
consequently computed through a formula given in the same article $$
above-mentioned. Results are shown in @tbl:fourth. N = \text{floor} \left( 2 + \frac{1}{4} \cdot \ln(10) \cdot D \right)
$$
given in [@yee19], where floor returns the highest integer smaller than its
argument. Results are shown in @tbl:fourth.
--------- ------------------------------ --------- ------------------------------
true: 0.57721\ 56649\ 01532\ 75452 true: 0.57721\ 56649\ 01532\ 75452
@ -221,11 +228,12 @@ approx: 0.57721\ 56649\ 01532\ 86554
diff: 0.00000\ 00000\ 00000\ 11102 diff: 0.00000\ 00000\ 00000\ 11102
--------- ------------------------------ --------- ------------------------------
Table: Fourth method results. {#tbl:fourth} Table: $\gamma$ estimation with the fastest
known convergence formula (@eq:faster). {#tbl:fourth}
Due to roundoff errors, the best results was obtained for $N = 10$. Up to 15
places were correctly computed.
Due to roundoff errors, the best results is obtained for $N = 10$. Since up to
15 places were correctly computed, an approximation of $\gamma$ better than the
one reached with the definition in @eq:gamma was obtained.
### Arbitrary precision ### Arbitrary precision
@ -240,18 +248,18 @@ to compute an operation based on the size of the operands.
The terms in @eq:faster can therefore be computed with arbitrarily large The terms in @eq:faster can therefore be computed with arbitrarily large
precision. Thus, a program that computes the Euler-Mascheroni constant within precision. Thus, a program that computes the Euler-Mascheroni constant within
a user controllable precision has been implemented. Unlike the previously a user controllable precision was implemented. Unlike the previously mentioned
mentioned programs, this one was more carefully optimized. programs, this one was more carefully optimized.
The $A$ and $B$ series are computed up to an arbitrary limit $k_{\text{max}}$. The $A$ and $B$ series are computed up to an arbitrary limit $k_{\text{max}}$.
Different values of $k_{\text{max}}$ were tested but, obviously, they all Different values of $k_{\text{max}}$ were tested but, obviously, they all
eventually reach a point where the approximation cannot guarantee the eventually reach a point where the approximation cannot guarantee the
requested precision; the solution turned out to be to let $k_{\text{max}}$ depends on requested precision; the solution turned out to be to let $k_{\text{max}}$
$N$. Consequently, $k_{\text{max}}$ was chosen to be $5N$, after it has been depends on $N$. Consequently, $k_{\text{max}}$ was chosen to be $5N$, after it
verified to produce the correct digits up to 500 decimal places. was verified to produce the correct digits up to 500 decimal places.
The GMP library offers functions to perform some operations such as addition, The GMP library offers functions to perform some operations such as addition,
multiplication, division, etc. however, the logarithm function is not multiplication, division, etc. However, the logarithm function is not
implemented. Thus, most of the code carries out the $\ln(N)$ computation. implemented. Thus, most of the code carries out the $\ln(N)$ computation.
First, it should be noted that the logarithm of only some special numbers can First, it should be noted that the logarithm of only some special numbers can
be computed with arbitrary precision, namely the ones of which a converging be computed with arbitrary precision, namely the ones of which a converging
@ -261,13 +269,14 @@ $$
N = N_0 \cdot b^e \thus \ln(N) = \ln(N_0) + e \cdot \ln(b) N = N_0 \cdot b^e \thus \ln(N) = \ln(N_0) + e \cdot \ln(b)
$$ $$
Since a fast converging series for $\ln(2)$ is known $b = 2$ was chosen. As Since a fast converging series for $\ln(2)$ is known, $b = 2$ was chosen. As
well as for the scientific notation, in order to get the mantissa $1 \leqslant well as for the scientific notation, in order to get the mantissa $1 \leqslant
N_0 < 2$, the number of binary digits of $N$ must be computed (conveniently, a N_0 < 2$, the number of binary digits of $N$ must be computed (conveniently, a
dedicated function `mpz_sizeinbase()` can be found in GMP). If the digits are $n$: dedicated function `mpz_sizeinbase()` can be found in GMP). If the digits are
$n$:
$$ $$
e = b - 1 \thus N_0 = \frac{N}{2^{n - 1}} e = n - 1 \thus N_0 = \frac{N}{2^{n - 1}}
$$ $$
Then, by defining: Then, by defining:
@ -276,8 +285,7 @@ $$
N_0 = \frac{1 + y}{1 - y} \thus y = \frac{N_0 - 1}{N_0 + 1} < 1 N_0 = \frac{1 + y}{1 - y} \thus y = \frac{N_0 - 1}{N_0 + 1} < 1
$$ $$
and the following series (which is convergent for $y < 1$) can therefore be the following series, which is convergent for $y < 1$, can therefore be used:
used:
$$ $$
\ln \left( \frac{1 + y}{1 - y} \right) = \ln \left( \frac{1 + y}{1 - y} \right) =
@ -286,12 +294,11 @@ $$
But when to stop computing the series? But when to stop computing the series?
Given a partial sum $S_k$ of the series, it is possible to know when a digit is Given a partial sum $S_k$ of the series, it is possible to know when a digit is
definitely correct, meaning that no matter how large $k$ can be, definitely correct. The key lies in the following concept. Letting $S$ the value
it will not affect that decimal place. The key lies in the following concept. of the series [@riddle08]:
Letting $S$ the value of the series:
$$ $$
S_k + \frac{a_{k+1}}{1 - \frac{a_{k+1}}{a_k}} < S < S_k + a_k \frac{L}{1 -L} S_k + a_k \frac{L}{1 -L} < S < S_k + \frac{a_{k+1}}{1 - \frac{a_{k+1}}{a_k}}
$$ $$
where $L$ is the limiting ratio of the series terms, which must be $< 1$ in where $L$ is the limiting ratio of the series terms, which must be $< 1$ in
@ -308,4 +315,4 @@ $$
\log(2) = \sum_{k=1}^{+ \infty} \frac{1}{k \cdot 2^k} \log(2) = \sum_{k=1}^{+ \infty} \frac{1}{k \cdot 2^k}
$$ $$
In this case the ratio is $L = 1/2$. In this case the ratio is $L = 1/2$.