ex-2: written about the program fast.c

fancy, fancier and fast are now correctly described in the pdf.
Only the computation time is missing.
This commit is contained in:
Giù Marcer 2020-05-25 23:59:38 +02:00 committed by rnhmjoj
parent 1ca0499430
commit 391be195d0
2 changed files with 161 additions and 51 deletions

View File

@ -179,3 +179,22 @@
year={2000}, year={2000},
publisher={Springer} publisher={Springer}
} }
@article{corless96,
title={On the Lambert W function},
author={Corless, Robert M and Gonnet, Gaston H and Hare, David EG and Jeffrey,
David J and Knuth, Donald E},
journal={Advances in Computational mathematics},
volume={5},
number={1},
pages={329--359},
year={1996},
publisher={Springer}
}
@article{demailly17,
author={Jean-Pierre Demailly},
title={Precise error estimate of the Brent-McMillan algorithm for the
computation of Euler's constant},
year={2017}
}

View File

@ -196,9 +196,10 @@ $8^{th}$ place.
### Fastest convergence formula ### Fastest convergence formula
The fastest known convergence belongs to the following formula [@yee19]: The fastest known convergence belongs to the following formula, known as refined
Brent-McMillan formula[@yee19]:
$$ $$
\gamma = \frac{A(N)}{B(N)} -\frac{C(N)}{B^2(N)} - \ln(N) \gamma_N = \frac{A(N)}{B(N)} -\frac{C(N)}{B^2(N)} - \ln(N)
$$ {#eq:faster} $$ {#eq:faster}
with: with:
@ -210,36 +211,48 @@ with:
\frac{((2k)!)^3}{(k!)^4 \cdot (16k)^2k} \\ \frac{((2k)!)^3}{(k!)^4 \cdot (16k)^2k} \\
\end{align*} \end{align*}
The series $A$ and $B$ were computed till there is no difference between two and the difference between $\gamma_N$ and $\gamma$ is given by [@demailly17]:
consecutive terms. The number of desired correct decimals $D$ was given in
input and $N$ was consequently computed through the formula:
$$ $$
N = \left\lfloor 2 + \frac{1}{4} \cdot \ln(10) \cdot D \right\rfloor |\gamma_N - \gamma| < \frac{5 \sqrt{2 \pi}}{12 \sqrt{N}} e^{-8N}
$$ {#eq:NeD} $$ {#eq:NeD}
given in [@brent00]. Results are shown in @tbl:fourth. This gives the value of $N$ which is to be used into the formula in order to get
Due to roundoff errors, the best results was obtained for $N = 10$. Up to 15 $D$ correct decimal digits of $\gamma$. In fact, by imposing:
places were correctly computed. $$
\frac{5 \sqrt{2 \pi}}{12 \sqrt{N}} e^{-8N} < 10^{-D}
$$
With a bit of math, it can be found that the smallest integer which satisfies
the inequality is given by:
$$
N = 1 + \left\lfloor \frac{1}{16}
W \left( \frac{5 \pi}{9} 10^{2D + 1}\right) \right\rfloor
$$
The series $A$ and $B$ were computed till there is no difference between two
consecutive terms. Results are shown in @tbl:fourth.
--------- ------------------------------ --------- ------------------------------
true: 0.57721\ 56649\ 01532\ 75452 true: 0.57721\ 56649\ 01532\ 75452
approx: 0.57721\ 56649\ 01532\ 86554 approx: 0.57721\ 56649\ 01532\ 53248
diff: 0.00000\ 00000\ 00000\ 11102 diff: 0.00000\ 00000\ 00000\ 33062
--------- ------------------------------ --------- ------------------------------
Table: $\gamma$ estimation with the fastest Table: $\gamma$ estimation with the fastest
known convergence formula (@eq:faster). {#tbl:fourth} known convergence formula (@eq:faster). {#tbl:fourth}
Because of roundoff errors, the best result was obtained only for $D = 15$, for
which the code accurately computes 15 digits and gives an error of
\SI{3.3e16}{}. For $D > 15$, the requested can't be fulfilled.
### Arbitrary precision ### Arbitrary precision
To overcome the issues related to the double representation, one can resort to a To overcome the issues related to the double representation, one can resort to a
representation with arbitrary precision. In the GMP library (which stands for representation with arbitrary precision. In the GMP library (which stands for
GNU Multiple Precision arithmetic), for example, real numbers can be GNU Multiple Precision arithmetic), for example, real numbers can be
approximated by a ratio of two integers (fraction) with arbitrary precision: approximated by a ratio of two integers (a fraction) with arbitrary precision:
this means a check is performed on every operation and in case of an integer this means a check is performed on every operation and in case of an integer
overflow, additional memory is requested and used to represent the larger result. overflow, additional memory is requested and used to represent the larger result.
Additionally, the library automatically switches to the optimal algorithm Additionally, the library automatically switches to the optimal algorithm
@ -249,30 +262,32 @@ The terms in @eq:faster can therefore be computed with arbitrarily large
precision. Thus, a program that computes the Euler-Mascheroni constant within precision. Thus, a program that computes the Euler-Mascheroni constant within
a user controllable precision was implemented. a user controllable precision was implemented.
According to [@brent00], the $A(N)$, $B(N)$ and $C(N)$ series awere computed up According to [@brent00], the $A(N)$, $B(N)$ and $C(N)$ series were computed up
to $k_{\text{max}} = 5N$, which was verified to compute the correct digits up to to $k_{\text{max}} = 5N$ \textcolor{red}{sure?}, since it guarantees the accuracy
500 decimal places. od the result up to the $D$ decimal digit.
The GMP library offers functions to perform some operations such as addition, The GMP library offers functions to perform some operations such as addition,
multiplication, division, etc. However, the logarithm function is not multiplication, division, etc. However, the logarithm function is not
implemented. Thus, most of the code carries out the $\ln(N)$ computation. implemented. Thus, most of the code carries out the $\ln(N)$ computation.
First, it should be noted that the logarithm of only some special numbers can This is because the logarithm of only some special numbers can be computed
be computed with arbitrary precision, namely the ones of which a converging with arbitrary precision, namely the ones of which a converging series is known.
series is known. This forces $N$ to be rewritten in the following way: This forces $N$ to be rewritten in the following way:
$$ $$
N = N_0 \cdot b^e \thus \ln(N) = \ln(N_0) + e \cdot \ln(b) N = N_0 \cdot b^e \thus \ln(N) = \ln(N_0) + e \cdot \ln(b)
$$ $$
Since a fast converging series for $\ln(2)$ is known (it will be shwn shortly), Since a fast converging series for $\ln(2)$ is known (it will be shwn shortly),
$b = 2$ was chosen. As well as for the scientific notation, in order to get the $b = 2$ was chosen. In case $N$ is a power of two, only $\ln(2)$ is to be
mantissa $1 \leqslant N_0 < 2$, the number of binary digits of $N$ must be computed. If not, it must be handled as follows.
computed (conveniently, a dedicated function `mpz_sizeinbase()` can be found in As well as for the scientific notation, in order to get the mantissa $1
GMP). If the digits are \leqslant N_0 < 2$, the number of binary digits of $N$ must be computed
(conveniently, a dedicated function `mpz_sizeinbase()` is found in GMP). If the
digits are
$n$: $n$:
$$ $$
e = n - 1 \thus N = N_0 \cdot 2^{n-1} \thus N_0 = \frac{N}{2^{n - 1}} e = n - 1 \thus N = N_0 \cdot 2^{n-1} \thus N_0 = \frac{N}{2^{n - 1}}
$$ $$
The logarithm od whichever number $N_0$ can be computed using the series of The logarithm of whichever number $N_0$ can be computed using the series of
$\text{atanh}(x)$, which converges for $|x| < 1$: $\text{atanh}(x)$, which converges for $|x| < 1$:
$$ $$
\text{atanh}(x) = \sum_{k = 0}^{+ \infty} \frac{x^{2k + 1}}{2x + 1} \text{atanh}(x) = \sum_{k = 0}^{+ \infty} \frac{x^{2k + 1}}{2x + 1}
@ -290,26 +305,15 @@ $$
\thus \ln(z) = 2 \, \text{atanh} \left( \frac{z - 1}{z + 1} \right) \thus \ln(z) = 2 \, \text{atanh} \left( \frac{z - 1}{z + 1} \right)
$$ $$
Then, by defining: The idea is to set $N_0 = z$ and define:
$$ $$
y = \frac{z - 1}{z + 1} \thus z = \frac{1 + y}{1 - y} y = \frac{N_0 - 1}{N_0 + 1}
$$ $$
from which: hence:
$$ $$
\ln \left( \frac{1 + y}{1 - y} \right) = 2 \, \text{atanh}(y) \ln(N_0) = 2 \, \text{atanh} (y)
= 2 \, \sum_{k = 0}^{+ \infty} \frac{y^{2k + 1}}{2y + 1} = 2 \sum_{k = 0}^{+ \infty} \frac{y^{2k + 1}}{2k + 1}
$$
which is true if $|y| < 1$. The idea is to set:
$$
N_0 = \frac{1 + y}{1 - y} \thus y = \frac{N_0 - 1}{N_0 + 1} < 1
$$
and therefore:
$$
\ln(N_0) = \ln \left( \frac{1 + y}{1 - y} \right) =
2 \sum_{k = 0}^{+ \infty} \frac{y^{2k + 1}}{2k + 1}
$$ $$
But when to stop computing the series? But when to stop computing the series?
@ -322,7 +326,7 @@ $$
where $a_k$ is the $k^{\text{th}}$ term of the series and $L$ is the limiting where $a_k$ is the $k^{\text{th}}$ term of the series and $L$ is the limiting
ratio of the series terms, which must be $< 1$ in order for it to converge (in ratio of the series terms, which must be $< 1$ in order for it to converge (in
this case, it is easy to prove that $L = y^2$). The width $\Delta S$ of the this case, it is easy to prove that $L = y^2 < 1$). The width $\Delta S$ of the
interval containing $S$ gives the precision of the estimate $\tilde{S}$ if this interval containing $S$ gives the precision of the estimate $\tilde{S}$ if this
last is assumed to be the middle value of it, namely: last is assumed to be the middle value of it, namely:
$$ $$
@ -336,7 +340,7 @@ In order to know when to stop computing the series, $\Delta S$ must be
evaluated. With a bit of math: evaluated. With a bit of math:
$$ $$
a_k = \frac{y^{2k + 1}}{2k + 1} \thus a_k = \frac{y^{2k + 1}}{2k + 1} \thus
a_{k + 1} = \frac{y^{2k + 3}}{2k + 3} = a_k \frac{2k + 1}{2k + 3} y^2 a_{k + 1} = \frac{y^{2k + 3}}{2k + 3} = a_k \, \frac{2k + 1}{2k + 3} \, y^2
$$ $$
hence: hence:
@ -352,9 +356,10 @@ $$
\right) = \Delta S_k \with a_k = \frac{y^{2k + 1}}{2k + 1} \right) = \Delta S_k \with a_k = \frac{y^{2k + 1}}{2k + 1}
$$ $$
By imposing $\Delta S < 1/10^D$, where $D$ is the correct decimal place required, By imposing $\Delta S < 10^{-D}$, where $D$ is the last correct decimal place
$k$ at which to stop can be obtained. This is achieved by trials, checking for required, $k^{\text{max}}$ at which to stop can be obtained. This is achieved by
every $k$ whether $\Delta S_k$ is less or greater than $1/10^D$. trials, checking for every $k$ whether $\Delta S_k$ is less or greater than
$10^{-D}$.
The same considerations holds for $\ln(2)$. It could be computed as the Taylor The same considerations holds for $\ln(2)$. It could be computed as the Taylor
expansion of $\ln(1 + x)$ with $x = 1$, but it would be too slow. A better idea expansion of $\ln(1 + x)$ with $x = 1$, but it would be too slow. A better idea
@ -376,6 +381,7 @@ it, 1000 digits were computed in \SI{0.63}{s}, with a \SI{3.9}{GHz}
2666mt al secondo 2666mt al secondo
\textcolor{red}{Scrivere specifice pc che fa i conti} \textcolor{red}{Scrivere specifice pc che fa i conti}
### Faster implemented formula ### Faster implemented formula
When implemented, @eq:faster, which is the most fast known convergence formula, When implemented, @eq:faster, which is the most fast known convergence formula,
@ -397,17 +403,102 @@ $$
and: and:
$$ $$
B_j = \frac{B_{j-1} N^2}{j^2} B_j = \frac{B_{j-1} N^2}{j^2}
\with B_0 = - \ln(N) \with B_0 = 1
$$ $$
This way, the error is given by: This way, the error is given by:
$$ $$
|\gamma - \gamma_k| = \pi e^{4N} |\gamma - \gamma_k| = \pi e^{-4N}
$$ $$
Then, in order to avoid computing the logarithm of a generic number $N$, which is greater with respect to the refined formula. Then, in order to avoid
instead of using @eq:NeD, a number $N = 2^p$ of digits were computed, where $p$ computing the logarithm of a generic number $N$, instead of using @eq:NeD, a
was chosen in order to satisfy: number $N = 2^p$ of digits were computed, where $p$ was chosen to satisfy:
$$
\pi e^{-4N} = \pi e^{-42^p} < 10^{-D} \thus
p > \log_2 \left[ \ln(\pi) + D \ln(10) \right] -2
$$ $$
and the smaller integer greater than it was selected, namely:
$$ $$
p = \left\lfloor \log_2 \left[ \ln(\pi) + D \ln(10) \right] \right\rfloor + 1
$$
As regards the precision of the used numbers, this time the type `pmf_t` of GMP
was emploied. It consists of an arbitrary-precision floating point whose number
of bits is to be defined when the variable is initialized.
If a number $D$ of digits is required, the number of bits can be computed from:
$$
10^D = 2^b \thus b = D \log_2 (10)
$$
in order to the roundoff errors to not affect the final result, a security
range of 64 bits was added to $b$.
Furthermore, the computation of the logarithm was made even faster by solving
@eq:delta instead
of finding $k^{\text{max}}$ by trials.
$$
\Delta S = \frac{1}{k (k + 2) 2^{k - 1}} < 10^{-D}
$$ {#eq:delta}
This was accomplished as follow:
$$
k (k + 2) 2^{k - 1} > 10^D \thus 2^{k - 1} [(k + 1)^2 - 1] > 10^D
$$
thus, for example, the inequality is satisfied for $k = x$ such that:
$$
2^{x - 1} (x + 1)^2 = 10^D
$$
In order to lighten the notation: $q := 10^D$, from which:
\begin{align*}
2^{x - 1} (x + 1)^2 = q
&\thus \exp \left( \ln(2) \frac{x - 1}{2} \right) (x + 1)
= \sqrt{q} \\
&\thus \exp \left( \frac{\ln(2)}{2} (x + 1)\right) e^{- \ln(2)} (x + 1)
= \sqrt{q} \\
&\thus \exp \left( \frac{\ln(2)}{2} (x + 1)\right) \frac{1}{2} (x + 1)
= \sqrt{q} \\
&\thus \exp \left( \frac{\ln(2)}{2} (x + 1)\right) \frac{\ln(2)}{2} (x + 1)
= \ln (2) \sqrt{q}
\end{align*}
Having rearranged the equation this way allows the use of the Lambert $W$
function:
$$
W(x \cdot e^x) = x
$$
Hence, by computing the Lambert W function of both sides of the equation:
$$
\frac{\ln(2)}{2} (x + 1) = W ( \ln(2) \sqrt{q} )
$$
which leads to:
$$
x = \frac{2}{\ln(2)} \, W ( \ln(2) \sqrt{q} ) - 1
$$
Finally, the smallest integer greater than $x$ can be found as:
$$
k^{\text{max}} =
\left\lfloor \frac{2}{\ln(2)} \, W ( \ln(2) \sqrt{q} ) \right\rfloor
\with 10^D
$$
When a large number $D$ of digits is to be computed (but it also works for few
digits), the Lambert W function can be approximated by the first five terms of
its asymptotic value [@corless96]:
$$
W(x) = L_1 - L_2 + \frac{L_2}{L_1} + \frac{L_2 (L_2 - 2)}{2 L_1^2}
+ \frac{L_2 (36L_2 - 22L_2^2 + 3L_2^3 -12)}{12L_1^4}
$$
where
$$
L_1 = \ln(x) \et L_2 = \ln(\ln(x))
$$
\textcolor{red}{Tempi del risultato e specifiche del pc usato}