analistica/notes/sections/2.md

17 KiB
Raw Blame History

Exercise 2

Euler-Mascheroni constant

The Euler-Mascheroni constant is defined as the limiting difference between the partial sums of the harmonic series and the natural logarithm:


  \gamma = \lim_{n \rightarrow +\infty} \left( \sum_{k=1}^{n} \frac{1}{k}
  - \ln(n) \right)
$$ {#eq:gamma}

and represents the limiting blue area in @fig:gamma. The first 30 digits of
$\gamma$ are:

\gamma = 0.57721\ 56649\ 01532\ 86060\ 65120\ 90082 \dots

 {#eq:exact}

In complex analysis, this constant is related to many functions and can be
evaluated through a variety of identities. In this work, five methods were
implemented and their results discussed. In fact, evaluating $\gamma$ with a
limited precision due to floating points number representation entails limited
precision on the estimation of $\gamma$ itself due to roundoff errors. All the
known methods involve sums, subtractions or products of very big or small
numbers, packed in series, partial sums or infinite products. Thus, the
efficiency of the methods lies on how quickly they converge to their limit.

![The area of the blue region converges to the EulerMascheroni
constant.](images/2-gamma-area.png){#fig:gamma width=7cm}


## Computing the constant


### Definition

First, in order to give a quantitative idea of how hard it is to reach a good
estimation of $\gamma$, it was naively computed using the definition given in
@eq:gamma. The difference was computed for increasing value of $n$, with
$n_{i+1} = 10 \cdot n_i$ and $n_1 = 20$, till the approximation starts getting
worse, namely:

| \gamma(n_{i+1}) - \gamma | > | \gamma(n_i) - \gamma|



and $\gamma (n_i)$ was selected as the best result (see @tbl:1_results).

---------------------------------
n           $|\gamma(n)-\gamma|$
----------- ---------------------
\SI{2e1}{}    \SI{2.48e-02}{}

\SI{2e2}{}    \SI{2.50e-03}{}

\SI{2e3}{}    \SI{2.50e-04}{}

\SI{2e4}{}    \SI{2.50e-05}{}

\SI{2e5}{}    \SI{2.50e-06}{}

\SI{2e6}{}    \SI{2.50e-07}{}

\SI{2e7}{}    \SI{2.50e-08}{}

\SI{2e8}{}    \SI{2.50e-09}{}

\SI{2e9}{}    \SI{2.55e-10}{}

\SI{2e10}{}   \SI{2.42e-11}{}

\SI{2e11}{}   \SI{1.44e-08}{}
---------------------------------

Table: Partial results using the definition of $\gamma$ with double
precision. {#tbl:1_results}

The convergence is logarithmic: to fix the first $d$ decimal places, about
$10^d$ terms of the armonic series are needed. The double precision runs out at
the $10^{\text{th}}$ place, at $n=\SI{2e10}{}$.  
Since all the number are given with double precision, there can be at best 16
correct digits, since for a double 64 bits are allocated in memory: 1 for the
sign, 8 for the exponent and 55 for the mantissa:

2^{55} = 10^{d} \thus d = 55 \cdot \log(2) \sim 16.6



Only 10 digits were correctly computed: this means that when the terms of the
series start being smaller than the smallest representable double, the sum of
all the remaining terms gives a number $\propto 10^{-11}$.  
Best result in @tbl:first.

--------- -----------------------
true:	     0.57721\ 56649\ 01533

approx:    0.57721\ 56648\ 77325

diff:	     0.00000\ 00000\ 24207
--------- -----------------------

Table: First method best result. From the top down: true value, best estimation
and difference between them. {#tbl:first}


### Alternative formula

As a first alternative, the constant was computed through the identity which
relates $\gamma$ to the $\Gamma$ function as follow [@davis59]:

\gamma = \lim_{M \rightarrow + \infty} \sum_{k = 1}^{M} \binom{M}{k} \frac{(-1)^k}{k} \ln(\Gamma(k + 1))



Varying $M$ from 1 to 100, the best result was obtained for $M = 41$ (see
@tbl:second). It went sour: the convergence is worse than using the definition
itself. Only two places were correctly computed (@tbl:second).

--------- -----------------------
true:	     0.57721\ 56649\ 01533

approx:    0.57225\ 72410\ 34058

diff:	     0.00495\ 84238\ 67473
--------- -----------------------

Table: Best estimation of $\gamma$ using
the alternative formula. {#tbl:second}

Here, the problem lies in the binomial term: computing the factorial of a
number greater than 18 goes over 15 places and so cannot be correctly
represented. Furthermore, the convergence is slowed down by the logarithmic
factor.


### Reciprocal $\Gamma$ function

A better result was found using the well known reciprocal $\Gamma$ function
formula [@bak91]:

\frac{1}{\Gamma(z)} = z e^{yz} \prod_{k = 1}^{+ \infty} \left( 1 + \frac{z}{k} \right) e^{-z/k}



which gives:

\gamma = - \frac{1}{z} \ln \left( z \Gamma(z) \prod_{k = 1}^{+ \infty} \left( 1 + \frac{z}{k} \right) e^{-z/k} \right)



The execution stops when there is no difference between two consecutive therms
of the infinite product (it happens for $k = 456565794 \sim \SI{4.6e8}{}$,
meaning that for this value of $k$ the term of the product is equal to 1 in
terms of floating points). Different values of $z$ were checked, with $z_{i+1}
= z_i + 0.01$ ranging from 0 to 20, and the best result was found for $z = 9$.

---------------------------------------------------------------
    z  $|\gamma(z) - \gamma |$      z   $|\gamma(z) - \gamma |$
----- ------------------------ ------  ------------------------
 1     \SI{9.712e-9}{}          8.95    \SI{9.770e-9}{}
 
 3     \SI{9.320e-9}{}          8.96    \SI{9.833e-9}{}
 
 5     \SI{9.239e-9}{}          8.97    \SI{9.622e-9}{}
 
 7     \SI{9.391e-9}{}          8.98    \SI{9.300e-9}{}
 
 9     \SI{8.482e-9}{}          8.99    \SI{9.059e-9}{}
 
 11    \SI{9.185e-9}{}          9.00    \SI{8.482e-9}{}
 
 13    \SI{9.758e-9}{}          9.01    \SI{9.564e-9}{}
 
 15    \SI{9.747e-9}{}          9.02    \SI{9.260e-9}{}
 
 17    \SI{9.971e-9}{}          9.03    \SI{9.264e-9}{}
 
 19    \SI{10.084e-9}{}         9.04    \SI{9.419e-9}{}
---------------------------------------------------------------

Table: Differences between some obtained values of $\gamma$ and
the exact one found with the reciprocal $\Gamma$ function formula.
The values on the left are shown to give an idea of the $z$
large-scale behaviour; on the right, the values around the best
one ($z = 9.00$) are listed. {#tbl:3_results}

As can be seen in @tbl:3_results, the best value for $z$ is only by chance,
since all $|\gamma(z) - \gamma |$ are of the same order of magnitude. The best
one is compared with the exact value of $\gamma$ in @tbl:third.

--------- -----------------------
true:	     0.57721\ 56649\ 01533

approx:    0.57721\ 56564\ 18607

diff:	     0.00000\ 00084\ 82925
--------- -----------------------

Table: Third method results for z = 9.00. {#tbl:third}

This time, the convergence of the infinite product is fast enough to ensure the
$8^{th}$ place.


### Fastest convergence formula

The fastest known convergence belongs to the following formula, known as refined
Brent-McMillan formula[@yee19]:

\gamma_N = \frac{A(N)}{B(N)} -\frac{C(N)}{B^2(N)} - \ln(N)

 {#eq:faster}

with:
\begin{align*}
  &A(N) = \sum_{k=1}^{+ \infty} \frac{N^k}{k!} \cdot H(k)
          \with H(k) = \sum_{j=1}^{k} \frac{1}{j}         \\
  &B(N) = \sum_{k=1}^{+ \infty} \frac{N^k}{k!}            \\
  &C(N) = \frac{1}{4N} \sum_{k=0}^{2N}
          \frac{((2k)!)^3}{(k!)^4 \cdot (16k)^2k}         \\
\end{align*}

and the difference between $\gamma_N$ and $\gamma$ is given by [@demailly17]:

|\gamma_N - \gamma| < \frac{5 \sqrt{2 \pi}}{12 \sqrt{N}} e^{-8N}

 {#eq:NeD}

This gives the value of $N$ which is to be used into the formula in order to get
$D$ correct decimal digits of $\gamma$. In fact, by imposing:

\frac{5 \sqrt{2 \pi}}{12 \sqrt{N}} e^{-8N} < 10^{-D}



With a bit of math, it can be found that the smallest integer which satisfies
the inequality is given by:

N = 1 + \left\lfloor \frac{1}{16} W \left( \frac{5 \pi}{9} 10^{2D + 1}\right) \right\rfloor



The series $A$ and $B$ were computed till there is no difference between two
consecutive terms. Results are shown in @tbl:fourth.  

--------- ------------------------------
true:	     0.57721\ 56649\ 01532\ 75452

approx:    0.57721\ 56649\ 01532\ 53248

diff:	     0.00000\ 00000\ 00000\ 33062
--------- ------------------------------

Table: $\gamma$ estimation with the fastest
known convergence formula (@eq:faster). {#tbl:fourth}

Because of roundoff errors, the best result was obtained only for $D = 15$, for
which the code accurately computes 15 digits and gives an error of
\SI{3.3e16}{}. For $D > 15$, the requested can't be fulfilled.

### Arbitrary precision

To overcome the issues related to the double representation, one can resort to a
representation with arbitrary precision. In the GMP library (which stands for
GNU Multiple Precision arithmetic), for example, real numbers can be
approximated by a ratio of two integers (a fraction) with arbitrary precision: 
this means a check is performed on every operation and in case of an integer
overflow, additional memory is requested and used to represent the larger result.
Additionally, the library automatically switches to the optimal algorithm
to compute an operation based on the size of the operands.

The terms in @eq:faster can therefore be computed with arbitrarily large
precision.  Thus, a program that computes the Euler-Mascheroni constant within
a user controllable precision was implemented.

According to [@brent00], the $A(N)$, $B(N)$ and $C(N)$ series were computed up
to $k_{\text{max}} = 5N$ \textcolor{red}{sure?}, since it guarantees the accuracy
od the result up to the $D$ decimal digit.  
The GMP library offers functions to perform some operations such as addition,
multiplication, division, etc. However, the logarithm function is not
implemented. Thus, most of the code carries out the $\ln(N)$ computation.
This is because the logarithm of only some special numbers can be computed
with arbitrary precision, namely the ones of which a converging series is known.
This forces $N$ to be rewritten in the following way:

N = N_0 \cdot b^e \thus \ln(N) = \ln(N_0) + e \cdot \ln(b)



Since a fast converging series for $\ln(2)$ is known (it will be shwn shortly),
$b = 2$ was chosen. In case $N$ is a power of two, only $\ln(2)$ is to be
computed. If not, it must be handled as follows.  
As well as for the scientific notation, in order to get the mantissa $1
\leqslant N_0 < 2$, the number of binary digits of $N$ must be computed
(conveniently, a dedicated function `mpz_sizeinbase()` is found in GMP). If the
digits are
$n$:

e = n - 1 \thus N = N_0 \cdot 2^{n-1} \thus N_0 = \frac{N}{2^{n - 1}}



The logarithm of whichever number $N_0$ can be computed using the series of
$\text{atanh}(x)$, which converges for $|x| < 1$:

\text{atanh}(x) = \sum_{k = 0}^{+ \infty} \frac{x^{2k + 1}}{2x + 1}



In fact:

\text{tanh}(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}} = \frac{e^{2x} - 1}{e^{2x} + 1}



with the change of variable $z = e^{2x}$, one gets:

\text{tanh} \left( \frac{\ln(z)}{2} \right) = \frac{z - 1}{z + 1} \thus \ln(z) = 2 , \text{atanh} \left( \frac{z - 1}{z + 1} \right)



The idea is to set $N_0 = z$ and define:

y = \frac{N_0 - 1}{N_0 + 1}



hence:

\ln(N_0) = 2 , \text{atanh} (y) = 2 \sum_{k = 0}^{+ \infty} \frac{y^{2k + 1}}{2k + 1}



But when to stop computing the series?  
Given a partial sum $S_k$ of the series, it is possible to know when a digit is
definitely correct. The key lies in the following concept [@riddle08]. Letting
$S$ be the value of the series:

S_k + a_k \frac{L}{1 -L} < S < S_k + \frac{a_{k+1}}{1 - \frac{a_{k+1}}{a_k}}



where $a_k$ is the $k^{\text{th}}$ term of the series and $L$ is the limiting
ratio of the series terms, which must be $< 1$ in order for it to converge (in
this case, it is easy to prove that $L = y^2 < 1$). The width $\Delta S$ of the
interval containing $S$ gives the precision of the estimate $\tilde{S}$ if this
last is assumed to be the middle value of it, namely:

\tilde{S} = S_k + \frac{1}{2} \left( a_k \frac{L}{1 -L} + \frac{a_{k+1}}{1 - \frac{a_{k+1}}{a_k}} \right) \et \Delta S = \frac{a_{k+1}}{1 - \frac{a_{k+1}}{a_k}} - a_k \frac{L}{1 -L}



In order to know when to stop computing the series, $\Delta S$ must be
evaluated. With a bit of math:

a_k = \frac{y^{2k + 1}}{2k + 1} \thus a_{k + 1} = \frac{y^{2k + 3}}{2k + 3} = a_k , \frac{2k + 1}{2k + 3} , y^2



hence:

\frac{a_{k+1}}{1 - \frac{a_{k+1}}{a_k}} = \frac{a_k}{\frac{2k + 3}{2k + 1}y^{-2} - 1}



from which:

\Delta S = a_k \left( \frac{1}{\frac{2k + 3}{2k + 1}y^{-2} - 1} - \frac{y^2}{1 - y^2} \right) = \Delta S_k \with a_k = \frac{y^{2k + 1}}{2k + 1}



By imposing $\Delta S < 10^{-D}$, where $D$ is the last correct decimal place
required, $k^{\text{max}}$ at which to stop can be obtained. This is achieved by
trials, checking for every $k$ whether $\Delta S_k$ is less or greater than
$10^{-D}$.

The same considerations holds for $\ln(2)$. It could be computed as the Taylor
expansion of $\ln(1 + x)$ with $x = 1$, but it would be too slow. A better idea
is to use the series with $x = -1/2$, which leads to a much more fast series:

\log(2) = \sum_{k=1}^{+ \infty} \frac{1}{k \cdot 2^k}



In this case the math siyplifies a lot: the ratio turns out to be $L =
1/2$ and therefore:

\Delta S = \frac{1}{k(k + 2) 2^{k-1}}



Once computed the estimations of both logarithms and the series $A$, $B$ and
$C$ with the required precision, the program can be used to estimate $\gamma$
up to whatever decimal digit. To give an idea of the time it takes to compute
it, 1000 digits were computed in \SI{0.63}{s}, with a \SI{3.9}{GHz}
2666mt al secondo
\textcolor{red}{Scrivere specifice pc che fa i conti}


### Faster implemented formula

When implemented, @eq:faster, which is the most fast known convergence formula,
is not fast at all.  
The procedure can be accelerated by removing the $C(N)$ correction and keeping
only the two series $A(N)$ and $B(N)$. In this case, the equation can be
rewritten in a most convenient way [@brent00]:

\gamma_k = \frac{U_k(N)}{V_k(N)} \with U_k(N) = \sum_{j = 0}^k A_j(N) \et V_k(N) = \sum_{j = 0}^k B_j(N)



where:

A_j = \frac{1}{j} \left( \frac{A_{j-1} N^2}{j} + B_j \right) \with A_0 = - \ln(N)



and:

B_j = \frac{B_{j-1} N^2}{j^2} \with B_0 = 1



This way, the error is given by:

|\gamma - \gamma_k| = \pi e^{-4N}



which is greater with respect to the refined formula. Then, in order to avoid
computing the logarithm of a generic number $N$, instead of using @eq:NeD, a
number $N = 2^p$ of digits were computed, where $p$ was chosen to satisfy:

\pi e^{-4N} = \pi e^{-42^p} < 10^{-D} \thus p > \log_2 \left[ \ln(\pi) + D \ln(10) \right] -2



and the smaller integer greater than it was selected, namely:

p = \left\lfloor \log_2 \left[ \ln(\pi) + D \ln(10) \right] \right\rfloor + 1



As regards the precision of the used numbers, this time the type `pmf_t` of GMP
was emploied. It consists of an arbitrary-precision floating point whose number
of bits is to be defined when the variable is initialized.  
If a number $D$ of digits is required, the number of bits can be computed from:

10^D = 2^b \thus b = D \log_2 (10)



in order to the roundoff errors to not affect the final result, a security
range of 64 bits was added to $b$.

Furthermore, the computation of the logarithm was made even faster by solving
@eq:delta instead
of finding $k^{\text{max}}$ by trials.

\Delta S = \frac{1}{k (k + 2) 2^{k - 1}} < 10^{-D}

 {#eq:delta}

This was accomplished as follow:

k (k + 2) 2^{k - 1} > 10^D \thus 2^{k - 1} [(k + 1)^2 - 1] > 10^D



thus, for example, the inequality is satisfied for $k = x$ such that:

2^{x - 1} (x + 1)^2 = 10^D



In order to lighten the notation: $q := 10^D$, from which:
\begin{align*}
  2^{x - 1} (x + 1)^2 = q
  &\thus \exp \left( \ln(2) \frac{x - 1}{2}  \right) (x + 1)
    = \sqrt{q}                 \\
  &\thus \exp \left( \frac{\ln(2)}{2} (x + 1)\right) e^{- \ln(2)} (x + 1)
    = \sqrt{q} \\
  &\thus \exp \left( \frac{\ln(2)}{2} (x + 1)\right) \frac{1}{2} (x + 1)
    = \sqrt{q} \\
  &\thus \exp \left( \frac{\ln(2)}{2} (x + 1)\right) \frac{\ln(2)}{2} (x + 1)
        = \ln (2) \sqrt{q}
\end{align*}

Having rearranged the equation this way allows the use of the Lambert $W$
function:

W(x \cdot e^x) = x



Hence, by computing the Lambert W function of both sides of the equation:

\frac{\ln(2)}{2} (x + 1) = W ( \ln(2) \sqrt{q} )



which leads to:

x = \frac{2}{\ln(2)} , W ( \ln(2) \sqrt{q} ) - 1



Finally, the smallest integer greater than $x$ can be found as:

k^{\text{max}} = \left\lfloor \frac{2}{\ln(2)} , W ( \ln(2) \sqrt{q} ) \right\rfloor \with 10^D



When a large number $D$ of digits is to be computed (but it also works for few
digits), the Lambert W function can be approximated by the first five terms of
its asymptotic value [@corless96]:

W(x) = L_1 - L_2 + \frac{L_2}{L_1} + \frac{L_2 (L_2 - 2)}{2 L_1^2} + \frac{L_2 (36L_2 - 22L_2^2 + 3L_2^3 -12)}{12L_1^4}



where

L_1 = \ln(x) \et L_2 = \ln(\ln(x))


 
\textcolor{red}{Tempi del risultato e specifiche del pc usato}