# Exercise 2 ## Euler-Mascheroni constant The Euler-Mascheroni constant is defined as the limiting difference between the partial sums of the harmonic series and the natural logarithm: $$ \gamma = \lim_{n \rightarrow +\infty} \left( \sum_{k=1}^{n} \frac{1}{k} - \ln(n) \right) $$ {#eq:gamma} and represents the limiting blue area in @fig:gamma. The first 30 digits of $\gamma$ are: $$ \gamma = 0.57721\ 56649\ 01532\ 86060\ 65120\ 90082 \dots $$ {#eq:exact} In complex analysis, this constant is related to many functions and can be evaluated through a variety of identities. In this work, five methods were implemented and their results discussed. In fact, evaluating $\gamma$ with a limited precision due to floating points number representation entails limited precision on the estimation of $\gamma$ itself due to roundoff errors. All the known methods involve sums, subtractions or products of very big or small numbers, packed in series, partial sums or infinite products. Thus, the efficiency of the methods lies on how quickly they converge to their limit. ![The area of the blue region converges to the Euler–Mascheroni constant.](images/2-gamma-area.png){#fig:gamma width=7cm} ## Computing the constant ### Definition First, in order to give a quantitative idea of how hard it is to reach a good estimation of $\gamma$, it was naively computed using the definition given in @eq:gamma. The difference was computed for increasing value of $n$, with $n_{i+1} = 10 \cdot n_i$ and $n_1 = 20$, till the approximation starts getting worse, namely: $$ | \gamma(n_{i+1}) - \gamma | > | \gamma(n_i) - \gamma| $$ and $\gamma (n_i)$ was selected as the best result (see @tbl:naive-errs). ---------------------------- n $|γ(n)-γ|$ ----------- ---------------- \num{2e1} \num{2.48e-02} \num{2e2} \num{2.50e-03} \num{2e3} \num{2.50e-04} \num{2e4} \num{2.50e-05} \num{2e5} \num{2.50e-06} \num{2e6} \num{2.50e-07} \num{2e7} \num{2.50e-08} \num{2e8} \num{2.50e-09} \num{2e9} \num{2.55e-10} \num{2e10} \num{2.42e-11} \num{2e11} \num{1.44e-08} ---------------------------- Table: Partial results using the definition of $\gamma$ with double precision. {#tbl:naive-errs} The convergence is logarithmic: to fix the first $d$ decimal places, about $10^d$ terms of the harmonic series are needed. The double precision runs out at the $10^{\text{th}}$ place, at $n=\num{2e10}$. Since all the number are given with double precision, there can be at best 16 correct digits. In fact, for a double 64 bits are allocated in memory: 1 for the sign, 8 for the exponent and 55 for the mantissa, hence: $$ 2^{55} = 10^{d} \thus d = 55 \cdot \log(2) \sim 16.6 $$ Only 10 digits were correctly computed: this means that when the terms of the series start being smaller than the smallest representable double, the sum of all the remaining terms gives a number $\propto 10^{-11}$. The best result is shown in @tbl:naive-res. ------- -------------------- exact 0.57721 56649 01533 approx 0.57721 56648 77325 diff 0.00000 00000 24207 ------- -------------------- Table: First method best result. From the top down: true value, best estimation and difference between them. {#tbl:naive-res} ### Alternative formula As a first alternative, the constant was computed through the identity [@davis59] which relates $\gamma$ to the $\Gamma$ function as follow: $$ \gamma = \lim_{M \rightarrow + \infty} \sum_{k = 1}^{M} \binom{M}{k} \frac{(-1)^k}{k} \ln(\Gamma(k + 1)) $$ Varying $M$ from 1 to 100, the best result was obtained for $M = 41$ (see @tbl:limit-res). This approximation gave an underwhelming result: the convergence is actually worse than the definition itself. Only two decimal places were correctly computed: -------- -------------------- exact 0.57721 56649 01533 approx 0.57225 72410 34058 diff 0.00495 84238 67473 -------- -------------------- Table: Best estimation of $\gamma$ using the alternative formula. {#tbl:limit-res} Here, the problem lies in the binomial term: computing the factorial of a number greater than 18 goes over 16 places and so cannot be correctly represented. Furthermore, the convergence is slowed down by the logarithmic factor. ### Reciprocal $\Gamma$ function A better result was found using the well known reciprocal $\Gamma$ function formula [@bak91]: $$ \frac{1}{\Gamma(z)} = z e^{yz} \prod_{k = 1}^{+ \infty} \left( 1 + \frac{z}{k} \right) e^{-z/k} $$ which gives: $$ \gamma = - \frac{1}{z} \ln \left( z \Gamma(z) \prod_{k = 1}^{+ \infty} \left( 1 + \frac{z}{k} \right) e^{-z/k} \right) $$ The execution stops when there is no difference between two consecutive terms of the infinite product (it happens for $k = 456565794 \sim \num{4.6e8}$, meaning that for this value of $k$ the term of the product is equal to 1 in terms of floating points). Different values of $z$ were checked, with $z_{i+1} = z_i + 0.01$ ranging from 0 to 20, and the best result was found for $z = 9$. ----------------------------------------------- z $|γ(z) - γ|$ z $|γ(z) - γ|$ ----- ---------------- ------ ---------------- 1 \num{9.712e-9} 8.95 \num{9.770e-9} 3 \num{9.320e-9} 8.96 \num{9.833e-9} 5 \num{9.239e-9} 8.97 \num{9.622e-9} 7 \num{9.391e-9} 8.98 \num{9.300e-9} 9 \num{8.482e-9} 8.99 \num{9.059e-9} 11 \num{9.185e-9} 9.00 \num{8.482e-9} 13 \num{9.758e-9} 9.01 \num{9.564e-9} 15 \num{9.747e-9} 9.02 \num{9.260e-9} 17 \num{9.971e-9} 9.03 \num{9.264e-9} 19 \num{1.008e-8} 9.04 \num{9.419e-9} ----------------------------------------------- Table: Some results found with the reciprocal $\Gamma$ function formula. The values on the left are shown to give an idea of the $z$ large-scale behaviour; on the right, the values around the best one ($z = 9.00$) are listed. {#tbl:recip-errs} As can be seen in @tbl:recip-errs, the best value for $z$ is only by chance, since all $|\gamma(z) - \gamma |$ are of the same order of magnitude. The best one is compared with the exact value of $\gamma$ in @tbl:recip-res. ------- -------------------- true 0.57721 56649 01533 approx 0.57721 56564 18607 diff 0.00000 00084 82925 ------- -------------------- Table: Third method results for z = 9.00. {#tbl:recip-res} In this approximation, the convergence of the infinite product is fast enough to reach the $8^{th}$ decimal place. With respect to the definition, this formula returns a worse estimation of $\gamma$ because the calculation of the infinite product leads to more round-off instabilities than the series. ### Fastest convergence formula The fastest known convergence to $\gamma$ belongs to the following formula, known as refined Brent-McMillan formula [@yee19]: $$ \gamma_N = \frac{A(N)}{B(N)} -\frac{C(N)}{B^2(N)} - \ln(N) $$ {#eq:faster} where \begin{align*} &A(N) = \sum_{k=1}^{+ \infty} \frac{N^k}{k!} \cdot H(k) \with H(k) = \sum_{j=1}^{k} \frac{1}{j} \\ &B(N) = \sum_{k=1}^{+ \infty} \frac{N^k}{k!} \\ &C(N) = \frac{1}{4N} \sum_{k=0}^{2N} \frac{((2k)!)^3}{(k!)^4 \cdot (16k)^2k} \\ \end{align*} The asymptotic error of this estimation, given in [@demailly17], is: $$ |\gamma_N - \gamma| \sim \frac{5 \sqrt{2 \pi}}{12 \sqrt{N}} e^{-8N} = \Delta_N $$ {#eq:NeD} The error bound gives the value of $N$ to be used to get $D$ correct decimal digits of $\gamma$. This is done by imposing: $$ \Delta_N < 10^{-D} $$ The inequality with the equal sign can be solved with the use of the Lambert $W$ function. For a real number $x$, $W(x)$ is defined as the inverse function of $x\exp(x)$, so the idea is to recast the equation into this form and take $W$ both sides. \begin{align*} \frac{5 \sqrt{2 \pi}}{12 \sqrt{x}} e^{-8x} = 10^{-D} & \thus \left(\frac{12}{5}\right)^2 \frac{x}{2 \pi} e^{16x} = 10^{2D} \\ & \thus 16x e^{16x} = \left(\frac{5 \pi}{9}\right)^2 10^{2D + 1} \\ & \thus x = \frac{1}{16} W\left(\left(\frac{5 \pi}{9}\right)^2 10^{2D + 1}\right) \end{align*} The smallest integer which satisfies the inequality is then $$ N = 1 + \left\lfloor \frac{1}{16} W \left( \frac{5 \pi}{9} 10^{2D + 1}\right) \right\rfloor $$ {#eq:refined-n} The series $A$ and $B$ were computed till there is no difference between two consecutive terms. Results are shown in @tbl:fourth. ------- -------------------------- exact 0.57721 56649 01532 75452 approx 0.57721 56649 01532 53248 diff 0.00000 00000 00000 33062 ------- -------------------------- Table: $\gamma$ estimation with the fastest known convergence formula (@eq:faster). {#tbl:fourth} Due to roundoff errors, the best result was obtained only for $D = 15$, for which the code accurately computes 15 digits and gives an error of \num{3.3e16}. For $D > 15$, the requested can't be fulfilled. ### Arbitrary precision To overcome the issues related to the limited precision of the machine floating points, one can resort to a software implementation with arbitrary precision. In the GMP [@gmp20] library (GNU Multiple Precision arithmetic), for example, reals can be approximated by a rational or floating point numbers with arbitrary precision. For integers and fractions, this means a check is performed on every operation that can overflow and additional memory is requested and used to represent the larger result. For floating points, it means the size of the mantissa can be chosen before initialising the number. Additionally, the library automatically switches to the optimal algorithm to compute an operation based on the size of the operands. The terms in @eq:faster can therefore be computed with arbitrarily large precision. Thus, a program that computes the Euler-Mascheroni constant within a user controllable precision was implemented. To compute $N$ by @eq:refined-n, a trick must be used to avoid overflows in the exponentiation, particularly when computing more than a few hundreds digits. The details are explained in @sec:optimised, below. Once the number $N$ has been fixed, the series $A(N)$ and $B(N)$ are to be evaluated. With rational numbers, a different criterion for the truncation must be considered, since two consecutive term are always different. Brent and McMillan[@brent00] prove that to reduce the partial sum to less than $\Delta_N$ it is sufficient to compute $\alpha N$ terms, where $\alpha$ is the solution to the equation: $$ \alpha\ln(\alpha) = 3 + \alpha \thus \alpha = \exp(W(3e) - 1) \approx 4.97 $$ The GMP library offers functions to perform some operations such as addition, multiplication, division, etc. However, the logarithm function is not implemented. Thus, most of the code carries out the $\ln(N)$ computation. The reason is that the logarithm of only some special numbers can be computed with arbitrary precision, namely the ones of which a converging series is known. This forces $N$ to be rewritten in the following way: $$ N = N_0 \cdot b^e \thus \ln(N) = \ln(N_0) + e \cdot \ln(b) $$ Since a fast converging series for $\ln(2)$ is known (it will be shown shortly), $b = 2$ was chosen. If $N$ is a power of two, $N_0$ is 1 and only $\ln(2)$ is to be computed. More generally, the problem lies in the calculation of $\ln(N_0)$. Regarding the scientific notation, to find the mantissa $1 \leqslant N_0 < 2$, the number of binary digits of $N$ must be computed (conveniently, a dedicated function `mpz_sizeinbase()` exists in GMP). If the digits are $n$: $$ e = n - 1 \thus N = N_0 \cdot 2^{n-1} \thus N_0 = \frac{N}{2^{n - 1}} $$ The logarithm of $N_0$ can be computed from the Taylor series of the hyperbolic tangent, which is convergent for $|x| < 1$: $$ \text{atanh}(x) = \sum_{k = 0}^{+ \infty} \frac{x^{2k + 1}}{2x + 1} $$ The relation with the logarithm follows from the definition: $$ \text{tanh}(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}} = \frac{e^{2x} - 1}{e^{2x} + 1} $$ and the change of variable $z = e^{2x}$: $$ \text{tanh} \left( \frac{\ln(z)}{2} \right) = \frac{z - 1}{z + 1} \thus \ln(z) = 2 \, \text{atanh} \left( \frac{z - 1}{z + 1} \right) $$ The idea is to set $N_0 = z$ and define: $$ y = \frac{N_0 - 1}{N_0 + 1} $$ hence: $$ \ln(N_0) = 2 \, \text{atanh} (y) = 2 \sum_{k = 0}^{+ \infty} \frac{y^{2k + 1}}{2k + 1} $$ To estimate a series with a given precision, some care must be taken: different techniques apply to different series and are explained in the paper [@riddle08]. In this case, letting $S$ be the value of the series and $S_k$ the $k$-th partial sum, the following bounds can be found: $$ S_k + a_k \frac{L}{1 -L} < S < S_k + \frac{a_{k+1}}{1 - \frac{a_{k+1}}{a_k}} $$ where $a_k$ is the $k^{\text{th}}$ term of the series and $L$ is the limiting ratio of the series terms, which must be $\le 1$ in order for it to converge (in this case, it is easy to prove that $L = y^2 < 1$). The width $\Delta S$ of the interval containing $S$ gives the precision of the estimate $\tilde{S}$ if this last is assumed to be its middle value, namely: $$ \tilde{S} = S_k + \frac{1}{2} \left( a_k \frac{L}{1 -L} + \frac{a_{k+1}}{1 - \frac{a_{k+1}}{a_k}} \right) \et \Delta S = \frac{a_{k+1}}{1 - \frac{a_{k+1}}{a_k}} - a_k \frac{L}{1 -L} $$ For this series: \begin{align*} a_k = \frac{y^{2k + 1}}{2k + 1} && a_{k + 1} = \frac{y^{2k + 3}}{2k + 3} = a_k \, \frac{2k + 1}{2k + 3} \, y^2 \end{align*} so, the right bound results: $$ \frac{a_{k+1}}{1 - \frac{a_{k+1}}{a_k}} = \frac{a_k}{\frac{2k + 3}{2k + 1}y^{-2} - 1} $$ and it finally follows: $$ \Delta S_k = a_k \left( \frac{1}{\frac{2k + 3}{2k + 1}y^{-2} - 1} - \frac{y^2}{1 - y^2} \right) $$ By imposing $\Delta S_k < 10^{-D}$, where $D$ is the number decimal places required, $k^{\text{max}}$ at which to stop the summation can be obtained. This is achieved by trials, checking for every $k$ whether $\Delta S_k$ is less or greater than $10^{-D}$. Similar considerations can be made for $\ln(2)$. The number could be computed from the MacLaurin series of $\ln(1 + x)$ with $x = 1$, but that yields a slowly convergent series with alternating sign. A trick is to use $x = -1/2$, which leads to a much faster series with constant sign: $$ \ln(2) = \sum_{k=1}^{+ \infty} \frac{1}{k \cdot 2^k} $$ In this case the series ratio is found to be $L = 1/2$ and the error: $$ \Delta S_k = \frac{1}{k(k + 2) 2^{k-1}} $$ Once the logarithms and the terms $A$, $B$ and $C$ have been computed, @eq:faster can finally be used to obtain $\gamma$ up to the given decimal place. The program was implemented with no particular care for performance but was found to be relatively fast. On a \SI{3.9}{GHz} desktop computer with \SI{2666}{MT\per\s} memory, it takes \SI{0.63}{\second} to compute the first 1000 digits. However, this makes it quite unpractical for computing more than a few thousands digits. For this reason, a more optimised algorithm was implemented. ### Optimised implementation {#sec:optimised} The refined Brent-McMillan formula (@eq:faster) is theoretically the fastest but is difficult to implement efficiently. In practice it turns out to be slower than the standard formula even if its asymptotic error is better. The standard formula [@brent00] ignores the correction $C(N)$ and can be rewritten in a more convenient way: $$ \gamma_k = \frac{\sum_{k = 0}^{k_\text{max}} A_k(N)} {\sum_{k = 0}^{k_\text{max}} B_k(N)} $$ where: \begin{align*} A_k &= \frac{1}{k} \left(\frac{A_{k-1} N^2}{k} + B_k \right) & A_0 &= - \ln(N) \\ B_k &= \frac{B_{k-1} N^2}{k^2} & B_0 &= 1 \\ \end{align*} As said, the asymptotic error of the formula decreases slower: $$ |\gamma - \gamma_k| \sim \pi e^{-4N} = \Delta_N $$ In order to avoid the expensive computation of $\ln(N)$, N was chosen to be a power of 2, $N = 2^p$. As before: $$ \Delta_p = \pi e^{-2^{p+2}} < 10^{-D} \thus p > \log_2 \left( \ln(\pi) + D \ln(10) \right) - 2 $$ and the smaller integer is found by taking the floor: $$ p = \left\lfloor \log_2 \left( \ln(\pi) + D \ln(10) \right) \right\rfloor + 1 $$ The number of terms to compute, $k_\text{max}$, is still proportional to $N$ but by a different factor $\beta$ such that: \begin{align*} \beta\ln(\beta) = 1 + \beta && \beta = \exp(W(e^{-1}) + 1) \approx 3.59 \end{align*} As regards the numbers representation, this time the GMP type `mpf_t` was employed. It consists of an arbitrary-precision floating point with a fixed exponent of 32 or 64 bits but arbitrary mantissa length. If a number $D$ of digits is required, the number of bits can be computed from: $$ 10^D = 2^b \thus b = D \log_2 (10) $$ In order to avoid roundoff errors affecting the final result, a security range of 64 bits was added to $b$. Furthermore, the computation of $\ln(2)$ was optimized by solving @eq:delta analytically, instead of finding $k^{\text{max}}$ by trials. The series estimation error was: $$ \Delta S_k = \frac{1}{k (k + 2) 2^{k - 1}} < 10^{-D} $$ {#eq:delta} and the following condition should hold: $$ k (k + 2) 2^{k - 1} > 10^D \thus 2^{k - 1} [(k + 1)^2 - 1] > 10^D $$ Asking for the equality, with $k = x$ it yields to: $$ 2^{x - 1} (x + 1)^2 = 10^D $$ which can be again solved by the Lambert $W$ function as follows: \begin{align*} 2^{x - 1} (x + 1)^2 = 10^D &\thus \exp \left( \ln(2) \frac{x - 1}{2} \right) (x + 1) = 10^{D/2} \\ &\thus \exp \left( \frac{\ln(2)}{2} (x + 1)\right) e^{- \ln(2)} (x + 1) = 10^{D/2} \\ &\thus \exp \left( \frac{\ln(2)}{2} (x + 1)\right) \frac{\ln(2)}{2} (x + 1) = \ln (2) 10^{D/2} \end{align*} Taking $W$ both sides of the equation: $$ \frac{\ln(2)}{2} (x + 1) = W ( \ln(2) 10^{D/2} ) $$ which leads to: $$ x = \frac{2}{\ln(2)} \, W ( \ln(2) 10^{D/2} ) - 1 $$ Finally, the smallest integer greater than $x$ can be found as: $$ k^{\text{max}} = \left\lfloor \frac{2}{\ln(2)} \, W (\ln(2) 10^{D/2}) \right\rfloor $$ When a large number $D$ of digits is to be computed, the exponentiation can easily overflow if working in double precision. However, $W$ grows like a logarithm, so intuitively this operation could be optimized out. This can be done by using the asymptotic expansion [@corless96] at large $x$ of $W(x)$ : $$ W(x) = L_1 - L_2 + \frac{L_2}{L_1} + \frac{L_2 (L_2 - 2)}{2 L_1^2} + \frac{L_2 (3L_2^3 - 22L_2^2 + 36L_2 - 12)}{12L_1^4} + O\left(\left(\frac{L_2}{L_1}\right)^5\right) $$ where: $$ L_1 = \ln(x) \et L_2 = \ln(\ln(x)) $$ Now the usual properties of the logarithm can be applied to write: \begin{align*} L_1 &= \ln(\ln(2) 10^{D/2}) = \ln(\ln(2)) + \frac{D}{2}\ln(10) \\ L_2 &= \ln(L_1) \end{align*} This way, $k^{\text{max}}$ can be easily computed with standard double precision floating points, with no risk of overflow or need of arbitrary precision arithmetic. ### 1 million digits computation On the same desktop computer described before, the optimised program computes the first 1000 digits of $\gamma$ in \SI{4.6}{\milli\second}: a $137\times$ improvement over the previous program. This makes it suitable for a more intensive computation. As a proof of concept, the program was then used to compute a million digits of $\gamma$, which took \SI{1.26}{\hour} of running time. The result was verified in \SI{6.33}{\hour} by comparing it with the `mpmath.euler()` function from the mpmath [@mpmath13] multiple-precision library with the Python integer backend. This library uses the standard Brent-McMillan algorithm and variable precision floating points too but is not based on GMP arithmetic types.