2020-03-06 02:24:32 +01:00
|
|
|
|
# Exercise 2
|
|
|
|
|
|
2020-05-08 23:07:37 +02:00
|
|
|
|
|
2020-03-06 02:24:32 +01:00
|
|
|
|
## Euler-Mascheroni constant
|
|
|
|
|
|
|
|
|
|
The Euler-Mascheroni constant is defined as the limiting difference between the
|
|
|
|
|
partial sums of the harmonic series and the natural logarithm:
|
|
|
|
|
$$
|
2020-05-29 14:50:19 +02:00
|
|
|
|
\gamma = \lim_{n \rightarrow +\infty} \left(
|
|
|
|
|
\sum_{k=1}^{n} \frac{1}{k}
|
2020-03-06 02:24:32 +01:00
|
|
|
|
- \ln(n) \right)
|
|
|
|
|
$$ {#eq:gamma}
|
|
|
|
|
|
|
|
|
|
and represents the limiting blue area in @fig:gamma. The first 30 digits of
|
|
|
|
|
$\gamma$ are:
|
|
|
|
|
$$
|
|
|
|
|
\gamma = 0.57721\ 56649\ 01532\ 86060\ 65120\ 90082 \dots
|
|
|
|
|
$$ {#eq:exact}
|
|
|
|
|
|
|
|
|
|
In complex analysis, this constant is related to many functions and can be
|
|
|
|
|
evaluated through a variety of identities. In this work, five methods were
|
|
|
|
|
implemented and their results discussed. In fact, evaluating $\gamma$ with a
|
|
|
|
|
limited precision due to floating points number representation entails limited
|
|
|
|
|
precision on the estimation of $\gamma$ itself due to roundoff errors. All the
|
|
|
|
|
known methods involve sums, subtractions or products of very big or small
|
|
|
|
|
numbers, packed in series, partial sums or infinite products. Thus, the
|
|
|
|
|
efficiency of the methods lies on how quickly they converge to their limit.
|
|
|
|
|
|
|
|
|
|
![The area of the blue region converges to the Euler–Mascheroni
|
2020-05-20 18:45:27 +02:00
|
|
|
|
constant.](images/2-gamma-area.png){#fig:gamma width=7cm}
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-08 23:07:37 +02:00
|
|
|
|
|
2020-03-06 02:24:32 +01:00
|
|
|
|
## Computing the constant
|
|
|
|
|
|
2020-05-08 23:07:37 +02:00
|
|
|
|
|
2020-03-06 02:24:32 +01:00
|
|
|
|
### Definition
|
|
|
|
|
|
2020-05-08 23:07:37 +02:00
|
|
|
|
First, in order to give a quantitative idea of how hard it is to reach a good
|
2020-03-06 02:24:32 +01:00
|
|
|
|
estimation of $\gamma$, it was naively computed using the definition given in
|
|
|
|
|
@eq:gamma. The difference was computed for increasing value of $n$, with
|
|
|
|
|
$n_{i+1} = 10 \cdot n_i$ and $n_1 = 20$, till the approximation starts getting
|
|
|
|
|
worse, namely:
|
|
|
|
|
$$
|
|
|
|
|
| \gamma(n_{i+1}) - \gamma | > | \gamma(n_i) - \gamma|
|
|
|
|
|
$$
|
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
and $\gamma (n_i)$ was selected as the best result (see @tbl:naive-errs).
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
----------------------------
|
|
|
|
|
n $|γ(n)-γ|$
|
|
|
|
|
----------- ----------------
|
|
|
|
|
\num{2e1} \num{2.48e-02}
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
\num{2e2} \num{2.50e-03}
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
\num{2e3} \num{2.50e-04}
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
\num{2e4} \num{2.50e-05}
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
\num{2e5} \num{2.50e-06}
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
\num{2e6} \num{2.50e-07}
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
\num{2e7} \num{2.50e-08}
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
\num{2e8} \num{2.50e-09}
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
\num{2e9} \num{2.55e-10}
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
\num{2e10} \num{2.42e-11}
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
\num{2e11} \num{1.44e-08}
|
|
|
|
|
----------------------------
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
|
|
|
|
Table: Partial results using the definition of $\gamma$ with double
|
2020-05-27 15:29:59 +02:00
|
|
|
|
precision. {#tbl:naive-errs}
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-08 23:07:37 +02:00
|
|
|
|
The convergence is logarithmic: to fix the first $d$ decimal places, about
|
2020-05-27 15:29:59 +02:00
|
|
|
|
$10^d$ terms of the harmonic series are needed. The double precision runs out
|
|
|
|
|
at the $10^{\text{th}}$ place, at $n=\num{2e10}$.
|
2020-05-26 09:58:16 +02:00
|
|
|
|
Since all the number are given with double precision, there can be at best 16
|
|
|
|
|
correct digits, since for a double 64 bits are allocated in memory: 1 for the
|
|
|
|
|
sign, 8 for the exponent and 55 for the mantissa:
|
|
|
|
|
$$
|
|
|
|
|
2^{55} = 10^{d} \thus d = 55 \cdot \log(2) \sim 16.6
|
|
|
|
|
$$
|
|
|
|
|
|
|
|
|
|
Only 10 digits were correctly computed: this means that when the terms of the
|
|
|
|
|
series start being smaller than the smallest representable double, the sum of
|
2020-05-26 10:07:30 +02:00
|
|
|
|
all the remaining terms gives a number $\propto 10^{-11}$. The best result is
|
2020-05-27 15:29:59 +02:00
|
|
|
|
shown in @tbl:naive-res.
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
------- --------------------
|
|
|
|
|
exact 0.57721 56649 01533
|
|
|
|
|
approx 0.57721 56648 77325
|
|
|
|
|
diff 0.00000 00000 24207
|
|
|
|
|
------- --------------------
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-08 23:07:37 +02:00
|
|
|
|
Table: First method best result. From the top down: true value, best estimation
|
2020-05-27 15:29:59 +02:00
|
|
|
|
and difference between them. {#tbl:naive-res}
|
2020-05-08 23:07:37 +02:00
|
|
|
|
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
|
|
|
|
### Alternative formula
|
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
As a first alternative, the constant was computed through the identity
|
|
|
|
|
[@davis59] which relates $\gamma$ to the $\Gamma$ function as follow:
|
2020-03-06 02:24:32 +01:00
|
|
|
|
$$
|
|
|
|
|
\gamma = \lim_{M \rightarrow + \infty} \sum_{k = 1}^{M}
|
|
|
|
|
\binom{M}{k} \frac{(-1)^k}{k} \ln(\Gamma(k + 1))
|
|
|
|
|
$$
|
|
|
|
|
|
|
|
|
|
Varying $M$ from 1 to 100, the best result was obtained for $M = 41$ (see
|
2020-05-27 15:29:59 +02:00
|
|
|
|
@tbl:limit-res). This approximation gave an underwhelming result: the
|
|
|
|
|
convergence is actually worse than the definition itself. Only two decimal
|
|
|
|
|
places were correctly computed:
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
-------- --------------------
|
|
|
|
|
exact 0.57721 56649 01533
|
|
|
|
|
approx 0.57225 72410 34058
|
|
|
|
|
diff 0.00495 84238 67473
|
|
|
|
|
-------- --------------------
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-11 23:28:23 +02:00
|
|
|
|
Table: Best estimation of $\gamma$ using
|
2020-05-27 15:29:59 +02:00
|
|
|
|
the alternative formula. {#tbl:limit-res}
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
|
|
|
|
Here, the problem lies in the binomial term: computing the factorial of a
|
|
|
|
|
number greater than 18 goes over 15 places and so cannot be correctly
|
2020-05-08 23:07:37 +02:00
|
|
|
|
represented. Furthermore, the convergence is slowed down by the logarithmic
|
|
|
|
|
factor.
|
|
|
|
|
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
|
|
|
|
### Reciprocal $\Gamma$ function
|
|
|
|
|
|
|
|
|
|
A better result was found using the well known reciprocal $\Gamma$ function
|
2020-05-08 23:07:37 +02:00
|
|
|
|
formula [@bak91]:
|
2020-03-06 02:24:32 +01:00
|
|
|
|
$$
|
|
|
|
|
\frac{1}{\Gamma(z)} = z e^{yz} \prod_{k = 1}^{+ \infty}
|
|
|
|
|
\left( 1 + \frac{z}{k} \right) e^{-z/k}
|
|
|
|
|
$$
|
|
|
|
|
|
|
|
|
|
which gives:
|
|
|
|
|
$$
|
|
|
|
|
\gamma = - \frac{1}{z} \ln \left( z \Gamma(z)
|
|
|
|
|
\prod_{k = 1}^{+ \infty} \left( 1 + \frac{z}{k} \right) e^{-z/k} \right)
|
|
|
|
|
$$
|
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
The execution stops when there is no difference between two consecutive terms
|
2020-05-26 10:07:30 +02:00
|
|
|
|
of the infinite product (it happens for $k = 456565794 \sim \num{4.6e8}$,
|
2020-05-08 23:07:37 +02:00
|
|
|
|
meaning that for this value of $k$ the term of the product is equal to 1 in
|
|
|
|
|
terms of floating points). Different values of $z$ were checked, with $z_{i+1}
|
|
|
|
|
= z_i + 0.01$ ranging from 0 to 20, and the best result was found for $z = 9$.
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
-----------------------------------------------
|
2020-05-29 14:50:19 +02:00
|
|
|
|
z $|γ(z) - γ|$ z $|γ(z) - γ|$
|
2020-05-27 15:29:59 +02:00
|
|
|
|
----- ---------------- ------ ----------------
|
|
|
|
|
1 \num{9.712e-9} 8.95 \num{9.770e-9}
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
3 \num{9.320e-9} 8.96 \num{9.833e-9}
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
5 \num{9.239e-9} 8.97 \num{9.622e-9}
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
7 \num{9.391e-9} 8.98 \num{9.300e-9}
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
9 \num{8.482e-9} 8.99 \num{9.059e-9}
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
11 \num{9.185e-9} 9.00 \num{8.482e-9}
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
13 \num{9.758e-9} 9.01 \num{9.564e-9}
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
15 \num{9.747e-9} 9.02 \num{9.260e-9}
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
17 \num{9.971e-9} 9.03 \num{9.264e-9}
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-29 14:50:19 +02:00
|
|
|
|
19 \num{1.008e-8} 9.04 \num{9.419e-9}
|
2020-05-27 15:29:59 +02:00
|
|
|
|
-----------------------------------------------
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-08 23:07:37 +02:00
|
|
|
|
Table: Differences between some obtained values of $\gamma$ and
|
|
|
|
|
the exact one found with the reciprocal $\Gamma$ function formula.
|
|
|
|
|
The values on the left are shown to give an idea of the $z$
|
|
|
|
|
large-scale behaviour; on the right, the values around the best
|
2020-05-27 15:29:59 +02:00
|
|
|
|
one ($z = 9.00$) are listed. {#tbl:recip-errs}
|
2020-05-08 23:07:37 +02:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
As can be seen in @tbl:recip-errs, the best value for $z$ is only by chance,
|
2020-05-08 23:07:37 +02:00
|
|
|
|
since all $|\gamma(z) - \gamma |$ are of the same order of magnitude. The best
|
2020-05-27 15:29:59 +02:00
|
|
|
|
one is compared with the exact value of $\gamma$ in @tbl:recip-res.
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
------- --------------------
|
|
|
|
|
true 0.57721 56649 01533
|
|
|
|
|
approx 0.57721 56564 18607
|
|
|
|
|
diff 0.00000 00084 82925
|
|
|
|
|
------- --------------------
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
Table: Third method results for z = 9.00. {#tbl:recip-res}
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
In this approximation, the convergence of the infinite product is fast enough
|
|
|
|
|
to reach the $8^{th}$ decimal place.
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
|
|
|
|
|
2020-05-08 23:07:37 +02:00
|
|
|
|
### Fastest convergence formula
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
The fastest known convergence to $\gamma$ belongs to the following formula,
|
|
|
|
|
known as refined Brent-McMillan formula [@yee19]:
|
2020-03-06 02:24:32 +01:00
|
|
|
|
$$
|
2020-05-25 23:59:38 +02:00
|
|
|
|
\gamma_N = \frac{A(N)}{B(N)} -\frac{C(N)}{B^2(N)} - \ln(N)
|
2020-03-06 02:24:32 +01:00
|
|
|
|
$$ {#eq:faster}
|
2020-05-27 15:29:59 +02:00
|
|
|
|
where
|
2020-03-06 02:24:32 +01:00
|
|
|
|
\begin{align*}
|
|
|
|
|
&A(N) = \sum_{k=1}^{+ \infty} \frac{N^k}{k!} \cdot H(k)
|
2020-05-25 23:59:38 +02:00
|
|
|
|
\with H(k) = \sum_{j=1}^{k} \frac{1}{j} \\
|
2020-03-06 02:24:32 +01:00
|
|
|
|
&B(N) = \sum_{k=1}^{+ \infty} \frac{N^k}{k!} \\
|
|
|
|
|
&C(N) = \frac{1}{4N} \sum_{k=0}^{2N}
|
|
|
|
|
\frac{((2k)!)^3}{(k!)^4 \cdot (16k)^2k} \\
|
|
|
|
|
\end{align*}
|
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
The asymptotic error of this estimation, given in [@demailly17], is:
|
2020-05-08 23:07:37 +02:00
|
|
|
|
$$
|
2020-05-27 15:29:59 +02:00
|
|
|
|
|\gamma_N - \gamma| \sim \frac{5 \sqrt{2 \pi}}{12 \sqrt{N}} e^{-8N} = \Delta_N
|
2020-05-25 12:43:58 +02:00
|
|
|
|
$$ {#eq:NeD}
|
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
The error bound gives the value of $N$ to be used to get $D$ correct decimal
|
|
|
|
|
digits of $\gamma$. In fact, this is done by imposing:
|
2020-05-25 23:59:38 +02:00
|
|
|
|
$$
|
|
|
|
|
\frac{5 \sqrt{2 \pi}}{12 \sqrt{N}} e^{-8N} < 10^{-D}
|
|
|
|
|
$$
|
2020-05-08 23:07:37 +02:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
The inequality with the equal sign can be solved with the use of the Lambert
|
|
|
|
|
$W$ function. For a real number $x$, $W(x)$ is defined as the inverse function
|
|
|
|
|
of $x\exp(x)$, so the idea is to recast the equation into this form and take
|
|
|
|
|
$W$ both sides.
|
|
|
|
|
|
|
|
|
|
\begin{align*}{3}
|
|
|
|
|
\frac{5 \sqrt{2 \pi}}{12 \sqrt{x}} e^{-8x} = 10^{-D}
|
|
|
|
|
& \thus \left(\frac{12}{5}\right)^2 \frac{x}{2 \pi} e^{16x} = 10^{2D} \\
|
|
|
|
|
& \thus 16x e^{16x} = \left(\frac{5 \pi}{9}\right)^2 10^{2D + 1} \\
|
|
|
|
|
& \thus x = \frac{1}{16} W\left(\left(\frac{5 \pi}{9}\right)^2 10^{2D + 1}\right)
|
|
|
|
|
\end{align*}
|
|
|
|
|
|
|
|
|
|
The smallest integer which satisfies the inequality is then
|
2020-05-25 23:59:38 +02:00
|
|
|
|
$$
|
|
|
|
|
N = 1 + \left\lfloor \frac{1}{16}
|
|
|
|
|
W \left( \frac{5 \pi}{9} 10^{2D + 1}\right) \right\rfloor
|
2020-05-27 15:29:59 +02:00
|
|
|
|
$$ {#eq:refined-n}
|
2020-05-25 23:59:38 +02:00
|
|
|
|
|
|
|
|
|
The series $A$ and $B$ were computed till there is no difference between two
|
|
|
|
|
consecutive terms. Results are shown in @tbl:fourth.
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
------- --------------------------
|
|
|
|
|
exact 0.57721 56649 01532 75452
|
|
|
|
|
approx 0.57721 56649 01532 53248
|
|
|
|
|
diff 0.00000 00000 00000 33062
|
|
|
|
|
------- --------------------------
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
Table: $\gamma$ estimation with the fastest known convergence formula
|
|
|
|
|
(@eq:faster). {#tbl:fourth}
|
2020-05-08 23:07:37 +02:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
Due to roundoff errors, the best result was obtained only for $D = 15$, for
|
2020-05-25 23:59:38 +02:00
|
|
|
|
which the code accurately computes 15 digits and gives an error of
|
2020-05-26 10:07:30 +02:00
|
|
|
|
\num{3.3e16}. For $D > 15$, the requested can't be fulfilled.
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
|
2020-03-06 02:24:32 +01:00
|
|
|
|
### Arbitrary precision
|
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
To overcome the issues related to the limited precision of the machine floating
|
|
|
|
|
points, one can resort to a software implementation with arbitrary precision.
|
|
|
|
|
|
|
|
|
|
In the GMP [@gmp20] library (GNU Multiple Precision arithmetic), for
|
|
|
|
|
example, reals can be approximated by a rational or floating point numbers
|
|
|
|
|
with arbitrary precision. For integer and fractions this means a check is
|
|
|
|
|
performed on every operation that can overflow and additional memory is
|
|
|
|
|
requested and used to represent the larger result. For floating points it
|
|
|
|
|
means the size of the mantissa can be chosen before initialising the number.
|
|
|
|
|
Additionally, the library automatically switches to the optimal algorithm to
|
|
|
|
|
compute an operation based on the size of the operands.
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
|
|
|
|
The terms in @eq:faster can therefore be computed with arbitrarily large
|
2020-05-27 15:29:59 +02:00
|
|
|
|
precision. Thus, a program that computes the Euler-Mascheroni constant within a
|
|
|
|
|
user controllable precision was implemented.
|
|
|
|
|
|
|
|
|
|
To compute $N$ by @eq:refined-n, a trick must be used to avoid overflows in the
|
|
|
|
|
exponentiation, particularly when computing more than a few hundreds digits are
|
|
|
|
|
to be computed. The details are explained in @sec:optimised, below. Once the
|
|
|
|
|
number $N$ has been fixed, the series $A(N)$ and $B(N)$ are to be evaluated.
|
|
|
|
|
With rational numbers, a different criterion for the truncation must be
|
|
|
|
|
considered, because two consecutive term are always different. Brent and
|
|
|
|
|
McMillan[@brent00] prove that to reduce the partial sum to less than $\Delta_N$
|
|
|
|
|
it is sufficient to compute $\alpha N$ terms, where $\alpha$ is the solution to
|
|
|
|
|
the equation:
|
|
|
|
|
\begin{align*}
|
|
|
|
|
\alpha\ln(\alpha) = 3 + \alpha &&
|
|
|
|
|
\alpha = \exp(W(3e) - 1) \approx 4.97
|
|
|
|
|
\end{align*}
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
|
|
|
|
The GMP library offers functions to perform some operations such as addition,
|
2020-05-08 23:07:37 +02:00
|
|
|
|
multiplication, division, etc. However, the logarithm function is not
|
2020-03-06 02:24:32 +01:00
|
|
|
|
implemented. Thus, most of the code carries out the $\ln(N)$ computation.
|
2020-05-27 15:29:59 +02:00
|
|
|
|
The reason is that the logarithm of only some special numbers can be computed
|
2020-05-25 23:59:38 +02:00
|
|
|
|
with arbitrary precision, namely the ones of which a converging series is known.
|
|
|
|
|
This forces $N$ to be rewritten in the following way:
|
2020-03-06 02:24:32 +01:00
|
|
|
|
$$
|
|
|
|
|
N = N_0 \cdot b^e \thus \ln(N) = \ln(N_0) + e \cdot \ln(b)
|
|
|
|
|
$$
|
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
Since a fast converging series for $\ln(2)$ is known (it will be shown shortly),
|
|
|
|
|
$b = 2$ was chosen. If $N$ is a power of two, $N_0$ is 1 and only $\ln(2)$ is to be
|
|
|
|
|
computed. More generally, the problem reduces to the calculation of $\ln(N_0)$.
|
|
|
|
|
|
|
|
|
|
Regarding the scientific notation, to find the mantissa $1
|
2020-05-25 23:59:38 +02:00
|
|
|
|
\leqslant N_0 < 2$, the number of binary digits of $N$ must be computed
|
2020-05-27 15:29:59 +02:00
|
|
|
|
(conveniently, a dedicated function `mpz_sizeinbase()` exists in GMP). If the
|
2020-05-25 23:59:38 +02:00
|
|
|
|
digits are
|
2020-05-08 23:07:37 +02:00
|
|
|
|
$n$:
|
2020-05-25 12:43:58 +02:00
|
|
|
|
$$
|
|
|
|
|
e = n - 1 \thus N = N_0 \cdot 2^{n-1} \thus N_0 = \frac{N}{2^{n - 1}}
|
|
|
|
|
$$
|
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
The logarithm of $N_0$ can be computed from the Taylor series of the hyperbolic
|
|
|
|
|
tangent, which is convergent for $|x| < 1$:
|
2020-05-25 12:43:58 +02:00
|
|
|
|
$$
|
|
|
|
|
\text{atanh}(x) = \sum_{k = 0}^{+ \infty} \frac{x^{2k + 1}}{2x + 1}
|
|
|
|
|
$$
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
The relation with the logarithm follows from the definition
|
2020-03-06 02:24:32 +01:00
|
|
|
|
$$
|
2020-05-25 12:43:58 +02:00
|
|
|
|
\text{tanh}(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}}
|
|
|
|
|
= \frac{e^{2x} - 1}{e^{2x} + 1}
|
|
|
|
|
$$
|
2020-05-27 15:29:59 +02:00
|
|
|
|
and the change of variable $z = e^{2x}$:
|
2020-05-25 12:43:58 +02:00
|
|
|
|
$$
|
|
|
|
|
\text{tanh} \left( \frac{\ln(z)}{2} \right) = \frac{z - 1}{z + 1}
|
|
|
|
|
\thus \ln(z) = 2 \, \text{atanh} \left( \frac{z - 1}{z + 1} \right)
|
2020-03-06 02:24:32 +01:00
|
|
|
|
$$
|
|
|
|
|
|
2020-05-25 23:59:38 +02:00
|
|
|
|
The idea is to set $N_0 = z$ and define:
|
2020-05-25 12:43:58 +02:00
|
|
|
|
$$
|
2020-05-25 23:59:38 +02:00
|
|
|
|
y = \frac{N_0 - 1}{N_0 + 1}
|
2020-05-25 12:43:58 +02:00
|
|
|
|
$$
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-25 23:59:38 +02:00
|
|
|
|
hence:
|
2020-03-06 02:24:32 +01:00
|
|
|
|
$$
|
2020-05-25 23:59:38 +02:00
|
|
|
|
\ln(N_0) = 2 \, \text{atanh} (y)
|
|
|
|
|
= 2 \sum_{k = 0}^{+ \infty} \frac{y^{2k + 1}}{2k + 1}
|
2020-03-06 02:24:32 +01:00
|
|
|
|
$$
|
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
To estimate a series with a given precision some care must be taken:
|
|
|
|
|
different techniques apply to different series and are explained in
|
|
|
|
|
the paper [@riddle08]. In this case, letting $S$ be the value of the
|
|
|
|
|
series and $S_k$ the $k$-th partial sum the following bounds can be found:
|
2020-03-06 02:24:32 +01:00
|
|
|
|
$$
|
2020-05-08 23:07:37 +02:00
|
|
|
|
S_k + a_k \frac{L}{1 -L} < S < S_k + \frac{a_{k+1}}{1 - \frac{a_{k+1}}{a_k}}
|
2020-03-06 02:24:32 +01:00
|
|
|
|
$$
|
|
|
|
|
|
2020-05-25 12:43:58 +02:00
|
|
|
|
where $a_k$ is the $k^{\text{th}}$ term of the series and $L$ is the limiting
|
2020-05-27 15:29:59 +02:00
|
|
|
|
ratio of the series terms, which must be $\le 1$ in order for it to converge (in
|
2020-05-25 23:59:38 +02:00
|
|
|
|
this case, it is easy to prove that $L = y^2 < 1$). The width $\Delta S$ of the
|
2020-05-25 12:43:58 +02:00
|
|
|
|
interval containing $S$ gives the precision of the estimate $\tilde{S}$ if this
|
2020-05-27 15:29:59 +02:00
|
|
|
|
last is assumed to be its middle value, namely:
|
2020-05-25 12:43:58 +02:00
|
|
|
|
$$
|
|
|
|
|
\tilde{S} = S_k + \frac{1}{2} \left(
|
|
|
|
|
a_k \frac{L}{1 -L} + \frac{a_{k+1}}{1 - \frac{a_{k+1}}{a_k}}
|
|
|
|
|
\right) \et
|
|
|
|
|
\Delta S = \frac{a_{k+1}}{1 - \frac{a_{k+1}}{a_k}} - a_k \frac{L}{1 -L}
|
|
|
|
|
$$
|
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
For this series:
|
|
|
|
|
\begin{align*}
|
|
|
|
|
a_k = \frac{y^{2k + 1}}{2k + 1} &&
|
2020-05-25 23:59:38 +02:00
|
|
|
|
a_{k + 1} = \frac{y^{2k + 3}}{2k + 3} = a_k \, \frac{2k + 1}{2k + 3} \, y^2
|
2020-05-27 15:29:59 +02:00
|
|
|
|
\end{align*}
|
|
|
|
|
so, the right bound results:
|
2020-05-25 12:43:58 +02:00
|
|
|
|
$$
|
|
|
|
|
\frac{a_{k+1}}{1 - \frac{a_{k+1}}{a_k}}
|
|
|
|
|
= \frac{a_k}{\frac{2k + 3}{2k + 1}y^{-2} - 1}
|
|
|
|
|
$$
|
2020-05-27 15:29:59 +02:00
|
|
|
|
and it finally follows:
|
2020-05-25 12:43:58 +02:00
|
|
|
|
$$
|
2020-05-27 15:29:59 +02:00
|
|
|
|
\Delta S_k = a_k \left(
|
2020-05-25 12:43:58 +02:00
|
|
|
|
\frac{1}{\frac{2k + 3}{2k + 1}y^{-2} - 1} - \frac{y^2}{1 - y^2}
|
2020-05-27 15:29:59 +02:00
|
|
|
|
\right)
|
2020-05-25 12:43:58 +02:00
|
|
|
|
$$
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
By imposing $\Delta S_k < 10^{-D}$, where $D$ is the number decimal places
|
|
|
|
|
required, $k^{\text{max}}$ at which to stop the summation can be obtained. This
|
|
|
|
|
is achieved by trials, checking for every $k$ whether $\Delta S_k$ is less or
|
|
|
|
|
greater than $10^{-D}$.
|
2020-03-06 02:24:32 +01:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
Similar considerations can be made for $\ln(2)$. The number could be computed
|
|
|
|
|
from the MacLaurin series of $\ln(1 + x)$ with $x = 1$, but that yields a
|
|
|
|
|
slowly convergent series with alternating sign. A trick is to use $x = -1/2$,
|
|
|
|
|
which leads to a much faster series with constant sign:
|
2020-03-06 02:24:32 +01:00
|
|
|
|
$$
|
2020-05-27 15:29:59 +02:00
|
|
|
|
\ln(2) = \sum_{k=1}^{+ \infty} \frac{1}{k \cdot 2^k}
|
2020-03-06 02:24:32 +01:00
|
|
|
|
$$
|
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
In this case the series ratio is found to be $L = 1/2$ and the error
|
2020-05-25 12:43:58 +02:00
|
|
|
|
$$
|
2020-05-27 15:29:59 +02:00
|
|
|
|
\Delta S_k = \frac{1}{k(k + 2) 2^{k-1}}
|
2020-05-25 12:43:58 +02:00
|
|
|
|
$$
|
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
Once the logarithms and the terms $A$, $B$ and $C$ have been computed @eq:faster
|
|
|
|
|
can finally be used to obtain $\gamma$ up to the given decimal place.
|
2020-05-25 12:43:58 +02:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
The program was implemented with no particular care for performance but
|
|
|
|
|
was found to be relatively fast. On a \SI{3.9}{GHz} desktop computer with
|
|
|
|
|
\SI{2666}{MT\per\s} memory, it takes \SI{0.63}{\second} to compute the first
|
|
|
|
|
1000 digits. However, the quadratic complexity of the algorithm makes this
|
|
|
|
|
quite unpractical for computing more than a few thousands digits. For this
|
|
|
|
|
reason a more optimised algorithm was implemented.
|
2020-05-25 23:59:38 +02:00
|
|
|
|
|
2020-05-25 12:43:58 +02:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
### Optimised implementation {#sec:optimised}
|
2020-05-25 12:43:58 +02:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
The refined Brent-McMillan formula (@eq:faster) is theoretically the fastest
|
|
|
|
|
but is difficult to implement efficiently. In practice it turns out to be
|
|
|
|
|
slower than the standard formula even if its asymptotic error is better.
|
2020-05-25 12:43:58 +02:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
The standard formula [@brent00] ignores the correction $C(N)$ and is rewritten
|
|
|
|
|
in a more convenient way:
|
2020-05-25 12:43:58 +02:00
|
|
|
|
$$
|
2020-05-27 15:29:59 +02:00
|
|
|
|
\gamma_k = \frac{\sum_{k = 0}^{k_\text{max}} A_k(N)}
|
|
|
|
|
{\sum_{k = 0}^{k_\text{max}} B_k(N)}
|
2020-05-25 12:43:58 +02:00
|
|
|
|
$$
|
2020-05-27 15:29:59 +02:00
|
|
|
|
where:
|
|
|
|
|
\begin{align*}
|
|
|
|
|
A_k &= \frac{1}{k} \left(\frac{A_{k-1} N^2}{k} + B_k \right) & A_0 &= - \ln(N) \\
|
|
|
|
|
B_k &= \frac{B_{k-1} N^2}{k^2} & B_0 &= 1 \\
|
|
|
|
|
\end{align*}
|
2020-05-25 12:43:58 +02:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
|
|
|
|
|
As said, the asymptotic error of the formula decreases slower
|
2020-05-25 12:43:58 +02:00
|
|
|
|
$$
|
2020-05-27 15:29:59 +02:00
|
|
|
|
|\gamma - \gamma_k| \sim \pi e^{-4N} = \Delta_N
|
2020-05-25 12:43:58 +02:00
|
|
|
|
$$
|
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
In order to avoid the expensive computation of $\ln(N)$, N was chosen to be a
|
|
|
|
|
power of 2, $N = 2^p$. As before:
|
2020-05-25 12:43:58 +02:00
|
|
|
|
$$
|
2020-05-27 15:29:59 +02:00
|
|
|
|
\Delta_p = \pi e^{-2^{p+2}} < 10^{-D} \thus
|
|
|
|
|
p > \log_2 \left( \ln(\pi) + D \ln(10) \right) - 2
|
2020-05-25 23:59:38 +02:00
|
|
|
|
$$
|
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
and the smaller integer is found by taking the floor:
|
2020-05-25 23:59:38 +02:00
|
|
|
|
$$
|
2020-05-27 15:29:59 +02:00
|
|
|
|
p = \left\lfloor \log_2 \left( \ln(\pi) + D \ln(10) \right) \right\rfloor + 1
|
2020-05-25 23:59:38 +02:00
|
|
|
|
$$
|
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
The number of terms to compute, $k_\text{max}$ is still proportional to $N$
|
|
|
|
|
but by a different factor, $\beta$ such that:
|
|
|
|
|
\begin{align*}
|
|
|
|
|
\beta\ln(\beta) = 1 + \beta &&
|
|
|
|
|
\beta = \exp(W(e^{-1}) + 1) \approx 3.59
|
|
|
|
|
\end{align*}
|
|
|
|
|
|
|
|
|
|
As regards the numbers representation, this time the GMP type `mpf_t` was
|
|
|
|
|
employed. It consists of an arbitrary-precision floating point with a fixed
|
|
|
|
|
exponent of 32 or 64 bits but arbitrary mantissa length.
|
2020-05-25 23:59:38 +02:00
|
|
|
|
If a number $D$ of digits is required, the number of bits can be computed from:
|
|
|
|
|
$$
|
|
|
|
|
10^D = 2^b \thus b = D \log_2 (10)
|
|
|
|
|
$$
|
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
in order to avoid roundoff errors affecting the final result, a security range
|
|
|
|
|
of 64 bits was added to $b$.
|
2020-05-25 23:59:38 +02:00
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
Furthermore, the computation of $\ln(2)$ was optimized by solving
|
|
|
|
|
@eq:delta analytically, instead of finding $k^{\text{max}}$ by trials.
|
|
|
|
|
The series estimation error was
|
2020-05-25 23:59:38 +02:00
|
|
|
|
$$
|
2020-05-27 15:29:59 +02:00
|
|
|
|
\Delta S_k = \frac{1}{k (k + 2) 2^{k - 1}} < 10^{-D}
|
2020-05-25 23:59:38 +02:00
|
|
|
|
$$ {#eq:delta}
|
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
and the following condition should hold:
|
2020-05-25 12:43:58 +02:00
|
|
|
|
$$
|
2020-05-25 23:59:38 +02:00
|
|
|
|
k (k + 2) 2^{k - 1} > 10^D \thus 2^{k - 1} [(k + 1)^2 - 1] > 10^D
|
|
|
|
|
$$
|
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
Asking for the equality with $k = x$ yields
|
2020-05-25 23:59:38 +02:00
|
|
|
|
$$
|
|
|
|
|
2^{x - 1} (x + 1)^2 = 10^D
|
|
|
|
|
$$
|
2020-05-27 15:29:59 +02:00
|
|
|
|
which can be again solved by the Lambert $W$ function as follows
|
2020-05-25 23:59:38 +02:00
|
|
|
|
\begin{align*}
|
2020-05-27 15:29:59 +02:00
|
|
|
|
2^{x - 1} (x + 1)^2 = 10^D
|
2020-05-25 23:59:38 +02:00
|
|
|
|
&\thus \exp \left( \ln(2) \frac{x - 1}{2} \right) (x + 1)
|
2020-05-27 15:29:59 +02:00
|
|
|
|
= 10^{D/2} \\
|
2020-05-25 23:59:38 +02:00
|
|
|
|
&\thus \exp \left( \frac{\ln(2)}{2} (x + 1)\right) e^{- \ln(2)} (x + 1)
|
2020-05-27 15:29:59 +02:00
|
|
|
|
= 10^{D/2} \\
|
2020-05-25 23:59:38 +02:00
|
|
|
|
&\thus \exp \left( \frac{\ln(2)}{2} (x + 1)\right) \frac{\ln(2)}{2} (x + 1)
|
2020-05-27 15:29:59 +02:00
|
|
|
|
= \ln (2) 10^{D/2}
|
2020-05-25 23:59:38 +02:00
|
|
|
|
\end{align*}
|
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
Taking $W$ both sides of the equation:
|
2020-05-25 23:59:38 +02:00
|
|
|
|
$$
|
2020-05-27 15:29:59 +02:00
|
|
|
|
\frac{\ln(2)}{2} (x + 1) = W ( \ln(2) 10^{D/2} )
|
2020-05-25 23:59:38 +02:00
|
|
|
|
$$
|
|
|
|
|
which leads to:
|
|
|
|
|
$$
|
2020-05-27 15:29:59 +02:00
|
|
|
|
x = \frac{2}{\ln(2)} \, W ( \ln(2) 10^{D/2} ) - 1
|
2020-05-25 23:59:38 +02:00
|
|
|
|
$$
|
|
|
|
|
|
|
|
|
|
Finally, the smallest integer greater than $x$ can be found as:
|
|
|
|
|
$$
|
|
|
|
|
k^{\text{max}} =
|
2020-05-27 15:29:59 +02:00
|
|
|
|
\left\lfloor \frac{2}{\ln(2)} \, W (\ln(2) 10^{D/2}) \right\rfloor
|
2020-05-25 23:59:38 +02:00
|
|
|
|
$$
|
|
|
|
|
|
2020-05-27 15:29:59 +02:00
|
|
|
|
When a large number $D$ of digits is to be computed, the exponentiation can
|
|
|
|
|
easily overflow if working in double precision. However, $W$ grows like a
|
|
|
|
|
logarithm, so intuitively this operation could be optimized out.
|
|
|
|
|
|
|
|
|
|
In fact this can be done by using the asymptotic expansion [@corless96]
|
|
|
|
|
at large $x$ of $W(x)$ :
|
2020-05-25 23:59:38 +02:00
|
|
|
|
$$
|
|
|
|
|
W(x) = L_1 - L_2 + \frac{L_2}{L_1} + \frac{L_2 (L_2 - 2)}{2 L_1^2}
|
2020-05-27 15:29:59 +02:00
|
|
|
|
+ \frac{L_2 (3L_2^3 - 22L_2^2 + 36L_2 - 12)}{12L_1^4}
|
|
|
|
|
+ O\left(\left(\frac{L_2}{L_1}\right)^5\right)
|
2020-05-25 23:59:38 +02:00
|
|
|
|
$$
|
|
|
|
|
where
|
|
|
|
|
$$
|
|
|
|
|
L_1 = \ln(x) \et L_2 = \ln(\ln(x))
|
|
|
|
|
$$
|
2020-05-27 15:29:59 +02:00
|
|
|
|
|
|
|
|
|
Now the usual properties of the logarithm can be applied to write:
|
|
|
|
|
\begin{align*}
|
|
|
|
|
L_1 &= \ln(\ln(2) 10^{D/2}) = \ln(\ln(2)) + \frac{D}{2}\ln(10) \\
|
|
|
|
|
L_2 &= \ln(L_1)
|
|
|
|
|
\end{align*}
|
|
|
|
|
|
|
|
|
|
In this way $k^{\text{max}}$ can be easily computed with standard double
|
|
|
|
|
precision floating points, with no risk of overflow or need of arbitrary
|
|
|
|
|
precision arithmetic.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### 1 million digits computation
|
|
|
|
|
|
|
|
|
|
On the same desktop computer, the optimised program computes the first 1000
|
|
|
|
|
digits of $\gamma$ in \SI{4.6}{\milli\second}: a $137\times$ improvement over
|
|
|
|
|
the previous program. This make it suitable for a more intensive computation.
|
|
|
|
|
|
|
|
|
|
As a proof of concept, the program was then used to compute a million digits of
|
|
|
|
|
$\gamma$, which took \SI{1.26}{\hour} of running time.
|
|
|
|
|
The result was verified in \SI{6.33}{\hour} by comparing it with the
|
|
|
|
|
`mpmath.euler()` function from the mpmath [@mpmath13] multiple-precision
|
|
|
|
|
library with the Python integer backend. The library also uses the standard
|
|
|
|
|
Brent-McMillan algorithm and variable precision floating points but is not
|
|
|
|
|
based on GMP arithmetic types.
|