In numerical analysis, the Shanks transformation is a non-linear series acceleration method to increase the rate of convergence of a sequence. This method is named after Daniel Shanks, who rediscovered this sequence transformation in 1955. It was first derived and published by R. Schmidt in 1941.
Formulation
For a sequence { a m } m ∈ N {\displaystyle \left\{a_{m}\right\}_{m\in \mathbb {N} }} the series
A = ∑ m = 0 ∞ a m {\displaystyle A=\sum _{m=0}^{\infty }a_{m}\,}is to be determined. First, the partial sum A n {\displaystyle A_{n}} is defined as:
A n = ∑ m = 0 n a m {\displaystyle A_{n}=\sum _{m=0}^{n}a_{m}\,}and forms a new sequence { A n } n ∈ N {\displaystyle \left\{A_{n}\right\}_{n\in \mathbb {N} }} . Provided the series converges, A n {\displaystyle A_{n}} will also approach the limit A {\displaystyle A} as n → ∞ . {\displaystyle n\to \infty .} The Shanks transformation S ( A n ) {\displaystyle S(A_{n})} of the sequence A n {\displaystyle A_{n}} is the new sequence defined by23
S ( A n ) = A n + 1 A n − 1 − A n 2 A n + 1 − 2 A n + A n − 1 = A n + 1 − ( A n + 1 − A n ) 2 ( A n + 1 − A n ) − ( A n − A n − 1 ) {\displaystyle S(A_{n})={\frac {A_{n+1}\,A_{n-1}\,-\,A_{n}^{2}}{A_{n+1}-2A_{n}+A_{n-1}}}=A_{n+1}-{\frac {(A_{n+1}-A_{n})^{2}}{(A_{n+1}-A_{n})-(A_{n}-A_{n-1})}}}where this sequence S ( A n ) {\displaystyle S(A_{n})} often converges more rapidly than the sequence A n . {\displaystyle A_{n}.} Further speed-up may be obtained by repeated use of the Shanks transformation, by computing S 2 ( A n ) = S ( S ( A n ) ) , {\displaystyle S^{2}(A_{n})=S(S(A_{n})),} S 3 ( A n ) = S ( S ( S ( A n ) ) ) , {\displaystyle S^{3}(A_{n})=S(S(S(A_{n}))),} etc.
Note that the non-linear transformation as used in the Shanks transformation is essentially the same as used in Aitken's delta-squared process so that as with Aitken's method, the right-most expression in S ( A n ) {\displaystyle S(A_{n})} 's definition (i.e. S ( A n ) = A n + 1 − ( A n + 1 − A n ) 2 ( A n + 1 − A n ) − ( A n − A n − 1 ) {\displaystyle S(A_{n})=A_{n+1}-{\frac {(A_{n+1}-A_{n})^{2}}{(A_{n+1}-A_{n})-(A_{n}-A_{n-1})}}} ) is more numerically stable than the expression to its left (i.e. S ( A n ) = A n + 1 A n − 1 − A n 2 A n + 1 − 2 A n + A n − 1 {\displaystyle S(A_{n})={\frac {A_{n+1}\,A_{n-1}\,-\,A_{n}^{2}}{A_{n+1}-2A_{n}+A_{n-1}}}} ). Both Aitken's method and the Shanks transformation operate on a sequence, but the sequence the Shanks transformation operates on is usually thought of as being a sequence of partial sums, although any sequence may be viewed as a sequence of partial sums.
Example
As an example, consider the slowly convergent series4
4 ∑ k = 0 ∞ ( − 1 ) k 1 2 k + 1 = 4 ( 1 − 1 3 + 1 5 − 1 7 + ⋯ ) {\displaystyle 4\sum _{k=0}^{\infty }(-1)^{k}{\frac {1}{2k+1}}=4\left(1-{\frac {1}{3}}+{\frac {1}{5}}-{\frac {1}{7}}+\cdots \right)}which has the exact sum π ≈ 3.14159265. The partial sum A 6 {\displaystyle A_{6}} has only one digit accuracy, while six-figure accuracy requires summing about 400,000 terms.
In the table below, the partial sums A n {\displaystyle A_{n}} , the Shanks transformation S ( A n ) {\displaystyle S(A_{n})} on them, as well as the repeated Shanks transformations S 2 ( A n ) {\displaystyle S^{2}(A_{n})} and S 3 ( A n ) {\displaystyle S^{3}(A_{n})} are given for n {\displaystyle n} up to 12. The figure to the right shows the absolute error for the partial sums and Shanks transformation results, clearly showing the improved accuracy and convergence rate.
n {\displaystyle n} | A n {\displaystyle A_{n}} | S ( A n ) {\displaystyle S(A_{n})} | S 2 ( A n ) {\displaystyle S^{2}(A_{n})} | S 3 ( A n ) {\displaystyle S^{3}(A_{n})} |
---|---|---|---|---|
0 | 4.00000000 | — | — | — |
1 | 2.66666667 | 3.16666667 | — | — |
2 | 3.46666667 | 3.13333333 | 3.14210526 | — |
3 | 2.89523810 | 3.14523810 | 3.14145022 | 3.14159936 |
4 | 3.33968254 | 3.13968254 | 3.14164332 | 3.14159086 |
5 | 2.97604618 | 3.14271284 | 3.14157129 | 3.14159323 |
6 | 3.28373848 | 3.14088134 | 3.14160284 | 3.14159244 |
7 | 3.01707182 | 3.14207182 | 3.14158732 | 3.14159274 |
8 | 3.25236593 | 3.14125482 | 3.14159566 | 3.14159261 |
9 | 3.04183962 | 3.14183962 | 3.14159086 | 3.14159267 |
10 | 3.23231581 | 3.14140672 | 3.14159377 | 3.14159264 |
11 | 3.05840277 | 3.14173610 | 3.14159192 | 3.14159266 |
12 | 3.21840277 | 3.14147969 | 3.14159314 | 3.14159265 |
The Shanks transformation S ( A 1 ) {\displaystyle S(A_{1})} already has two-digit accuracy, while the original partial sums only establish the same accuracy at A 24 . {\displaystyle A_{24}.} Remarkably, S 3 ( A 3 ) {\displaystyle S^{3}(A_{3})} has six digits accuracy, obtained from repeated Shank transformations applied to the first seven terms A 0 , … , A 6 . {\displaystyle A_{0},\ldots ,A_{6}.} As mentioned before, A n {\displaystyle A_{n}} only obtains 6-digit accuracy after summing about 400,000 terms.
Motivation
The Shanks transformation is motivated by the observation that — for larger n {\displaystyle n} — the partial sum A n {\displaystyle A_{n}} quite often behaves approximately as5
A n = A + α q n , {\displaystyle A_{n}=A+\alpha q^{n},\,}with | q | < 1 {\displaystyle |q|<1} so that the sequence converges transiently to the series result A {\displaystyle A} for n → ∞ . {\displaystyle n\to \infty .} So for n − 1 , {\displaystyle n-1,} n {\displaystyle n} and n + 1 {\displaystyle n+1} the respective partial sums are:
A n − 1 = A + α q n − 1 , A n = A + α q n and A n + 1 = A + α q n + 1 . {\displaystyle A_{n-1}=A+\alpha q^{n-1}\quad ,\qquad A_{n}=A+\alpha q^{n}\qquad {\text{and}}\qquad A_{n+1}=A+\alpha q^{n+1}.}These three equations contain three unknowns: A , {\displaystyle A,} α {\displaystyle \alpha } and q . {\displaystyle q.} Solving for A {\displaystyle A} gives6
A = A n + 1 A n − 1 − A n 2 A n + 1 − 2 A n + A n − 1 . {\displaystyle A={\frac {A_{n+1}\,A_{n-1}\,-\,A_{n}^{2}}{A_{n+1}-2A_{n}+A_{n-1}}}.}In the (exceptional) case that the denominator is equal to zero: then A n = A {\displaystyle A_{n}=A} for all n . {\displaystyle n.}
Generalized Shanks transformation
The generalized kth-order Shanks transformation is given as the ratio of the determinants:7
S k ( A n ) = | A n − k ⋯ A n − 1 A n Δ A n − k ⋯ Δ A n − 1 Δ A n Δ A n − k + 1 ⋯ Δ A n Δ A n + 1 ⋮ ⋮ ⋮ Δ A n − 1 ⋯ Δ A n + k − 2 Δ A n + k − 1 | | 1 ⋯ 1 1 Δ A n − k ⋯ Δ A n − 1 Δ A n Δ A n − k + 1 ⋯ Δ A n Δ A n + 1 ⋮ ⋮ ⋮ Δ A n − 1 ⋯ Δ A n + k − 2 Δ A n + k − 1 | , {\displaystyle S_{k}(A_{n})={\frac {\begin{vmatrix}A_{n-k}&\cdots &A_{n-1}&A_{n}\\\Delta A_{n-k}&\cdots &\Delta A_{n-1}&\Delta A_{n}\\\Delta A_{n-k+1}&\cdots &\Delta A_{n}&\Delta A_{n+1}\\\vdots &&\vdots &\vdots \\\Delta A_{n-1}&\cdots &\Delta A_{n+k-2}&\Delta A_{n+k-1}\\\end{vmatrix}}{\begin{vmatrix}1&\cdots &1&1\\\Delta A_{n-k}&\cdots &\Delta A_{n-1}&\Delta A_{n}\\\Delta A_{n-k+1}&\cdots &\Delta A_{n}&\Delta A_{n+1}\\\vdots &&\vdots &\vdots \\\Delta A_{n-1}&\cdots &\Delta A_{n+k-2}&\Delta A_{n+k-1}\\\end{vmatrix}}},}with Δ A p = A p + 1 − A p . {\displaystyle \Delta A_{p}=A_{p+1}-A_{p}.} It is the solution of a model for the convergence behaviour of the partial sums A n {\displaystyle A_{n}} with k {\displaystyle k} distinct transients:
A n = A + ∑ p = 1 k α p q p n . {\displaystyle A_{n}=A+\sum _{p=1}^{k}\alpha _{p}q_{p}^{n}.}This model for the convergence behaviour contains 2 k + 1 {\displaystyle 2k+1} unknowns. By evaluating the above equation at the elements A n − k , A n − k + 1 , … , A n + k {\displaystyle A_{n-k},A_{n-k+1},\ldots ,A_{n+k}} and solving for A , {\displaystyle A,} the above expression for the kth-order Shanks transformation is obtained. The first-order generalized Shanks transformation is equal to the ordinary Shanks transformation: S 1 ( A n ) = S ( A n ) . {\displaystyle S_{1}(A_{n})=S(A_{n}).}
The generalized Shanks transformation is closely related to Padé approximants and Padé tables.8
Note: The calculation of determinants requires many arithmetic operations to make, however Peter Wynn discovered a recursive evaluation procedure called epsilon-algorithm which avoids calculating the determinants.910
See also
- Aitken's delta-squared process
- Anderson acceleration
- Rate of convergence
- Richardson extrapolation
- Sequence transformation
Notes
- Shanks, D. (1955), "Non-linear transformation of divergent and slowly convergent sequences", Journal of Mathematics and Physics, 34 (1–4): 1–42, doi:10.1002/sapm19553411
- Schmidt, R.J. (1941), "On the numerical solution of linear simultaneous equations by an iterative method", Philosophical Magazine, 32 (214): 369–383, doi:10.1080/14786444108520797
- Van Dyke, M.D. (1975), Perturbation methods in fluid mechanics (annotated ed.), Parabolic Press, ISBN 0-915760-01-0
- Bender, C.M.; Orszag, S.A. (1999), Advanced mathematical methods for scientists and engineers, Springer, ISBN 0-387-98931-5
- Weniger, E.J. (1989). "Nonlinear sequence transformations for the acceleration of convergence and the summation of divergent series". Computer Physics Reports. 10 (5–6): 189–371. arXiv:math.NA/0306302. Bibcode:1989CoPhR..10..189W. doi:10.1016/0167-7977(89)90011-7.
- Brezinski, C.; Redivo-Zaglia, M.; Saad, Y. (2018), "Shanks sequence transformations and Anderson acceleration", SIAM Review, 60 (3): 646–669, doi:10.1137/17M1120725, hdl:11577/3270110
- Senhadji, M.N. (2001), "On condition numbers of the Shanks transformation", J. Comput. Appl. Math., 135 (1): 41–61, Bibcode:2001JCoAM.135...41S, doi:10.1016/S0377-0427(00)00561-6
- Wynn, P. (1956), "On a device for computing the em(Sn) transformation", Mathematical Tables and Other Aids to Computation, 10 (54): 91–96, doi:10.2307/2002183, JSTOR 2002183
- Wynn, P. (1962), "Acceleration techniques for iterated vector and matrix problems", Math. Comp., 16 (79): 301–322, doi:10.1090/S0025-5718-1962-0145647-X
References
Weniger (2003). ↩
Bender & Orszag (1999), pp. 368–375. ↩
Van Dyke (1975), pp. 202–205. ↩
Van Dyke (1975), pp. 202–205. ↩
Bender & Orszag (1999), pp. 368–375. ↩
Bender & Orszag (1999), pp. 368–375. ↩
Bender & Orszag (1999), pp. 389–392. ↩
Bender & Orszag (1999), pp. 389–392. ↩
Wynn (1956) ↩
Wynn (1962) ↩