Menu
Home Explore People Places Arts History Plants & Animals Science Life & Culture Technology
On this page
Shanks transformation

In numerical analysis, the Shanks transformation is a non-linear series acceleration method to increase the rate of convergence of a sequence. This method is named after Daniel Shanks, who rediscovered this sequence transformation in 1955. It was first derived and published by R. Schmidt in 1941.

We don't have any images related to Shanks transformation yet.
We don't have any YouTube videos related to Shanks transformation yet.
We don't have any PDF documents related to Shanks transformation yet.
We don't have any Books related to Shanks transformation yet.
We don't have any archived web articles related to Shanks transformation yet.

Formulation

For a sequence { a m } m ∈ N {\displaystyle \left\{a_{m}\right\}_{m\in \mathbb {N} }} the series

A = ∑ m = 0 ∞ a m {\displaystyle A=\sum _{m=0}^{\infty }a_{m}\,}

is to be determined. First, the partial sum A n {\displaystyle A_{n}} is defined as:

A n = ∑ m = 0 n a m {\displaystyle A_{n}=\sum _{m=0}^{n}a_{m}\,}

and forms a new sequence { A n } n ∈ N {\displaystyle \left\{A_{n}\right\}_{n\in \mathbb {N} }} . Provided the series converges, A n {\displaystyle A_{n}} will also approach the limit A {\displaystyle A} as n → ∞ . {\displaystyle n\to \infty .} The Shanks transformation S ( A n ) {\displaystyle S(A_{n})} of the sequence A n {\displaystyle A_{n}} is the new sequence defined by23

S ( A n ) = A n + 1 A n − 1 − A n 2 A n + 1 − 2 A n + A n − 1 = A n + 1 − ( A n + 1 − A n ) 2 ( A n + 1 − A n ) − ( A n − A n − 1 ) {\displaystyle S(A_{n})={\frac {A_{n+1}\,A_{n-1}\,-\,A_{n}^{2}}{A_{n+1}-2A_{n}+A_{n-1}}}=A_{n+1}-{\frac {(A_{n+1}-A_{n})^{2}}{(A_{n+1}-A_{n})-(A_{n}-A_{n-1})}}}

where this sequence S ( A n ) {\displaystyle S(A_{n})} often converges more rapidly than the sequence A n . {\displaystyle A_{n}.} Further speed-up may be obtained by repeated use of the Shanks transformation, by computing S 2 ( A n ) = S ( S ( A n ) ) , {\displaystyle S^{2}(A_{n})=S(S(A_{n})),} S 3 ( A n ) = S ( S ( S ( A n ) ) ) , {\displaystyle S^{3}(A_{n})=S(S(S(A_{n}))),} etc.

Note that the non-linear transformation as used in the Shanks transformation is essentially the same as used in Aitken's delta-squared process so that as with Aitken's method, the right-most expression in S ( A n ) {\displaystyle S(A_{n})} 's definition (i.e. S ( A n ) = A n + 1 − ( A n + 1 − A n ) 2 ( A n + 1 − A n ) − ( A n − A n − 1 ) {\displaystyle S(A_{n})=A_{n+1}-{\frac {(A_{n+1}-A_{n})^{2}}{(A_{n+1}-A_{n})-(A_{n}-A_{n-1})}}} ) is more numerically stable than the expression to its left (i.e. S ( A n ) = A n + 1 A n − 1 − A n 2 A n + 1 − 2 A n + A n − 1 {\displaystyle S(A_{n})={\frac {A_{n+1}\,A_{n-1}\,-\,A_{n}^{2}}{A_{n+1}-2A_{n}+A_{n-1}}}} ). Both Aitken's method and the Shanks transformation operate on a sequence, but the sequence the Shanks transformation operates on is usually thought of as being a sequence of partial sums, although any sequence may be viewed as a sequence of partial sums.

Example

As an example, consider the slowly convergent series4

4 ∑ k = 0 ∞ ( − 1 ) k 1 2 k + 1 = 4 ( 1 − 1 3 + 1 5 − 1 7 + ⋯ ) {\displaystyle 4\sum _{k=0}^{\infty }(-1)^{k}{\frac {1}{2k+1}}=4\left(1-{\frac {1}{3}}+{\frac {1}{5}}-{\frac {1}{7}}+\cdots \right)}

which has the exact sum π ≈ 3.14159265. The partial sum A 6 {\displaystyle A_{6}} has only one digit accuracy, while six-figure accuracy requires summing about 400,000 terms.

In the table below, the partial sums A n {\displaystyle A_{n}} , the Shanks transformation S ( A n ) {\displaystyle S(A_{n})} on them, as well as the repeated Shanks transformations S 2 ( A n ) {\displaystyle S^{2}(A_{n})} and S 3 ( A n ) {\displaystyle S^{3}(A_{n})} are given for n {\displaystyle n} up to 12. The figure to the right shows the absolute error for the partial sums and Shanks transformation results, clearly showing the improved accuracy and convergence rate.

n {\displaystyle n} A n {\displaystyle A_{n}} S ( A n ) {\displaystyle S(A_{n})} S 2 ( A n ) {\displaystyle S^{2}(A_{n})} S 3 ( A n ) {\displaystyle S^{3}(A_{n})}
04.00000000
12.666666673.16666667
23.466666673.133333333.14210526
32.895238103.145238103.141450223.14159936
43.339682543.139682543.141643323.14159086
52.976046183.142712843.141571293.14159323
63.283738483.140881343.141602843.14159244
73.017071823.142071823.141587323.14159274
83.252365933.141254823.141595663.14159261
93.041839623.141839623.141590863.14159267
103.232315813.141406723.141593773.14159264
113.058402773.141736103.141591923.14159266
123.218402773.141479693.141593143.14159265

The Shanks transformation S ( A 1 ) {\displaystyle S(A_{1})} already has two-digit accuracy, while the original partial sums only establish the same accuracy at A 24 . {\displaystyle A_{24}.} Remarkably, S 3 ( A 3 ) {\displaystyle S^{3}(A_{3})} has six digits accuracy, obtained from repeated Shank transformations applied to the first seven terms A 0 , … , A 6 . {\displaystyle A_{0},\ldots ,A_{6}.} As mentioned before, A n {\displaystyle A_{n}} only obtains 6-digit accuracy after summing about 400,000 terms.

Motivation

The Shanks transformation is motivated by the observation that — for larger n {\displaystyle n} — the partial sum A n {\displaystyle A_{n}} quite often behaves approximately as5

A n = A + α q n , {\displaystyle A_{n}=A+\alpha q^{n},\,}

with | q | < 1 {\displaystyle |q|<1} so that the sequence converges transiently to the series result A {\displaystyle A} for n → ∞ . {\displaystyle n\to \infty .} So for n − 1 , {\displaystyle n-1,} n {\displaystyle n} and n + 1 {\displaystyle n+1} the respective partial sums are:

A n − 1 = A + α q n − 1 , A n = A + α q n and A n + 1 = A + α q n + 1 . {\displaystyle A_{n-1}=A+\alpha q^{n-1}\quad ,\qquad A_{n}=A+\alpha q^{n}\qquad {\text{and}}\qquad A_{n+1}=A+\alpha q^{n+1}.}

These three equations contain three unknowns: A , {\displaystyle A,} α {\displaystyle \alpha } and q . {\displaystyle q.} Solving for A {\displaystyle A} gives6

A = A n + 1 A n − 1 − A n 2 A n + 1 − 2 A n + A n − 1 . {\displaystyle A={\frac {A_{n+1}\,A_{n-1}\,-\,A_{n}^{2}}{A_{n+1}-2A_{n}+A_{n-1}}}.}

In the (exceptional) case that the denominator is equal to zero: then A n = A {\displaystyle A_{n}=A} for all n . {\displaystyle n.}

Generalized Shanks transformation

The generalized kth-order Shanks transformation is given as the ratio of the determinants:7

S k ( A n ) = | A n − k ⋯ A n − 1 A n Δ A n − k ⋯ Δ A n − 1 Δ A n Δ A n − k + 1 ⋯ Δ A n Δ A n + 1 ⋮ ⋮ ⋮ Δ A n − 1 ⋯ Δ A n + k − 2 Δ A n + k − 1 | | 1 ⋯ 1 1 Δ A n − k ⋯ Δ A n − 1 Δ A n Δ A n − k + 1 ⋯ Δ A n Δ A n + 1 ⋮ ⋮ ⋮ Δ A n − 1 ⋯ Δ A n + k − 2 Δ A n + k − 1 | , {\displaystyle S_{k}(A_{n})={\frac {\begin{vmatrix}A_{n-k}&\cdots &A_{n-1}&A_{n}\\\Delta A_{n-k}&\cdots &\Delta A_{n-1}&\Delta A_{n}\\\Delta A_{n-k+1}&\cdots &\Delta A_{n}&\Delta A_{n+1}\\\vdots &&\vdots &\vdots \\\Delta A_{n-1}&\cdots &\Delta A_{n+k-2}&\Delta A_{n+k-1}\\\end{vmatrix}}{\begin{vmatrix}1&\cdots &1&1\\\Delta A_{n-k}&\cdots &\Delta A_{n-1}&\Delta A_{n}\\\Delta A_{n-k+1}&\cdots &\Delta A_{n}&\Delta A_{n+1}\\\vdots &&\vdots &\vdots \\\Delta A_{n-1}&\cdots &\Delta A_{n+k-2}&\Delta A_{n+k-1}\\\end{vmatrix}}},}

with Δ A p = A p + 1 − A p . {\displaystyle \Delta A_{p}=A_{p+1}-A_{p}.} It is the solution of a model for the convergence behaviour of the partial sums A n {\displaystyle A_{n}} with k {\displaystyle k} distinct transients:

A n = A + ∑ p = 1 k α p q p n . {\displaystyle A_{n}=A+\sum _{p=1}^{k}\alpha _{p}q_{p}^{n}.}

This model for the convergence behaviour contains 2 k + 1 {\displaystyle 2k+1} unknowns. By evaluating the above equation at the elements A n − k , A n − k + 1 , … , A n + k {\displaystyle A_{n-k},A_{n-k+1},\ldots ,A_{n+k}} and solving for A , {\displaystyle A,} the above expression for the kth-order Shanks transformation is obtained. The first-order generalized Shanks transformation is equal to the ordinary Shanks transformation: S 1 ( A n ) = S ( A n ) . {\displaystyle S_{1}(A_{n})=S(A_{n}).}

The generalized Shanks transformation is closely related to Padé approximants and Padé tables.8

Note: The calculation of determinants requires many arithmetic operations to make, however Peter Wynn discovered a recursive evaluation procedure called epsilon-algorithm which avoids calculating the determinants.910

See also

Notes

References

  1. Weniger (2003).

  2. Bender & Orszag (1999), pp. 368–375.

  3. Van Dyke (1975), pp. 202–205.

  4. Van Dyke (1975), pp. 202–205.

  5. Bender & Orszag (1999), pp. 368–375.

  6. Bender & Orszag (1999), pp. 368–375.

  7. Bender & Orszag (1999), pp. 389–392.

  8. Bender & Orszag (1999), pp. 389–392.

  9. Wynn (1956)

  10. Wynn (1962)