         ╧ґҐ№: ╠рҐхьрҐшър » ╩юЁэш ЄґэъІшщ ш эхышэхщэ√є ёшёҐхь » Newton's methods
═р яЁртрє Ёхъырь√
╤рьр  ръҐґры№эр  шэЄюЁьрІш  ю ьрурчшэрє ¤ыхъҐЁюэшъш ш ъюья№■ҐхЁют ▐ыьрЁҐ ёюсЁрэр эр эрЇхь ёрщҐх.
Newton's method and high order iterations  ┼ёыш ┬√ эрщфхҐх ёҐрҐ№■ яюыхчэющ ш шэҐхЁхёэющ - эх ёюіҐшҐх чр ҐЁґф, яхЁхтхфшҐх ьрҐхЁшры шыш єюҐ  с√ ірёҐ№ хую ш юҐяЁрт№Ґх эр рфЁхё algolist@manual.ru. ╟рфрҐ№ тюяЁюё√ шыш яЁюёҐю эряшёрҐ№ яшё№ью ьюцэю Ґръцх эр ёҐЁрэшІх ъюэҐръҐют.

╤ҐрҐ№  ы■схчэю яЁхфюёҐртыхэр:
web: numbers.computation.free.fr

1  Introduction  It is of great importance to solve equations of the form
 f(x)=0,

in many applications in Mathematics, Physics, Chemistry, ... and of course in the computation of some important mathematical constants or functions like square roots. In this essay, we are only interested in one type of methods : the Newton's methods.

1.1  Newton's approach

Around 1669, Isaac Newton (1643-1727) gave a new algorithm  to solve a polynomial equation and it was illustrated on the example y3-2y-5=0. To find an accurate root of this equation, first one must guess a starting value, here y 2. Then just write y=2+p and after the substitution the equation becomes

 p3+6p2+10p-1=0.

Because p is supposed to be small we neglect p3+6p2 compared to 10p-1 and the previous equation gives p 0.1, therefore a better approximation of the root is y 2.1. It's possible to repeat this process and write p=0.1+q, the substitution gives

 q3+6.3q2+11.23q+0.061=0,

hence q -0.061/11.23=-0.0054... and a new approximation for y 2.0946... And the process should be repeated until the required number of digits is reached.

In his method, Newton doesn't explicitly use the notion of derivative and he only applies it on polynomial equations.

1.2  Raphson's iteration

A few years later, in 1690, a new step was made by Joseph Raphson (1678-1715) who proposed a method  which avoided the substitutions in Newton's approach. He illustrated his algorithm on the equation x3-bx+c=0, and starting with an approximation of this equation x g, a better approximation is given by

 x ╗ g+ c+g3-bg b-3g2 .

Observe that the denominator of the fraction is the opposite of the derivative of the numerator.

This was the historical beginning of the very important Newton's (or sometimes called Newton-Raphson's) algorithm.

1.3  Later studies

The method was then studied and generalized by other mathematicians like Simpson (1710-1761), Mourraille (1720-1808), Cauchy (1789-1857), Kantorovich (1912-1986), ... The important question of the choice of the starting point was first approached by Mourraille in 1768 and the difficulty to make this choice is the main drawback of the algorithm (see also ).

2  Newton's method  Nowadays, Newton's method is a generalized process to find an accurate root of a system (or a single) equations f(x)=0. We suppose that f is a C2function on a given interval, then using Taylor's expansion near x

 f(x+h)=f(x)+hfв(x)+O(h2),

and if we stop at the first order (linearization of the equation), we are looking for a small h such as

 f(x+h)=0 ╗ f(x)+hfв(x),

giving

 h
 =- f(x) fв(x) ,
 x+h
 =x- f(x) fв(x) .

2.0.1  Newton's iteration

The Newton iteration is then given by the following procedure: start with an initial guess of the root x0, then find the limit of recurrence:

 xn+1=xn- f(xn) fв(xn) ,

and Figure 1 is a geometrical interpretation of a single iteration of this formula. Figure 1: One Iteration of Newton's method

Unfortunately, this iteration may not converge or is unstable (we say chaotic sometimes) in many cases. However we have two important theorems giving conditions to insure the convergence of the process (see ).

2.0.2  Convergence's conditions

Theorem 1 Let x* be a root of f(x)=0, where f is a C2function on an interval containing x*, and we suppose that | fв(x*)| > 0, then Newton's iteration will converge to x* if the starting point x0 is close enough to x*.

This theorem insure that Newton's method will always converge if the initial point is sufficiently close to the root and if this root if not singular (that is fв(x*) is non zero). This process has the local convergence property. A more constructive theorem was given by Kantorovich, we give it in the particular case of a function of a single variable x.

Theorem 2 (Kantorovich). Let f be a C2 numerical function from an open interval I of the real line, let x0 be a point in this interval, then if there exist finite positive constants (m0,M0,K0) such as
 ъъ 1 fв(x0) ъъ
 < m0,
 ъъ f(x0) fв(x0) ъъ
 < M0,
 | fвв(x0)|
 < K0,

and if h0=2m0M0K0 г 1,Newton's iteration will converge to a root x* of f(x)=0, and
 | xn-x*| < 21-nM0h02n-1.

This theorem gives sufficient conditions to insure the existence of a root and the convergence of Newton's process. More if h0 < 1, the last inequality shows that the convergence is quadratic (the number of good digits is doubled at each iteration). Note that if the starting point x0 tends to the root x*, the constant M0 will tend to zero and h0 will become smaller than 1, so the local convergence theorem is a consequence of Kantorovich's theorem.

3  Cubic iteration  Newton's iteration may be seen as a first order method (or linearization method), it's possible to go one step further and write the Taylor expansion of f to a higher order

 f(x+h)=f(x)+hfв(x)+ h2 2 fвв(x)+O(h3),

and we are looking for h such as

 f(x+h)=0 ╗ f(x)+hfв(x)+ h2 2 fвв(x),

we take the smallest solution for h (we have to suppose that fв(x) and fвв(x) are non zero)

h=-  fв(x)

fвв(x)
ц
ч
ш
1-   ц

 1- 2f(x)fвв(x) fв(x)2

І
і
Ї
.

It's not necessary to compute the square root, because if f(x) is small, using the expansion

 1- ╓ 1-a = a 2 + a2 8 +O(a3),

h becomes

 h=- f(x) fв(x) цш 1+ f(x)fвв(x) 2fв(x)2 +... ІЇ .

The first attempt to use the second order expansion is due to the astronomer E. Halley (1656-1743) in 1694.

3.0.3  Householder's iteration

The previous expression for h, allows to derive the following cubic iteration (the number of digits triples at each iteration), starting with x0

 xn+1=xn- f(xn) fв(xn) цш 1+ f(xn)fвв(xn) 2fв(xn)2 ІЇ .

This procedure is given in . It can be efficiently used to compute the inverse or the square root of a number.

Another similar cubic iteration may be given by

 xn+1=xn- 2f(xn)fв(xn) 2fв(xn)2-f(xn)fвв(xn) ,

sometimes known as Halley's method. We may also write it as

 xn+1
 =xn-f(xn) цш 1 fв(xn)-f(xn)fвв(xn)/(2fв(xn)) ІЇ
 =xn- f(xn) fв(xn) цш 1- f(xn)fвв(xn) 2fв(xn)2 ІЇ -1 .

Note that if we replace (1-a)-1 by 1+a+O(a2), we retrieve Householder's iteration.

4  High order iterations  4.1  Householder's methods

Under some conditions of regularity of f and it's derivative, Householder gave in  the general iteration

 xn+1=xn+(p+1) цш (1/f)(p) (1/f)(p+1) ІЇ xn ,

where p is an integer and (1/f)(p) is the derivative of order p of the inverse of the function f. This iteration has convergence of order (p+2). For example p=0 has quadratic convergence (order 2) and the formula gives back Newton's iteration while p=1 has cubical convergence (order 3) and gives again Halley's method. Just like Newton's method a good starting point is required to insure convergence.

Using the iteration with p=2, gives the following iteration which has quartical convergence (order 4)

 xn+1=xn-f(xn) цш fв(xn)2-f(xn)fвв(xn)/2 fв(xn)3-f(xn)fв(xn)fвв(xn)+f(3)(xn)f2(xn)/6 ІЇ .

4.2  Modified methods

Another idea is to write

 xn+1=xn+hn+a2n hn2 2! +a3n hn3 3! +...,

where hn=-f(xn)/fв(xn) is given by the simple Newton's iteration and (a2n,a3n,...) are real parameters which we will estimate in order to minimize the value of f(xn+1):

 f(xn+1)=f цш xn+hn+a2n hn2 2! +a3n hn3 3! +... ІЇ ,

we assume that f is regular enough and hn+a2nhn2/2!+a3nhn3/3!+... is small, hence using the expansion of f near xn

 f(xn+1)=f(xn)+ цш hn+a2n hn2 2! +a3n hn3 3! +... ІЇ fв(xn)+ цш hn+a2n hn2 2! +a3n hn3 3! +... ІЇ 2 fвв(xn) 2 +...,

and because f(xn)+hnfв(xn)=0, we have

 f(xn+1)=( a2nfв(xn)+fвв(xn)) hn2 2! +( a3nfв(xn)+3a2nfвв(xn)+f(3)(xn)) hn3 3! +O(hn4).

A good choice for the ain is clearly to cancel as many terms as possible in the previous expansion, so we impose

 a2n
 =- fnвв fnв ,
 a3n
 = -fnвfn(3)+3( fnвв) 2 ( fnв) 2 ,
 a4n
 = -( fnв) 2fn(4)+10fnвfnввfn(3)-15( fnвв) 3 ( fnв) 3 ,
 a5n
 = -( fnв) 3fn(5)+15(fnв) 2fnввfn(4)+10(fnв) 2( fn(3)) 2-105fnв( fnвв) 2fn(3)+105( fnвв) 4 ( fnв) 4 ,
 a6n
 =...

with, for the sake of brevity, the notation fn(k)=f(k)(xn). The formal values of the ain may be computed for much larger values of i. Finally the general iteration is

 xn+1
 =xn+hn цш 1+a2n hn 2! +a3n hn2 3! +a4n hn3 4! +... ІЇ
 =xn- f(xn) fв(xn) цш 1+ fвв(xn) 2!fв(xn) цш f(xn) fв(xn) ІЇ + 3fвв(xn)2-fв(xn)f(3)(xn) 3!fв(xn)2 цш f(xn) fв(xn) ІЇ 2 +... ІЇ .

For example, if we stop at a3n and set a4n=a5n=...=0, we have the helpful quartic modified iteration (note that this iteration is different than the previous Householder's quartic method)
 xn+1=xn- f(xn) fв(xn) цш 1+ f(xn)fвв(xn) 2!fв(xn)2 + f(xn)2(3fвв(xn)2-fв(xn)f(3)(xn)) 3!fв(xn)4 ІЇ ,

and if we omit a3n, we retrieve Householder's cubic iteration.

It's also possible to find the expressions for (a4n,a5n,a6n,a7n,...), and define quintic, sextic, septic, octic ... iterations.

5  Examples  In the essay on the square root of two, some applications of those methods are given and also in the essay on how compute the inverse or the square root of a number. Here let's compare the different iterations on the computation of the Golden Ratio

 j = 1+╓5 2 .

First we write j = 1/2+x and x=5/2 is root of the equation

 1 x2 - 4 5 =0,

then, it's convenient to set hn=1-4/5×xn2, we have the 5 algorithms, deduced from the general previous methods (left as exercise!), with respectively quadratic, cubic, quartic, sextic and octic convergence :

 xn+1
 =xn+ 1 2 xnhn      (Newton),
 xn+1
 =xn+ 1 8 xnhn( 4+3hn)      (Householder),
 xn+1
 =xn+ 1 16 xnhn( 8+6hn+5hn2)      (Quartic),
 xn+1
 =xn+ 1 256 xnhn( 128+96hn+hn2(80+70hn+63hn2) )       (Sextic),
 xn+1
 =xn+ 1 2048 xnhn( 1024+768hn+hn2( 640+560hn+hn2( 504+462hn+429hn2)) )       (Octic).

Starting with the initial value x0=1.118, the first iteration becomes respectively

 j1Newton
 =x1+1/2=1.61803398(720...)      8 digits,
 j1Householder
 =x1+1/2=1.6180339887498(163...)      13digits,
 j1Quartic
 =x1+1/2=1.61803398874989484(402...)      17digits,
 j1Sextic
 =x1+1/2=1.6180339887498948482045868(216...)      25 digits,
 j1Octic
 =x1+1/2=1.618033988749894848204586834365638(076...)      33 digits,

and one more step gives for x2 respectively 17, 39, 69,154 and 273 good digits. If we iterate just three more steps, x5 will give respectively 139, 1049, 4406, 33321 and 140053 correct digits ... Observe that hn tends to zero as n increases which may be used to accelerate the computations and the choice of the best algorithm depends on your implementation (algorithm used for multiplication, the square, ...)

6  Newton's method for several variables  Newton's method may also be used to find a root of a system of two or more non linear equations

 f(x,y)
 =0
 g(x,y)
 =0,

where f and g are C2functions on a given domain. Using Taylor's expansion of the two functions near (x,y) we find

 f(x+h,y+k)
 =f(x,y)+h ╢f ╢x +k ╢f ╢y +O(h2+k2)
 g(x+h,y+k)
 =g(x,y)+h ╢g ╢x +k ╢g ╢y +O(h2+k2)

and if we keep only the first order terms, we are looking for a couple (h,k) such as

 f(x+h,y+k)
 =0 ╗ f(x,y)+h ╢f ╢x +k ╢f ╢y
 g(x+h,y+k)
 =0 ╗ g(x,y)+h ╢g ╢x +k ╢g ╢y

hence it's equivalent to the linear system

ц
ч
ч
ч
ч
ч
ш
 ╢f ╢x
 ╢f ╢y
 ╢g ╢x
 ╢g ╢y
І
і
і
і
і
і
Ї
ц
ч
ш
 h
 k
І
і
Ї
=- ц
ч
ш
 f(x,y)
 g(x,y)
І
і
Ї
.

The (2×2) matrix is called the Jacobian matrix (or Jacobian) and is sometimes denoted

J(x,y)= ц
ч
ч
ч
ч
ч
ш
 ╢f ╢x
 ╢f ╢y
 ╢g ╢x
 ╢g ╢y
І
і
і
і
і
і
Ї

(it's generalization as a (N×N) matrix for a system of N equations and N variables is immediate) . This suggest to define the new process

ц
ч
ш
 xn+1
 yn+1
І
і
Ї
= ц
ч
ш
 xn
 yn
І
і
Ї
-J-1(xn,yn) ц
ч
ш
 f(xn,yn)
 g(xn,yn)
І
і
Ї

starting with an initial guess (x0,y0) and under certain conditions (which are not so easy to check and this is again the main disadvantage of the method), it's possible to show that this process converges to a root of the system. The convergence remains quadratic.

Example 3 We are looking for a root near (x0=-0.6,y0=0.6) of the following system
 f(x,y)
 =x3-3xy2-1
 g(x,y)
 =3x2y-y3,

here the Jacobian and its inverse become

 J(xn,yn)
=3 ц
ч
ш
 xn2-yn2
 -2xnyn
 2xnyn
 xn2-yn2
І
і
Ї
 J-1(xn,yn)
=  1

3( xn2+yn2) 2
ц
ч
ш
 xn2-yn2
 2xnyn
 -2xnyn
 xn2-yn2
І
і
Ї

and the process gives
 x1
 =-0.40000000000000000000,y1=0.86296296296296296296
 x2
 =-0.50478978186242263605,y2=0.85646430512069295697
 x3
 =-0.49988539803643124722,y3=0.86603764032215486664
 x4
 =-0.50000000406150565266,y4=0.86602539113638168322
 x5
 =-0.49999999999999983928,y5=0.86602540378443871965
 ...

Depending on your initial guess Newton's process may converge to one of the three roots of the system:
 (-1/2,╓3/2),(-1/2,-╓3/2),(1,0),

and for some values of (x0,y0) the convergence of the process may be tricky! The study of the influence of this initial guess leads to aesthetic fractal pictures.

Cubic convergence also exists for several variables system of equations : Chebyshev's method.

References  A. S. Householder, The Numerical Treatment of a Single Nonlinear Equation, McGraw-Hill, New York, (1970)


L.V. Kantorovich and G.P. Akilov, Functional Analysis in Normed Spaces, Pergamon Press, Elmsford, New York, (1964)


I. Newton, Methodus fluxionum et serierum infinitarum, (1664-1671)


J.M. Ortega and W.C. Rheinboldt, Iterative solution of non linear equations in several variables, (1970)


H. Qiu, A Robust Examination of the Newton-Raphson Method with Strong Global Convergence Properties, Master's Thesis, University of Central Florida, (1993)


J. Raphson, Analysis Aequationum universalis, London, (1690)