Newton's Method

Total Page:16

File Type:pdf, Size:1020Kb

Newton's Method Newton's method In numerical analysis, Newton's method, also known as the Newton–Raphson method, named after Isaac Newton and Joseph Raphson, is a root-finding algorithm which produces successively better approximations to the roots (or zeroes) of a real-valued function. The most basic version starts with a single-variable function f defined for a real variable x, the function's derivative f ′, and an initial guess x0 for a root of f. If the function satisfies sufficient assumptions and the initial guess is close, then is a better approximation of the root than x0. Geometrically, (x1, 0) is the intersection of the x-axis and the tangent of the graph of f at (x0, f (x0)): that is, the improved guess is the unique root of the linear approximation at the initial point. The process is repeated as until a sufficiently precise value is reached. This algorithm is first in the class of Householder's methods, succeeded by Halley's method. The method can also be extended to complex functions and to systems of equations. Contents Description History Practical considerations Difficulty in calculating derivative of a function Failure of the method to converge to the root Overshoot Stationary point Poor initial estimate Mitigation of non-convergence Slow convergence for roots of multiplicity greater than 1 Analysis Proof of quadratic convergence for Newton's iterative method Basins of attraction Failure analysis Bad starting points Iteration point is stationary Starting point enters a cycle Derivative issues Derivative does not exist at root Discontinuous derivative Non-quadratic convergence Zero derivative No second derivative Generalizations Complex functions Chebyshev's third -order method Nash–Moser iteration Nonlinear systems of equations k variables, k functions k variables, m equations, with m > k Nonlinear equations in a Banach space Nonlinear equations over p-adic numbers Newton–Fourier method Quasi-Newton methods q-analog Modified Newton methods Maehly's procedure Hirano's modified Newton method Interval Newton's method Applications Minimization and maximization problems Multiplicative inverses of numbers and power series Solving transcendental equations Obtaining zeros of special functions Numerical verification for solutions of nonlinear equations Examples Square root of a number Solution of cos x = x3 Pseudocode See also Notes References Further reading External links Description The idea is to start with an initial guess which is reasonably close to the true root, then to approximate the function by its tangent line using calculus, and finally to compute the x-intercept of this tangent line by elementary algebra. This x-intercept will typically be a better approximation to the original function's root than the first guess, and the method can be iterated. More formally, suppose f : (a, b) → ℝ is a differentiable function defined on the interval (a, b) with values in the real numbers ℝ, and we have some current approximation xn. Then we can derive the formula for a better approximation, xn + 1 by referring to the diagram on the right. The equation of the tangent line to the curve y = f (x) at x = xn is where f′ denotes the derivative. The x-intercept of this line (the value of x which makes y = 0) is taken as the next approximation,xn + 1, to the root, so that the equation of the tangent line is satisfied when : The function f is shown in blue and the tangent line is in red. We see that xn + 1 is a better approximation than xn for the root x of the function f. Solving for xn + 1 gives We start the process with some arbitrary initial value x0. (The closer to the zero, the better. But, in the absence of any intuition about where the zero might lie, a "guess and check" method might narrow the possibilities to a reasonably small interval by appealing to the intermediate value theorem.) The method will usually converge, provided this initial guess is close enough to the unknown zero, and that f ′(x0) ≠ 0. Furthermore, for a zero of multiplicity 1, the convergence is at least quadratic (see rate of convergence) in a neighbourhood of the zero, which intuitively means that the number of correct digits roughly doubles in every step. More details can be found in the analysis section below. Householder's methods are similar but have higher order for even faster convergence. However, the extra computations required for each step can slow down the overall performance relative to Newton's method, particularly if f or its derivatives are computationally expensive to evaluate. History The name "Newton's method" is derived from Isaac Newton's description of a special case of the method in De analysi per aequationes numero terminorum infinitas (written in 1669, published in 1711 by William Jones) and in De metodis fluxionum et serierum infinitarum (written in 1671, translated and published as Method of Fluxions in 1736 by John Colson). However, his method differs substantially from the modern method given above: Newton applies the method only to polynomials. He does not compute the successive approximations xn, but computes a sequence of polynomials, and only at the end arrives at an approximation for the root x. Finally, Newton views the method as purely algebraic and makes no mention of the connection with calculus. Newton may have derived his method from a similar but less precise method by Vieta. The essence of Vieta's method can be found in the work of the Persian mathematician Sharaf al-Din al-Tusi, while his successor Jamshīd al-Kāshī used a form of Newton's method to solve xP − N = 0 to find roots of N (Ypma 1995). A special case of Newton's method for calculating square roots was known since ancient times and is often called the Babylonian method. Newton's method was used by 17th-century Japanese mathematician Seki Kōwa to solve single-variable equations, though the connection with calculus was missing.[1] Newton's method was first published in 1685 in A Treatise of Algebra both Historical and Practical by John Wallis.[2] In 1690, Joseph Raphson published a simplified description in Analysis aequationum universalis.[3] Raphson again viewed Newton's method purely as an algebraic method and restricted its use to polynomials, but he describes the method in terms of the successive approximations xn instead of the more complicated sequence of polynomials used by Newton. Finally, in 1740, Thomas Simpson described Newton's method as an iterative method for solving general nonlinear equations using calculus, essentially giving the description above. In the same publication, Simpson also gives the generalization to systems of two equations and notes that Newton's method can be used for solving optimization problems by setting the gradient to zero. Arthur Cayley in 1879 in The Newton–Fourier imaginary problem was the first to notice the difficulties in generalizing Newton's method to complex roots of polynomials with degree greater than 2 and complex initial values. This opened the way to the study of the theory of iterations of rational functions. Practical considerations Newton's method is an extremely powerful technique—in general the convergence is quadratic: as the method converges on the root, the difference between the root and the approximation is squared (the number of accurate digits roughly doubles) at each step. However, there are some difficulties with the method. Difficulty in calculating derivative of a function Newton's method requires that the derivative can be calculated directly. An analytical expression for the derivative may not be easily obtainable or could be expensive to evaluate. In these situations, it may be appropriate to approximate the derivative by using the slope of a line through two nearby points on the function. Using this approximation would result in something like the secant method whose convergence is slower than that of Newton's method. Failure of the method to converge to the root It is important to review the proof of quadratic convergence of Newton's Method before implementing it. Specifically, one should review the assumptions made in the proof. For situations where the method fails to converge, it is because the assumptions made in this proof are not met. Overshoot If the first derivative is not well behaved in the neighborhood of a particular root, the method may overshoot, and diverge from that root. An example of a function with one root, for which the derivative is not well behaved in the neighborhood of the root, is 1 1 for which the root will be overshot and the sequence of will diverge. For , the root will still be overshot, but the sequence will oscillate between two values. For , the root will x a = 2 2 < a < 1 still be overshot but the sequence will converge, and for a ≥ 1 the root will not be overshot at all. In some cases, Newton's method can be stabilized by using successive over-relaxation, or the speed of convergence can be increased by using the same method. Stationary point If a stationary point of the function is encountered, the derivative is zero and the method will terminate due to division by zero. Poor initial estimate A large error in the initial estimate can contribute to non-convergence of the algorithm. To overcome this problem one can often linearise the function that is being optimized using calculus, logs, differentials, or even using evolutionary algorithms, such as the stochastic funnel algorithm. Good initial estimates lie close to the final globally optimal parameter estimate. In nonlinear regression, the sum of squared errors (SSE) is only "close to" parabolic in the region of the final parameter estimates. Initial estimates found here will allow the Newton–Raphson method to quickly converge.
Recommended publications
  • A Learning-Based Iterative Method for Solving Vehicle
    Published as a conference paper at ICLR 2020 ALEARNING-BASED ITERATIVE METHOD FOR SOLVING VEHICLE ROUTING PROBLEMS Hao Lu ∗ Princeton University, Princeton, NJ 08540 [email protected] Xingwen Zhang ∗ & Shuang Yang Ant Financial Services Group, San Mateo, CA 94402 fxingwen.zhang,[email protected] ABSTRACT This paper is concerned with solving combinatorial optimization problems, in par- ticular, the capacitated vehicle routing problems (CVRP). Classical Operations Research (OR) algorithms such as LKH3 (Helsgaun, 2017) are inefficient and dif- ficult to scale to larger-size problems. Machine learning based approaches have recently shown to be promising, partly because of their efficiency (once trained, they can perform solving within minutes or even seconds). However, there is still a considerable gap between the quality of a machine learned solution and what OR methods can offer (e.g., on CVRP-100, the best result of learned solutions is between 16.10-16.80, significantly worse than LKH3’s 15.65). In this paper, we present “Learn to Improve” (L2I), the first learning based approach for CVRP that is efficient in solving speed and at the same time outperforms OR methods. Start- ing with a random initial solution, L2I learns to iteratively refine the solution with an improvement operator, selected by a reinforcement learning based controller. The improvement operator is selected from a pool of powerful operators that are customized for routing problems. By combining the strengths of the two worlds, our approach achieves the new state-of-the-art results on CVRP, e.g., an average cost of 15.57 on CVRP-100. 1 INTRODUCTION In this paper, we focus on an important class of combinatorial optimization, vehicle routing problems (VRP), which have a wide range of applications in logistics.
    [Show full text]
  • Linear Approximation of the Derivative
    Section 3.9 - Linear Approximation and the Derivative Idea: When we zoom in on a smooth function and its tangent line, they are very close to- gether. So for a function that is hard to evaluate, we can use the tangent line to approximate values of the derivative. p 3 Example Usep a tangent line to approximatep 8:1 without a calculator. Let f(x) = 3 x = x1=3. We know f(8) = 3 8 = 2. We'll use the tangent line at x = 8. 1 1 1 1 f 0(x) = x−2=3 so f 0(8) = (8)−2=3 = · 3 3 3 4 Our tangent line passes through (8; 2) and has slope 1=12: 1 y = (x − 8) + 2: 12 To approximate f(8:1) we'll find the value of the tangent line at x = 8:1. 1 y = (8:1 − 8) + 2 12 1 = (:1) + 2 12 1 = + 2 120 241 = 120 p 3 241 So 8:1 ≈ 120 : p How accurate was this approximation? We'll use a calculator now: 3 8:1 ≈ 2:00829885. 241 −5 Taking the difference, 2:00829885 − 120 ≈ −3:4483 × 10 , a pretty good approximation! General Formulas for Linear Approximation The tangent line to f(x) at x = a passes through the point (a; f(a)) and has slope f 0(a) so its equation is y = f 0(a)(x − a) + f(a) The tangent line is the best linear approximation to f(x) near x = a so f(x) ≈ f(a) + f 0(a)(x − a) this is called the local linear approximation of f(x) near x = a.
    [Show full text]
  • Generalized Finite-Difference Schemes
    Generalized Finite-Difference Schemes By Blair Swartz* and Burton Wendroff** Abstract. Finite-difference schemes for initial boundary-value problems for partial differential equations lead to systems of equations which must be solved at each time step. Other methods also lead to systems of equations. We call a method a generalized finite-difference scheme if the matrix of coefficients of the system is sparse. Galerkin's method, using a local basis, provides unconditionally stable, implicit generalized finite-difference schemes for a large class of linear and nonlinear problems. The equations can be generated by computer program. The schemes will, in general, be not more efficient than standard finite-difference schemes when such standard stable schemes exist. We exhibit a generalized finite-difference scheme for Burgers' equation and solve it with a step function for initial data. | 1. Well-Posed Problems and Semibounded Operators. We consider a system of partial differential equations of the following form: (1.1) du/dt = Au + f, where u = (iti(a;, t), ■ ■ -, umix, t)),f = ifxix, t), ■ • -,fmix, t)), and A is a matrix of partial differential operators in x — ixx, • • -, xn), A = Aix,t,D) = JjüiD', D<= id/dXx)h--Yd/dxn)in, aiix, t) = a,-,...i„(a;, t) = matrix . Equation (1.1) is assumed to hold in some n-dimensional region Í2 with boundary An initial condition is imposed in ti, (1.2) uix, 0) = uoix) , xEV. The boundary conditions are that there is a collection (B of operators Bix, D) such that (1.3) Bix,D)u = 0, xEdti,B<E(2>. We assume the operator A satisfies the following condition : There exists a scalar product {,) such that for all sufficiently smooth functions <¡>ix)which satisfy (1.3), (1.4) 2 Re {A4,,<t>) ^ C(<b,<p), 0 < t ^ T , where C is a constant independent of <p.An operator A satisfying (1.4) is called semi- bounded.
    [Show full text]
  • The Mean Value Theorem Math 120 Calculus I Fall 2015
    The Mean Value Theorem Math 120 Calculus I Fall 2015 The central theorem to much of differential calculus is the Mean Value Theorem, which we'll abbreviate MVT. It is the theoretical tool used to study the first and second derivatives. There is a nice logical sequence of connections here. It starts with the Extreme Value Theorem (EVT) that we looked at earlier when we studied the concept of continuity. It says that any function that is continuous on a closed interval takes on a maximum and a minimum value. A technical lemma. We begin our study with a technical lemma that allows us to relate 0 the derivative of a function at a point to values of the function nearby. Specifically, if f (x0) is positive, then for x nearby but smaller than x0 the values f(x) will be less than f(x0), but for x nearby but larger than x0, the values of f(x) will be larger than f(x0). This says something like f is an increasing function near x0, but not quite. An analogous statement 0 holds when f (x0) is negative. Proof. The proof of this lemma involves the definition of derivative and the definition of limits, but none of the proofs for the rest of the theorems here require that depth. 0 Suppose that f (x0) = p, some positive number. That means that f(x) − f(x ) lim 0 = p: x!x0 x − x0 f(x) − f(x0) So you can make arbitrarily close to p by taking x sufficiently close to x0.
    [Show full text]
  • AP Calculus AB Topic List 1. Limits Algebraically 2. Limits Graphically 3
    AP Calculus AB Topic List 1. Limits algebraically 2. Limits graphically 3. Limits at infinity 4. Asymptotes 5. Continuity 6. Intermediate value theorem 7. Differentiability 8. Limit definition of a derivative 9. Average rate of change (approximate slope) 10. Tangent lines 11. Derivatives rules and special functions 12. Chain Rule 13. Application of chain rule 14. Derivatives of generic functions using chain rule 15. Implicit differentiation 16. Related rates 17. Derivatives of inverses 18. Logarithmic differentiation 19. Determine function behavior (increasing, decreasing, concavity) given a function 20. Determine function behavior (increasing, decreasing, concavity) given a derivative graph 21. Interpret first and second derivative values in a table 22. Determining if tangent line approximations are over or under estimates 23. Finding critical points and determining if they are relative maximum, relative minimum, or neither 24. Second derivative test for relative maximum or minimum 25. Finding inflection points 26. Finding and justifying critical points from a derivative graph 27. Absolute maximum and minimum 28. Application of maximum and minimum 29. Motion derivatives 30. Vertical motion 31. Mean value theorem 32. Approximating area with rectangles and trapezoids given a function 33. Approximating area with rectangles and trapezoids given a table of values 34. Determining if area approximations are over or under estimates 35. Finding definite integrals graphically 36. Finding definite integrals using given integral values 37. Indefinite integrals with power rule or special derivatives 38. Integration with u-substitution 39. Evaluating definite integrals 40. Definite integrals with u-substitution 41. Solving initial value problems (separable differential equations) 42. Creating a slope field 43.
    [Show full text]
  • The Generalized Simplex Method for Minimizing a Linear Form Under Linear Inequality Restraints
    Pacific Journal of Mathematics THE GENERALIZED SIMPLEX METHOD FOR MINIMIZING A LINEAR FORM UNDER LINEAR INEQUALITY RESTRAINTS GEORGE BERNARD DANTZIG,ALEXANDER ORDEN AND PHILIP WOLFE Vol. 5, No. 2 October 1955 THE GENERALIZED SIMPLEX METHOD FOR MINIMIZING A LINEAR FORM UNDER LINEAR INEQUALITY RESTRAINTS GEORGE B. DANTZIG, ALEX ORDEN, PHILIP WOLFE 1. Background and summary. The determination of "optimum" solutions of systems of linear inequalities is assuming increasing importance as a tool for mathematical analysis of certain problems in economics, logistics, and the theory of games [l;5] The solution of large systems is becoming more feasible with the advent of high-speed digital computers; however, as in the related problem of inversion of large matrices, there are difficulties which remain to be resolved connected with rank. This paper develops a theory for avoiding as- sumptions regarding rank of underlying matrices which has import in applica- tions where little or nothing is known about the rank of the linear inequality system under consideration. The simplex procedure is a finite iterative method which deals with problems involving linear inequalities in a manner closely analogous to the solution of linear equations or matrix inversion by Gaussian elimination. Like the latter it is useful in proving fundamental theorems on linear algebraic systems. For example, one form of the fundamental duality theorem associated with linear inequalities is easily shown as a direct consequence of solving the main prob- lem. Other forms can be obtained by trivial manipulations (for a fuller discus- sion of these interrelations, see [13]); in particular, the duality theorem [8; 10; 11; 12] leads directly to the Minmax theorem for zero-sum two-person games [id] and to a computational method (pointed out informally by Herman Rubin and demonstrated by Robert Dorfman [la]) which simultaneously yields optimal strategies for both players and also the value of the game.
    [Show full text]
  • Lecture 9: Newton Method 9.1 Motivation 9.2 History
    10-725: Convex Optimization Fall 2013 Lecture 9: Newton Method Lecturer: Barnabas Poczos/Ryan Tibshirani Scribes: Wen-Sheng Chu, Shu-Hao Yu Note: LaTeX template courtesy of UC Berkeley EECS dept. 9.1 Motivation Newton method is originally developed for finding a root of a function. It is also known as Newton- Raphson method. The problem can be formulated as, given a function f : R ! R, finding the point x? such that f(x?) = 0. Figure 9.1 illustrates another motivation of Newton method. Given a function f, we want to approximate it at point x with a quadratic function fb(x). We move our the next step determined by the optimum of fb. Figure 9.1: Motivation for quadratic approximation of a function. 9.2 History Babylonian people first applied the Newton method to find the square root of a positive number S 2 R+. The problem can be posed as solving the equation f(x) = x2 − S = 0. They realized the solution can be achieved by applying the iterative update rule: 2 1 S f(xk) xk − S xn+1 = (xk + ) = xk − 0 = xk − : (9.1) 2 xk f (xk) 2xk This update rule converges to the square root of S, which turns out to be a special case of Newton method. This could be the first application of Newton method. The starting point affects the convergence of the Babylonian's method for finding the square root. Figure 9.2 shows an example of solving the square root for S = 100. x- and y-axes represent the number of iterations and the variable, respectively.
    [Show full text]
  • Occasion of Receiving the Seki-Takakazu Prize
    特集:日本数学会関孝和賞受賞 On the occasion of receiving the Seki-Takakazu Prize Jean-Pierre Bourguignon, the director of IHÉS A brief introduction to the Institut des Hautes Études Scientifiques The Institut des Hautes Études Scientifiques (IHÉS) was founded in 1958 by Léon MOTCHANE, an industrialist with a passion for mathematics, whose ambition was to create a research centre in Europe, counterpart to the renowned Institute for Advanced Study (IAS), Princeton, United States. IHÉS became a foundation acknowledged in the public interest in 1981. Like its model, IHÉS has a small number of Permanent Professors (5 presently), and hosts every year some 250 visitors coming from all around the world. A Scientific Council consisting of the Director, the Permanent Professors, the Léon Motchane professor and an equal number of external members is in charge of defining the scientific strategy of the Institute. The foundation is managed by an 18 member international Board of Directors selected for their expertise in science or in management. The French Minister of Research or their representative and the General Director of CNRS are members of the Board. IHÉS accounts are audited and certified by an international accountancy firm, Deloitte, Touche & Tomatsu. Its resources come from many different sources: half of its budget is provided by a contract with the French government, but institutions from some 10 countries, companies, foundations provide the other half, together with the income from the endowment of the Institute. Some 50 years after its creation, the high quality of its Permanent Professors and of its selected visiting researchers has established IHÉS as a research institute of world stature.
    [Show full text]
  • Multivariable and Vector Calculus
    Multivariable and Vector Calculus Lecture Notes for MATH 0200 (Spring 2015) Frederick Tsz-Ho Fong Department of Mathematics Brown University Contents 1 Three-Dimensional Space ....................................5 1.1 Rectangular Coordinates in R3 5 1.2 Dot Product7 1.3 Cross Product9 1.4 Lines and Planes 11 1.5 Parametric Curves 13 2 Partial Differentiations ....................................... 19 2.1 Functions of Several Variables 19 2.2 Partial Derivatives 22 2.3 Chain Rule 26 2.4 Directional Derivatives 30 2.5 Tangent Planes 34 2.6 Local Extrema 36 2.7 Lagrange’s Multiplier 41 2.8 Optimizations 46 3 Multiple Integrations ........................................ 49 3.1 Double Integrals in Rectangular Coordinates 49 3.2 Fubini’s Theorem for General Regions 53 3.3 Double Integrals in Polar Coordinates 57 3.4 Triple Integrals in Rectangular Coordinates 62 3.5 Triple Integrals in Cylindrical Coordinates 67 3.6 Triple Integrals in Spherical Coordinates 70 4 Vector Calculus ............................................ 75 4.1 Vector Fields on R2 and R3 75 4.2 Line Integrals of Vector Fields 83 4.3 Conservative Vector Fields 88 4.4 Green’s Theorem 98 4.5 Parametric Surfaces 105 4.6 Stokes’ Theorem 120 4.7 Divergence Theorem 127 5 Topics in Physics and Engineering .......................... 133 5.1 Coulomb’s Law 133 5.2 Introduction to Maxwell’s Equations 137 5.3 Heat Diffusion 141 5.4 Dirac Delta Functions 144 1 — Three-Dimensional Space 1.1 Rectangular Coordinates in R3 Throughout the course, we will use an ordered triple (x, y, z) to represent a point in the three dimensional space.
    [Show full text]
  • Calculus I - Lecture 15 Linear Approximation & Differentials
    Calculus I - Lecture 15 Linear Approximation & Differentials Lecture Notes: http://www.math.ksu.edu/˜gerald/math220d/ Course Syllabus: http://www.math.ksu.edu/math220/spring-2014/indexs14.html Gerald Hoehn (based on notes by T. Cochran) March 11, 2014 Equation of Tangent Line Recall the equation of the tangent line of a curve y = f (x) at the point x = a. The general equation of the tangent line is y = L (x) := f (a) + f 0(a)(x a). a − That is the point-slope form of a line through the point (a, f (a)) with slope f 0(a). Linear Approximation It follows from the geometric picture as well as the equation f (x) f (a) lim − = f 0(a) x a x a → − f (x) f (a) which means that x−a f 0(a) or − ≈ f (x) f (a) + f 0(a)(x a) = L (x) ≈ − a for x close to a. Thus La(x) is a good approximation of f (x) for x near a. If we write x = a + ∆x and let ∆x be sufficiently small this becomes f (a + ∆x) f (a) f (a)∆x. Writing also − ≈ 0 ∆y = ∆f := f (a + ∆x) f (a) this becomes − ∆y = ∆f f 0(a)∆x ≈ In words: for small ∆x the change ∆y in y if one goes from x to x + ∆x is approximately equal to f 0(a)∆x. Visualization of Linear Approximation Example: a) Find the linear approximation of f (x) = √x at x = 16. b) Use it to approximate √15.9. Solution: a) We have to compute the equation of the tangent line at x = 16.
    [Show full text]
  • Tutorial 8: Solutions
    Tutorial 8: Solutions Applications of the Derivative 1. We are asked to find the absolute maximum and minimum values of f on the given interval, and state where those values occur: (a) f(x) = 2x3 + 3x2 12x on [1; 4]. − The absolute extrema occur either at a critical point (stationary point or point of non-differentiability) or at the endpoints. The stationary points are given by f 0(x) = 0, which in this case gives 6x2 + 6x 12 = 0 = x = 1; x = 2: − ) − Checking the values at the critical points (only x = 1 is in the interval [1; 4]) and endpoints: f(1) = 7; f(4) = 128: − Therefore the absolute maximum occurs at x = 4 and is given by f(4) = 124 and the absolute minimum occurs at x = 1 and is given by f(1) = 7. − (b) f(x) = (x2 + x)2=3 on [ 2; 3] − As before, we find the critical points. The derivative is given by 2 2x + 1 f 0(x) = 3 (x2 + x)1=2 and hence we have a stationary point when 2x + 1 = 0 and points of non- differentiability whenever x2 + x = 0. Solving these, we get the three critical points, x = 1=2; and x = 0; 1: − − Checking the value of the function at these critical points and the endpoints: f( 2) = 22=3; f( 1) = 0; f( 1=2) = 2−4=3; f(0) = 0; f(3) = (12)2=3: − − − Hence the absolute minimum occurs at either x = 1 or x = 0 since in both − these cases the minimum value is 0, while the absolute maximum occurs at x = 3 and is f(3) = (12)2=3.
    [Show full text]
  • Historical Development of the Chinese Remainder Theorem
    Historical Development of the Chinese Remainder Theorem SHEN KANGSHENG Communicated by C. TRUESDELL 1. Source of the Problem Congruences of first degree were necessary to calculate calendars in ancient China as early as the 2 na century B.C. Subsequently, in making the Jingchu [a] calendar (237,A.D.), the astronomers defined shangyuan [b] 1 as the starting point of the calendar. If the Winter Solstice of a certain year occurred rl days after shangyuan and r2 days after the new moon, then that year was N years after shangyuan; hence arose the system of congruences aN ~ rl (mod 60) ~ r2 (mod b), . ~ ' where a is the number of days in a tropical year and b the number of days in a lunar month. 2. Sun Zi suanjing [c] (Master Sun's Mathematical Manual) Sun Zi suanjing (Problem 26' Volume 3) reads: "There are certain things whose number is unknown. A number is repeatedly divided by 3, the remainder is 2; divided by 5, the remainder is 3; and by 7, the remainder is 2. What will the num- ber be ?" The problem can be expressed as x --= 2 (mod 3) ~ 3 (mod 5) ~- 2 (rood 7). SUN ZI solved the problem as we do, giving x ~ 140 + 63 -k 30 ~=- 233 ~ 23 (rood 105). 1 Shangyuan is a supposed moment that occurred simultaneously with the midnight of jiazi [v] (the first day of the 60 day cycle), the Winter Solstice and the new moon. 286 SHEN KANGSHENG In speaking of the algorithm yielding these addends in the solution he continued: "In general, for G1 ~ 0(mod 5) ~ 0 (mod 7) = 1 (rood 3), take G1 = 70, G2 ~ 0(rood 3) ~ 0(mod 7) ~ 1 (mod 5), take G2 = 21, G3 ~ 0 (mod 3) ~ 0 (mod 5) ~ (1 rood 7), take G3 = 15.
    [Show full text]