DICTIONARY OF MATHEMATICAL TERMS

DR. ASHOT DJRBASHIAN in cooperation with PROF. PETER STATHIS 2 i

Abel’s Also called Abel’s theorem, for diﬀerential equations. Let y1 and y2 be solutions of second order equation

L[y] = y00 + p(t)y0 + q(t)y = 0, where both p and q are continuous functions on some For a , the non- I. Then by Abel’s theorem the of |r| given by the formula the equation is given by the formula n r if r ≥ 0 |r| =  Z  −r if r < 0 W (y , y ) = c exp − p(t) dt , 1 2 Absolute value of a real number indicates the between the on the number where is a that depends on solutions corresponding to that number and the . y1, y2 only. absolute value function The function f(x) = |x|, deﬁned for all real values x. abscissa In the Cartesian coordinate , the name of the x-axis. absolute maximum and minimum (1) For func- tions of one . A number M in the range of a given function f(x), such that f(x) ≤ M for all values of x in the domain of f is the absolute maxi- mum. Similarly, a number m in the range of a given function f(x), such that f(x) ≥ m for all values of x in the domain of f is the absolute minimum. If the function f is given on a closed interval [a, b], P∞ then to ﬁnd absolute maximum and minimum values, absolutely convergent A series n=1 an the following steps to be taken: is absolutely convergent if the series formed with the P∞ (a) Find all critical points (points, where the deriva- absolute values of its terms n=1 |an| converges. If tive f 0 is either zero or does not exist) of the function a series is absolutely convergent, then any rearrange- on the open interval (a, b) ment of that series is also convergent and the sum is (b) Find the values of f at critical points and - the same as the original series. points a and b. In , the rate of change of (c) Compare all these values to determine the largest the of a moving object. Mathematically, and smallest. They will be the absolute maximum if the distance traveled by the object is given by and minimum values respectively. See also local max- the function s(t), then acceleration is the second imum and minimum. derivative of this function: a(t) = s00(t). (2) For functions of several variables deﬁnitions similar to the above are also valid. The critical acute An angle that measures between 0 points are also used to ﬁnd these maximum and and 90◦ in or between 0 and π/2 in minimum points. measure. 2

: the same number of rows and columns. If A and B are two m × n-matrices, then their sum is deﬁned to be the matrix C which has as elements the sums of elements of the given matrices in each . Example:  1 1 −2   −2 2 3   −1 0 4  +  1 −3 −4  5 −2 0 −4 0 1

(2) In the more general case when two events might Airy’s equation The second order equation have a common outcome, the addition rule is y00 − xy = 0, −∞ < x < ∞.

For real-valued functions of one variable: the function is called analytic at some point (1) Assume we know that a = 15, b = 25,A = 85◦. x = a, if it is inﬁnitely diﬀerentiable on some neigh- Using the Law of we get borhood of that point and its Taylor-MacLaurin se- ries sin A ∞ X f (n)(a) sin B = b ≈ 1.66 > 1, (x − a)n a n! which is impossible. This means that it is not possi- n=0 ble to construct a triangle with the given values. converges in a neighborhood of that point to the func- (2) Let a = 22, b = 12,A = 42◦. Here the use of tion itself. Not all inﬁnitely diﬀerentiable functions the gives sin B ≈ 0.365 and solving are analytic. The function this equation we get two solutions: B ≈ 21.4◦ and B ≈ 158.6◦. The second angle is not possible  −1/x2 f(x) = e if x 6= 0 because otherwise the sum of angles would have been 0 if x = 0 greater than 180◦, hence the ﬁrst one is valid only. From here, C ≈ 116.6◦. The remaining element of is inﬁnitely diﬀerentiable, but all of its Taylor- the triangle, the side c is found by another use of MacLaurin coeﬃcients at the origin are zero and the Law of sines: c = a sin C/ sin A ≈ 29.4 and the hence the series does not represent the function f. 6 One of the branches of ge- angle bisector See bisector of an angle. ometry. Analytic geometry combines the geometric angle of depression If an object is viewed from and algebraic methods by introducing coordinate sys- above, then the angle between the horizontal line and tems, and presenting geometric objects with the help the line drown between the viewer and the object, is of algebraic functions and their graphs. the angle of depression. angle Geometric object, which consists of a point, angle of elevation If an object is viewed from be- called vertex and two rays coming out of this point. low, then the angle between the horizontal line and In trigonometry and calculus, angles usually are the line drown between the viewer and the object, is placed in standard position, when the vertex is placed the angle of elevation. at the origin of the Cartesian , one side, called initial, coincides with the positive half of ANOVA Stands for of . It is a the x-axis and the other side, called terminal, can collection of statistical methods designed to compare have arbitrary direction. In this position angles can means of two or more groups of values and general- be both positive and negative. The angles, where izes the distribution tests and t-tests. the terminal side is moved counterclockwise, are con- angular The rate of change of the angle as sidered positive and clockwise direction is considered an object moves along a . If the angle is mea- negative. sured in radian measure and is denoted by θ, the The two main measuring units for angles are the de- of the circle is r, then the angular speed is gree measure and the radian measure. Two angles given by the formula ω = θ . with the same terminal side are called coterminal and r they diﬀer in by an multiple of a full cir- annihilator A diﬀerential is an annihila- cle, i.e. by multiple of 360◦ or 2π, depending on tor for some function, if application of that operator measuring . Angles traditionally are classiﬁed by to the function results in zero function. Example: their measure: An angle between 0 and 90 degrees The operator D2 + 4 is an annihilator for the func- is called acute, a 90 degree angle is called right, be- tion sin 2x, because tween 90 and 180 degrees – obtuse and 180 degrees– d2 straight. Additionally, angles measuring 90, 180, 270 2 (D + 4) sin 2x = 2 sin 2x + 4 sin 2x = 0. or 360 degrees are called quadrantal. dx angle between is deﬁned to be the angle between lines to these curves at the point of The rate banks charge for intersection. borrowed money, or pay for the money deposited. Ex- pressed in percentages or equivalent form. angle between vectors For vectors in inner prod- For example, annual percentage rate of 9 percent is uct . If u, v are two vectors in the inner prod- translated into the numeric value 0.09, and APR 5.7 uct V , then the angle θ between them is percent becomes 0.057. determined from the relation annulus A region bounded by two concentric cir- hu, vi cles of diﬀerent . Also called or washer. cos θ = , ||u||||v|| Another name for indeﬁnite inte- gral.The antiderivative of a function is a new function where hu, vi is the inner in V and ||u|| de- which has that given function as its derivative. notes the (, ) of the vector. antidiﬀerentiation Same as ﬁnding the indeﬁnite angle between vector and plane is deﬁned to . be the complement of the angle between that vector and the normal vector to the plane. antilogarithm For a number y and a b, the 7

number x such that logb x = y. Same as exponential For values not too far from a this function. gives satisfactory results. Also called linear approxi- mation. approximate (1)The procedure of substituting for the exact value of a number with a value close to the approximation by Taylor If a exact value; (2) The procedure of substituting for the function f(x) can be expanded into a values of a function by values of other, usually sim- pler, functions. Most often the are done by polynomials or rational functions because of the simple nature of these functions. approximate integration The procedure of cal- culating deﬁnite , when exact integration is impossible or diﬃcult. There are various methods of approximate integration. For speciﬁc methods see rule, Simpson’s rule, , and approximate integration by Riemann sums below. approximate integration by Riemann sums By the deﬁnition of deﬁnite integral, it is the of Riemann sums, as the length of interval partitions ap- proaches zero. Hence, each represents an approximate value of the given integral. Formally, let the function f(x) be deﬁned on some interval [a, b] that converges to that function, then the partial sums and form the left Riemann sum of that series become approximations to the function. Let n ∞ X X f ()(a) Rn = f(xi−1)∆x. f(x) = (x − a)j j! i=1 n=0

Then the number Rn is the left endpoint approxima- be that expansion around the point x = a. If f is R b bounded by M > 0 in the interval |x − a| ≤ d and tion of the integral a f(x)dx. The right endpoint approximation is deﬁned similarly. ∞ X f (n)(a) R (x) = (x − a)n approximation Collective term for diﬀerent meth- n n! ods allowing to approximate numbers, functions, and j=n+1 solutions of diﬀerent types of equations, when exact is the remainder of the series, then numbers or solutions are not available. For more de- tails see , quadratic approxima- M|x − a|n+1 |R (x)| ≤ , |x − a| ≤ d. tion, ’s method, and approximation by Taylor n (n + 1)! polynomials below. This , called Taylor’s inequality, tells that approximation by diﬀerentials For a given dif- the ﬁrst n terms of the series represent a good ap- ferentiable function f(x) around some point x = proximation for the function f. a, the simplest approximation is by a line passing The graph in this article shows Taylor approxima- through that point and with equal to the deriva- tions of the function sinx by Taylor polynomials of tive of the function at that point: f 0(a). The equa- degree 1,2,3,4 and 5. tion of that line is given by approximation problems Diﬀerent problems (x) = f(a) + f 0(a)(x − a). that come down to substituting the exact value of 8 some by a value diﬀerent but close to the actual value. arc Usually, a connected portion of a circle. In calculus, a connected portion of any smooth . formula (1) For the circle. If the ra- dius is r and an arc is subtended over the angle of measure θ (in radian measure), then the length of Archimedian The polar curve r = θeiθ. the arc is given by the formula s = rθ. The could be written as (2) For a general curve on the plane given by the x(t) = t cos t, y(t) = t sin t. function y = f(x) on interval [a, b]. The arc length is given by the formula

Z b L = p1 + [f 0(x)]2 dx. a

(3) For a curve in R3 given by parametric equations x = x(t), y = y(t), z = z(t), where a ≤ t ≤ b, the arc length is given by a similar formula

s Z b dx2 dy 2 dz 2 L = + + dt. arcsine function The inverse of the sin x function, a dt dt dt denoted either sin−1 x or arcsin x. Represents the an- gle θ on the interval [−π/2, π/2] such that sin θ = x. Formula for the parametric is similar, The domain of this function is [−1, 1] and the range with the last term missing. is [−π/2, π/2]. We have the following diﬀerentiation formula: arc length function The expression d 1 sin−1 = √ . dx 2 Z x 1 − x s(x) = p1 + [f 0(t)]2 dt, a which comes out in calculations of arc , is the arc length function. arccosine function The inverse of the cos x func- tion, denoted cos−1 x or arccos x. Represents the an- gle θ on the interval [0, π] such that cos θ = x. The domain of this function is [−1, 1] and the range is arctangent function The inverse of the tan x [0, π]. We have the following diﬀerentiation formula: function, denoted tan−1 x or arctan x. Represents the angle θ on the interval (−π/2, π/2) such that d 1 tan θ = x. The domain of this function is (−∞, ∞) cos−1 = −√ . dx 1 − x2 and the range is (−π/2, π/2). The inverse functions 9 cot−1 x, sec−1 x, csc−1 x are deﬁned in a similar width w and length `, A = w · `. In the particular manner. We have the following diﬀerentiation for- case when the length and width are equal, we get mula: a square (of side a, for example) and the is d 1 2 tan−1 = . A = a . dx 1 + x2 1 For a triangle with base b and height h, A = 2 b · h For a with base a and height h, A = b · h For a with two parallel bases a and b and a+b height h, A = 2 · h. In general, the area of any could be cal- culated by dividing it into a number of triangles or and adding the of these smaller pieces. See also the corresponding deﬁnitions for pictures and notations. area One of the basic notions of geometry used in many branches of mathematics. Indicates the area in polar coordinates If a curve is given with ”amount of plane” included in a certain plane region. the polar equation r = f(θ), where a ≤ θ ≤ b, then area between curves Let two curves be given by the area under that curve is equal to the functions f(x) and g(x). Then the area between 1 Z b that curves (on some interval [a, b]) is given by A = [f(θ)]2dθ. 2 a Z b |f(x) − g(x)|dx. area of a sector The area of a sector of a circle a with radius r and the opening angle (in radian mea- 1 2 area function For a given non-negative function sure) θ is given by the formula A = 2 r θ. f(x) deﬁned and continuous on some interval [a, b], area of a See . the integral Z x area of a Assume we have A(x) = f(t) dt a a surface obtained by revolving the smooth curve y = f(x), a ≤ x ≤ b, about some line. Then, de- is sometimes called area function, because it repre- noting by r the distance between arbitrary point x sents the area under the curve y = f(x) from point on interval [a, b] and the line, the surface area can be a to x. expressed as For a circle with radius r, the area is given by the formula A = πr2. Z b S = 2πrp1 + (f 0(x))2dx. a area of an The area of an ellipse is given by the equation area under the curve If a curve is given by the function f(x) ≥ 0, then the area under that curve x2 y2 + = 1 from point a to point b is given by the deﬁnite integral a2 b2 Z b is equal to πab. Because the area does not change f(x)dx. during , the same value for the area is a also true for translated ellipse. area of geometric ﬁgures For a with area under a parametric curve Suppose the 10

curve y = F (x),F (x) ≥ 0, is written in paramet- the inequality is as follows: Let a1, a2, ··· , an be any ric form x = f(t), y = g(t), α ≤ t ≤ β. Then the non-negative numbers. Then area under that curve (under the condition that this √ a + a + ··· + a n 1 2 n curve traverses only once as t increases from α to β) a1 · a2 ··· an ≤ is given by the formula n

xn−1 + an−1nxn = bn−1

xn = bn. Now, the back substitution consists of the following steps. First, we have xn = bn. Substituting back into the previous equation, we get xn−1 = bn−1 −an−1nbn. Continuing this way up, we eventually ﬁnd all the values of variables x1, x2, ··· , xn−1, xn, thus com- pleting solution of the system. backward The second stage (or phase) of the solution of linear systems by Gaussian elimina- tion. See back substitution above for details.

bar graphs Also called bar chart. Bar graphs are

one of many ways of visual representation of qualita- tive (categorical) data. In order to distinguish them from histograms, a little space is left between bars in the graph. The picture above shows the production numbers of some hypothetical company categorized 13 by months. N52◦E means the angle is 52 degrees measured from positive y-axis clockwise (to the East) and notation base of In exponential S47◦W means that the angle is 47 degrees measured function y = ax, where 0 < a < ∞, a 6= 1, and from negative y-axis clockwise (to the West). This −∞ < x < ∞, the constant a is called the base and method is used traditionally in marine . the variable x is called the exponent The second method is mostly used in areal naviga- base of In logarithmic function y = tion and has only one starting direction (initial side logb x, where 0 < b < ∞, b 6= 1, and 0 < x < ∞, the of the angle), namely, the positive y-axis and the an- constant b is called the base of logarithm. gles measured vary between 0 and 180 degrees. basis Let S = {v1, v2, ··· , vn} be a of bell-shaped (Gaussian) curve The curve de- the linear vector space V . S is called a basis for scribing the . Mathematically the V , if for any given vector u in V , there exists a function having that graph is given by the formula unique collection of scalars c1, c2, ··· , cn, such that 1  (x − µ)2  u = c1v1 + c2v2 + ··· + cnvn. These numbers are f(x) = √ exp − , called coordinates (or components) relative to the σ 2π 2σ2 given basis. A vector space has inﬁnitely many bases and changing the basis changes also the coordinates of any given vector. Any basis is necessarily linearly independent. The number of the vectors in the set S is called of the space V . Similar deﬁnition works also in cases when the dimension is not ﬁnite. A basis is called orthogonal, if the inner products of all possible pairs (of diﬀerent vectors) in that basis are zeros:

xi · xj = 0, 1 ≤ i, j ≤ n, i 6= j.

n where k = n!/k!(n − k)! is the binomial coeﬃ- cient (see above). Binomial distributions serve as a good approximation for normal distributions. The mean and of any binomial distri- bution could be calculated by very simple formulas: √ µ = np, σ = npq respectively. binomial expansion Also called Binomial Theo- rem. For any given real x and y (ﬁxed or block of a matrix Let A be some n × m-matrix. variable) and any n, the following We can divide this matrix into parts by separating expansion holds: certain rows and columns. Each part in this case is n n n called a block. Example: The matrix (x + y)n = xny0 + xn−1y1 + xn−2y2 0 1 2  1 −1 −2   n  n A =  −1 3 4  + ··· + x1yn−1 + x0yn n − 1 n 5 −2 0 n and k , 0 ≤ k ≤ n are the binomial coeﬃcients. could be divided into four blocks as follows For a real number α, −∞ < α <  −1  B = (1) B = (−1 − 2) B = ∞, the series 1 2 3 5 ∞ X α   (1 + x)α = xn, 3 4 n B4 = . n=0 −2 0 16

This division is not unique and depends on what we equation need to do with this matrix. Operations done with p(x)y00 − q(x)y + λr(x)y = 0 blocks of the matrix are called block manipulations. on the interval [0, 1] with be conditions The system of rules of opera- 0 α1y(0) + α2y (0) = 0 and β1y(1) + β2y(1) = 0 tions with logical statements. This algebra involves is called Sturm-Loiville boundary value problem. Boolean variables, that take True and False values (or 1 and 0) and operations of conjunction (”and”), A function f, deﬁned on some disjunction (”or”) and (”not”). Named af- interval I (ﬁnite or inﬁnite), is bounded, if there ter George Boole. exists a number M > 0, such that |f(x)| ≤ M for all values x ∈ I. boundary For any set in the Euclidean space n R , n ≥ 1, the boundary is the set of all boundary bounded sequence A sequence {an} of real or points (see below). Examples: (1) For the interval complex numbers is bounded if there exists a number [a, b] the boundary consists of two points a and b. (2) M > 0 such that |an| ≤ M for all values of n ≥ 1. For the x2 + y2 + z2 ≤ 1, the boundary is the 2 2 2 A subset S of the R is x + y + z = 1. bounded, if there is a positive number M > 0 such boundary conditions (1) For ordinary diﬀeren- that all elements x ∈ S satisfy |x| < M. If the set S tial equations. If the equation is given on some ﬁnite is a subset of Rn, then similar deﬁnition applies with interval [a, b], then the boundary consists of only two the |x| understood as the length of the point x ∈ Rn. points a and b and the conditions are given on both points. The most general conditions could be written 0 boxplot One of the tools of visualizing quantitative as α1y(0) + α2y (0) = 0 and β1y(1) + β2y(1) = 0. data. To construct a boxplot it is necessary ﬁrst (2) For partial diﬀerential equations. If the equation to organize the values in increasing order. Next, is given on some region D, then boundary is either ﬁnding the of these values we indicate the some curve or a surface. Accordingly, the boundary center. On the last step we ﬁnd the two conditions would be given on that boundary. of these two halves. These points are the ﬁrst and boundary curve A curve that makes the bound- third quartiles of the data set. Along with the ary of some plane region. minimum and maximum values they make-up the ﬁve point summary. Now, the boxplot consists of a boundary point For a set D in the Euclidean n box where these ﬁve values are indicated. The form space R , n ≥ 1, a point a ∈ D is a boundary point, of the boxplot is not standard and could be made if any ball with center at that point contains both both horizontal or vertical. Picture shows one of the points in D and outside of D. In case n = 1 the ball possible forms. should be substituted by interval and in case n = 2 it should be a circle. Examples: (1) For the interval [a, b] each of the endpoints is a boundary point. (2) For the circle x2 + y2 ≤ 1 the point (1, 0) is a bound- ary point. boundary value problem For ordinary diﬀer- ential equations. A diﬀerential equation along with boundary conditions for solutions. To solve a bound- ary value problem means to ﬁnd a solution to the given equation that also satisﬁes conditions on the boundary. The boundary conditions are chosen to braces Or curly .The symbols {}. One assure that the solution is unique. Example: The of the grouping symbols along with parentheses and 17 brackets. Primarily is used to separate certain num- bers and variables to indicate operations to be done C ﬁrst. brachistochrone For given two points on the plane, brachistochrone is the curve of quickest de- calculus One of the major branches of mathe- scent between these points. This curve is not the matics, along with algebra and geometry. Calculus shortest (which is the straight line) and turns out to itself consists of two main branches: diﬀerential and be a . integral calculus. These two branches are related by the Fundamental Theorem of Calculus. branches of hyperbola Any non-degenerate hyperbola consists of two separate curves, called Cantor set Also called the Cantor ternary branches. set.Take the closed interval [0, 1] and remove the middle open interval (1/3, 2/3) from if. The result branches of the tangent The graph of the tan- will be the union of two closed intervals [0, 1/3] gent function consists of inﬁnitely many identical and [2/3, 1]. From each of these intervals we again curves, called its branches. These branches are sep- remove their middle third open intervals (1/9, 2/9) arated from each other by the vertical asymptotes and (7/9, 8/9) and get the union of four closed π x = 2 + πn, n is an arbitrary integer. intervals [0, 1/9], [2/9, 1/3], [2/3, 7/9], [8/9, 1]. If brackets The symbols [ ]. One of the grouping we continue this process indeﬁnitely, the result will symbols along with parentheses and braces. Primar- be a called Cantor set. This set has many ily is used to separate certain numbers and variables remarkable properties. In particular, it is an inﬁnite to indicate operations to be done ﬁrst. set of measure zero and it is a perfect set in the sense that for any point in the set there is an inﬁnite sequence of points from that set that approach that point.

Any of the plane curves given in polar coordinates by the equations

r = a(1 ± cos θ)

or r = a(1 ± sin θ 18

Cardioids are special cases of lima¸cons. Cartesian coordinate system (1) For points on a plane, also called the rectangular coordinate system. A method of representing points on a plane as ordered pairs of numbers and vice versa. To do so, a pair of perpendicular lines (called axes) are drawn, that intersect at a point called the origin. The horizontal line is called the x-axis and the vertical one – the y-axis. Now, coordinates of any point are determined by drawing perpendicular lines from a point until they reach the axes. The points of these intersections (which represent numbers on corresponding number lines) make up a pair of numbers, called the . Conversely, if we have a pair of numbers, then the point corresponding (3) The system works also for spaces of any dimen- to it could be found by putting these numbers on sion n ≥ 2 and even for inﬁnite dimensional spaces. appropriate axes and drawing two , The importance of the Cartesian coordinate system is until they meet at some point. that it allows to connect geometry with algebra, cal- culus, and other branches of mathematics. Named after Ren´e Descartes. Cartesian plane A plane, equipped with the Cartesian coordinate system. For any two given sets A and B (of arbitrary nature) the set of all possible ordered pairs (x, y), where x is an element of A and y is an element of B. The notation is A × B. The Cartesian product can be deﬁned for any number of sets and even for inﬁnitely many sets. Cauchy’s Let the functions f(x) and g(x) be continuous on a closed interval [a, b] and be diﬀerentiable on the open interval (a, b). Then there exists a point c, a < c < b, such that f 0(c) f(b) − f(a) = . g0(c) g(b) − g(a) This theorem is the generalization of the Mean value theorem. (2) For points in three dimensional space. To rep- Cauchy-Euler equation Also called Euler equa- resent points in the space one additional coordinate tion. The equation axis (called the z-axis) is necessary. This allows to x2u00 + bxu0 + cu = 0, represent any point as an ordered triple, just as in the case of the plane. Similarly, any ordered triple where b and c are real numbers. The solutions of will represent a point in the space. this equation depend on the solutions of the alge- braic equation F (r) = r(r − 1) + ar + b = 0. 19

(1) If the has two real solutions the result will be another distribution that is approx- r1 6= r2 then the general solution of the Cauchy-Euler imately normally distributed and this new distribu- r1 r2 equation is y = c1|x| + c2|x| . tion has the same mean as the original distribution. (2) If the quadratic equation has the a repeated so- The standard deviations are also related. More pre- lution r then the general solution is given by the for- cisely: r r mula y = c1|x| + c2|x| ln |x|. Suppose we have some distribution with the mean µ (3) In the case of two complex solutions r = λ ± iµ, and standard deviation σ. If we form a new distribu- the solution of the diﬀerential equation comes in the tion from the means of simple random samples of size form n, then this distribution will be close to a normal dis- tribution with mean µ and standard deviation σ/n. λ y = |x| [c1 cos(µ ln |x|) + c2 sin(µ ln |x|)]. The larger n is, the better is the approximation. Let R denote a plate with uniform Cauchy-Schwarz inequality For any two non- distribution ρ. The center of mass of this plate is zero sequences of real numbers {a }n and {b }n , k k=1 k k=1 called centroid of R. There are diﬀerent formulas for the inequality locating the coordinates of the centroid. If the region n n n R is bounded by the function f(x) ≥ 0 and the lines X 2 X 2 X 2 ( akbk) ≤ ak bk y = 0, x = a, x = b, then the coordinates of the k=1 k=1 k=1 centroid are given by the formulas holds. The equality holds if and only if ak = cbk for Z b Z b 1 1 2 some constant c. The same inequality is true even in x = xf(x)dx, y = [f(x)] dx, A 2A the case of inﬁnite sequences. a a R b Cayley-Hamilton theorem Let A be a square where A = a f(x)dx is the area of the region. matrix and let In the case when R is bounded by two functions f(x) ≥ g(x), the formulas should be adjusted and 2 n−1 n c0 + c1λ + c2λ + ··· + cn−1λ + λ = 0 will include f(x) − g(x) in the formula for x and [f(x)]2 − [g(x)]2 in the formula for y. be its characteristic equation. Then the matrix A satisﬁes the equation The formula for calculating the deriva- tive of a composite function. (1) For functions of one 2 n−1 n c0I + c1A + c2A + ··· + cn−1A + A = 0, variable. Let F (x) = f(g(x)) and both f(x) and g(x) are diﬀerentiable. Then where I is the identity matrix. F 0(x) = f 0(g(x)) · g0(x). center of a circle The point that has equal dis- tance from each point of the circle. (2) For functions of several variables: Let z = center of the distribution A somewhat intuitive z(x1, ··· , xm) be a diﬀerentiable function of m notion that is not exactly deﬁned and may mean dif- variables and each of these variables are them- ferent thing for diﬀerent distributions. Most com- selves diﬀerentiable functions of n variables: x1 = monly the center refers to the mean of the distribu- x1(t1, ··· , tn), ··· , xm = xm(t1, ··· , tn). Then the fol- tion µ. lowing formulas for the partial hold: An angle, that is formed by two radii ∂z ∂z ∂x ∂z ∂x = 1 + ··· + m of a circle. ∂t1 ∂x1 ∂t1 ∂xm ∂t1 The most fundamental re- ··············· sult in Statistics. Informally, it states that if we form ∂z ∂z ∂x1 ∂z ∂xm a sampling distribution from a large population then = + ··· + ∂tn ∂x1 ∂tn ∂xm ∂tn 20

called the eigenvalues of A. Example: For the matrix change of base For logarithmic function: If a and   b are both not equal to 1, and 2 0 0 x > 0, then A =  1 3 0  log x 0 −4 1 log x = b . a log a b we have change of variable A method used in many situ- 2 − λ 0 0 ations to solve certain types of equations or evaluate det(A − λI) = 1 3 − λ 0 deﬁnite or indeﬁnite integrals. Example: To solve the 0 −4 1 − λ equation x1/2 − 5x1/4 + 6 = 0 we make a change of variable y = x1/4 to translate to the quadratic equa- tion y2 − 5y + 6 = 0 which has the solutions y = 2, 3. = (2 − λ)(3 − λ)(1 − λ), Returning to the original variable x we get x1/4 = 2, 3 or x = 16, 81. which has three real and distinct roots: 1,2,3. See For examples with integration see the entry substitu- also eigenvalues and eigenvectors. tion method. (2) For a linear homogeneous diﬀerential equation characteristic equation (1) The equation p(x) = (n) (n−1) 0 0, where p is the characteristic polynomial of a given a0y + a1y + ··· + cn−1y + cny = 0 square matrix. If the matrix has the size n × n, then the polynomial has degree n and, by the Fundamental the corresponding polynomial Theorem of Algebra, has exactly n roots (if counted n n−1 with multiplicities). The roots of the equation may p(r) = a0r + a1r + ··· + cn−1r + cn be both real or complex and some of them might be repeated. See also eigenvalues, eigenvectors, and of degree n with respect to the variable r is its char- eigenspace. acteristic polynomial. The roots (zeros) of the poly- (2) For a linear homogeneous diﬀerential equation nomial are the characteristic roots of the polynomial.

(n) (n−1) 0 Chebyshev equation The diﬀerential equation a0y + a1y + ··· + cn−1y + cny = 0 the corresponding polynomial equation (1 − x2)y00 − xy0 + αy = 0.

n n−1 a0r + a1r + ··· + cn−1r + cn = 0. Solutions of this equation when α is a non-negative integer n, are called . These The solutions of this equation are called characteristic polynomials could be written in many diﬀerent forms. roots. For the importance and the use of these roots The simplest and most used form is see the article linear ordinary diﬀerential equations. √ √ (x + x2 − 1)n + (x + x2 − 1)−n characteristic polynomial (1) Let A be an n × T (x) = . n matrix and I is the identity matrix of the same n 2 size. The of the matrix A − λI is a polynomial, called the characteristic polynomial of the matrix A. This polynomial has degree n and, by the Fundamental Theorem of Algebra, has exactly n chi-square distribution The distribution of roots (if counted with multiplicities). This roots are 21

for any combination of indices 1 ≤ i, j ≤ n. closed curve A curve that has no end points. If a plane curve is given by the parametric equations x = x(t), y = y(t), t ∈ [a, b], then the curve is closed if x(a) = x(b) and y(a) = y(b). closed interval Intervals of the form [a, b], where both end-points are ﬁnite and included. closed set A set that contains all of its boundary points. For example, the interval [0, 1] is a closed set because it contains both of its boundary points 0 variance. There are multiple deﬁnitions for this dis- and 1, while the interval (0, 1) is not closed. The set tribution and the following is one of the simplest: {(x, y)|x2 + y2 ≤ 1} is a closed set on the plane but 2 2 (n − 1)s2 the set {(x, y)|x +y < 1} is not a closed set because χ2 = , it does not contain its boundary. σ2 closed surface In the simplest form, a closed sur- 2 where n is the sample size, s is the sample variance is the boundary of a bounded solid domain. The 2 and σ is the population variance. Because variance general deﬁnition, given in advanced calculus courses, is positive the curve of the distribution is always on requires notions from and topology. the right side. As the graph shows, the distribution becomes more symmetric with the increasing size of closed under A set of numbers (or, more gener- the sample. ally, some objects) is closed under certain operation, if the result of that operation also belongs to that set. Examples. (1) The set of all is closed For any given curve, a straight , under both operations of addition and multiplication, that connects any two points on the curve. Most because the result is also an integer. (2) The set of commonly is used for the circle. real numbers is closed under addition, subtraction, multiplication, and division (except by zero), but it circle A plane geometric ﬁgure with the property is not closed under the operation of ex- that there exists a point (called center of the circle) traction. (3) The set of all square matrices is closed and a positive number r (called the radius), such under addition, and also matrix that the distance from each point of that ﬁgure to multiplication. the center is equal to r. coeﬃcient A word most commonly used to indi- circular A cylinder with a circle base. cate the non-variable factor of a term of a polyno- mial. See coeﬃcients of a polynomial. This term The of a circle. is also used to describe the numeric components in functions other than polynomials. Clairaut’s theorem Also known as the theorem coeﬃcient matrix For a system of linear equa- of the change of order of partial diﬀerentiation. If tions the function f(x , x , . . . , x ) has continuous second 1 2 n a11x1 + a12x2 + ··· + a1nxn = b1 partial derivatives at some point (a1, a2, ..., an) in its domain, then a21x1 + a22x2 + ··· + a2nxn = b2 ··························· ∂2f ∂2f (a1, a2, ..., an) = (a1, a2, ..., an) ∂xi∂xj ∂xj∂xi am1x1 + am2x2 + ··· + amnxn = bm 22

π π the m × n matrix that consists of all coeﬃcients of cot( − θ) = tan θ, tan( − θ) = cot θ; the system (but not the constant parts on the right 2 2 π π sides): csc( − θ) = sec θ, sec( − θ) = csc θ, 2 2  a11 a12 . . . a1n  which are true for any real values of θ where the func-  a21 a22 . . . a2n   . . .  . tions are deﬁned.  . . .  column matrix A matrix, that consists of one col- am1 am2 . . . amn umn, or an m × 1 matrix. It is the same as a vector, When a column of the constant terms is attached, written in column form. Example: the matrix is called augmented.  3  coeﬃcients of a polynomial In the polynomial −4 . of arbitrary degree   1 n n−1 p(x) = anx + an−1x + ··· + a1x + a0, column space Let A be an m × n matrix. The the constants a , a , ··· , a . 0 1 n subspace of the space Rm that is spanned by the coeﬃcients of series In the column-vectors of that matrix is called the column

∞ space of A. X n an(x − c) column vector A vector written in column form. n=0 The same as column matrix above. the numeric constants an, n = 0, 1, 2, 3 ···. combination For the given set S of n elements, cofactor For a given n × n matrix A, the (i, j)th combination of k elements is a subset of S, that con- cofactor is the determinant of the (n−1)×(n−1) ma- tains exactly k elements. At least one element should trix obtained by deleting the ith row and jth column be diﬀerent and two sets with the same elements of A, multiplied by (−1)i+j. Example: If are considered the same. For example, the {1, 2, 3} and {2, 1, 3} are the same. The number of   −4 4 1 combinations of k elements out of n is given by the A =  0 −1 3  formula n n! 2 5 2 Cn = = , k k k!(n − k)! then the cofactor corresponding to the element a23 (which is 3) is where 0 ≤ k ≤ n. See also binomial coeﬃcients and .  −4 4  (−1)2+3 det = (−1)(−20 − 8) = 28. common denominator If two or more fractions 2 5 have the same denominator, then that denominator See also minors of a matrix. is called the common denominator (of the fractions). If the denominators are diﬀerent, the fractions could cofactor matrix Another name for the adjoint be changed to an equivalent form with a common matrix. denominator. See least common denominator for de- The common name of trigonometric tails. functions cos x, cot x, csc x. common factors In any of two or cofunction identities In trigonometry, the iden- more natural numbers, any factors that are the same tities for all of these numbers. Example: The numbers 3 π π 24 = 2 · 3 and 30 = 2 · 3 · 5 have three common fac- cos( − θ) = sin θ, sin( − θ) = cos θ; 2 2 tors, 2, 3, and 6. 23 common Logarithmic function with complementary angles Two angles are comple- base 10. The base of common logarithms is often mentary, if the sum of their measures is 90◦ (in de- not written but is understood: log x = log10 x. gree measure) or π/2 (in radian measure). common multiple For two or more integers, any If a quadratic integer that is divisible by all of them. The num- is not a perfect square, it is still possible to represent ber 30 is a common multiple of 10 and 6, because it as a sum of a perfect square and some constant. 30 = 10 · 3 and 30 = 6 · 5. This procedure is called completing the square. If the trinomial is ax2 + bx + c, then we isolate the ﬁrst for addition For any two terms and write given real or complex numbers a and b, a + b = b + a, or, the order of the addends does not change the sum.  b  ax2 + bx + c = a x2 + x + c commutative property for multiplication For a any given real or complex numbers a and b, a·b = b·a, and notice that in order to make the expression inside or, the order of the factors does not change the prod- the parenthesis a perfect square we are missing the uct. square of half of the linear term’s coeﬃcient. Adding commuting matrices Two square matrices A and and subtracting that number we get B such that A · B = B · A.  b b2 b2  a x2 + x + − + c comparison properties of the integral For in- a 4a2 4a2 tegrable functions f(x) and g(x) of one real variable  b b2  b2 deﬁned on [a, b]: = a x2 + x + − + c R b a 4a2 4a (1) If f(x) ≥ 0 then a f(x)dx ≥ 0 b b (2) If f(x) ≥ g(x) then R f(x)dx ≥ R g(x)dx.  b 2 b2 a a = a x + − + c. R b 2a 4a (3) If m ≤ f(x) ≤ M, then m(b − a) ≤ a f(x)dx ≤ M(b − a). Examples: 2 2 2 comparison theorem for integrals Let f and g (1) x − 6x + 7 = (x − 6x + 9) − 9 + 7 = (x − 3) − 2. be two functions deﬁned on [a, ∞), with the property 2 2 7 2 7 49 (2) 3x + 7x − 5 = 3(x + 3 x) − 5 = 3(x + 3 x + 36 − f(x) ≥ g(x) for x ≥ a. 49 ) − 5 = 3(x2 + 7 x + 49 ) − 49 − 5 = 3(x + 7 )2 − 109 . R ∞ R ∞ 36 3 36 12 6 12 (1) If a f(x)dx is convergent, then a g(x)dx is also convergent. complex conjugates Complex numbers with the R ∞ R ∞ same real parts and opposite imaginary parts: a + ib (2) If a g(x)dx is divergent, then a f(x)dx is also divergent. and a−ib. The of the number z is denoted by z. A very important property of complex comparison test for series Let {an} and {bn} be conjugates is that their product is always a positive two numeric sequences. number or zero (if the number is zero itself): P∞ (1) If 0 ≤ an ≤ bn for all n and n=1 bn is conver- P∞ (a + ib)(a − ib) = a2 − i2b2 = a2 + b2 ≥ 0. gent, then n=1 an is convergent. P∞ (2) If an ≥ bn ≥ 0 for all n and n=1 bn is divergent, then P∞ a is divergent. This fact is used in the process of the division of two n=1 n complex numbers. complement of a set Let S be a space of sets or a set of some elements and assume A is an element of complex exponents The complex exponent for that space. The complement of A is the set A¯, that the base e is deﬁned by the use of Euler’s formula. If has no intersection with A (A ∩ A¯ = ∅), and their z = x + iy is some complex number, then ¯ union ”ﬁlls” the space, A ∪ A = S. ez = ex(cos y + i sin y). 24

For any other positive base a 6= 1 the complex ex- vector-functions also. ponent could be deﬁned similarly with the use of the components of a vector Let v be some vec- identity a = eln a. tor in a vector space V . Then it could be writ- complex fraction A fraction, where the numera- ten in its component form v = (v1, v2, ··· , vn) and tor, or denominator or both are fractions themselves. the scalars v1, v2, ··· , vn are the components of this Examples: vector. Components depend on the basis of the 2 x space V and vary with . Exam- 3 2x + 1 , 2 . ple: the vector v = (2, 3) in the 7 x − 1 2 13 x − 5 e1 = (1, 0), e2 = (0, 1) of R , in the nonstandard 0 0 basis e1 = (−1, 0), e2 = (0, −1) will have compo- complex numbers Numbers of the form z√= x + nents (−2, −3). iy, where x and y are real numbers and i = −1, is the . x is called the real part of the composite function See composition of functions. complex number and y is the imaginary part. Two A natural number, that could complex numbers are equal, if their real and imagi- be written as a product of two other natural numbers nary parts are identical: a + ib = c + id if and only if diﬀerent from 1 or the number itself. If it is not pos- a = c and b = d. Each complex number can be rep- sible, the number is called prime. resented graphically as a point on the plane, where the real part is identiﬁed with the x-coordinate and composition of functions (1) For functions of the imaginary part with the y-coordinate. one variable. Let f(x) and g(x) be two functions Complex numbers also have a trigonometric repre- and assume that the range of g is contained in the sentation of the form z = r(cos θ + i sin θ), where domain of f. Then the composition of f with g r = |z| = px2 + y2 is the modulus of the number is deﬁned to be the function h(x) = f(g(x)). The and the angle θ, (called the argument of z) is deter- notation h = f ◦ g is commonly used. The op- mined from the equation tan θ = y/x. eration of composition is not commutative. Exam- For operations with complex numbers see addition ple: Let f(x) = x2 + 1 and g(x) = 2x − 1. Then and subtraction of complex numbers, division of com- (f ◦ g)(x) = 4x2 − 4x + 1 and (g ◦ f)(x) = 2x2 + 1. plex numbers, multiplication of complex numbers, (2) For functions of several variables. If DeMoivre’s theorem. f = f(x1, ··· , xm) and there are m functions gk(x1, ··· , xn), 1 ≤ k ≤ m, of n variables, then the A plane equivalent to the Carte- composition function can be deﬁned by the formula sian plane with one signiﬁcant diﬀerence: the points on the plane are not viewed as ordered pairs (x, y), h(x1, ··· , xn) = but rather as points with real and imaginary parts f(g (x , ··· , x ), ··· , g (x , ··· , x )). coinciding with these same values x, y. Hence, arith- 1 1 n m 1 n metic operations can be done with the points on the In this deﬁnition we have to assume also that the complex plane according to the rules of operations combined ranges of the functions gk are in the do- with complex numbers. main of f. In the several variable case the composi- tion is also not commutative. component function If r is a vector-function in the three dimensional space, then it could be written composition of linear transformations Let T1 : in the form V1 → V2 be a linear transformation between two lin- ear vector spaces V1 and V2. Assume also, that an- r(t) = (f(t), g(t), h(t)). other linear transformation T2 between V2 and some other linear vector space V3 is deﬁned in such a way, The functions f, g, h are called component func- that its domain is contained in the range of the trans- tions. The same notion is valid for two dimensional formation T1. Then the composition transformation 25

2 T (v) = T2(T1(v)) = (T2 ◦ T1)(v) is a transforma- x is positive”, ”If a function is diﬀerentiable, then tion between the spaces V1 and V3. The composition it is continuous”. To each conditional statement operation is not commutative: T1 ◦ T2 6= T2 ◦ T1 in A ⇒ B, there exist three other statements. general. (1) Converse: B ⇒ A, meaning ”If B, then A”. If the direct statement is true, the converse may or compound When a given amount of may not be true. money is invested and the interest is paid not only on (2) Inverse: not A ⇒ not B, meaning ”If A is not the principal amount but also on the interest earned true, then B is not true”. As in the case of converse, on the principal, it is called . If the inverse may or may not be true. the principal amount is denoted by P , the interest (3) Contrapositive: not B ⇒ not A, meaning ”If rate by r and interest payments are made n a B is not true, then A is not true”. Direct and year, then the formula for the amount A after t years contrapositive statements are equivalent in the sense is  r nt that either they both are true or both are false. A = P 1 + . n See also biconditional statement, logical contraposi- See also continuously compounded interest. tive. A diﬀerentiable function on conditionally A series that some interval is concave down if its derivative is a de- converges but the series formed by the absolute P∞ (−1)n creasing function. Geometrically this property means values of its terms does not. The series n=1 n that the tangent line to the graph of the function is conditionally convergent because the series formed at any point is above the graph. Similarly, a dif- by its absolute values is the divergent harmonic series P∞ ferentiable function on some interval is concave up n=1 1/n. If a series is conditionally convergent, if its derivative is an increasing function. Geomet- then its terms could be rearranged so that the sum rically this property means that the tangent line to equals any given number. the graph of the function at any point is below the graph. Concave up functions are also called convex. concavity The property of being concave. concavity test Let f(x) be a twice diﬀerentiable function on some interval I. (1) If f 0(x) > 0 for all values of x on the interval, then the function is con- cave up. (2) If f 0(x) < 0 for all values of x on the interval, then the function is concave down. conditional probability The probability of some event A under the assumption that event B happens too. The notation is P (A|B). If two events A, B are not mutually exclusive, then the probability that both events happen is given by the general multipli- cation formula

P (A ∩ B) = P (A)P (B|A). A three dimensional geometric object gener- ated by a line (called the generator) rotating about a conditional statement A logical statement of ﬁxed point on the line called the apex. Both the sur- the form ”If A, then B”, written also in the short face and the solid generated this way are called cone. form A ⇒ B. Examples include: ”If x is real, then The picture above shows of a right circular 26 cone. etc. conﬁdence interval In statistics. An interval of conjugates The expressions a + b and a − b are real values where we expect to have the ”majority” called conjugates or conjugate pairs. The term most of values of some distribution concentrated. The two often refers to complex conjugates. It is also used for endpoints of that interval are given by the expres- √ √ √ √ expressions of the form a + b and a − b. Us- sion Estimate ± Margin of Error. For example, for ing these conjugates is important when rationalizing the distribution of proportions, this interval could be denominators containing these expressions. given by the expression conjugate axis See hyperbola. pˆ ± z∗SE(ˆp), connected region If any two points of a region wherep ˆ is the estimated proportion of the sample can be connected by a path (curve) that lies com- data, SE(ˆp) is the standard error of that sample pletely inside that region, then the region is called data, and z∗ is the critical value corresponding to connected. the given conﬁdence level. The length of the interval conservative vector ﬁeld A vector ﬁeld F is depends on the size of the data set and on the desired called conservative, if there exists a scalar function conﬁdence level. f such that its vector ﬁeld coincides with conﬁdence level A percentage we chose to be con- F : F=5f. ﬁdent of the accuracy of statistical estimates that consistent linear system A system of equations come from sample data. The most common values which has at least one solution. When the system are 90%, 95%, and 99%. The conﬁdence level is as- has no solution, it is called inconsistent. sociated with critical values. constant A quantity that does not change, usually An between two some type of number. or more sets. Geometric ﬁgures are called congruent if one ﬁgure (plane or solid) could be moved and/or constant multiple rule See diﬀerentiation rules. rotated and/or reﬂected to coincide with the other constraint A term used as a synonym of restric- ﬁgure. Congruence is diﬀerent from equality in the tion. The constraints could be on an independent sense that these two ﬁgures are not the same because variable as well as on a function itself. Example: originally they occupied diﬀerent places on the plane Find the maximum value of the function f(x, y) = or in space while equal ﬁgures are supposed to have 2x2 − 3xy + y2 under the constraint x2 + y2 = 1. all components of the same size and occupy the same space. An expression of the form The result of cutting a double cone b a + 1 . by a plane. Depending on the position of the plane 0 b a + 2 the result could be a circle, ellipse, parabola, hyper- 1 a + ··· bola or a section: a point, a pair of 2 intersecting lines, or the . An alternative Continued fraction might be ﬁnite or continue indef- geometric deﬁnition could be given using eccentric- initely. ity. continued fraction expansion Representation of Algebraically, any conic section could be described a number or a function as a ﬁnite or inﬁnite continued as a solution of a quadratic equation in two vari- fraction. Example: ables: Ax2 + Bxy + Cy2 + Dx + Ey + F = 0, where A, B, C, D, E, F are real constants. See the en- √ 1 2 = 1 + . tries ellipse, hyperbola, parabola for the details and 1 2 + additional information regarding foci, directrix, axes 2 + ··· 27

person taking them. This method is not considered continuity of a function The function f(x), de- scientiﬁcally reliable or unbiased. ﬁned on some interval [a, b] is continuous at a point convergence For speciﬁc deﬁnitions see conver- c ∈ (a, b), if gent integrals, convergent sequences, convergent se- lim f(x) = f(c). x→c ries. See also absolutely convergent series, condition- continuity on an interval The function is contin- ally convergent series. uous on an interval, if it is continuous at each point convergent integral A deﬁnite integral with ﬁnite of that interval. If the function is deﬁned on a closed value. For proper integrals this means that the Rie- interval [a, b] then for the continuity at the end-points mann sums have ﬁnite limit. The same applies also a and b see the entries continuity from the right and to improper integrals. continuity from the left respectively below. convergent sequence A numeric sequence that continuity from the left The function f(x) is approaches a ﬁnite value. The formal deﬁnition is: continuous from the left at some point c, if at that Let {a }∞ be a sequence of (real or complex) num- point the left-hand limit exists and is equal to the n n=1 bers. The sequence converges to some number A, if value of the function at that point: lim − f(x) = x→c for any given ε > 0 there exists an N > 0, such that f(c). |an − A| < ε whenever n > N. In an equivalent no- continuity from the right The function f(x) is tation, limn→∞ an = A. continuous from the right at some point c, if at that Example: The sequence {1 + 1/n} converges to 1 as point the right-hand limit exists and is equal to the n → ∞. value of the function at that point: lim + f(x) = x→c convergent series A numeric series is convergent f(c). Pn if the partial sums of that series Sn = k=1 ak form a See continuity of a function. convergent sequence. More precisely, if for any given ε > 0 there exists a number N such that |S −L| < ε continuously compounded interest When a n for any n > N, we say that the series converges to L. given amount of money is invested and the interest is Example: The series paid not only on principal amount but also on interest ∞ earned on principal, and the number of payments is X 1 unlimited, then it is called continuously compounded np interest. If the principal amount is P and the interest n=1 rate is r, then the amount of money after t years is is convergent for any value p > 1. The same exact given by the formula A = P ert. See also compound deﬁnition is valid also for series, such as interest. power series or . contraction operator Or contraction transforma- converging solutions A term commonly used to tion. An operator applied to a vector does not change describe approximate solutions that have the prop- its direction but makes its magnitude smaller. The erty of approaching the exact solution during a cer- operator is given by the formula T x = kx, where tain limiting process. As an example see Newton’s 0 ≤ k ≤ 1. In the case k ≥ 1 the transformation is method for conditions when approximate solutions called a dilation operator. For a more general case approach exact solution. when not only the magnitude but also the direction converse See conditional statement. is changed, see expansion operator. A diﬀerentiable function on some contrapositive See conditional statement. interval is convex, if its derivative is an increasing convenience sampling A type of sampling when function. Geometrically this property means that the the samples are taken based on the convenience of the tangent line to the graph of the function at every 28 point is always below the graph. Convex functions clidean space Rn, that lie on the same plane. also are called concave up. correlation coeﬃcient Suppose we have data in A set in Euclidean space with the prop- the form of ordered pairs (x, y) and there is some erty that for any two points points in that set, the kind of relationship between x and y-variables. Cor- line connecting those points lies completely in the set. relation coeﬃcient measures the strength of that re- lationship in numerical terms. To calculate that co- integral For two functions deﬁned on eﬃcient, usually denoted by r, we ﬁrst normalize the the real axis R, the formal integral x and y-values by ﬁnding their z-scores denoted by Z ∞ z and z respectively. The sum (f ∗ g)(x) = f(t)g(x − t)dt. x y −∞ P z z r = x y Convolution is commutative in the sense that f ∗ g = n − 1 g ∗ f. Another important property of convolution is that the translates convolu- is the correlation coeﬃcient. It is deﬁned in a way tion into multiplication: Let Lf(x) and Lg(x) be that −1 ≤ r ≤ 1. A positive r indicates positive Laplace transforms of f and g respectively. Then correlation and a negative r indicates negative L(f ∗ g)(x) = Lf(x) ·Lg(x). correlation. The closer r is to 1 or -1, the stronger is the linear relationship between the x and y-values. coordinate axes See Cartesian coordinate system. coordinate matrix If S={v1, v2, ..., vn} is a basis in some vector space V then any vector v from V could be written as

v = c1v1 + c2v2 + ··· + cnvn. The n × 1 column-matrix that consists of the scalars c1, c2, ..., cn is called the coordinate matrix of the vec- tor v with respect to the basis S. coordinate plane The plane that is determined by any pair of coordinate axes. See Cartesian coordinate system. coordinate vector The same as coordinate matrix, corresponding angles Corresponding angles are only written in the form of a 1 × n row-matrix. formed when a line crosses two coplanar lines. The corresponding angles are not necessarily congruent, coordinate system Any of the methods to iden- but they are if the coplanar lines are also parallel. tify points on the plane or in space with an ordered In the ﬁgure above the lines a and b are parallel and pair or an ordered triple. See the coordinate systems the line t is the transversal. Congruent angles are Cartesian, cylindrical, polar, rectangular, spherical. marked accordingly. coordinates In any coordinate system the ordered pairs or triples corresponding to a point. The same cosecant function One of the six trigonometric point may have diﬀerent coordinates in diﬀerent co- functions. Geometrically, the cosecant of an angle in ordinate systems. Example: The point with coordi- √ a is the ratio of the hypotenuse of the nates (−2, 2 3) in the Cartesian system has coordi- triangle to the opposite side. The cosecant could also nates (4, 2π/3) in the polar system. be deﬁned as the reciprocal of the function. The coplanar vectors Two or more vectors in the Eu- function csc x could be extended to all real values ex- 29 actly as the sin x function is extended. The domain of csc x is all real values, except x = πn, n any in- teger, and the range is (−∞, −1] ∪ [1, ∞). csc x is 2π-periodic. The cosecant function is related to other trigonomet- ric functions by many identities, the most important of these are csc x = 1/ sin x, 1 + cot2 x = csc2 x. The derivative and integral of this function are given by the formulas d csc x = − csc x cot x, dx Z csc xdx = ln | csc x − cot x| + C. cost function The total cost to a company to produce x units of certain product. Usually denoted by C(x). The derivative of this function is called the marginal cost function. See also average cost function, revenue function..

cosine function One of the six trigonometric func- tions. Geometrically, the cosine of an angle in a right triangle is the ratio of the adjacent side to the hy- potenuse of the triangle. A more general approach to extend the cos x function for any real number x is as follows: Let P = (a, b) be any point on the plane other than the origin and θ is the angle formed cotangent function One of the six trigonometric by the x-axes and the terminal side, connecting the functions. Geometrically, the cotangent of an angle origin and P . Then cos θ = √ a . Next, after es- in a right triangle is the ratio of the adjacent side of a2+b2 the triangle to the opposite side. It could also be de- tablishing one-to-one correspondence between angles ﬁned as the reciprocal of the tangent function. The and real numbers, we can have the cosine function function cot x could be extended as a function of all deﬁned for all real numbers. The range of cos x is real numbers except for x = πn, n any integer, and [−1, 1] and it is 2π-periodic. the range is all of R. The cot x function is π-periodic. The cosine function is related to other trigonometric The cotangent function is related to the other functions by various identities. The most important by various identities. The is the Pythagorean identity sin2 x + cos2 x = 1. The most important of these are the identities cot x = derivative and indeﬁnite integral of this function are: cos x/ sin x, cot x = 1/ tan x and a version of the d Z Pythagorean identity 1+cot2 x = csc2 x. The deriva- (cos x) = − sin x, cos xdx = sin x + C. dx tive and integral of this function are given by the 30 formulas Calculating these , we get |A| = 44, |A | = −40, |A | = 72, |A | = d Z 1 2 3 cot x = −csc2x, cot xdx = ln | sin x| + C. 152. Now, Cramer’s rule gives the solution dx (−12/11, 18/11, 38/11). coterminal angles Two angles in standard posi- critical point Also called critical number or value. tion which have the same terminal side (see the en- For a continuous function f(x) on some interval I, try angle for explanation of terms above). Any two a point c ∈ I is critical, if f 0(c) = 0 or f 0(c) does coterminal angles diﬀer in size by an integer multiple not exist. Critical points are important because if a ◦ of 360 (in degree measure) or 2π (in radian mea- function has a local extremum at some point in its do- sure). main, then that point is a critical point. The opposite counting numbers Another name for natural is not always true. Examples: (1) For the function numbers. f(x) = x − 2 sin x on [0, 2π] the critical points are x = π/3, x = 5π/3. The ﬁrst one is the local mini- Cramer’s rule One of the methods of solving sys- mum and the second one the local maximum of the tems of linear algebraic equations which involves the function on the given interval. (2) For the function use of determinants. For the system of n equations with n unknowns n 2x if x ≥ 0 f(x) = a11x1 + a12x2 + ··· + a1nxn = b1 x if x < 0

a21x1 + a22x2 + ··· + a2nxn = b2 the only critical point is x = 0 but at that point ······························ the function has neither a minimum nor a maximum value. an1x1 + an2x2 + ··· + annxn = bn The notion of a critical point is also extended to func- the solution is given by the formulas xk = tions of several variables. A point (c1, c2, c3) is a crit- det(Ak)/ det(A), 1 ≤ k ≤ n. Here A is the matrix of ical point of a function f(x, y, z) of three variables if the given system of equations and the matrices Ak are either all ﬁrst partial derivatives are equal to zero or found from A by removing the kth column and substi- some of the ﬁrst partial derivatives do not exist at tuting it by the column of constants [b1, b2, ··· , bn]. that point. As in the case of the functions of one This formula works if and only if the determinant variable, here also the local maximum and minimum det(A) 6= 0. values are possible at critical points only, but the Example: Solve the system opposite is not true: not all critical points are local minimum or maximum points for the function. These x1 + +xn = 6 kind of points are called saddle points.

−3x1 + 4x2 + 6xn = 30 critical value In statistics. A numeric value asso- ciated with any distribution that depends on the type −x − 2x + 3x = 8 1 2 n of distribution and the conﬁdence level desired. For The four determinants for this system are: the standard normal distribution the critical values corresponding to the most common conﬁdence levels 1 0 2 6 0 2 ∗ 90%, 95%, 99% are z = 1.165, 1.96, 2.576 respec- |A| = −3 4 6 , |A1| = 30 4 6 tively. −1 −2 3 8 −2 3 For two vectors in three dimen-

1 6 2 1 0 6 sional space only. Let u = (u1, u2, u3) and v = 3 |A2| = −3 30 6 , |A3| = −3 4 30 (v1, v2, v3) be vectors in R . Then their cross product

−1 8 3 −1 −2 8 is deﬁned to be another vector from the same space, 31 given by the formula To ﬁnd the solutions of equation (2), we need to make the back substitution x = y − a/3. The solutions of u × v = (u2v3 − u3v2, u3v1 − u1v3, u1v2 − u2v1). equation (1) are obviously the same as that of equa- tion (2). Note also, that at least one of the solutions This vector could be expressed as a determinant: of the equations (1)-(3) is always real. 3 i j k Example: For the√ equation√ x − 8x − 3 = 0 the solu- −3+ 5 −3− 5 u × v = u1 u2 u3 . tions are 3, 2 , 2 . v v v 1 2 3 cubic polynomial A polynomial of the third de- The cross product is not commutative: u × v = gree p(x) = ax3 +bx2 +cx+d. In the particular case, −v × u. the function f(x) = ax3 is called the . An equation of the form p(x) = 0, cubic root Formally, the inverse of the cubic func- where p is a cubic polynomial. By the Fundamental tion. A given number a is the cubic root√ of another Theorem of Algebra, any cubic equation has exactly number, b, if a3 = b. The notation is a = 3 b. three solutions counting multiplicities of zeros. Sim- of a vector ﬁeld Let F = P i + Qj + Rk be a ilar to quadratic equations, any cubic equation could vector ﬁeld in R3 and assume that all partial deriva- be solved by a formula involving radicals, however tives of the components P, Q, R exist. Then the this formula is very complicated and diﬃcult to use. vector function Here is a short demonstration of that formula and its deduction. ∂R ∂Q curlF = − i To solve the general cubic equation ∂y ∂z

αx3 + βx2 + γx + δ = 0, α 6= 0, (1) ∂P ∂R ∂Q ∂P  + − j + − k ∂z ∂x ∂x ∂y we divide it by α and get a monic equation (equation with leading coeﬃcient equal to 1) is the curl of the vector ﬁeld. See also of a vector ﬁeld and gradient vector ﬁeld. x3 + ax2 + bx + c = 0. (2) curvature Let C be a smooth curve deﬁned by Denoting x = y − a/3 we reduce this equation to a some vector function r = r(t) and let s = s(t) be that simpler equation of the form curve’s arc length. If T = r0/|r0| is the unit to the curve then the curvature of the curve is y3 + py + q = 0, (3) given by the formula

2 3 where p = b − a /3, q = c − ab/3 + 2a /27. Let dT κ = . r denote one√ of two complex cubic roots√ of unity: ds r = −1/2 + 3i/2 or r = −1/2 − √3i/2 and let ∆ = p−(27q2 + 4p3),A = (−27q + 3i 3∆)/2,B = √ curve (1) A plane curve is a set of all ordered (−27q − 3i 3∆)/2. The choice of the two possible pairs (f(t), g(t)), where f and g are continuous square roots for ∆ and three possible cubic roots of A functions deﬁned on some interval I = [α, β]. If no and B must be made so that (AB)1/3 = −3p. With points repeat (i.e., if (f(t ), g(t )) 6= (f(t ), g(t )) this notations, the three roots of equation (3) are 1 1 2 2 for two values t 6= t ), then the curve is called given by the following formulas: 1 2 simple. If (f(α), g(α)) = (f(β), g(β)), then we A1/3 + B1/3 have a closed curve. If the functions deﬁning the y = , y = rA1/3 + r2B1/3, 1 3 2 curve are diﬀerentiable, the curve is called smooth. Accordingly, if they are piece-wise diﬀerentiable then 2 1/3 1/3 y3 = r A + rB . the curve is called piece-wise smooth. A plane curve 32 could be also given by a polar equation. See also cylinder Geometrically, a cylinder is a three boundary curve. dimensional surface that consists of lines that are (2) A curve in R3 is a set of all ordered triples parallel to some given line and pass through some (f(t), g(t), h(t)) with all functions continuous on closed plane curve. The most commonly used cylin- some interval I. The deﬁnitions of closed, simple, ders are right circular that pass through and smooth curves remain the same as above. a circle on the plane and are perpendicular to that For calculations of lengths of curves see length of plane. Algebraically, a right circular cylinder is given parametric curve, length of polar curve, length of a by the formula x2 + y2 = r2, where the third variable space curve. z is free to assume any real values. Another example of a cylinder is a right elliptic cylinder given by the formula of ellipse x2/a2 + y2/b2 = 1 and the variable z is again free.

cylindrical coordinates A coordinate system in three dimensional space that uses polar coordinates in two dimensions and the rectangular coordinate in the third dimension. The relations are given by the cycloid A plane curve that appears as a trace of a formulas point on a circle, when the circle rolls on a straight line. The parametric equations for the cycloid are x = r cos θ, y = r sin θ, z = z. given by Here r2 = x2 + y2 and the angle is determined from x = a(θ − sin θ), y = a(1 − cos θ), the equation tan θ = y/x. where a is the radius of the circle and θ is the angle cylindrical shell A geometric solid which is the formed by two radii in the circle, one connecting the diﬀerence of two concentric circular cylinders with center with the tracing point on the cycloid and the the same axis but diﬀerent radii. For applications of other connecting the center with the point where the cylindrical shells see . circle touches line. 33

decomposition of matrices Writing a given ma- D trix as a sum or product of two or more matrices. For a speciﬁc way of splitting a matrix see LU(lower upper)-decomposition. damped If an object oscillates and an- decreasing function The function f(x) deﬁned on other (such as friction) aﬀects its motion by de- some interval I is decreasing, if for any two points creasing its amplitude, then this results in a damped x1, x2 ∈ I, x1 < x2, we have f(x1) > f(x2). The vibration of the object. This kind of motion is de- function f(x) = 2−x is an example of a decreasing scribed by the diﬀerential equation function. my00 + cy0 + ky = 0, decreasing sequence A sequence of real numbers {an}, n ≥ 1, is decreasing, if am < ak whenever 2 where c is the damping constant. m > k. The sequence an = 1/n is an example of a decreasing sequence. data A set of values collected in the process of sam- pling or a census. The values of the data could be deﬁnite integral (1) Let the function f(x) be de- numerical (quantitative) or categorical (qualitative). ﬁned and continuous on some ﬁnite interval [a, b]. Deﬁnite integral of a function could be deﬁned in data analysis A collection of statistical methods many diﬀerent but equivalent ways and we present designed to evaluate and draw conclusions from a some of this deﬁnitions. body of data. For speciﬁc methods see, e.g. con- Let us divide the interval into n equal parts of size ﬁdence interval, hypothesis testing, analysis of vari- ∆x = (b − a)/n. Denote the endpoints of these parts ance, etc. by x0(= a), x1, x2, ··· , xn(= b) and chose arbitrary ∗ ∗ ∗ decimal numbers The numeric system with base points x1, x2, ··· , xn in each of these smaller inter- ten. This system uses ten digits 0,1,2,3,4,5,6,7,8,9 vals [xi−1, xi], 1 ≤ i ≤ n. Then the deﬁnite integral and the place value system, which takes into consid- of f on the interval [a, b] is the limit eration the position of the digit along with its value, Z b n to write all real numbers. For example, in the number X ∗ f(x)dx = lim f(xi )∆x 3531 the two 3’s have diﬀerent values because of being n→∞ a i=1 in diﬀerent positions. The ﬁrst 3 represents 3000 and the second one 30. To represent fractional and even if the limit exists. In special cases of this deﬁnition ∗ irrational numbers in decimal system, we use the dec- the sample points xi can be chosen to be the left imal point (in many countries a coma is used), which endpoints or right endpoints, or the of in- separates the whole part of the number from the frac- tervals. They all are equivalent to the more general tional. In the number 43.587 the 43 is the whole deﬁnition. part and the digits after the point represent the frac- The sum in the deﬁnition of the integral is called a tional part. Hence, 43.587 = 43+0.587. Any fraction Riemann sum and the deﬁnite integral is called the (), could be written as either a ter- . The calculations for deﬁnite in- minating or non-terminating, . On tegrals are done primarily by the use of the Funda- the other hand, any can be writ- mental Theorem of Calculus, rather than using the ten as an inﬁnite non-repeating√ decimal. Examples: deﬁnition directly. Examples: 3/8 = 0.375, 1/3 = 0.333..., 2 = 1.414213562.... Z 4 See also binary numbers. 3 4 4 4x dx = x |1 = 256 − 1 = 255, deciphering matrix The matrix used to decipher 1 Z π a coded message, usually, the inverse of the matrix π that coded the message. sin xdx = − cos x|0 = −(−1) − (−1) = 2. 0 34

For a positive function f(x) ≥ 0 on [a, b] the deﬁ- the measure 1◦. Hence, the would have nite integral is just the area under the graph of this 90◦ and the straight angle 180◦. See also angle. function. More generally, the deﬁnite integral is the demand function Also called price function. The diﬀerence of the areas above and below the x-axis for price of the product when a company wants to sell x the graph of a given function. units of that product, usually denoted by p(x). The (2) This deﬁnition could be used to include also func- graph of a demand function is called a demand curve. tions with ﬁnite number of jump discontinuities. The Closely related are revenue function, proﬁt function. notion of the deﬁnite (Riemann) integral could be generalized to include more general functions, such DeMoivre’s theorem Let the complex number as with inﬁnite discontinuities or deﬁned on inﬁnite z be written in trigonometric form: z = reiθ = intervals. See . r(cos θ + i sin θ). Then for any integer n zn = (3) Further generalizations of the deﬁnite integral re- rneinθ = rn(cos nθ + i sin nθ). sulted in development of other types of integrals (Stil- a denominator (1) For the numeric fraction b , the jes, Lebesgue, Denjoy, etc.). p(x) number b. (2) For the q(x) , the poly- deﬁnite integration The process of calculating nomial q. (3) For any expression of the form f(x) the the deﬁnite integral of some function. g(x function g(x). degenerate conic sections Special cases of conic sections, when the plane cutting the cone produces A subset of real numbers is dense, if for a point or a pair of intersecting lines. Algebraically, any point a of that set there are inﬁnitely many other these cases correspond respectively to the equations points b of the same set, arbitrarily close to a. For- mally, the set S in R is dense, if for any point a ∈ S x2 y2 x2 y2 and any number ε > 0, there exists another point + = 0, y2 = a, − = 0. a2 b2 a2 b2 b ∈ S, such that |a − b| < ε. Examples: The set of all rational numbers is dense, but the set of all integers is not. degree of freedom If n unknowns (or variables) x1, x2, ··· , xn are connected by one relation, then function Also called probability density only n − 1 of them can be chosen arbitrarily, whereas function. A function that represents some type of dis- the nth one is dependent on the other choices. For tribution. The graph of a density function is called a that reason we say that the degree of freedom of these density curve. Density function must be non-negative variables is n−1. If the same unknowns are connected and that the area under a density curve is exactly 1, by two relations, then the degree of freedom would since it represents the total probability. Symbolically, be n − 2, and so on. a function f(x) is a density function if f(x) ≥ 0 and For a general polynomial Z ∞ given by the formula f(x)dx = 1. −∞ n n−1 p(x) = anx + an−1x + ··· + a1x + a0, dependent variable In an equation y = f(x) the variable y. The variable x is called the independent the whole number n ≥ 0. Example: For the polyno- variable. mial p(x) = 3x5 − 2x4 + 5x2 − 6 the degree is 5. derivative Let the function f(x) be deﬁned on degree measure of an angle One of the two main some interval (a, b). The derivative of the function units for measuring an angle (the other unit is called at some point x = c is deﬁned as the limit radian measure). In this system the circumference of any circle is divided into 360 equal arcs and the f(x) − f(c) f 0(c) = lim , angle subtended on one of those arcs is said to have x→c x − c 35

0 if this limit exists. The derivative of a function is a because we know√ that (sin x) = cos x and function itself given by cos(sin−1 x) = 1 − x2. f(x + h) − f(x) derivative of a vector function See diﬀerentia- f 0(x) = lim , h→0 h tion of vector functions. where x is any point in the domain of the function Descartes’ rule of signs Let the polynomial of such that this limit exists. The set of all these points degree n ≥ 1 be written in the standard form: is the domain of the derivative. Some other common p(x) = a xn + a xn−1 + ··· + a x + a , notations for the derivative of y = f(x) are: n n−1 1 0

dy df and a0 6= 0. Then (1) The number of positive zeros y0, , ,D f, Df(x). dx dx x of p is either equal to the number of changes of coeﬃcients of p(x) or is less than that number In practice, derivatives are calculated by the diﬀeren- by an even integer; (2)The number of negative zeros tiation rules, rather than using the deﬁnition directly. of p is either equal to the number of sign changes Examples: of coeﬃcients of p(−x) or is less than that num- 1 ber by an even integer. Example: The polynomial (xn)0 = nxn−1, (tan−1 x)0 = , (ex)0 = ex. p(x) = 2x3 − 5x2 + 6x − 4 has three sign changes, so 1 + x2 it may only have three or one positive zeros (roots). See also left-hand derivative, right-hand derivative. Next, p(−x) = −2x3 − 5x2 − 6x − 4, has no sign For derivatives of functions of several variables see changes, so p cannot have any negative zeros. partial derivatives, directional derivatives, normal determinant A number, associated with any derivatives. square matrix. For a given matrix A the notations derivative of a composite function See chain for corresponding determinant are |A| or det(A). The rule. value of the determinant is deﬁned recursively as fol- lows: If the matrix is 1 × 1, A = (a), then |A| = a. derivative of indeﬁnite integral Let f be a con- For the 2 × 2 matrix tinuous function on some interval [a, b] and F (x) = R x 0 f(t)dt. Then F (x) = f(x) for a < x < b. This re-  a b  a , sult is one case of the Fundamental Theorem of Cal- c d culus. This statement has far reaching generaliza- tions for functions f(x) other than continuous func- |A| = ad − cb. In general, for the n × n matrix tions.  a11 a12 . . . a1n  derivative of the Let f(x) be a a . . . a a diﬀerentiable function on some interval I and let  21 22 2n   . . .  g(x) be its inverse. Then, if for some point a ∈ I,  . . .  f 0(g(a)) 6= 0, then the derivative of g at that point is an1 an2 . . . ann given by the determinant is equal to 0 1 g (a) = 0 . f (g(a)) n X This formula makes it possible to calculate deriva- |A| = a1jC1,j, tives of functions if we know the derivatives of their j=1 inverse. Example: where C1,j are the ﬁrst row cofactors. This presenta- d 1 tion is called expansion by the ﬁrst row. It is possible (sin−1 x) = √ , dx 1 − x2 to expand the determinant by any row or column and 36 the formulas are similar. Example: that that eigenvalues be the elements on the diago- nal of D. The method of diagonalization is also used −2 2 3 −3 −4 in solving homogeneous or non-homogeneous diﬀer- 1 −3 −4 = (−2) 0 1 ential equations. −4 0 1 For a circle, the line segment passing

function results in the value at point c: digits In decimal numeric system the numerals Z ∞ 0, 1, 2, 3, 4, 5, 6, 7, 8 and 9. δ(t − c)f(t)dt = f(c). −∞ dilation operator Or dilation transformation. An operator that applied to a vector does not change its This function plays very important role in theoretical direction but makes its norm bigger. The operator is phisics. given by the formula T x = kx, where k ≥ 1. In the direct variation The name of many possible re- case 0 ≤ k ≤ 1 the transformation is called contrac- lations between two variables. The most common tion operator. For a more general case when not only are:(1) y varies directly with x means that there is the norm but also the direction could be changed, see a real number k 6= 0, such that y = kx; (2) y varies expansion operator. directly with x2 means y = kx2; (3) y varies directly dimension of vector space If a vector space with x3 means y = kx3. There are many other pos- V has a system of linearly independent vectors sibilities but rarely used. Compare also with inverse {v1, v2, ··· , vn}, that span V , then the number n variation and joint variation. is the dimension of the vector space. Also, any basis directed line segment The same as vector. of V , has the same number of vectors. In the case, when there is no ﬁnite basis, the space is called inﬁ- For functions of two real nite dimensional. variables. Let u = (a, b) be a in the plane and f(x, y) be a function. The derivative of f in di- dimension theorem for linear transformations rection of the vector u at some point (x , y ) (denoted Similar to dimension theorem for matrices. If T is a 0 0 by D f(x , y )) is the limit linear transformation from the n-dimensional vector u 0 0 space V to some other vector space W , then f(x + ha, y + hb) − f(x , y ) lim 0 0 0 0 , h→0 h (T ) + nullity(T ) = n. if it exists. The deﬁnition for functions of three or more variables is similar. dimension theorem for matrices If A is a ma- direction angles The angles any nonzero vector v trix with n columns, then makes with the coordinate axes are called direction angles. The cosines of these angles are the direction rank(A) + nullity(A) = n. cosines. Hence, if α is the angle between v and x-axis, then Here rank(A) is the rank of the matrix and v · i cos α = nullity(A) is the dimension of the kernel of the ma- |v||i| trix. and similarly for other two directions. If a vector is Dirac function A generalized ”function”, used to describe the direction of some line then the formally deﬁned for any real point c by the condi- direction cosines are also called direction numbers. tions: direction ﬁeld For diﬀerential equations such as Z ∞ the equation y0 = f(t, y). We can make a direc- δ(t − c) = 0, t 6= c, δ(t − c)dt = 1. −∞ tion ﬁeld by choosing several points on the coordinate plane and by drawing a small line segment from each The Laplace transform of this function could be cal- point with the slope equal to the value of f at that culated in generalized sense and is equal to e−sc. point. This method is a useful tool when trying to Additionally, convolution of any function with Dirac get an idea about the behavior of the solution which 39 √ is diﬃcult (even impossible) to ﬁnd. the quantity D = b2 − 4ac is the . Depending on the sign of D, the corresponding directrix In geometric deﬁnition of the parabola, quadratic equation f(x) = 0 has: (1) Two distinct the name of the ﬁxed line with the following prop- real roots, if D > 0; (2) Two real repeated roots, if erty: A parabola is the set of all points on a plane D = 0 ,or (3) Two complex conjugate roots, if D < 0. that are equidistant from a ﬁxed point (called ) and a ﬁxed line, not containing that point is called disjoint events Two events that cannot happen at directrix. In a more general view, directrix could be the same time. In symbolic form, if A and B are the deﬁned also for and . See eccen- events, then they are disjoint if A ∩ B = ∅. tricity. The term is used as a synonym for circle. The discontinuity Property for a function, opposite to spelling disc is also valid. being continuous. A function could be discontinuous Some kind of shift or move from the by diﬀerent reasons. For example, the function given position. n 1 if x ≥ 0 f(x) = −1 if x < 0 distance One of the most important geometric (and also physical) notions. The distance is under- is discontinuous at the point x = 0, because the stood as a measure of how far or close two objects left-hand and right-hand limits at that point are not are. Diﬀerent measuring methods and also diﬀerent equal. The function f(x) = 1/x is discontinuous at measuring units, such as metric or English measuring the point x = 0, because it is not deﬁned there. See units, may be used. In Algebra and Calculus we mea- also discontinuous function. sure distances by the distance formula or by many discontinuous coeﬃcients Usually applies to dif- other methods, such as arc length formula, involving ferential equations with variable coeﬃcients. In many integration. See corresponding entries for details. cases these kind of equations can be solved (or shown distance between a point and a line If (a, b) to have a solution) even if the coeﬃcients have some is some point on the plane and the equation Ax + type of discontinuity, usually jump discontinuity. By + C = 0 represents a line, then the distance of discontinuous function A function f(x) is dis- the point from the line is given by the formula continuous at some point x = c, if one of the following |Aa + Bb + C| happens. (1) The limit lim f(x) does not exist. d = √ . x→c 2 2 (2) The limit exists, but does not equal the value of A + B the function at that point: limx→c f(x) 6= f(c). (3) The function is deﬁned around that point, but not at distance between a point and a plane Let the the point. equation of a plane be given by Ax+By+Cz+D = 0 This property is opposite to the property of being and the point P (x0, y0, z0) is not on the plane. The continuous. distance now will be given by the formula Ax + By + Cz + D| discrete variable A variable that does not take d = 0√ 0 0 . its values continuously. Being discrete could be ex- A2 + B2 + C2 pressed in diﬀerent ways. For example, if the variable x takes whole values 0, 1, 2, 3, ··· or it takes rational distance formula Let (x1, y1) and (x2, y2) be two values, then it will be discrete in both situations. As points on the plane. Then the distance between them a rule, however, we usually consider random variables is given by the formula taking either whole or integer values. p 2 2 d = (x2 − x1) + (y2 − y1) . discriminant For the general Similarly, if x = (x1, x2, ··· , xn) and y = 2 n f(x) = ax + bx + c (y1, y2, ··· , yn) are two vectors in Euclidean space R , 40 then the distance is given by a similar formula P (a ≤ X ≤ b) ≥ 0. Additionally, as in the case of discrete distributions, the probability of all space of v u n outcomes is 1. For speciﬁc examples of continuous uX 2 d = t (yk − xk) . distributions see normal distribution, Poisson distri- k=1 bution, t-distribution, F-distribution, uniform distri- bution. distance problem The problem of ﬁnding the dis- For any real or complex tance traveled by a moving object. If the object numbers a, b, c, the property moves with a constant speed v, then the distance, s is given by s = v ·t, where t is the time of travel. In a(b + c) = ab + ac, general, if the velosity of the object is variable and is that allows to , or to split algebraic or nu- given by the function f(t), then the distance traveled meric expressions. Similar properties are valid also between time periods t = a and t = b will be given for other mathematical objects, such as functions, by the integral matrices, etc. Z b divergence of a vector ﬁeld For a vector ﬁeld s = f(t) dt. F = P i + Qj + Rk in R3 the divergence is a scalar a function, deﬁned by the equation ∂P ∂Q ∂R distribution More precisely, probability distribu- divF = + + , tion. For any random event, the outcomes appear ∂x ∂y ∂z with certain . The combination of all out- if the partial derivatives exist. See also curl of a vec- comes with their relative frequences is a distribution. tor ﬁeld and gradient vector ﬁeld. Depending on types and number of outcomes, dis- tributions are divided into discrete and continuous. divergence test for series Let {an} be a se- Any discrete distribution has only ﬁnite number of quence of numbers. If limn→∞ an does not exist or P∞ outcomes and continuous distributions have inﬁnitely if limn→∞ an 6= 0, then the series n=1 an is diver- many, but not discrete outcomes. gent. (1) Discrete distributions. Every outcome has a prob- Let F = P i + Qj + Rk be ability () of appearing, that is either posi- a vector ﬁeld with continuously diﬀerentiable com- tive or zero. The sum of probabilities of all possi- ponents in some open region of R3. Let D be a re- ble outcomes is 1. Formally, if there are n outcomes gion,contained in the domain of F, with the boundary X1,X2, ··· ,Xn and P (A) indicates the probability S. Then, if n is the outward normal to S, we have of an event A, then P (Xi) ≥ 0, i = 1, 2, ··· , n and the relationship P P (Xi) = 1. Binomial distribution is one of the ZZ ZZZ most common discrete probability distributions and F · n dS = divF dV. histograms are their most common visual representa- S D tions. Also called Gauss’ theorem. (2) Continuous distributions are represented by a divergent improper integral (1) The improper density curve which is the graph of a positive func- R ∞ tion with the property that the area under that curve integral a f(x)dx is divergent, if the limit is 1. Individual probability of any outcome is always Z A zero (because there are inﬁnitely many outcomes), lim f(x)dx hence the probabilities in question are probabilities A→∞ a of groups of events. This means that if X is any does not exist or is inﬁnite. The deﬁnitions for inte- R b R ∞ random outcome, then P (X) = 0 but, for example grals −∞ f(x)dx and −∞ f(x)dx are similar. 41

(2) For improper integrals of unbounded functions Theorem of Algebra, any polynomial of degree two or R b higher is divisible by another polynomial, with pos- on bounded intervals a f(x)dx, where f has inﬁnite limit at the point x = b. We call that integral diver- sibly complex coeﬃcients. If we restrict division to gent, if the limit be by real polynomials only, then not all the poly- nomials will be divisible. Examples: The polynomial Z b−δ p(x) = x2 − 5x + 6 is divisible by real polynomials lim f(x)dx 2 δ→0 a x − 2 and x − 3. The polynomial q(x) = x + 4 is not divisible by any real polynomial but it is divisible by does not exist or is inﬁnite. The deﬁnitions for the complex polynomials x + 2i and x − 2i. cases when the point of unboundedness of the func- tion is the left endpoint or some inside point, is sim- division One of the four basic operations in alge- ilar. bra and arithmetic. By dividing two numbers we get their ratio. Division is possible by any real (or com- divergent power series Each power series plex) number, except zero. Division of any number ∞ by zero is not deﬁned. X c (x − a)n n division of complex numbers (1) To divide two n=0 a+ib complex numbers c+id , written in the standard form, has an interval of convergence given by |x − a| < R, we multiply both numerator and denominator by the where R ≥ 0. For any values of x, |x − a| > R, the conjugate of the denominator, c − id. This allows to series will be divergent. In case when R = 0, the se- get a real number c2 + d2 in the denominator and di- ries will be convergent in one point only, x = a. See vide the resulting complex number in the numerator also and convergent series. by that real number: divergent sequence A sequence, where the terms a + ib ac − bd ad + bc = + i. do not go to any particular ﬁnite limit. The opposite c + id c2 + d2 c2 + d2 of the convergent sequence. Examples: The sequences 1, −1, 1, −1 ··· and 1, 4, 9, ..., n2, ... are divergent. (2) If the complex numbers z = r(cos θ + i sin θ) and w = ρ(cos φ + sin φ) are given in trigonometric form, divergent series A numeric series P∞ a is di- n=1 n then vergent, if the sequence of its partial sums Sn = z r Pn = [cos(θ − φ) + i sin(θ − φ)]. i=1 ai does not have a ﬁnite limit, i.e. is a di- w ρ vergent sequence. The opposite of convergent series. Examples: The harmonic series is divergent, because division of fractions To divide the fraction a/b its partial sums grow indeﬁnitely large. The series by another fraction c/d, we just multiply the ﬁrst 1 − 1 + 1 − 1 + 1 − · · · is divergent, because its partial one by the reciprocal of the second: sums do not approach any particular limit. The notion of divergence applies also to functional a c a d ad ÷ = · = . series such as power series or Fourier series. A func- b d b c bc tional series is divergent at some point if the numeric series formed by the values of functional series at that See also multiplication of fractions. particular point is divergent. division of functions For two functions f(x) and dividend In the division process of two numbers, g(x) their quotient (division function) is deﬁned to f f(x) the number, that is getting divided. The other num- be the quotient of their values: g (x) = g(x) . This ber is called divisor. function is deﬁned where f and g are deﬁned, except where g(x) = 0. divisible polynomial A polynomial, that can be divided by another polynomial. By the Fundamental division of polynomials Let P (x) and Q(x) be 42

two polynomials and assume that degree of Q is less (x1, x2, ··· , xn) and y = (y1, y2, ··· , yn) in Eu- or equal to the degree of P . To divide P by Q, means clidean space Rn, the number to ﬁnd out ”how many times Q ﬁts into P ”. As in the case of division of integers, the result is not x · y = x1y1 + x2y2 + ··· + xnyn. always a speciﬁc polynomial, but rather there is some is a special case of inner product. See remainder. By the , in this case we also scalar product. have P (x) R(x) = D(x) + , double angle formulas For trigonometric func- Q(x) Q(x) tions the formulas where R(x) is the remainder of this division. Exam- sin 2x = 2 sin x cos x ple: cos 2x = cos2 x − sin2 x = 5x3 − 4x2 + 7x − 2 2x + 2 = 5x − 4 + . 2 2 x2 + 1 x2 + 1 2 cos x − 1 = 1 − 2 sin x 2 tan x In practice, division process is done either by long tan 2x = . 1 − tan2 x division , or by . The second one is possible only if we need to divide by a Similar formulas are valid for other trigonometric binomial of the form x − c. See corresponding entries functions but are rarely used. for the details of the algorithms. double integral (1) Let a function f(x, y) of two divisor In the division process of two numbers, the variables be deﬁned on some rectangular region D = number, that divides the other one, called dividend. [a, b] × [c, d] = {(x, y)|a ≤ x ≤ b, c ≤ y ≤ d}. Then the double integral of the function in the domain D For a function of one real is the limit of its Riemann sums, as the number of variable f(x), the set of all real values x, such that the partitions of the sides of rectangle goes to inﬁnity: function is deﬁned and is ﬁnite. More generally, for a ZZ ZZ function f(x1, x2, ··· , xn) of n independent variables, f(x, y)dA = f(x, y)dxdy n the region G in the Euclidean space R , such that f D D is deﬁned for all values of variables that belong to the n m region G. X X ∗ ∗ = lim f(xi , yj )∆x∆y. n,m→∞ Examples: For the function of one variable i=1 j=1 1 (2) In the case the function is given on a more general f(x) = √ x − 1 region than a rectangle, the deﬁnition requires some modiﬁcations. Assume that the region has the form the domain is the set {x|x > 1} because the square root function is deﬁned for non-negative values of ar- D = {(x, y)|a ≤ x ≤ b, g1(x) ≤ y ≤ g2(x)}, gument and, additionally, is not de- ﬁned. where g1, g2 are continuous on [a, b]. Then the double For the function of two variables integral on this type of regions is deﬁned to be the iterated integral f(x, y) = ln(x2 + y2) ZZ Z b Z g2(x) f(x, y)dA = f(x, y)dydx. the domain is all values of x and y except (0, 0) D a g1(x) because the logarithmic function is not deﬁnes at the origin. (3) Integrals over the regions of the type dot product For any two vectors x = D = {(x, y)|c ≤ y ≤ d, h1(y) ≤ x ≤ h2(y)} 43 are deﬁned similarly. See also deﬁnite integral, triple integral, . E double Riemann sum Let a function f(x, y) of two variables be deﬁned on some rectangular region D = [a, b] × [c, d] = {(x, y)|a ≤ x ≤ b, c ≤ y ≤ d}. In e One of the most important universal constants in of forming Riemann sums, we divide interval mathematics. This is a non-algebraic, transcenden- [a, b] into n equal parts of size ∆x = (b−a)/n and in- tal number, such that there are many approximation terval [c, d] into m equal parts of size ∆y = (d−c)/m formulas. Among them, most familiar are: by choosing points x0(= a), x1, ··· , xn(= b) and y0(= ∞ X 1 c), y1, ··· , ym(= d). In resulting smaller rectangles e = n! Di,j = [xi−1, xi] × [yj−1, yj], 1 ≤ i ≤ n, 1 ≤ j ≤ m, n=0 ∗ ∗ we choose some points (xi , yj ) and form the double sum and n m 1 X X e = lim (1 + )n. ∗ ∗ n→∞ f(xi , yj )∆x∆y, n i=1 j=1 The decimal approximation of e is 2.7182818... which is called double Riemann sum. The choice eccentricity For conic sections. Let L be a ﬁxed ∗ ∗ of points (xi , yj ) is arbitrary, and (under some line on the plane (called directrix) and P be some restrictions on function) always results in the same point (called focus), outside of that line. A positive limit. See also double integral. real number e is now ﬁxed. The set of all points Q on the plane, satisfying the relation

dist(Q, P ) = e dist(Q, L)

is called conic section and e is its eccentricity. In case e < 1 the result is ellipse, for e = 1 we get parabola and when e > 1, the result is hyperbola. For ellipses and hyperbolas the eccentricity has a simpler descrip- tion. See respective deﬁnitions. echelon form of a matrix Also called row eche- lon form. The result of performing Gaussian elimina- tion process on the matrix. In echelon form each row should start with zero or the leading one. Further, in each next row the leading one should be to the right of the previous row’s one. Finally, the rows with all zero elements should appear in the bottom rows only. Example: The matrix

 1 4 0 −3   0 0 1 0   0 0 0 1  0 0 0 0

x2 y2 + = 1, a2 b2

√where the constants a, b have special meaning: c = a2 − b2 is the distance from origin to a focus.

The three dimensional surface given by the formula x2 y2 z2 + + = 1. a2 b2 c2 elliptic cone The three dimensional surface given by the formula x2 y2 z2 2 + 2 − 2 = 0. Moreover, the intersection points of ellipse with the a b c coordinate axes are called vertices of ellipse and their This surface represents a cone which perpendicular distances from origin are equal to a and b. The line cuts are ellipses instead of . In particular case connecting two foci is called major axis and the per- a = b we get circular cone. pendicular line (passing trough origin) is the minor elliptic The three dimensional surface axis. given by the formula In the more general case, when the center of ellipse is located at some point (h, k) but the major and minor x2 y2 z + = . axes are still parallel to coordinate axes, the standard a2 b2 c equation becomes The picture shows the paraboloid given by the (x − h)2 (y − k)2 equation z = x2 + y2. + = 1. a2 b2 In these cases eccentricity is given by the formula e = c/a. Both of the above cases happen when in the general equation the term Bxy is missing. In the case B 6= 0 the result is still an ellipse, which is the result of rotation of one of the previous simpler cases. The ellipse could also be given by its polar equation:

de de r = or r = , 1 ± e cos θ 1 ± e sin θ where d > 0 and 0 < e < 1 is the eccentricity of the ellipse. 46 empty set The set, that contains no elements. The equation A statement that one quantity equals an- notation is ∅. other. Unlike identities, equations are not true for all the values of variables that are involved in that endpoints For the closed or open intervals relationship and by that reason they are also called [a, b], (a, b), the points a and b. The same is true conditional equality. To solve an equation means to for half-open intervals [a, b) and (a, b]. ﬁnd all values of the variables or unknowns that make endpoint extreme value Minimum or maximum that statement an identity. Depending on the type value of a function, that occurs at the endpoint of the of variables, the equations can be polynomial, ratio- closed interval [a, b]. nal, trigonometric, radical, exponential, logarithmic, and so on. In the case when the unknown quantity English system of measurements The system is a function, then we can have a diﬀerential, integral where the distances (also areas and ) are mea- or integro-diﬀerential equation. There are also equa- sured by (in), feet(ft), (y), and miles(mi) tions that contain matrices, vectors and other math- and the are measured using ounces(oz) and ematical objects. The equations could be given both pounds(lb). Unlike the of measure- in rectangular coordinates and also in polar, spherical ments which is based on decimal numeric system, re- and other variables. Examples: lations in English system are more or less arbitrary. Polynomial: 3x3 − 2x2 + 4x − 5 = 0; Here are some of that relations: 1 ft= 12in; 1 y=3 ft, Exponential: 2x−1 − 4 = 0; 1 mi= 1760 y , 1 lb= 16 oz. Trigonometric: sin x + cos x = 1; entries of matrix Same as elements of a matrix. Logarithmic: ln(x2 − 1) + ln(x2 + 1) = 0; Polar: r2 + sin θ = 0; epicycloid When a circle rolls on the outside of a Diﬀerential: 2xy00 + 3x2y0 − y = 0. ﬁxed circle, the path of any point is called epicycloid. Some equations cannot be categorized because they Let a be the radius of the ﬁxed circle with center at contain diﬀerent kinds of functions, such as the equa- the origin and b be the radius of the moving circle. tion 2x2 − 2x+1 + sin x = 0. These kind of equations When we connect the origin (center of the ﬁrst circle) are called mixed. See also corresponding entries for with the center of the second circle and continue, the all the mentioned types of equations. resulting radius in the moving circle will form some angle θ with the horizontal line. In this setting, the A triangle that has all sides parametric equations of epicycloid are given by the equal in size. In equilateral triangles all three angles formulas are also equal in size and measure 60◦ (π/3 in radian measure). a + b x = (a + b) cos θ − b cos θ, b equilibrium point or position In Physics or the point or position where the system a + b y = (a + b) sin θ − b sin θ. reaches equilibrum. This means that after reaching b that point the system does no longer move unless equality The property of being equal to. Math- some outside force is applied. ematically, equality means that certain quantity is equilibrum solution For autonomous diﬀerential equal another. These quantities may contain num- equations. When solving the equation y0 = f(y), bers, variables, functions, and other mathematical some solutions out to be constants, hence, do objects. If the equality holds for all values of given not depend on time, and are called equilibrum. This , then we have the case of identity. If the happens exactly at the roots of equation f(y) = 0 equality is true for some values of that parameters and these roots are called critical points. only, then we have the case of equation. Equations are also called conditional equality. See correspond- equivalence A notion that has many manifesta- ing entries. tions in mathematics. In the most general and ab- 47

stract form it could be deﬁned as some kind of rela- gent. Denote by S its sum and by SN its Nth partial tion between elements of a set S that satisﬁes three sum. Then |S − SN | < |aN+1|. See also alternating axioms for any elements a, b, c of that set. If we series test and alternating series estimation theorem. denote that relation by ”◦”, then the are: error estimate for the Midpoint rule Assume 1. (Reﬂexivity) a ◦ a that we wish to calculate numerically the integral 2. (Symmetry) if a ◦ b then b ◦ a R b f(x)dx using the Midpoint rule and denote by 3. (Transitivity) if a ◦ b and b ◦ c then a ◦ c . a Mn the nth approximation using that rule. Then, This way equality relation becomes equivalence, sim- 00 ilarity of triangles is equivalence as well as others. if |f (x)| ≤ M for all a ≤ x ≤ b, then equivalent equations Two equations that have Z b M(b − a)3 f(x)dx − T ≤ . the same solution sets. Equations x2 −5x−6 = 0 and n 2 a 24n (x + 1)(x − 6) = 0 are equivalent, because they have the same solutions x = −1, 6. The term is usually This estimate is very similar to the estimate for the used for equations of the same type, but equations of Trapezoidal rule. diﬀerent nature can be equivalent too. The equation error estimate for Simpson’s rule Assume 2x − 1 = 1 is algebraic and log x = 0 is logarithmic, 2 that we wish to calculate numerically the integral and they are equivalent because they have the same R b solution x = 1. a f(x)dx using the Simpson’s rule and denote by Sn the nth approximation using that rule. Then, if equivalent inequalities Two inequalities that |f (4)(x)| ≤ M for all a ≤ x ≤ b, then have the same solution sets. Inequalities x2 −5x−6 >

0 and (x + 1)(x − 6) > 0 are equivalent, because they Z b M(b − a)5 f(x)dx − S ≤ . have the same . n 4 a 180n equivalent matrices Or row-equivalent matrices. If a matrix could be transformed to another matrix error estimate for the Trapezoidal rule As- by a ﬁnite number of elementary row operations, then sume that we wish to calculate numerically the inte- the matrices are called equivalent. R b gral a f(x)dx using the Trapezoidal rule and denote equivalent system Two systems of equations that by Tn the nth approximation using that rule. Then, have the same solution sets. See also equivalent equa- if |f 00(x)| ≤ M for all a ≤ x ≤ b, then tions

Z b M(b − a)3 equivalent vectors Two vectors that have the f(x)dx − T ≤ . n 2 same direction and length. a 12n error Any time an exact value (of a number, func- This estimate is very similar to the estimate for the tion, series, integral, etc.) is substituted by an ap- Midpoint rule. proximate value, an error occurs. Errors could be estimate of the sum of a series The sums of estimated in absolute terms (such as the diﬀerence most of the series are diﬃcult or even impossible to between exact and approximate values) or in relative calculate. Estimates serve as a substitute for exact terms (as a proportion of error to the exact value). value and are good enough if the error is small. See For diﬀerent speciﬁc error estimates see the entries also error estimate for alternating series. following this one on error estimates of series and in- tegrals. Distance, generated by the Euclidean norm. For two vectors error estimate for alternating series Assume x = (x , x , ··· , x ), y = (y , y , ··· , y ) in Eu- that the series P∞ a is alternating and is conver- 1 2 n 1 2 n n=1 n clidean space, their distance is deﬁned to be the 48 norm of their diﬀerence: dist(x, y) = ||x − y||. θ = π, we get the famous formula eiπ = −1 which connects all the fundamental numbers of mathemat- Euclidean inner product For two vectors in ics. the Euclidean space x = (x1, x2, ··· , xn), and y = (y1, y2, ··· , yn), the quantity For approximation of solutions of initial value problems. The idea behind this method x · y = x1y1 + x2y2 + ··· + xnyn. is to substitute the solutions of the problem (which usually are diﬃcult or even impossible to ﬁnd) by a It is the same as dot product. See also scalar product. , which is the tangent line to this so- Euclidean norm For a vector x = (x1, x2, ··· , xn) lution at some point. By this reason the method is 2 2 in Euclidean space the number ||x|| = (x1 +x2 +···+ also called tangent line method. In detail, assume we 2 1/2 xn) . This number is non-negative and it is zero if are solving the problem and only if the vector is the zero vector. Also called 0 magnitude or length of the vector. y = f(t, y) y(t0) = y0 Euclidean space The vector space of all vectors in some interval. As a ﬁrst approximation we tale the with n coordinates, where the addition and scalar linear function multiplication operations are deﬁned in the usual y = y + f(t , y )(t − t ) way: 0 0 0 0

which passes through the point (t0, y0) and hence, u + v = (u1 + v1, u2 + v2, ··· , un + vn), coincides with the solution at that particular point. Next, we choose a point t1 not too far from t0 to αu = (αu1, αun, ··· , αun) assure that the error is small and construct another for any two vectors u, v and any real number α. Ad- line ditionally, the inner product is deﬁned in this space y = y1 + f(t1, y1)(t − t1), as the dot product (or, which is the same, Euclidean where y = y + f(t , y )(t − t ), is the value found inner product). This, on its turn, induces the norm 1 0 0 0 1 0 from the ﬁrst line. Proceeding this way, we will have in the space, which deﬁnes Euclidean distance. This a series of equations special vector space is now called Euclidean space and is commonly denoted by Rn, where n is any positive y = yn + f(tn, yn)(t − tn), integer. each of which is given on the interval [t , t ]. These Euler’s constant The limit n n+1 approximate solutions converge to the exact solution 1 1 1 when the lengths of intervals approach zero and un- γ = lim (1 + + + ··· + − ln n), n→∞ 2 3 n der certain conditions on function f. which exists and is ﬁnite. The value of γ is approxi- Euler-Fourier formulas Also called Fourier for- mately 0.577212 but it is not known if the number is mulas. These formulas represent Fourier coeﬃcients rational or not. See also harmonic series. of the function by certain integral formulas involving the function itself. Let f(x) be deﬁned on some in- Euler equation See Cauchy-Euler equation. terval [−L, L]. Then its Fourier coeﬃcients are given Euler’s formula The formula by

eiθ = cos θ + i sin θ, 1 Z L nπx an = f(x) sin dx, n = 0, 1, 2... √ L −L L where i = −1 is the imaginary unit. This formula is true also when the real number θ is substituted by 1 Z L nπx any complex number z. In the particular case when bn = f(x) cos dx, n = 1, 2, 3... L −L L 49

F (x, y) on R, such that evaluation The term is used in mathematics ∂F ∂F in diﬀerent situations, most commonly when we = M, = N. (1) want to ﬁnd the value of a function. In this case, to ∂x ∂y evaluate a function means to ﬁnd its value for certain Every solution of this equation is given by the implicit value of the independent variable(s). Example: To function F (x, y) = c, c = constant. The necessary evaluate the function f(x) = 3x4 − 2x2 + x − 4 at and suﬃcient condition for the equation being exact the point x = 2 means to calculate the value of f at is that point: f(2) = 3 · 24 − 2 · 22 + 2 − 4 = 38. My(x, y) = Nx(x, y). (2) 2 even function A function f(x) of real variable, Example: To solve the equation (3x − 2xy + 2)dx + 2 2 that satisﬁes the condition f(−x) = f(x). This (6y − x + 3)dy = 0 we ﬁrst easily check that the condition means that the graph of the function is above conditions are satisﬁed. Next, to ﬁnd the func- symmetric with respect to the y-axes. The functions tion F we integrate the ﬁrst of equations (1) with f(x) = cos x and f(x) = x2 are examples of even respect to the variable x and get functions. Z F (x, y) = (3x2 − 2xy + 2)dx

= x3 − x2y + 2x + g(y), where g(y) is an arbitrary function of the variable y. To satisfy the second of the equations (1) we diﬀeren- tiate this function by y and equate it to the function N: −x2 + g0(y) = 6y2 − x2 + 3, from where, g0(y) = 6y2 + 3, or g(y) = 2y3 + 3y (no need of arbitrary constant). Finally, solution of the even permutations A permutation that is the re- equation will be given by the implicit equation sult of even number of transpositions. F (x, y) = x3 − x2y + 2x + 2y3 + 3y = C. event A term used in and Statistics to indicate some kind of outcome for the existence and uniqueness For diﬀer- (usually random) variable. Tossing a coin and getting ential equations. Statements, that contain conditions Tail or Head are two diﬀerent events. If an event can- under which the given diﬀerential equation has a solu- not be split into any smaller events, then it is called tion and that solution is unique. Depending on type simple event. The collection of all simple events forms of equation (order, coeﬃcients, homogeneous or not, the event space. Events can be dependent, indepen- boundary or initial conditions, etc) the existence and dent, mutually exclusive. See corresponding entries uniqueness conditions diﬀer from each other. For ex- for more information. ample, if the equation is of the ﬁrst order, then one boundary or is usually enough, while exact equations Let M(x, y) and N(x, y) be func- for the second order equations at least two conditions tions on some rectangle R = [a, b] × [c, d]. The ﬁrst are necessary. Similar observations are valid for sys- order ordinary diﬀerential equation tems of equations too. Example: M(x, y) + N(x, y)y0 = 0 Theorem. Assume we have the equation is called exact, if there exists a diﬀerentiable function y00 + p(t)y0 + q(t)y = g(t) 50

0 0 with initial conditions y(t0) = y0, y (t0) = y0 where pressed with the use of exponential function with neg- all the functions p, q and g are continuous on some ative exponent: interval I, containing point t0. Then there exists ex- kt actly one solution y = φ(t) of this problem and the f(t) = Ae , k < 0. solution exists on all the interval I. Many economic, biological and other phenomena expanded form A term used in various situations. have model. For a polynomial p(x) = (x − 2)(x + 5) the expanded form will be p(x) = x2 + 3x − 10. The term is used exponential function The function f(x) = ax, also for determinants, when expanded form means where a > 0, a 6= 1, which is deﬁned for all real the form when it is expanded as a sum of cofactors values of x. The range of the function is (0, ∞). or minors. expansion by minors One of the methods of cal- culating the value of a determinant. See correspond- ing entry. expansion operator A linear operator that ”ex- pands” given vector. In the most general setting, an expansion operator with factor k > 0 is a operator that makes the norm of a vector x be equal k||x||. Example: Operator given by a matrix

 k 0  0 k A functional model that is expressed with the use of exponential function with transforms any vector (x, y) into vector k(x, y). See positive exponent: also contraction operator, dilation operator. f(t) = Aekt, k > 0. expected value For a distribution, given by the function f(x), the quantity Many economic, biological and other phenomena have exponential growth. Z ∞ exponential order Functions that have growth or µ = xf(x)dx, −∞ decay comparable with the growth or decay of the ex- ponential function. The sinh x and cosh x functions if it is ﬁnite. Expected value is the same as the mean both have exponential order. See hyperbolic func- of the distribution. This quantity could be deﬁned tions. also for distributions of several variables. One of the main mathematical experimental study In statistics, a study where operations. The exponent of a number b, b > 0 the researcher controls, modiﬁes, or changes the con- (called base) of some power x is denoted by bx. In ditions in order to collect data of the results and ef- the case when x is a natural number, this expression fects of these changes. Most of the medical studies is understood as the number b multiplied by itself x are experiments to assess the results of treatment or times: bx = b · b ··· b. For exponents other than - medication. ural numbers, this deﬁnition is extended in several steps. explanatory variable In statistics, another name (1) By deﬁnition b0 = 1 for any b > 0. for the independent variable. (2) If n is a positive integer, then b−n = 1/bn by def- exponential decay A functional model that is ex- inition. 51

(3) For positive integers n, we deﬁne the exponent of the data based on a sample data. 1/n b to be the of the number b, i.e. that extreme value Maximum or minimum values of a is a number that raised√ to the nth power (which is function. natural) results in b:( n b)n = b. n/m If f(x) is a continuous (4) For any integers n and √m the exponent√ b is deﬁned to be the number ( m b)n = m bn. function on the closed interval [a, b], then there exist (5) To deﬁne the exponent bx for irrational number points c, d ∈ [a, b] such that f(c) is the minimum x we use limiting process: This power is deﬁned as and f(d) is the maximum values of f on [a, b]. the limit A very similar statement is true also for functions of x c two or more variables. b = lim b n , n→∞ extremum Collective name for minimum and where {cn} is some sequence of rational numbers that maximum. approaches x. In deﬁnition of exponents we usually avoid the case b = 1 because it results in trivial case. From this deﬁnition the following important properties of ex- ponentiation follow: (a) an · am = an+m (b) (an)m = an·m (c) (ab)n = an · bn (d) an/am = an−m. See also exponential function exponents at the singularity In diﬀerential equations. When solving an equation by the power series method and the point is regular singular point, the roots of the corresponding indicial equation are called exponents at the singularity. expression In mathematics, any combination of mathematical objects (numbers, functions, matrices, transformations, etc.) and operations on them. See also and arithmetic expression. extraneous root or solution In solving certain types of equations, some roots, that turn out invalid. This happens in solving rational, radical, logarith- mic, and some other types of equations. Example: The equation √ 2x + 7 − x = 2 has two solutions, x = 1, x = −3 and the second one is extraneous. A procedure of ﬁnding values of functions or data based on some known values of the function or given data. Similar to the notion of in- terpolation. The least square regression line is an example of extrapolation when we predict the values 52

equation r = a sin 2θ, a > 0. The curve is called so because it consists of four loops. It is a special case of more general family of polar curves given by the equations r = a sin nθ, r = a cos nθ, where n is an arbitrary integer. The graph shows the equation r = sin 2θ.

force One of the most important notions in Physics. By the Newton’s second Law, if an object of mass m moves along a line with the position function s(t), then the force on the object is the product of the mass and its acceleration: d2s F = m . dt2 Fourier coeﬃcients (1) In the Fourier series forced When the vibration of a or a is aﬀected by some additional force ∞ X F (t), then this vibration obeys the following diﬀer- a0/2 + (an cos nθ + bn sin nθ) ential equation: n=1

d2x dx the coeﬃcients an, n ≥ 0 and bn, n ≥ 1. m + c + kx = F (t). dt2 dt (2) In a more general setting, if the function f(x) is deﬁned on interval [0, 1], r(x) is some func- tion, and φ (x), φ (x), ··· , φ (x) is an orthonormal formula An equation that contains two or more 1 2 n set, then the numbers variables. As a rule, a formula is deduced to cal- culate values of one variable if the values of other Z 1 variable(s) are known. Elementary formulas in most ak = f(x)φi(x)r(x)dx, i = 1, 2, ..., n, cases are interchangeable in a sense that they could 0 be solved for any other variable. This is not true for are the Fourier coeﬃcients of f. more complicated formulas. Here are a few familiar formulas from geometry and algebra. See correspond- Fourier series For the given integrable function ing entries for more details. f(x) on the interval [0, 2π], the formal series

Area of a rectangle: A = ` × w. ∞ Compound interest: P = A(1 + r/n)nt. X f(x) = a0/2 + (an cos nθ + bn sin nθ), Distance formula: n=1 p 2 2 d = (x1 − x2) + (y1 − y2) . where the coeﬃcients are deﬁned by the formulas

1 Z π four-leafed rose The plane curve, given by polar an = f(x) cos nxdx, n = 0, 1, 2, ... π −π 55

1 Z π Z d "Z b # b = f(x) sin nxdx, n = 1, 2, 3... n π = f(x, y)dy dx. −π c a Fourier series of a function converges to that function Similar formulas hold also for triple or multiple inte- if it satisﬁes certain conditions, e.g., if it is diﬀeren- grals. tiable. Also called . function One of the most important mathematical fraction Any real number that could be written objects. In order to deﬁne a function it is necessary n in the form m , where n and m are arbitrary real ﬁrst of all to have two sets associated with the func- numbers with only one condition that m 6= 0. In tion called the domain and the range. Then we can the particular case when n and m are integers frac- formally deﬁne the function as some relation (- tions are called rational numbers. For operations with ping, transformation) that translates each element of fractions see addition and subtraction of fractions, the domain to exactly one element of the range. If multiplication of fractions, division of fractions. See the requirement of having exactly one corresponding also complex fractions and partial fractions for spe- element in the range is removed, then instead of func- cial types of fractions. tion we have a relation. Functions can be represented frequency For trigonometric functions f(x) = in many diﬀerent ways. a sin ωt and g(x) = a cos ωt, the number ω/2π is the 1) The ﬁrst, least common form of representation is frequency. This number shows how many times the the verbal representation. As an example we can de- functions complete one full cycle when we move the scribe a function as follows: The function translates distance equal to 2π. The reciprocal to the frequency each value of the variable to its double. is the period of the function. 2) The function can be represented also numer- Fresnel functions The functions ically, by a table, matrix, chart, etc. For ex- ample, if the domain of the function is the set Z x {1, 2, 3, 4, 5} and the range is {6, 9, 13, 17} then the S(x) = sin t2dt 0 function can be given as a list of ordered pairs: {(1, 6), (2, 9), (3, 17), (4, 9), (5, 13)}, meaning that the and function translates the point 1 to the point 6, point Z x 2 to the point 9, etc. C(x) = cos t2dt. 0 3) The function can have graphical representation when instead of numeric values just the graph is given The integrals on the interval [0, ∞) are both conver- p from where the numeric values may be found. This gent and are equal to π/8. method of representing functions is not very accu- Frobenius method The method of solving linear rate. diﬀerential equations near a regular singular point. 4) The most important form of representation of func- See the entry series solution for the details. tions is the algebraic (often also called analytic) form, when an explicit formula is given from which the val- Fubini’s theorem In its most general form this ues of the function can be calculated. The functions theorem gives conditions on function of several vari- represented algebraically, usually written with func- ables which allow to calculate multiple integrals as it- tion notation, which simpliﬁes the process of ﬁnding erated integrals. In the simplest form the theorem for the values at a given point. For example, if a func- double integrals is formulated as follows: Let f be a tion given with the notation f(x) = 2x2 −3, then the continuous function on the rectangle R = {(x, y)|a ≤ value of the function at the point x = −2.7 will be x ≤ b, c ≤ y ≤ d}. Then found by substituting this value for x : f(−2.7) = 2(−2.7)2 − 3 = 11.58. ZZ Z b "Z d # f(x, y)dA = f(x, y)dx dy Functions given algebraically are further categorized R a c as algebraic and transcendental. On their turn, alge- 56

−n2π2a2t/M 2 nπx braic functions have subcategories of polynomial, ra- un(x, t) = e sin , n ≥ 1. tional, radical functions and the transcendental func- M tions can be exponential, logarithmic, trigonometric The equation: 2 and others. For speciﬁc examples see the correspond- a uxx = utt, 0 < x < M, t > 0 ing entries. nπx nπat Further, functions could be increasing, decreasing, un(x, t) = sin cos , n ≥ 1. even, odd, one-to-one, periodic, continuous, diﬀer- M M entiable, integrable and of many other types. Laplace’s equation:

5) Functions can be represented as series, such as uxx + uyy = 0, 0 < x < a, 0 < y < b power series or Fourier series. For example, the ra- nπx nπat tional function f(x) = 1/(1 − x) on interval (−1, 1) un(x, t) = sinh sin , n ≥ 1. could be represented as b b fundamental theorem of algebra If p(x) is a ∞ 1 X polynomial of degree n > 0, then the correspond- = xn. ing polynomial equation has at least one solution in 1 − x n=0 the ﬁeld of complex numbers. As a result of this theorem it follows that any poly- Similarly, we have a Fourier series representation on nomial of degree n > 0 has exactly n roots (real or interval (0, π) complex), if we count the roots with their multiplic- ∞ ities. π − x X sin nx = . 2 n fundamental theorem of arithmetic Any posi- n=1 tive integer n could be factored in a product of pow- k1 k2 km 6) Functions could be represented also with integrals. ers of prime numbers: n = p1 · p2 ··· pm . This Laplace transform is the most common example of presentation is unique except possible order of prime 3 2 integral representations of functions. factors. Example: 2520 = 2 · 3 · 5 · 7. The notion of the function could be extended also fundamental theorem of calculus Let f be a to the case of two, three or, more generally, n vari- Riemann integrable function on the interval [a, b] and ables. Also, functions can be vectors themselves, or let F be the indeﬁnite integral (antiderivative) of the be vector-valued. For speciﬁc types of function see function f on (a, b): F 0(x) = f(x). Then also Bessel functions, harmonic functions, implicit Z b functions. f(x)dx = F (b) − F (a). fundamental matrix spaces For a matrix A, the a collective name of the following four spaces: Row This theorem allows to calculate deﬁnite integrals space of A, column space of A, the nullspace of A, without elaborate process of partitioning the inter- and the nullspace of the of the matrix AT . vals, constructing Riemann sums and passing to the limit. It also combines two main branches of calcu- fundamental solutions In the most general set- lus: the diﬀerential calculus and integral calculus. ting, a (generalized) function φ is called the funda- Example: Since (sin x)0 = cos x, mental solution of the partial diﬀerential operator L, if it satisﬁes the equation Lφ = δ, where δ is the Dirac Z π cos xdx = sin π − sin 0 = 0. function. For speciﬁc partial diﬀerential equations we 0 have the following expressions for fundamental solu- tions: fundamental trigonometric identities The Heat conduction equation: identities 1 1 1 2 cot x = , sec x = , csc x = a uxx = ut, 0 < x < M, t > 0 tan x cos x sin x 57 and sin x cos x tan x = , cot x = . G cos x sin x See also Pythagorean identities, cofunction identi- ties,addition and subtraction formulas, double-angle, half-angle formulas, sum-to-product, product-to-sum For the real positive variable x, formulas. the function deﬁned by the formula Z ∞ Γ(x) = tx−1e−tdt. 0 This function is the generalization of the factorial in the sense that Γ(n + 1) = n!. Gamma function can also be deﬁned for complex values of argument. Gaussian elimination One of the most eﬀective methods of solving systems of linear algebraic equa- tions. The idea of the method consists of eliminating more and more variables in successive equations, and to arrive to the situation, when the last equation has only one unknown. This will allow to ﬁnd the value of that variable and substitute in the previous equation that has only two variables, hence ﬁnding the value of the second unknown. Continuing this back substi- tution, we will be able to solve the system. During this process, only elementary row operations are per- formed, which guarantees, that the resulting system is equivalent to the original one, i.e., has the same solution set. Also, the procedure may be applied di- rectly to the system of equations or the corresponding augmented matrix. We will demonstrate the second approach. Let

a11x1 + a12x2 + ··· + a1nxn = b1

a21x1 + a22x2 + ··· + a2nxn = b2 ······························

an1x1 + an2x2 + ··· + annxn = bn be our system of n equations with n variables x1, x2, ··· , xn and let

 a11 a12 . . . a1n b1   a21 a22 . . . a2n b2   . . . .   . . . .  an1 an2 . . . ann bn 58 be the corresponding augmented matrix. First, we column of the second row. Similarly, doing the oper- divide ﬁrst row by a11 and get 1 in the ﬁrst position ation 3R1 +R3 → R3 will eliminate the ﬁrst entry on of the ﬁrst row. Next, we multiply ﬁrst row by −a21 the third row. Now, the matrix has the form and add to the second row, to get zero on the ﬁrst   position of the second row. After dividing second row 1 −2 3 4 by a22 − a12a21/a1, we will have the following format  0 5 −10 −5  . 0 −2 8 10 0 0 0  1 a12 . . . a1n b1  0 0  0 1 . . . a2n b2  On the next two steps we multiply second row by 1/5  . . . .  and the third row by -1/3 to get 1’s in the middle  . . . .  column. Finally, after doing the operations −R2 + an1 an2 . . . ann bn R3 → R3 and multiplying the last row by -1/2, we arrive to Now, continuing, we can eliminate ﬁrst element of   the third row by using the ﬁrst row, second element 1 −2 3 4 of the third row by using the second row, and then  0 1 −2 −1  . dividing by the value of the resulting third element, 0 0 1 2 we will make it one. As a ﬁnal result we will ﬁnd the This matrix corresponds to the system of equations matrix in upper triangular form, with all entries on the diagonal being 1’s: x − 2y + 3z = 4 0 0 0  1 a12 . . . a1n b1  0 0 y − 2z = −1  0 1 . . . a2n b2   . . . .   . . . .  z = 2, 0 0 0 ... 1 bn which could be solved by back substitution method This triangular matrix corresponds to triangular sys- and gives the solution set (4, 3, 1). tem of equations, that is solved by the back substitu- Gauss-Jordan elimination The development of tion method. This same method works also for more Gaussian elimination method. Here, the method general case of m × n systems but does not result in does not stop at triangular matrix but makes it diago- unique solution. nal. Assume, that as a result of Gaussian elimination, Example: To solve the system of equations our augmented matrix is reduced to the following one:

2x + y − 4z = 3  1 a12 . . . a1n b1   0 1 . . . a2n b2  x − 2y + 3z = 4  . . . .   . . . .  −3x + 4y − z = −2    0 0 . . . an−1n bn−1  we form its augmented matrix 0 0 ... 1 bn

s(t) = a cos ωt, s(t) = a sin ωt.

Here s is the displacement of the object at the time t, a is the amplitude of motion and ω/2π is the fre- quency of . harmonic series The numeric series ∞ X 1 . n n=1 This series is divergent and according to Euler’s for- mula, its partial sums go to inﬁnity as the natural logarithmic function. heat conduction equation In Partial Diﬀerential Equations: for a function of two spacial variables helix A curve in 3-dimensional space given by the x, y, and the time variable t, the equation equations ∂u(x, y, t) ∂2u(x, y, t) ∂2u(x, y, t) = + . x = cos t, y = sin t, z = t. ∂t ∂x2 ∂y2 Here t is a real parameter. As it increases, the point In the simpler case of one spacial variable x and time (x, y, z) traces a helix about the z-axis. 2 variable t the equation has the form a uxx = ut. Heron’s formula For the area of a triangle with Heaviside function The function the sides, equal to a, b and c. Denote s = (a+b+c)/2. n 1 if x ≥ 0 Then the area A is given by the formula f(x) = 0 if x < 0 A = ps(s − a)(s − b)(s − c). This function plays important role in physics and electrical . See also Laplace transforms of . higher order derivatives Since the derivative of a function is also a function, we can calculate the helicoid A 3-dimensional surface given by the derivative of that function. If it exists, then that will equations be the of the function: x = r cos t, y = r sin t, z = αt, d2f f 00(x) = (f 0(x))0 = . dx2 where both r and α range from −∞ to ∞. For any 000 d3f ﬁxed values of that parameters, helicoid becomes a Similarly, the , f (x) = dx3 , the 64

4 (4) iv d f The equation fourth, f (x) = f (x) = dx4 , and the nth deriva- dnf (n) 00 0 tive f (x) = dxn are deﬁned, if they exist. y (t) + p(t)y (t) + q(t)y(t) = 0 A linear vector space, equipped is homogeneous, but the equation with inner product and the notion of distance asso- 0 2 ciated with that inner product. The term almost ex- y (t) − 2y(t) = 3x clusively applies to inﬁnite dimensional vector spaces. is not. In this last case equation is called non- histogram In statistics, a histogram is a graphical homogeneous. display of frequencies. A histogram is the graphi- Homogeneous equations are also classiﬁed by their cal version of a table which shows what proportion order (ﬁrst, second,etc.), by the type of coeﬃcients of cases fall into each of several speciﬁed categories. (constant or variable), by and so on. In The histogram diﬀers from a bar chart in that it is cases when we have several equations with several the area of the bar that denotes the value, not the unknown functions, we get system of homogeneous height. The categories are usually speciﬁed as non- equations. overlapping intervals of some variable. Another dif- Hooke’s law In physics, this law states that the ference between histograms and bar graphs is that force necessary to hold a spring stretched x units be- the bars in histograms are adjacent, not separated as yond the equilibrium point is proportional to the dis- for bar graphs. placement x, if its small. Symbolically, f(x) = kx, where k is a positive number called spring constant. horizontal asymptote See asymptote. horizontal axis The x-axis in Cartesian coordi- nate system. It is given by the equation y = 0. horizontal line A line, that is parallel to the x- axis in Cartesian coordinate system. Any horizontal line is given by the equation y = a, where a is any real number. If a function f(x) is given graphically, and any horizontal line intersects that graph only once, then the function is one-to-one. If even at one point this condition is violated then the function cannot be one-to-one. This test allows to determine if the given function has an inverse or not homogeneous algebraic equations Systems of but is not accurate. linear algebraic equations, where all of the right sides hyperbola One of the three main conic sections. are zero. Homogeneous systems are always consis- Geometrically, a hyperbola is the location (locus) of tent, because they always have the trivial solution all points in a plane, the diﬀerence of whose dis- (i.e. all zero solution). Homogeneous systems may tances from two ﬁxed points in the plane is a positive have also non-trivial solutions, if any of the equations constant. The two points, from which we measure is a of multiple other equations. distances, are called foci (plural for focus). Equiva- homogeneous diﬀerential equation A diﬀeren- lently, the hyperbola could be described as the result tial equation where only the unknown function and of cutting double cone by a plane parallel to its axis. its derivatives are present, possibly multiplied by Hyperbola is a smooth curve with two branches. Al- some (variable or constant) coeﬃcients. Examples: ternative geometric deﬁnition could be given with the 65 use of eccentricity. See corresponding deﬁnition. Algebraically, the general equation of a hyperbola is given by the equation Ax2 + Bxy + Cy2 + Dx + Ey + F = 0, where A, B, C, D, E, F are real constants and A · C < 0. In the case, when the foci are located on one of the coordinate axes and center coincides with the origin, this equation could be transformed into standard form

x2 y2 − = 1, a2 b2 where√ the constants a, b have special meaning: c = a2 − b2 is the distance from origin to a focus. The functions Moreover, the intersection points of hyperbola with one of the axes are called vertices of hyperbola and ex − e−x ex + e−x sinh x = , cosh x = , their distance from origin is equal to a. The line 2 2 connecting two foci is called transverse axis and the perpendicular line (passing trough origin) is the ex − e−x ex + e−x tanh x = , coth x = , conjugate axis. Each hyperbola of this form has ex + e−x ex − e−x b two slant asymptotes given by the formulas y = ± a x. Hyperbolic secant and cosecant functions are deﬁned as reciprocals of hyperbolic cosine and sine functions, In the more general case,when the center of hyperbola respectively. The sinh x and cosh x functions are the is located at some point (h, k) but the transverse and most used with a very limited use for other four hy- conjugate axes still parallel to coordinate axes, the perbolic functions. standard equation becomes hyperbolic identities Identities for hyperbolic (x − h)2 (y − k)2 functions, analogous to corresponding trigonometric − = 1, a2 b2 identities. Here the equivalent of the Pythagorean identity is the following: and the equations of the slant asymptotes are y = b cosh2 x − sinh2 x = 1. k ± a (x − h). In all cases the eccentricity is given by the formula e = c/a. Both of the above cases happen when in the general Another example is the formula equation the term Bxy is missing. In the case B 6= 0 sinh(x + y) = sinh x cosh y + cosh x sinh y. the result is still a hyperbola which is the result of rotation of one of the previous simpler cases about hyperbolic paraboloid The three dimensional the center of hyperbola. surface given by the formula The hyperbola could also be given by its polar equa- tion: x2 y2 z − = . a2 b2 c de de r = or r = , 1 ± e cos θ 1 ± e sin θ hyperbolic substitution A method of of indeﬁnite integrals, when the independent variable where d > 0 and e > 1 is the eccentricity of the is substituted by one of the hyperbolic functions. hyperbola. Substitutions are used to evaluate indeﬁnite integrals 66 when ready to use formulas are not available. See also trigonometric substitutions and substitution method.

hypersphere Usually, the name of the n- dimensional sphere given by the equation 2 2 2 2 x1 + x2 + ··· + xn = r , where r is the ra- dius. hypervolume Usually, volume in n-dimensional Euclidean space. (1) Hyperboloid of one sheet is given by the formula

x2 y2 z2 + − = 1 a2 b2 c2

and represents a surface that is connected. (2) Hyperboloid of two sheets is given by

x2 y2 z2 hypocycloid Geometrically, a curve that is traced − − = 1 a2 b2 c2 by a point on a circle that rolls inside a bigger cir- cle. Depending on ratio of radiuses of the circles, the result is a diﬀerent curve. If that ratio is 1/4, the re- sult is called astroid (see picture above). In general, a and consists of two pieces. hypercycloid with n + 1 cusps is given by parametric 67 equations I 1 1 x = cos θ + cos nθ, y = sin θ − sin nθ. n n hypotenuse In the right triangle, the side, oppo- identity (1) An equality, containing constants site to the right angle. The other two sides are called and/or variables, that is true for all values of the legs. variables, whenever the expression makes sense. Ex- amples are: (a + b)2 = a2 + 2ab + b2, sin2 θ + cos2 θ = 1, 1 + tan2 x = sec2 x. The ﬁrst two identities are valid for all values of a, b or θ, and the last identity is valid for all values of x 6= π/2+πk, k = 0, ±1, ±2, ..., because the functions involved in the identity are not deﬁned for that values of argument. See also trigonometric identities. (2) A number, that added to other numbers or multiplied by other numbers does not change them. See additive identity, multiplicative identity. hypothesis A statement or proposition, which va- identities, trigonometric See trigonometric lidity is not known but which is very likely to be identities. true based on experimental or theoretical observa- identity matrix The square matrix where all the tions. One of the best known is the so-called Rie- entries are zero except of the main diagonal, where mann hypothesis. all the entries are ones. Example: hypothesis testing In Statistics. the procedure of  1 0 0  testing a working assuption (hypothesis) using a set I = 0 1 0 of values from a sample. For details see null hypoth-   0 0 1 esis and alternative hypothesis. When we multiply the identity matrix by any other matrix from the right or left, the given matrix does not change: IA = AI = A. The function f(x) = x which leaves each point in its place. identity transformation A transformation T of a vector space V that leaves all elements of the space unchanged: For any v ∈ V , T v = v. Also called identity operator. of a linear transformation. If T : V → W is a linear transformation from one vector space to an- other, then for any vector u ∈ V the vector T (u) ∈ W is its image. The set of all images is called the image of the transformation T . imaginary axis The vertical axis in the complex plane. When the complex plane is identiﬁed with the 68

Cartesian plane, imaginary axis is the same as y-axis. contains a point (a, b). Assume that F (a, b) = 0, ∂F (a, b) 6= 0, and both partial derivatives of A number of the form ai, ∂y √ F are continuous on the disk. Then the equation where a is a real number and i = −1 is the imagi- F (x, y) = 0 deﬁnes y as an explicit function of x near nary unit. √ the point (a, b) and the derivative of that function is imaginary unit The number i = −1. This num- given by the formula ber is not real, because i2 = −1. See also complex dy ∂F  ∂F  numbers. = − / . dx ∂x ∂y implicit diﬀerentiation When a function is given implicitly (see ), then it is still pos- Similar statements are valid for functions of three or sible to calculate the derivative of that function. Let more variables also. the function is given by the equation F (x, y) = 0 and assume that F has partial derivatives with respect to implicit solution A solution of an (usually diﬀer- both variables. In that case, we will be able to cal- ential) equation which is given as an implicit function culate the derivative of y with respect to the variable F (x, y) = 0. x if we view y as a function of x and use the chain implied domain When the domain of some func- rule. Example: If the implicit function is given by tion is not stated explicitly but is clear from the deﬁ- the equation nition of the function, then it is said that the domain arctan y/x = x, is implied. then diﬀerentiation by x gives improper fraction A fraction, where the numer- ator is greater than or is equal to the denominator. dy x − y dx = 1 Examples: 7 , 25 . x2 + y2 5 25 improper integral An integral, where either one and, after the simpliﬁcation, or both integration limits are inﬁnite, or the function is unbounded on the bounded interval. dy x2 + y2 + y = , (1) Let the function f(x) be deﬁned on the interval dx x R A [a, ∞) and assume that the integral a f(x)dx exists which itself is an implicit function. for any given number A ≥ a. Then the limit of that integral (ﬁnite or inﬁnite) is the integral of f over the implicit function A function of two variables, interval [a, ∞): that is not explicitly solved for any of the vari- ables. This relation can be expressed as an equation Z ∞ Z A F (x, y) = 0 with some function F . Examples are f(x)dx = lim f(x)dx. x2 + y2 − 1 = 0, 2 sin3 x − 3xy = 0. Similar deﬁni- a A→∞ a tion holds for functions of several variables too. Let In case this limit is ﬁnite, the function is called inte- x1, ··· , xn, y be (n + 1) variables and F represents grable on the given interval. The improper integrals some functions of that variables. Then the equa- over the intervals (−∞, b] and (−∞, ∞) are deﬁned tion F (x1, ··· , xn, y) = 0 is an implicit function if similarly. the equation is not solved for any of the variables. (2) Let f(x) be deﬁned on a bounded interval [a, b] Implicit Function Theorem gives conditions un- but be unbounded there, for example, as we approach der which a function deﬁned implicitly, can be solved the right endpoint b. Assume also, that on each inter- for the dependent variable. We give the simplest ver- val [a, b − δ] the function is bounded and the integral R b−δ sion for the case of two variables. Let F (x, y) = 0 a f(x)dx exists. Then the limit of that integral be an implicit function deﬁned on some disk, that (ﬁnite or inﬁnite) is the integral of f over the inter- 69 val [a, b]: is inconsistent because it has no solutions.

2 integral One of the most important notions of cal- Z e−x dx culus. Has two main meanings. See deﬁnite integral 2 + x2 73

2 e−x = 2 ln |x − 1| − ln |x − 2| + C. the function f(x) = 2+x2 is the integrand. Assume we have a general ﬁrst order linear diﬀerential equation Z 12 dx y0 + p(t)y = g(t) x4 − x3 − 2x2 Z  3 6 1 4  with some given functions p and g. In order to ﬁnd = − + − dx x x2 x − 2 x + 1 the general solution of this equation, we multiply all terms by some positive function µ(t), called integrat- 6 = 3 ln |x| + + ln |x − 2| − 4 ln |x + 1| + C. ing factor, which makes the left side equal to the x derivative of the function y(x)µ(x). This function can be chosen to be equal µ(t) = exp R p(t)dt and the general solution of the equation is given now in Z x3 + 1 2 dx the form (x2 + 1)2

1 Z t  Z  x x 1  y = µ(s)g(s)ds + c . = 2 − + dx x2 + 1 (x2 + 1)2 (x2 + 1)2 µ(t) t0 1 x integration The procedure of calculating deﬁnite 2 −1 = ln(x + 1) + 2 + 2 + tan x + C. or indeﬁnite integrals. Unlike diﬀerentiation, where x + 1 x + 1 the derivative of any analytically given function could be found by applying diﬀerentiation rules, integra- One of the integration meth- tion process is much harder and for many functions ods. Let u and v be diﬀerentiable functions. Then there are no simple indeﬁnite integrals. For diﬀerent Z Z methods of ﬁnding indeﬁnite integrals see integration udv = uv − vdu by partial fractions, integration by parts, integration by substitution, by trigonometric substitution, term- and the identical formula holds for the deﬁnite inte- by-term integration. For approximate integration of grals: deﬁnite integrals see approximate integration, numer- Z b Z b b ical integration, Simpson’s rule, trapezoidal rule. udv = uv|a − vdu. a a integration by partial fractions A method of This formula allows to calculate many integrals that integration of rational functions. Let f(x) = otherwise were impossible to handle. Example: With P (x)/Q(x) be a proper rational function. Then it u = ln x, v = x, could be decomposed into the sum of partial frac- tions of one of the following forms: Z Z ln x dx = x ln x − dx = x ln x − x + C. A Ax + B , , (ax + b)j (ax2 + bx + c)k integration by substitution One of the integra- tion methods. Let f(x) and u = g(x) be two func- where in the second case the denominator is irre- tions an u be also diﬀerentiable. Then ducible over real numbers. Now, integration of the integral R f(x)dx is reduced to integration of simpler Z Z f(g(x))g0(x) dx = f(u) du. fractions. Examples:

Z x − 3 Z  2 1  This formula allows to calculate certain integrals, dx = − dx x2 − 3x + 2 x − 1 x − 2 that have special form. Example: To calculate the 74

√ integral R 2x + 1 dx, we notice, that with substitu- ﬁnd the y-intercept of the function y = x2 − 4x + 3, tion u = 2x + 1, du = 2dx, and we plug-in x = 0 and ﬁnd y = 3 and the point is (0, 6). To ﬁnd the x-intercepts, we plug-in y = 0 Z √ 1 Z √ 2x + 1 dx = udu and solve the equation x2 − 4x + 3 = 0, which has 2 solutions x = 1, 3 and the intercepts are (1, 0), (3, 0). 1 1√ = u3/2 + C = 2x + 1 + C. 3 3 intermediate value theorem Let f(x) be a con- tinuous function on the closed interval [a, b]. Assume integration by trigonometric substitution A that L is a number between the values f(a) and f(b) method of evaluating integrals containing expressions √ of the function at the endpoints. Then there exists at of the type a2 ± x2. In that case the substitutions least one point c such that a ≤ c ≤ b and f(c) = L. x = a sin θ, x = a tan θ or x = a sec θ often result This theorem helps sometimes to ﬁnd the x-intercepts in simpler integrals to evaluate. Examples: (1) To of functions. For example, if f(a) < 0 and f(b) > 0 evaluate the integral then the theorem means that there is at least one x- Z dx intercept inside the interval [a, b]. √ , ( 1 − x2)3 If some points are given on the plane, then any curve passing through that points we substitute x = sin θ, dx = cos θdθ and get is called interpolating curve. If that curve rep- Z dx Z cos θdθ resents the graph of some polynomial, then it √ = = tan θ + C, ( 1 − x2)3 (cos θ)3 is called interpolating polynomial. In the most general case we may ﬁx some points of the and, returning√ to original variable, the answer trans- given function f creating a set of ordered pairs 2 fers to x/ 1 − x + C. {(x1, f(x1)), (x2, f(x2)), ..., (xn, f(xn))} and the in- (2) To evaluate the integral R √ 1 dx, we use the 4+x2 terpolating polynomial p(x) will represent an approx- substitution x = 2 tan θ dx = 2 sec2 θdθ and see that imation to the function f which additionally coin- the integral is transformed to cides with given function at all prescribed points. Depending on the degree of the polynomial the in- Z sec θdθ = ln | sec θ + tan θ| + C terpolation will be linear, quadratic, cubic, etc. See also extrapolation. √ ! 4 + x2 x intersection A point where two curves coincide. = ln + + C. 2 2 If the curves are given by equations y = f(x) and y = g(x), then they intersect, if for some point integro-diﬀerential equation An equation, x = c, f(c) = g(c). When a curve intersects with one where the unknown function is both integrated and of the coordinate axes, then the intersection point is diﬀerentiated. Example: called intercept. Z b intersection of sets If A and B are two sets of 00 φ (x) = f(x) + K(x, t)φ(t)dt, some objects, then their intersection A ∩ B is de- a ﬁned to be the set of all elements that belong to where f and K are given functions and φ is to be both A and B. Example: If A = {1, 2, 3, 4, 5, 7} and determined from this equation. B = {2, 4, 6, 8} then A ∩ B = {2, 4}. In case when two sets have no common elements we say that their intercepts The intersections of the graph of a intersection is the empty set. function with the coordinate axes. For y = f(x), the y-intercept is the point, when x = 0 and the interval A subset of the real line with the prop- x-intercept(s) is the point, when y = 0. Example: To erty that for any two points in that subset, all the 75

√ 3 x −1 points between that two are also included. Intervals x; f(x) = 2 , f (x) = log2 x. could be closed (include both endpoints), open (do inverse Laplace transform The transformation, not include endpoints), or half-open (include only one that translates the Laplace transform of some func- of endpoints). The four possible conﬁgurations are: tion back to the same function. Symbolically, [a, b], (a, b), [a, b), (a, b]. In this deﬁnition one or both endpoints could be inﬁnite. See inﬁnite inter- L−1Lf(t) = f(t). val. The inverse Laplace transform exists under certain interval of convergence For a power series conditions on the original function and is unique. An-

∞ alytically it is given by a formula involving complex X n variable integration. A list of inverse Laplace formu- an(x − c) n=0 las could be deduced from direct Laplace transform formulas working backwards. See the corresponding the interval (c − r, c + r), where the seris converges. entry. The number r is called and may be zero, some positive number, or inﬁnity. Some se- inverse matrix For a given square matrix A the ries also converge at the endpoints, hence the interval matrix B, that satisﬁes the conditions AB = BA = becomes closed. Example: The series I, where I is the identity matrix. Inverse matrix is denoted by A−1 and it exists if and only if the deter- ∞ X xn minant of the matrix A is not zero. There is a simple n formula for calculation of inverses of 2 × 2 matrices. n=1 If  a b  has the half-open interval [−1, 1) as its interval of con- A = c d vergence, because when x = 1, it results in diverging harmonic series and for x = −1 it is the converging is the given matrix, then alternating harmonic series. 1  d −b  A−1 = . Something, that does not change when ad − bc −c a certain transformation is performed. Examples: For For general n × n matrices the inversion formula uses 2 the function f(x) = x the point x = 1 is invariant, cofactors and is given by because f(1) = 1. For any linear transformation T in any vector space V , the origin is invariant, because −1 1 T A = (Cij) , T (0) = 0. In a more general setting, the area of a det(A) plane geometric ﬁgure is invariant when we transform T where (Cij) is the transpose of the matrix where ele- this ﬁgure using only translation, rotation and reﬂec- ment on ij−th position is the corresponding cofactor. tion operations. This formula is very hard to use in practice, so in- inverse function For function f(x) of one real stead a version of Gauss-Jordan elimination method variable with domain D, the function g(x) is its in- is used. Example: If verse, if for all appropriate values of the variable  1 2 2  g(f(x)) = f(g(x)) = x. A =  2 3 3  , 3 4 5 −1 The inverse of the function f is denoted by f . we form a 3 × 6 matrix by adjoining the identity ma- Usual suﬃcient condition for existence of the in- trix to A: verse is the condition of being one-to-one. Fa-  1 2 2 1 0 0  miliar examples of pairs of inverses are: f(x) =  2 3 3 0 1 0  2x, f −1(x) = 1/2x; f(x) = x3, f −1(x) = 3 4 5 0 0 1 76 and apply Gauss-Jordan method. The result will be irrational number Any real number that is not another 3 × 6 matrix rational. Irrational numbers cannot be written as fractions, as terminating or as non-  1 0 0 −3 2 0  terminating, repeating decimals√ . Common examples  0 1 0 1 1 −1  of irrational numbers are 2, π, e. 0 0 1 1 −2 1 The polynomial is irre- and the right half of this matrix is the inverse of the ducible over a number ﬁeld, if it cannot be factored given matrix A: into linear factors with coeﬃcients from that ﬁeld. A polynomial which is irreducible over some number  −3 2 0  ﬁeld, may be reducible over other number ﬁeld. A−1 = 1 1 −1 .   (1) The polynomial x2 −2 is irreducible over the ﬁeld 1 −2 1 of rational numbers, but it is reducible√ over the√ ﬁeld 2 inverse transformation Let T : V → W be a of real numbers because x − 2 = (x − 2)(x + 2). linear transformation from one linear vector space (2) The polynomial x2 +4 is irreducible over rationals to another. The inverse of this transformation, de- and reals but it is reducible over complex numbers noted by T −1, is another linear transformation (if because x2 + 4 = (x + 2i)(x − 2i). By the Fundamen- it exists), that transforms W to V and satisﬁes the tal Theorem of Algebra, any polynomial with rational conditions T ◦ T −1 = T −1 ◦ T = I, where I is the coeﬃcients is reducible over the ﬁeld of complex num- identity transformation and ”◦” denotes composition bers. of transformations. Suﬃcient condition of existence irregular singular point For diﬀerential equa- of the inverse transformation is that T be one-to-one tions. Consider the equation transformation. 00 0 inverse trigonometric functions See arcsine, P (x)y + Q(x)y + R(x)y = 0 arccosine and arctangent functions. and its power series solutions. If for the point x = inverse variation The name of many possible re- x0,P (x0) 6= 0, then that point is called ordinary and lations between two variables. The most common series solution is possible near that point. If P (x0) = are:(1) y varies inversely with x means that there is a 0, then the point is singular. Additionally, the series real number k 6= 0, such that y = k/x; (2) y varies in- solution is still possible if the singular point is regular, 2 2 versely with x means y = k/x ; (3) y varies inversely which means that the limits with x3 means y = k/x3. There are many other pos- sibilities but rarely used. Compare also with direct Q(x) 2 R(x) lim (x − x0) and lim (x − x0) variation and joint variation. x→x0 P (x) x→x0 P (x) inversion (1) The procedure of ﬁnding the inverse are ﬁnite. If any of these conditions is violated, then of a given object: number, function, matrix, etc. the point x0 is called irregular singular point. (2) In a permutation, whenever a larger number pre- cedes a smaller one, we call it an inversion. See also irrotational vector ﬁeld A vector ﬁeld F is transposition. called irrotational at some point P , if at that point curlF = 0. invertible matrix A matrix such that the inverse matrix exists. isobars A contour line of equal or constant pres- sure on a graph. A parametric curve given by the equa- tions A triangle that has two equal sides. Isosceles triangles necessarily also have two x = r(cos θ + θ sin θ), y = r(sin θ − θ cos θ). equal angles. 77 iterate In the process of successive approximation, the result on each intermediate step is called iterate. J iterated integral Let f(x, y) be a function deﬁned on some rectangle R = [a, b] × [c, d]. We may inte- grate this function with respect of the second variable Jacobian Assume that the variables x, y, z are and get a new function of one variable expressed with the help of other three variables u, v, w and the change of variable functions x = Z d x(u, v, w), y = y(u, v, w), z = z(u, v, w) are continu- F (x) = f(x, y)dy. ously diﬀerentiable. Then the matrix c ∂x ∂x ∂x Now, if we integrate this resulting function with rex-   ∂u ∂v ∂w pect to the ﬁrst variable, then we will get  ∂y ∂y ∂y  J =    ∂u ∂v ∂w  Z b Z b "Z d # ∂z ∂z ∂z F (x)dx = f(x, y)dy dx. ∂u ∂v ∂w a a c is called the Jacobian matrix of the transformation. The integral on the right side is called iterated The importance of this matrix is that the determi- integral. Similar deﬁnitions hold for functions of nant of J plays important role in calculating integrals three or more variables. Fubini’s theorem uses by change of variable. The following theorem holds: iterated integral in calculations of multiple integrals. Let the function f(x, y, z) be deﬁned and integrable in some region G and let the change of variable for- mulas x = x(u, v, w), y = y(u, v, w), z = z(u, v, w) transform G to some other region G0. Then ZZZ f(x, y, z) dxdydz G ZZZ = f(u, v, w) det(J) dudvdw. G0

joint variation The name of many possible rela- tions between three or more variables. The most com- mon are:(1) z varies directly with x and y means that there is a real number k 6= 0, such that z = kxy; (2) z varies directly with x and inversely with y means x 2 z = k y ; (3) z varies directly with x and inversely x2 with y means z = k y . There are many other pos- sibilities including more variables. See also inverse variation and direct variation. Jordan curve Let I = [a, b] be an interval and the functions x = f(t) and y = g(t) are deﬁned on that interval. These equations deﬁne a plane parametric curve. This curve is called Jordan curve if: (1) The functions f, g are continuous; (2) the curve is closed, meaning that f(a) = f(b), g(a) = g(b); (3) The curve 78 is simple, i.e. f(t1) 6= f(t2) except the endpoints, and K similarly g(t1) 6= g(t2). Jordan curves in the space are deﬁned in the same way. Jordan decomposition See Jordan form of ma- Kepler’s laws of planetary motion: trix. (1) Every revolves around the Sun in an ellip- tical with the Sun at one focus. Jordan domain A plane region that is bounded (2) The line joining the Sun and a planet sweeps out by a Jordan curve. equal areas in equal time. Jordan form of matrix A block-diagonal square (3) The square of the period of revolution of a planet matrix, that could be written in the form is proportional to the of the length of the major axis of its orbit.  J1 0 ... 0  kernel of convolution In the convolution integral  0 J2 ... 0   . . .  , Z ∞  . . .  (K ∗ f)(x) = f(t)K(x − t)dt 0 0 ...Jn −∞ the function K, if we consider f as a parameter. where each Jk is itself a square matrix. The diag- onal elements of this block-matrix are all the same, kernel of integral transform In the integral the sub-diagonal consists of all 1’s, and all the other transform elements of the matrix are zeros: Z b  λk 0 ... 0 0  T f(x) = K(x, t)f(t)dt, a  1 λk ... 0 0    J =  0 1 ... 0 0  the function K(x, y). Transforms vary, depending on  . . . .   . . . .  the nature of the kernel. We get the most common Fourier and Laplace transforms, when the kernel is 0 0 ... 1 λk equal to eixy and e−xy respectively. The entries on diagonal are the eigenvalues of the ma- kernel of linear transformation Let T be a lin- trix. These type of matrices are called simple Jordan ear transformation from one vector space V to an- matrix. Every square matrix is similar to a matrix other, W . The set of all vectors x ∈ V , such that of the Jordan form. T x = 0 is the kernel of the transformation. The ker- jump discontinuity A function f(x) is said to nel is in fact a subspace of V : if u, v belong to the ker- have jump disconinuity at some point x = c, if both nel then for any constants α and β the vector αu+βv left-hand and right-hand limits of the function at that also belongs to the kernel. The dimension of the ker- point exist, but are not equal: nel is called nullity.

lim f(x) = a, lim f(x) = b, a 6= b. Kronecker’s symbol Also called . x→c+ x→c− A function of two discrete parameters i, j, given by the formula The Heaviside function is an example of a function with jump discontinuity at point 0.  1 if i = j δ = . ij 0 if i 6= j The parameters i and j are chosen, as a rule, to be integers or whole numbers. 79

Equivalent notation 52f is also used. See also har- L monic functions and Laplace’s equation. Laplace transform A special type of integral transform, where the kernel is the function K(s, t) = Lagrange’s identity Consider the Sturm- e−st. Formally, the Laplace transform of a function Liouville boundary value problem on the interval [0, ∞) is given by 0 0 [p(x)y ] − q(x)y + λr(x)y = 0 Z ∞ −st with boundary conditions Lf(s) = e f(t)dt. 0 α y(0) + α y0(0) = 0, β y(1) + β y0(1) = 0 1 2 1 2 Laplace transform of a function can exist even if the and denote L[y] = −[p(x)y0]0 + q(x)y. Then La- function itself is not integrable on [0, ∞). In fact, for grange’s identity is true: the existence of the Laplace transform of a function Z 1 f it is enough that this function be piecewise contin- {L[u]v − uL[v]}dx = 0. uous on any ﬁnite interval [0,N] and have a growth 0 no more than exponential function: |f(t)| ≤ keat for suﬃciently large values of its argument. Assume that the function Here is a list of Laplace transforms of some common f(x, y, z) is given in some domain D and is diﬀeren- functions, where on the left is the function and on tiable there. If we want to ﬁnd the extreme values of the right is its transform: this function subject to the constraint g(x, y, z) = k 1 (or, otherwise, on the level surface g(x, y, z) = k), 1 , s > 0 then the of both functions f and g are par- s allel at all extreme points, and hence, there is a real 1 number λ such that 5f = λ 5 g. The constant λ eat , s > a is the Lagrange multiplier. In cases when we wish s − a to ﬁnd the extreme values of a function subject to Γ(p + 1) tp, p > −1, , s > 0 two or more constraints, we will have two or more sp+1 Lagrange multipliers. For the procedure of ﬁnding a sin at, , s > 0 extreme values with the use of Lagrange multipliers s2 + a2 see method of Lagrange multipliers. s cos at, , s > 0 Laguerre equation The second order diﬀerential s2 + a2 equation a sinh at, , s > 0 xy00 + (1 − x)y0 + λy = 0. s2 − a2 In the case when λ = m, a positive integer, the solu- s cosh at, , s > 0 tions are polynomials, known as Laguerre polynomi- s2 − a2 als. See also Chebyshev equation. One of the most important properties of the Laplace lamina A 2 dimensional planar closed surface with transform (and some other integral transforms), is mass and density. that it allows to substitute certain complicated op- erations by a simpler ones. For example, the con- Laplace operator For a function f(x, y, z) of three volution of two functions is substituted by multipli- variables, which is two times diﬀerentiable by all vari- cation (see corresponding article) and the operation ables, the expression of diﬀerentiation could be substituted by the oper- ∂2f ∂2f ∂2f ation of multiplication by the independent variable. ∆f = + + . ∂x2 ∂y2 ∂z2 More precisely, let f(t) be a piecewise diﬀerentiable 80 function on any interval 0 ≤ t ≤ A and satisfy the in- use of theorems about inversions of Laplace trans- equality |f(t)| ≤ Keat for t ≥ M, where K, a, M are forms of step functions (see the next article again) some positive constants. Then the Laplace transform gives us the ﬁnal solution of f 0 exists for s > a and is given by 1 y = [u (t)k(t − 5) − u (t)k(t − 10)]. Lf 0(s) = sLf(s) − f(0). 5 5 10

Under similar conditions, we have also a formula for Laplace transform of higher degree derivatives of the Laplace transforms of special functions (1) function: Let uc(t) be the unit (generalization Lf (n)(s) = snLf(s) of the Heaviside function) deﬁned by −sn−1f(0) − · · · − sf (n−2)(0) − f (n−1)(0). n 1 if x ≥ c f(x) = . These properties are crucial in applications of Laplace 0 if x < c transform in solving initial value problems. Then Laplace transform method For solving initial e−cs L[u (t)] = , s > 0. value problems for linear non-homogeneous diﬀeren- c s tial equations. The method is especially important in cases when the right side of the equation is discontin- In the special case c = 0(Heaviside function) the uous because the other methods (undetermined coef- transform is just 1/s. ﬁcients, , series solutions) are (2) If F (s) = L[f(t)] exists for some s > a ≥ 0, then almost never successful in that case. Example: Solve −cs the equation L[uc(t)f(t − c)] = e F (s), s > a. y00 + 4y = g(t) (3) If the function f is periodic with period T , then with the initial conditions y(0) = 0, y0(0) = 0 and the function g(t) deﬁned to be 0 on the interval [0, 5) R T e−stf(t)dt and (t−5)/5 on [5, 10) and 1 on [10, ∞). This function L[f(t)] = 0 . could be written as 1 − e−sT Laplace’s equation In partial diﬀerential equa- g(t) = [u5(t)(t − 5) − u10(t)(t − 10)]/5, tions. In the simplest case of two variables, the equa- where the function uc(t) is deﬁned below in the article tion Laplace transforms of special functions. Applying the ∂2u(x, y) ∂2u(x, y) + = 0. Laplace transform to both sides of the equation, using ∂x2 ∂y2 the initial conditions, and denoting Ly = Y (s), we get Solutions to this equation are called harmonic func- e−5s − e−10s tions. (s2 + 4)Y (s) = . 5s2 latent roots The same as roots of characteristic Making notation K(s) = 1/s2(s2 + 4) we will get equation or eigenvalues.

e−5s − e−10s Let α, β, γ denote the (measures Y (s) = K(s). of) angles of some triangle ABC and denote by a 5 (the size of) the side, opposite to α, by b– the side Now, using the partial fraction decomposition of the opposite to angle β and by c the third side (opposite function K we ﬁnd its inverse Laplace transform to to γ). Then the following three relationships connect be the function k(t) = t/4 − 1/8 sin 2t. Finally, the these six quantities: 81

element of each row (except the rows consisting of all zeroes) is 1 and is called leading 1. leading coeﬃcient For a polynomial of degree n,

n n−1 2 p(x) = anx + an−1x + ··· + a2x + a1x + a0

the coeﬃcient of the highest degree term, an. learning curve The same as logistic curve. See lo- gistic diﬀerential equation. least common denominator Abbreviated LCD, a2 = b2 + c2 − 2bc cos α, is the of two or more denom- 2 2 2 8 b = a + c − 2ac cos β, inators of fractions. To ﬁnd LCD of fractions 15 and 5 we just ﬁnd the LCM of 15 and 8. c2 = a2 + b2 − 2ab cos γ. 8 Low of Cosines generalizes , be- least common multiple (LCM) For given two cause when one of the angles is a right angle, then the natural numbers n and m, the least common mul- corresponding cosine is zero and the formula reduces tiple is the smallest whole number that is divisible to Pythagorean. This law is used to solve oblique by both of these numbers. Example, for 12 and (not right) triangles. 22, LCM(12,22)=132. There are methods of ﬁnding LCM of two or more numbers, and the most common Law of In probability and statis- is described as follows. To ﬁnd LCM, ﬁnd prime fac- tics, this law states that if we chose large number of torizations of that numbers and form a number, that samples from any given distribution, then its mean is the product of all prime factors, each of whose is will be close to the mean of the distribution itself. repeated only as many times as the largest multiplic- More precisely, as the sample size increases approach- ity in any of the numbers. Example: To ﬁnd LCM of ing the distribution size (or inﬁnity, if the distribution 25, 90 and 120, we write 25 = 52, 90 = 2·32 ·5, 120 = is continuous), then the sample mean approaches dis- 23 ·3·5 and LCM(25,90,120)=23 ·32 ·52 = 1800. There tribution mean. is a formula relating LCM with the greatest common Law of Sines Let α, β, γ denote the (measures of) factor (GCF): angles of some triangle ABC and denote by a (the n · m LCM(n, m) = . size of) the side, opposite to α, by b– the side oppo- GCF (n, m) site to angle β and by c the third side (opposite to γ). Then the following relations are true: regression line Suppose we have sin α sin β sin γ = = , gathered data that came in the form of ordered pairs. a b c Then each value geometrically represents a point on or, equivalently, the plane. The problem of is to ﬁnd a line that represents these values the best. The mea- a b c = = . sure of ”closeness” of the line to the points from the sin α sin β sin γ data is the sum of the squares of diﬀerences between actual values and corresponding values on that line. Law of Sines is used to solve oblique triangles. For Among all the possible lines the one that has the an illustration see Law of Cosines. smallest sum of that values is called the least squares leading 1’s After the process of Gaussian elimina- regression line. As any other line this line has the tion applied to a m × n matrix, the ﬁrst non-zero equation of the form y = mx+b, where m is the slope 82 and b is the y-intercept. Traditionally, in Statistics, is L, if for any real number ε > 0 there exists δ > 0 this equation is written as such that |f(x)−L| < ε as soon as |x−c| < δ, x < c. See also limit and right-hand limit. yˆ = b0 + b1x. Legendre equation The diﬀerential equation of The notationy ˆ is used to indicate that these are ”pre- second order dicted”, not the actual values from the data. The (1 − x2)y00 − 2xy0 + α(α + 1)y = 0. parameters b0, b1 could be found from the data. The slope b1 = rsy/sx, where r is the correlation coef- In case α = n, a non-negative integer, the solutions ﬁcient of the data, sy is the standard deviation of the second coordinates from the data (y-values) and of this equation are polynomials. They are called Legendre polynomials, P (x) if, additionally, P (1) = sx is the standard deviation of ﬁrst coordinates from n n the data (x-values). After the slope is found, the y- 1. intercept b0 is determined by the formula b0 =y ¯−b1x¯, Legendre polynomials Solutions of the Legendre where the bar on top means the mean of the corre- equation for α = 0, 1, 2, ... with the initial condition sponding y or x-values. Pn(1) = 1. These polynomials could be expressed in least upper bound Also called supremum. If a the form function f or a sequence an are bounded above, then n 1 d 2 n ”smallest” of all possible upper bounds is the least Pn(x) = (x − 1) . 2nn! dxn upper bound. Example: The sequence an = 1 − 1/n is bounded by any number M ≥ 1 but not by any number less than 1. The number 1 is the least upper legs of a right triangle Two sides of a right bound. See also greatest lower bound. triangle that form the right angle. The other side is left-hand derivative If a function is not diﬀer- called hypotenuse. entiable at some point x = c, then the limit in the deﬁnition of derivative does not exists, i.e.

f(c + h) − f(c) lim h→0 h does not exist. In some cases, however, one sided limit might exist. If the left-hand limit

f(c + h) − f(c) lim h→0− h exists, then it is called left-hand derivative of f at length of a line segment If the line segment is the point x = c. See also right-hand derivative. on the x-axis and has endpoints a and b, then the length is given by b − a. If the line segment connects left-hand limit Let f(x) be a function deﬁned two points on the plane, then the length is just the on some interval [a, b] (also could be an open inter- distance between these points. See distance formula. val). Left-hand limit of f at some point c is the limit of f(x) as the point x approaches c from the length of parametric curve Let the curve C left, or, which is the same, as x < c. The notation be given by parametric equations x = f(t), y = 0 0 is limx→c− f(x). The precise deﬁnition of left-hand g(t), a ≤ t ≤ b and f and g are assumed to be limit (the ε − δ deﬁnition) is the following: The func- continuous on [a, b]. Additionally, we assume that tion f has left-hand limit at the point c and that limit when t increases from a to b, the curve C is traversed 83 exactly once. Then the length of the curve C is given with notation A ≤ B. If the quantities are real num- by the formula bers, then this statement means that A is to the left of B on the number line, or the two points coincide. s Z b dx2 dy 2 See also greater than or equal to. L = + dt. a dt dt level curve For a function f(x, y) of two variables, the equations f(x, y) = k with any real k in the range Example: For a circle of radius 1, we have parametric of f, deﬁne the level curves of the function. equations level surface For a function f(x, y, z) of three vari- x = cos t, y = sin t, 0 ≤ t ≤ 2π ables, the equations f(x, y, z) = k with any real k in the range of f, deﬁne the level curves of the function. and Level of functions of n variables are deﬁned Z 2π p Z 2π similarly. L = sin2 t + cos2 tdt = 1 dt = 2π. 0 0 L’Hospital’s rule Suppose the functions f and g are deﬁned on some interval, containing some point length of polar curve Let the curve C be given c, with possible exception of that point. Assume also, with polar equation r = f(θ), a ≤ θ ≤ b. The the that the limit of f(x)/g(x) as x approaches c is an length of the curve is given by the formula indeterminate form 0/0 or ∞/∞. Then s Z b dr 2 f 0(x) L = r2 + dθ. lim 0 = L a dθ x→c g (x) Example: For the cardioid r = 1 + sin θ implies that also

Z 2π f(x) L = p(1 + sin θ)2 + cos2 θdθ = 8. lim = L. x→c 0 g(x) This rule is the major tool of computation of limits length of a space curve Suppose the when we have any kind of indeterminate expression. curve C is given by a vector function Also, if the application of derivative results in another r(t) = hf(t), g(t), h(t)i, a ≤ t ≤ b. Then the indeterminate, then the repeated use of this rule still length of C is given by a formula generalizing the might produce a ﬁnite limit. Examples: one for the plane: (1) sin x cos x Z b lim = lim = 1. L = |r0(t)|dt. x→0 x x→0 1 a (2) 2x2 − 5 4x 2 lim = lim = . length of a vector See norm of a vector. x→∞ 5x2 + 3 x→∞ 10x 5 less than An inequality, stating that one quantity (3) x cos 2x is smaller than the other, with notation A < B. If lim x cot 2x = lim the quantities are real numbers, then this statement x→0 x→0 sin 2x means that A is to the left of B on the number line. cos 2x − 2x sin 2x 1 = lim = . See also greater than. x→0 2 cos 2x 2 less than or equal to An inequality, stating that L’Hospital’s rule is speciﬁcally formulated for two one quantity is smaller than the other, or equal to it, types of indeterminate forms but can be used also 84 for indeterminate forms 0 · ∞, 00, 1∞ and others. For the ﬁrst of these cases, if we need to calculate the limit limx→c f(x)g(x) and one of the functions approaches zero and the other one to inﬁnity, then f we write the product fg as one of the 1/g 1/f or g and get indeterminate form 0/0 or ∞/∞ and proceed as before. For other types the preliminary logarithming the expression results in an indetermi- nate form of one of the above types and after calcu- lation of that limit we can ﬁnd the original limit by exponentiating the answer. Example: To calculate the limit  a x lim 1 + x→∞ x we logarithm the expression under the limit (denote limit One of the most important notions of Calcu- it by y) we get lus. On the intuitive level, the at a point or the when the index ln(1 + a/x) grows indeﬁnitely, could be viewed as the eventual ln y = x ln(1 + a/x) = . 1/x value of the function or sequence. In the deﬁnitions that follow we will describe precisely these notions Then, by the l’Hospital’s rule, for functions and sequences. (1) Let the function f be deﬁned on some interval ln(1 + a/x) −a/x2 I with possible exception of some point c inside of lim = lim x→∞ 1/x x→∞ (1 + a/x)(−1/x2) that interval. We say that f(x) has limit L as x ap- proaches c and write a = lim = a. lim f(x) = L, x→∞ 1 + a/x x→c

Hence, the limit in question is equal to ea. if for any real number ε > 0 there exists another number δ > 0 such that |f(x) − L| < ε as soon as like terms In an algebraic expression with one or |x − c| < δ, x 6= c. more variables, the terms where the variable parts (2) There are a few special cases that should be are identical (constant factors may be diﬀerent). For treated separately. First, if the point c is one of the example, in the expression 2x2y3 −3xy2 +x3y +4xy2 endpoints of the interval I, then the limit should be only the terms −3xy2 and 4xy2 are similar and could substituted by one-sided limits. For deﬁnitions see be combined. The resulting expression is 2x2y3 + left-hand limit, right-hand limit. Second, the case xy2 + x3y. Also called similar terms. when the point c is the point ∞ or −∞, the deﬁni- limacon, or more precisely, lima¸con The family of tion requires modiﬁcation. Here is the case c = ∞. parametric curves, given in polar coordinates by one We say that the function f(x) has a ﬁnite limit L as of the equations x approaches inﬁnity and write lim f(x) = L, r = a + b sin θ, r = a + b cos θ. x→∞ if for any real number ε > 0 there exists a number In the spacial case |a| = |b|, we have cardioid. N > 0 such that |f(x) − L| < ε as soon as x > N. Finally, there is a case when the limit of a function 85

R b does not exist in a very speciﬁc way of becoming in- integral a f(x)dx the interval endpoints a and b are deﬁnitely large or indeﬁnitely small. In these cases we called limits of integration or integration limits. say that the function has inﬁnite limit, understand- line One of the basic object of ing that the limit does not exist as a ﬁnite number. along with points and planes. Lines are not deﬁned One of the possible cases is the following: but instead assumed to be understood as having no We say that lim f(x) = ∞ if for any real number x→c width and inﬁnite length in both directions. In Carte- M > 0 there exists a real δ > 0 such that f(x) > M sian coordinate system lines could be given by equa- as soon as |x − c| < δ, x 6= c. tions. The most general equation of the line on the (3) Limits of sequences could be deﬁned exactly as plane is written in the form ax+by = c where a, b, c the limits of functions at inﬁnity, if we view a se- are any real numbers. Any line could be uniquely quence a as a function deﬁned on positive integers n determined by two points (according to one of the only: a = f(n). With this view, the exact deﬁnition n Euclidean postulates). If the two points have coordi- of the limit of a sequence is this: nates (x , y ) and (x , y ) then we deﬁne the slope of We say that a numeric sequence {a }, n = 1, 2, 3, ... 1 1 2 2 n the line by the relation has a limit A and write y − y m = 2 1 . lim an = A, n→∞ x2 − x1 This quantity does not depend on the choice of the if for any ε > 0 there exists a positive integer N such points or the order of that points. Now, the equation that |a − A| < ε as soon as n ≥ N. n of the line passing through that two points could be P Assume that an and written using the point-slope form of the line P bn are both series with positive terms. If y − y1 = m(x − x1), a lim n = c or, equivalently, we could use the coordinates (x2, y2). n→∞ bn In the case when the slope of the line and its inter- and c > 0 is a ﬁnite number, then either both series section b with the y-axis (y-intercept) are known, we converge or both diverge. can use the slope-intercept equation of the line limit laws Let f(x) and g(x) be functions such that y = mx + b. the limits Each of these forms could be translated into the lim f(x) and lim g(x) other, hence, they are equivalent. x→c x→c The line parallel to x-axis (horizontal line) has zero exist. Then the limits of their sum, diﬀerence, prod- slope and is given by the equation y = b. The slope uct and quotient also exist and could be calculated of the vertical line (parallel to y-axis) is not deﬁned by the rules and could be written in the form x = a. Using the notion of the slope we can see that two lim[f(x) ± g(x)] = lim f(x) ± lim g(x), lines on the plane are parallel if and only if their x→c x→c x→c are equal. If two lines are given by the equa- lim f(x) · g(x) = lim f(x) · lim g(x), tions y = m1x + b1 and y = m2x + b2, then they are x→c x→c x→c perpendicular if and only if m1 · m2 = −1. f(x) lim f(x) In three dimensional space any plane is given by the lim = x→c . x→c g(x) limx→c g(x) equation ax+by+cz = d and lines could be presented as intersections of two planes in the space. This ap- In the last law we assume that lim g(x) 6= 0. x→c proach could be extended also to lines in higher di- limit of integration In the notation of the deﬁnite mensional spaces. 86

divide 462 by 3, multiplication table is diﬃcult to apply and we use the . The procedure is illustrated below.

long division of polynomials When dividing a polynomial P (x) by another polynomial Q(x), where logical contrapositive If a logical statement is of the degree of Q is less than or equal to the degree of the form ”If A, then B”, (symbolically, A ⇒ B), P , the result is another polynomial plus, as a rule, a then the statement ”not B ⇒ not A”, meaning ”If B remainder. This is expressed as is not true, then A is not true” is called logical con- P (x) R(x) trapositive. Direct and contrapositive statements are = D(x) + , equivalent in the sense that either they both are true Q(x) Q(x) or both are false. See also conditional statement. where D is the resulting polynomial which degree is logistic diﬀerential equation The ﬁrst degree always less than the degree of P and R is the remain- non-linear diﬀerential equation of the form der polynomial which always has degree less than the  y  degree of Q. The practical procedure of ﬁnding the y0 = r 1 − y, polynomials D(x) and R(x) is called long division (of K polynomials) and is similar to the long division of where r and K are constants. The solution of this numbers. An example below illustrates this process. equation with initial condition y(0) = y0 is given by

y0K y = −rt y0 + (K − y0)e and is called logistic curve. The mathematical models described by logistic equation are called logistic models and the function growth is called In this example D(x) = x3 +2x−3 and the remainder logistic growth. is 0. When dividing by a binomial of the form x − c the long division of numbers A procedure, algo- synthetic division is usually simpler and easier to per- rithm, for dividing two numbers, usually integers form. See the corresponding article for details. or decimals. This method is used in cases when lower bound For a function f(x) a number M is simple division is diﬃcult. For example, to divide 72 a lower bound, if f(x) ≥ M for all values of x in the by 9 we need only to remember the multiplication domain of f. table and ﬁnd the answer to be 8 because we know that 8 · 9 = 72. On the other hand, if we want to lower triangular matrix A square matrix where 91 all the entries above the main diagonal are zeros. Ex- ample: M  2 0 0   3 −1 0  . −2 5 3 MacLaurin series The special case of Taylor se- The determinant of such a matrix is just the product ries for representation of functions as a power series of all diagonal elements. See also upper diagonal ma- expanded at the point x = 0. If the function f(x) trix. is inﬁnitely diﬀerentiable (and also analytic) in some neighborhood of the point zero, then the power series lower-upper decomposition of a matrix Or expansion is true: LU-decomposition. If a square matrix A can be re- ∞ duced to row-echelon form U by Gaussian elimina- X f (n)(0) f(x) = xn. tion without interchange of rows, then A can be fac- n! tored in the form A = LU, where L is lower triangu- n=0 lar matrix and U is an upper triangular matrix. Most of the elementary functions have their Maclau- lurking variable In Statistics. Suppose there are rin series expansions in speciﬁc intervals: two random variables related to each other in a way ∞ 1 X that the change of one of them aﬀects the other. In = xn, (−1, 1) 1 − x other words, one of the variables is the explanatory n=0 (or independent) and the other one is the response ∞ (or dependent) variables. In the majority of real X xn ex = , (−∞, ∞) life situations, however, there are many other aspects n! (variables) that aﬀect the change of the response vari- n=0 ∞ able. Sometimes they are disregarded by some reason X xn ln(1 + x) = (−1)n+1 , (−1, 1) intentionally (such as to simplify the situation) and n sometimes by omission or a mistake. These type of n=1 ∞ variables are the lurking variable. X x2n+1 sin x = (−1)n , (−∞, ∞) (2n + 1)! n=0 ∞ X x2n cos x = (−1)n , (−∞, ∞) (2n)! n=0 ∞ X x2n+1 tan−1 x = (−1)n , (−1, 1). 2n + 1 n=0 See also binomial series. magnitude of a vector See norm of a vector. main diagonal of a matrix For a square matrix A, the imaginary line, connecting the element on the ﬁrst row of the ﬁrst column with the element on the nth row and nth column. Example: In the matrix

90 + x grams and the amount of gold there will be variables. If a function f depends on more than one 0.75(90 + x) grams. On the other had this new mix- variable, then the generalization of Riemann sums ture still contains the same amount of gold as before, and Riemann integrals allows to integrate this func- which is 90% of 90 grams, equal to 81 grams. Hence, tion in a domain in n-dimensional Euclidean space the equation will be 0.75(90 + x) = 81. The solution Rn. The most common cases are the double integral is x = 18 grams. and triple integral. See corresponding entries for de- tails of deﬁnitions. mode For a set of numeric values, the number that appears most frequently. A set may have multi- multiplication of complex numbers (1) Let ple modes or have none at all. Examples: The set z = x + iy and w = u + iv be two complex num- S = {1, 2, 3, 4, 2, 5, 2, 6} has one mode which is 2. bers written in the standard form. Then, to mul- The set S = {1, 2, 3, 4, 3, 5, 2, 6} has two modes: 2 tiply these two numbers we multiply all terms with and 3, and the set S = {1, 2, 3, 4, 5, 6} has no mode, each other and combine the real and imaginary parts: because no numbers appear more than once. In this z · w = (x + iy)(u + iv) = (xu − yv) + i(xv + yu), last case it also could be said that all the elements of because i2 = −1. Example: the set are modes. (3 − 2i)(−1 + 4i) = −3 + 2i + 12i − 8i2 model, mathematical See mathematical model. modulus of a complex number For a complex = −3 + 8 + (2 + 12)i = 5 + 14i. number z = x + iy, the non-negative number |z| = (2) If the complex numbers z = r(cos θ + i sin θ) and p x2 + y2. Shows the distance from the point z on w = ρ(cos φ + sin φ) are given in trigonometric form, the complex plane to the origin. Also sometimes then called absolute value of the complex number. zw˙ = rρ[cos(θ − φ) + i sin(θ − φ)]. An algebraic expression that contains multiplication of fractions The product of two constants multiplied by natural power(s) of one or fractions is deﬁned to be another fraction with the more variables. Examples: 2x2, −3x3y2, xyz. numerator being the product of two numerators and the denominator the product of denominators. For- A function that is either in- mally, if a and c are the given fractions, then creasing or decreasing. See corresponding entries. b d a c a · c monotonic sequence A sequence that is either in- · = . creasing or decreasing. See corresponding entries. b d b · d monotonic sequence theorem If a monotonic In practice, instead of multiplying directly, we sim- sequence is bounded, then it has a ﬁnite limit. plify ﬁrst and then multiply, to avoid possible big numbers. Example: multiple-angle formulas In trigonometry, formu- 12 15 12 · 15 3 · 3 9 las relating trigonometric functions of multiple angles · = = = . with functions of single angle. The most common are 25 32 25 · 32 5 · 8 40 the double angle formulas, but triple or quadruple multiplication of functions For two functions angle formulas also exist. Here is one of possible ver- f(x) and g(x) their product (result of multiplication) sions of the triple angle formula for cosine function: function is deﬁned to be the product of their values: cos 3θ = 4 cos3 θ − 3 cos θ. (fg)(x) = f(x)g(x). This function is deﬁned where f and g are both deﬁned. Other versions of this formula and formulas for other multiplication of matrices Let A be an m × r functions also exist but very rarely used. matrix and B an r × n matrix. Then the product multiple integral For functions of several real AB = A · B is an m × n matrix whose entries are 97 determined by the following rule: The entry in row multiplication of power series Assume the func- i and column j is the sum of products of all pairs tions f(x) and g(x) are represented by power series. from ith row and jth column. In more detail. Let Then the product of these functions could be repre- A = {aij} and B = {bij}. Then the product matrix sented as a power series (with the interval of con- Pr C = AB has entries cij = k=1 aikbkj. As seen from vergence being equal to intersection of two intervals the deﬁnition, matrix product is possible only when of convergence) and that series is the product of the the number of columns in the ﬁrst matrix is equal to power series of the functions f and g. Example: To the number of rows in the second one. In particular, multiply the functions ex and 1/(1 + x2) we write is not commutative, because in their power series representations and multiply as we most cases BA is not even deﬁned when AB is de- would multiply two polynomials: ﬁned. For square matrices A, B both products AB  x x2  and BA are deﬁned but are not equal in general. Ex- 1 + + + ··· 1 − x2 + x4 − x6 + ··· amples: (1)Let 1! 2!   2 3 4 4 4 1 4 3 x 2 3 x 4 x x  1 2 4  = 1 + x + − x − x + + x − + + ··· A = ,B = 0 −1 3 1 . 2 6 2 24 2 6 0   2 7 5 2 x2 5x3 7x4 = 1 + x − − + − · · · . Then 2 6 24  12 27 30 13  AB = . 8 −4 26 12 multiplication property of equality Let A, B, C be any algebraic expressions and assume (2) If also that C 6= 0. Then, if A = B, then A · C = B · C.  2 −3   0 5  A = ,B = multiplication property for inequalities Let 1 4 2 −1 A, B, C be any algebraic expressions and assume then also that C 6= 0. (1) If A ≤ B and C > 0, then A · C ≤ B · C; (2) If A ≤ B and C < 0, then  −6 16   5 20  AB = ,BA = A · C ≥ B · C. 8 1 3 −10 Similar statements are true also in the case of other and hence, AB 6= BA. types of inequalities: <, >, ≥. Example: 2 < 5 but 2 · (−3) = −6 > 5 · (−3) = −15. multiplication of polynomials To multiply two polynomials we just multiply all terms of the ﬁrst one by all terms of the second one and combine all like multiplication rule for probabilities Let A and terms. Formally, if B be two independent events. Then the probability that both of these events will happen is given by the n m X i X j formula p(x) = aix , q(x) = bjx , P (A ∩ B) = P (A) · P (B). i=0 j=0 This formula extends to any number of independent then the product is a polynomial of degree n + m events. For the case when the events are not indepen- given by dent the general multiplication rule for probabilities n+m X works. Another notation for multiplication rule is p(x) · q(x) = c xk, k P (A andB) = P (A) · P (B). k=0 Pk multiplicative identity An element, that multi- where ck = aibk−i. Example: i=0 plied by any other element of the given set, does not (x2 + x − 2)(x3 − x + 1) = x5 + x4 − 3x3 + 3x − 2. change it. For the sets of real or complex numbers 98 the real number 1 serves as a multiplicative identity. For the set of square matrices, the identity matrix is N the multiplicative identity. An element b of some set is the multiplicative inverse of some other element, natural exponential function The exponential x a, if a · b = b · a = 1, where by 1 the with the base equal to e and notation y = e . identity is denoted. For the set of real numbers the In calculus, natural exponential functions allow to x 0 x inverse of a is denoted by a−1 or 1/a and it exists for simplify many calculations. In particular, (e ) = e R x x all numbers except a = 0. For matrices, the inverse and e dx = e + C. −1 is denoted by A . natural growth law See exponential decay, expo- multiplicity of eigenvalue See eigenvalues of nential growth. matrix. natural logarithmic function The logarithmic multiplicity of a zero Let p(x) be a polynomial of function with the base equal to e and special nota- degree n ≥ 1. Then, by the Fundamental theorem of tion: loge x = ln x. In calculus, natural logarithms Algebra it could be factored as a product of exactly n allow to simplify many calculations. In particular, we have (ln x)0 = 1/x and linear binomials: p(x) = a(x−c1)(x−c2) ··· (x−cn), where c1, c2, ..., cn are some complex numbers and Z p(ck) = 0, k = 1, 2, ...n. If any of these factors repeat, ln x dx = x ln x − x + C. then the corresponding cj is called multiple zero of the polynomial. The number of times this factor ap- pears in factorization is the multiplicity of this zero. natural number The numbers 1, 2, 3, 4 ···. This Example: In polynomial p(x) = (x−1)(x+1)2(x−2)3 sequence of numbers goes indeﬁnitely and there is no the zeros are x = 1, x = −1, x = 2 and they have greatest number in it. Also are called counting num- multiplicities 1,2 and 3 respectively. bers. multiplying factor See integrating factor. negative angle If the angle is placed in standard mutually exclusive events Two events are mu- position and the terminal side is moved in negative tually exclusive, if they cannot happen at the same (clockwise) direction, then the angle is considered time. Complementary events are always mutually negative. See also angle. exclusive, but this is also true for other events. negative deﬁnite function A function F (x, y) When rolling a die, the outcomes 2 and 4 are not deﬁned on some domain D containing the origin is complementary but they are mutually exclusive. negative deﬁnite, if F (0, 0) = 0 and F (x, y) < 0 ev- erywhere else in D. If F (x, y) ≤ 0, then the function is negative semideﬁnite. See also positive deﬁnite. negative deﬁnite matrix A A is negative deﬁnite, if for any vector x 6= 0, xT Ax < 0. If the inequality is substituted by ≤, then the ma- trix is negative semideﬁnite. Here xT indicates the transpose of the column vector x. See also positive deﬁnite matrix. negative number A number that is less than zero. Equivalently, any number that could be placed on the number line to the left of the origin. 99

Newton’s law of cooling According to this law, their signs, |f 0(x)| ≥ m > 0 and |f 00(x)| ≤ M for all discovered experimentally, the rate of cooling (loss of values of x in some interval (x0, x1). Assume that temperature) of a body is proportional to the diﬀer- f(x0) and f(x1) have opposite signs, while f(x1) 00 ence of temperatures of the body and the surround- and f (x1) have the same sign. Then there exists ing environment (if this diﬀerence is relatively small). a point c in (x0, x1) with f(c) = 0. Moreover, if Mathematically, this law could be expressed by the |x1 − x0| ≤ m/M, then the successive approxima- diﬀerential equation tions by Newton’s method x1, x2, x3, ··· approach c and |xn+1 − c| ≤ |xn+1 − xn|. dT (t) = −k(T − T ), dt 0 Newton’s second law The force on the object is equal to its mass multiplied by the acceleration of where T (t) is the temperature of the body and T0 is that object. the surrounding temperature. The solution is given A square matrix A such that by T (t) = T + (T (0) − T )e−rt. 0 0 An = 0 for some positive integer n ≥ 1. Newton’s laws of motion The laws of classical nondegenerate conic section The opposite of mechanics established by I.Newton. See also New- degenerate conic sections. All the regular conic sec- ton’s second law. tions (ellipses, , hyperbolas) are nondegen- Newton’s method A method for ﬁnding approx- erate. imate solutions of some equation f(x) = 0, where f nondiﬀerentiable function A function that is is some suﬃciently smooth function, not necessarily not diﬀerentiable. Among the reasons for a function a polynomial. If the solutions of this equation are to be nondiﬀerentiable are: failure to be continuous not possible by some exact method, then in many at the point, having ”wedges” at the point. There cases Newton’s method gives good approximations. are continuous functions, that are not diﬀerentiable The procedure is as follows. If we are trying to ﬁnd at any point. the solution of the equation and we have two values a and b, where f(a) and f(b) have opposite signs, nonhomogeneous algebraic equations The sys- we know (by the intermediate value theorem) that tem of linear equations where not all right sides are there is a solution to the equation f(x) = 0 in the zero. See also homogeneous algebraic equations. interval (a, b). Choose any of the endpoints to be our nonhomogeneous linear diﬀerential equation ﬁrst approximation to the solution and denote it by The equation of the form x1. If f(x1) 6= 0, then we choose the second point of approximation using the intersection of the tangent y00 + p(x)y0 + q(x)y = g(x), line to f at the point x1 with the x-axis. This point, which we denote by x , will be given by equation 2 or a similar one with higher order derivatives. In case x = x − f(x )/f 0(x ). Continuing the same way, 2 1 1 1 when g(x) = 0, the equation is homogeneous. we have a sequence, given by recurrence formula noninvertible matrix Also called singular matrix. f(xn) A matrix such that the inverse does not exist. See xn+1 = xn − 0 , n = 1, 2, 3, ... f (xn) also corresponding entry. This procedure does not always give good approxi- nonlinear diﬀerential equation Any diﬀerential mation to the solution. The following statement indi- equation where the unknown function or any of its cates conditions under which the procedure converges derivatives appear in a nonlinear form. Examples to the solution and also gives estimate of how exact could be the approximation is. Theorem. Suppose f 0(x) and f 00(x) do not change y00 + 2(y0)2 = 0, y00 + y0y = 2. 100

The nonlinear equations could be categorized by the normal distribution The continuous probabil- highest degree of the derivative involved, by the ex- ity distribution expressed by the normal density istence or non-existence of the function on the right function. The probabilities for this distribution are side, and many other criteria, just as in the case of calculated as areas under the curve. For example, if linear equations. The theory of nonlinear equations we want to know what is the probability of choosing is not as complete as the theory of linear equations a point between the values a and b, then we calculate and solving diﬀerent equations requires involvement the area under the curve between a and b. These of diﬀerent methods. These include the method of areas are calculated with the use of tables, graphing linearization, and solutions of autonomous equations or computers. See also standard normal and systems. For more details see the corresponding distribution. entries. nonsingulal matrix A (square) matrix, that has an inverse. Opposite of the singular matrix. nontrivial solution Many diﬀerential equations have two types of solutions: one is a meaningful func- tion and the other one is a constant or even iden- tically zero solution. These last type solutions are called trivial and the ﬁrst type solutions are nontriv- ial solutions. Example: The equation y0 = −y2 has the trivial solution y = 0 and nontrivial solutions y = 1/(x + C) for any real constant C. norm of a vector If v is a vector in some vec- A matrix A is normal, if AA∗ = tor space V with coordinates (v1, v2, ··· , vn), then its A∗A. Here A∗ denotes the conjugate of the transpose norm is deﬁned to be the non-negative number of the matrix. In case of real matrices, A∗ is just the q transpose of A. ||v|| = v2 + v2 + ··· + v2 . 1 2 n normal probability curve The curve that is the graph of the normal density function. See the deﬁni- In the case when V is an inner product vector space, tion and the picture above. ||v|| = hv, vi1/2. Also called the length or magnitude of the vector. normal system For any system of linear equations written in the matrix form Ax = b the associated normal The same as normal vector. equation normal density function The function that de- AT Ax = AT b scribes the normal distribution. Mathematically it is T given by the function of the form is called the normal system. Here A is the transpose of A. The normal system is important in ﬁnding the 2 f(x) = ce(x−a) , least square solutions of the system. normal vector A vector that is orthogonal to a where c and a are constants speciﬁcally chosen to curve or a surface. In case of the plane curve this satisfy the conditions of probability distributions. For means that the vector is perpendicular to the tan- more precise deﬁnition see bell shaped curve. gent line at a given point of the curve. In three di- mensional space orthogonality neans that the vector normal derivative The directional derivative in is perpendicular to the tangent plane to the surface the direction of the normal vector. at some point of that surface. A line in the direction 101 of the normal vector is called normal line. The branch of mathematics that deals with the properties of whole numbers. Despite normalization To make a system of orthogonal the fact that this is one of the oldest branches of vectors orthonormal, the procedure of normalization mathematics, there are several unresolved problems, is necessary. For that we just divide each vector from concerning properties of whole numbers. Among orthogonal system by its norm and the resulting vec- them, the famous deals with tors will have norm 1. See Gram-Schmidt process for properties of prime numbers. the details. This term can also be used in other con- texts where the notion of ”normal” is appropriately numerator For a fraction or a rational function, deﬁned. the top part in the fractional expression. In number 7 , 7 is the numerator and 13 is called denominator. nullity Let T be a linear transformation in some 13 In function vector space V and K denotes the kernel of that 3x4 − 2x2 + x − 5 transformation. The dimension of the kernel is called x5 + 2x2 − 1 nullity. the polynomial 3x4 − 2x2 + x − 5 is the numerator. null hypothesis In statistics, a hypothesis (state- ment, assumption) that a certain population param- numerical coeﬃcient Same as coeﬃcient. eter has some value. This hypothesis is usually de- Calculation of deﬁnite noted by H0 and has the form parameter=some value. integrals approximately, when exact calculations are For example, if the hypothesis is about the population impossible or diﬃcult. The same as approximate mean, then H0 : µ = µ0. Sometimes the hypothesis integration. is given also in the form µ ≥ µ0 or µ ≤ µ0. See also alternative hypothesis, hypothesis testing. numerical methods General term for indicating various methods of solving equations when analytic The set that contains no elements. Has the (sometimes algebraic) solutions are either impossible same meaning as empty set and the same notation ∅. or very complicated. For solutions of algebraic nullspace Let A be an m × n matrix and x be a equations see, e.g., Newton’s method. For numerical vector in Rn. The solution space of the equation solutions of diﬀerential equations see Euler method. Ax = 0 is the nullspace of the matrix. The nullspace is the subspace of Rn which is translated to zero by the of the matrix (linear transformation) A. See also kernel of linear transformation. number The most basic object in mathematics. The numbers are impossible to deﬁne, they rather could be described. We distinguish two major sets of numbers: the real numbers and the complex num- bers with the understanding that the ﬁrst set is a subset of the second one. Additionally, the real num- bers consist of subsets of irrational and rational num- bers. Rational numbers further have the subsets of integers, whole numbers, natural numbers and prime numbers. See the corresponding entries for deﬁni- tions. The visual presentation of real numbers is done by expressing them as points on the real line, and the complex numbers can be expressed as points on the complex plane. 102 O

All of the above cases happen when in the general equation the term Bxy is missing. In the case B 6= 0 the result is still a parabola, which is the result of rotation of one of the previous simpler cases. The parabola could also be given by its polar equa- tion: See addition and subtraction d d of vectors. r = or r = , 1 ± cos θ 1 ± sin θ parameter In mathematics parameters can be de- where d > 0. scribes as something in between constants and vari- parabolic cylinder The three dimensional surface ables. This means that depending on situation the given by (for example) the equation parameter could either be ﬁxed at some numeric value or start changing its values. In the expression 2 2 2 a x = b z. Z ∞ e−λxdx 0 paraboloid See elliptic paraboloid. λ is a parameter and x is the variable of integration. For other case where the term parameter is used in parallelepiped A three dimensional solid, similar slightly diﬀerent meaning see parametric equations. to rectangular box. Diﬀerence is that if for rectan- gular solid each face is a rectangle, for parallelepiped parametric curve See parametric equation. each face is a parallelogram. parametric equation Many curves that by dif- parallel lines Two lines on the plane that do not ferent reasons cannot be expressed as a graph of intersect. If two lines are parallel then they have the a function, can be expressed by parametric equa- same slope. Example: The lines y = −2x + 1 and tions. Suppose C is a plane curve and the coordi- y = −2x − 27 are parallel. nates (x, y) of the curve are given by the functions x = f(t), y = g(t), where f and g are some functions parallel planes Two planes in the space are paral- of the variable t on some interval I. The equations for lel if they do not intersect. If a x+b y+c z = d and 1 1 1 1 x and y are called parametric equations of the curve a x+b y+c z = d are equations of two planes, then 2 2 2 2 C and t plays the role of parameter in this case. they are parallel if and only if a /a = b /b = c /c 1 2 1 2 1 2 Examples: (1) The circle given by x2 + y2 = 1 could but are not equal to d /d . 1 2 be written in parametric form as x = cos θ, y = parallel vectors Two vectors u and v are parallel sin θ, 0 ≤ θ ≤ 2π. 107

In the particular case when f(x) = x we have are solutions of that equation then so is their sum n n limx→a x = a . function y1 + y2. power series A series formed by the combinations probability (1) One of the main branches of math- of power functions: ematics dealing with events which outcome is gov- erned by chance. 2 3 n c0 + c1x + c2x + c3x + ··· + cnx + ··· (2) Probability of an event is the ”likelihood” that that event will happen. It is measured by a numeric The numbers c0, c1, c2, ··· are called the coeﬃcients value that varies between 0 and 1. The event that of the power series. Each power series converges on cannot happen (an impossible event) has probability some interval of the form −R < x < R, where R is zero and an event that is certain to happen has prob- called the radius of convergence. Depending on coef- ability one. For discrete variables the probability of ﬁcients, the radius R may be zero (series converges some event A can be calculated by the formula only at x = 0), inﬁnite (series converges for any real number of times A happened x), or ﬁnite. In this last case the series may or may P (A) = , not converge at the endpoints x = −R or x = R (see number of total experiments also radius of convergence). The power series could where by ”experiment” we understand observations be diﬀerentiated or integrated term-by-term just as we produced to see if A happens. The simplest ex- any polynomial. In the more general case, also power ample is tossing a coin multiple times and observing series centered at and arbitrary point x = a are con- how many times we get the ”tail” (the event A). sidered: If the number of outcomes is some ﬁxed number, such 2 3 as in the case of tossing a coin (exactly two possible c0 + c1(x − a) + c2(x − a) + c3(x − a) + ··· outcomes: ”head” or ”tail”), or rolling a dye (six This series has the exact same properties as the series possible outcomes: the values from 1 to 6), then the centered at zero. probability of an event could be calculated by the formula A natural number that cannot be number of ways A happens divided evenly by any other natural numbers except P (A) = . 1 and that number itself. All the other natural num- number of total outcomes bers are called composite. The number 1 is considered For operations with probability values see multipli- neither prime nor composite. First few prime num- cation rule for probabilities, addition rule for prob- bers are: 2,3,5,7,11,13,17,19,23,29,31... By the Fun- abilities. For probability distributions see the entry damental theorem of arithmetic any natural number distribution and the related entries mentioned in that is the product of prime numbers. article. See also conditional probability. principal nth root Let a be a real number that probability model A mathematical model where has at least one nth root. The principal nth root of the numeric values used to construct that model are a is the one that has the same sign as a and denoted √ √ probabilities of some events. For examples see bino- by n a. For a = 4 the number 4 = 2 is the princi- mial distribution, Poisson distribution, normal dis- pal square root because it is positive. For a = −27 √ tribution. the number 3 −27 = −3 is the principal cubic root because it is negative. probability density function A function that is determined by the probabilities of some event(s). The principle of mathematical induction See math- deﬁning properties of these functions are: ematical induction. (1) The function is always non-negative; principle of superposition for linear homoge- (2) The area under that function is exactly one. neous diﬀerential equations states, that if y1 and y2 See also entries related to speciﬁc probability models. 114

7 product The result of multiplication of two (or 12 we cross-multiply and get the equation 12x = 7 · 3 more) objects. The objects may be real or com- and x = 7/4. plex numbers, functions, matrices, vectors, etc. For P-value Also sometimes called probability value. speciﬁc deﬁnitions in all of these cases see the en- One of the methods of hypothesis testing. The P- tries multiplication of complex numbers, of fractions, value is the probability of observing a value like the of functions, of matrices and also cross product, dot given value (or even less likely) assuming that the null product, scalar product. hypothesis is true. The smaller the P-value the more product formulas Also called product-to-sum for- evidence is there to reject the null hypothesis. Big P- mulas. In trigonometry, the formulas values indicate that there is not enough evidence to reject the null hypothesis (so we have to accept it). 1 sin x sin y = [cos(x − y) − cos(x + y)] Visually, the P-value is the area under the normal 2 probability curve from the observed vaue to inﬁnity 1 (in the case of right-tailed test), the area from the cos x cos y = [cos(x + y) + cos(x − y)] observed value to minus inﬁnity (left-tailed test), or 2 the sum of areas in both directions (two-tailed test). 1 sin x cos y = [sin(x + y) + sin(x − y)] 2 Pythagorean identities The trigonometric iden- tities 1 cos x sin y = [sin(x + y) − sin(x − y)]. sin2 θ + cos2 θ = 1 2 2 2 Similar formulas are also possible for other functions tan θ + 1 = sec θ but hardly ever used. See also trigonometric identi- 1 + cot2 θ = csc2 θ. ties. See also trigonometric identities. For diﬀerentiation. If the functions f and g are diﬀerentiable, then Pythagorean theorem One of the most famous theorems of geometry that states that the square of [f(x)g(x)]0 = f(x)g0(x) + g(x)f 0(x). the length of the hypothenuse of any right triangle is equal to the sum of the squares of the lengths of the Examples: (1) (x2 sin x)0 = 2x sin x + x2 cos x, two legs. Symbolically, if a, b represent the lengths of (2) (tan x ln x)0 = sec2 x ln x + 1/x tan x. legs and c is the length of the hypothenuse, then projection The mapping of a point onto some line a2 + b2 = c2. or plane. The most common projections are the or- thogonal projections. See the corresponding entry for the details and examples. proper value See eigenvalue. proportion A statement that two or more are equal. Ratios may contain just numbers or vari- ables or even functions. Examples of proprtions are 3 6 x 7 5 = 10 , 3 = 12 or the Law of Sines: sin α sin β sinγ = = . a b c To solve a proportion means to ﬁnd the unknown x quantity. For example, to solve the proportion 3 = 115

Algebraic equation of the fourth degree that could be written in the form

αy4 + βy3 + γy2 + δy +  = 0, α 6= 0.

There is a formula for solving these types of equations which is extremely complicated and diﬃcult to use in practice unlike the quadratic formula (see also cubic equation). Here is a quick outline of the solution process. First, we divide the equation by α 6= 0 and get a monic equation

√ where n r means the positive nth root of the positive rotations about the origin to the angle θ is given by number r. the matrix  cos θ − sin θ  Example: To√ ﬁnd the four fourth roots of the number . z = −2 + 2 3i we write z = 4(cos 120◦ + i sin 120◦) sin θ cos θ and wk, k = 0, 1, 2, 3 are given by There are more rotation operators in three dimen- √  120◦ + 360◦k 120◦ + 360◦k  sional Euclidean space, corresponding to rotation 4 4 cos + i sin , about the origin or about some axis. for example, 4 4 the counterclockwise rotation about the positive x- √ √ √ 6 2 2 axis through an angle θ is given by the matrix from where we ﬁnd w0 = + i , w1 = − + √ √ √ 2√ √2 2 6 6 2 2 6   i 2 , w2 = − 2 − i 2 , w3 = 2 − i 2 . 1 0 0  0 cos θ − sin θ  . roots of unity The nth roots of the number 1 con- 0 sin θ cos θ sidered as a complex number. We can write the num- ber 1 in the trigonometric form as 1 = cos 0 + i sin 0 Similar matrices work for rotations about y-axis or and apply the formula for the roots of a complex num- z-axis. ber and get round-oﬀ error Also called error. The 2πk 2πk diﬀerence between the actual (exact) value and the w = cos + i sin , k = 0, 1, ··· , n − 1. k n n numeric approximation value. Round-oﬀ errors are inevitable when representing irrational (and some The ﬁrst is always 1(w0 = 1). Geomet- rational) numbers because their numeric value con- rically, the roots of unity are points located on the tains inﬁnitely many digits (too many digits, respec- unit circle at the equal distance from each other. tively).√ Example: The exact decimal value of the root test A method of determining the absolute number 2 contains inﬁnitely many digits (and, as convergence or divergence of a (numeric or func- a result, will never be known). Depending on situ- tional) series. ation we may use the numeric approximations 1.41, pn (a) If limn→∞ |an| = a < 1, then the series 1.4142, 1.414213562, etc. The round-oﬀ errors for P∞ these values are approximately 0.0042, 0.00001356, n=1 an is absolutely convergent. pn 0.000000000373. (b) If limn→∞ |an| = a > 1, then the series is di- vergent. row-echelon form A matrix is in the row-echelon (c) In the case a = 1 the test cannot give a deﬁnite form if the following conditions are satisﬁed: answer. (1) If a row does not consist of zeros only, then the root law for limits If n is a positive integer, then ﬁrst non-zero element is 1 (leading 1); (2) Rows consisting of zeros only are grouped at the

pn q lim f(x) = n lim f(x). bottom; x→a x→c (3) In two non-zero rows, leading 1 in the bottom row is to the right of the leading 1 of the row above. In case when n is even, we need additionally assume The matrices that limx→c f(x)√> 0.√ In the special case f(x) = x n n we have limx→c x = a.  1 −2 0   1 3 0 −2  rotation operator Also called rotation transfor-  0 1 0   0 0 1 3  . mation. The operator in the Euclidean space that 0 0 1 0 0 0 0 rotates each vector of that space with respect to some are in the row-echelon form. ﬁxed point or some ﬁxed axis by some speciﬁc angle. The rotation operator in the plane corresponding to row equivalence If a matrix A can be obtained 127 from another matrix B by a ﬁnite sequence of elementary row operations, then matrix B can be S obtained from matrix A by the same number of elementary row operation. Because every elementary row operation is invertible, the second sequence of Suppose the point (a, b) is a critical matrices is just the inverses of the ﬁrst sequence (in point for a function f(x, y). If it is neither a maxi- reverse order). These kind of matrices are called row mum nor minimum value for the function, then the equivalent. corresponding point on the graph of f (a point on the row reduction The process of applying the surface determined by the function) is called a saddle Gauss or Gauss-Jordan elimination procedure to a point. given matrix. This process is useful, in particular, in sample A part of the population gathered with the evaluating the determinants of matrices. intention of getting information about the popula- row space For an m × n matrix A the vectors tion. The process of gathering samples is called sam- formed by the m rows of that matrix are called row pling. In order to present statistical (and scientiﬁc) vectors and the subspace of the Euclidean space Rn value all samples should be random. See the entry spanned by that vectors is the row space of the simple random sample. In addition to simple ran- matrix. The row spaces of row equivalent matrices dom sample we have the following special methods of are the same. sampling: 1) Probability samples give each element of the pop- A surface that could be generated ulation equal chance to be chosen; by the motion of a straight line. Examples of ruled 2) In stratiﬁed random sample we ﬁrst divide pop- surfaces are cylinders, , hyperbolic . ulation into groups of similar objects (stratas) and then used the simple random sample in each of these groups. The results should be put together; 3) In systematic samples we chose every nth element of the population. There are some other methods of sampling all of which assure the un-biased of the sample. sample mean The mean (average) of all sample values. If x1, x2, ··· , xn are the sample values then the mean is x + x + ··· + x Pn x x = 1 2 n = i=1 i . n n sample space When considering a random phe- nomena, the collection of all possible outcomes is the sample space. For example, when rolling a dye, the sample space is the collection of the outcomes 1, 2, 3, 4 ,5, and 6.

sample standard deviation√ The square root of the sample variance: s = s2. See also standard de- viation.

sample variance If x1, x2, ··· , xn are all the val- ues of the sample and x is the sample mean, then the 128 variance is given by the formula scalar product Has the same meaning as the in- ner product. Let x, y be two vectors. Then Pn (x − x)2 s2 = i=1 i . (1) hx, yi = hy, xi; (2) For any real a, hax, yi = n − 1 ahx, yi; (3) hx, xi ≥ 0 and hx, xi = 0 if and only There is another formula, called shorthand formula, if x = 0. that is sometimes more convenient to use: scalar multiplication The operation of multiply- n Pn x2 − (Pn x )2 ing vectors or matrices by a scalar. In the case of ma- s2 = i=1 i i=1 i . n(n − 1) trices, to multiply by a scalar means to multiply each entry of the matrix by the same constant. Hence, See also variance.  1 0 −2   2 0 −4  sampling distribution Suppose we have a distri- 2 ·  −1 3 4  =  −2 6 8  . bution from large population. Chose a simple random 5 −2 0 10 −4 0 sample of size n from that population. After that we In the case when vectors are given in the column or ﬁnd the mean of that sample and denote it byx ¯1. Then we do the same process the second time and row form, then scalar multiplication works exactly denote the mean of the second sample of size n by as in the case of matrices. If the vectors are given in an abstract vector space, then scalar multiplication x¯2. If we repeat this process many times we will get is deﬁned in the form of axioms (part of the axioms a sequence, or set of the values {x¯1, x¯2, x¯3, · · ·}. The distribution formed from these values is the sampling of a vector space): distribution. According to the Central limit theorem, (1) If k is a scalar and u is in the space V , then ku the sampling distribution is always normal (more pre- also is in V ; cisely, approaches to normal, if n increases). The (2) k(u + v) = ku + kv; same theorem also establishes relationship between (3) (k + l)u = ku + lu; the means and standard deviations of these distribu- (4) k(lu) = (kl)u. tions. scatterplot Also called scatter diagram or scat- sawtooth wave The function deﬁned as f(t) = ter graph. If the statistical data comes in pairs t, 0 ≤ t < 1 and then repeated periodically with (paired data), then they could be viewed as ordered the period 1: f(t + 1) = f(t). pairs and presented on the Cartesean plane as a number of points. The result is called scatter- scalar The same as a constant. A quantity that . The ﬁgure shows the scatterplot of paired does not change. Real and complex numbers are data {(2, 24), (3, 37), (5, 45), (8, 31), (9, 78), (11, 61), scalars. (12, 82)}. scalar equation of a plane The same as equation of the plane and in three dimensional space is given by ax + by + cz + d = 0. scalar ﬁeld Or the ﬁeld of scalars. In choosing the scalars (constants) to use in diﬀerent situations, we usually choose among the ﬁelds of rational numbers, real numbers or complex numbers. When we need a real vector space, then the choice is the ﬁeld of real numbers. Also we consider polynomials using coeﬃ- scientiﬁc notation Notation used to write very cients from the ﬁeld of rational numbers. large or very small numbers in a more compact form. 129

In this form the number is written as a number be- sector of a circle A part of a circle that is bounded tween 1 and 10 multiplied by some . Ex- by two radiuses and the arc of the circumference con- amples: (1) The number 4250000000 can be written necting the endpoints of radiuses. as 4.25 × 109. (2) The number 0.0000716 could be semi-circle One half of a circle. Also could be written as 7.16 × 10−5. viewed as a sector formed by two radiuses making secant function One of the six trigonometric func- 180◦ angle. tions. Geometrically, the secant of an angle in a right separable diﬀerential equations In the most triangle, is the ratio of the hypothenuse of the trian- general setting, the equation is separable if it could gle to the adjacent side. Also could be deﬁned as the be written in the form reciprocal of the cosine function. The function sec x could be extended to all real values exactly as the dy M(x) + N(y) = 0. cos x function. The domain of sec x is all real values, dx except x = π/2 + πn, n any integer, and the range is Another way of describing separable equations is to (−∞, −1] ∪ [1, ∞). sec x is 2π-periodic. write them in the form One of the Pythagorean identities relates the secant function to the tangent function: 1 + tan2 θ = sec2 θ. dy f(x) = . The following are the calculus related formulas: dx g(y)

d To solve this equation we write g(y)dy = f(x)dx and (sec x) = sec x tan x, dx integrate to ﬁnd the general solution Z Z Z sec xdx = ln | sec x + tan x| + C. g(y)dy = f(x)dx.

Example: Solve the equation dy/dx = x3/y2. Solu- tion: y2dy = x3dx, Z Z y2dy = x3dx,

y3/3 = x4/4 + C, and y = p3 3x4/4 + 3C = p3 3x4/4 + K. sequence A sequence is a , func- secant line A line that connects two points on tions, vectors, etc. written in some order. Terms (or some curve. The segment of the secant line between members) of a sequence are numbered using the set these two points is the chord. of natural or whole numbers or all integers. (1) Numeric sequences. A sequence could be viewed second derivative test A test that allows to ﬁnd as a function of natural (whole, integer) variable. local maximums or minimums and substitutes the Hence, we write f(1) = a1, f(2) = a2, ···. A sequence ﬁrst derivative test in cases when the second deriva- is ﬁnite if it has only ﬁnite number of terms. Other- tive at the critical points exists. wise it is called inﬁnite. A sequence {an} is bounded, 00 Assume f (x) is continuous near the point c. if there is a number M > 0 such that |an| ≤ M for 0 00 (a) If f (c) = 0 and f (c) > 0, then f has a local all n. {an} is called increasing if any next term is minimum at the point c; greater than the previous one: an+1 > an. Similarly, 0 00 (b) If f (c) = 0 and f (c) < 0, then f has a local it is called decreasing if an+1 < an. The general name maximum at the point c. for increasing and decreasing sequences is monotonic. 130

Limits for sequences are deﬁned like limits for func- polynomials). See also binomial series, Taylor series, tions. A sequence {an} is said to have a limit L as MacLaurin series. n → ∞ if for any  > 0 there is a N > 0 such series solution For linear diﬀerential equations. that |a − L| <  as soon as n > N. We write n Assume that we are solving the equation limn→∞ an = L. A sequence that has a limit as n → ∞ is called convergent. Otherwise it is diver- (n) (n−1) y + a1(x)y + ··· + an(x)y = 0, gent. In case when the sequence is divergent in a speciﬁc way that it increases without bound, we say where the coeﬃcients a1(x), a2(x), ··· , am(x) are an- that it goes to inﬁnity. Formally, limn→∞ an = ∞ alytic functions, hence can be represented by their means that for any M > 0 there is an integer N such Taylor-MacLaurin series. The idea behand the series that an > M as n > N. The case limn→∞ an = −∞ solutions of this equation is that it is natural to look is deﬁned similarly. for analytic solutions, i.e. solutions, represented by (2) Sequences of functions could be treated sim- power series. This idea extends also to some cases ilar to numeric sequence. Again, we have the when the coeﬃcients are not analytic but have some notions of bounded , increasing or decreasing se- ”regularity” properties. The method works the best quences. The most common functional sequences are when the coeﬃcients ai(x) are polynomials, or even sequences of monomials {xn} and trigonometric func- better, if they are monomials. We will demonstrate tions {sin nx, cos nx}. this method for the case of second order equation series The sum of the terms of a ﬁnite or inﬁnite, P (x)y00 + Q(x)y0 + R(x)y = 0. (1) numeric or functional sequence. Accordingly, we have numeric or functional series.To write the series we use A point x = x0 is called ordinary point for this equa- the summation (sigma) notation. tion, if P (x ) 6= 0, otherwise it is called a singular P∞ 0 1) Numeric series. The series n=0 an is called con- point. vergent, if the sequence of its partial sums Sn = (A) The case of the ordinary point. Dividing the Pn k=0 ak is a convergent sequence. The precise deﬁni- equation by P (x), we will have a simpler form near P∞ tion is: We say that the series n=0 an is convergent the ordinary point x0 (because P (x0) 6= 0) and converges to some ﬁnite number S, if for any given  > 0 there exists a natural number N such y00 + p(x)y0 + q(x)y = 0 (2) that |Sn − S| <  as soon as n ≥ N. In order to ﬁnd out if a series is convergent or not, various con- and will look for a solution vergence tests may be used. For positive series see ∞ X n integral test, comparison test, lomit comparison test. y(x) = an(x − x0) , For alternating series see the alternating series test. n=0 Finally, for of the series see the where the coeﬃcients a are still to be determined. ratio test or the root test. n If we calculate the ﬁrst and second derivatives of this 2) Functional series. If the terms of the series con- unknown function and also write power series rep- sists of functions, then it is called functional series. resentations of functions p(x) and q(x) around the A functional series may be convergent for some val- point x , then we will get ues of the variable and divergent for other values. 0 The deﬁnition of convergence of a functional series at ∞ X n−2 any point is exactly the same as the deﬁnition of the n(n − 1)an(x − x0) convergence of a numeric series. The most common n=2 functional series are power series and trigonometric  ∞  ∞ ! series. Besides them, other functional series are also X j X n−1 considered (such as Bessel series or series of Laguerre +  pj(x − x0)  (x − x0) j=0 n=1 131

∞ ! ∞ ! X i X n (B) The case of the singular point. If the function + qi(x − x0) an(x − x0) = 0. P (x) from the equation (1) above becomes zero at i=0 n=0 any point, then the described method does not work. Finally, if we perform all and addi- There is however an exceptional case where a similar tions, equate resulting coeﬃcients of all powers to method gives a satisfactory solution. If P (x0) = 0 zero (because on the right side we have the identi- and cally zero function), then we will ﬁnd expressions for Q(x) 2 R(x) coeﬃcients an we the help of the known coeﬃcients lim (x − x0) , lim (x − x0) x→x0 P (x) x→x0 P (x) pj, qi of the functions p(x) and q(x). Example: Find the series solution of Airy’s equation are both ﬁnite, then the point x0 is called regular

00 singular point. Now assuming for simplicity that the y − xy = 0, −∞ < x < ∞. point x = 0 is a regular singular point for the equa- tion (1) we divide that equation by P (x) and then Proceeding as described before and using the fact multiplying everything by x2 get the simpliﬁed equa- that the functions p(x) = 0 and q(x) = x are very tion simple, we will have series solution presentation cen- tered at x = 0 (here the summation index is changed x2y00 + x[xp(x)]y0 + [x2q(x)]y = 0. (3) to a more convenient form): Here we are seeking solutions in the form ∞ ∞ X n X n ∞ 2a2 + (n + 2)(n + 1)an+2x = an−1x . r X n y = x anx , n=1 n=1 n=0 Now, clearly a2 = 0, because the right side does not where both the coeﬃcients an and the index r are have a constant term and from the to be determined. Proceeding as in the case of or- (n + 2)(n + 1)a = a dinary points and equating the coeﬃcients of equal n+2 n−1 powers we will have a recurrence relationship for the that follows from equating the coeﬃcients of equal coeﬃcients and the index: powers, we have also a = a = ··· = 0, or a = n−1 5 8 3n−1 X 0, n = 1, 2, 3, ···. Similarly, we ﬁnd that F (r + n)an + ak[(r + k)pn−k + qn−k] = 0, k=0 a0 a3n = 2 · 3 · 5 · 6 ··· (3n − 1)(3n) where F (r) = r(r − 1) + p0 + q0. Theorem. (a) Let r ≥ r be the two real roots of and 1 2 a1 the indicial equation F (r) = 0. Then there exists a a = 3n+1 3 · 4 · 6 · 7 ··· (3n)(3n + 1) solution of the form and the solution of the equation (2)is given by the " ∞ # r1 X n formula y1 = |x| 1 + an(r1)x , n=1  x3 x6  y = a 1 + + + ··· where a (r ) are found from recurrence relation 0 2 · 3 2 · 3 · 5 · 6 n 1 above.  x4 x7  (b) If r1 − r2 is not zero or a positive integer, then +a1 x + + + ··· . there exists a second linearly independent solution of 3 · 4 3 · 4 · 6 · 7 the form This solution depends on constants a0 and a1. If we " ∞ # need to ﬁnd a unique solution then it is necessary to r2 X n y2 = |x| 1 + an(r2)x . have two initial or boundary values given. n=1 132

(c) If r1 = r2, then the second solution is given by conic. See the equations in the corresponding en- tries. ∞ r1 X n y2 = y1(x) ln |x| + |x| bn(r1)x . shift of a function The process of moving the n=1 graph of a function horizontally or vertically or both. Algebraically the vertical shift is achieved by adding (d) If r1 − r2 = N is a positive integer, then or subtracting a number from the function: g(x) = " ∞ # f(x) + c is the vertical shift if the functionf. If c > 0 r2 X n y2 = ay1(x) ln |x| + |x| 1 + cn(r2)x . then the function is shifted up and if c < 0 then the n=1 shift is down. The horizontal shift of the function f is given by the formula g(x) = f(x − c). If c > 0 then The coeﬃcients an(r1), bn(r1), cn(r2) and the con- the function shifts to the right and if c < 0 it shifts stant a can be determined by substituting the solu- to the left. For example, the graph of the function tions into the equation (3). The constant may equal g(x) = (x + 3)2 + 2 could be found from the graph of zero and the corresponding term will be missing. the parabola f(x) = x2 by shifting it left 3 units and sets One of the most basic notions of mathematics. shifting up 2 units. By a set we understand any collection of objects of sieve of Erathosthenes The ancient method of any nature: numbers, people, countries, , etc. ﬁnding prime numbers. Suppose we need to ﬁnd all Objects of a set are called its elements or members. the prime numbers up to some number N. Make the Two sets A and B are considered equal (more pre- list (table) of all these numbers and start eliminat- cisely, identical) if they have exactly the same ele- ing the numbers that are not prime as follows: First ments. A set B belongs to another set A (is a subset throw away the number 1 because it is not prime. of A, notation: B ⊂ A), if all elements of B are Next, leave 2 because it is prime but throw away all also elements of A, but the opposite is not necessar- multiples of 2 (all even numbers) up to N. In the ily true. For a set A, its complement A is the set of third step leave 3 but eliminate all the multiples of all elements that do not belong to A. For example, in 3 that are still in the list (all even numbers that are the larger set of all real numbers, the complement of multiples of 3 were eliminated in the previous step). all rational numbers is the set of all irrational num- Continuing this way we will throw away all non-prime bers. For the operations of intersection and union of numbers and what is left would be all prime numbers sets see corresponding entries. up to N. During this process our number list gets lots shift of index of summation In summation no- of ”holes” which is the reason for the name ”sieve”. tation for functions it is sometimes important or con- sigma notation The same as summation notation. venient to ”shift” the index of summation to be able to combine two or more sums. For example, the sum signed elementary product An elementary prod- uct multiplied by +1 or -1. Used to calculate deter- n minants. X k ak−1x k=1 signed number A term that is sometimes used to indicate negative number, as numbers preceded by Pn−1 k+1 could be written also as k=0 akx using the sub- the ”minus” sign. stitution m = k − 1 and then returning to the index signiﬁcance level Also called alpha level and de- k because the notation does not make the sum any noted by the Greek letter α. For tests of signiﬁcance, diﬀerent. a numeric level such that any event with probability shifted conics The three major conic sections (el- below that is considered rare. It is set arbitrarily but lipses, parabolas, hyperbolas) may be shifted horizon- the usual choices are 0.1, 0.05 and 0.001. The choice tally or vertically or both. The result is a shifted of the signiﬁcance level depends on the importance 133 of the problem. ple from the population where every possible triple of values from P has equal chance to be chosen. More signiﬁcance tests See test of signiﬁcance. generally, a simple random sample of size n is a sam- similar terms The same as like terms. ple where every possible set of n values from P has exactly the same chance to be chosen. similar matrices Square matrices A and B are similar if there exists an invertible matrix P such simply connected region If any simple closed that B = P −1AP . similar matrices have many prop- curve in the region contains only points from the re- erties and characteristics in common. In particular, gion, then it is called simply connected. Another, the determinants, traces, eigenvalues, nullity of simi- simpler deﬁnition says, that if any two points in the lar matrices are the same. If the two matrices are the region can be connected by a continuous curve that standard matrices of some transformations T1 and T2 is completely inside the region, then it is simply con- then the transformation bringing one to the other is nected. In other words, the region basically consists called transformation of one ”piece”. simple curve A curve is simple if it does not in- Simpson’s rule One of the methods of approxi- tersect itself, except maybe the endpoints. If the mate integration. To calculate the approximate value curve is given by the parametric equations x = R b of the integral a f(x)dx using Simpson’s rule we di- f(t), y = g(t), a ≤ t ≤ b, then being simple means vide the interval [a, b] into even number n equal in- (f(c), g(c)) 6= (f(d), g(d)) for c, d 6= a and c, d 6= b. tervals of the length ∆x = (b − a)/n with the points If the endpoints coincide, i.e., if f(a) = f(b) and x0 = a, x1 = a + ∆x, x2 = a + 2∆x, ··· , xn = b. Then g(a) = g(b), then we call it a simple closed curve. we have the approximation simple region A term sometimes used to describe regions that lie between the graphs of two functions. Z b ∆x Similarly, simple solid region is a region that lies be- f(x)dx ≈ [f(x0) + 4f(x1) + 2f(x2) a 3 tween two surfaces described by functions of two vari- ables. simple eigenvalues See eigenvalue of matrix. +4f(x3) + ··· + 2f(xn−2) + 4f(xn−1) + f(xn)]. simple harmonic motion If a particle or an ob- ject moves (oscillates) according to the equation y = For the degree of accuracy of this approximation see A cos(ωt − δ) then this kind of motion is simple har- error estimate for Simpson’s rule. monic. Here A is the amplitude, ω/2π is the fre- sine function One of the six trigonometric quency and δ is the phase shift. functions. Geometrically, the sine of an angle in simple interest If the bank pays interest on the a right triangle is the ratio of the opposite side principal amount only then the interest is called sim- to the hypotenuse of the triangle. More general ple. Mathematically, if P is the amount of the money approach allows to deﬁne sin x function for any real invested at the rate r yearly, then after n years that x: Let P = (a, b) be any point on the plane other principle will become P + nrP = P (1 + nr). See also than the origin and θ is the angle formed by the compound interest. positive half of the x-axis and the terminal side, connecting the origin and P . Then sin θ = √ b . simple random sample One of the most impor- a2+b2 Next, after establishing one-to-one correspondence tant notions in statistics. A simple random sample between angles and real numbers, we can have the of size 2 from a given population P is a sample where sine function deﬁned for all real numbers. The range every possible pair of values has equal chance to be of sin x is [−1, 1] and it is 2π-periodic. chosen. A simple random sample of size 3 is a sam- 134

and 5 columns then we say that it is of size 4 × 5. skew lines Two lines in three dimensional space are called skew lines if they are not parallel and do not intersect. Skew lines do not lie in any plane. skew-symmetric matrix A square matrix is skew-symmetric if it is equal to the opposite of its transpose: A = −AT . slant asymptote An asymptote that is neither The sine function is related to other trigonometric horizontal nor vertical. For exact deﬁnition see the functions by various identities. The most important entry asymptote. 2 2 is the Pythagorean identity sin x + cos x = 1. The slope One of the most important characteristics of derivative and indeﬁnite integral of this function are: a line on a plane. Shows the level of ”steepness” d Z or ”ﬂatness” of the line. If the points (x1, y1) and (sin x) = cos x, sin xdx = − cos x + C. (x , y ) are on the line, then, by deﬁnition, the slope dx 2 2 of the line is the number m = (y2 −y1)/(x2 −x1). For a horizontal line the slope is zero and for a vertical sine integral function The function deﬁned by line the slope is not deﬁned because the denominator the integral in the deﬁnition becomes zero. Z x sin t Si(x) = dt. slope ﬁeld Same as direction ﬁeld. 0 t The integrand cannot be expressed with elementary slope-intercept equation of a line If the slope function and the values of this function are found by m and the y-intercept b of the line are known, then approximate (numeric) integration. the equation of the line is given by the formula singular matrix The opposite of an invertible ma- y = mx + b. trix. For a square matrix A to be singular means Example: If the slope of the line is m = 2 and the there is no matrix B such that A · B = I, where I y-intercept is the point (0, −1), then the equation of is the identity matrix. An easy criteria for the singu- the line is y = 2x − 1. See also line. larity is that the determinant of a singular matrix is zero. smooth curve A curve created by a smooth func- tion or smooth functions. singular point A point where the function cannot be deﬁned because it has inﬁnite limit (from the left, smooth function Usually, a function that has con- from the right, or both). The point x = 1 is singular tinuous derivative. In some other settings smooth for the function f(x) = 1/(x − 1)2 because function might mean a function that has more than one derivative. In particular, inﬁnitely diﬀerentiable 1 lim = ∞. functions are also called smooth. The notion of the x→1 (x − 1)2 smooth function extends also to functions of several variables. A function has singular point at inﬁnity if limx→∞ f(x) = ±∞. For the use of the term singular smooth surface A surface given by a diﬀerentiable in diﬀerential equations see irregular singular point, function f(x, y) of two variables. Geometrically this regular singular point. means that the surface has no ”corners” or ”edges”. size of a matrix Size of a matrix is the number of solid Usually, a three dimensional object, geomet- rows and columns it has. So, if the matrix has 4 rows ric ﬁgure. For example, means the ﬁgure 135 that we receive when we connect a ﬁxed point with all and the sides are denoted by a, b, c and we follow the the points of a closed simple curve in space. Solid of convention that side a is opposite to angle A, side b revolution means a solid that is received by rotating is opposite to angle B and side c is opposite to angle a plane region about some line, such as coordinate C. axis. (1) The case AAS. Let C = 102.3◦,B = 28.7◦, b = 27.4. First we ﬁnd the angle A using the fact that solution Depending on the type of equation the the sum of all angles is 180◦: A = 49.0◦. Then using solution would be a number, set of numbers, set of the Law of sines we have a = b sin A/ sin B ≈ 43.06. ordered pairs or ordered triples, a matrix, a vector, Similarly, c = b sin C/ sin B ≈ 55.75 and the triangle a function or set of functions, etc. All these solu- is solved. tions have the same common property: substitution (2) The case SAS. Let A = 115◦, b = 15, c = 10. into the equation (or system of equations) results in First, we use Law of cosines to ﬁnd the side a : a true statement, an identity. For details of how to a2 = b2 + c2 − 2bc cos A ≈ 451.79 and a ≈ 21.26. ﬁnd solutions of equations see the articles quadratic After this, we have the choice of using either the equation, cubic equation, quartic equation, rational Law of cosine again or the Law os sines to ﬁnd equations, radical equations, systems of linear equa- one of the missing angles. Law of sines will give tions, linear ordinary diﬀerential equation and many sin B = (b/a) sin A ≈ 0.63945 and B ≈ 39.75◦. Now, others where the solutions of the corresponding equa- C ≈ 25.25◦ and the triangle is solved. tions are presented. (3) The case SSS. Let a = 8, b = 19, c = 14. Using solution curve The graph of the solution of a dif- the Law of cosines we have, for example, ferential equation. a2 + b2 − c2 cos B = ≈ −0.45089 solution set The set of all solutions of the given 2ac equation. from where B ≈ 116.80◦. Now, the rest follows as in the previous case and we can ﬁnd the remaining al- solution space Let Ax = b be a system of linear ◦ ◦ equations. Each solution x of this equation is called gles as before: A ≈ 22.08 ,C ≈ 41.12 . The solution solution vector. The set of solutions forms a linear is complete. space called solution space. space In mathematics the term has diﬀerent mean- solving triangles To solve a triangle means to ﬁnd ings depending on context. Most commonly space one or more elements (side or angle) of the triangle means the three dimensional space we live in which knowing the other elements. To solve a triangle we is mathematically the three dimensional Euclidean need to know at least three elements (including at space. In other cases space refers to vector space. least one side), because otherwise the problem be- See corresponding entries for more details. comes impossible to solve. The follwing four cases space curve A curve in three dimensional space are possible in solving triangles: (1) One side and two could be given by diﬀerent methods. Suppose t is angles are known (AAS or ASA); (2) Two sides and a variable on some interval [a, b] and the functions one angle not formed by them is known (SSA); (3) f, g, h are deﬁned and continuous on that interval. Two sides and the angle formed by them are known Then the set of all points (x, y, z) in space where (SAS), and (4) Three sides are known (SSS). The ﬁrst two cases are handled by the Law of sines and x = f(t), y = g(t), z = h(t) the other two cases are solved by the Law of cosines. is a space curve given parametrically by these equa- Also, the second case is called ambiguous, because it tions. not always results in deﬁnite solution. For that case The same curve could be also given as a vector with see the separate entry ambiguous case. In the fol- the equation lowing examples the angles are denoted by A, B, C r(t) = f(t)i + g(t)j + h(t)k, 136 where i, j, k are the standard bases in the three di- square matrix A matrix such that the number of mensional space. rows is equal to the number of columns. There are Space curve is called continuous, diﬀerentiable, etc. many matrix operations that are possible for square if the functions f, g, h have the same properties. matrices but not for other type of matrices. For ex- ample, the notions of determinant and inverse of a sphere One of the most common geometric ﬁgures. matrix do not make sense if the matrix is not square. The term has two meanings, to indicate the geometric For other entries speciﬁcally applied to square ma- solid or its boundary. In the ﬁrst case the term ball is trices see nilponent matrix, normal matrix, skew- more commonly used. The equation of the ball (solid symmetric, symmetric, unitary matrices. sphere) of radius r centered at the points (a, b, c) is (x−a)2 +(y −b)2 +(z −c)2 ≤ r2. The equation of its square root The radical of the second order. The boundary (the sphere) is (x−a)2 +(y−b)2 +(z−c)2 = square root of a non-negative number a ≥ 0 is de- r2. ﬁned to be the non-negative number b ≥ 0 such that b2 = a. The notation for the square root of the num- spherical coordinates One of the most common √ ber a is a. If the number a is negative then its coordinate systems in three dimensional space along square root is not a real number, it is imaginary. See with the Cartesian and cylindrical systems. It is the also the entry root for more details. three dimensional analog of the polar coordinate sys- tem on the plane. The points in spherical coordinate squeeze theorem Assume that three functions system are determined by three parameters: the dis- f(x), g(x), h(x) satisfy the inequalities f(x) ≤ g(x) ≤ tance from the origin (usually denoted by ρ) and two h(x) near some point x = c but not necessarily at that angles. The ﬁrst angle, denoted by θ is measured on point. Assume also that the xy-plane from the positive x-axis and the second lim f(x) = lim h(x) = L angle, denoted by φ, is measured from the positive x→c x→c z-axis. This way ρ ≥ 0, 0 ≤ θ ≤ 2π, 0 ≤ φ ≤ π. then lim g(x) = L. x→c

Example: Evaluate the limit

lim x cos(1/x). x→0 Because the function cos(1/x) has no limit at zero, the usual product law for limits does not apply. How- ever, since | cos(1/x)| ≤ 1,

The relations between the spherical and rectangular −x ≤ x cos(1/x) ≤ x. coordinates are given by the formulas Now, limx→0 x = limx→0(−x) = 0 and by the x = ρ cos θ sin φ, y = ρ sin θ sin φ, z = ρ cos φ. squeeze theorem our limit is also zero. standard basis vectors In many common vector spaces sets of vectors that form a basis are considered spiral of See Archimedian spiral. to be standard. For example, in the Euclidean space n spread of a distribution The meaning of this R the set of vectors e1 = (1, 0, 0 ··· , 0), e2 = term depends on situation but as a rule the spread (0, 1, 0, ··· , 0), ··· , en = (0, 0, 0, ··· , 1) is considered of any distribution is described by its standard devi- (and called) the standard basis. In the space Pn ation or variance. of all polynomials of degree n or less the set of 137

2 n functions (1, x, x , ··· , x ) is the standard basis. am1x1 + am2x2 + ··· + amnxn = ym standard deviation For a set of values The matrix A = [a ], 1 ≤ i ≤ m, 1 ≤ j ≤ n x , x , x , ··· , x with the mean µ, the standard ij 1 2 3 n formed by the coeﬃcients of these equations is called deviation is deﬁned by the formula the standard matrix of the transformation T . The s notion of the standard matrix could be extended also Pn (x − µ)2 σ = i=1 i . to transformations in general linear vector spaces. n − 1 standard normal distribution The special The square of this expression is called variance. case of the normal distribution, when the mean is There are other, sometimes easier formulas for calcu- equal to 0 and the standard deviation is equal to lating standard deviation. See sample standard devi- 1. If the random variable describing some normal ation. distribution with the mean µ and standard deviation σ is x then the variable z = (x − µ)/σ describes the standard equations A term used in diﬀerent situ- corresponding standard normal distribution. The ations with speciﬁc meanings. For example, the stan- graph of the standard normal distribution is also the dard equation of a line is ax+by = c, or the standard normal density curve centered at 0. equation of an ellipse centered at the point (h, k) is standard position Usually refers to angles or (x − h)2 (y − k)2 vectors. In the case of angles, the standard position + = 1. a2 b2 is when the vertex is located at the origin and the initial side coincides with the positive side of the x-axis. In the case of the vectors the standard standard error In Statistics the standard devia- position is when the initial point coincides with the tion of the sampling distribution is usually unknown. origin. See also angles, vectors. To estimate it we use values of the sample and call it sampling error because it almost never equals the standardized value The same as the z-. actual standard deviation. The standard error for statistical inference Also called inferential the sample mean x¯ is given by the formula SE(¯x) = √ statistics. The use of statistical information (data) s/ n, where s is the sample standard deviation and to draw conclusions about larger population. n is the size of the sample. The standard error for proportions is statistics One of the main branches of Mathemat- ics. It is concerned with collecting and interpreting r pˆ(1 − pˆ) data. By other deﬁnitions, it is a mathematical SE(ˆp) = , n pertaining to the collection, analysis, inter- pretation or explanation, and presentation of data. wherep ˆ is the sample proportion and n is the sample Some other deﬁnitions of this science also include size. the fact that Statistics most of the time deals with standard matrix Suppose T is a linear transfor- random phenomena and needs to draw conclusions mation from the Euclidean space Rn to Rm. Then from randomly received data. See diﬀerent entries the vectors x = (x1, x2, ··· , xn) are transferred to about Statistics throughout this Dictionary. vectors y = (y1, y2, ··· , ym) by linear equations stemplot Also called stem-and-leaf plot. An easy and fast way of establishing the form of the distri- a x + a x + ··· + a x = y 11 1 12 2 1n n 1 bution for relatively small data set. For example, to organize the set of values {44 46 47 49 63 64 66 a x + a x + ··· + a x = y 21 1 22 2 2n n 2 68 68 72 72 75 76 81 84 88 106}, we separate the ······························ ”tens” digits and combine all the ”ones” digits that 138 correspond to each particular tens digit. The result is important geometric ﬁgures (notions). A line is not deﬁned but assumed to be understood as other fun- damental geometric or mathematical notions. In al- gebra, a line is given by its equation. See the entry line for all the details. submatrix Any part of a given matrix could be considered as a matrix, called submatrix. subset A set B is called the subset of the set A (written B ⊂ A), if any element of B is also an ele- step function The function f(x) = [[x]] deﬁned as ment of A. Example: The set {2, 4, 5} is the subset the greatest integer less than or equal to x. Diﬀer- of the set {1, 2, 3, 4, 5}. ent variations of this function also may be called step subspace A subset W of a vector space V is called function or staircase function. a subspace if W itself is a vector space under the same Stokes’ theorem One of the most important the- rules of addition and scalar multiplication as for the orems of the . It could be space V . viewed as the generalization of the Fundamental The- substitution method or substitution rule. One of orem of Calculus and also of the Green’s theorem. the main methods of evaluating indeﬁnite or deﬁnite This theorem relates surface integrals of some func- integrals. The essence of the method is to ﬁnd some tions with integrals on the boundary curve of that kind of substitution that will make the function un- surface. der the integral sign simpler and easier to evaluate Stokes’ theorem. Let S be an oriented piecewise- the integral. This method works only in cases when smooth surface bounded by a simple, closed, the function could be presented or viewed as a prod- piecewise-smooth boundary curve C with positive uct of a composite function and the derivative of the orientation. Let F a vector ﬁeld with continuously function that is the argument of that function. For- diﬀerentiable components in a region G ⊂ R3 that mally: contains S. Then Let u = g(x) be diﬀerentiable, then Z ZZ F · dr = curlF · dS. Z Z C S f(g(x))g0(x)dx = f(u)du. For deﬁnitions of vector ﬁeld and curl of a vector ﬁeld √ see corresponding entries. See also divergence theo- Examples: (1) R 3x − 1dx. The substitution u = rem for related results. 3x−1 results in du = 3dx and the integral transforms to stretching a function One of the possible trans- Z √ du 1 Z formations of functions. The vertical stretch of the u = u1/2du 3 3 function f is given by the formula g(x) = cf(x), 2 2 where |c| > 1. In case when |c| < 1 the term squeez- = u3/2 + C = (3x − 1)3/2 + C. ing or compression is used. The horizontal stretching 9 9 is given by the formula g(x) = f(cx) where |c| < 1. (2) R sin4 x cos xdx. Here the substitution u = In case |c| > 1 the transformation is squeezing or sin x, du = cos xdx gives compression. Z Z straight angle An angle that is 180◦ in degree mea- sin4 x cos xdx = u4du sure or π in radian measure. straight line Or simply a line. One of the most = u5/5 + C = sin5 x/5 + C. 139

u + v  u − v  The version of the substitution method for deﬁnite cos u + cos v = 2 cos cos , integrals is: 2 2 u + v  u − v  Z b Z g(b) cos u − cos v = −2 sin sin . f(g(x))g0(x)dx = f(u)du. 2 2 a g(a) Similar formulas are also possible for other functions An angle is subtended by an arc but hardly ever used. (usually, of a circle) if the two rays forming the angle supplementary angles Two angles that add up pass through the endpoints of the arc. to 180◦ (or π in radian measure). Two supplemen- subtraction One of the four main arithmetic and tary angles form one straight angle. algebraic operations. By subtracting two numbers surfaces In the simplest cases a surface is a two we ﬁnd their diﬀerence. Example: −2 − (−5) = 3. dimensional object in three dimensional space that See also addition and subtraction of complex num- could be given by some equation z = f(x, y). De- bers, fractions, functions, vectors, matrices. pending on properties of the function f we get dif- subtraction formulas for trigonometric func- ferent type of surfaces. For example, if f(x, y) has tions See addition and subtraction formulas for continuous partial derivatives by both variables then trigonometric functions. the surface would be called smooth. If the function is a second degree polynomial with respect to its vari- sum The result of addition of two or more num- ables then the surface would be called . The bers, functions, matrices, etc. Example: The sum of meaning of the term closed surface should be under- the functions f(x) = 2x2 +3x−1 and g(x) = x2 +5 is standable by itself. For explanations of terms ori- the function (f +g)(x) = 3x2 +3x+4. For the sum of ented surface, paramatric surface see the correspond- inﬁnitely many numbers or functions see series and ing entries. power series. For functions of three variables the equation f(x, y, z) = k for some constant k is called the level summation notation A shorthand notation for surface. summation of many numbers or functions. Uses the Greek letter sigma Σ. The notation surface area Suppose some surface S is given by the equation z = f(x, y) and the function has con- 10 X 2 tinuous partial derivatives fx, fy in some domain D. n Then the area of the surface S could be calculated by n=1 the formula ZZ q means that we are adding the squares of all numbers 2 2 A(S) = fx (x, y) + fy (x, y) + 1 dA, from 1 to 10. The variable n is called summation in- D dex or variable. In case the index takes all natural ∞ where dA = dxdy is the element of the area of D. numbers, the notation is P a . n=1 n This formula is the three dimensional generalization sum rule See diﬀerentiation rules. of the arc-length formula and has similar format. sum-to-product formulas In trigonometry, the surface integral Suppose a surface S in three di- formulas mensional space is given by the function z = g(x, y) over the domain D in xy-plane. The integral of the u + v  u − v  sin u + sin v = 2 sin cos , function f(x, y, z) over that surface (under certain 2 2 regularity conditions for that surface) is deﬁned to be u + v  u − v  ZZ sin u − sin v = 2 cos sin , f(x, y, z)dS = 2 2 S 140

ZZ q 2 2 f(x, y, g(x, y)) gx + gy + 1dA. D Here dA = dxdy is the element of the plane measure and dS is the element of the measure on the surface S and gx, gy are the partial derivatives of the func- tion g(x, y). In particular case when f(x, y, z) = 1 we will get the surface area of S (see above). Similar formulas are valid for surfaces given by parametric equations and for integrals of vector functions with The bottom row means that the result of the division 2 appropriate changes. For calculations of surface in- is the polynomial 2x + x − 1 and the remainder is tegrals of vector functions with the help of volume −3: (triple) integrals see divergence theorem. 2x3 + 3x2 − 4 3 = 2x2 + x − 1 − . surface of revolution A surface received by ro- x + 1 x + 1 tation of some curve about an axis, such as coor- dinate axis. Many common surfaces (, systems of linear algebraic equations The gen- paraboloids, etc.) could be viewed as surfaces of rev- eral system of m linear equations with n unknowns olution. is The term may mean diﬀer- a11x1 + a12x2 + ··· + a1nxn = b1 ent things depending on situation. The most com- a x + a x + ··· + a x = b monly symmetry with respect to one of the coordi- 21 1 22 2 2n n 2 nate axis or the origin is considered. Functions sym- ······························ metric with respect to the y-axis are the even func- a x + a x + ··· + a x = b tions and functions symmetric with respect to the m1 1 m2 2 mn n m origin are the odd functions. where aij, 1 ≤ i ≤ m, 1 ≤ j ≤ n and bk, 1 ≤ k ≤ n are numeric constants and x , x , ··· , x are the un- symmetric matrices A matrix A is symmetric if 1 2 n knowns. For diﬀerent methods of solving these kind it is equal to its transpose: A = AT . A symmetric of equations see Cramer’s rule, Gaussian elimination, matrix is necessarily a square matrix. Example: Gauss-Jordan elimination, matrix method.   1 −1 2 systems of linear diﬀerential equations (homo-  −1 −2 0  geneous) This type of systems are written usually 2 0 3 in the following form

0 synthetic division A simpliﬁed algorithm of di- x1 = a11x1 + a12x2 + ··· + a1nxn vision of any polynomial by a binomial of the form x0 = a x + a x + ··· + a x x − c. This method allows to avoid writing (and re- 2 21 1 22 2 2n n peating) the variable x in the process of division. In ······························ order to be able to use this algorithm it is neces- x = a x + a x + ··· + a x sary to write the polynomial we are dividing in the n m1 1 m2 2 mn n form of decreasing powers and it is also mandatory to where x1, x2, ··· , xn are the unknown functions. Us- ”ﬁll-in” the missing powers. For example, the poly- ing vector and matrix notations, this system could nomial P (x) = 2x3 + 3x2 − 4 should be written as be written more compactly as x0 = Ax. The solution P (x) = 2x3 + 3x2 + 0x − 4. Now, suppose we want of this system depends essentially on the eigenvalues to divide this polynomial by x + 1. The synthetic and eigenvectors of the matrix A. division procedure looks as below (1) The matrix A has n distinct real eigenvalues 141

r1, r2, ··· , rn with corresponding (distinct) eigenvec- r = 2 is a repeated eigenvalue and tors (ξ , ξ , ··· , ξ ). Then the general solution vector 1 2 n  1  of the system x0 = Ax is given by x(1)(t) = e2t −1 x(t) = c ξ er1t + c ξ er2t + ··· + c ξ ernt. 1 1 1 2 1 n is its corresponding vector solution. The second lin- Example: To solve the system early independent vector solution is given by the for- mula  1 1  x0 = x  1   0  4 1 x(2)(t) = te2t + e2t. −1 −1 we ﬁnd the eigenvalues r1 = 3, r2 = −1 and the corresponding eigenvectors systems of linear diﬀerential equations (non- homogeneous) These systems in matrix form  1   1  ξ(1) = , ξ(2) = could be written as 2 −2 x0 = ax + g(t), and the two (linearly independent solutions would be where g(t) is a vector function. The general so-  1   1  x(1)(t) = e3t, x(2) = e−t. lution of this system can be found by adding the 2 −2 general solution of the homogeneous system to any particular solution of the non-homogeneous system. (2) If the (real) matrix A has a complex eigenvalue Among many methods for ﬁnding particular solutions then it should have also its complex conjugate eigen- are the methods of undetermined coeﬃcients, varia- value: r = λ + iµ, r = λ − iµ. The corresponding 1 2 tion of parameters, Laplace transform method and eigenvectors are also conjugate in the sense that if the method of diagonalization. The ﬁrst three meth- ξ(1) = a + ib then ξ(2) = a − ib. Now, the vector ods (although for equations, not for systems) are de- solutions corresponding to these complex eigenvalues scribed in separate articles. Here we will describe the could be written in the following form: diaginalizathion method. u(t) = eλt(a cos µt − b sin µt), If the matrix A is diagonalizable, then there is an in- vertible matrix T whose columns are the eigenvectors v(t) = eλt(a cos µt + b sin µt). of A that diagonalizes A. Denoting x = T y we will Example: The system of equations get an equation in new variable T y0 = AT y + g(t).  −1/2 1  x0 = x −1 −1/2 After applying the inverse of T from the left of this equation we arrive to new equation has eigenvalues r1 = −1/2 + i, r2 = −1/2 − i. The two linearly independent vector solutions are y0 = Dy + T −1g(t),

    −1 − t cos t − t sin t where D = T AT is a diagonal matrix. The last u(t) = e 2 , v(t) = e 2 . − sin t cos t system is in fact a system of n equations where each of them contains only one variable and has the form (3) If any of the eigenvalues of the matrix A is re- 0 peated then the situation is more complicated. Ex- yk(t) = rkyk(t) + hk(t), 1 ≤ k ≤ n. ample: For the system Here rk are in fact the eigenvalues if the matrix A.  1 −1  Each of these equations is a simple ﬁrst order equa- x0 = x 1 3 tion and can be solved by the method of integrating 142 factors, for example. After solving it we just need to multiply the solutions by the matrix T to ﬁnd the T vector x and that will be a particular solution of the non-homogeneous system. systems of non-linear algebraic equations t-distribution Suppose the random variable x rep- There is no theory for solutions of non-linear alge- resents some distribution (normal or not) with mean braic systems but in many cases the substitution or µ and standard deviation σ and denote byx ¯ the elimination methods (but not Gaussian elimination) means of samples of size n from that distribution. work here too. Example: Solve the system of equa- A new distribution formed by the values tions x¯ − µ 3x − y = 1 t = √ , s/ n x2 − y = 5 where s is the sample standard deviation, is called Solving the ﬁrst equation for y and substituting into t-distribution. This distribution is generally speak- the second equation results in quadratic equation ing not normal even if the original distribution of x2 − 3x − 4 = 0. This equation has the solutions the x values is normal because of the presence of x = 4, −1. Substituting these values into either of √ the standard error SE(¯x) = s/ n in denominator. the original equations gives y = 11, −4. Finally, the The t-distribution is diﬀerent for diﬀerent values of system has two solutions (4, 11), (−1, −4). sample size n and it is approaching standard normal distribution as the sample size increases. The val- ues (probabilities) corresponding to t-distribution are found from special tables, by graphing calculators, or computers. It is generally acceptable to approximate t-distributions by the standard normal distribution if n ≥ 30. The value t in the formula above is called one-sample t-statistic. tangent function One of the six trigonometric functions. Geometrically, the tangent of an angle in a right triangle, is the ratio of the opposite side of the triangle to the adjacent side. Also could be de- ﬁned as the reciprocal of the cotangent function. The function tan x could be extended as a function of all real numbers with exception of x = π/2 + πn, n any integer, and the range is all of R. tan x function is π-periodic. Tangent function is related to other trigonomet- ric functions by various identities. The most important of these are the identities tan x = sin x/ cos x, tan x = 1/ cot x and a version of the Pythagorean identity 1+tan2 x = sec2 x. The deriva- tive and integral of this function are given by the formulas

d Z tan x = sec2x, tan xdx = ln | sec x| + C. dx 143

if it exists and is not zero, is the tangent vector at the point r(t0). tautochrone This is a curve down which a parti- cle slides freely under just the force of and reaches the bottom of the curve in the same amount of time no matter where its starting point on the curve was. It turns out that the only curve satisfying this condition is the cycloid. tangential component Any vector coming out Taylor series Let the function f(x) be deﬁned on from some point of a curve, can be presented as some interval around the point x = a and has deriva- a combination of tangent and normal vectors. The tives of arbitrary order at that point. Then the power component in the direction of the tangent vector is series ∞ the tangential component. X f (n)(a) f 0(a) (x − a)n = f(a) + (x − a) n! 1! tangent line A line is tangent to some curve at a n=0 point of that curve, if the line and curve coincide at f 00(a) f 000(a) + (x − a)2 + (x − a)3 + ··· that point and have no other common points in some 2! 3! neighborhood of that point. If the curve is given by is called the Taylor series of f. The Taylor series of a diﬀerentiable function f(x), then the tangent line a function may or may not converge to represent the to this curve at the point (a, f(a) has the slope function itself. The functions such that their Taylor f(x) − f(a) series converges to them are called analytic. f 0(a) = lim x→a x − a A ﬁnite part of the Taylor series with summation from n = 0 to n = N is called the Taylor polynomial and could be given by the equation y − f(a) = of order N. The Nth Taylor polynomial is usually f 0(a)(x − a). If the function is not diﬀerentiable at denoted by TN (x). If we denote the remaining part some point then the corresponding curve does not of the Taylor series (the sum from n = N + 1) by have a tangent line at that point. RN (x), then the question of when the Taylor series tangent line method The same as the Euler converges to the function is resolved by the following method for approximate solutions of diﬀerential Theorem. If f(x) = TN (x) + RN (x) and equations. lim RN (x) = 0 tangent plane A plane in three dimensional space N→∞ is tangent to some surface at some point of that for |x−a| < R, then f is equal to the sum if its Taylor surface if they coincide at that point but have no series on |x − a| < R. other common points in some neighborhood of that A criteria showing when the remainder of the Taylor point. If the surface is given by a diﬀerentiable func- series converges to zero is given by the following Tay- tion z = f(x, y), then the tangent plane at the point lor’s inequality: (x , y , z ) is given by the equation 0 0 0 If |f (N+1)(x)| ≤ M for |x−a| ≤ r, then the remainder

z − z0 = fx(x0, y0)(x − x0) + fy(x0, y0)(y − y0). of the Taylor series satisﬁes

M N+1 tangent vector A vector (usually unit vector) in |RN (x) ≤ |x − a| , |x − a| ≤ d. the direction of the tangent to a curve at some point. (N + 1)! Tangent vector is orthogonal to the normal vector at The special case of Taylor series when a = 0 is called the same point. If a space curve is given by the para- MacLaurin series. See corresponding entry for exam- 0 metric vector function r(t), then its derivative r (t0), ples of Taylor-MacLaurin series. 144

Taylor’s inequality See Taylor series. term of a sequence In a sequence a1, a2, a3, ··· each member (element, entry) is called term. Sim- P∞ ilarly, in the series n=1 an each an is a term of the series. The same term(word) is used to describe the elements of any polynomial, which also are called monomials. terminal side Of two sides forming an angle in standard position, one is called the initial side and the other is the terminal. See the entry angle for all the details. term-by-term diﬀerentiation and integration P n If a power series cn(x − a) has radius of conver- time plot A type of graph, mostly used in Statis- gence R > 0 then the function tics, where the points are found from observations ∞ taken during some time period and then connected X n by segments of straight line. f(x) = cn(x − a) n=0 A geometric three dimensional surface that is diﬀerentiable on the interval (a − R, a + R) and is the result of rotation of a circle around an axis not crossing or touching that circle. ∞ 0 X n−1 f (x) = ncn(x − a) , n=1 ∞ Z X (x − a)n+1 f(x)dx = C + c . n n + 1 n=0 The diﬀerentiated and integrated series have the same radius of convergence as the original series. tests of convergence of series Diﬀerent tests to establish convergence or divergence of numeric or power series. Each of them have separate entries. See alternating series convergence test, comparison total diﬀerential For a function z = f(x, y) of two test, integral test, limit comparison test, ratio test, variables, the expression root test. ∂z ∂z test of signiﬁcance Statistical test, establishing dz = dx + dy. ∂x ∂y the signiﬁcance of the sample results. As a rule, this means that some test statistic should be compared Total diﬀerential for functions of three or more vari- to some pre-determined small value (called α-value). ables is deﬁned in the similar way. The ﬁnding considered signiﬁcant if the test statistic (usually P-value) is smaller than this α-value. trace of a matrix For a square matrix A the trace is the sum of all elements on the main diagonal. For A geometric object composed of four the matrix triangular faces, three of which meet at each vertex.  2 −2 0  A regular tetrahedron is one in which the four trian-  5 −1 −4  gles are regular, or equilateral. 3 0 4 145 the trace is 2 + (−1) + 4 = 5. transpose of a matrix The transpose of a matrix Any function that is A is a matrix such that the rows are the columns of not algebraic. Transcendental functions include, but A and the columns are the rows of A. Hence, if A is are not limited to, exponential, logarithmic, trigono- a m × n matrix, then its transpose is a n × m matrix. metric, hyperbolic, and inverse trigonometric and hy- The most common notation for the transpose is AT . perbolic functions. Besides, most of the functions de- Example: For the matrix ﬁned with the use of power series or integrals are also   transcendental. 1 −2 0 −2  5 1 −4 3  A number that is not a 3 0 1 2 solution of an algebraic equation with rational coeﬃ- cients. Examples are the numbers e and π. the transpose is transformation See linear transformation.  1 5 3  −2 1 0 transformations of a function The combined   .  0 −4 1  name for operations on functions including horizon- −2 3 2 tal shift, vertical shift, reﬂection, stretching, shrink- ing (or compressing). The most general transforma- tion of a given function f(x) would be the function transposition In any set of elements the opera- g(x) = af(bx − c) + d. Here the constant b indicates tion that switches any two elements but leaves all the horizontal stretching (if |b| < 1) or shrinking (if the others unchanged is called a transposition. It is |b| > 1). The number c/b indicates the horizontal a special type of permutation. shift, d is the vertical shift and the constant a is the transverse axis of hyperbola The axis (line) vertical stretching (if |a| > 1) or shrinking (if |a| < 1). passing through two foci of the hyperbola. Also, multiplication by the constant a may result in reﬂection with respect to the x-axis if it is negative. trapezoid A quadrilateral with one pair of parallel Similarly, if b < 0 then the graph of the function sides. f would be reﬂected with respect to the y-axis. A trapezoidal rule One of the methods of approxi- typical example of a function with all of the above mate integration. To calculate the approximate value transformations is the function R b of the integral a f(x)dx using trapezoidal rule we y = −2 sin(3x + π/4) − 1. divide the interval [a, b] into n equal intervals of the length ∆x = (b − a)/n with the points x = a, x = The term translations of functions is also sometimes 0 1 a + ∆x, x = a + 2∆x, ··· , x = b. Then used. 2 n Z b ∆x transition matrix Suppose V is a vector space f(x)dx ≈ [f(x ) + 2f(x ) 0 0 0 0 0 1 and B = {u1, u2, ··· , un} and B = {u1, u2, ··· , un} a 2 are two bases in that space. Let for some vector +2f(x2) + ··· + 2f(xn−1) + f(xn)]. v ∈ V [v]B and [v]B0 denote the coordinate matri- ces of that vector with respect to the bases B and B0 For the degree of accuracy of this approximation see respectively. Then there exists a n×n matrix P such error estimate for the trapezoidal rule. that [v] = P [v] 0 . The matrix P is the transition B B triangle A geometric ﬁgure that consists of three matrix from the basis B to the basis B0 and could be points (vertices) and three intervals connecting these written in the form points. Any two adjacent intervals form an angle, 0 0 0 P = [[u1]B|[u2]B| · · · |[un]B] . hence the name. Triangle is the simplest special case 146 of a polygon. segment will form some angle θ with the real axis, called the argument of z. The argument of a num- For any two vectors u and v ber can be found from the formula tan θ = b/a.The in the Euclidean space Rn trigonometric form of the number z is ||u + v|| ≤ ||u|| + ||v||, z = r(cos θ + sin θ). where the symbol || · || indicates the norm (length, The Euler modiﬁcation of this form is z = reiθ. magnitude) of a vector. The name comes from the Trigonometric form is convenient when multiplying, fact that this inequality generalizes the geometric fact dividing, raising to whole power and especially for that any side of a triangle is shorter than the sum of ﬁnding the roots of complex numbers. See multi- the other two sides. The triangle inequality has fur- plication of complex numbers, division of complex ther generalizations to the so-called metric spaces. numbers, DeMoivre’s theorem,roots of a complex triangular matrix A matrix where all the ele- number. ments above or below the main diagonal are zeros. trigonometric functions The six functions See also lower triangular, upper triangular matrices. sin x, cos x, tan x, cot x, sec x and csc x. For trigonometric equations Equations involving the deﬁnitions and graphs of each of them see the trigonometric functions. Unlike identities, equations corresponding deﬁnitions. are not true statements for all or ”” values trigonometric identities Identities involving of the variable, but might be true only for limited trigonometric functions. Identities can be true for number of values if considered in ﬁnite interval. In all values of the variable(s) or for all values except cases when we are looking for solutions without re- where the functions are not deﬁned. Trigonometric strictions on the interval, we usually get inﬁnitely formulas are also identities. Examples: many solutions but they are always periodic repeti- tions of solutions on the ﬁnite interval. Examples: sin2 θ + cos2 θ = 1 The equation sin x = 1/2 cos x + sin x tan x = sec x has only two solutions x = π/6, x = 5π/6 on the in- cos(x − y) = cos x cos y + sin x sin y. terval [0, 2π). If considered on the real axis (−∞, ∞), if has inﬁnitely many solutions found by adding ar- trigonometric integrals A group of indeﬁnite in- bitrary multiple of 2π to the two basic solutions: tegrals that contain diﬀerent combinations of trigono- x = π/6 + 2πk, x = 5π/6 + 2πk, k = 0, ±1, ±2, ···. metric functions. For most of these combinations Similarly, the equation there are methods to evaluate that integrals but not all trigonometric integrals can be expressed in the 4 cos2 x − 1 = 0 form of elementary functions. R n m has the solutions x = π/3, x = 2π/3 on [0, 2π) and 1) Integrals of the form sin x cos xdx. we get the general solution by adding a multiple of (a) If m = 2k + 1 is odd, then write 2π to that solutions. Z sinn x(cos2 x)k cos xdx trigonometric form of the complex number Let z = a + ib be a complex number. Then it can Z be represented on the complex plane as a point with = sinn x(1 − sin2 x)k cos xdx coordinates a and b. The distance of this point√ from the origin is the modulus of the number r = a2 + b2. and using the substitution u = sin x reduce the inte- Now, if we connect the origin with the point z, that gral to algebraic: R un(1 − u2)kdx. 147

Z (b) If n = 2s + 1 is odd, then similar to the previous = (1 − cos2 x)2 cos2 x sin xdx case Z 2 s m (sin ) cos x sin xdx Z u3 u5 u7 = − (1 − u2)2u2du = − + 2 − + C Z 3 5 7 = (1 − cos2 x)s cosm x sin xdx cos3 x 2 cos5 x cos7 x = − + − + C. and the substitution u = cos x again reduces integral 3 5 7 to algebraic. (c) If both n and m are even, then the use of half- Z angle formulas tan6 x sec4 xdx 1 1 sin2 x = (1 − cos 2x), cos2 x = (1 + cos 2x) Z 2 2 = tan6 x(1 + tan2 x) sec2 xdx (possibly multiple times) reduces the integral to easy Z u7 u9 to evaluate form. = u6(1 + u2)du = + + C 2) Integrals of the form R tann x secm xdx. 7 9 (a) If m = 2k is even, then write tan7 x tan9 = + + C. Z 7 9 tann x(sec2)k−1 sec2 xdx

Z n 2 k−1 2 Z = tan x(1 + tan x) sec xdx sin 7x cos 4xdx and substitute u = tan x using the fact that 1 Z Z  (tan x)0 = sec2 x. Integral reduces to algebraic. = sin 3xdx + cos 11xdx (b) If n = 2s + 1 is odd, then write 2 1 1 Z == − cos 3x + sin 11x + C. (tan2 x)s secm−1 x sec x tan xdx 6 22

Z = (sec2 x − 1)s secm−1 sec x tan xdx trigonometric series A ﬁnite or inﬁnite series of the form and substitute u = sec x using the fact that ∞ (sec x)0 = sec x tan x. Once again, the integral X c + (an cos nθ + bn sin nθ). reduces to algebraic. n=1 3) In evaluating the integrals of the R R form sin nx cos mxdx, sin nx sin mxdx or In the case the coeﬃcients c, an, bn have special re- R cos nx cos mxdx it is useful to use the product-to- lations to some function f(x), the series becomes a sum formulas. Fourier series. See the corresponding entry. Many other trigonometric integrals can be calculated trigonometric substitutions See integration by by diﬀerent methods. Among trigonometric integrals trigonometric substitution. that cannot be expressed with the use of elementary functions is the integral R sin(x2)dx. trinomial A polynomial that has three terms. Tri- Examples: nomial might have one or more variables. Examples: 2 3 2 2 4 Z Z 2x − 5x + 6, 3x y + 4x y − 6xy. 5 2 2 2 2 sin x cos xdx = (sin x) cos x sin xdx Or scalar triple product of three 148 vectors u, v, w in three dimensional space is deﬁned as u · (v × w) and could be calculated by the formula U

u1 u2 u3

u · (v × w) = v1 v2 v3 . unbiased statistic A sample statistic whose w1 w2 w3 sampling distribution has a mean value equal to the mean value of the population parameter that it is trivial solution An obvious, very simple solution estimating. Sample mean, proportion and variance of an equation or system of equations. As a rule, are unbiased statistics but the sample standard trivial solution is a zero solution. Any homogeneous deviation is not. linear system has a trivial solution. Many diﬀeren- unbounded function A function that is not tial equations have trivial (identically zero) solution bounded from above, below, or both. The functions along with non-zero solutions. f(x) = ln x, g(x) = x2 are unbounded but the function h(x) = sin x is bounded. twisted cubic A space curve given by the vector- unbounded region In case of a plane region, it 2 3 function r(t) = (t, t , t ). is called bounded if it could be enclosed in some circle of ﬁnite radius centered at the origin. If there is no such circle, then the region is unbounded. Any region given by an inequality y > ax + b is an example of an unbounded region. Unbounded regions in the three dimensional space are deﬁned similarly. undeﬁned function A function that is not or cannot be deﬁned for certain values of the variable. For example, the function f(x) = 1/x is undeﬁned for x = 0 because division by zero is not deﬁned. undercoverage (of population) In statistics, when sampling, sometimes certain parts of the population are covered less that the others. For example, when conducting telephone polls, people without phones are not covered at all. underdetermined linear system A system of linear equations that has more unknowns than equations. If this system is consistent, then it has inﬁnitely many solutions. See also overdetermined linear systems. undetermined coeﬃcients Also called the method of undetermined coeﬃcients for certain types of equations. In using this method we either know or guess the form of the solution but do not know the value of the constant(s) and are trying to determine that value. One of the versions of 149 this method is described in the deﬁnition of partial unit step function Similar to the Heaviside func- fraction decomposition. Second application is for tion, deﬁned by the formula solution of non-homogeneous linear diﬀerential equa- n 1 if t ≥ c u (t) = tions, working mainly for equations with constant c 0 if t < c coeﬃcients. for c ≥ 0. The Laplace transform of this function is Suppose we are trying to ﬁnd a particular solution −cs of the equation the function L[uc(t)] = e /s, s > 0. unit circle A circle with radius equal to one. In 00 0 2t y − 3y − 4y = 3e . Cartesian coordinate system such a circle could be given by the equation (x − a)2 + (y − b)2 = 1, where We guess that the solution should have the form (a, b) is the coordinate of the center. The picture 2t Y (t) = Ae , where A is yet to be determined shows the unit circle with center at the origin. coeﬃcient. After we plug-in this supposed solution into the equation we get (4A − 6A − 4A)e2t = 3e2t and A = −1/2 and the coeﬃcient is found. Similar method also works for the systems of linear equations where, of course, we have to solve systems of linear algebraic equations with respect to a number of undetermined coeﬃcients.

A sphere with radius one. In Carte- sian coordinate system the equation is x2 +y2 +z2 = uniform distribution A continuous probability 1. The notion of the sphere extends also to higher distribution where the density curve is a function that dimensional Euclidean spaces. is constant on some interval [a, b] and is zero every- where else. Because of the requirement that the area unit vector A vector that has norm (length, mag- under the curve should be 1, the height of the line is nitude) equal to 1. If v is an arbitrary non-zero vector 1/(b − a). in the vector space V , then the vector u = v/||v|| has unit norm and hence, is a unit vector. See also nor- union of sets If A and B are two sets of some mal vector, tangent vector. objects, then their union A ∪ B is deﬁned to be the set of all elements that belong to A or B or unitary diagonalization Suppose A is some both of them. Example: If A = {1, 3, 5, 7} and square matrix and P is a of the same size. Then A is said to be unitary diagonalizable if B = {2, 4, 6, 8} then A ∪ B = {1, 2, 3, 4, 5, 6, 7, 8}. −1 In case when the two sets have common elements, the matrix P AP is diagonal. then in the union these elements are counted only unitary matrix A matrix A with complex entries once. Hence, if A = {1, 2, 3} and B = {2, 3, 4}, then is unitary, if its inverse A−1 is equal to its conjugate A ∪ B = {1, 2, 3, 4}. transpose A∗. The matrix uniqueness theorems See existence and unique- 1  1 + i 1 + i  . ness theorems. 2 1 − i 1 − i 150 is a unitary matrix. V unlike fractions Two (or more) fractions that 3 4 have diﬀerent denominators. The fractions 4 and 7 are unlike fractions. On the other hand, the fractions 3 6 value of a function The numeric expression of a 8 and 16 are not unlike fractions because the second one could be reduced to simplest terms as 3 with the given function at a given point. Examples: (1) The 8 3 2 same denominator. value of the function f(x) = 3x − 2x + 5x − 1 at the point x = 2 is determined by substituting the unlike terms Two terms in an algebraic expression value x = 2 in the expression of the function: f(2) = that do not have identical variable parts (factors). In 3 · 23 − 2 · 22 + 5 · 2 − 1 = 25. (2) The value of the 2 the expression 2xy + 2xy the two terms are unlike function f(x) = sin x at the point x = π/6 is 1/2. 2 terms, because xy and xy are not identical. The To calculate the values of more complicated functions constant factors do not play role in deciding if the approximate methods and/or calculators are needed. terms are like terms or unlike terms. variable One of the basic mathematical objects upper triangular matrix A square matrix where along with constants, functions, etc. Variables are all the entries below the main diagonal are zeros. Ex- quantities that do not have ﬁxed value (as the ample: constants) but rather have the ability to change. For   2 4 2 example, time is a variable quantity because it does  0 1 −3  . not stand still. Variables are denoted by symbols 0 0 3 such as x, y, z, t, u, and others. In certain situations The determinant of such a matrix is just the product there might be two or more variable quantities of all diagonal elements. See also lower diagonal and if they are related to each other by some rule matrix. or formula, we call that a function or relation. Accordingly, the variables in that relations would be independent or dependent. See also the entries explanatory, quantitative, qualitative, response, random, lurking variables. variance For a set of numeric values x1, x2, x3, ··· , xn with the mean µ, the variance is deﬁned by the formula

Pn (x − µ)2 σ2 = i=1 i . n − 1 The square root of this quantity is called standard deviation. variation Speciﬁc types of relationships between two or more variable. (1) Direct variation. In this case one of the variables is directly proportional to another variable in some form. For example if y = kx (k is some numeric con- stant) we say that y varies directly with x. If y = kx2 then y varies directly with x2. (2) Inverse variation. In this case one variable is in- versely proportional to some form of the other vari- 151 able. If y =√k/x we say that y varies inversely√ with we notice that the corresponding homogeneous equa- x. If y = k/ x, then y varies inversely with x, and tion y00 + 4y = 0 has the general solution y(t) = so on. c1 cos 2t + c2 sin 2t and look for a particular solu- (3) Joint variation. In this case one variable is di- tion of the non-homogeneous equation in the form rectly or inversely proportional to some forms of two Y (t) = u1(t) cos 2t + u2(t) sin 2t. Now, the Wron- or more other variable. Example: z = kx2/y3. skian of this equation is W = 2 and by the Theorem above, u (t) = −3 cos t and u (t) = 3/2 ln | csc t − variation of parameters A method of ﬁnding a 1 2 cot t| + 3 cos t. Plugging in these functions into the particular solution for non-homogeneous linear diﬀer- previous formula we ﬁnd a particular solution of the ential equations. This method is the development of non-homogeneous equation: the method of undetermined coeﬃcients and allows to ﬁnd particular solutions for much wider classes of Y (t) = −3 sin t cos 2t + 3 cos t sin 2t equations. Suppose we want to ﬁnd the general solution of the 3 + sin 2t ln | csc t − cot t|. equation 2 y00 + p(t)y0 + q(t)y = g(t). (1) Similar method works also for equations of higher or- der and for systems of non-homogeneous equations. First of all, the general solution of this equation is vector One of the basic mathematical objects, vec- the combination of the general solution of the corre- tors could be viewed as quantities that have both sponding homogeneous equation some numeric value and a direction. Many physical y00 + p(t)y0 + q(t)y = 0 (2) notions (such as force or ) have that proper- ties and vectors are abstract mathematical represen- and any particular solution of the equation (1). As- tations of these facts. As with the other mathemati- sume also that we know the general solution of the cal objects (numbers, variables, functions, matrices), homogeneous equation (2) and it is many operations are possible to perform with vectors too. As objects, vectors may exist on the plane, three yh(t) = c1y1(t) + c2y2(t), dimensional space, or more generally, in any vector where y1, y2 are two linearly independent solutions. space. The idea behind the method of variation of parame- 1) Addition of vectors in the plane is deﬁned by the ters is to replace constants c1, c2 by some, yet to be parallelogram rule (see addition and subtraction of determined functions u1(t), u2(t) and try to ﬁnd a vectors). More generally, if v and u are two vectors particular solution in the form in some vector space V , then Y (t) = u (t)y (t) + u (t)y (t). 1 1 2 2 v + u = (v1 + u1, v2 + u2, ··· , vn + un), Theorem. The particular solution of the equation (1) where v = (v , v , ··· , v ), u = (u , u , ··· , u ). could be written in the form 1 2 n 1 2 n 2) The scalar product of a number c and the vector Z t y2(s)g(s) v is deﬁned as cv = (cv1, cv2, ··· , cvn). Y (t) = −y1(t) ds 3) It is impossible to deﬁne a product of two vec- t0 W (y1, y2)(s) tors in any space V in such a manner that the result Z t y1(s)g(s) be another vector from the same space and at the +y2(t) ds, t0 W (y1, y2)(s) same time satisfy all the expected properties of mul- tiplication (commutative, associative, distributive). where W (y1, y2) is the Wronskian of y1 and y2. Example: To solve the equation Instead, the inner product (or dot product) of two vectors could be deﬁned which is not a vector but y00 + 4y = 3 csc t a number (real or complex, depending on the space 152