Applied Mathematical Sciences, Vol. 10, 2016, no. 23, 1111 - 1130 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2016.512735

Geometric Constructions Relating Various Lines of Regression

Brian J. McCartin†

Applied Mathematics, Kettering University 1700 University Avenue, Flint, MI 48504-6214 USA

Copyright c 2016 Brian J. McCartin. This article distributed under the Creative Com- mons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract As a direct consequence of the Galton-Pearson-McCartin Theorem [10, Theorem 2], the concentration ellipse provides a unifying thread to the Euclidean construction of various lines of regression. These include lines of coordinate regression [7], orthogonal regression [13], λ-regression [8] and (λ, µ)-regression [9] whose geometric constructions are afforded a unified treatment in the present paper.

Mathematics Subject Classification: 51M15, 51N20, 62J05

Keywords: geometric construction, linear regression, concentration ellipse, Galton-Pearson-McCartin Theorem

1 Introduction

In fitting a linear relationship [6] to a set of (x, y) data (Figure 1), it is many times assumed that one (‘independent’) variable, say x, is known exactly while the other (‘dependent’) variable, say y, is subject to error. The line L (Figure 2 2) is then chosen to minimize the total square vertical deviation Σdy. This is known as the line of regression of y-on-x. Reversing the roles of x and y 2 minimizes instead the total square horizontal deviation Σdx, thereby yielding the line of regression of x-on-y. Either method may be termed coordinate regression.

†Deceased January 29, 2016 1112 Brian J. McCartin

This procedure may be generalized to the situation where both variables are subject to independent errors with zero means, albeit with equal variances. 2 In this case, L is chosen to minimize the total square orthogonal deviation Σd⊥ (Figure 2). This is referred to as orthogonal regression.

1

0.5

y 0

−0.5

−1

−1.5 −1 −0.5 0 0.5 1 1.5 x Figure 1: Linear Fitting of Discrete Data

This approach may be further extended to embrace the case where the error 2 2 variances are unequal. Defining σu = variance of x-error, σv = variance of y- 2 2 error and λ = σv /σu, a weighted least squares procedure [1, 16] is employed 1+m2 2 whereby λ+m2 · Σd⊥ is minimized in order to determine L. This is referred to as λ-regression. Finally, if the errors are correlated with puv = covariance of 2 1+m2 2 errors and µ = puv/σu then λ−2µm+m2 · Σd⊥ is minimized in order to determine L. This is referred to as (λ, µ)-regression. In the next section, the concept of oblique linear least squares approximation is introduced (Figure 3). In this mode of approximation, one first selects a direction specified by the slope s. One then obliquely projects each data in this direction onto the line L. Finally, L is then chosen so as to minimize 2 the total square oblique deviation Σdi . It is known that this permits a unified treatment of all of the previously described flavors of linear regression [10]. Galton [7] first provided a geometric interpretation of coordinate regres- sion in terms of the best-fitting concentration ellipse which has the same first moments and second moments about the centroid as the experimental data [3]. Pearson [13] then extended this geometric characterization to orthogonal regression. McCartin [8] further extended this geometric characterization to Geometric construction of regression lines 1113

Figure 2: Coordinate and Orthogonal Regression

λ-regression. Finally, McCartin [9] succeeded in extending this geometric char- acterization to the general case of (λ, µ)-regression. In a similar vein, a geomet- ric characterization of oblique linear least squares approximation [10] further generalizes these Galton-Pearson-McCartin geometric characterizations of lin- ear regression. This result is known as the Galton-Pearson-McCartin Theorem and is reviewed below (Theorem 1). The primary purpose of the present paper is to utilize the Galton-Pearson- McCartin Theorem to provide a unified approach to the Euclidean construction of these various lines of regression. As such, the concentration ellipse will serve as the adhesive which binds together these varied geometric constructions. However, before this can be accomplished, the basics of oblique least squares, linear regression and the concentration ellipse must be reviewed.

2 Oblique Least Squares

n Consider the experimentally ‘observed’ data {(xi, yi)}i=1 where xi = Xi +ui and yi = Yi + vi. Here (Xi,Yi) denote theoretically exact values with corre- sponding random errors (ui, vi). It is assumed that E(ui) = 0 = E(vi), that 2 2 successive observations are independent, and that V ar(ui) = σu, V ar(vi) = σv , Cov(ui, vi) = puv irrespective of i. Next, define the statistics of the sample data corresponding to the above population statistics. The mean values are given by

n n 1 X 1 X x¯ = xi;y ¯ = yi, (1) n i=1 n i=1 1114 Brian J. McCartin

Figure 3: Oblique Projection the sample variances are given by n n 2 1 X 2 2 1 X 2 σx = (xi − x¯) ; σy = (yi − y¯) , (2) n i=1 n i=1 the sample covariance is given by n 1 X pxy = (xi − x¯) · (yi − y¯), (3) n i=1 and (assuming that σx · σy 6= 0) the sample correlation coefficient is given by

pxy rxy = . (4) σx · σy

2 Note that if σx = 0 then the data lie along the vertical line x =x ¯ while 2 if σy = 0 then they lie along the horizontal line y =y ¯. Hence, we assume 2 2 without loss of generality that σx · σy 6= 0 so that rxy is always well defined. Furthermore, by the Cauchy-Buniakovsky-Schwarz inequality,

2 2 2 pxy ≤ σx · σy (i.e. − 1 ≤ rxy ≤ 1) (5) with equality if and only if (yi − y¯) ∝ (xi − x¯), in which case the data lie on the line 2 pxy σy y − y¯ = 2 (x − x¯) = (x − x¯), (6) σx pxy since pxy 6= 0 in this instance. Thus, we may also restrict −1 < rxy < 1. Turning our attention to Figure 3, the oblique linear least squares approx- imation, L, corresponding to the specified slope s may be expressed as L : y − y¯ = m(x − x¯), (7) Geometric construction of regression lines 1115 since it is straightforward to show that it passes through the centroid of the data, (¯x, y¯) [10]. L is to be selected so as to minimize the total square oblique deviation n n X 2 X 2 2 S := di = [(xi − x∗) + (yi − y∗) ]. (8) i=1 i=1 Furthermore, this optimization problem may be reduced [10] to choosing m to minimize

2 n 1 + s X 2 S(m) = 2 · [m(xi − x¯) − (yi − y¯)] , (9) (m − s) i=1 thereby producing the extremizing slope [10]

2 σy − s · pxy m = 2 . (10) pxy − s · σx Equations (7) and (10) together comprise a complete solution to the problem of oblique linear least squares approximation.

3 Linear Regression

Some of the implications of oblique linear least squares approximation for linear regression are next considered, beginning with the case of coordinate regression. Setting s = 0 in Equation (10) produces regression of x on y:

2 + σy − mx := , mx := 0. (11) pxy

Likewise, letting s → ∞ in Equation (10) produces regression of y on x:

+ pxy − my := 2 , my := ∞. (12) σx Unsurprisingly, we see that the coordinate regression lines [7] are special cases of oblique regression. The more substantial case of orthogonal regression is now considered. Set- 1 ting s = − m in Equation (10) produces

2 2 2 pxy · m − (σy − σx) · m − pxy = 0, (13) whose roots provide both the best orthogonal regression line q (σ2 − σ2) + (σ2 − σ2)2 + 4p2 + y x y x xy m⊥ = , (14) 2pxy 1116 Brian J. McCartin as well as the worst orthogonal regression line q (σ2 − σ2) − (σ2 − σ2)2 + 4p2 − y x y x xy m⊥ = . (15) 2pxy Thus, orthogonal regression [13] is also a special case of oblique regression. λ Turning now to the case of λ-regression, set s = − m in Equation (10):

2 2 2 pxy · m − (σy − λσx) · m − λpxy = 0, (16) whose roots provide both the best λ-regression line q (σ2 − λσ2) + (σ2 − λσ2)2 + 4λp2 + y x y x xy mλ = , (17) 2pxy as well as the worst λ-regression line q (σ2 − λσ2) − (σ2 − λσ2)2 + 4λp2 − y x y x xy mλ = . (18) 2pxy Thus, λ-regression [8] is also a special case of oblique regression. µm−λ Proceeding to the case of (λ, µ)-regression, set s = m−µ in Equation (10):

2 2 2 2 2 (pxy − µσx) · m − (σy − λσx) · m − (λpxy − µσy) = 0, (19) whose roots provide both the best (λ, µ)-regression line q (σ2 − λσ2) + (σ2 − λσ2)2 + 4(p − µσ2)(λp − µσ2) + y x y x xy x xy y mλ,µ = 2 , (20) 2(pxy − µσx) as well as the worst (λ, µ)-regression line q (σ2 − λσ2) − (σ2 − λσ2)2 + 4(p − µσ2)(λp − µσ2) − y x y x xy x xy y mλ,µ = 2 . (21) 2(pxy − µσx) Thus, (λ, µ)-regression [9] is also a special case of oblique regression.

4 Concentration Ellipse

As established in the previous section, all of the garden varieties of linear regression are subsumed under the umbrella of oblique linear least squares ap- proximation. The generalization of the Galton-Pearson-McCartin geometric characterizations of linear regression to oblique linear least squares approxi- mation is now considered. Geometric construction of regression lines 1117

Figure 4: Concentration Ellipse

Define the concentration ellipse, E, (Figure 4) via

2 2 2 (x − x¯) pxy (x − x¯) (y − y¯) (y − y¯) pxy 2 − 2 · · + 2 = 4(1 − 2 2 ), (22) σx σxσy σx σy σy σxσy which has the same centroid and second moments about that centroid as does the data [3, pp. 283-285] (see [10] for proof). In this sense, it is the ellipse which is most representative of the data points without any a priori statistical assumptions concerning their origin. The concentration ellipse is homothetic 2 2 2 to the inertia ellipse with ratio of similitude 4n(σxσy − pxy) [8]. In order to achieve our characterization, the concept of conjugate of an ellipse is required [14, p. 146]. A of an ellipse is a passing through the center con- necting two antipodal points on its periphery. The conjugate diameter to a given diameter is that diameter which is parallel to the tangent to the ellipse at either peripheral point of the given diameter (Figure 5). If the ellipse is described by

ax2 + 2hxy + by2 + 2gx + 2fy + c = 0 (h2 < ab), (23) then, according to [10, p. 146], the slope of the conjugate diameter is

a + h · m s = − , (24) h + b · m 1118 Brian J. McCartin

A 2

A 1

Figure 5: Conjugate Semidiameters where m is the slope of the given diameter. Consequently, a + h · s m = − , (25) h + b · s thus establishing that conjugacy is a symmetric relation. As a result, these conjugacy conditions may be rewritten symmetrically as

b m · s + h (m + s) + a = 0. (26)

For the concentration ellipse, Equation (22), take

2 2 a = σy, h = −pxy, b = σx, (27) so that 2 2 σy − pxy · m σy − pxy · s s = − 2 ; m = − 2 , (28) −pxy + σx · m −pxy + σx · s which implies that s and m satisfy the symmetric conjugacy condition

2 2 σx · (m · s) − pxy · (m + s) + σy = 0. (29) Direct comparison of Equations (28) and (10) demonstrates that the best and worst slopes of oblique regression satisfy Equation (29) and are thus con- jugate to one another. Hence, the following pivotal result has been established [10]: Geometric construction of regression lines 1119

Theorem 1 (Galton-Pearson-McCartin) The best and worst lines of oblique regression are conjugate diameters of the concentration ellipse whose slopes satisfy the conjugacy relation Equation (29).

µm−λ Note that s = m−µ may be rearranged to reproduce the geometric char- acterization for (λ, µ)-regression of McCartin [9]: (m − µ) · (s − µ) = µ2 − λ. Furthermore, subsequently setting µ = 0 reproduces the geometric charac- terization for λ-regression of McCartin [8]: m · s = −λ. Now, setting λ = 1 reproduces Pearson’s geometric characterization for orthogonal regression [13]: it is the major axis of the concentration ellipse. Finally, setting λ = 0, ∞ repro- duces Galton’s geometric characterization for coordinate regression [7]: they are the lines displayed in Figure 6 connecting the points of horizontal/vertical tangencies, respectively, to the centroid.

5 Geometric Constructions

Previous work on constructing an ellipse from its moments [12] will now be exploited in order to construct various lines of regression. This is possible because, as has been demonstrated above, all of theses regression lines are associated with the concentration ellipse. 2 2 Once the moments hx,¯ y,¯ σx, pxy, σyi have been computed from the discrete n data set {(xi, yi)}i=1, the equation of the concentration ellipse, E, is given by the positive definite quadratic form

" 2 #−1 " # h i σx pxy x − x¯ x − x¯ y − y¯ 2 = 4, (30) pxy σy y − y¯ and the elliptical area may be expressed as

q 2 2 2 AE = 4π σxσy − pxy. (31) The slopes of the principal axes of the concentration ellipse are the roots of the quadratic σ2 − σ2 m2 + x y · m − 1 = 0, (32) pxy so that q (σ2 − σ2) + (σ2 − σ2)2 + 4p2 + y x y x xy m⊥ = , (33) 2pxy q (σ2 − σ2) − (σ2 − σ2)2 + 4p2 − y x y x xy m⊥ = , (34) 2pxy + since m⊥ has the same sign as pxy [8]. 1120 Brian J. McCartin

The corresponding squared magnitudes of the principal semiaxes [14, p. 158] are the roots of the quadratic

2 2 2 2 2 2 Σ⊥ − 4(σx + σy) · Σ⊥ + 16(σxσy − pxy) = 0, (35) so that + 2 h 2 2 q 2 2 2 2 i (Σ⊥) = 2 (σx + σy) + (σx − σy) + 4pxy , (36)

− 2 h 2 2 q 2 2 2 2 i (Σ⊥) = 2 (σx + σy) − (σx − σy) + 4pxy . (37) Equation (22) may be expanded thereby yielding

2 2 2 2 2 2 2 σy · (x − x¯) − 2pxy · (x − x¯)(y − y¯) + σx · (y − y¯) = 4(σxσy − pxy) (38) as the equation of the concentration ellipse. Setting y =y ¯ yields the corre- sponding horizontal points of intersection

 q 2 2 2  x¯ ± 2 σx − pxy/σy, y¯ , (39) while setting x =x ¯ yields the corresponding vertical points of intersection

 q 2 2 2 x,¯ y¯ ± 2 σy − pxy/σx . (40)

The points of vertical and horizontal tangency are [10], respectively,

!  pxy  pxy x¯ ± 2σx, y¯ ± 2 ; x¯ ± 2 , y¯ ± 2σy . (41) σx σy

With reference to Figure 5, recall that two semidiameters are conjugate to one another if they are parallel to each others tangents [11]. For this to occur, it is necessary and sufficient that their slopes m and s satisfy the symmetric conjugacy condition, Equation (29). Thus, the semidiameter to a point of vertical/horizontal tangency is conjugate to the semidiameter to a point of vertical/horizontal intersection, respectively. (See Figure 6.) In the succeeding subsections, straightedge-compass constructions of the 2 2 various lines of regression, given the moments, hx,¯ y,¯ σx, pxy, σyi, will be de- veloped. In this endeavor, certain primitive arithmetic/algebraic Euclidean constructions [2, pp. 121-122], as next described, will be required.

Proposition 1 (Primitive Constructions) Given two line segments of non- zero lengths a√and b, one may construct line segments of length a + b, a − b, a · b, a/b and a using only straightedge and compass. Geometric construction of regression lines 1121

m− m+ y x

m+ y m− x

Figure 6: Coordinate Regression: y-on-x (Left); x-on-y (Right)

5.1 Coordinate Regression As previously noted, the semidiameter of a point of vertical intersection is conjugate to the semidiameter to a point of vertical tangency. Thus, by Equations (40) and (41), the vectors

q 2 2 2 pxy h0, 2 σy − pxy/σxi; h2σx, 2 i (42) σx define a pair of conjugate semidiameters emanating from the centroid, (¯x, y¯). (See Figure 6, Left.)

Construction 1 (Regression of y-on-x) Given the first and second moments 2 2 hx,¯ y,¯ σx, pxy, σyi, the best and worst lines of regression of y-on-x are given, re- spectively, by the pair of conjugate semidiameters

+ + pxy − − q 2 2 2 hxy , yy i := h2σx, 2 i; hxy , yy i := h0, 2 σy − pxy/σxi σx which may be constructed using Proposition 1.

The corresponding best and worst mean-squared vertical deviations are given, respectively, by [15, pp. 220-222]

2 2 2 2 2 2 2 2 2 1  +2 4(σxσy − pxy) σxσy − pxy 1  −2 4(σxσy − pxy) · Sy = + 2 = 2 , · Sy = − 2 = ∞. n (xy ) σx n (xy ) (43) 1122 Brian J. McCartin

Likewise, the semidiameter of a point of horizontal intersection is conjugate to the semidiameter to a point of horizontal tangency. Thus, by Equations (39) and (41), the vectors

q 2 2 2 pxy h2 σy − pxy/σx, 0i; h2 , 2σyi (44) σy define a pair of conjugate semidiameters emanating from the centroid, (¯x, y¯). (See Figure 6, Right.)

Construction 2 (Regression of x-on-y) Given the first and second moments 2 2 hx,¯ y,¯ σx, pxy, σyi, the best and worst lines of regression of x-on-y are given, re- spectively, by the pair of conjugate semidiameters

+ + pxy − − q 2 2 2 hxx , yx i := h2 , 2σyi; hxx , yx i := h2 σy − pxy/σx, 0i σy which may be constructed using Proposition 1.

The corresponding best and worst mean-squared vertical deviations are given, respectively, by [15, pp. 220-222]

2 2 2 2 2 2 2 2 2 1  +2 4(σxσy − pxy) σxσy − pxy 1  −2 4(σxσy − pxy) · Sx = + 2 = 2 , · Sx = − 2 = ∞. n (yx ) σy n (yx ) (45)

5.2 Orthogonal Regression The following beautiful construction due to Carlyle [5, p. 99] and Lill [4, pp. 355-356] will be invoked in order to construct the remaining lines of regression.

Proposition 2 (Graphical Solution of a Quadratic Equation) Draw a having as diameter the line segment joining (0, 1) and (a, b). The ab- scissae of the points of intersection of this circle with the x-axis are the roots of the quadratic x2 − ax + b = 0.

A double application of Proposition 2, in concert with Equations (32) and (35), produces the following construction for the best and worst orthogonal regression lines.

Construction 3 (Orthogonal Regression) Given the first and second mo- 2 2 ments hx,¯ y,¯ σx, pxy, σyi, the best and worst lines of orthogonal regression are given by the pair of principal axes determined as follows: Geometric construction of regression lines 1123

(4(σ 2+σ 2),16(σ 2σ 2−p2 )) x y x y xy

(0,1)

(σ 2−σ 2)/p y x xy − + m⊥ m⊥

−1 ((σ 2−σ 2)/p ,−1) y x xy

(0,1) + + Σ−2 Σ+2 ⊥ ⊥

Figure 7: Carlyle-Lill Construction: Directions (Left); Magnitudes (Right)

1. According to Equation (32), apply the Carlyle-Lill construction, Propo- 2 2 sition 2, with (a, b) = ( σy−σx , −1) to construct the directions of the prin- pxy cipal axes. (See Figure 7, Left.)

2. According to Equation (35), apply the Carlyle-Lill construction, Proposi- 2 2 2 2 2 tion 2, with (a, b) = (4(σx + σy), 16(σxσy − pxy)) to construct the squared magnitudes of the principal semiaxes. (See Figure 7, Right.) √ 3. Apply the ·-operator, Proposition 1, to these squared magnitudes in + − order to construct the magnitudes, {Σ⊥, Σ⊥}, of the principal semiaxes emanating from the centroid, (¯x, y¯). (See Figure 8.)

± ± 4. Denote by hx⊥, y⊥i the tips of these principal semiaxes lying on the con- centration ellipse, E.

By Equation (36), the corresponding best mean-squared orthogonal deviation is given by

2 2 2 2 2 2 1  +2 4(σxσy − pxy) 2(σxσy − pxy) · S⊥ = + + = h q i, (46) n (x )2 + (y )2 2 2 2 2 2 2 ⊥ ⊥ (σx + σy) + (σx − σy) + 4pxy while, by Equation (37), the corresponding worst mean-squared orthogonal deviation is given by

2 2 2 2 2 2 1  −2 4(σxσy − pxy) 2(σxσy − pxy) · S⊥ = − − = h q i. (47) n (x )2 + (y )2 2 2 2 2 2 2 ⊥ ⊥ (σx + σy) − (σx − σy) + 4pxy 1124 Brian J. McCartin

Σ+ ⊥

Σ− ⊥

Figure 8: Orthogonal Regression

Equations (46) and (47) may be combined and further simplified to read 1  2 1 h q i · S± = · (σ2 + σ2) ∓ (σ2 − σ2)2 + 4p2 . (48) n ⊥ 2 x y x y xy 6 λ-Regression

λ Turning to the case of λ-regression corresponding to the slope s = − m in Figure 3, it is shown in [8] that the x-expansion(λ > 1)/compression(λ < 1) √ √ 2 2 m ξ := λ · x ⇒ σ = λ · σ , pξy = λ · pxy, mˆ = √ (49) ξ x λ transforms λ-regression into orthogonal regression so that the slopes of the best and worst λ-regression lines are, by Equation (32), roots of the quadratic λσ2 − σ2 m2 + x y · m − λ = 0. (50) pxy Specifically, by Equation (33), the slope of the best λ-regression line is q (σ2 − λσ2) + (σ2 − λσ2)2 + 4λp2 + y x y x xy mλ = , (51) 2pxy while, by Equation (34), the slope of the worst λ-regression line is q (σ2 − λσ2) − (σ2 − λσ2)2 + 4λp2 − y x y x xy mλ = . (52) 2pxy Geometric construction of regression lines 1125

Σ− Σ+ λ λ

Figure 9: λ-Regression

By Equation (35), the squared magnitudes of the corresponding conjugate semidiameters are the roots of the quadratic

2 2 2 2 2 2 Σλ − 4(λσx + σy) · Σλ + 16λ(σxσy − pxy) = 0, (53) so that + 2 h 2 2 q 2 2 2 2 i (Σλ ) = 2 (λσx + σy) + (λσx − σy) + 4λpxy , (54)

− 2 h 2 2 q 2 2 2 2 i (Σλ ) = 2 (λσx + σy) − (λσx − σy) + 4λpxy . (55) Thus, Construction 3 may be modified to accommodate λ-regression. Construction 4 (λ-Regression) Given the first and second moments 2 2 hx,¯ y,¯ σx, pxy, σyi, the best and worst lines of λ-regression are given by the pair of conjugate semidiameters determined as follows: 1. According to Equation (50), apply the Carlyle-Lill construction, Propo- 2 2 sition 2, with (a, b) = ( σy−λσx , −λ) to construct the directions of the pxy conjugate semidiameters. 2. According to Equation (53), apply the Carlyle-Lill construction, Propo- 2 2 2 2 2 sition 2, with (a, b) = (4(λσx + σy), 16λ(σxσy − pxy)) to construct the squared magnitudes of the conjugate semidiameters. √ 3. Apply the ·-operator, Proposition 1, to these squared magnitudes in + − order to construct the magnitudes, {Σλ , Σλ }, of the conjugate semidi- ameters emanating from the centroid, (¯x, y¯). (See Figure 9.) 1126 Brian J. McCartin

± ± 4. Denote by hxλ , yλ i the tips of these conjugate semidiameters lying on the concentration ellipse, E.

By Equation (46), the corresponding best mean-squared λ-deviation is given by

2 2 2 1  +2 −1 4λ(σxσy − pxy) · Sλ = max (1, λ ) · + 2 + 2 n λ(xλ ) + (yλ ) max (1, λ−1) · 2λ(σ2σ2 − p2 ) = x y xy h 2 2 q 2 2 2 2 i (λσx + σy) + (λσx − σy) + 4λpxy 1 h q i = max (1, λ−1) · (λσ2 + σ2) − (λσ2 − σ2)2 + 4λp2 ,(56) 2 x y x y xy while, by Equation (47), the corresponding worst mean-squared λ-deviation is given by

2 2 2 1  −2 −1 4λ(σxσy − pxy) · Sλ = max (1, λ ) · − 2 − 2 n λ(xλ ) + (yλ ) max (1, λ−1) · 2λ(σ2σ2 − p2 ) = x y xy h 2 2 q 2 2 2 2 i (λσx + σy) − (λσx − σy) + 4λpxy 1 h q i = max (1, λ−1) · (λσ2 + σ2) + (λσ2 − σ2)2 + 4λp2 .(57) 2 x y x y xy 7 (λ, µ)-Regression

µm−λ Turning to the case of (λ, µ)-regression corresponding to the slope s = m−µ in Figure 3, it is shown in [9] that the linear mapping

q m − µ ξ := λ − µ2 · x, η := −µ · x + y ⇒ mˆ = √ , (58) λ − µ2 q 2 2 2 2 2 2 2 2 2 σξ = (λ−µ )·σx, ση = µ ·σx −2µ·pxy +σy, pξη = λ − µ ·(pxy −µσx) (59) transforms (λ, µ)-regression into orthogonal regression so that the slopes of the best and worst (λ, µ)-regression lines are, by Equation (32), roots of the quadratic 2 2 2 2 λσx − σy λpxy − µσy m + 2 · m − 2 = 0. (60) pxy − µσx pxy − µσx Specifically, by Equation (33), the slope of the best (λ, µ)-regression line is

q (σ2 − λσ2) + (σ2 − λσ2)2 + 4(p − µσ2)(λp − µσ2) + y x y x xy x xy y mλ,µ = 2 , (61) 2(pxy − µσx) Geometric construction of regression lines 1127

− Σ λ ,µ

+ Σ λ ,µ

Figure 10: (λ, µ)-Regression while, by Equation (34), the slope of the worst (λ, µ)-regression line is q (σ2 − λσ2) − (σ2 − λσ2)2 + 4(p − µσ2)(λp − µσ2) − y x y x xy x xy y mλ,µ = 2 . (62) 2(pxy − µσx) By Equation (35), the squared magnitudes of the corresponding conjugate semidiameters are the roots of the quadratic

2 h 2 2 2 2 2 i Σλ,µ − 4 (λ − µ )σx + (µ σx − 2µpxy + σy) · Σλ,µ

2 h 2 2 2 2 2 2i +16(λ − µ ) σx(µ σx − 2µpxy + σy) − (pxy − µσx) = 0, (63) so that

+ 2 h 2 2 2 2 2 i (Σλ,µ) = 2 (λ − µ )σx + (µ σx − 2µpxy + σy)

r 2 h 2 2 2 2 2 i 2 2 2 +2 (λ − µ )σx − (µ σx − 2µpxy + σy) + 4(λ − µ )(pxy − µσx) , (64)

− 2 h 2 2 2 2 2 i (Σλ,µ) = 2 (λ − µ )σx + (µ σx − 2µpxy + σy)

r 2 h 2 2 2 2 2 i 2 2 2 −2 (λ − µ )σx − (µ σx − 2µpxy + σy) + 4(λ − µ )(pxy − µσx) . (65) Thus, Construction 3 may be modified to accommodate (λ, µ)-regression. 1128 Brian J. McCartin

Construction 5 ((λ, µ)-Regression) Given the first and second moments 2 2 hx,¯ y,¯ σx, pxy, σyi, the best and worst lines of (λ, µ)-regression are given by the pair of conjugate semidiameters determined as follows:

1. According to Equation (60), apply the Carlyle-Lill construction, Propo- 2 2 2 σy−λσx µσy−λ sition 2, with (a, b) = ( 2 , 2 ) to construct the directions of the pxy−µσx pxy−µσx conjugate semidiameters.

2. According to Equation (63), apply the Carlyle-Lill construction, Proposi- h 2 2 2 2 2 i tion 2, with (a, b) = (4 (λ − µ )σx + (µ σx − 2µpxy + σy) , 2 h 2 2 2 2 2 2i 16(λ−µ ) σx(µ σx − 2µpxy + σy) − (pxy − µσx) ) to construct the squared magnitudes of the conjugate semidiameters. √ 3. Apply the ·-operator, Proposition 1, to these squared magnitudes in + − order to construct the magnitudes, {Σλ,µ, Σλ,µ}, of the conjugate semidi- ameters emanating from the centroid, (¯x, y¯). (See Figure 10.)

± ± 4. Denote by hxλ,µ, yλ,µi the tips of these conjugate semidiameters lying on the concentration ellipse, E. By Equation (46), the corresponding best mean-squared (λ, µ)-deviation is given by

2  2 2 2 2 2 2 1  2 4(λ − µ ) σ (µ σ − 2µpxy + σ ) − (pxy − µσ ) · S+ = max (1, λ−1) · x x y x λ,µ + 2 + + + 2 n λ(xλ,µ) − 2µxλ,µyλ,µ + (yλ,µ)

−1 2  2 2 2 2 2 2 max (1, λ ) · 2(λ − µ ) σx(µ σx − 2µpxy + σy) − (pxy − µσx) = q 2 2 2 2 2 2  2 2 2 2 2  2 2 2 (λ − µ )σx + (µ σx − 2µpxy + σy) + (λ − µ )σx − (µ σx − 2µpxy + σy) + 4(λ − µ )(pxy − µσx)

  1 q 2 −1 2 2 2 2 2  2 2 2 2 2  2 2 2 = max (1, λ ) · (λ − µ )σ + (µ σ − 2µpxy + σ ) − (λ − µ )σx − (µ σx − 2µpxy + σy ) + 4(λ − µ )(pxy − µσx) , 2 x x y while, by Equation (47), the corresponding worst mean-squared (λ, µ)-deviation is given by

2  2 2 2 2 2 2 1  2 4(λ − µ ) σ (µ σ − 2µpxy + σ ) − (pxy − µσ ) · S− = max (1, λ−1) · x x y x λ,µ − 2 − − − 2 n λ(xλ,µ) − 2µxλ,µyλ,µ + (yλ,µ)

−1 2  2 2 2 2 2 2 max (1, λ ) · 2(λ − µ ) σx(µ σx − 2µpxy + σy) − (pxy − µσx) = q 2 2 2 2 2 2  2 2 2 2 2  2 2 2 (λ − µ )σx + (µ σx − 2µpxy + σy) − (λ − µ )σx − (µ σx − 2µpxy + σy) + 4(λ − µ )(pxy − µσx)

  1 q 2 −1 2 2 2 2 2  2 2 2 2 2  2 2 2 = max (1, λ ) · (λ − µ )σ + (µ σ − 2µpxy + σ ) + (λ − µ )σx − (µ σx − 2µpxy + σy ) + 4(λ − µ )(pxy − µσx) . 2 x x y Geometric construction of regression lines 1129

8 Conclusion

In the foregoing, Euclidean constructions were supplied for various lines of regression. These included lines of coordinate regression [7], orthogonal regression [13], λ-regression [8] and (λ, µ)-regression [9]. The adhesive which bound together these geometric constructions was provided by the concentra- tion ellipse [3]. This unified approach was, in turn, a direct consequence of the Galton-Pearson-McCartin Theorem [10, Theorem 2] which establishes that, as special cases of oblique regression [10], all of the associated best and worst regression lines form pairs of conjugate diameters of the concentration ellipse.

Acknowledgements. I would like to thank my devoted wife Mrs. Bar- bara A. McCartin for her indispensable aid in constructing the figures.

References

[1] F. S. Acton, Analysis of Straight-Line Data, Dover, New York, 1966.

[2] R. Courant and H. Robbins, Geometrical Constructions, Chapter III in What Is Mathematics?, Oxford University Press, New York, 1941, 117- 164.

[3] H. Cram´er, Mathematical Methods of Statistics, Princeton University Press, Princeton, New Jersey, 1946.

[4] L. E. Dickson, Constructions with Ruler and Compasses; Regular Polygons, Chapter in Monographs on Topics of Modern Mathematics Relevant to the Elementary Field (J. W. A. Young, Editor), Dover, New York, 1955, 353- 386.

[5] H. Eves, An Introduction to the History of Mathematics, Sixth Edition, HBJ-Saunders, New York, 1990.

[6] R. W. Farebrother, Fitting Linear Relationships: A History of the Calculus of Observations 1750-1900, Springer, New York, 1999. http://dx.doi.org/10.1007/978-1-4612-0545-6

[7] F. Galton, Family Likeness in Stature, with Appendix by J. D. H. Dickson, Proc. Roy. Soc. London, 40 (1886), 42-73. http://dx.doi.org/10.1098/rspl.1886.0009

[8] B. J. McCartin, A Geometric Characterization of Linear Regression, Statis- tics, 37 (2003), no. 2, 101-117. http://dx.doi.org/10.1080/0223188031000112881 1130 Brian J. McCartin

[9] B. J. McCartin, The of Linear Regression with Correlated Er- rors, Statistics, 39 (2005), no. 1, 1-11. http://dx.doi.org/10.1080/02331880412331328260

[10] B. J. McCartin, Oblique Linear Least Squares Approximation, Applied Mathematical Sciences, 4 (2010), no. 58, 2891-2904.

[11] B. J. McCartin, A Matrix Analytic Approach to the Conjugate Diameters of an Ellipse, Applied Mathematical Sciences, 7 (2013), no. 36, 1797-1810.

[12] B. J. McCartin, Geometric Construction of an Ellipse from its Moments, Applied Mathematical Sciences, 10 (2016), no. 3, 127-135. http://dx.doi.org/10.12988/ams.2016.511704

[13] K. Pearson, On Lines and Planes of Closest Fit to Systems of Points in Space, Philosophical Magazine Series 6, 2 (1901), 559-572. http://dx.doi.org/10.1080/14786440109462720

[14] G. Salmon, A Treatise on Conic Sections, Sixth Edition, Chelsea, New York, 1954.

[15] W. M. Smart, Combination of Observations, Cambridge University Press, Cambridge, 1958.

[16] P. Sprent, Models in Regression, Methuen, London, 1969.

Received: January 1, 2016; Published: March 30, 2016