Multicollinearity

Total Page:16

File Type:pdf, Size:1020Kb

Multicollinearity 00:28 Thursday 29th October, 2015 See updates and corrections at http://www.stat.cmu.edu/~cshalizi/mreg/ Lecture 17: Multicollinearity 36-401, Fall 2015, Section B 27 October 2015 Contents 1 Why Collinearity Is a Problem 1 1.1 Dealing with Collinearity by Deleting Variables . .2 1.2 Diagnosing Collinearity Among Pairs of Variables . .3 1.3 Why Multicollinearity Is Harder . .3 1.4 Geometric Perspective . .5 2 Variance Inflation Factors 5 2.1 Why VIFi ≥ 1 ............................5 3 Matrix-Geometric Perspective on Multicollinearity 7 3.1 The Geometric View . .9 3.2 Finding the Eigendecomposition . 10 3.3 Using the Eigendecomposition . 10 3.3.1 Example . 10 3.4 Principal Components Regression . 14 4 Ridge Regression 15 4.1 Some Words of Advice about Ridge Regression . 18 4.2 Penalties vs. Constraints . 19 4.3 Ridge Regression in R . 19 4.4 Other Penalties/Constraints . 20 5 High-Dimensional Regression 20 5.1 Demo . 21 6 Further Reading 23 1 Why Collinearity Is a Problem Remember our formula for the estimated coefficients in a multiple linear regres- sion: βb = (xT x)−1xT y 1 2 1.1 Dealing with Collinearity by Deleting Variables This is obviously going to lead to problems if xT x isn't invertible. Similarly, the variance of the estimates, h i Var βb = σ2(xT x)−1 will blow up when xT x is singular. If that matrix isn't exactly singular, but is close to being non-invertible, the variances will become huge. There are several equivalent conditions for any square matrix, say u, to be singular or non-invertible: • The determinant det u or juj is 0. • At least one eigenvalue1 of u is 0. (This is because the determinant of a matrix is the product of its eigenvalues.) • u is rank deficient, meaning that one or more of its columns (or rows) is equal to a linear combination of the other rows2. Since we're not concerned with any old square matrix, but specifically with xT x, we have an additional equivalent condition: • x is column-rank deficient, meaning one or more of its columns is equal to a linear combination of the others. The last explains why we call this problem collinearity: it looks like we have p different predictor variables, but really some of them are linear combi- nations of the others, so they don't add any information. The real number of distinct variables is q < p, the column rank of x. If the exact linear relation- ship holds among more than two variables, we talk about multicollinearity; collinearity can refer either to the general situation of a linear dependence among the predictors, or, by contrast to multicollinearity, a linear relationship among just two of the predictors. Again, if there isn't an exact linear relationship among the predictors, but they're close to one, xT x will be invertible, but (xT x)−1 will be huge, and the variances of the estimated coefficients will be enormous. This can make it very hard to say anything at all precise about the coefficients, but that's not necessarily a problem. 1.1 Dealing with Collinearity by Deleting Variables Since not all of the p variables are actually contributing information, a natural way of dealing with collinearity is to drop some variables from the model. If you want to do this, you should think very carefully about which variable to delete. As a concrete example: if we try to include all of a student's grades as 1You learned about eigenvalues and eigenvectors in linear algebra; if you are rusty, now is an excellent time to refresh your memory. 2The equivalence of this condition to the others is not at all obvious, but, again, is proved in linear algebra. 00:28 Thursday 29th October, 2015 3 1.2 Diagnosing Collinearity Among Pairs of Variables predictors, as well as their over-all GPA, we'll have a problem with collinearity (since GPA is a linear function of the grades). But depending on what we want to predict, it might make more sense to use just the GPA, dropping all the individual grades, or to include the individual grades and drop the average3. 1.2 Diagnosing Collinearity Among Pairs of Variables Linear relationships between pairs of variables are fairly easy to diagnose: we make the pairs plot of all the variables, and we see if any of them fall on a straight line, or close to one. Unless the number of variables is huge, this is by far the best method. If the number of variables is huge, look at the correlation matrix, and worry about any entry off the diagonal which is (nearly) ±1. 1.3 Why Multicollinearity Is Harder A multicollinear relationship involving three or more variables might be totally invisible on a pairs plot. For instance, suppose X1 and X2 are independent 2 Gaussians, of equal variance σ , and X3 is their average, X3 = (X1 + X2)=2. The correlation between X1 and X3 is Cov [X1;X3] Cor(X1;X3) = p (1) Var [X1] Var [X3] Cov [X ; (X + X )=2] = 1 1 2 (2) pσ2σ2=2 σ2=2 = p (3) σ2= 2 1 = p (4) 2 p This is also the correlation between X2 and X3. A correlation of 1= 2 isn't trivial, but is hardly perfect, and doesn't really distinguish itself on a pairs plot (Figure 1). 3One could also drop just one of the individual class grades from the average, but it's harder to think of a scenario where that makes sense. 00:28 Thursday 29th October, 2015 4 1.3 Why Multicollinearity Is Harder 30 40 50 60 70 80 90 ● ● ● ● ● ● ● ● ● ● ● ● 100 ● ● ● ● ● ● ● ● ● ●● ● ● ● ● 90 ● ● ● ● ● ● ● ●● ● ●● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● 80 ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● x1 ● ● ● ● ● ● ● ● ● ●●● ● ● ●● ● 70 ● ●●● ● ● ● ● ●●● ●● ● ●●●● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● 60 ● ● ● ●● ● ● ● ●● ● ● ●● ● ● ●● ● ●●● ● ●● ● ● ● ● 50 ● ● ● ● ● ● 40 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 90 ● ● ● ● ● ● ● ●● ● ● ● ●● ●● ● ● ● ● ●● ● ●● ● ● ●● ● ● ● 80 ● ● ● ● ● ● ● ● ●● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ●●●● ● ●● ●●● ● 70 ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●●● ● ●●● ● ● ● ●●● ● ● x2 ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● 60 ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● 50 ● ● ● ● 40 ● ● ● ● 30 ● ● ● ● ● ● ● ● ● ● 90 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● 80 ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ●●● ● ● ●●● x3 ● ● ● ● ● ● ● ● ● ●●● ●●● ● 70 ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ●●● ● ● ● ● ●● ●● ● ●● ●●● ●● ●●●● ● ●●●● ● ● ●● ● ● ● ● ● ● ● ● ● ● 60 ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● 50 40 50 60 70 80 90 100 50 60 70 80 90 ## x1 x2 x3 ## x1 1.00000000 0.03788452 0.7250514 ## x2 0.03788452 1.00000000 0.7156686 ## x3 0.72505136 0.71566863 1.0000000 # Simulation: two independent Gaussians x1 <- rnorm(100, mean=70, sd=15) x2 <- rnorm(100, mean=70, sd=15) # Add in a linear combination of X1 and X2 x3 <- (x1+x2)/2 pairs(cbind(x1,x2,x3)) cor(cbind(x1,x2,x3)) Figure 1: Illustration of a perfect multi-collinear relationship might not show up on a pairs plot or in a correlation matrix. 00:28 Thursday 29th October, 2015 5 1.4 Geometric Perspective 1.4 Geometric Perspective The predictors X1;:::Xp form a p-dimensional random vector X. Ordinarily, we expect this random vector to be scattered throughout p-dimensional space. When we have collinearity (or multicollinearity), the vectors are actually con- fined to a lower-dimensional subspace. The column rank of a matrix is the number of linearly independent columns it has. If x has column rank q < p, then the data vectors are confined to a q-dimensional subspace. It looks like we've got p different variables, but really by a change of coordinates we could get away with just q of them. 2 Variance Inflation Factors If the predictors are correlated with each other, the standard errors of the coef- ficient estimates will be bigger than if the predictors were uncorrelated. ^ If the predictors were uncorrelated, the variance of βi would be h i σ2 Var β^ = (5) i ns2 Xi just as it is in a simple linear regression. With correlated predictors, however, we have to use our general formula for the least squares: h ^ i 2 T −1 Var βi = σ (x x)i+1;i+1 (6) (Why are the subscripts on the matrix i + 1 instead of i?) The ratio between th Eqs. 6 and 5 is the variance inflation factor for the i coefficient, VIFi. The average of the variance inflation factors across all predictors is often written VIF , or just VIF . Folklore says that VIFi > 10 indicates \serious" multicollinearity for the predictor. I have been unable to discover who first proposed this threshold, or what the justification for it is. It is also quite unclear what to do about this. Large variance inflation factors do not, after all, violate any model assumptions. 2.1 Why VIFi ≥ 1 Let's take the case where p = 2, so xT x is a 3 × 3 matrix. As you saw in the homework, 2 1 x1 x2 3 1 T 2 x x = 4 x1 x1 x1x2 5 n 2 x2 x1x2 x2 00:28 Thursday 29th October, 2015 6 2.1 Why VIFi ≥ 1 After tedious but straightforward algebra4, we get for the inverse (deep breath) 2 3 Var [X ] Var [X ] − Cov [X ;X ]2 + Var [x X − x X ] x Var [X ] − Cov [X ;X ] x x Cov [X ;X ] − Var [X ] x 1 −1 1 d 1 d 2 d 1 2 d 2 1 1 2 1 d 2 d 1 2 2 1 d 1 2 d 1 2 xT x = 6 7 2 4 x1Vard [X2] − Covd [X1;X2] x2 Vard [X2] −Covd [X1;X2] 5 n Vard [X1] Vard [X2] − Covd [X1;X2] x1Covd [X1;X2] − Vard [X1] x2 −Covd [X1;X2] Vard [X1] where the hats on the variances and covariances indicate that they are sample, not population, quantities. Notice that the pre-factor to the matrix, which is the determinant of n−1xT x, blows up when X1 and X2 are either perfectly correlated or perfectly anti- correlated | which is as it should be, since then we'll have exact collinearity. The variances of the estimated slopes are, using this inverse, 2 2 h ^ i σ Vard [X2] σ Var β1 = 2 = 2 n Vard [X1] Vard [X2] − Covd [X1;X2] n(Vard [X1] − Covd [X1;X2] =Vard [X2]) and 2 2 h ^ i σ Vard [X1] σ Var β2 = 2 = 2 n Vard [X1] Vard [X2] − Covd [X1;X2] n(Vard [X2] − Covd [X1;X2] =Vard [X1]) Notice that if Covd [X1;X2] = 0, these reduce to 2 2 h ^ i σ h ^ i σ Var β1 = ; Var β2 = nVard [X1] nVard [X2] exactly as we'd see in simple linear regressions.
Recommended publications
  • Instrumental Regression in Partially Linear Models
    10-167 Research Group: Econometrics and Statistics September, 2009 Instrumental Regression in Partially Linear Models JEAN-PIERRE FLORENS, JAN JOHANNES AND SEBASTIEN VAN BELLEGEM INSTRUMENTAL REGRESSION IN PARTIALLY LINEAR MODELS∗ Jean-Pierre Florens1 Jan Johannes2 Sebastien´ Van Bellegem3 First version: January 24, 2008. This version: September 10, 2009 Abstract We consider the semiparametric regression Xtβ+φ(Z) where β and φ( ) are unknown · slope coefficient vector and function, and where the variables (X,Z) are endogeneous. We propose necessary and sufficient conditions for the identification of the parameters in the presence of instrumental variables. We also focus on the estimation of β. An incorrect parameterization of φ may generally lead to an inconsistent estimator of β, whereas even consistent nonparametric estimators for φ imply a slow rate of convergence of the estimator of β. An additional complication is that the solution of the equation necessitates the inversion of a compact operator that has to be estimated nonparametri- cally. In general this inversion is not stable, thus the estimation of β is ill-posed. In this paper, a √n-consistent estimator for β is derived under mild assumptions. One of these assumptions is given by the so-called source condition that is explicitly interprated in the paper. Finally we show that the estimator achieves the semiparametric efficiency bound, even if the model is heteroscedastic. Monte Carlo simulations demonstrate the reasonable performance of the estimation procedure on finite samples. Keywords: Partially linear model, semiparametric regression, instrumental variables, endo- geneity, ill-posed inverse problem, Tikhonov regularization, root-N consistent estimation, semiparametric efficiency bound JEL classifications: Primary C14; secondary C30 ∗We are grateful to R.
    [Show full text]
  • Projective Geometry: a Short Introduction
    Projective Geometry: A Short Introduction Lecture Notes Edmond Boyer Master MOSIG Introduction to Projective Geometry Contents 1 Introduction 2 1.1 Objective . .2 1.2 Historical Background . .3 1.3 Bibliography . .4 2 Projective Spaces 5 2.1 Definitions . .5 2.2 Properties . .8 2.3 The hyperplane at infinity . 12 3 The projective line 13 3.1 Introduction . 13 3.2 Projective transformation of P1 ................... 14 3.3 The cross-ratio . 14 4 The projective plane 17 4.1 Points and lines . 17 4.2 Line at infinity . 18 4.3 Homographies . 19 4.4 Conics . 20 4.5 Affine transformations . 22 4.6 Euclidean transformations . 22 4.7 Particular transformations . 24 4.8 Transformation hierarchy . 25 Grenoble Universities 1 Master MOSIG Introduction to Projective Geometry Chapter 1 Introduction 1.1 Objective The objective of this course is to give basic notions and intuitions on projective geometry. The interest of projective geometry arises in several visual comput- ing domains, in particular computer vision modelling and computer graphics. It provides a mathematical formalism to describe the geometry of cameras and the associated transformations, hence enabling the design of computational ap- proaches that manipulates 2D projections of 3D objects. In that respect, a fundamental aspect is the fact that objects at infinity can be represented and manipulated with projective geometry and this in contrast to the Euclidean geometry. This allows perspective deformations to be represented as projective transformations. Figure 1.1: Example of perspective deformation or 2D projective transforma- tion. Another argument is that Euclidean geometry is sometimes difficult to use in algorithms, with particular cases arising from non-generic situations (e.g.
    [Show full text]
  • Robot Vision: Projective Geometry
    Robot Vision: Projective Geometry Ass.Prof. Friedrich Fraundorfer SS 2018 1 Learning goals . Understand homogeneous coordinates . Understand points, line, plane parameters and interpret them geometrically . Understand point, line, plane interactions geometrically . Analytical calculations with lines, points and planes . Understand the difference between Euclidean and projective space . Understand the properties of parallel lines and planes in projective space . Understand the concept of the line and plane at infinity 2 Outline . 1D projective geometry . 2D projective geometry ▫ Homogeneous coordinates ▫ Points, Lines ▫ Duality . 3D projective geometry ▫ Points, Lines, Planes ▫ Duality ▫ Plane at infinity 3 Literature . Multiple View Geometry in Computer Vision. Richard Hartley and Andrew Zisserman. Cambridge University Press, March 2004. Mundy, J.L. and Zisserman, A., Geometric Invariance in Computer Vision, Appendix: Projective Geometry for Machine Vision, MIT Press, Cambridge, MA, 1992 . Available online: www.cs.cmu.edu/~ph/869/papers/zisser-mundy.pdf 4 Motivation – Image formation [Source: Charles Gunn] 5 Motivation – Parallel lines [Source: Flickr] 6 Motivation – Epipolar constraint X world point epipolar plane x x’ x‘TEx=0 C T C’ R 7 Euclidean geometry vs. projective geometry Definitions: . Geometry is the teaching of points, lines, planes and their relationships and properties (angles) . Geometries are defined based on invariances (what is changing if you transform a configuration of points, lines etc.) . Geometric transformations
    [Show full text]
  • PARAMETER DETERMINATION for TIKHONOV REGULARIZATION PROBLEMS in GENERAL FORM∗ 1. Introduction. We Are Concerned with the Solut
    PARAMETER DETERMINATION FOR TIKHONOV REGULARIZATION PROBLEMS IN GENERAL FORM∗ Y. PARK† , L. REICHEL‡ , G. RODRIGUEZ§ , AND X. YU¶ Abstract. Tikhonov regularization is one of the most popular methods for computing an approx- imate solution of linear discrete ill-posed problems with error-contaminated data. A regularization parameter λ > 0 balances the influence of a fidelity term, which measures how well the data are approximated, and of a regularization term, which dampens the propagation of the data error into the computed approximate solution. The value of the regularization parameter is important for the quality of the computed solution: a too large value of λ > 0 gives an over-smoothed solution that lacks details that the desired solution may have, while a too small value yields a computed solution that is unnecessarily, and possibly severely, contaminated by propagated error. When a fairly ac- curate estimate of the norm of the error in the data is known, a suitable value of λ often can be determined with the aid of the discrepancy principle. This paper is concerned with the situation when the discrepancy principle cannot be applied. It then can be quite difficult to determine a suit- able value of λ. We consider the situation when the Tikhonov regularization problem is in general form, i.e., when the regularization term is determined by a regularization matrix different from the identity, and describe an extension of the COSE method for determining the regularization param- eter λ in this situation. This method has previously been discussed for Tikhonov regularization in standard form, i.e., for the situation when the regularization matrix is the identity.
    [Show full text]
  • Multiple Discriminant Analysis
    1 MULTIPLE DISCRIMINANT ANALYSIS DR. HEMAL PANDYA 2 INTRODUCTION • The original dichotomous discriminant analysis was developed by Sir Ronald Fisher in 1936. • Discriminant Analysis is a Dependence technique. • Discriminant Analysis is used to predict group membership. • This technique is used to classify individuals/objects into one of alternative groups on the basis of a set of predictor variables (Independent variables) . • The dependent variable in discriminant analysis is categorical and on a nominal scale, whereas the independent variables are either interval or ratio scale in nature. • When there are two groups (categories) of dependent variable, it is a case of two group discriminant analysis. • When there are more than two groups (categories) of dependent variable, it is a case of multiple discriminant analysis. 3 INTRODUCTION • Discriminant Analysis is applicable in situations in which the total sample can be divided into groups based on a non-metric dependent variable. • Example:- male-female - high-medium-low • The primary objective of multiple discriminant analysis are to understand group differences and to predict the likelihood that an entity (individual or object) will belong to a particular class or group based on several independent variables. 4 ASSUMPTIONS OF DISCRIMINANT ANALYSIS • NO Multicollinearity • Multivariate Normality • Independence of Observations • Homoscedasticity • No Outliers • Adequate Sample Size • Linearity (for LDA) 5 ASSUMPTIONS OF DISCRIMINANT ANALYSIS • The assumptions of discriminant analysis are as under. The analysis is quite sensitive to outliers and the size of the smallest group must be larger than the number of predictor variables. • Non-Multicollinearity: If one of the independent variables is very highly correlated with another, or one is a function (e.g., the sum) of other independents, then the tolerance value for that variable will approach 0 and the matrix will not have a unique discriminant solution.
    [Show full text]
  • An Overview and Application of Discriminant Analysis in Data Analysis
    IOSR Journal of Mathematics (IOSR-JM) e-ISSN: 2278-5728, p-ISSN: 2319-765X. Volume 11, Issue 1 Ver. V (Jan - Feb. 2015), PP 12-15 www.iosrjournals.org An Overview and Application of Discriminant Analysis in Data Analysis Alayande, S. Ayinla1 Bashiru Kehinde Adekunle2 Department of Mathematical Sciences College of Natural Sciences Redeemer’s University Mowe, Redemption City Ogun state, Nigeria Department of Mathematical and Physical science College of Science, Engineering &Technology Osun State University Abstract: The paper shows that Discriminant analysis as a general research technique can be very useful in the investigation of various aspects of a multi-variate research problem. It is sometimes preferable than logistic regression especially when the sample size is very small and the assumptions are met. Application of it to the failed industry in Nigeria shows that the derived model appeared outperform previous model build since the model can exhibit true ex ante predictive ability for a period of about 3 years subsequent. Keywords: Factor Analysis, Multiple Discriminant Analysis, Multicollinearity I. Introduction In different areas of applications the term "discriminant analysis" has come to imply distinct meanings, uses, roles, etc. In the fields of learning, psychology, guidance, and others, it has been used for prediction (e.g., Alexakos, 1966; Chastian, 1969; Stahmann, 1969); in the study of classroom instruction it has been used as a variable reduction technique (e.g., Anderson, Walberg, & Welch, 1969); and in various fields it has been used as an adjunct to MANOVA (e.g., Saupe, 1965; Spain & D'Costa, Note 2). The term is now beginning to be interpreted as a unified approach in the solution of a research problem involving a com-parison of two or more populations characterized by multi-response data.
    [Show full text]
  • Statistical Foundations of Data Science
    Statistical Foundations of Data Science Jianqing Fan Runze Li Cun-Hui Zhang Hui Zou i Preface Big data are ubiquitous. They come in varying volume, velocity, and va- riety. They have a deep impact on systems such as storages, communications and computing architectures and analysis such as statistics, computation, op- timization, and privacy. Engulfed by a multitude of applications, data science aims to address the large-scale challenges of data analysis, turning big data into smart data for decision making and knowledge discoveries. Data science integrates theories and methods from statistics, optimization, mathematical science, computer science, and information science to extract knowledge, make decisions, discover new insights, and reveal new phenomena from data. The concept of data science has appeared in the literature for several decades and has been interpreted differently by different researchers. It has nowadays be- come a multi-disciplinary field that distills knowledge in various disciplines to develop new methods, processes, algorithms and systems for knowledge dis- covery from various kinds of data, which can be either low or high dimensional, and either structured, unstructured or semi-structured. Statistical modeling plays critical roles in the analysis of complex and heterogeneous data and quantifies uncertainties of scientific hypotheses and statistical results. This book introduces commonly-used statistical models, contemporary sta- tistical machine learning techniques and algorithms, along with their mathe- matical insights and statistical theories. It aims to serve as a graduate-level textbook on the statistical foundations of data science as well as a research monograph on sparsity, covariance learning, machine learning and statistical inference. For a one-semester graduate level course, it may cover Chapters 2, 3, 9, 10, 12, 13 and some topics selected from the remaining chapters.
    [Show full text]
  • S41598-018-25035-1.Pdf
    www.nature.com/scientificreports OPEN An Innovative Approach for The Integration of Proteomics and Metabolomics Data In Severe Received: 23 October 2017 Accepted: 9 April 2018 Septic Shock Patients Stratifed for Published: xx xx xxxx Mortality Alice Cambiaghi1, Ramón Díaz2, Julia Bauzá Martinez2, Antonia Odena2, Laura Brunelli3, Pietro Caironi4,5, Serge Masson3, Giuseppe Baselli1, Giuseppe Ristagno 3, Luciano Gattinoni6, Eliandre de Oliveira2, Roberta Pastorelli3 & Manuela Ferrario 1 In this work, we examined plasma metabolome, proteome and clinical features in patients with severe septic shock enrolled in the multicenter ALBIOS study. The objective was to identify changes in the levels of metabolites involved in septic shock progression and to integrate this information with the variation occurring in proteins and clinical data. Mass spectrometry-based targeted metabolomics and untargeted proteomics allowed us to quantify absolute metabolites concentration and relative proteins abundance. We computed the ratio D7/D1 to take into account their variation from day 1 (D1) to day 7 (D7) after shock diagnosis. Patients were divided into two groups according to 28-day mortality. Three diferent elastic net logistic regression models were built: one on metabolites only, one on metabolites and proteins and one to integrate metabolomics and proteomics data with clinical parameters. Linear discriminant analysis and Partial least squares Discriminant Analysis were also implemented. All the obtained models correctly classifed the observations in the testing set. By looking at the variable importance (VIP) and the selected features, the integration of metabolomics with proteomics data showed the importance of circulating lipids and coagulation cascade in septic shock progression, thus capturing a further layer of biological information complementary to metabolomics information.
    [Show full text]
  • The Dual Theorem Concerning Aubert Line
    The Dual Theorem concerning Aubert Line Professor Ion Patrascu, National College "Buzeşti Brothers" Craiova - Romania Professor Florentin Smarandache, University of New Mexico, Gallup, USA In this article we introduce the concept of Bobillier transversal of a triangle with respect to a point in its plan; we prove the Aubert Theorem about the collinearity of the orthocenters in the triangles determined by the sides and the diagonals of a complete quadrilateral, and we obtain the Dual Theorem of this Theorem. Theorem 1 (E. Bobillier) Let 퐴퐵퐶 be a triangle and 푀 a point in the plane of the triangle so that the perpendiculars taken in 푀, and 푀퐴, 푀퐵, 푀퐶 respectively, intersect the sides 퐵퐶, 퐶퐴 and 퐴퐵 at 퐴푚, 퐵푚 and 퐶푚. Then the points 퐴푚, 퐵푚 and 퐶푚 are collinear. 퐴푚퐵 Proof We note that = 퐴푚퐶 aria (퐵푀퐴푚) (see Fig. 1). aria (퐶푀퐴푚) 1 Area (퐵푀퐴푚) = ∙ 퐵푀 ∙ 푀퐴푚 ∙ 2 sin(퐵푀퐴푚̂ ). 1 Area (퐶푀퐴푚) = ∙ 퐶푀 ∙ 푀퐴푚 ∙ 2 sin(퐶푀퐴푚̂ ). Since 1 3휋 푚(퐶푀퐴푚̂ ) = − 푚(퐴푀퐶̂ ), 2 it explains that sin(퐶푀퐴푚̂ ) = − cos(퐴푀퐶̂ ); 휋 sin(퐵푀퐴푚̂ ) = sin (퐴푀퐵̂ − ) = − cos(퐴푀퐵̂ ). 2 Therefore: 퐴푚퐵 푀퐵 ∙ cos(퐴푀퐵̂ ) = (1). 퐴푚퐶 푀퐶 ∙ cos(퐴푀퐶̂ ) In the same way, we find that: 퐵푚퐶 푀퐶 cos(퐵푀퐶̂ ) = ∙ (2); 퐵푚퐴 푀퐴 cos(퐴푀퐵̂ ) 퐶푚퐴 푀퐴 cos(퐴푀퐶̂ ) = ∙ (3). 퐶푚퐵 푀퐵 cos(퐵푀퐶̂ ) The relations (1), (2), (3), and the reciprocal Theorem of Menelaus lead to the collinearity of points 퐴푚, 퐵푚, 퐶푚. Note Bobillier's Theorem can be obtained – by converting the duality with respect to a circle – from the theorem relative to the concurrency of the heights of a triangle.
    [Show full text]
  • The Generalization of Miquels Theorem
    THE GENERALIZATION OF MIQUEL’S THEOREM ANDERSON R. VARGAS Abstract. This papper aims to present and demonstrate Clifford’s version for a generalization of Miquel’s theorem with the use of Euclidean geometry arguments only. 1. Introduction At the end of his article, Clifford [1] gives some developments that generalize the three circles version of Miquel’s theorem and he does give a synthetic proof to this generalization using arguments of projective geometry. The series of propositions given by Clifford are in the following theorem: Theorem 1.1. (i) Given three straight lines, a circle may be drawn through their intersections. (ii) Given four straight lines, the four circles so determined meet in a point. (iii) Given five straight lines, the five points so found lie on a circle. (iv) Given six straight lines, the six circles so determined meet in a point. That can keep going on indefinitely, that is, if n ≥ 2, 2n straight lines determine 2n circles all meeting in a point, and for 2n +1 straight lines the 2n +1 points so found lie on the same circle. Remark 1.2. Note that in the set of given straight lines, there is neither a pair of parallel straight lines nor a subset with three straight lines that intersect in one point. That is being considered all along the work, without further ado. arXiv:1812.04175v1 [math.HO] 11 Dec 2018 In order to prove this generalization, we are going to use some theorems proposed by Miquel [3] and some basic lemmas about a bunch of circles and their intersections, and we will follow the idea proposed by Lebesgue[2] in a proof by induction.
    [Show full text]
  • Linear Features in Photogrammetry
    Linear Features in Photogrammetry by Ayman Habib Andinet Asmamaw Devin Kelley Manja May Report No. 450 Geodetic Science and Surveying Department of Civil and Environmental Engineering and Geodetic Science The Ohio State University Columbus, Ohio 43210-1275 January 2000 Linear Features in Photogrammetry By: Ayman Habib Andinet Asmamaw Devin Kelley Manja May Report No. 450 Geodetic Science and Surveying Department of Civil and Environmental Engineering and Geodetic Science The Ohio State University Columbus, Ohio 43210-1275 January, 2000 ABSTRACT This research addresses the task of including points as well as linear features in photogrammetric applications. Straight lines in object space can be utilized to perform aerial triangulation. Irregular linear features (natural lines) in object space can be utilized to perform single photo resection and automatic relative orientation. When working with primitives, it is important to develop appropriate representations in image and object space. These representations must accommodate for the perspective projection relating the two spaces. There are various options for representing linear features in the above applications. These options have been explored, and an optimal representation has been chosen. An aerial triangulation technique that utilizes points and straight lines for frame and linear array scanners has been implemented. For this task, the MSAT (Multi Sensor Aerial Triangulation) software, developed at the Ohio State University, has been extended to handle straight lines. The MSAT software accommodates for frame and linear array scanners. In this research, natural lines were utilized to perform single photo resection and automatic relative orientation. In single photo resection, the problem is approached with no knowledge of the correspondence of natural lines between image space and object space.
    [Show full text]
  • Regularized Radial Basis Function Models for Stochastic Simulation
    Proceedings of the 2014 Winter Simulation Conference A. Tolk, S. Y. Diallo, I. O. Ryzhov, L. Yilmaz, S. Buckley, and J. A. Miller, eds. REGULARIZED RADIAL BASIS FUNCTION MODELS FOR STOCHASTIC SIMULATION Yibo Ji Sujin Kim Department of Industrial and Systems Engineering National University of Singapore 119260, Singapore ABSTRACT We propose a new radial basis function (RBF) model for stochastic simulation, called regularized RBF (R-RBF). We construct the R-RBF model by minimizing a regularized loss over a reproducing kernel Hilbert space (RKHS) associated with RBFs. The model can flexibly incorporate various types of RBFs including those with conditionally positive definite basis. To estimate the model prediction error, we first represent the RKHS as a stochastic process associated to the RBFs. We then show that the prediction model obtained from the stochastic process is equivalent to the R-RBF model and derive the associated mean squared error. We propose a new criterion for efficient parameter estimation based on the closed form of the leave-one-out cross validation error for R-RBF models. Numerical results show that R-RBF models are more robust, and yet fairly accurate compared to stochastic kriging models. 1 INTRODUCTION In this paper, we aim to develop a radial basis function (RBF) model to approximate a real-valued smooth function using noisy observations obtained via black-box simulations. The observations are assumed to capture the information on the true function f through the commonly-used additive error model, d y = f (x) + e(x); x 2 X ⊂ R ; 2 where the random noise is characterized by a normal random variable e(x) ∼ N 0;se (x) , whose variance is determined by a continuously differentiable variance function se : X ! R.
    [Show full text]