Feature Extraction Techniques

9/7/2017 Feature Extraction Techniques Unsupervised methods can also be used to find features which can be useful for categorization. There are unsupervised methods that represent a form of smart Unsupervised Learning II feature extraction. Feature Extraction Transforming the input data into the set of features still describing the data with sufficient accuracy In pattern recognition and image processing, feature extraction is a special form of dimensionality reduction What is feature reduction? Why feature reduction? Original data reduced data • Most machine learning and data mining techniques may not be effective for high-dimensional data • When the input data to an algorithm is too large to be processed and it is suspected to be redundant (much data, Linear transformation but not much information) • Analysis with a large number of variables generally requires a large amount of memory and computation d T dp Y power or a classification algorithm which overfits the G training sample and generalizes poorly to new samples • The important dimension may be small. X p – For example, the number of genes responsible for a certain type of disease may be small. Gpd : X Y GT X d 1 9/7/2017 Feature reduction versus feature Why feature reduction? selection • Visualization: projection of high-dimensional data • Feature reduction onto 2D or 3D. – All original features are used – The transformed features are linear combinations of the original features. • Data compression: efficient storage and retrieval. • Feature selection • Noise removal: positive effect on query accuracy. – Only a subset of the original features are used. • Continuous versus discrete Application of feature reduction Algorithms Feature Extraction Techniques • Face recognition Principal component analysis • Handwritten digit recognition Singular value decomposition • Text mining Non-negative matrix factorization • Image retrieval Independent component analysis • Microarray data analysis • Protein classification 2 9/7/2017 What is Principal Component Geometric picture of principal components (PCs) Analysis? • Principal component analysis (PCA) – Reduce the dimensionality of a data set by finding a new set of variables, smaller than the original set of variables z – Retains most of the sample's information. 1 – Useful for the compression and classification of data. • By information we mean the variation present in the • the 1st PC is a minimum distance fit to a line in X space sample, given by the correlations between the original z1 variables. nd • the 2 PC z 2 is a minimum distance fit to a line in the plane – The new variables, called principal components (PCs), are perpendicular to the 1st PC uncorrelated, and are ordered by the fraction of the total information each retains. PCs are a series of linear least squares fits to a sample, each orthogonal to all the previous. Principal Components Analysis (PCA) Background Mathematics • Principle – Linear projection method to reduce the number of parameters – Transfer a set of correlated variables into a new set of uncorrelated Linear Algebra variables – Map the data into a space of lower dimensionality – Form of unsupervised learning Calculus • Properties – It can be viewed as a rotation of the existing axes to new positions in the Probability and Computing space defined by original variables – New axes are orthogonal and represent the directions with maximum variability 3 9/7/2017 Eigenvalues & eigenvectors Eigenvalues & eigenvectors Example. Consider the matrix Consider the matrix P for which the columns are C1, C2, and C3, i.e., 1 2 1 1 1 2 A 6 1 0 P 6 2 3 1 2 1 13 1 2 Consider the three column matrices we have Determinants of P det(p)= 84. So this matrix is invertible. Easy calculations give é 1 ù é -1 ù é 2 ù ê ú ê ú ê ú C1 = ê 6 ú, C2 =ê 2 ú, C3 =ê 3 ú, 7 0 7 ê -13 ú ê 1 ú ê -2 ú 1 1 ë û ë û ë û P 27 24 9 84 32 12 8 We have 0 4 6 Next we evaluate the matrix P-1AP. AC0 , AC 8 , AC 9 , 1 2 3 7071211 12 000 0 4 6 1 P1 AP 272496 10 6 23 040 84 In other words, we have 32128 12 1 131 2 003 AC1 = 0C1, AC2 = -4C2, AC3 = 3C3, In other words, we have 000 0, -4 and 3 are eigenvalues of A, C1,C2 and C3 are eigenvectors P1 AP 0 4 0 Ac c 0 0 3 Eigenvalues & eigenvectors Eigenvalues & eigenvectors In other words, we have Definition. Let A be a square matrix. A non-zero vector C is called an eigenvector of A if and only if there exists a number (real or complex) λ such that AC C. Using the matrix multiplication, we obtain If such a number λ exists, it is called an eigenvalue of A. The vector C 000 is called eigenvector associated to the eigenvalue λ. 1 APP0 4 0 0 0 3 Remark. The eigenvector C must be non-zero since we have which implies that A is similar to a diagonal matrix. In particular, we have A0 0 0 (0 is a zero vector) 0 0 0 for any number λ. nn1 APP0 4 0 n 0 0 3 for n 1,2,... 4 9/7/2017 Eigenvalues & eigenvectors Determinants 1 2 1 Example. Consider the matrix A 6 1 0 Determinant of order 2 1 2 1 .easy to remember (for order 2 only).. We have seen that aa11 12 ||A a11 a 22 a 12 a 21 aa21 22 AC1 = 0C1, AC2 = -4C2, AC3 = 3C3, - + where 12 1 1 2 Example: Evaluate the determinant: CCC6 , 2 , 3 , 34 1 2 3 12 13 1 2 1 4 2 3 2 34 So C1 is an eigenvector of A associated to the eigenvalue 0. C2 is an eigenvector of A associated to the eigenvalue -4 while C3 is an eigenvector of A associated to the eigenvalue 3. Eigenvalues & eigenvectors Eigenvalues & eigenvectors Example. Consider the matrix For a square matrix A of order n, the number λ is an eigenvalue if and only if there exists a non-zero vector C such that 12 A AC C 20 Using the matrix multiplication properties, we obtain The equation (AIC ) 0 n translates into 12 (1 )(0 ) 4 0 We also know that this system has one solution if and only if the matrix coefficient is 20 invertible, i.e. which is equivalent to the quadratic equation det(AI n ) 0. 2 Since the zero-vector is a solution and C is not the zero vector, then we must have 40 Solving this equation leads to (use quadratic formula) det(AI ) 0. n 1 17 1 17 , and 22 In other words, the matrix A has only two eigenvalues. 5 9/7/2017 Eigenvalues & eigenvectors Eigenvalues & eigenvectors 1 2 1 Example. Consider the diagonal matrix A 6 1 0 In general, for a square matrix A of order n, the equation a 0 0 0 1 2 1 0b 0 0 D . 0 0c 0 0 0 0 d will give the eigenvalues of A. Its characteristic polynomial is It is a polynomial function in λ of degree n. Therefore this equation will not have more than n roots or solutions. a 0 0 0 0b 0 0 So a square matrix A of order n will not have more than n eigenvalues. det(D I ) ( a )( b )( c )( d ) 0 n 0 0c 0 0 0 0 d So the eigenvalues of D are a, b, c, and d, i.e. the entries on the diagonal. Computation of Eigenvectors Computation of Eigenvectors Example. Consider the matrix Let A be a square matrix of order n and λ one of its eigenvalues. Let X be an eigenvector of A associated to λ. We must have First we look for the eigenvalues of A. These are given by the characteristic equation AX Xor ( A In ) X 0 det(AI 3 ) 0. 1 2 1 This is a linear system for which the matrix coefficient is AI n 6 1 0 0 Since the zero-vector is a solution, the system is consistent. 1 2 1 Remark. Note that if X is a vector which satisfies AX= λX , If we develop this determinant using the third column, we obtain then the vector Y = c X (for any arbitrary number c) satisfies the 6 1 1 2 same equation, i.e. AY- λY. ( 1 ) 0 1 2 6 1 In other words, if we know that X is an eigenvector, then cX is also By algebraic manipulations, we get an eigenvectordet( associatedAI n )to the 0. same eigenvalue. ( 4)( 3) 0 which implies that the eigenvalues of A are 0, -4, and 3. 6 9/7/2017 Computation of Eigenvectors Computation of Eigenvectors EIGENVECTORS ASSOCIATED WITH EIGENVALUES So the unknown vector X is given by 1. Case λ=0. : The associated eigenvectors are given by the linear system xx 1 AX 0 X y 66 x x which may be rewritten by zx 13 13 x20 y z 60xy Therefore, any eigenvector X of A associated to the eigenvalue 0 is given by x 20 y z 1 The third equation is identical to the first. From the second equation, Xc 6, we have y = 6x, so the first equation reduces to 13x + z = 0. So this system is equivalent to 13 yx 6 where c is an arbitrary number. zx13 Computation of Eigenvectors Computation of Eigenvectors Then we use elementary row operations to reduce it to a upper-triangular form. 2. Case λ=-4: The associated eigenvectors are given by the linear system First we interchange the first row to the end AX4 X or (AIX 4 ) 0 1 2 3 0 3 5 2 1 0 which may be rewritten by 6 3 0 0 5x 2 y z 0 6xy 3 0 Next, we use the first row to eliminate the 5 and 6 on the first column.

Feature Extraction Techniques

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support