Principal Component Analysis (PCA)

-- -- -2- Principal Component Analysis (PCA) • Patternrecognition in high-dimensional spaces -Problems arise when performing recognition in a high-dimensional space (e.g., curse of dimensionality). -Significant improvements can be achievedbyfirst mapping the data into a lower-dimensionality space. a1 b1 a2 b2 x = −−> reduce dimensionality −−> y = (K << N) ... ... aN bK -The goal of PCA is to reduce the dimensionality of the data while retaining as much as possible of the variation present in the original dataset. • Dimensionality reduction -PCA allows us to compute a linear transformation that maps data from a high dimensional space to a lower dimensional space. b = t a + t a +...+t a 1 = 11 1 + 12 2+ + 1n N b2 t21a1 t22a2 ... t2n aN ... = + + + bK tK1a1 tK2a2 ... tKN aN t11 t12 ... t1N t21 t22 ... t2N or y = Tx where T = ... ... ... ... tK1 tK2 ... tKN -- -- -3- • Lower dimensionality basis -Approximate the vectors by finding a basis in an appropriate lower dimensional space. (1) Higher-dimensional space representation: = + + ...+ x a1v1 a2v2 aN vN v1, v2,..., vN is a basis of the N-dimensional space (2) Lower-dimensional space representation: = + + ...+ xˆ b1u1 b2u2 bK uK u1, u2,..., uK is a basis of the K-dimensional space - Note: if both bases have the same size (N = K), then x = xˆ) Example 1 0 0 = = = v1 0 , v2 1 , v3 0 (standard basis) 0 0 1 3 = = + + xv 3 3v1 3v2 3v3 3 1 1 1 = = = u1 0 , u2 1 , u3 1 (some other basis) 0 0 1 3 = = + + xu 3 0u1 0u2 3u3 3 = thus, xv xu -- -- -4- • Information loss -Dimensionality Reduction implies Information Loss !! -Preserveasmuch information as possible, that is, minimize ||x − xˆ|| (error) • Howtodetermine the best lower dimensional space? The best low-dimensional space can be determined by the "best" eigenvectorsofthe covariance matrix of x (i.e., the eigenvectorscorresponding to the "largest" eigenvalues -- also called "principal components"). • Methodology -Suppose x1, x2,..., x M are Nx1 vectors M = 1 Step 1: x Σ xi M i=1 Φ = − Step 2: subtract the mean: i xi x = Φ Φ ...Φ Step 3: form the matrix A [ 1 2 M ] (NxM matrix), then compute: 1 M = Φ ΦT = T C Σ n n AA M n=1 (sample covariance matrix, NxN,characterizes the scatter of the data) ... Step 4: compute the eigenvalues of C: 1 > 2 > > N Step 5: compute the eigenvectors of C: u1, u2,...,uN -Since C is symmetric, u1, u2,...,uN form a basis, (i.e., anyvector x or actually (x − x),can be written as a linear combination of the eigenvectors): N − = + + ...+ = x x b1u1 b2u2 bN uN Σ biui i=1 -- -- -5- Step 6: (dimensionality reduction step)keep only the terms corresponding to the K largest eigenvalues: K − = xˆ x Σ biui where K << N i=1 − -The representation of xˆ x into the basis u1, u2,..., uK is thus b1 b 2 ... bK • Linear tranformation implied by PCA -The linear tranformation RN ->RK that performs the dimensionality reduction is: T b1 u1 b uT 2 = 2 (x − x) = UT (x − x) ... ... T bK uK • An example (see Castleman’sappendix, pp. 648-649) -- -- -6- • Geometrical interpretation -PCA projects the data along the directions where the data varies the most. -These directions are determined by the eigenvectors of the covariance matrix corresponding to the largest eigenvalues. -The magnitude of the eigenvalues corresponds to the variance of the data along the eigenvector directions. • Properties and assumptions of PCA -The newvariables (i.e., bi’s)are uncorrelated. 1 0 0 0 0 2 . the covariance of b ’s is: UT CU = i . . 0 0 K -The covariance matrix represents only second order statistics among the vector values. -Since the newvariables are linear combinations of the original variables, it is usally difficult to interpret their meaning. -- -- -7- • Howtochoose the principal components? -Tochoose K,use the following criterion: K Σ i i=1 > Threshold (e.g., 0.9 or 0.95) N Σ i i=1 • What is the error due to dimensionality reduction? -Wesaw above that an original vector x can be reconstructed using its the prin- cipla components: K K − = = + xˆ x Σ biui or xˆ Σ biui x i=1 i=1 -Itcan be shown that the low-dimensional basis based on principal components minimizes the reconstruction error: e = ||x − xˆ|| -Itcan be shown that the error is equal to: N = e 1/2 Σ i i=K+1 • Standardization -The principal components are dependent on the units used to measure the original variables as well as on the range of values theyassume. -Weshould always standardize the data prior to using PCA. -Acommon standardization method is to transform all the data to have zero mean and unit standard deviation: − xi ( and are the mean and standard deviation of xi’s) -- -- -8- • PCA and classification -PCA is not always an optimal dimensionality-reduction procedure for classification purposes: • Other problems -How tohandle occlusions? -How tohandle different views of a 3D object? -- --.

Principal Component Analysis (PCA)

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support