PDF.Js Viewer

Total Page:16

File Type:pdf, Size:1020Kb

PDF.Js Viewer IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 25, NO. 8, AUGUST 2013 1863 Principal Composite Kernel Feature Analysis: Data-Dependent Kernel Approach Yuichi Motai, Senior Member, IEEE, and Hiroyuki Yoshida, Member, IEEE Abstract—Principal composite kernel feature analysis (PC-KFA) is presented to show kernel adaptations for nonlinear features of medical image data sets (MIDS) in computer-aided diagnosis (CAD). The proposed algorithm PC-KFA has extended the existing studies on kernel feature analysis (KFA), which extracts salient features from a sample of unclassified patterns by use of a kernel method. The principal composite process for PC-KFA herein has been applied to kernel principal component analysis [34] and to our previously developed accelerated kernel feature analysis [20]. Unlike other kernel-based feature selection algorithms, PC-KFA iteratively constructs a linear subspace of a high-dimensional feature space by maximizing a variance condition for the nonlinearly transformed samples, which we call data-dependent kernel approach. The resulting kernel subspace can be first chosen by principal component analysis, and then be processed for composite kernel subspace through the efficient combination representations used for further reconstruction and classification. Numerical experiments based on several MID feature spaces of cancer CAD data have shown that PC-KFA generates efficient and an effective feature representation, and has yielded a better classification performance for the proposed composite kernel subspace using a simple pattern classifier. Index Terms—Principal component analysis, data-dependent kernel, nonlinear subspace, manifold structures Ç 1INTRODUCTION 3 APITALIZING on the recent success of kernel methods in eigenvalue problem that requires Oðn Þcomputations, and C pattern classification [62], [63], [64], [65], Scho¨lkopf and 2) each principal component in the Hilbert space depends on Smola [34] developed and studied a feature selection every one of the n input patterns, which defeats the goal of algorithm, in which principal component analysis (PCA) obtaining both an informative and concise representation. was effectively applied to a sample of n, d-dimensional Both of these deficiencies have been addressed in patterns that are first injected into a high-dimensional subsequent investigations that seek sets of salient features Hilbert space using a nonlinear embedding. Heuristically, that only depend upon sparse subsets of transformed embedding input patterns into a high-dimensional space input patterns. Tipping [43] applied a maximum-like- may elucidate salient nonlinear features in the input lihood technique to approximate the transformed covar- distribution, in the same way that nonlinearly separable iancematrixintermsofsuchasparsesubset.Francand classification problems may become linearly separable in Hlava´c [21] proposed a greedy method, which approx- higher dimensional spaces as suggested by the Vapnik- imates the mapped space representation by selecting a Chervonenkis theory [14]. Both the PCA and the nonlinear subset of input data. It iteratively extracts the data in the embedding are facilitated by a Mercer kernel of two mapped space until the reconstruction error in the arguments k : Rd Rd !R, which effectively computes mapped high-dimensional space falls below a threshold the inner product of the transformed arguments. This value. Its computational complexity is Oðnm3Þ,wheren is algorithm, called kernel principal component analysis the number of input patterns and m is the cardinality of (KPCA), thus avoids the problem of representing trans- formed vectors in the Hilbert space, and enables the the subset. Zheng et al. [56] split the input data into computation of the inner-product of two transformed M groups of similar size, and then applied KPCA to each vectors of an arbitrarily high dimension in constant time. group. A set of eigenvectors was obtained for each group. Nevertheless, KPCA has two deficiencies: 1) The computa- KPCA was then applied to a subset of these eigenvectors tion of the principal components involves the solution of an to obtain a final set of features. Although these studies proposed useful approaches, none provided a method that is both computationally efficient and accurate. To avoid the Oðn3Þ eigenvalue problem, Mangasarian . Y. Motai is with the Sensory Intelligent Laboratory, Department of Electrical and Computer Engineering, School of Engineering, Virginia et al. [16] proposed sparse kernel feature analysis (SKFA), Commonwealth University, PO Box 843068, 601 West M ain Street, which extracts l features, one by one, using an l1-constraint Richmond, VA 23284-3068. E-mail: [email protected]. on the expansion coefficients. SKFA requires only Oðl2n2Þ . H. Yoshida is with 3D Imaging Research, Department of Radiology, operations, and is, thus, a significant improvement over H ar var d M edi cal School and M assachuset t s Gener al H ospi t al , 25 N ew KPCA ifthe number ofdominant features is much less than Chardon Street, Suite 400C, Boston, M A 02114. p ffiffiffi E-mail: [email protected]. the data size. However, if l> n, then the computational M anuscript received 24 June 2011; revised 29 Dec. 2011; accepted 10 M ay cost of SKFA is likely to exceed that of KPCA. 2012; published online 22 M ay 2012. In this paper, we propose an accelerated kernel feature Recommended for accept ance by X . Z hu. analysis (AKFA) that generates l sparse features from a data For information on obtaining reprints of this article, please send e-mail to: 2 [email protected],andreferenceIEEECSLogNumberTKDE-2011-06-0373. set of n patterns using Oðln Þ operations. Since AKFA is Digital Object Identifier no. 10.1109/TKDE.2012.110. based on both KPCA and SKFA, we analyze the former 1041-4347/13/$31.00 ß 2013 IEEE Published by the IEEE Computer Society 1864 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 25, NO. 8, AUGUST 2013 algorithms, that is, KPCA and SKFA, and then describe between any finite sequences of inputs x :¼fxj : j2Nng AKFA in Section 2. and is defined as We have evaluated other existing multiple kernel learning (MKL) approaches [66], [68], and found that those K :¼ðKðxi ;xj Þ: i;j 2 NnÞ¼ ð ðx i Þ: ðx j ÞÞ: approaches do not rely on the data sets to combine and Commonly used kernel matrices are as follows [34]: choose the kernel functions very much. The choice of an appropriate kernel function has reflected prior knowledge . The linear kernel: concerning the problem at hand. However, it is often T difficult for us to exploit the prior knowledge on patterns Kðx;xi Þ¼ x xi ;ð1Þ for choosing a kernel function, and how to choose the best kernel function for a given data set is an open question. The polynomial kernel: According to the no free lunch theorem [40] on machine learning, there is no superior kernel function in general, and T d Kðx;xi Þ¼ x xi þoffset ;ð2Þ the performance of a kernel function depends on applica- tions, specifically the data sets. The five kernel functions, linear, polynomial, Gaussian, Laplace, and sigmoid, are . The Gaussian RBF kernel: chosen because they were known to have good perfor- 2 2 mances [40], [41], [42], [43], [44], [45]. Kðx;xi Þ¼ exp kx xi k =2 ;ð3Þ The main contribution of this paper is a principal composite kernel feature analysis (PC-KFA) described in . The Laplace RBF kernel: Section 3. In this new approach, the kernel adaptation is employed in the kernel algorithms above KPCA and AKFA Kðx;xi Þ¼ expð kxx i kÞ;ð4Þ in the form of the best kernel selection, engineer a composite kernel which is a combination of data-dependent kernels, and the optimal number of kernel combination. The sigmoid kernel: Other MKL approaches combined basic kernels, but our Kðx;x Þ¼ tanhð xT x þ Þ;ð5Þ proposed PC-KFA specifically chooses data-dependent i 0 i 1 kernels as linear composites. In Section 4, we summarize numerical evaluation experi- . The ANOVA RB kernel: mentsbased on medicalimagedata sets(MIDs)in computer- Xn aided diagnosis (CAD) using the proposed PC-KFA k k 2 d Kðx;xi Þ¼ exp x xi :ð6Þ 1. to choose the kernel function, k¼1 2. to evaluate feature representation by calculating reconstruction errors, . The linear spline kernel in one dimension: 3. to choose the number of kernel functions, Kðx;xi Þ¼ 1 þxxi minðx ; x i Þ 4. to composite the multiple kernel functions, ! 5. to evaluate feature classification using a simple xþ x ðminðx ; x Þ3 i minðx ; x Þ2 þ i : classifier, and 2 i 3 6. to analyze the computation time. Our conclusions appear in Section 5. ð7Þ Kernel selection is heavily dependent on the specific data 2KERNEL FEATURE ANALYSIS set. Currently, the most commonly used kernel functions 2.1 Kernel Basics are the Gaussian and Laplace RBF for general purpose Using Mercer’s theorem [15], a nonlinear, positive-definite when prior knowledge of the data is not available. Gaussian kernel k : Rd Rd !Rof an integral operator can be kernel avoids the sparse distribution while the high degree computed by the inner product of the transformed vectors polynomial kernel may cause the space distribution in large h ðx Þ; ðy Þi ,where : Rd !Hdenotes a nonlinear em- feature space. The polynomial kernel is widely used in bedding (induced by k) into a possibly infinite dimensional image processing while ANOVA RB is often used for Hilbert space H .Givenn sample points in the domain regression tasks. The spline kernels are useful for contin- d X n ¼fxi 2Rji ¼ 1; ...;ng,theimageYn ¼f ðx i Þji ¼ uous signal processing algorithms that involve B-spline 1; ...;ng of X n spans a linear subspace of at most (n 1) inner-products or the convolution of several spline basis dimensions. By mapping the sample points into a higher functions. Thus, in this paper, we will adopt only the first dimensional space, H , the dominant linear correlations in five kernels in (1)-(5).
Recommended publications
  • A Literature Survey on Computer-Aided Diagnosis in Detection and Classification of Polyp in Colon Cancer Using CT Colonography
    ISSN (Online) 2278-1021 ISSN (Print) 2319-5940 IJARCCE International Journal of Advanced Research in Computer and Communication Engineering ICRITCSA M S Ramaiah Institute of Technology, Bangalore Vol. 5, Special Issue 2, October 2016 A Literature Survey on Computer-Aided Diagnosis in Detection and Classification of Polyp in Colon Cancer using CT Colonography Akshay Godkhindi 1, Dr Dayananda P 2, Dr Sowmyarani C N 3 MSRIT, Bangalore, India 1 Assistant Professor, Dept of Information Science & Engg, MSRIT, Bangalore, India 2 Associate Professor, Dept of Computer Science & Engg, RVCE, Bangalore, India 3 Abstract: Colorectal cancer is a cancer that starts inside the colon or the rectum in large intestine. These cancers is likewise called colon cancer or rectal cancer, depending on wherein they start. Colon and rectal cancer are frequently grouped together because they've many features in common and most colorectal cancers begin as a increase on the internal lining of the colon or rectum called a polyp. A few types of polyps can change into cancer over the several years, but not all polyps end up in cancers. The risk of changing into a most cancers depends at the kind of polyp. Computer-aided detection (CADe) and analysis (CAD) has been a rapidly growing, potential area of research in medical imaging. Machine leaning (ML) plays a crucial role in CAD, because objects such as lesions and organs may not be represented accurately with the aid of an easy equation; as a consequence, medical pattern recognition essentially require “getting to know from examples.” Computed tomography (CT) Colonography or virtual colonoscopy makes use of special x-ray machine to have a look at the large intestine for cancer and growths known as polyps.
    [Show full text]
  • Data Mining Based Soft Computing Methods for Web Intelligence
    International Journal of Application or Innovation in Engineering & Management (IJAIEM) Web Site: www.ijaiem.org Email: [email protected] Volume 3, Issue 3, March 2014 ISSN 2319 - 4847 DATA MINING BASED SOFT COMPUTING METHODS FOR WEB INTELLIGENCE Mr. Ankit R. Deshmukh1, Prof. Sunil R. Gupta2 1ME (CSE) ,First Year,Department of CSE,Prof. Ram Meghe Institute Of Technology and Research, Badnera,Amravati. Sant Gadgebaba Amravati University, Amarvati, Maharashtra, India - 444701 2Assitantant Professor, Department of CSE, Prof. Ram Meghe Institute Of Technology and Research, Badnera,Amravati. Sant Gadgebaba Amravati University, Amarvati, Maharashtra, India - 444701 Abstract Web has become the primary means for information distribution. It is being used for commercial, entertainment or educational purposes and thus its popularity resulted in heavy traffic in the Internet. Web Intelligence (WI) deals with the scientific exploration of the new territories of the Web. As a new field of computer science, it combines artificial intelligence and advanced information technology in the context of the Web, and goes beyond each of them. Data mining has a lot of scope in e- Applications. The key problem is how to find useful hidden patterns for better application. Problem to address soft computing techniques like Neural networks, Fuzzy Logic, Support Vector Machines, Genetic Algorithms in Evolutionary Computation. In this paper, we explore soft computing techniques use to achieve web intelligences. Keywords: Web, Web intelligence, Data mining, Soft computing, Neural networks, Support Vector Machines, Fuzzy Logic, Genetic Algorithm. 1. INTRODUCTION Data mining has useful business applications such as finding useful hidden information from databases, predicting future trends, and making good business decisions [1,6,7].
    [Show full text]
  • VCU School of Engineering Research Magazine, Vol. 5
    TOUCHING LIVES Volume 5 Dear Colleagues and Friends, ur School of Engineering is committed to creativity and discovery O in all aspects of our experience here at Virginia Commonwealth University and in the community at large. We engage in respectful collaborations, we foster problem solving and we are driven by the need to improve human life at home and around the world. I proudly share this issue of the VCU Research Report in which you will learn of remarkable work devoted to improving the human condition. In the pages to follow, you will learn about nanoparticle-based drug- delivery technology to help treat brain cancer in a less invasive way, a new type of smart material to reduce invasive surgical procedures, and privacy protection in cloud computing environments. You will read about promising work to create resorbable vascular grafts, battery-free computer processors, and improved pulmonary drug delivery through aerosol dynamics—as well as machine-based early cancer detection advancements, biomedical signal and image processing to detect hemorrhagic shock, and a breakthrough in soft-surface antimicrobial coatings. It is this work that exemplifies our devotion to all things possible. As you read these research summaries, I invite you to contact the individual faculty members and students—join in the meaningful discovery to solve problems and improve human life. Please join me in celebrating all of our scholars and students. Your School of Engineering is truly amazing and has advanced more in its 16-year history than any other school I’ve seen. Congratulations! Dr. J. Charles Jennett — Interim Dean,VCU Engineering 2 Battlefield Vital Signs 12 Breaking Through Barriers Published by Virginia Commonwealth University School of Engineering Nanomagnetic Computing to Reduce P.O.
    [Show full text]