Journal of Machine Learning Research 7 (2006) 1531–1565 Submitted 10/05; Published 7/06 Large Scale Multiple Kernel Learning Sören Sonnenburg
[email protected] Fraunhofer FIRST.IDA Kekuléstrasse 7 12489 Berlin, Germany Gunnar Rätsch
[email protected] Friedrich Miescher Laboratory of the Max Planck Society Spemannstrasse 39 Tübingen, Germany Christin Schäfer
[email protected] Fraunhofer FIRST.IDA Kekuléstrasse 7 12489 Berlin, Germany Bernhard Schölkopf BERNHARD
[email protected] Max Planck Institute for Biological Cybernetics Spemannstrasse 38 72076, Tübingen, Germany Editors: Kristin P. Bennett and Emilio Parrado-Hernández Abstract While classical kernel-based learning algorithms are based on a single kernel, in practice it is often desirable to use multiple kernels. Lanckriet et al. (2004) considered conic combinations of kernel matrices for classification, leading to a convex quadratically constrained quadratic program. We show that it can be rewritten as a semi-infinite linear program that can be efficiently solved by recy- cling the standard SVM implementations. Moreover, we generalize the formulation and our method to a larger class of problems, including regression and one-class classification. Experimental re- sults show that the proposed algorithm works for hundred thousands of examples or hundreds of kernels to be combined, and helps for automatic model selection, improving the interpretability of the learning result. In a second part we discuss general speed up mechanism for SVMs, especially when used with sparse feature maps as appear for string kernels, allowing us to train a string kernel SVM on a 10 million real-world splice data set from computational biology.