Algorithms for Multiclass Classification and Regularized Regression
Total Page:16
File Type:pdf, Size:1020Kb
442 GERRIT JAN JOHANNES VAN DEN BURG Multiclass classification and regularized regression problems are very common in modern statistical and machine learning applications. On the one hand, multiclass classification problems require the prediction of class labels: given observations of objects that belong to certain classes, can we predict to which class a new object belongs? On the other hand, the regularized regression problem is a variation of the common Algorithms for Multiclass regression problem, which measures how changes in independent variables influence an observed outcome. In regularized regression, constraints are placed on the coefficients of the regression model to DEN BURG GERRIT JAN JOHANNES VAN enforce certain properties in the solution, such as sparsity or limited size. Classification and Regularized In this dissertation several new algorithms are presented for both multiclass classification and regularized regression problems. For multiclass classification the GenSVM method is presented. This method extends the binary support vector machine to multiclass classification problems in a way that is both flexible and Regression general, while maintaining competitive performance and training time. In a different chapter, accurate estimates of the Bayes error are applied to both meta-learning and the construction of so-called classification hierarchies: structures in which a multiclass classification problem is decomposed into several binary classification problems. For regularized regression problems a new algorithm is presented in two parts: first for the sparse regression problem and second as a general algorithm for regularized regression where the regularization function is a measure of the size of the coefficients. In the proposed algorithm graduated nonconvexity is used to slowly introduce the nonconvexity in the problem while iterating towards a solution. The empirical performance and theoretical convergence properties of the algorithm are analyzed with numerical experiments that demonstrate the ability for the algorithm to obtain globally optimal solutions. - Algorithms for Multiclass Classification and Regularized Regression The Erasmus Research Institute of Management (ERIM) is the Research School (Onderzoekschool) in the field of management of the Erasmus University Rotterdam. The founding participants of ERIM are Rotterdam School of Management (RSM), and the Erasmus School of Economics (ESE). ERIM was founded in 1999 and is officially accredited by the Royal Netherlands Academy of Arts and Sciences (KNAW). The research undertaken by ERIM is focused on the management of the firm in its environment, its intra- and interfirm relations, and its business processes in their interdependent connections. The objective of ERIM is to carry out first rate research in management, and to offer an advanced doctoral programme in Research in Management. Within ERIM, over three hundred senior researchers and PhD candidates are active in the different research programmes. From a variety of academic backgrounds and expertises, the ERIM community is united in striving for excellence and working at the forefront of creating new business knowledge. ERIM PhD Series Research in Management Erasmus University Rotterdam (EUR) Erasmus Research Institute of Management Mandeville (T) Building Burgemeester Oudlaan 50 3062 PA Rotterdam, The Netherlands P.O. Box 1738 3000 DR Rotterdam, The Netherlands T +31 10 408 1182 E [email protected] W www.erim.eur.nl 28807_dissertatie_Cover_Gertjan_van_den_Burg.indd Alle pagina's 24-10-17 14:13 ALGORITHMS FOR MULTICLASS CLASSIFICATION AND REGULARIZED REGRESSION Algorithms for Multiclass Classification and Regularized Regression Algoritmes voor classificatie en regressie Thesis to obtain the degree of Doctor from the Erasmus University Rotterdam by command of the rector magnificus Prof.dr. H.A.P. Pols and in accordance with the decision of the Doctorate Board. The public defence shall be held on Friday January 12, 2018 at 09:30 hrs by GERRIT JAN JOHANNES VAN DEN BURG born in ’s-Gravenhage. Doctoral Committee Promotor: Prof.dr. P.J.F. Groenen Other members: Prof.dr. D. Fok Prof.dr. A.O. Hero Prof.dr. H.A.L. Kiers Copromotor: dr. A. Alfons Erasmus Research Institute of Management – ERIM The joint research institute of the Rotterdam School of Management (RSM) and the Erasmus School of Economics (ESE) at the Erasmus University Rotterdam. Internet: https://www.erim.eur.nl. ERIM Electronic Series Portal: https://repub.eur.nl ERIM PhD Series in Research in Management, 442 ERIM reference number: EPS-2017-442-MKT ISBN 978-90-5892-499-5 © 2017, G.J.J. van den Burg Design: G.J.J. van den Burg. This publication (cover and interior) is printed by Tuijtel on recycled paper, BalanceSilk®. The ink used is produced from renewable resources and alcohol free fountain solution. Certifications for the paper and the printing production process: Recycle, EU Ecolabel, FSC®, ISO14001. More info: https://www.tuijtel.com. All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system, without written permission from the author. The cost of perfection is infinite. – C.G.P. Grey Acknowledgements The road to this dissertation started about seven years ago when I attended a lecture on Multivariate Statistics, taught by prof. Patrick Groenen. During this lecture the “support vector machine” classification method was introduced and I remember that I was immediately very excited about it. A few weeks later, I emailed prof. Groenen to ask whether it would be possible to start a thesis project related to these “SVMs”. Luckily for me, this was possible and we soon started working on a project that would eventually become my MSc thesis and, after some more work, Chapter 2 of this dissertation. After finishing my two MSc theses, I took some time to travel before starting my PhD at Erasmus University Rotterdam in December 2012, with Patrick Groenen as my doctoral advisor. The first task of my PhD was to improve my MSc thesis on multiclass SVMs to such a level that it could be submitted to a top academic journal. This involved a lot more research, programming, and testing but eventually we managed to submit the paper in December 2014 to the Journal of Machine Learning Research, a respected journal in the machine learning field. After a few rounds of revision, the paper was accepted and it got published in December 2016. While the first paper was under review, I started a second project that focused on regularized regression (Chapter 4 of this dissertation). For this project Patrick and I teamed up with Andreas Alfons, who had extensive experience with robust statistics and regularized methods. This collaboration really helped improve the paper and take it to the next level. This paper was submitted to a machine learning conference in 2015 but was unfortunately rejected for various reasons. In the end, we managed to extend the work and make the method much more general, as can be seen in Chapter 5. I am confident that we will be able to submit this work to a good statistics journal in the near future. vii In the winter of 2016 I had the wonderful opportunity to visit the University of Michigan in Ann Arbor for three months to join the research group of prof. Alfred Hero at the EECS department. It was quite the adventure living in the U.S. for three months (especially when cycling to the university through a thick layer of snow!). More importantly however, it was an incredible learning experience to join a different research group for a little while and to work with the people in the group. I am also very happy that this visit resulted in an actual research project that grew out to become Chapter 3 of this dissertation. As the above paragraphs already highlight, I am indebted to a great number of people who contributed in one way or another to my time as a PhD candidate. First of all, I would like to thank Patrick Groenen for taking me on as an MSc student and as a PhD student and for giving me many opportunities to develop myself both as an independent researcher and as a teacher. I’ve enjoyed and learned a lot from our collaboration over the years. I especially admired your way of looking at a problem from a more geometric point of view, which often helped us to gain a lot more intuition about a problem. I certainly hope that even though my PhD ended, our collaboration will not. I would also like to thank Andreas Alfons for being my co-advisor for the last few years of my PhD. Your keen eye for detail and your way of looking at the material has greatly improved Chapters 4 and 5 of this dissertation. Thanks also to prof. Alfred Hero for giving me the opportunity to visit the University of Michigan twice to work on a joined research project and for being part of my doctoral committee. I also want to extend my gratitude to all the members of my doctoral committee for taking the time to read this dissertation and for coming to Rotterdam for the defense. During my time as a PhD student, I had the pleasure of sharing my office with two great colleagues. Niels Rietveld, thank you for welcoming me to the university and for showing me the ropes during the first few months. I’ll never forget the first few days of my PhD when we worked on the window sill because the new desks hadn’t arrived yet! Thomas Visser, thanks for being a nice office mate as well; despite our very different research topics I think we still learned from each other. Besides these two actual office mates, I would also like to thank Alexandra Rusu for being a quasi office mate two doors down the hall – thanks for the friendship and the many “interruptions” during the day that made the life of a PhD student looking at his computer screen all day much less boring! There are two colleagues from the statistics department that I would like to thank here as well.