The Teaching Dimension of Kernel Perceptron

The Teaching Dimension of Kernel Perceptron Akash Kumar Hanqi Zhang Adish Singla Yuxin Chen MPI-SWS University of Chicago MPI-SWS University of Chicago Abstract et al.(2020)). One of the most studied scenarios is the case of teaching a version-space learner (Goldman & Kearns, 1995; Anthony et al., 1995; Zilles et al., Algorithmic machine teaching has been stud- 2008; Doliwa et al., 2014; Chen et al., 2018; Mansouri ied under the linear setting where exact teach- et al., 2019; Kirkpatrick et al., 2019). Upon receiving ing is possible. However, little is known for a sequence of training examples from the teacher, a teaching nonlinear learners. Here, we estab- version-space learner maintains a set of hypotheses that lish the sample complexity of teaching, aka are consistent with the training examples, and outputs teaching dimension, for kernelized perceptrons a random hypothesis from this set. for different families of feature maps. As a warm-up, we show that the teaching complex- As a canonical example, consider teaching a 1- ity is Θ(d) for the exact teaching of linear dimensional binary threshold function fθ∗ (x) = d k 1 perceptrons in R , and Θ(d ) for kernel per- x θ∗ for x [0; 1]. For a learner with a finite (or f − g 2 i ceptron with a polynomial kernel of order k. countable infinite) version space, e.g., θ i=0;:::;n 2 f n g Furthermore, under certain smooth assump- where n Z+ (see Fig. 1a), a smallest training set i 2 i+1 i i+1 tions on the data distribution, we establish a is ; 0 ; ; 1 where θ∗ < ; thus the f n n g n ≤ n rigorous bound on the complexity for approxi- teaching dimension is 2. However, when the version mately teaching a Gaussian kernel perceptron. space is continuous, the teaching dimension becomes We provide numerical examples of the opti- , because it is no longer possible for the learner to 1 mal (approximate) teaching set under several pick out a unique threshold θ∗ with a finite training canonical settings for linear, polynomial and set. This is due to two key (limiting) modeling assump- Gaussian kernel perceptrons. tions of the version-space learner: (1) all (consistent) hypotheses in the version space are treated equally, and (2) there exists a hypothesis in the version space 1 Introduction that is consistent with all training examples. As one can see, these assumptions fail to capture the behavior of many modern learning algorithms, where the best Machine teaching studies the problem of finding an opti- hypotheses are often selected via optimizing certain mal training sequence to steer a learner towards a target loss functions, and the data is not perfectly separable concept (Zhu et al., 2018). An important learning- (i.e. not realizable w.r.t. the hypothesis/model class). theoretic complexity measure of machine teaching is the teaching dimension (Goldman & Kearns, 1995), To lift these modeling assumptions, a more realistic which specifies the minimal number of training ex- teaching scenario is to consider the learner as an empir- amples required in the worst case to teach a target ical risk minimizer (ERM). In fact, under the realizable concept. Over the past few decades, the notion of setting, the version-space learner could be viewed as an teaching dimension has been investigated under a va- ERM that optimizes the 0-1 loss—one that finds all hy- riety of learner’s models and teaching protocols (e.g,. potheses with zero training error. Recently, Liu & Zhu Cakmak & Lopes(2012); Singla et al.(2013; 2014); Liu (2016) studied the teaching dimension of linear ERM, et al.(2017); Haug et al.(2018); Tschiatschek et al. and established values of teaching dimension for several (2019); Liu et al.(2018); Kamalaruban et al.(2019); classes of linear (regularized) ERM learners, including Hunziker et al.(2019); Devidze et al.(2020); Rakhsha support vector machine (SVM), logistic regression and ridge regression. As illustrated in Fig. 1b, for the previ- th Proceedings of the 24 International Conference on Artifi- ous example it suffices to use (θ∗ , 0) ; (θ∗ + , 1) cial Intelligence and Statistics (AISTATS) 2021, San Diego, f − g with any min(1 θ∗; θ∗) as training set to teach California, USA. PMLR: Volume 130. Copyright 2021 by θ ≤ − l the author(s). ∗ as an optimizer of the SVM objective (i.e., 2 regu- <latexit sha1_base64="2bie5j0FklXVgHszbRALhkB9Hm0=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEsN4KXjy2YD+gDWWznbRrN5uwuxFK6C/w4kERr/4kb/4bt20O2vpg4PHeDDPzgkRwbVz32ylsbG5t7xR3S3v7B4dH5eOTto5TxbDFYhGrbkA1Ci6xZbgR2E0U0igQ2Akmd3O/84RK81g+mGmCfkRHkoecUWOlpjsoV9yquwBZJ15OKpCjMSh/9YcxSyOUhgmqdc9zE+NnVBnOBM5K/VRjQtmEjrBnqaQRaj9bHDojF1YZkjBWtqQhC/X3REYjradRYDsjasZ61ZuL/3m91IQ1P+MySQ1KtlwUpoKYmMy/JkOukBkxtYQyxe2thI2poszYbEo2BG/15XXSvqp619Xb5nWlXsvjKMIZnMMleHADdbiHBrSAAcIzvMKb8+i8OO/Ox7K14OQzp/AHzucPefeMtg==</latexit> <latexit sha1_base64="GqPwQjZ2r3YPS+cj+fJ5jHVTlCo=">AAAB73icbVDLSgNBEOyNrxhfUY9eBoMgHsKuBIy3gBePEcwDkjXMTmaTIbOz60yvEEJ+wosHRbz6O978GyfJHjSxoKGo6qa7K0ikMOi6305ubX1jcyu/XdjZ3ds/KB4eNU2casYbLJaxbgfUcCkUb6BAyduJ5jQKJG8Fo5uZ33ri2ohY3eM44X5EB0qEglG0UruLQ4704aJXLLlldw6ySryMlCBDvVf86vZjlkZcIZPUmI7nJuhPqEbBJJ8WuqnhCWUjOuAdSxWNuPEn83un5MwqfRLG2pZCMld/T0xoZMw4CmxnRHFolr2Z+J/XSTGs+hOhkhS5YotFYSoJxmT2POkLzRnKsSWUaWFvJWxINWVoIyrYELzll1dJ87LsVcrXd5VSrZrFkYcTOIVz8OAKanALdWgAAwnP8ApvzqPz4rw7H4vWnJPNHMMfOJ8/wiSPxg==</latexit> <latexit sha1_base64="2bie5j0FklXVgHszbRALhkB9Hm0=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEsN4KXjy2YD+gDWWznbRrN5uwuxFK6C/w4kERr/4kb/4bt20O2vpg4PHeDDPzgkRwbVz32ylsbG5t7xR3S3v7B4dH5eOTto5TxbDFYhGrbkA1Ci6xZbgR2E0U0igQ2Akmd3O/84RK81g+mGmCfkRHkoecUWOlpjsoV9yquwBZJ15OKpCjMSh/9YcxSyOUhgmqdc9zE+NnVBnOBM5K/VRjQtmEjrBnqaQRaj9bHDojF1YZkjBWtqQhC/X3REYjradRYDsjasZ61ZuL/3m91IQ1P+MySQ1KtlwUpoKYmMy/JkOukBkxtYQyxe2thI2poszYbEo2BG/15XXSvqp619Xb5nWlXsvjKMIZnMMleHADdbiHBrSAAcIzvMKb8+i8OO/Ox7K14OQzp/AHzucPefeMtg==</latexit> <latexit sha1_base64="QrxaAz2yQLwjgOwEUUvZ/CWYkT0=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEsN4KXjy2YD+gDWWznbRrN5uwuxFK6C/w4kERr/4kb/4bt20O2vpg4PHeDDPzgkRwbVz32ylsbG5t7xR3S3v7B4dH5eOTto5TxbDFYhGrbkA1Ci6xZbgR2E0U0igQ2Akmd3O/84RK81g+mGmCfkRHkoecUWOlpjcoV9yquwBZJ15OKpCjMSh/9YcxSyOUhgmqdc9zE+NnVBnOBM5K/VRjQtmEjrBnqaQRaj9bHDojF1YZkjBWtqQhC/X3REYjradRYDsjasZ61ZuL/3m91IQ1P+MySQ1KtlwUpoKYmMy/JkOukBkxtYQyxe2thI2poszYbEo2BG/15XXSvqp619Xb5nWlXsvjKMIZnMMleHADdbiHBrSAAcIzvMKb8+i8OO/Ox7K14OQzp/AHzucPe3uMtw==</latexit> <latexit sha1_base64="GqPwQjZ2r3YPS+cj+fJ5jHVTlCo=">AAAB73icbVDLSgNBEOyNrxhfUY9eBoMgHsKuBIy3gBePEcwDkjXMTmaTIbOz60yvEEJ+wosHRbz6O978GyfJHjSxoKGo6qa7K0ikMOi6305ubX1jcyu/XdjZ3ds/KB4eNU2casYbLJaxbgfUcCkUb6BAyduJ5jQKJG8Fo5uZ33ri2ohY3eM44X5EB0qEglG0UruLQ4704aJXLLlldw6ySryMlCBDvVf86vZjlkZcIZPUmI7nJuhPqEbBJJ8WuqnhCWUjOuAdSxWNuPEn83un5MwqfRLG2pZCMld/T0xoZMw4CmxnRHFolr2Z+J/XSTGs+hOhkhS5YotFYSoJxmT2POkLzRnKsSWUaWFvJWxINWVoIyrYELzll1dJ87LsVcrXd5VSrZrFkYcTOIVz8OAKanALdWgAAwnP8ApvzqPz4rw7H4vWnJPNHMMfOJ8/wiSPxg==</latexit> <latexit sha1_base64="2bie5j0FklXVgHszbRALhkB9Hm0=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEsN4KXjy2YD+gDWWznbRrN5uwuxFK6C/w4kERr/4kb/4bt20O2vpg4PHeDDPzgkRwbVz32ylsbG5t7xR3S3v7B4dH5eOTto5TxbDFYhGrbkA1Ci6xZbgR2E0U0igQ2Akmd3O/84RK81g+mGmCfkRHkoecUWOlpjsoV9yquwBZJ15OKpCjMSh/9YcxSyOUhgmqdc9zE+NnVBnOBM5K/VRjQtmEjrBnqaQRaj9bHDojF1YZkjBWtqQhC/X3REYjradRYDsjasZ61ZuL/3m91IQ1P+MySQ1KtlwUpoKYmMy/JkOukBkxtYQyxe2thI2poszYbEo2BG/15XXSvqp619Xb5nWlXsvjKMIZnMMleHADdbiHBrSAAcIzvMKb8+i8OO/Ox7K14OQzp/AHzucPefeMtg==</latexit> <latexit sha1_base64="uTQlYwH9iubcAeQoIo2VSlEv+2M=">AAAB+nicbVBNS8NAEN34WetXq0cvwSKIYEmkYL0VvHisYD+giWWznbRLNx/sTpQS+1O8eFDEq7/Em//GTZuDtj4YeLw3w8w8LxZcoWV9Gyura+sbm4Wt4vbO7t5+qXzQVlEiGbRYJCLZ9agCwUNoIUcB3VgCDTwBHW98nfmdB5CKR+EdTmJwAzoMuc8ZRS31S2UHR4D0/uzcgVhxkWkVq2rNYC4TOycVkqPZL305g4glAYTIBFWqZ1sxuimVyJmAadFJFMSUjekQepqGNADlprPTp+aJVgamH0ldIZoz9fdESgOlJoGnOwOKI7XoZeJ/Xi9Bv+6mPIwThJDNF/mJMDEysxzMAZfAUEw0oUxyfavJRlRShjqtog7BXnx5mbQvqnatenVbqzTqeRwFckSOySmxySVpkBvSJC3CyCN5Jq/kzXgyXox342PeumLkM4fkD4zPHwF3k9Q=</latexit> <latexit sha1_base64="QrxaAz2yQLwjgOwEUUvZ/CWYkT0=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEsN4KXjy2YD+gDWWznbRrN5uwuxFK6C/w4kERr/4kb/4bt20O2vpg4PHeDDPzgkRwbVz32ylsbG5t7xR3S3v7B4dH5eOTto5TxbDFYhGrbkA1Ci6xZbgR2E0U0igQ2Akmd3O/84RK81g+mGmCfkRHkoecUWOlpjcoV9yquwBZJ15OKpCjMSh/9YcxSyOUhgmqdc9zE+NnVBnOBM5K/VRjQtmEjrBnqaQRaj9bHDojF1YZkjBWtqQhC/X3REYjradRYDsjasZ61ZuL/3m91IQ1P+MySQ1KtlwUpoKYmMy/JkOukBkxtYQyxe2thI2poszYbEo2BG/15XXSvqp619Xb5nWlXsvjKMIZnMMleHADdbiHBrSAAcIzvMKb8+i8OO/Ox7K14OQzp/AHzucPe3uMtw==</latexit> <latexit sha1_base64="GqPwQjZ2r3YPS+cj+fJ5jHVTlCo=">AAAB73icbVDLSgNBEOyNrxhfUY9eBoMgHsKuBIy3gBePEcwDkjXMTmaTIbOz60yvEEJ+wosHRbz6O978GyfJHjSxoKGo6qa7K0ikMOi6305ubX1jcyu/XdjZ3ds/KB4eNU2casYbLJaxbgfUcCkUb6BAyduJ5jQKJG8Fo5uZ33ri2ohY3eM44X5EB0qEglG0UruLQ4704aJXLLlldw6ySryMlCBDvVf86vZjlkZcIZPUmI7nJuhPqEbBJJ8WuqnhCWUjOuAdSxWNuPEn83un5MwqfRLG2pZCMld/T0xoZMw4CmxnRHFolr2Z+J/XSTGs+hOhkhS5YotFYSoJxmT2POkLzRnKsSWUaWFvJWxINWVoIyrYELzll1dJ87LsVcrXd5VSrZrFkYcTOIVz8OAKanALdWgAAwnP8ApvzqPz4rw7H4vWnJPNHMMfOJ8/wiSPxg==</latexit> <latexit sha1_base64="+KldzglugnuR7NcCxtMwkEsn9xU=">AAAB+nicbVBNS8NAEN34WetXq0cvwSKIQkmkYL0VvHisYD+giWWznbRLNx/sTpQS+1O8eFDEq7/Em//GTZuDtj4YeLw3w8w8LxZcoWV9Gyura+sbm4Wt4vbO7t5+qXzQVlEiGbRYJCLZ9agCwUNoIUcB3VgCDTwBHW98nfmdB5CKR+EdTmJwAzoMuc8ZRS31S2UHR4D0/uzcgVhxkWkVq2rNYC4TOycVkqPZL305g4glAYTIBFWqZ1sxuimVyJmAadFJFMSUjekQepqGNADlprPTp+a <latexit sha1_base64="wi34xSZ/KS5JCNIeyECh4VoWQys=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU02kYL0VvHisaG2hDWWznbRLN5uwuxFK6E/w4kERr/4ib/4bt20O2vpg4PHeDDPzgkRwbVz32ymsrW9sbhW3Szu7e/sH5cOjRx2nimGLxSJWnYBqFFxiy3AjsJMopFEgsB2Mb2Z++wmV5rF8MJME/YgOJQ85o8ZK9/xC9ssVt+rOQVaJl5MK5Gj2y1+9QczSCKVhgmrd9dzE+BlVhjOB01Iv1ZhQNqZD7FoqaYTaz+anTsmZVQYkjJUtachc/T2R0UjrSRTYzoiakV72ZuJ/Xjc1Yd3PuExSg5ItFoWpICYms7/JgCtkRkwsoUxxeythI6ooMzadkg3BW355lTxeVr1a9fquVmnU8ziKcAKncA4eXEEDbqEJLWAwhGd4hTdHOC/Ou/OxaC04+cwx/IHz+QMJOI2g</latexit>

The Teaching Dimension of Kernel Perceptron

Designing Algorithms for Machine Learning and Data Mining

Mistake Bounds for Binary Matrix Completion

Kernel Methods and Factorization for Image and Video Analysis

COMS 4771 Perceptron and Kernelization

Kernel Methods for Pattern Analysis

Fast Kernel Classifiers with Online and Active Learning

Online Passive-Aggressive Algorithms on a Budget

Mistake Bounds for Binary Matrix Completion

Output Neuron

A Review of Learning Planning Action Models Ankuj Arora, Humbert Fiorino, Damien Pellier, Marc Etivier, Sylvie Pesty

Explaining RNN Predictions for Sentiment Classification

Outline of Machine Learning