Intuition Learninglabeled Data (C) Semi - Supervised Learning

Feature Supervised Labels representation learning Labeled data (a) Supervised learning Feature Supervised representation learning Labeled data Feature Supervised Labels representation learning Labeled data (b) Transfer learning Feature subspace Unlabeled data Feature Supervised Labels representation learning Intuition LearningLabeled data (c) Semi - supervised learning 1 2 2 Anush Sankaran , Mayank Vatsa , RichaFeature Singh 1 IBM Research subspace Unlabeled data 2 IIIT, Delhi, India Feature Supervised Labels [email protected], [email protected], [email protected] learning Labeled data (d) Self taught learning Abstract Feature Feature Reinforcement Labels “By reading on the title and abstract, do you think subspace representation learning this paper will be accepted in IJCAI 2018?” A com- Unlabeled data mon impromptu reply would be “I don’t know but Feature Supervised Labels I have an intuition that this paper might get ac- representation learning cepted”. Intuition is often employed by humans Labeled data to solve challenging problems without explicit ef- (e) Intuition learning forts. Intuition is not trained but is learned from Figure 1: The proposed solution where an intuition learning algo- one’s own experience and observation. The aim of rithm can supplement a supervised or semi-supervised algorithm at this research is to provide intuition to an algorithm, decision level. apart from what they are trained to know in a supervised manner. We present a novel intuition learning framework that learns to perform a task com- learning. In other words, intuition is a context dependent guesswork pletely from unlabeled data. The proposed frame- that can be incorrect certain times. In a typical learning work uses a continuous-state reinforcement learn- pipeline, the concept of intuition can be used for a ing mechanism to learn a feature representation and variety of purposes starting from training data selection upto a data-label mapping function using unlabeled data. and including decision making. Heuristics are the simplest The mapping functions and feature representation form of intuition that bypass or is used in conjunction with are succinct and can be used to supplement any su- rational decisions to obtain quick approximate results. For pervised or semi-supervised algorithm. The exper- example, heuristics can be used in (1) choosing the new data [ et al. ] iments on the CIFAR-10 database shows challeng- points in an online active learning scenario Chu , 2011 , [ et al. ] ing cases where intuition learning improve the per- (2) for feature representation Coates , 2011 , (3) feature [ ] formance of a given classifier. selection Guyon and Elisseeff, 2003 , or (4) choice of classifier and it’s parameters [Cavalin et al., 2012]. Table 1 shows the comparison of existing popular machine 1 Introduction learning paradigms. Supervised learning attempts to learn an Intuition refers to knowledge acquired without inference input-output mapping function on a feature space using a set and/or the use of reason [Simpson et al., 1989]. Philosophi- of labeled training data. Transfer learning aims to improve cally, there are several definitions for intuition and the most the target learning function using the knowledge in source popularly used one is “Thoughts that are reached with lit- (related) domain and source learning tasks [Pan and Yang, tle apparent effort, and typically without conscious aware- 2010]. Many types of knowledge transfer such as classifi- ness” [Hogarth, 2001] and is considered as the opposite of cation parameters [Lawrence and Platt, 2004], feature rep- a rational process. From a machine learning perspective, resentations [Evgeniou and Pontil, 2007], and training in- training a supervised classifier is a rational process where it stances [Huang et al., 2006] have been tested to improve the is trained with labeled data allowing it to learn a decision performance of supervised learning tasks. Semi-supervised boundary. Also, traditional unsupervised learning methods learning utilizes additional knowledge from unlabeled data, do not map the learnt patterns to their corresponding class drawn from the same distribution and having the same task labels. Semi-supervised approaches bridge this gap by lever- labels as the labeled data. Many of these research works aging unlabeled data to better perform supervised learning has focussed on unsupervised feature learning ie., to create tasks. However, the final task (say, classification) is per- a feature subspace using the unlabeled data, to which the la- formed only by a supervised classifier using labeled data with beled data can be projected to obtain a new feature representa- some additional knowledge from unsupervised learning. The tion [Chapelle et al., 2006]. In 2007, Raina et al. [Raina et al., notion of intuition would mean that the system performs tasks 2007] proposed a framework termed as “Self-taught learn- using only unlabeled data without any supervised (rational) ing” to create the generic feature subspace using sparse auto- Paradigm Input data Learnt function Comments Supervised [Bishop and <data, label> data-label mapping Nasrabadi, 2006] Unsupervised [Bishop <data> data clusters and Nasrabadi, 2006] Semi-supervised <data, label>, unlabeled data data-label mapping unlabeled data follow the same distribution [Chapelle et al., 2006] Reinforcement [Kael- reward function (or value) state, action policy need a teacher to provide reward bling et al., 1996] Active [Settles, 2010] <data, label> data-label mapping, new need human annotator (Oracle) or expert al- data selection gorithm to provide labels for new data Transfer [Pan and <sourceData, sourceLabel>, targetData - targetLabel transfer can be data instances, classification Yang, 2010] <targetData, targetLabel> mapping parameters, or features Imitation [Natarajan et sourceData, sourceData- targetData - targetLabel need a teacher to provide reward al., 2011] sourceLabel mapping mapping Self taught [Raina et <data, label>, unlabeled data data-label mapping unlabeled data need not follow the same al., 2007] distribution and label as data Deep learning [Bengio, <data, label>, unlabeled data data-label mapping complex architecture to learn robust data 2009] representations Intuition data, unlabeled data, reward data-label mapping unlabeled data need not follow the same function (or value) distribution, need a reward function Table 1: Comparison of existing popular machine learning paradigms along with the proposed intuition learning paradigm. encoders irrespective of the task labels. Self-taught learning Large dismisses the same class label assumption of semi-supervised unlabeled data learning and forms a generic high-level feature subspace from Generic feature the unlabeled data, where the labeled data can be projected. subspace As shown in Figure 1, we postulate a framework of sup- RL-based Limited classifier plementing intuition decisions at the decision level to a su- Task #1 Task #2 . Task #m task-specific specific specific specific pervised or semi-supervised classifier. The decisions drawn data features features features by the reinforcement learning block in Figure 1 are called in- RL-based RL-based . RL-based tuition because they are learnt only using the unlabeled data classifier classifier classifier with an indirect reward from a teacher. Existing algorithms, Label Label Label broadly, require training labels for building a classifier or bor- rows the classifier parameters from an already trained classi- Figure 2: A block diagram outlining on how a feature space can be fier. Direct or indirect training is not always possible as ob- adapted using reinforcement learning algorithm with feedback from taining data labels is very costly. To address this challenge, a supervised classifier trained on limited task-specific data. we propose a novel paradigm for unsupervised task performance mechanism learnt from cumulative experience. Intu- ition is modeled as a learning framework, which provides the sification framework is proposed to map input data to ability to learn a task completely from unlabeled data. By output class label, without the explicit use of training. using continuous state reinforcement learning as a classifier, • A residual Q-learning based function approximation the framework learns to perform the classification task with- method for learning the feature representation of task out the need for explicit labeled data. Reinforcement learning specific data. A novel reward function which does not helps in adapting a randomly initialized feature space to the require class labels is designed to provide feedback to specific task at hand, where a parallel supervised classifier is the reinforcement based classification system. used a teacher. As the proposed framework is able to learn a mapping function from the input data to the output class • A context dependent addition framework is proposed, labels, without the requirement for explicit training, it func- where the result of the intuition framework can be sup- tions similar to human intuition and we term this approach as plemented based on the confidence of the trained super- Intuition Learning. vised or semi-supervised mapping function. 1.1 Research Contributions 2 An Intuition Learning Algorithm This research proposes a intuition learning framework to en- The basic idea of the proposed intuition learning framework able algorithms learn a specific classification or regression is presented in Figure 2. Given a large set of unlabelled data, task completely from unlabeled

Load more