Online Learning to Rank for Content-Based Image Retrieval∗ Ji Wan1,2,3, Pengcheng Wu2, Steven C

Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015) Online Learning to Rank for Content-Based Image Retrieval∗ Ji Wan1;2;3, Pengcheng Wu2, Steven C. H. Hoi2, Peilin Zhao4, Xingyu Gao1;2;3, Dayong Wang5, Yongdong Zhang1, Jintao Li1 1 Key Laboratory of Intelligent Information Processing of CAS, ICT, CAS, China 2 Singapore Management University 3 University of the Chinese Academy of Sciences 4 Institute for Infocomm Research, A*STAR, Singapore 5 Michigan State University, MI, USA fwanji,gaoxingyu,zhyd,[email protected], fpcwu,[email protected], [email protected], [email protected] Abstract Although CBIR has been studied extensively for years, it is often hard to find a single best retrieval scheme, i.e., some A major challenge in Content-Based Image Re- pair of feature representation and distance measure, which trieval (CBIR) is to bridge the semantic gap be- can consistently beat the others in all scenarios. It is thus tween low-level image contents and high-level se- highly desired to combine multiple types of diverse feature mantic concepts. Although researchers have inves- representations and different kinds of distance measures in tigated a variety of retrieval techniques using differ- order to improve the retrieval accuracy of a real-world CBIR ent types of features and distance functions, no sin- task. In practice, it is however nontrivial to seek an optimal gle best retrieval solution can fully tackle this chal- combination of different retrieval schemes, especially in web- lenge. In a real-world CBIR task, it is often highly scale CBIR applications with millions or even billions of im- desired to combine multiple types of different fea- ages. Besides, for real-world CBIR applications, the optimal ture representations and diverse distance measures combination weights for different image retrieval tasks may in order to close the semantic gap. In this paper, vary across different application domains. Thus, it has be- we investigate a new framework of learning to rank come an urgent research challenge for investigating an auto- for CBIR, which aims to seek the optimal combina- mated and effective learning solution for seeking the optimal tion of different retrieval schemes by learning from combination of multiple diverse retrieval schemes in CBIR. large-scale training data in CBIR. We first formulate the problem formally as a learning to rank task, To tackle the above challenge, in this paper, we investi- which can be solved in general by applying the ex- gate a machine learning framework of learning to rank al- isting batch learning to rank algorithms from text gorithms in seeking the optimal combination of multiple di- information retrieval (IR). To further address the verse retrieval schemes for CBIR by learning from large-scale scalability towards large-scale online CBIR appli- training data automatically. In particular, we first formulate cations, we present a family of online learning to the problem as a learning to rank task, which thus can be rank algorithms, which are significantly more ef- solved in general by applying the existing batch learning to ficient and scalable than classical batch algorithms rank algorithms in text IR. However, to further improve the for large-scale online CBIR. Finally, we conduct an efficiency and scalability issues, we present a family of on- extensive set of experiments, in which encouraging line learning to rank algorithms to cope with the challenge results show that our technique is effective, scalable of large-scale learning in CBIR. We give theoretical analysis and promising for large-scale CBIR. of the proposed online learning to rank algorithms, and em- pirically show that the proposed algorithms are both effective and scalable for large-scale CBIR tasks. 1 Introduction In summary, our main contributions of this paper include: Content-based image retrieval (CBIR) has been extensively i) We conduct a comprehensive study of applying learning to studied for many years in multimedia and computer vision rank techniques to CBIR, aiming to seek the optimal combi- communities. Extensive efforts have been devoted to various nation of multiple retrieval schemes; ii) We propose a family low-level feature descriptors [Jain and Vailaya, 1996] and dif- of efficient and scalable online learning to rank algorithms ferent distance measures defined on some specific sets of low- for CBIR; iii) We analyze the theoretical bounds of the pro- level features [Manjunath and Ma, 1996]. Recent years also posed online learning to rank algorithms, and also examine witness the surge of research on local feature based represen- their empirical performances extensively. tations, such as the bag-of-words models [Sivic et al., 2005] The rest of this paper is organized as follows. Section 2 using local feature descriptors (e.g., SIFT [Lowe, 1999]). reviews related work. Section 3 presents our problem for- ∗This work was supported by Singapore MOE tier 1 research mulation and a family of online learning to rank algorithms grant (C220/MSS14C003) and the National Nature Science Foun- for CBIR, and Section 4 gives theoretical analysis. Section 5 dation of China (61428207). discusses our experiments and Section 6 concludes this work. 2284 2 Related Work family of diverse existing studies in CBIR [He et al., 2004; Hoi et al., 2006; Chechik et al., 2010] that usually to ap- 2.1 Learning to Rank and CBIR ply machine learning techniques (supervised or unsupervised Learning to rank has been extensively studied in text Infor- learning) to learn a good ranking function on a single type mation Retrieval (IR) [Qin et al., 2010]. In general, most ex- of features or some combined features. Such existing tech- isting approaches can be grouped into three major categories: niques potentially could be incorporated as one component (i) pointwise, (ii) pairwise, and (iii) listwise approaches. We of our scheme, which is out of scope of the discussions in briefly review related work in each category below. this work. The first group, the family of pointwise learning to rank approaches, simply treats ranking as a regular classification or 2.2 Online Learning regression problem by learning to predict numerical ranking Online learning is a family of efficient and scalable machine values of individual objects. For example, in [Cooper et al., learning algorithms [Rosenblatt, 1958; Crammer et al., 2006] 1992; Crammer and Singer, 2001; Li et al., 2007], the rank- extensively studied in machine learning for years. In general, ing problem was formulated as a regression task in different online learning operates in a sequential manner. Consider on- forms. In addition, [Nallapati, 2004] formulated the ranking line classification, each time step, an online learner processes problem as a binary classification of relevance on document an incoming example by first predicting its class label; af- objects, and solved it by applying some discriminative mod- ter that, it receives the true class label from the environment, els such as SVM. which is then used to measure the loss between the predicted The second group of learning to rank algorithms, the fam- label and the truth label; at the end of each time step, the ily of pairwise approaches, treats the pairs of documents learner is updated whenever the loss is nonzero. Typically, as training instances and formulates ranking as a task of the goal of an online learning task is to minimize the cumula- learning a classification or regression model from the col- tive mistakes over the entire sequence of predictions. lection of pairwise instances of documents. A variety of In literature, a variety of algorithms have been proposed pairwise learning to rank algorithms have been proposed by for online learning [Hoi et al., 2014]. The most well-known applying different machine learning algorithms [Joachims, example is the Perceptron algorithm [Rosenblatt, 1958]. In 2002; Burges et al., 2005; Tsai et al., 2007]. The well- recent years, various algorithms have been proposed to im- known algorithms include: SVM-based approaches such prove Perceptron [Li and Long, 1999; Crammer et al., 2006], as RankSVM [Joachims, 2002], neural networks based which usually follow the criterion of maximum margin learn- approaches such as RankNet [Burges et al., 2005], and ing principle. A notable approach is the family of Passive- boosting-based approaches such as RankBoost [Freund et al., Aggressive (PA) learning algorithms [Crammer et al., 2006], 2003], etc. This group is the most widely explored research which updates the classifier whenever the online learner fails direction of learning to rank, in which many techniques have to produce a large margin on the current instance. These algo- been successfully applied in real-world commercial systems. rithms are often more efficient and scalable than batch learn- In general, our proposed approaches belong to this category. ing algorithms. In this work, we aim to extend the existing The third group, the family of listwise learning to rank ap- online learning principle for developing new learning to rank proaches, treats a list of documents for a query as a train- algorithms. In addition, we note that our work is also very ing instance and attempts to learn a ranking model by op- different from another study in [Grangier and Bengio, 2008] timizing some loss functions defined on the predicted list which focuses on text-based image retrieval by applying PA and the ground-truth list. There are two different kinds algorithms. By contrast, our CBIR study focuses on image of approaches in this category. The first is to directly retrieval based on the visual similarity. Finally, the proposed optimize some IR metrics, such as Mean Average Preci- online learning to rank is based on linear models and is thus sion (MAP) and Normalized Discounted Cumulative Gain more scalable than the kernel-based similarity learning ap- (NDCG) [Jarvelin¨ and Kekal¨ ainen,¨ 2000].

Online Learning to Rank for Content-Based Image Retrieval∗ Ji Wan1,2,3, Pengcheng Wu2, Steven C

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support