Building and Using Knowledge Models for Semantic Image Annotation Hichem Bannour
Total Page:16
File Type:pdf, Size:1020Kb
Building and Using Knowledge Models for Semantic Image Annotation Hichem Bannour To cite this version: Hichem Bannour. Building and Using Knowledge Models for Semantic Image Annotation. Other. Ecole Centrale Paris, 2013. English. NNT : 2013ECAP0027. tel-00905953 HAL Id: tel-00905953 https://tel.archives-ouvertes.fr/tel-00905953 Submitted on 19 Nov 2013 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. ECOLE CENTRALE DES ARTS ET MANUFACTURES "ECOLE CENTRALE PARIS" PHD THESIS in candidacy for the degree of Doctor of Ecole Centrale Paris Specialty: Computer Science Defended by Hichem Bannour Building and Using Knowledge Models for Semantic Image Annotation prepared at Ecole Centrale Paris, MAS Laboratry defended on March 8th , 2013 Jury: Chairman: Dr. Marcin Detyniecki CNRS Paris, France Reviewers: Pr. Jean-Marc Ogier University of La Rochelle, France Dr. Philippe Mulhem CNRS Grenoble, France Examiners: Dr. Adrian Popescu CEA Saclay, France Dr. Jamal Atif University of Paris-Sud 11, France Advisors: Dr. Céline Hudelot Ecole Centrale Paris, France Pr. Marc Aiguier Ecole Centrale Paris, France Building and Using Knowledge Models for Semantic Image Annotation Hichem Bannour A Dissertation Presented at Ecole Centrale Paris in Candidacy for the Degree of Doctor of Philosophy Recommended for Acceptance by the Department of Mathematics Applied to Systems Advisors: Dr. Céline Hudelot and Pr. Marc Aiguier March 2013 c Copyright by Hichem Bannour, 2013. All rights reserved. To my parents, To my sisters and my niece Yasmine , who have contributed to my work like nobody else. iii Acknowledgements Firstly, I would like to thank my advisor Dr. Céline Hudelot for offering me the opportunity to achieve my PhD within the MAS laboratory of Ecole Centrale Paris. Thank you Céline for your trust, the freedom you gave me to explore the topics that I was interested in, your valuable advices and your unconditional support. At your side I learned many things both scientific and personal. I also want to thank Pr. Marc Aiguier who co-directed my PhD work and who was always available for giving me priceless advices on my scientific career. I am also thank- ful to Marc and Céline for offering me the possibility to teach at Ecole Centrale Paris. Afterward, I would like to thank all the committee members for accepting to review my modest work. Specifically, I’m grateful to my reviewers Pr. Jean-Marc Ogier and Dr. Philippe Mulhem, the chairman Dr. Marcin Detyniecki and the examiners Dr. Adrian Popescu and Dr. Jamal Atif for their relevant comments and questions which contributed to enhance this dissertation. Then, I would like to acknowledge all the members of the MAS laboratory, with special thanks to Frederic Abergel, Nikos Paragios, Pascale Legall, Pascal Laurent, Gilles Faye, Iasonas Kokkinos, Marie-Aude Aufaure, Anirban Chakraborti and the others for their kindness, their support and with whom it was always a pleasure to discuss. I am also thankful to Annie Glomeron and Sylvie Dervin who were always concerned to provide me the necessary conditions for the completion of my thesis. I could not forget to thank Pr. Bechir Ayeb who introduced me in the field of research and who has been often my inspiration source. Without his constant support, I could never have completed this work. I would like also to acknowledge Pr. Rahul Singh for the valuable discussions we had and for the interesting position he offered me to evolve in his lab in San Francisco. I hope that we could collaborate very soon. I am also thankful to my colleagues and friends who have shared with me unforgettable moments within and outside the lab. Namely, Marc-Antoine Arnaud, Nesrine Ben Mustapha, Hassen & Hana Ben Zineb, Dung Bui, Clément Courbet, Emmanuelle Gallet, Nicolas James, Bilal Kanso, Sofiane Karchoudi, Adrian Maglo, Casio Melo, Adith Perez, Florent Pruvost, Rania Soussi, Olivier Teboul, Konstantin Todorov, Amel Znaidia, and the others. It was my pleasure to share with you lunches, table football and skiing. I am also thankful to Arunvady Xayasenh, Ka Ho Yim, Cédric Zaccardi, Géraldine Carbonel and Catherine Lhopital who have evolved with me within the UJ2CP (association of junior researchers of Ecole Centrale Paris). I would close these acknowledgements by those who are dearest to me: my family. Specifically, I am grateful to my father Abdelaziz and my mother Zakia for providing iv me all the needed support to succeed in this PhD work. I am also thankful to my sister Asma Bennour and my brother in law Taoufik Hnia who always provided me logistic support and shared with me joyful moments in Paris. I also thank my sister and private doctor Arij Bennour who was always present to push me forward. Finally, I thank my sister Khaoula, my sweetest niece Yasmine and my brother in law Amine Omri who always brought joy in my life. v Abstract This dissertation aims at building and using knowledge-driven models in order to improve the accuracy of automatic image annotation. Currently, many image annotation approaches are based on the automatic association between low-level or mid-level visual features and semantic concepts using machine learning techniques. Nevertheless, the only use of machine learning seems to be insufficient to bridge the well-known semantic gap problem, and therefore to achieve efficient systems for auto- matic image annotation. Structured knowledge models, such as semantic hierarchies and ontologies, appear to be a good way to improve such approaches. These semantic structures allow modeling many valuable semantic relations between concepts, as for instance subsumption, contextual and spatial relationships. Indeed, these relation- ships have been proved to be of prime importance for the understanding of image semantics. Moreover, such structured knowledge models about high-level concepts enable to reduce the complexity of the large-scale image annotation problem. In this thesis, we propose a new methodology for building and using structured knowledge models for automatic image annotation. Specifically, our first proposals deal with the automatic building of explicit and structured knowledge models, such as semantic hierarchies and multimedia ontologies, dedicated to image annotation. Thereby, we propose a new approach for building semantic hierarchies faithful to image semantics. Our approach is based on a new image-semantic similarity mea- sure between concepts and on a set of rules that allow connecting the concepts with higher relatedness till the building of the final hierarchy. Afterwards, we propose to go further in the modeling of image semantics through the building of explicit knowledge models that incorporate richer semantic relationships between image con- cepts. Therefore, we propose a new approach for automatically building multimedia ontologies consisting of subsumption relationships between image concepts, and also other semantic relationships such as contextual and spatial relations. Fuzzy descrip- tion logics are used as a formalism to represent our ontology and to deal with the uncertainty and the imprecision of concept relationships. In order to assess the effectiveness of the built structured knowledge models, we propose subsequently to use them in a framework for image annotation. We propose therefore an approach, based on the structure of semantic hierarchies, to effectively perform hierarchical image classification. Furthermore, we propose a generic approach for image annotation combining machine learning techniques, such as hierarchical im- age classification, and fuzzy ontological-reasoning in order to achieve a semantically relevant image annotation. Empirical evaluations of our approaches have shown sig- nificant improvement in the image annotation accuracy. Keywords: Automatic Image Annotation, Hierarchical Image Classification, Multimedia Ontolo- gies, Semantic Hierarchies, Knowledge-Driven Models, Ontological Reasoning, Fuzzy- Description Logics. vi Résumé Cette thèse vise à construire et à utiliser des modèles à base de connaissances pour l’annotation automatique d’images. En effet, la plupart des approches d’annotation d’images sont basées sur la formulation d’une fonction de correspondance entre les caractéristiques de bas niveau, ou de niveau intermédiaire, et les concepts séman- tiques en utilisant des techniques d’apprentissage automatique. Cependant, la seule utilisation des algorithmes d’apprentissage semble être insuffisante pour combler le problème bien connu du fossé sémantique, et donc pour produire des systèmes effi- caces pour l’annotation automatique d’images. L’utilisation des connaissances struc- turées, comme les hiérarchies sémantiques et les ontologies, semble être un bon moyen pour améliorer ces approches. Ces structures de connaissances permettent de mod- éliser de nombreuses relations sémantiques entre les concepts, comme par exemple, les relations de subsomption, les relations contextuelles et les relations spatiales. Ces relations se sont avérées être d’une importance primordiale pour la compréhension de la sémantique d’images. En outre, l’utilisation