Assignment of Master's Thesis

Assignment of Master's Thesis

ASSIGNMENT OF MASTER’S THESIS Title: Comic2vec: Vector representation of comics Student: Bc. Martin Piták Supervisor: MSc. Juan Pablo Maldonado Lopez, Ph.D. Study Programme: Informatics Study Branch: Knowledge Engineering Department: Department of Applied Mathematics Validity: Until the end of summer semester 2019/20 Instructions In this work we explore the feasibility of a vector-space embedding for unstructured data (comics). This can provide the basis for a content-based recommendation system. As online evaluation of recommender systems is costly, we propose an alternative criteria to measure the performance of such system in this setting. 1) Study different approaches for abstract representation of rich media. 2) Provide a comprehensive survey of the existing work. 3) Explore different alternative embeddings. 4) Propose a baseline system based on previous work. Consider possible optimizations and improvements. 5) Use a proxy to measure the performance of the achieved results. References Will be provided by the supervisor. Ing. Karel Klouda, Ph.D. doc. RNDr. Ing. Marcel Jiřina, Ph.D. Head of Department Dean Prague October 12, 2018 Czech Technical University in Prague Faculty of Information Technology Department of Applied Mathematics Master Thesis Comic2vec: Vector representation of comics Bc. Martin Piták Supervisor: MSc. Juan Pablo Maldonado Lopez, Ph.D. Reviewer: Ing. Karel Klouda, Ph.D. May 2019 Acknowledgements I would like show my gratitude to my supervisor MSc. Juan Pablo Maldonado Lopez Ph.D. for his excellent guidance. I would also like to express gratitude to Ing. Karel Klouda Ph.D., Ing Daniel Vašata Ph.D., Ing. Tomáš Bartoň, Ondřej Bíža, Evka Šimková, Štěpánka Jislová and Václav Roháč for their time in which they were kind enough to discuss and find solutions for any issues concerning this paper. I would also like to thank my family and my friends for supporting me along the journey. I would like to thank Lenka Kubičová for helping me create the dataset of comics. And on a final note I must sincerely thank Andrea Malá for the correction of this paper. v Declaration I hereby declare that the presented thesis is my own work and that I have cited all sources of information in accordance with the Guideline for adhering to ethical principles when elaborating an academic final thesis. I acknowledge that my thesis is subject to the rights and obligations stipulated by the Act No. 121/2000 Coll., the Copyright Act, as amended. In accordance with Article 46(6) of the Act, I hereby grant a nonexclusive authorization (license) to utilize this thesis, including any and all computer programs incorporated therein or attached thereto and all corresponding documentation (hereinafter collectively referred to as the “Work”), to any and all persons that wish to utilize the Work. Such persons are entitled to use the Work in any way (including for-profit purposes) that does not detract from its value. This authorization is not limited in terms of time, location and quantity. However, all persons that makes use of the above license shall be obliged to grant a license at least in the same scope as defined above with respect to each and every work that is created (wholly or in part) based on the Work, by modifying the Work, by combining the Work with another work, by including the Work in a collection of works or by adapting the Work (including translation), and at the same time make available the source code of such work at least in a way and scope that are comparable to the way and scope in which the source code of the Work is made available. In Prague May 5, 2019 . vii Abstract The world of comics is receiving a lot of attention lately. Not only are thematic movies based on comics being released almost daily but on the top of that scientists are starting to study comics as a research field now as well. This paper tries to describe comics so the reader can understand them. It explores the possibilities of embedding comics into vector space. It explains various methods and algorithms that will be used in the process of creating and evaluating the accuracy of the embeddings. There are two embeddings created: one for the style and the other one for the text. A special metric is used to measure the accuracy of these embeddings. The style embedding is created using Inception V3 which is a Convolutional neural network (CNN) re-trained on TPU and achieves accuracy of 98%. The text embedding is created using a method named Doc2Vec and achieves accuracy of over 83%. Two datasets are created in the process of making this work, unfortunately the one used for style embedding cannot be made public. Keywords: Comic, TensorFlow, CNN, Recomendation system, Embedding, Clustering ix Abstrakt V dnešní době se komiksům dostává velké popularity. Jedním z faktorů toho nárůstu je fakt, že vycházejí filmy založené na komiksových světech, kterým se dostává velké popularity. Dalším faktorem je zájem vědců o komiksy a jejich následné studování. Tato práce se snaží popsat komiksy tak, aby i komiksový začátečník pochopil, co to přesně komiksy jsou. Zkoumá různé možnosti vnoření komiksů do vektorového prostoru. Vysvětluje různé metody a algoritmy, které jsou použity při tvorbě těchto vnoření a jejich následného vyhodnocení přesnosti. Jsou vytvořena dvě vnoření, jedno pro styl a druhé pro text. Pro vyhodnocení přesnosti je použita speciální metrika. Vnoření stylu je vytvořeno pomocí Inception V3, což je konvoluční neuronová síť (CNN), která byla přetrénovaná pomocí TPU. Toto vnoření dosahuje přesnosti 98%. Vnoření textu je vytvořené pomocí Doc2Vec a dosahuje přesnosti více jak 70%. Při tvorbě této práce byly vytvořeny dva datasety, jeden obsahující panely komiksů a druhý texty. Bohužel dataset panelů komiksů nemůže být zveřejněn. Klíčová slova: Komiks, TensorFlow, CNN, Doporučovací sytémy, Vnoření, Shlukování xi Contents 1 At first a bit of philosophy and history 3 1.1Comic................................................................. 3 1.2 History of comics . 4 1.2.1 Europe . 5 1.2.2 America . 5 1.2.3 Japan . 6 1.2.4 Modern . 7 1.3 Basic categories of comics . 7 1.3.1 Region . 8 1.3.2 Format . 8 1.3.3 Layout . 8 1.3.4Age ................................................................ 9 1.3.5 Gender . 9 1.4 Language of comics . 9 1.4.1 Panels and gutters . 9 1.4.2 Text. 10 1.4.3 Lettering/sounds . 10 1.5Style................................................................. 10 1.5.1 Line Style . 11 1.5.2 Movement Style . 11 1.5.3 Colouring and Shading . 12 2 Related Work 15 2.1 Style embedding . 15 2.1.1 FaceNet . 15 2.1.2 Illustration2Vec . 15 2.1.3 Are Anime Cartoons? . 15 2.1.4 The Amazing Mysteries of the Gutter: Drawing Inferences Between Panels in Comic Book Narratives . 16 2.1.5 A Neural Algorithm of Artistic Style . 16 2.1.6 Recognizing Image Style . 16 2.1.7 Item2Vec: Neural Item Embedding for Collaborative Filtering . 16 xiii 2.1.8 A Visual Embedding for the Unsupervised Extraction of Abstract Semantics 17 2.1.9 CNN-based Classification of Illustrator Style in Graphic Novels: Which Features Contribute Most? . 17 2.2 Text Embedding . 17 2.2.1 Efficient Estimation of Word Representations in Vector Space . 17 2.2.2 Distributed Representations of Sentences and Documents . 17 3 Style embedding 19 3.1 Panel vs. Page . 19 3.2 Dataset. 19 3.2.1 Data augmentation . 20 3.2.2 Data split . 21 3.3 Distance Metric . 21 3.4 Clustering algorithms . 21 3.4.1 K-means . 22 3.4.2 HDBScan . 22 3.4.3 Markov clustering . 24 3.5 Dimensionality reduction . 24 3.5.1 PCA . 24 3.5.2 MDS . 25 3.5.3 Isomap . 26 3.5.4 t-SNE . 27 3.5.5 UMAP . 28 3.5.6 Conclusion . 30 3.6 Baseline . ..

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    81 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us